A Note on SYSVIPC and Jails on FreeBSD

2018-03-26

The previous blogpost on jails claimed that there were “known problems” with allowing sysvipc inside jails on FreeBSD and called out postgres as a programm that requires allow.sysvipc when run inside a jail, which creates security problems. As it turns out, the release of FreeBSD-11 has seen improvements in the way sysvipc can be used in jails and it seems worthwile to take a quick look at this.

If even a quick look is too much time for you, here’s an even shorter TL;DR:

Don’t use allow.sysvipc=1, instead set sysvmsg, sysvsem and sysvshm to "new", if you need to use sysvipc inside a jail.

The “Classic” Situation:

By default, jails do not permit any of the sysvipc mechanisms available on a usual FreeBSD system. The reason for this is that access to sysvipc permits a jail to access the global (system-wide) sysvipc space. The only thing that prevents a program in one jail to access the shared sysv ressources of another jail (or even the host system itself) is the UID of the processes (as far as the author can tell). This means that the only way to securely run apps that require sysvipc inside jails is to ensure that they have uids different from any other user on the systems.

This collides with the fact that packages (such as the postgres-package) usually create system users with the same uid/gid everywhere. Hence, running different instances of such a programm without creating extra user accounts may have security consequences down the road.

Releases prior to 10.4 and 11 had one simple option to enable sysvipc in jails:

allow.sysvipc = 1;

While this can be set on a per-jail basis, it still suffers from the shortcomings described above.

Example:

If we have postgres running inside a jail, then running ipcs would show something like:

pgsql01 / # ipcs
Message Queues:
T           ID          KEY MODE        OWNER    GROUP

Shared Memory:
T           ID          KEY MODE        OWNER    GROUP
m        65536      5432001 --rw------- postgres postgres

Semaphores:
T           ID          KEY MODE        OWNER    GROUP

The output on the host would be identical (but the uid/gid would be displayed numerically in absence of the postgres-user in /etc/passwd). If sysvipc has not been enabled for a jail then the same command would show no information in SYSv ressources:

test01 / # ipcs
Message Queues:
T           ID          KEY MODE        OWNER    GROUP

Shared Memory:
T           ID          KEY MODE        OWNER    GROUP

Semaphores:
T           ID          KEY MODE        OWNER    GROUP

However, as soon as we enter a jail where allow.sysvipc has been set to 1 then ipcs would again show all available ressources, even if they belong to a different jail:

test02 / # ipcs
Message Queues:
T           ID          KEY MODE        OWNER    GROUP

Shared Memory:
T           ID          KEY MODE        OWNER    GROUP
m        65536            0 --rw------- postgres postgres

Semaphores:
T           ID          KEY MODE        OWNER    GROUP

SYSVIPC in Modern Jails:

In FreeBSD-11 and (at least according to the manpage in FreeBSD-10.4, the allow.sysvipc option was deprecated in favor of three seperate parameters: sysvmsg, sysvsem and sysvshm, which oddly do not fall in the allow subcategory of options.

These settings allow (again either globally or on a per-jail basis) to enable access to the different IPC ressoures and offer a new configuration option:

sysvshm = ["disable"|"inherit"|"new"];

The default “disable” disables access to this type of ressource. The “inherit” option reflects the old default, where the jail inherits the same IPC “namespace” as the system, resulting in the classic situation described above.

However, setting any of these options to "new" means that the jail will get it’s own namespace for this ressource, which seems to fix any of the problems described above.

Example:

If we have two instances of postgresql, each running inside its own jail, then the “classic” allow.sysvipc-method would yield the following ipcs-output in each of the jails:

pgsql02 / # ipcs
Message Queues:
T           ID          KEY MODE        OWNER    GROUP

Shared Memory:
T           ID          KEY MODE        OWNER    GROUP
m        65536            0 --rw------- postgres postgres
m       589825      5432001 --rw------- postgres postgres

Semaphores:
T           ID          KEY MODE        OWNER    GROUP

With the new method, it suffices to set sysvshm = "new"; for the jails in question. In this case, each jail can only see its own information:

pgsql01 / # ipcs
Message Queues:
T           ID          KEY MODE        OWNER    GROUP

Shared Memory:
T           ID          KEY MODE        OWNER    GROUP
m       131072      5432001 --rw------- postgres postgres

Semaphores:
T           ID          KEY MODE        OWNER    GROUP

pgsql02 / # ipcs
Message Queues:
T           ID          KEY MODE        OWNER    GROUP

Shared Memory:
T           ID          KEY MODE        OWNER    GROUP
m       655361      5432001 --rw------- postgres postgres

Semaphores:
T           ID          KEY MODE        OWNER    GROUP

Interestingly, the host on which the jails are running would still be able to see both “namespaces”:

~ # ipcs
Message Queues:
T           ID          KEY MODE        OWNER    GROUP

Shared Memory:
T           ID          KEY MODE        OWNER    GROUP
m       131072            0 --rw------- 770      770
m       655361            0 --rw------- 770      770

Semaphores:
T           ID          KEY MODE        OWNER    GROUP

Acknowledgements:

The use of sysvshm was first suggested to us by Harald Eilertsen (@harald@quitter.no). Thanks!