Re: blocking / mount using containers

2018-07-11 Thread Daniel Walsh

On 07/10/2018 10:00 AM, Mclain, Warren wrote:


I am trying to find a solution for blocking the mounting of / from 
containers. This is a major security hole for Docker and all of those 
types of applications.


I found the mount_anyfile  Boolean but nothing that digs into that to 
show how to disable specific mountings.


Looking for any information that would help the container community in 
general.


This seems mighty arbitrary. I would think you would want to block lots 
of directories from being mounted into the container in addition to /, 
/home, /var, /etc? for example.


What tool are you using, and what access to you want to grant to your users?


thanks

___

Warren McLain

Enterprise Engineering Services

IEI Foundation Engineering - Compute, Optum Technology

 warren_mcl...@optum.com Office: 763-744-3107


This e-mail, including attachments, may include confidential and/or
proprietary information, and may be used only by the person or entity
to which it is addressed. If the reader of this e-mail is not the intended
recipient or his or her authorized agent, the reader is hereby notified
that any dissemination, distribution or copying of this e-mail is
prohibited. If you have received this e-mail in error, please notify the
sender by replying to this message and delete this e-mail immediately.



___
Selinux mailing list
Selinux@tycho.nsa.gov
To unsubscribe, send email to selinux-le...@tycho.nsa.gov.
To get help, send an email containing "help" to selinux-requ...@tycho.nsa.gov.



___
Selinux mailing list
Selinux@tycho.nsa.gov
To unsubscribe, send email to selinux-le...@tycho.nsa.gov.
To get help, send an email containing "help" to selinux-requ...@tycho.nsa.gov.

Re: SELinux Namespace on bind mounted files

2018-03-08 Thread Daniel Walsh

On 03/08/2018 01:20 PM, Stephen Smalley wrote:

On 03/08/2018 05:55 AM, Zvonko Kosic wrote:

I've seen the presentation by James Morrison about  namespacing SELinux and I 
have a question regarding a special case we have in our environment.

We have third party prestart runtime hooks for docker which bind mount
files from the host into the container image, which have the wrong label.

To change the SELinux labels on the host is not an option because
it breaks stuff on the host.

Wil the SELinux namespacing work on files that are bind mounted?

I believe the answer is yes, since my patches support per-namespace in-core 
inode SIDs and James' additional patches support per-namespace on-disk xattrs 
(so the bind-mounted files can have two distinct labels, one of which will be 
presented to processes in the root/init namespace and the other to processes 
within the child namespace).  That said, this is all very much work in progress.


I am not a big fan of Namespaced SELinux.  I think it complicates things 
and will confuse people.  I would think a better solution would be to 
run your container with a different type so that you could allow access 
t othese file types.


It would be a lot easier to create a type based on container-selinux 
policy and just run your container with it.



podman run -ti --security-opt label=type:mycontianer_t -v /SRC:/DEST IMAGE

Or if  you must

docker run -ti --security-opt label=type:mycontianer_t -v /SRC:/DEST IMAGE






Re: Does selinux work with kernel namespaces?

2018-02-08 Thread Daniel Walsh

On 02/07/2018 04:10 PM, Matt Callaway wrote:

Hello,

I am attempting to run Docker on CentOS 7.4 with selinux and kernel
namespaces enabled. When I do so I observe an error that leads me to
an issue filed in github and a kernel patch that suggests that the
cause should be fixed in kernel 4.11+. Yet I cannot run docker
containers in this fashion on a 4.15 kernel.
Not sure what you mean by Kernel Namespace, are you talking about User 
Namespace?

Should docker with selinux-enabled work on a 4.15.1 kernel on CentOS
7.4 with namespaces enabled?

Yes.

This might be a docker question, but the details I'll present below
suggest it might be more appropriate for this forum.

Details about the host and environment:

What AVC messages are you seeing?

[root@localhost ~]# uname -r
4.15.1-1.el7.elrepo.x86_64

[root@localhost ~]# cat /etc/redhat-release
CentOS Linux release 7.4.1708 (Core)

[root@localhost ~]# docker --version
Docker version 17.12.0-ce, build c97c6d6

This is the latest docker-ce package from Docker's repository:

[root@localhost ~]# repoquery -i docker-ce

Name: docker-ce
Version : 17.12.0.ce
Release : 1.el7.centos
Architecture: x86_64
Size: 128453687
Packager: Docker 
Group   : Tools/Docker
URL : https://www.docker.com
Repository  : docker-ce-stable
Summary : The open-source application container engine
Source  : docker-ce-17.12.0.ce-1.el7.centos.src.rpm

The kernel is 4.15.1 from ElRepo, because that seems to be the
accepted way to get a 4.x kernel on CentOS, which I did because data
suggested I needed at least 4.11+

[root@localhost ~]# repoquery -i kernel-ml

Name: kernel-ml
Version : 4.15.1
Release : 1.el7.elrepo
Architecture: x86_64
Size: 204626242
Packager: Alan Bartlett 
Group   : System Environment/Kernel
URL : https://www.kernel.org/
Repository  : elrepo-kernel
Summary : The Linux kernel. (The core of any Linux-based operating system.)
Source  : kernel-ml-4.15.1-1.el7.elrepo.src.rpm


Here we see selinux-enabled is true and userns-remap is set to default:

[root@localhost ~]# cat /etc/docker/daemon.json
{
   "debug": true,
   "selinux-enabled": true,
   "userns-remap": "default"
}

[root@localhost ~]# docker info 2>&1 | grep -A3 Security
Security Options:
  seccomp
   Profile: default
  selinux


So when I try it I get:

[root@localhost ~]# docker run hello-world
docker: Error response from daemon: OCI runtime create failed:
container_linux.go:296: starting container process caused
"process_linux.go:301: running exec setns process for init caused
\"exit status 40\"": unknown.

When running in Permissive mode I see a different error:

[root@localhost ~]# setenforce 0

[root@localhost ~]# docker run hello-world
docker: Error response from daemon: OCI runtime create failed:
container_linux.go:296: starting container process caused
"process_linux.go:398: container init caused \"rootfs_linux.go:58:
mounting \\\"devpts\\\" to rootfs
\\\"/var/lib/docker/1001.1001/overlay2/6798981d1cf925a748187e0f2e9151f47bca9352457aa5b933a2bcb55eff9570/merged\\\"
at 
\\\"/var/lib/docker/1001.1001/overlay2/6798981d1cf925a748187e0f2e9151f47bca9352457aa5b933a2bcb55eff9570/merged/dev/pts\\\"
caused \\\"invalid argument\\\"\"": unknown.


Looking around for these symptoms I find these references...

First message goes here:

https://github.com/moby/moby/issues/35336

Suggests the namespace.unpriv_enable=1 flag. I have already enabled that:

[root@localhost ~]# grep unpriv /boot/grub2/grub.cfg
linux16 /vmlinuz-4.15.1-1.el7.elrepo.x86_64
root=/dev/mapper/VolGroup00-LogVol00 ro no_timer_check console=tty0
console=ttyS0,115200n8 net.ifnames=0 biosdevname=0 crashkernel=auto
rd.lvm.lv=VolGroup00/LogVol00 rd.lvm.lv=VolGroup00/LogVol01 rhgb quiet
LANG=en_US.UTF-8 namespace.unpriv_enable=1


Then I do setenforce 0 and we get the second devpts error which leads to:

https://github.com/opencontainers/runc/issues/1215

which leads to:

https://bugzilla.redhat.com/show_bug.cgi?id=1401537

which leads to a kernel patch:

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=01593d3299a1cfdb5e08acf95f63ec59dd674906

I've since discovered via the author that that patch was included in
kernel 4.11.

So... what am I missing? All signs suggest that this *should* work,
and yet does not.

Thank you for your time.






Re: More problems with bounds checking.

2018-01-09 Thread Daniel Walsh

Lukas Vrabec informs me that their is a new allow rule nnp_transition

allow container_runtime_t spc_t:process2 nnp_transition;

Which allows me to get rid of all of the typebounds cruft.  Very nice. 
And will be supported in RHEL7.5 release.



On 01/09/2018 10:45 AM, Daniel Walsh wrote:

On 01/09/2018 10:40 AM, Stephen Smalley wrote:

On Tue, 2018-01-09 at 10:19 -0500, Daniel Walsh wrote:

For some reason semodule will not allow me to install container.pp.
I
am trying to have type bounds from container_runtime_t to spc_t to
container_t.

Any reason this isn't on list?

Nope bad habit. I will send this to the list.

BTW, if you apply my "Generalize support for NNP/nosuid SELinux domain
transitions" kernel patch (or use a kernel that includes it, >= 4.14)
and enable the nnp_nosuid_transition policy capability, you shouldn't
have to use type bounds at all anymore.


I start with no type bounds.

# seinfo --typebounds

Typebounds: 0

During the install it tells me spc_t is already bound by a parent,
no
clue what parent.  And cil file does not exist when command
completes.

Yes, that's a real pain.  To work around it, I will often run
/usr/libexec/selinux/hll/pp on the pp file to generate the cil file so
I can look at the line numbers reported, ala:
/usr/libexec/selinux/hll/pp container.pp container.cil
vi container.cil


# semodule -X 400 -i container.pp
Type spc_t already bound by parent at
/var/lib/selinux/targeted/tmp/modules/400/container/cil:35
Bad bounds statement at
/var/lib/selinux/targeted/tmp/modules/400/container/cil:1583
semodule:  Failed!


I have only the two commands added.

# grep typebounds container.te
# Added to make typebounds check work.
# typebounds container_runtime_exec_t exec_type;
# typebounds container_runtime_exec_t mountpoint;
#    unconfined_typebounds(container_runtime_t)
typebounds container_runtime_t spc_t;
typebounds spc_t container_t;

This is what is generated in the tmp file.

# grep typebounds.* tmp/container.tmp
  # unconfined_exec_typebounds(container_runtime_exec_t)
  #    unconfined_exec_typebounds(container_auth_exec_t)
# Added to make typebounds check work.
# typebounds container_runtime_exec_t exec_type;
# typebounds container_runtime_exec_t mountpoint;
#    unconfined_typebounds(container_runtime_t)
typebounds container_runtime_t spc_t;
typebounds spc_t container_t;













Re: More problems with bounds checking.

2018-01-09 Thread Daniel Walsh

On 01/09/2018 10:40 AM, Stephen Smalley wrote:

On Tue, 2018-01-09 at 10:19 -0500, Daniel Walsh wrote:

For some reason semodule will not allow me to install container.pp.
I
am trying to have type bounds from container_runtime_t to spc_t to
container_t.

Any reason this isn't on list?

Nope bad habit. I will send this to the list.

BTW, if you apply my "Generalize support for NNP/nosuid SELinux domain
transitions" kernel patch (or use a kernel that includes it, >= 4.14)
and enable the nnp_nosuid_transition policy capability, you shouldn't
have to use type bounds at all anymore.


I start with no type bounds.

# seinfo --typebounds

Typebounds: 0

During the install it tells me spc_t is already bound by a parent,
no
clue what parent.  And cil file does not exist when command
completes.

Yes, that's a real pain.  To work around it, I will often run
/usr/libexec/selinux/hll/pp on the pp file to generate the cil file so
I can look at the line numbers reported, ala:
/usr/libexec/selinux/hll/pp container.pp container.cil
vi container.cil


# semodule -X 400 -i container.pp
Type spc_t already bound by parent at
/var/lib/selinux/targeted/tmp/modules/400/container/cil:35
Bad bounds statement at
/var/lib/selinux/targeted/tmp/modules/400/container/cil:1583
semodule:  Failed!


I have only the two commands added.

# grep typebounds container.te
# Added to make typebounds check work.
# typebounds container_runtime_exec_t exec_type;
# typebounds container_runtime_exec_t mountpoint;
#    unconfined_typebounds(container_runtime_t)
typebounds container_runtime_t spc_t;
typebounds spc_t container_t;

This is what is generated in the tmp file.

# grep typebounds.* tmp/container.tmp
  #    unconfined_exec_typebounds(container_runtime_exec_t)
  #    unconfined_exec_typebounds(container_auth_exec_t)
# Added to make typebounds check work.
# typebounds container_runtime_exec_t exec_type;
# typebounds container_runtime_exec_t mountpoint;
#    unconfined_typebounds(container_runtime_t)
typebounds container_runtime_t spc_t;
typebounds spc_t container_t;








Re: [BUG]kernel softlockup due to sidtab_search_context run for long time because of too many sidtab context node

2017-12-15 Thread Daniel Walsh

On 12/15/2017 08:56 AM, Stephen Smalley wrote:

On Fri, 2017-12-15 at 03:09 +, yangjihong wrote:

On 12/15/2017 10:31 PM, yangjihong wrote:

On 12/14/2017 12:42 PM, Casey Schaufler wrote:

On 12/14/2017 9:15 AM, Stephen Smalley wrote:

On Thu, 2017-12-14 at 09:00 -0800, Casey Schaufler wrote:

On 12/14/2017 8:42 AM, Stephen Smalley wrote:

On Thu, 2017-12-14 at 08:18 -0800, Casey Schaufler wrote:

On 12/13/2017 7:18 AM, Stephen Smalley wrote:

On Wed, 2017-12-13 at 09:25 +, yangjihong wrote:

Hello,

I am doing stressing testing on 3.10 kernel(centos
7.4), to
constantly starting numbers of docker ontainers with
selinux
enabled, and after about 2 days, the kernel
softlockup panic:
 []
sched_show_task+0xb8/0x120
   [] show_lock_info+0x20f/0x3a0
   [] watchdog_timer_fn+0x1da/0x2f0
   [] ?
watchdog_enable_all_cpus.part.4+0x40/0x40
   []
__hrtimer_run_queues+0xd2/0x260
   [] hrtimer_interrupt+0xb0/0x1e0
   []
local_apic_timer_interrupt+0x37/0x60
   []
smp_apic_timer_interrupt+0x50/0x140
   [] apic_timer_interrupt+0x6d/0x80
 [] ?
sidtab_context_to_sid+0xb3/0x480
   [] ?
sidtab_context_to_sid+0x110/0x480
   [] ?
mls_setup_user_range+0x145/0x250
   []
security_get_user_sids+0x3f7/0x550
   [] sel_write_user+0x12b/0x210
   [] ? sel_write_member+0x200/0x200
   []
selinux_transaction_write+0x48/0x80
   [] vfs_write+0xbd/0x1e0
   [] SyS_write+0x7f/0xe0
   [] system_call_fastpath+0x16/0x1b

My opinion:
when the docker container starts, it would mount
overlay
filesystem with different selinux context, mount
point such as:
overlay on
/var/lib/docker/overlay2/be3ef517730d92fc4530e0e952ea
e4f6cb0f
07b4
bc32
6cb07495ca08fc9ddb66/merged type overlay
(rw,relatime,context="system_u:object_r:svirt_sandbox
_file_t:
s0:c
414,
c873",lowerdir=/var/lib/docker/overlay2/l/Z4U7WY6ASNV
5CFWLADP
ARHH
WY7:
/var/lib/docker/overlay2/l/V2S3HOKEFEOQLHBVAL5WLA3YLS
:/var/li
b/do
cker
/overlay2/l/46YGYO474KLOULZGDSZDW2JPRI,upperdir=/var/
lib/dock
er/o
verl
ay2/be3ef517730d92fc4530e0e952eae4f6cb0f07b4bc326cb07
495ca08f
c9dd
b66/
diff,workdir=/var/lib/docker/overlay2/be3ef517730d92f
c4530e0e
952e
ae4f
6cb0f07b4bc326cb07495ca08fc9ddb66/work)
shm on
/var/lib/docker/containers/9fd65e177d2132011d7b422755
793449c9
1327
ca57
7b8f5d9d6a4adf218d4876/shm type tmpfs
(rw,nosuid,nodev,noexec,relatime,context="system_u:ob
ject_r:s
virt
_san
dbox_file_t:s0:c414,c873",size=65536k)
overlay on
/var/lib/docker/overlay2/38d1544d080145c7d76150530d02
55991dfb
7258
cbca
14ff6d165b94353eefab/merged type overlay
(rw,relatime,context="system_u:object_r:svirt_sandbox
_file_t:
s0:c
431,
c651",lowerdir=/var/lib/docker/overlay2/l/3MQQXB4UCLF
B7ANVRHP
AVRC
RSS:
/var/lib/docker/overlay2/l/46YGYO474KLOULZGDSZDW2JPRI
,upperdi
r=/v
ar/l
ib/docker/overlay2/38d1544d080145c7d76150530d0255991d
fb7258cb
ca14
ff6d
165b94353eefab/diff,workdir=/var/lib/docker/overlay2/
38d1544d
0801
45c7
d76150530d0255991dfb7258cbca14ff6d165b94353eefab/work
)
shm on
/var/lib/docker/containers/662e7f798fc08b09eae0f0f944
537a4bce
dc1d
cf05
a65866458523ffd4a71614/shm type tmpfs
(rw,nosuid,nodev,noexec,relatime,context="system_u:ob
ject_r:s
virt
_san
dbox_file_t:s0:c431,c651",size=65536k)

sidtab_search_context check the context whether is in
the sidtab
list, If not found, a new node is generated and
insert into the
list, As the number of containers is
increasing,  context nodes
are also more and more, we tested the final number of
nodes
reached
300,000 +,
sidtab_context_to_sid runtime needs 100-200ms, which
will lead
to the system softlockup.

Is this a selinux bug? When filesystem umount, why
context node
is not deleted?  I cannot find the relevant function
to delete
the node in sidtab.c

Thanks for reading and looking forward to your reply.

So, does docker just keep allocating a unique category
set for
every new container, never reusing them even if the
container is
destroyed?
That would be a bug in docker IMHO.  Or are you
creating an
unbounded number of containers and never destroying the
older
ones?

You can't reuse the security context. A process in
ContainerA
sends a labeled packet to MachineB. ContainerA goes away
and its
context is recycled in ContainerC. MachineB responds some
time
later, again with a labeled packet. ContainerC gets
information
intended for ContainerA, and uses the information to take
over the
Elbonian government.

Docker isn't using labeled networking (nor is anything else
by
default; it is only enabled if explicitly configured).

If labeled networking weren't an issue we'd have full
security
module stacking by now. Yes, it's an edge case. If you want
to use
labeled NFS or a local filesystem that gets mounted in each
container (don't tell me that nobody would do that) you've
got the
same problem.

Even if someone were to configure labeled networking, Docker is
not
presently relying on that or SELinux network enforcement for
any
security properties, so it really doesn't matter.

True enough. I can imagine a use case, but as you point out, it
would
be a very 

Re: [BUG]kernel softlockup due to sidtab_search_context run for long time because of too many sidtab context node

2017-12-14 Thread Daniel Walsh

On 12/14/2017 12:42 PM, Casey Schaufler wrote:

On 12/14/2017 9:15 AM, Stephen Smalley wrote:

On Thu, 2017-12-14 at 09:00 -0800, Casey Schaufler wrote:

On 12/14/2017 8:42 AM, Stephen Smalley wrote:

On Thu, 2017-12-14 at 08:18 -0800, Casey Schaufler wrote:

On 12/13/2017 7:18 AM, Stephen Smalley wrote:

On Wed, 2017-12-13 at 09:25 +, yangjihong wrote:

Hello,

I am doing stressing testing on 3.10 kernel(centos 7.4), to
constantly starting numbers of docker ontainers with selinux
enabled,
and after about 2 days, the kernel softlockup panic:
[] sched_show_task+0xb8/0x120
  [] show_lock_info+0x20f/0x3a0
  [] watchdog_timer_fn+0x1da/0x2f0
  [] ?
watchdog_enable_all_cpus.part.4+0x40/0x40
  [] __hrtimer_run_queues+0xd2/0x260
  [] hrtimer_interrupt+0xb0/0x1e0
  [] local_apic_timer_interrupt+0x37/0x60
  [] smp_apic_timer_interrupt+0x50/0x140
  [] apic_timer_interrupt+0x6d/0x80
[] ?
sidtab_context_to_sid+0xb3/0x480
  [] ? sidtab_context_to_sid+0x110/0x480
  [] ? mls_setup_user_range+0x145/0x250
  [] security_get_user_sids+0x3f7/0x550
  [] sel_write_user+0x12b/0x210
  [] ? sel_write_member+0x200/0x200
  [] selinux_transaction_write+0x48/0x80
  [] vfs_write+0xbd/0x1e0
  [] SyS_write+0x7f/0xe0
  [] system_call_fastpath+0x16/0x1b

My opinion:
when the docker container starts, it would mount overlay
filesystem
with different selinux context, mount point such as:
overlay on
/var/lib/docker/overlay2/be3ef517730d92fc4530e0e952eae4f6cb0f
07b4
bc32
6cb07495ca08fc9ddb66/merged type overlay
(rw,relatime,context="system_u:object_r:svirt_sandbox_file_t:
s0:c
414,
c873",lowerdir=/var/lib/docker/overlay2/l/Z4U7WY6ASNV5CFWLADP
ARHH
WY7:
/var/lib/docker/overlay2/l/V2S3HOKEFEOQLHBVAL5WLA3YLS:/var/li
b/do
cker
/overlay2/l/46YGYO474KLOULZGDSZDW2JPRI,upperdir=/var/lib/dock
er/o
verl
ay2/be3ef517730d92fc4530e0e952eae4f6cb0f07b4bc326cb07495ca08f
c9dd
b66/
diff,workdir=/var/lib/docker/overlay2/be3ef517730d92fc4530e0e
952e
ae4f
6cb0f07b4bc326cb07495ca08fc9ddb66/work)
shm on
/var/lib/docker/containers/9fd65e177d2132011d7b422755793449c9
1327
ca57
7b8f5d9d6a4adf218d4876/shm type tmpfs
(rw,nosuid,nodev,noexec,relatime,context="system_u:object_r:s
virt
_san
dbox_file_t:s0:c414,c873",size=65536k)
overlay on
/var/lib/docker/overlay2/38d1544d080145c7d76150530d0255991dfb
7258
cbca
14ff6d165b94353eefab/merged type overlay
(rw,relatime,context="system_u:object_r:svirt_sandbox_file_t:
s0:c
431,
c651",lowerdir=/var/lib/docker/overlay2/l/3MQQXB4UCLFB7ANVRHP
AVRC
RSS:
/var/lib/docker/overlay2/l/46YGYO474KLOULZGDSZDW2JPRI,upperdi
r=/v
ar/l
ib/docker/overlay2/38d1544d080145c7d76150530d0255991dfb7258cb
ca14
ff6d
165b94353eefab/diff,workdir=/var/lib/docker/overlay2/38d1544d
0801
45c7
d76150530d0255991dfb7258cbca14ff6d165b94353eefab/work)
shm on
/var/lib/docker/containers/662e7f798fc08b09eae0f0f944537a4bce
dc1d
cf05
a65866458523ffd4a71614/shm type tmpfs
(rw,nosuid,nodev,noexec,relatime,context="system_u:object_r:s
virt
_san
dbox_file_t:s0:c431,c651",size=65536k)

sidtab_search_context check the context whether is in the
sidtab
list, If not found, a new node is generated and insert into
the
list,
As the number of containers is increasing,  context nodes are
also
more and more, we tested the final number of nodes reached
300,000 +,
sidtab_context_to_sid runtime needs 100-200ms, which will
lead to
the
system softlockup.

Is this a selinux bug? When filesystem umount, why context
node
is
not deleted?  I cannot find the relevant function to delete
the
node
in sidtab.c

Thanks for reading and looking forward to your reply.

So, does docker just keep allocating a unique category set for
every
new container, never reusing them even if the container is
destroyed?
That would be a bug in docker IMHO.  Or are you creating an
unbounded
number of containers and never destroying the older ones?

You can't reuse the security context. A process in ContainerA
sends
a labeled packet to MachineB. ContainerA goes away and its
context
is recycled in ContainerC. MachineB responds some time later,
again
with a labeled packet. ContainerC gets information intended for
ContainerA, and uses the information to take over the Elbonian
government.

Docker isn't using labeled networking (nor is anything else by
default;
it is only enabled if explicitly configured).

If labeled networking weren't an issue we'd have full security
module stacking by now. Yes, it's an edge case. If you want to
use labeled NFS or a local filesystem that gets mounted in each
container (don't tell me that nobody would do that) you've got
the same problem.

Even if someone were to configure labeled networking, Docker is not
presently relying on that or SELinux network enforcement for any
security properties, so it really doesn't matter.

True enough. I can imagine a use case, but as you point out, it
would be a very complex configuration and coordination exercise
using SELinux.


And if they wanted
to do that, they'd have to coordinate category assignments across all
systems involved, for which no 

Re: [atomic-devel] New SELinux/Container Blog

2017-06-20 Thread Daniel Walsh

On 06/20/2017 03:50 PM, Jeff Ligon wrote:

Looks like crap on a phone.
Is that livejournal?

Yes.




New SELinux/Container Blog

2017-06-20 Thread Daniel Walsh

http://danwalsh.livejournal.com/76358.html

Please social Media it.




Re: Collecting ideas for audit2allow improvement

2017-06-19 Thread Daniel Walsh

On 06/16/2017 12:08 PM, Dominick Grift wrote:

On Fri, Jun 16, 2017 at 08:21:25AM -0400, Daniel Walsh wrote:

On 06/14/2017 10:47 AM, Dominick Grift wrote:

On Wed, Jun 14, 2017 at 04:35:41PM +0200, Dominick Grift wrote:

On Wed, Jun 14, 2017 at 10:30:25AM -0400, Stephen Smalley wrote:

On Wed, 2017-06-14 at 09:01 -0400, Jan Zarsky wrote:

Hi,

I would like to improve SELinux audit2allow tool as my bachelor
thesis.
I collected ideas from my colleagues from RedHat SELinux team and I
would also
like to hear your ideas - what would you improve to make audit2allow
smarter or
easier to use.

Ideas collected so far:

* offer dac_read_search when sufficient instead of dac_override
  (see <https://github.com/SELinuxProject/selinux/issues/31>)

The hard part here is knowing when it is sufficient.  Might require
further kernel patches to get to the point where it is completely
unambiguous from the audit messages alone.  You could perhaps default
to only allowing dac_read_search and only allow dac_override if you see
that dac_read_search is already allowed and you are still getting a
dac_override denial.

This should not be an issue anymore. Becuase now (linux v4.12) dac_read_search 
is first checked
so translating a rule for dac_read_search will always be the most secure 
option. It might not be enough but youll notice if it isnt. Just rerun it again 
and you'll end up with dac_override if needed.


* offer multiple solutions to a problem (example: 1) add allow rule
for
  execute + execute_no_trans or 2) add allow rule for execute
  + type_transition rule)

Is this type_transition to an existing type that already exists, or
defining a new type and transitioning to it, or both?  Generating new
domains and types dynamically is one of the major gaps in current
audit2allow, and to date has only been supported in separate tools like
sepolicy generate.

One should be extremely careful here. "execute" does not automatically imply 
"execute_no_trans". That assumption can lead to disaster 
(https://www.cvedetails.com/cve/CVE-2015-1815/)

* interactive mode: ask questions and choose best solution
* warn when solution touches trusted computing base (rules you
should not be
  adding)
* suggest alternate labels for content, example: httpd_t not
allowed to write
  to user_home_t, might suggest that changing the label to
  httpd_user_content_t

This seems more along the lines of what setroubleshoot does, and tends
to be policy-specific. We need to preserve the usability of audit2allow
for other policies (e.g. Android, DSSP), so any policy-specific logic
needs to be encapsulated, configuration-driven, and optional.

setroubleshoot is not smart enough to suggest httpd_user_content_t, instead it 
suggest you allow full access to /home/* by toggling the boolean that grants 
access to user home content.

I know this from experience. There was a bug in fedora's 
apache_user_content_template for more than a year and hardly anyone noticed it, 
probably because setroubleshoot would have just suggested you allow full access 
to user home content.

https://bugzilla.redhat.com/show_bug.cgi?id=1457406

I actually think the suggest a accessible type is  the best improvement.
audit2allow has defaulted for years to suggesting adding allow rules,
usually the worst solution.  Later we added boolean support, which was
better but as you point out the not a great solution in a lot of cases.

The "best" improvement. I agree, but that say's more about the overal state of 
things than about the quality of this particular improvement.

The guy in the tweet was overwhelmed and intimidated by the large list of 
"accessible" types returned by setroubleshoot (and audit2allow would probably 
do the same)
Also things arent as simple, its not just about "what type grants the access needed 
to my domain". Other domains access to that type should also be considered.
Which is why I suggested we filter it down to a list that has the same 
prefix.  Also I think we should at least stick in the beginning to 
permissions that can effect the content of data, rather then just reveal 
the content.


write, append, add_name unlink setattr, remove_name rmdir link create ...

I think anything we can do to making it easier for the user to get a 
proper type, would be really helpful.

Most AVC's are caused by mislabeled files or processes.  Having audit2allow
attempt to examine the policy and figure out which types would have been
allowed with some simple sorting rules might be a big improvement.

audit2allow INHO shouldnt "examine" the policy (at least not in any significant 
way) because it cannot do it right. audit2allow should just stick with the basics + 
audit2why.

audit2why is the only functionality that works reasonably well. provided that some of the 
more recent "improvements" to audit2why get removed.
audit2why does nothing to reveal the cr

Re: Collecting ideas for audit2allow improvement

2017-06-16 Thread Daniel Walsh

On 06/14/2017 10:47 AM, Dominick Grift wrote:

On Wed, Jun 14, 2017 at 04:35:41PM +0200, Dominick Grift wrote:

On Wed, Jun 14, 2017 at 10:30:25AM -0400, Stephen Smalley wrote:

On Wed, 2017-06-14 at 09:01 -0400, Jan Zarsky wrote:

Hi,

I would like to improve SELinux audit2allow tool as my bachelor
thesis.
I collected ideas from my colleagues from RedHat SELinux team and I
would also
like to hear your ideas - what would you improve to make audit2allow
smarter or
easier to use.

Ideas collected so far:

   * offer dac_read_search when sufficient instead of dac_override
 (see )

The hard part here is knowing when it is sufficient.  Might require
further kernel patches to get to the point where it is completely
unambiguous from the audit messages alone.  You could perhaps default
to only allowing dac_read_search and only allow dac_override if you see
that dac_read_search is already allowed and you are still getting a
dac_override denial.

This should not be an issue anymore. Becuase now (linux v4.12) dac_read_search 
is first checked
so translating a rule for dac_read_search will always be the most secure 
option. It might not be enough but youll notice if it isnt. Just rerun it again 
and you'll end up with dac_override if needed.


   * offer multiple solutions to a problem (example: 1) add allow rule
for
 execute + execute_no_trans or 2) add allow rule for execute
 + type_transition rule)

Is this type_transition to an existing type that already exists, or
defining a new type and transitioning to it, or both?  Generating new
domains and types dynamically is one of the major gaps in current
audit2allow, and to date has only been supported in separate tools like
sepolicy generate.

One should be extremely careful here. "execute" does not automatically imply 
"execute_no_trans". That assumption can lead to disaster 
(https://www.cvedetails.com/cve/CVE-2015-1815/)

   * interactive mode: ask questions and choose best solution
   * warn when solution touches trusted computing base (rules you
should not be
 adding)
   * suggest alternate labels for content, example: httpd_t not
allowed to write
 to user_home_t, might suggest that changing the label to
 httpd_user_content_t

This seems more along the lines of what setroubleshoot does, and tends
to be policy-specific. We need to preserve the usability of audit2allow
for other policies (e.g. Android, DSSP), so any policy-specific logic
needs to be encapsulated, configuration-driven, and optional.

setroubleshoot is not smart enough to suggest httpd_user_content_t, instead it 
suggest you allow full access to /home/* by toggling the boolean that grants 
access to user home content.

I know this from experience. There was a bug in fedora's 
apache_user_content_template for more than a year and hardly anyone noticed it, 
probably because setroubleshoot would have just suggested you allow full access 
to user home content.

https://bugzilla.redhat.com/show_bug.cgi?id=1457406
I actually think the suggest a accessible type is  the best 
improvement.  audit2allow has defaulted for years to suggesting adding 
allow rules, usually the worst solution.  Later we added boolean 
support, which was better but as you point out the not a great solution 
in a lot of cases.  Most AVC's are caused by mislabeled files or 
processes.  Having audit2allow attempt to examine the policy and figure 
out which types would have been allowed with some simple sorting rules 
might be a big improvement.


Avc denying httpd_t ability to write to httpd_sys_content_t.

audit2allow should examine policy and figure out which domains httpd_t 
can write to with the current policy settings.  Then it should sort the 
solutions based on the characters of the sources context.  Perhaps 
chopping a characters off, maybe down to 3.


 sesearch -A -s httpd_t -p write -c | cut -f1 -d ":"  | cut -d" " -f 3 
| sort -u | grep httpd

httpd_cache_t
httpdcontent
httpd_lock_t
httpd_squirrelmail_t
httpd_sys_rw_content_t
httpd_t
httpd_tmpfs_t
httpd_tmp_t
httpd_user_rw_content_t
httpd_var_lib_t
httpd_var_run_t

Now eliminate the non file types, removes httpd_t.

That would give the user a much better chance of solving the problem 
without becoming policy specific.


Now if you wanted to become a little more smart, you could look at the 
second field of the label, and have a table that says "var" indicates 
the path begins with /var and "tmp" indicates a directory that ends with 
tmp.  But users could probably determine this.





Sort by types starting with httpd, http, htt


Re: Why does Python want to read /proc/meminfo

2017-05-14 Thread Daniel Walsh

On 05/06/2017 12:54 AM, Ian Pilcher wrote:

I am trying to write an SELinux policy to confine a simple service that
I have written in Python, and I'm trying to decide whether to allow or
dontaudit various denials.

To start, I've reduced my service to the simplest case:

  #!/usr/bin/python

  import sys

  sys.exit()

Running this program in a confined domain generated the following
denial:

avc:  denied  { read } for  pid=2024 comm="denatc" name="meminfo" 
dev="proc" ino=4026532028 scontext=system_u:system_r:denatc_t:s0 
tcontext=system_u:object_r:proc_t:s0 tclass=file


The program does continue on and exit cleanly, so it doesn't seem to
strictly require the access.

Does anyone know why Python is trying to access this file, or what
functionality I might be missing if I don't allow the access?

Usually tools read /proc/meminfo to figure out how much memory is 
available on the system and then to make some assumption about how much 
memory they can use.  A tool might allocate a memory buffer as X% of 
total memory on the system. (This is a bad assumption, since cgroups 
could alter the total amount of memory availabel to the process, but 
/proc/meminfo does not reflect the amount of memory available in the 
cgroup).  The code that looks at /proc/meminfo might be builtin to 
libc,  I would figure that whatever is trying to read /proc/meminfo 
expects to fail in certain situations, so it falls back to an alternate 
code path. This is most likely what you are seeing.



In most situations reading /proc/meminfo would not be considered a 
security risk, especially considering in the case of Cgroups the kernel 
will LIE.  :^)




Re: Docker daemon in enforcing state

2017-04-25 Thread Daniel Walsh

On 04/24/2017 05:30 PM, Umair Sarfraz wrote:

Hi,

So, I have been trying to play around with MLS (which I have 
successfully configured) on CentOS 7. I'm aiming to apply some 
security policies (categorization of docker containers) via MLS, but I 
can't seem to access and get correct labels of docker daemon if I am 
in `enforcing` mode. However, changing it to permissive mode allow me 
to access the service and have correct label. In enforcing mode, I get 
unlablled_t on the docker dirs.



I am fairly new to SELinux so please excuse me if this is a silly 
question but I am pretty sure I am missing something here. Any sort of 
help would be appreciated. Thanks.


--

Umair Sarfraz



Do you have container-selinux installed?



Re: let's revert e3cab998b48ab293a9962faf9779d70ca339c65d

2017-04-17 Thread Daniel Walsh
On 04/17/2017 10:49 AM, Stephen Smalley wrote:
> On Mon, 2017-04-17 at 10:40 -0400, Daniel Walsh wrote:
>> On 04/17/2017 09:34 AM, Stephen Smalley wrote:
>>> On Sat, 2017-04-15 at 06:23 -0400, Daniel Walsh wrote:
>>>> I believe that libselinux still reports that the system is
>>>> running
>>>> with
>>>> SELinux, if the selinuxfs is not mounted
>>>> inside of the container at all.
>>> Not after the commit referenced in the subject line; you removed
>>> the
>>> fallback code to check /proc/filesystems for selinuxfs from
>>> is_selinux_enabled(), so if selinuxfs is not mounted at all, it
>>> will
>>> return 0 (not enabled).  On non-Android, you can also cause
>>> is_selinux_enabled() to return 0 by not providing an
>>> /etc/selinux/config file in your container's root directory (see
>>> commit
>>>  
>>> c08c4eacab8d55598b9e5caaef8a871a7a476cab), i.e. as long as you do
>>> not
>>> install selinux-policy in your container root, then it will return
>>> disabled.
>> That seems to a chancy way of handling this.  Since I can see it as
>> pretty easy to accidently pull in selinux-policy package into a
>> container and then the container gets /etc/selinux/config and stuff
>> starts blowing up.  Not sure why the availability of this file should
>> indicate selinux is enabled.
> The existence of /etc/selinux/config is necessary but not sufficient;
> is_selinux_enabled() only returns 1 if selinuxfs is mounted (read-write 
> with the current logic) _and_ (on non-Android) if /etc/selinux/config
> exists.  The /etc/selinux/config test was added to avoid a regression
> when we dropped the old no-policy-loaded test.
>
> In any event, not mounting selinuxfs within the container would suffice
> to cause is_selinux_enabled() to return 0.
>
> ___
> Selinux mailing list
> Selinux@tycho.nsa.gov
> To unsubscribe, send email to selinux-le...@tycho.nsa.gov.
> To get help, send an email containing "help" to selinux-requ...@tycho.nsa.gov.

If that is the case, then I have no problem removing the read/only
check.  We can

make sure /sys/fs/selinux is not mounted into the container.

___
Selinux mailing list
Selinux@tycho.nsa.gov
To unsubscribe, send email to selinux-le...@tycho.nsa.gov.
To get help, send an email containing "help" to selinux-requ...@tycho.nsa.gov.


Re: let's revert e3cab998b48ab293a9962faf9779d70ca339c65d

2017-04-15 Thread Daniel Walsh
On 04/14/2017 04:41 PM, Stephen Smalley wrote:
> On Fri, 2017-04-14 at 21:43 +0200, Nicolas Iooss wrote:
>> On Fri, Apr 14, 2017 at 8:49 PM, Dominick Grift <dac.override@gmail.c
>> om> wrote:
>>> On Fri, Apr 14, 2017 at 01:56:30PM -0400, Stephen Smalley wrote:
>>>> On Fri, 2017-04-14 at 13:47 -0400, Daniel Walsh wrote:
>>>>> On 04/14/2017 11:33 AM, Stephen Smalley wrote:
>>>>>> On Fri, 2017-04-14 at 16:57 +0200, Dominick Grift wrote:
>>>>>>> Bear with me please, because i might not fully grasp the
>>>>>>> issue (i
>>>>>>> received help with diagnosing this issue):
>>>>>>>
>>>>>>> This commit causes issues (and is, i think, a lousy hack):
>>>>>>> e3cab998b48ab293a9962faf9779d70ca339c65d
>>>>>>>
>>>>>>> The commit causes entities to "think" that SELinux is
>>>>>>> disabled
>>>>>>> after
>>>>>>> "mount -o remount,ro /sys/fs/selinux
>>>>>>>
>>>>>>> It is "neat" to be able to make processes "think" that
>>>>>>> selinux is
>>>>>>> disabled on a selinux enabled system but not if it break
>>>>>>> anything
>>>>>>>
>>>>>>> The above results in the following:
>>>>>>>
>>>>>>> Systemd services that have ProtectKernelTunables=yes set in
>>>>>>> their
>>>>>>> respective service units, think that SELinux is disabled.
>>>>>>>
>>>>>>> However we have found that some of these services actually
>>>>>>> rely
>>>>>>> on
>>>>>>> SELinux to ensure proper labeling.
>>>>>>>
>>>>>>> So we have the option to make people aware that if you set
>>>>>>> ProtectKernelTunables=yes that then the process cannot be
>>>>>>> SELinux-
>>>>>>> aware properly, or we can just get rid of the commit above
>>>>>>> and
>>>>>>> just
>>>>>>> accept that process know that SELinux is enabled.
>>>>>>>
>>>>>>> Actual bug that caused me to look into this: systemd-
>>>>>>> localed
>>>>>>> selinux
>>>>>>> awareness is broken due it having ProtectKernelTunables=yes
>>>>>>> in
>>>>>>> its
>>>>>>> service unit
>>>>>> If selinuxfs is mounted read-only, then they can't use most
>>>>>> of the
>>>>>> selinuxfs interfaces, including even the ability to validate
>>>>>> or
>>>>>> canonicalize security contexts.  That will break most
>>>>>> SELinux-aware
>>>>>> services if we tell them that SELinux is enabled.  Are you
>>>>>> sure
>>>>>> systemd-localed would actually work if you told it SELinux
>>>>>> was
>>>>>> enabled
>>>>>> when selinuxfs was mounted read-only?  What SELinux
>>>>>> interfaces is
>>>>>> it
>>>>>> using?
>>>>>>
>>>>>> The other question is whether ProtectKernelTunables ought to
>>>>>> be
>>>>>> mounting selinuxfs read-only.  SELinux already controls the
>>>>>> ability
>>>>>> to
>>>>>> use its interfaces, including limiting even root, so it is
>>>>>> unclear
>>>>>> what
>>>>>> benefit we derive from having systemd add a further
>>>>>> restriction on
>>>>>> top.
>>>>>>
>>>>> Why is selinuxfs mounted readonly in this case?
>>>> I don't actually see this in upstream systemd unless I am just
>>>> missing
>>>> it.
>>>>
>>>> systemd/src/core/namespace.c:
>>>> /* ProtectKernelTunables= option and the related filesystem APIs
>>>> */
>>>> static const MountEntry protect_kernel_tunables_table[] = {
>>>> { "/proc/sys",   READONLY, false },
>>>> { "/proc/sysrq-trigger", READONLY, true  },
>>>> { "/proc/latency_stats", READONLY, true  },
>>>

Re: let's revert e3cab998b48ab293a9962faf9779d70ca339c65d

2017-04-14 Thread Daniel Walsh
On 04/14/2017 11:33 AM, Stephen Smalley wrote:
> On Fri, 2017-04-14 at 16:57 +0200, Dominick Grift wrote:
>> Bear with me please, because i might not fully grasp the issue (i
>> received help with diagnosing this issue):
>>
>> This commit causes issues (and is, i think, a lousy hack):
>> e3cab998b48ab293a9962faf9779d70ca339c65d
>>
>> The commit causes entities to "think" that SELinux is disabled after
>> "mount -o remount,ro /sys/fs/selinux
>>
>> It is "neat" to be able to make processes "think" that selinux is
>> disabled on a selinux enabled system but not if it break anything
>>
>> The above results in the following:
>>
>> Systemd services that have ProtectKernelTunables=yes set in their
>> respective service units, think that SELinux is disabled.
>>
>> However we have found that some of these services actually rely on
>> SELinux to ensure proper labeling.
>>
>> So we have the option to make people aware that if you set
>> ProtectKernelTunables=yes that then the process cannot be SELinux-
>> aware properly, or we can just get rid of the commit above and just
>> accept that process know that SELinux is enabled.
>>
>> Actual bug that caused me to look into this: systemd-localed selinux
>> awareness is broken due it having ProtectKernelTunables=yes in its
>> service unit
> If selinuxfs is mounted read-only, then they can't use most of the
> selinuxfs interfaces, including even the ability to validate or
> canonicalize security contexts.  That will break most SELinux-aware
> services if we tell them that SELinux is enabled.  Are you sure
> systemd-localed would actually work if you told it SELinux was enabled
> when selinuxfs was mounted read-only?  What SELinux interfaces is it
> using?
>
> The other question is whether ProtectKernelTunables ought to be
> mounting selinuxfs read-only.  SELinux already controls the ability to
> use its interfaces, including limiting even root, so it is unclear what
> benefit we derive from having systemd add a further restriction on top.
>
Why is selinuxfs mounted readonly in this case? 


The reason we want this is so that processes inside of containers do not
attempt to do SELinux stuff. 

http://danwalsh.livejournal.com/73099.html


___
Selinux mailing list
Selinux@tycho.nsa.gov
To unsubscribe, send email to selinux-le...@tycho.nsa.gov.
To get help, send an email containing "help" to selinux-requ...@tycho.nsa.gov.