Re: [lxc-users] Issue with physical nictype

2016-11-04 Thread Wolfgang Bumiller

> On November 4, 2016 at 1:36 PM "Tardif, Christian" 
>  wrote:
> 
> No issue when I bring the container up. The ipvlan0 interface is passed
> to the container, dissapearing from the host. But when I shut the
> container off, the nic is not released to the LXD host. As such, I can't
> restart the container again, as the NIC is no longer available from the
> LXD server to be offered to the container. 
> 
> How can I fix this behavior?

Does it not reappear on the host at all or is it perhaps just renamed to
the "dev"? (This happens when the name the interface has inside the
container conflicts with a name on the host.)
In this case https://github.com/lxc/lxc/pull/1269 should help.

___
lxc-users mailing list
lxc-users@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-users

Re: [lxc-users] Establish a bind mount to a running container

2016-10-07 Thread Wolfgang Bumiller

> On October 7, 2016 at 11:45 AM Stéphane Graber  wrote:
> 
> 
> On Fri, Oct 07, 2016 at 07:03:21AM +, Jäkel, Guido wrote:
> > Dear experts,
> > 
> > I wonder if it's possible to establish a bind mount filesystem resource 
> > from the LXC host to an already running container in an manual way, but 
> > analogous as it is done at startup time. 
> > 
> > I already figured out that the releasing an existing link is no thing; just 
> > umount it from inside the container. But is there a way to establish one 
> > while shifting the destination of a bind mount into the right namespace?
> > 
> > I ask about, because in a couple of days I have to change a (NFS) 
> > filesystem source (because of an hardware migration) that is common to a 
> > large number of running containers but not frequently used and I want to 
> > avoid to restart all the containers with it services.
> > 
> > thank you for advice
> > 
> > Guido
> 
> It's very difficult due to a number of restrictions in place in the kernel.
> 
> The only way of doing this that I'm aware of is what we do in LXD. We
> create a path on the host before the container starts, put that on a
> rshared mountpoint, then bind-mount that directory into the container
> under some arbitrary path.

But the container can break this from the inside by turning the inner
slave mount point into a private mountpoint once (which cannot be undone).
Then again, the standard AppArmor profile still has the make-private
on ** rules commented out with the note that AppArmor treats it as
allowing all mounts, so I suppose in the default case it'll be hard to
break this functionality.

I've been wondering if there's a more reliable way for a while now...

___
lxc-users mailing list
lxc-users@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-users

Re: [lxc-users] LXD Bridged IPv6

2016-04-26 Thread Wolfgang Bumiller
Curious, the symptoms are almost consistent with when you're trying
to do routing within a single subnet (which means NDP packets won't
reach their destination and you need to either setup neighbor proxying
with `ip neighbor` per-container or setup an ndp proxy daemon (ndppd)),
yet your container did get successfully autoconfigured? Are you
running a router advertiser on your host (radvd, dnsmasq, ...) or are
the routes advertised by your provider? (If the latter is the case,
is eth0 attached to lxdbr0 or are you really only routing?)

The host's `ip a` and `ip -6 r` output would be useful (ifconfig
lacks bridge port information and instead contains lots of useless stuff).

(Obviously any relevant firewall configuration would also be useful)

You can also use `tcpdump` to try and track the NDP and ping packets,
see which part fails and where.

How does your IPv4 setup compare to this, and do you use proxy_arp
with IPv4?

> On April 25, 2016 at 2:30 PM Nick Falcone  wrote:
> 
> 
> root@test9001:~# ip -6 r
> 2604:a880:0:1010::/64 dev eth0  proto kernel  metric 256  expires
> 3434sec pref medium
> fe80::/64 dev eth0  proto kernel  metric 256  pref medium
> default via fe80::684e:dcff:feae:fd61 dev eth0  proto ra  metric 1024 
> expires 1634sec hoplimit 64 pref medium
> 
> 
> root@test9001:~# default via fe80::1 dev eth0  metric 1024  pref medium
> 
> 
> after adding the route you suggested I still get:
> ip -6 route del default
> ip -6 route add default via fe80::1 dev eth0
> From 2604:a880:0:1010:216:3eff:fe87:ff20 icmp_seq=15 Destination
> unreachable: Address unreachable
> 
> On Mon, Apr 25, 2016, at 07:25 AM, Wojciech Arabczyk wrote:
> > What are your route settings in the container?
> > ip -6 route show
> > 
> > Have you tried adding the generic default route via:
> > ip -6 route add default via fe80::1 dev eth0
> > on the container itself?
> > 
> > On 25 April 2016 at 13:11, Nick Falcone  wrote:
> > > In my sysctl.conf I have:
> > >
> > > net.ipv4.ip_forward=1
> > > net.ipv6.conf.all.forwarding=1
> > >
> > >
> > > and just to double check
> > >
> > > root@lxdtest:~# sysctl net.ipv4.ip_forward
> > > net.ipv4.ip_forward = 1
> > > root@lxdtest:~# sysctl net.ipv6.conf.all.forwarding
> > > net.ipv6.conf.all.forwarding = 1
> > >
> > > On Mon, Apr 25, 2016, at 03:44 AM, Wojciech Arabczyk wrote:
> > >> Are you sure, you have enabled ipv6 forwarding via sysctl?
> > >>
> > >> On 22 April 2016 at 18:10, Nick Falcone  wrote:
> > >> > Hello
> > >> >
> > >> > I have been banging my head up against a wall for a few days now trying
> > >> > to get IPv6 to work across my bridged interface for my containers.
> > >> >
> > >> > I have tried different VPS and dedicated servers as well as versions of
> > >> > Ubuntu 14.04, 15.10, and 16.04 to get this working.  The latest test 
> > >> > all
> > >> > this info is from an Ubuntu 16.04 with the included version of LXD.
> > >> >
> > >> > First I install and run lxd init, I configure the bridge like so.
> > >> >
> > >> > lxdbr0Link encap:Ethernet  HWaddr fe:82:af:f0:5d:ce
> > >> >   inet addr:10.195.87.1  Bcast:0.0.0.0  Mask:255.255.255.0
> > >> >   inet6 addr: 2604:a880:0:1010::623:2/64 Scope:Global
> > >> >   inet6 addr: fe80::40c6:84ff:fe18:22fb/64 Scope:Link
> > >> >   UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> > >> >   RX packets:294 errors:0 dropped:0 overruns:0 frame:0
> > >> >   TX packets:18 errors:0 dropped:0 overruns:0 carrier:0
> > >> >   collisions:0 txqueuelen:1000
> > >> >   RX bytes:21612 (21.6 KB)  TX bytes:2127 (2.1 KB)
> > >> >
> > >> > This is my host information too
> > >> >
> > >> > eth0  Link encap:Ethernet  HWaddr 04:01:d4:50:c4:01
> > >> >   inet addr:162.243.200.170  Bcast:162.243.200.255
> > >> >   Mask:255.255.255.0
> > >> >   inet6 addr: fe80::601:d4ff:fe50:c401/64 Scope:Link
> > >> >   inet6 addr: 2604:a880:0:1010::623:1/64 Scope:Global
> > >> >   UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> > >> >   RX packets:76258 errors:0 dropped:0 overruns:0 frame:0
> > >> >   TX packets:8187 errors:0 dropped:0 overruns:0 carrier:0
> > >> >   collisions:0 txqueuelen:1000
> > >> >   RX bytes:111074998 (111.0 MB)  TX bytes:1230729 (1.2 MB)
> > >> >
> > >> > I launch and enter the first container it has this info:
> > >> >
> > >> > eth0  Link encap:Ethernet  HWaddr 00:16:3e:87:ff:20
> > >> >   inet addr:10.195.87.69  Bcast:10.195.87.255
> > >> >   Mask:255.255.255.0
> > >> >   inet6 addr: 2604:a880:0:1010:216:3eff:fe87:ff20/64
> > >> >   Scope:Global
> > >> >   inet6 addr: fe80::216:3eff:fe87:ff20/64 Scope:Link
> > >> >   UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
> > >> >   RX packets:20 errors:0 dropped:0 overruns:0 frame:0
> > >> >   TX packets:294 errors:0 dropped:0 overruns:0 carrier:0
> > >> >   collisions:0 txqueuele

Re: [lxc-users] Is there anything in LXC that would prevent DHCPv6 from working?

2016-03-19 Thread Wolfgang Bumiller
> On March 18, 2016 at 1:19 PM John Lewis  wrote:
> 
> 
> I am use wide-dhcpv6-server and wide-dhcpv6-client in two diffrent LXCs
> with an iproute2 created bridge and lxc created tun/tap devices and I am
> using 3.16.0-4-amd64 #1 SMP and my kernel. I don't have any firewall
> that would block ipv6 request and responses that would occur on port 546
> and 547, but I don't see any packets out of the interface on the client
> that are the packets that I am looking for when I tcpdump it. It is
> probably an application issue, but I just want to double check.

There shouldn't be anything lxc-specific here as far as I know.
Are you saying you have no firewall at all which could block anything,
or just that you think it should allow everything? You might still
be blocking neighbor discovery packets (which come from a MAC-derived
link-local ip address, so you also need to make sure you don't block
these by address either.)

(Oh also, just in case you're using an alpine linux containers, busybox'
dhcpv6 client is still not finished / broken (uses wrong addresses), so
that won't work.)

___
lxc-users mailing list
lxc-users@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-users

Re: [lxc-users] Missing /proc/self after lxc-attach ?

2016-02-15 Thread Wolfgang Bumiller
> On February 15, 2016 at 1:29 PM Mateusz Korniak 
>  wrote:
> [1]: After lxc-attach entry to container:
> # ls -la /proc/self
> ls: cannot read symbolic link /proc/self: No such file or directory
> lrwxrwxrwx 1 root root 0 Feb 15 12:05 /proc/self

What is the exact command you used to attach? Because eg. if you only
enter the mount namespace but not the PID namespace, then you see the
container's /proc with the *container*'s processes and thus your PID
doesn't show up, making /proc/self a dead symlink.

___
lxc-users mailing list
lxc-users@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-users

Re: [lxc-users] Does memory limit affects file system cache?

2016-01-26 Thread Wolfgang Bumiller
> On January 26, 2016 at 2:21 PM Eax Melanhovich  wrote:
> 
> 
> Hello.
> 
> Lets say I would like to test how PostgreSQL behaves under some
> circumstances in particular when some table does not fit into memory.
> The problem is I have 16 Gb of RAM so to test this I have to create
> quite a large table. Can I just set:
> 
> lxc.cgroup.memory.limit_in_bytes = 1024M
> 
> ... instead? Will it guarantee that database will not be cached by
> host's file system?

I'd like to know more details about this, too. As far as I can tell
`kmem.limit_in_bytes` can affect the file system cache, and is
pretty devastating when the container runs on ZFS as it seems to
count the ZFS ARC and sooner or later starts OOM-killing the container.
This does not happen when only using 'memsw.*'.

___
lxc-users mailing list
lxc-users@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-users

Re: [lxc-users] Convert LXC Guests from privileged to unprivileged

2015-12-03 Thread Wolfgang Bumiller
> fwiw lxd also ships with 'fuidshift' which has the same functionality.

After a quick glance over the code I only see it handling file ownership.
What about ACLs? (And perhaps other extra attributes I'm unaware of.)

I was thinking the most "complete" conversion happens when you tar up the
container in one namespace, with -p --acls --numeric-owner --xattrs etc.
and then unar it in the other namespace. This however fails to extract
device nodes into user namespaces... ;-/

(Offtopic: I'm still puzzled by the fact that mknod doesn't work in a
usernamespace. There's a capability for _just_ _that_ after all, and
there's the devices cgroup. I'd much rather have a rule that a non-zero
user starting a userns doesn't gain CAP_SYS_MKNOD unless it's already
there.)

___
lxc-users mailing list
lxc-users@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-users

Re: [lxc-users] Running Docker inside LXD

2015-11-30 Thread Wolfgang Bumiller

> On November 30, 2015 at 1:57 PM Tamas Papp  wrote:
> One is LXC (this mailing list, linuxcontainers.org) and the other 
> is libvirt based and it's quite different.
> 
> If I'm not wrong, Proxmox uses the latter.

Proxmox uses LXC.

___
lxc-users mailing list
lxc-users@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-users

Re: [lxc-users] pre-mount hook namespace

2015-11-17 Thread Wolfgang Bumiller
> On November 17, 2015 at 1:37 AM Serge Hallyn  wrote:
>
> I think Stéphane has found lxc with cgfs to be broken right now, although
> I thought that was only nested on top of lxcfs.  I haven't looked into it,
> but will try to in the near future.  If someone else wants to, all the
> better.  (I try to stay away from the cgfs code)

Ah, didn't see this mail before replying to the other one. Okay
so this means we'll have to stick to cgmanager for now I suppose.

___
lxc-users mailing list
lxc-users@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-users

Re: [lxc-users] pre-mount hook namespace

2015-11-17 Thread Wolfgang Bumiller
On Mon, Nov 16, 2015 at 04:33:25PM +, Serge Hallyn wrote:
> Quoting Serge Hallyn (serge.hal...@ubuntu.com):
> > Quoting Wolfgang Bumiller (w.bumil...@proxmox.com):
> > > So we ended up doing just that, but now with the latest lxcfs
> > > upgrades (I suspect cgmanager/cgfs changes) AppArmor suddenly
> > > denies lxc-start to bind mount something. Here's what happens
> > > with raw lxc-start commands:
> > > 
> > > # lxc-start -n 406
> > > 
> > > works, but (simplified to just unshare -m):
> > > 
> > > # unshare -m -- lxc-start -n 406
> > > 
> > > audit: type=1400 audit(1447670720.554:74): apparmor="DENIED" 
> > > operation="mount"
> > > profile="/usr/bin/lxc-start"
> > > name="/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/fs/cgroup/hugetlb/lxc/406/"
> > > pid=21536 comm="lxc-start" flags="rw, bind"
> > > 
> > > This doesn't make sense to me, I don't see how the namespace
> > > change would affect this? (Using unshare -m and then running
> > > `mount --make-r{slave,private,shared} /` doesn't change the
> > > outcome.)
> > 
> > Can you make sure that your apparmor profile has the
> > attach_disconnected flag?
> 
> Sorry, make that /etc/apparmor.d/usr.bin.lxc-start.

Okay it's not apparmor's fault (or not only anyway). (And yes the flag
is there).
If I put the profiles in complain mode I get the same with
apparmor="ALLOWED" but the mount still fails with a permission-denied
error.

Note that this is only cgfs with --disable-cgmanager (which I suspect is
not meant to work?). And I'm currently wondering how that would be
possible anyway. Eg. in lxcfs/cgfs I see that mkdir requests use the
fuse_context's uid/gid to reown files for cgroups - but the cgroups are
mounted _as_ cgroups, so how would that code even be reached in the fuse
fs?

And how does it connect to mount namespaces...?

___
lxc-users mailing list
lxc-users@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-users

Re: [lxc-users] pre-mount hook namespace

2015-11-16 Thread Wolfgang Bumiller

> On November 16, 2015 at 12:33 PM Dietmar Maurer  wrote:
> > On November 16, 2015 at 11:48 AM Wolfgang Bumiller 
> > wrote:
> > > On November 11, 2015 at 6:04 PM Serge Hallyn 
> > > wrote:
> > > Oh, right.  I forget that even when starting as root, this only works
> > > for the rootfs itself, not other mounts.  (Lxd actually does handle this,
> > > but at the cost of having a MS_SLAVE mount per container)
> > 
> > So we ended up doing just that, but now with the latest lxcfs
> > upgrades (I suspect cgmanager/cgfs changes) AppArmor suddenly
> > denies lxc-start to bind mount something. Here's what happens
> > with raw lxc-start commands
> 
> Seems to be related to lxc update. lxc 1.1.4 works with latest lxcfs.
> so the problem is introduced between lxc 1.1.4 and lxc 1.1.5

Ah actually it seems it's the change from --enable-cgmanager to
--disable-cgmanager we made between those versions.
(read: --enable-cgmanager works with 1.1.4 and 1.1.5, --disable
with neither).
Still don't know how that connects to AppArmor, though.

___
lxc-users mailing list
lxc-users@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-users

Re: [lxc-users] pre-mount hook namespace

2015-11-16 Thread Wolfgang Bumiller
> On November 11, 2015 at 6:04 PM Serge Hallyn  wrote:
> > > 2.
> > > If you are just using unpriv containers to use user namespaces, you can
> > > actually have the container be owned/started by root.  That's what I do
> > > for some containers where their rootfs is a dmcrypt device which I
> > > couldn't mount as an unpriv user.
> > 
> > They are started as root, which means I can prepare the mounts as you
> > suggested above, but I'd again be clobbering the host's namespace.
> 
> Oh, right.  I forget that even when starting as root, this only works
> for the rootfs itself, not other mounts.  (Lxd actually does handle this,
> but at the cost of having a MS_SLAVE mount per container)

So we ended up doing just that, but now with the latest lxcfs
upgrades (I suspect cgmanager/cgfs changes) AppArmor suddenly
denies lxc-start to bind mount something. Here's what happens
with raw lxc-start commands:

# lxc-start -n 406

works, but (simplified to just unshare -m):

# unshare -m -- lxc-start -n 406

audit: type=1400 audit(1447670720.554:74): apparmor="DENIED" operation="mount"
profile="/usr/bin/lxc-start"
name="/usr/lib/x86_64-linux-gnu/lxc/rootfs/sys/fs/cgroup/hugetlb/lxc/406/"
pid=21536 comm="lxc-start" flags="rw, bind"

This doesn't make sense to me, I don't see how the namespace
change would affect this? (Using unshare -m and then running
`mount --make-r{slave,private,shared} /` doesn't change the
outcome.)

406/config:
lxc.arch = amd64
lxc.include = /usr/share/lxc/config/debian.common.conf
lxc.include = /usr/share/lxc/config/debian.userns.conf
lxc.id_map = u 0 10 65536
lxc.id_map = g 0 10 65536
lxc.tty = 2
lxc.environment = TERM=linux
lxc.utsname = testit
lxc.cgroup.memory.limit_in_bytes = 536870912
lxc.cgroup.memory.memsw.limit_in_bytes = 1073741824
lxc.cgroup.cpu.shares = 1024
lxc.rootfs = /var/lib/lxc/406/rootfs
lxc.network.type = veth
lxc.network.veth.pair = veth406i0
lxc.network.hwaddr = 32:34:36:33:31:34
lxc.network.name = eth0

___
lxc-users mailing list
lxc-users@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-users

Re: [lxc-users] pre-mount hook namespace

2015-11-11 Thread Wolfgang Bumiller

> On November 11, 2015 at 5:07 PM Serge Hallyn  wrote:
> > Mount a filesystem for the unprivileged user which the they cannot
> > mount by themselves due to a lack of permissions.
> > # mount -o loop /path/you/don't/have/access/to.img /the/container
> 
> A few things,
> 
> 1.
> If you just want this to be a container in a user namespace, you could
> pre-mount it to a path where the user does have access so they can use
> a regular lxc.mount.entry.

Yes I know. I was just wondering if I can avoid having to mount it in
the host's namespace.

> 2.
> If you are just using unpriv containers to use user namespaces, you can
> actually have the container be owned/started by root.  That's what I do
> for some containers where their rootfs is a dmcrypt device which I
> couldn't mount as an unpriv user.

They are started as root, which means I can prepare the mounts as you
suggested above, but I'd again be clobbering the host's namespace.

> 3.
> Seth Forshee is working on support for several things that would help you
> here - in particular unprivileged users mounting ext4, using loop devices,
> and fuse.  Doesn't help you right now, but soon it might.

Sounds interesting, but not all our storage backends use loop devices
(or are ext4 (eg a zfs subvolume...)).
Btw. does that imply giving access to a loop device to the container's
user? Because this can be problematic. At least if the user can unmount
and detach the loop device. They could then wait for the next victim
to reuse it (with the next `mount -o loop` etc.) Just something to
look out for. Unless you can forbid the detaching. Seccomp maybe?

___
lxc-users mailing list
lxc-users@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-users

Re: [lxc-users] pre-mount hook namespace

2015-11-11 Thread Wolfgang Bumiller
Thanks for the reply.

> On November 11, 2015 at 4:40 PM Serge Hallyn  wrote:
> > This puts us in a bit of a pickle as we'd like to setup mountpoints
> > for an unprivileged container without giving it access to more than it
> > needs (in particular, the storage configuration and processes involved
> > in managing and activating them.)
> 
> Please give a specific example of what you want.

Mount a filesystem for the unprivileged user which the they cannot
mount by themselves due to a lack of permissions.
# mount -o loop /path/you/don't/have/access/to.img /the/container

> In order for an unprivileged user to be able to manipulate the mounts
> table, he must *first* unshare the user namespace.  That is so that
> if he mounts something over /etc/shadow, he can only trick setuid-root
> programs (like login) owned by his own user namespace.

Ah yes. I just read up on the mount namespace restrictions
section in user_namespaces(7).

Looks like it'll have to be mounting in the pre-start hook and
unmounting in the post-stop hook and letting the mounts stay
visible in the host's namespace.

___
lxc-users mailing list
lxc-users@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-users

[lxc-users] pre-mount hook namespace

2015-11-11 Thread Wolfgang Bumiller
The pre-mount hook documentation states that it is "a hook to be run
in the container's fs namespace", which seems a little confusing to me
as I'm not quite sure what the 'fs' namespace is supposed to
represent. clone(2)'s CLONE_FS just refers to the root dir, umask and
current working directory, but when running an unprivileged container
the user-namespace will also be set up.
This puts us in a bit of a pickle as we'd like to setup mountpoints
for an unprivileged container without giving it access to more than it
needs (in particular, the storage configuration and processes involved
in managing and activating them.)

For us this seems to be only possible in the pre-start hook now, but
this will be reflected on the host.
I've thought about running lxc-start in a mount namespace, but then I'd
have another namespace to clean up after (for the same reasons we added
the 'stop' hook.)

Since the CLONE_NEWUSER flag is used in the call to clone() I don't see
any convenient solution here, maybe someone has an idea?

Either way it would probably be a good idea to update the documentation
to reflect this. Maybe have yet another hook? (The user-namespace could
be entered later with unshare(CLONE_NEWUSER) and the sync barriers
already control when the parent can run lxc_map_ids().)

Or maybe I'm just missing something obvious?

___
lxc-users mailing list
lxc-users@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-users

Re: [lxc-users] lxc-attach

2015-09-16 Thread Wolfgang Bumiller
> On September 16, 2015 at 9:21 AM Andrey Repin  wrote:
> > I have a question about the lxc-attach command?
> 
> > Is this normal behavior, if I use
> > "lxc-attach -n container"
> > and then shutdown the container.
> 
> > Then the bash show no input anymore until I reset it with rest.
> > the output is ok.
> 
> This is the one reason I always exit the shell before shutdown takes place.
> I.e.
> shutdown -h now; exit

Seems to depend on something inside the guest system. (Maybe the shell?)
Eg. I don't get this behavior with an ArchLinux guest.
(btw. `stty sane` also fixes it without clearing the screen in case anyone's
interested)

___
lxc-users mailing list
lxc-users@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-users