Evgeni Golov schreef op 30-10-2016 16:53:

Given that we
1) ship a config that works just fine by default (but does not have
networking at all)
2) provide an easy way to enable DHCP on a bridge
do you think this report can be closed, or do you see any more room
for improvement here?

Thank you for responding. I was indeed mistaken about the default config.

There is a page on the wiki that mentions lxc-net and the option you mention.

However. There is scarcely any documentation on it.

lxc-net seems rather "oblique", you don't know what it is or what it does.

At some point I did check out https://github.com/CameronNemo/lxc-net but that was way after. It says it has been obsoleted by inclusing in LXC.

However when you look at the sources at that repo there is no indication of any DHCP.

The scripts are (in that repo) also rather ... minimal. It is not the complete set of masquerading rules you'd need for a truly functioning system.

I certainly cannot find any documentation on lxc-net directly.

So I guess the improvement would be that your point (2) actually reaches people...

All the documentation you can find instructs you to set the network configuration in your lxc config for the container. But then you have that hanging system.

In order to avoid that you have to use that lxc-net, but that is much more oblique and harder to find, so you won't do that. So the average newcomer will run into that problem I described. No one is going to run their container without networking. The first thing you do is to set up networking and then see if you can connect to it.

What I did eventually was to create my own bridge networking. It took a lot of time. I wrote a wiki article about it (not the time, but the instructions on how to do it ;-)). I put in some firewall rules to get the full loopback functionality and so on. So I'm still not using lxc-net.

So what was the point where I troubleshooted the network? SystemD puts me on the wrong track by saying that there is no timeout. That's one thing that can be improved. See I *did* attach to the console or I would never even have seen those messages.

I was just impatient enough to reboot the LXC container or proceed with my next attempt, prior to the dhcp script ever having finished, because SystemD told me it *wouldn't ever finish*.

If you were to give the systemd unit file for that (which is not related to LXC) a timeout value (explicitly) of say 20 seconds I wouldn't have gotten into that mess. This miscommunication causes you to spend more time on it than you otherwise would have.

SystemD communicates something that it does not actually know, and that is really the biggest issue here for people first running into this. How on earth are you supposed to know that SystemD is lying to ya? But then again, that only works to realize that you need to change your setting (in the container).

The issue then remains that there are two classes of people:

- those with DHCP in the network who depend on the setting to be dhcp and who subsequently do not set a fixed IP address - those with no DHCP in the network and who do set a fixed IP address (in the container config).

It seems a clear separation of people due to the config, something that could be treated as a defining characteristic.

I don't know how LXC can change that but I only see 3 solutions:

- don't set it to DHCP which you say will offend the other half of the people and I guess for a general home computer that is logical but you'd be flabbergasted if your network-less (dhcp-less) computer system or network would hang for 15+ seconds or longer booting your computer the first time, right.

Any computer not on a dhcp network now hangs while booting? That's not good is it. You can only solve that in one of two ways:

- create a shorter timeout for the dhcp thing (or don't wait for networking to come on before you give a login prompt). - or, allow systemd to communicate more clearly that it is not gonna wait forever for ya.

But really the strange thing from a user point of view is that you configure the network in LXC and then *it doesn't work* because it is not evident that the inner container is going to use DHCP by default.

But LXC doesn't determine the inner system. It could be anything right, not just Debian. It could be anything that does its networking in whatever way, so it is up to the (Debian) LXC people to determine that it should work with Debian, it's not like LXC can handle that it itself.

So the only solution comes down to providing that DHCP server by default (as LXC) instead of waiting for the user to select to use lxc-net for that.

That is actually what you expect as a user. You expect LXC to do the DHCP thing when you configure the networking inside (the container config).

So I would assume that the answer would need to lie in having LXC start that DHCP server when you configure a fixed IP address and maybe that is not perfect but it follows the model of what needs to happen anyway:

* you define a static address ---> inner container config must be set to manual/static OR dhcp must exist.

* you don't define a static address ---> nothing needs to happen because DHCP will work or you expect yourself to already have it

So the biggest problem at this point is that the LXC inner container config as set in the external configuration file (for the container) is completely disjunct from any lxc-net business as far as the configuration model goes.

LXC-NET apparently evolved as a standalone thing and apparently it is still is this way.

But people do not want to use lxc-net if they can't see what it is going to do for them. I don't know ... I have never come across the scripts on my computer (VPS).

So I can only suggest this thing and these are the 3 solutions I mentioned, perhaps? :P.

1. lxc-net must be better documented so that people do not set up networking without it (but some may still not want to use it) 2. the dhcp "server" must be started instantly and automatically when a static IP address has been configured (and there could be another configuration flag to control that) and it should not be dependent on another (external) configuration file like /etc/default/lxc (which doens't even exist).

And the third solution was to change the debian config to manual configuration so that it doesn't override the static IP setting of the container externally (the config of the container on the host).

So if you say the 3rd option is unavailable (and it should be, I guess) that leaves:

* clear documentation that setting the network in LXC config file (for the container) is not enough.
* automatically starting DHCP on static config
* make lxc-net more available, more accessible, and more transparent.

I *saw* the reference on the Debian Wiki: https://wiki.debian.org/LXC/SimpleBridge

However, the documentation is *so minimal* and the lxc-net service *doesn't exist*.

But hold on, you were mentioning 2.0? The Debian version in Jessie is 1.0.6-6+deb8u2.

That one in Stretch is 2.0.5-1... and LXC mentions that 1.1 has end of life (but 1.0 hasn't) but it seems they urge everyone to upgrade anyway?

My bug was against Jessie, I'm not sure I mentioned that. That means everyone in Jessie is going to keep stuck with this behaviour?

I guess all we can do then is make the documentation more explicit on the Wiki? I will seek to improve it if I have time to update on this 'anomaly' or this current status quo ;-).

Thanks for responding, bye.

Reply via email to