Bug#768188: Jessie Installer hangs after processing DHCPv6 stateful addressing
Hello, Thanks for the notice, Cyril Brulebois, le Wed 18 Feb 2015 22:29:23 +0100, a écrit : > Philipp Kern (2015-02-18): > > So now I guess the question is if we revert the change that broke it: > > > > Don't kill_dhcp_client without reason (Closes: #757711, #757988) > > > > Do not kill_dhcp_client after setting the hostname and > > domain, otherwise Linux udhcpc will stop renewing its lease, and > > on other platforms dhclient will de-configure the network interface > > (Closes: #757711, #757988) > > (No idea about hurd; anyway, adding both porter lists to Cc.) dhclient gets killed indeed, but for some reason the interface is not deconfigured, so it's fine for the hurd port. We already have to ship our own version of netcfg already anyway because of #769189 which introduces a 1-minute sleep. Samuel -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#768188: Jessie Installer hangs after processing DHCPv6 stateful addressing
Philipp Kern wrote: > one-shot mode (-1) and will exit after it acquired a lease successfully. dhclient isn't doing that, at least on kfreebsd. I'm not sure that's what -1 means. It will try only once to get a lease, initially. If successful it stays running - I assumed it continues to refresh the lease - and starting in the jessie version, will also give up the lease on SIGINT (that was #757711). I think reverting to what we had before reintroduces bugs, and would break downstream Ubuntu. I think a workaround should be more targetted at udhcpc/dhcp6c. Regards, -- Steven Chamberlain ste...@pyro.eu.org -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#768188: Jessie Installer hangs after processing DHCPv6 stateful addressing
On Wed, Feb 18, 2015 at 10:05:27PM +, Steven Chamberlain wrote: > We did expect that during freeze, some regressions may be introduced > that affect only GNU/kFreeBSD, and we'd have to fix things up in our > unofficial release, perhaps rolling packages back to an older version, > or uploading a patched version with +kfreebsd suffix. So, I'm happy if > you decide to revert this. > > At first glance, it reads like a limitation of udhcpc/dhcp6c only? > Killing it sounds like a workaround (which perhaps creates other > issues), and an ifdef linux also seems wrong in this context (and for > Ubuntu). > > kill-all-dhcp could be told never to kill ISC dhclient, but that too is > wrong, as this is also used to implement the 'Cancel' button in the > netcfg dialogs. > > Maybe there is still a better solution? > > Or perhaps we could add something that kills *only* udhcpc/dhcp6c, could > clearly annotate it as "this is a workaround for bug #768188". Then it > shouldn't affect Ubuntu, or derivatives/ports using ISC DHCP at all. > And if many years pass before someone comes back to look at this, they > should understand why it's there. Not killing the DHCP clients is the right thing to do. Leases really should be refreshed during d-i, everything else is madness. But that's not even what's happening with dhclient because it's being run in one-shot mode (-1) and will exit after it acquired a lease successfully. The revert I supposed would've been for jessie as the DHCPv6 hang is quite a nasty regression on Linux. But for the future we should really a) use one client on all the platforms and b) let it renew the lease properly. Kind regards Philipp Kern signature.asc Description: Digital signature
Bug#768188: Jessie Installer hangs after processing DHCPv6 stateful addressing
Cyril Brulebois wrote: > Philipp Kern (2015-02-18): > > So now I guess the question is if we revert the change that broke it: > > > > Don't kill_dhcp_client without reason (Closes: #757711, #757988) > > > > Do not kill_dhcp_client after setting the hostname and > > domain, otherwise Linux udhcpc will stop renewing its lease, and > > on other platforms dhclient will de-configure the network interface > > (Closes: #757711, #757988) > > > > At this point kFreeBSD is no longer a release architecture and the other > > platform using dhclient is Ubuntu. > > GNU/kFreeBSD people are (AFAICT) going to try and get an unofficial > release out, so pushing a regression in their way doesn't look too good > to me. Maybe using an #ifdef here to avoid killing the DHCP client on > kfreebsd, and reinstating the previous codepath on linux would be an > acceptable compromise until some evolved signal/process handling pops > up (during the stretch release cycle)? Firstly, thanks for the heads-up. We did expect that during freeze, some regressions may be introduced that affect only GNU/kFreeBSD, and we'd have to fix things up in our unofficial release, perhaps rolling packages back to an older version, or uploading a patched version with +kfreebsd suffix. So, I'm happy if you decide to revert this. At first glance, it reads like a limitation of udhcpc/dhcp6c only? Killing it sounds like a workaround (which perhaps creates other issues), and an ifdef linux also seems wrong in this context (and for Ubuntu). kill-all-dhcp could be told never to kill ISC dhclient, but that too is wrong, as this is also used to implement the 'Cancel' button in the netcfg dialogs. Maybe there is still a better solution? Or perhaps we could add something that kills *only* udhcpc/dhcp6c, could clearly annotate it as "this is a workaround for bug #768188". Then it shouldn't affect Ubuntu, or derivatives/ports using ISC DHCP at all. And if many years pass before someone comes back to look at this, they should understand why it's there. Regards, -- Steven Chamberlain ste...@pyro.eu.org -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#768188: Jessie Installer hangs after processing DHCPv6 stateful addressing
Philipp Kern (2015-02-18): > On Tue, Feb 10, 2015 at 09:22:25AM +0100, Philipp Kern wrote: > > On Sun, Feb 08, 2015 at 04:21:25PM +0100, Philipp Kern wrote: > > > On the other hand it also seems wrong for di_exec_shell_log to continue > > > after the invoked binary exited. I suspect that'd mean ppoll() and > > > proper signal handling, but I'm at a loss right now how to do that > > > properly in C. Maybe that's the right place to fix it in the meantime. > > > > I guess signalfd would make this rather neat, but it's not available > > on FreeBSD. :( > > > > The alternative would be to overwrite the SIGCHLD signal handler > > regardless of what has been set before and handle the signal in the > > library. > > So now I guess the question is if we revert the change that broke it: > > Don't kill_dhcp_client without reason (Closes: #757711, #757988) > > Do not kill_dhcp_client after setting the hostname and > domain, otherwise Linux udhcpc will stop renewing its lease, and > on other platforms dhclient will de-configure the network interface > (Closes: #757711, #757988) > > At this point kFreeBSD is no longer a release architecture and the other > platform using dhclient is Ubuntu. GNU/kFreeBSD people are (AFAICT) going to try and get an unofficial release out, so pushing a regression in their way doesn't look too good to me. Maybe using an #ifdef here to avoid killing the DHCP client on kfreebsd, and reinstating the previous codepath on linux would be an acceptable compromise until some evolved signal/process handling pops up (during the stretch release cycle)? (No idea about hurd; anyway, adding both porter lists to Cc.) Mraw, KiBi. signature.asc Description: Digital signature
Bug#768188: Jessie Installer hangs after processing DHCPv6 stateful addressing
On Tue, Feb 10, 2015 at 09:22:25AM +0100, Philipp Kern wrote: > On Sun, Feb 08, 2015 at 04:21:25PM +0100, Philipp Kern wrote: > > On the other hand it also seems wrong for di_exec_shell_log to continue > > after the invoked binary exited. I suspect that'd mean ppoll() and > > proper signal handling, but I'm at a loss right now how to do that > > properly in C. Maybe that's the right place to fix it in the meantime. > > I guess signalfd would make this rather neat, but it's not available > on FreeBSD. :( > > The alternative would be to overwrite the SIGCHLD signal handler > regardless of what has been set before and handle the signal in the > library. So now I guess the question is if we revert the change that broke it: Don't kill_dhcp_client without reason (Closes: #757711, #757988) Do not kill_dhcp_client after setting the hostname and domain, otherwise Linux udhcpc will stop renewing its lease, and on other platforms dhclient will de-configure the network interface (Closes: #757711, #757988) At this point kFreeBSD is no longer a release architecture and the other platform using dhclient is Ubuntu. Kind regards Philipp Kern signature.asc Description: Digital signature
Bug#768188: Jessie Installer hangs after processing DHCPv6 stateful addressing
On Sun, Feb 08, 2015 at 04:21:25PM +0100, Philipp Kern wrote: > On the other hand it also seems wrong for di_exec_shell_log to continue > after the invoked binary exited. I suspect that'd mean ppoll() and > proper signal handling, but I'm at a loss right now how to do that > properly in C. Maybe that's the right place to fix it in the meantime. I guess signalfd would make this rather neat, but it's not available on FreeBSD. :( The alternative would be to overwrite the SIGCHLD signal handler regardless of what has been set before and handle the signal in the library. Kind regards Philipp Kern -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#768188: Jessie Installer hangs after processing DHCPv6 stateful addressing
On Thu, Dec 18, 2014 at 03:04:38PM +0100, Peter Valdemar Mørch wrote: > netcfg: Do not kill_dhcp_client after setting the hostname and domain, > otherwise Linux udhcpc will stop renewing its lease, and on other > platforms dhclient will de-configure the network interface (#757711, > #757988). The call chain is this: udpkg --configure --force-configure netcfg \_ netcfg Z \_ dhcp6c \_ udhcpc udpkg does not collect netcfg's exit code. Instead it continues poll()ing to forward stderr. It receives the SIGCHLD but does not act upon it with a wait() or waitpid(). The function udpkg uses to invoke netcfg's configure comes from libdebian-installer: [...] snprintf(buf, sizeof(buf), "exec %s configure", config); if ((r = di_exec_shell_log(buf)) != 0) [...] Essentially dhcp6c and udhcpc need to be daemonized off correctly once they go into "lease acquired, renew in the background" mode and close their file descriptors[*]. However doing that early likely loses logging, so it'd be best if the programs would do the right thing. At least udhcpc calls bb_daemonize(0), which doesn't do any fd closing. For dhcp6c (wide-dhcpv6-client) we currently force foreground mode (-f). It is not sufficient for netcfg to simply close stderr, as all producers need to close it, as far as I understand. On the other hand it also seems wrong for di_exec_shell_log to continue after the invoked binary exited. I suspect that'd mean ppoll() and proper signal handling, but I'm at a loss right now how to do that properly in C. Maybe that's the right place to fix it in the meantime. Kind regards Philipp Kern [*] It has been a long-standing problem with some d-i (maybe just Ubuntu with isc-dhcp-client) that leases are not renewed during the runtime of the installation. Which might break networking when the switch throws you off post-expiry. signature.asc Description: Digital signature
Bug#768188: Jessie Installer hangs after processing DHCPv6 stateful addressing
On Thu, Dec 18, 2014 at 3:04 PM, Peter Valdemar Mørch wrote: > This occurs in in our work environment in VMware Workstation and > Proxmox when using bridged eth0, but not when using NAT. > I have also tried this on physical hardware without virtualization, and got the same hang. Peter
Bug#768188: Jessie Installer hangs after processing DHCPv6 stateful addressing
On Mon, 22 Dec 2014 12:03:24 +0100 =?UTF-8?Q?Peter_Valdemar_M=C3=B8rch?= < pe...@morch.com> wrote: > If it adds value, I can try booting the image on e.g. a laptop to see > if it is VMware specific. But I'm pretty sure it will experience the > same symptoms. That is only possible from January 5th onwards, since > I'm on vacation until then. > > Peter > Hi Peter, I have done my testing under KVM so it is not a VMware issue. I have also tried the latest RC version of the installer (debian-jessie-DI-rc1-amd64-netinst.iso) with same results. Radek
Bug#768188: Jessie Installer hangs after processing DHCPv6 stateful addressing
On Mon, Dec 22, 2014 at 11:52 AM, Philipp Kern wrote: > But please tell me: Why is there no Router Advertisement in the packet > dump? I see Router Solicitations and DHCPv6 interactions, but no RA. I have absolutely no idea. Perhaps that is the reason for the hang? All I know is that it hangs with Jessie Beta 2 - debian-jessie-DI-b2-amd64-netinst.iso and not with any previous debian installer as far back as sarge. And it doesn't hang when booting in the newly created/installed image. If it adds value, I can try booting the image on e.g. a laptop to see if it is VMware specific. But I'm pretty sure it will experience the same symptoms. That is only possible from January 5th onwards, since I'm on vacation until then. Peter -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Bug#768188: Jessie Installer hangs after processing DHCPv6 stateful addressing
On Thu, Dec 18, 2014 at 03:04:38PM +0100, Peter Valdemar Mørch wrote: > This occurs in in our work environment in VMware Workstation and > Proxmox when using bridged eth0, but not when using NAT. In my home > network, the exact same procedure goes through without any hangs for > both bridged and NAT. VMware Workstation's IPv6 "support" is full of sadness. Which virtualization do you use with Proxmox? But please tell me: Why is there no Router Advertisement in the packet dump? I see Router Solicitations and DHCPv6 interactions, but no RA. Kind regards Philipp Kern signature.asc Description: Digital signature
Bug#768188: Jessie Installer hangs after processing DHCPv6 stateful addressing
Thanks for the details. Adding Philipp to the loop: Peter Valdemar Mørch (2014-12-18): > I'm also seeing that d-i hangs after DHCP setup. > > But only in Jessie Beta 2 - debian-jessie-DI-b2-amd64-netinst.iso. Not > with Beta 1. The OP also used debian-jessie-DI-b2-amd64-netinst.iso. > > While it hangs, if I go to another terminal with ALT-F2, and issue: > > > kill-all-dhcp > > Then d-i continues past the hang. > > This occurs in in our work environment in VMware Workstation and > Proxmox when using bridged eth0, but not when using NAT. In my home > network, the exact same procedure goes through without any hangs for > both bridged and NAT. > > I've put a wireshark capture of everything from the virtual machine's > MAC address and /var/log/syslog from the installation up until after > running kill-all-dhcp at http://ge.tt/7b1wK872?c and also attached. > > It seems that in our network, IPv6 reverse DNS lookups fail. It is a > likely suspect to why it hangs, but I can't be sure. Misconfigured > IPv6 networks are probably not uncommon! ;-) > > The release announcement: "Debian Installer Jessie Beta 2 release" at > https://www.debian.org/devel/debian-installer/News/2014/20141005 > says: > > netcfg: Do not kill_dhcp_client after setting the hostname and domain, > otherwise Linux udhcpc will stop renewing its lease, and on other > platforms dhclient will de-configure the network interface (#757711, > #757988). > > This comes from a fix to: > > #757711 - netcfg: promptly kills dhclient, deconfigures interface - > Debian Bug report logs > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=757711 > > which is: > > Don't kill_dhcp_client without reason (Closes: #757711, #757988) > http://anonscm.debian.org/cgit/d-i/netcfg.git/commit/?id=48f1de7076f8d17a9bf4d11cb05968cb9d8987f7 > > that essentially is this diff: > > diff --git a/dhcp.c b/dhcp.c > index aa37bd0..5ef0dbc 100644 > --- a/dhcp.c > +++ b/dhcp.c > @@ -614,7 +614,6 @@ int netcfg_activate_dhcp (struct debconfclient > *client, struct netcfg_interface > netcfg_write_loopback(); > netcfg_write_interface(interface); > netcfg_write_resolv(domain, interface); > -kill_dhcp_client(); > stop_rdnssd(); > > return 0; > > > Since killing the dhcp client makes it continue for me, I'm pretty > sure the introduction of this fix for #757711 introduced in Jessie > Beta2 is the reason we're now seeing this. > > Sincerely, > > Peter Mraw, KiBi. signature.asc Description: Digital signature
Bug#768188: Jessie Installer hangs after processing DHCPv6 stateful addressing
In my setup, i've got dhcp for IPv4 and IPv6 without dhcp. Same result. I could reproduce the bug in a vm and on a notebook There is a process in zombie state: [netcfg]. Its parent process seems to be udpkg --configure --force-configure netcfg A workaround is to kill the process dhcp6c -c /var/lib/netcfg/dhcp6c.conf -f eth0 -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org