Re: Roaming, and libresolv being stuck in the 1980's mindset
On 21.4.2015 16:32, Dan Williams wrote: > On Mon, 2015-04-20 at 16:27 -0600, Philip Prindeville wrote: >> On Apr 20, 2015, at 12:23 AM, Siddhesh Poyarekar wrote: >> >>> On Sat, Apr 18, 2015 at 01:49:57PM -0600, Philip Prindeville wrote: If you go back through the previous glibc bugs, you'll find: https://sourceware.org/bugzilla/show_bug.cgi?id=984 from 2005 which was closed out as "RESOLVED, WONTFIX" with the text: There is a solution, already implemented. Use nscd and nscd -i hosts in the script that rewrites your resolv.conf (or nsswitch.conf etc.). >>> >>> Yes, that has been the upstream stance since quite some time now, so I >>> don't know if filing a fresh bug would change that. You could however >>> start a discussion upstream (libc-alpha at sourceware dot org) and >>> make a case for the resolver to watch for changes in resolv.conf. >> >> >> Yeah… I think inotify() might be the cleanest fix for that, but it’s not >> particularly portable. >> >> stat() would work everywhere else. >> >> Threading complicates things a bit, though. > > We tried to push inotify() style watching in glibc in the past, and I > think even Debian ships a patch for that. But the response was always > "use a local caching nameserver" because glibc cannot be all things to > everyone. And honestly, that's not an unreasonable answer. If you use > a local caching nameserver then you get split DNS too, which libresolv > cannot give you. You may encounter this same argument from upstream. > > However, when we last tried in NetworkManager land, that was like 2 > glibc maintainers ago, and current maintainers might have softened their > stance. In any case, I think it makes sense to just go with a local > caching nameserver anyway, for the split DNS. Then the network > management daemon (or whatever you use to switch networks) can forward > the DNS+domain info to the caching nameserver and everything is happy. Exactly this is planned for Fedora 23: https://fedoraproject.org//wiki/Changes/Default_Local_DNS_Resolver Stay tuned! Petr^2 Spacek > > Obviously, this doesn't excuse apps like Evolution/Thunderbird/etc from > the equation, since they should really be smart enough to know that if > they are not connected to the corporate VPN, they can't pull mail from > it. But that's between the network management daemon and the app, and > doesn't involve the resolver at all. > > Dan > >>> Problem with that is that no one seems to have gravitated towards this solution, and I don't blame them. It adds an extra layer of complexity and makes debugging issues that much more murky. A simpler fix is to grab mtime from stat()ing _PATH_RESCONF each time through res_query() and see if it's changed since the last time. Perhaps caching the inode # also and checking that, since an older version of the file might have been renamed as /etc/resolv.conf. >>> >>> That is conceptually simple, but expensive, since you'll be adding >>> syscalls to every lookup. One may argue that it is not much overhead >>> for a network lookup since the latter will still take up a bulk of the >>> time, but it is an added cost nevertheless. >>> >>> Siddhesh >> >> >> syscalls are a lot cheaper than network round-trips. At least in 99% of the >> cases. >> >> How often does name resolution happen, anyway? Not very often. You >> typically take call it before opening a socket, and then that socket >> persists a long time… >> >> In the case of split-horizon DNS service in a corporate environment, this >> problem still wouldn’t be solved, at least not directly. >> >> What might happen is you resolve imap.mycorp.com to some inside-the-firewall >> 10.x.x.x address, and connect to that. Then you roam off the network, but >> of course Thunderbird (for instance) knows nothing about this… it just >> eventually drops the connection because that address becomes unreachable (or >> points a new host with no knowledge of this TCP connection and it promptly >> RESETs). TB then tries to reopen a connection… I’ve not looked at TB source >> in about 8 years so I don’t know if it would redo the name resolution or >> not… if yes, then it might point to a new exterior name server and get the >> external (public-facing) address of imap.mycorp.com and things work again… >> But that’s being optimistic. >> >> The behavior I’ve seen implies that it caches the address from the original >> resolution and keeps trying to reconnect to that. >> >> But having libresolv transparently relearn the /etc/resolv.conf settings is >> the first step toward doing the right thing. >> >> -Philip >> >> >> >> > > -- Petr Spacek @ Red Hat -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Roaming, and libresolv being stuck in the 1980's mindset
On Mon, 2015-04-20 at 16:27 -0600, Philip Prindeville wrote: > On Apr 20, 2015, at 12:23 AM, Siddhesh Poyarekar wrote: > > > On Sat, Apr 18, 2015 at 01:49:57PM -0600, Philip Prindeville wrote: > >> If you go back through the previous glibc bugs, you'll find: > >> > >> https://sourceware.org/bugzilla/show_bug.cgi?id=984 > >> > >> from 2005 which was closed out as "RESOLVED, WONTFIX" with the text: > >> > >>There is a solution, already implemented. > >>Use nscd and nscd -i hosts in the script that rewrites your resolv.conf > >>(or nsswitch.conf etc.). > > > > Yes, that has been the upstream stance since quite some time now, so I > > don't know if filing a fresh bug would change that. You could however > > start a discussion upstream (libc-alpha at sourceware dot org) and > > make a case for the resolver to watch for changes in resolv.conf. > > > Yeah… I think inotify() might be the cleanest fix for that, but it’s not > particularly portable. > > stat() would work everywhere else. > > Threading complicates things a bit, though. We tried to push inotify() style watching in glibc in the past, and I think even Debian ships a patch for that. But the response was always "use a local caching nameserver" because glibc cannot be all things to everyone. And honestly, that's not an unreasonable answer. If you use a local caching nameserver then you get split DNS too, which libresolv cannot give you. You may encounter this same argument from upstream. However, when we last tried in NetworkManager land, that was like 2 glibc maintainers ago, and current maintainers might have softened their stance. In any case, I think it makes sense to just go with a local caching nameserver anyway, for the split DNS. Then the network management daemon (or whatever you use to switch networks) can forward the DNS+domain info to the caching nameserver and everything is happy. Obviously, this doesn't excuse apps like Evolution/Thunderbird/etc from the equation, since they should really be smart enough to know that if they are not connected to the corporate VPN, they can't pull mail from it. But that's between the network management daemon and the app, and doesn't involve the resolver at all. Dan > > > >> Problem with that is that no one seems to have gravitated towards > >> this solution, and I don't blame them. It adds an extra layer of > >> complexity and makes debugging issues that much more murky. > >> > >> A simpler fix is to grab mtime from stat()ing _PATH_RESCONF each > >> time through res_query() and see if it's changed since the last > >> time. Perhaps caching the inode # also and checking that, since an > >> older version of the file might have been renamed as > >> /etc/resolv.conf. > > > > That is conceptually simple, but expensive, since you'll be adding > > syscalls to every lookup. One may argue that it is not much overhead > > for a network lookup since the latter will still take up a bulk of the > > time, but it is an added cost nevertheless. > > > > Siddhesh > > > syscalls are a lot cheaper than network round-trips. At least in 99% of the > cases. > > How often does name resolution happen, anyway? Not very often. You > typically take call it before opening a socket, and then that socket persists > a long time… > > In the case of split-horizon DNS service in a corporate environment, this > problem still wouldn’t be solved, at least not directly. > > What might happen is you resolve imap.mycorp.com to some inside-the-firewall > 10.x.x.x address, and connect to that. Then you roam off the network, but of > course Thunderbird (for instance) knows nothing about this… it just > eventually drops the connection because that address becomes unreachable (or > points a new host with no knowledge of this TCP connection and it promptly > RESETs). TB then tries to reopen a connection… I’ve not looked at TB source > in about 8 years so I don’t know if it would redo the name resolution or not… > if yes, then it might point to a new exterior name server and get the > external (public-facing) address of imap.mycorp.com and things work again… > But that’s being optimistic. > > The behavior I’ve seen implies that it caches the address from the original > resolution and keeps trying to reconnect to that. > > But having libresolv transparently relearn the /etc/resolv.conf settings is > the first step toward doing the right thing. > > -Philip > > > > -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Roaming, and libresolv being stuck in the 1980's mindset
On 04/21/2015 09:26 AM, Björn Persson wrote: > Philip Prindeville wrote: >> The behavior I’ve seen implies that [Thunderbird] caches the address >> from the original resolution and keeps trying to reconnect to that. > > Well, if it would cache the address for longer than its time to live, > then it would be doing it wrong. There is a widespread belief that web browsers have to do that to counter certain vulnerabilities (the exploit technique is sometimes called “DNS rebinding”). -- Florian Weimer / Red Hat Product Security -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Roaming, and libresolv being stuck in the 1980's mindset
Philip Prindeville wrote: > The behavior I’ve seen implies that [Thunderbird] caches the address > from the original resolution and keeps trying to reconnect to that. Well, if it would cache the address for longer than its time to live, then it would be doing it wrong. Björn Persson pgp7JOz1Vw8d2.pgp Description: OpenPGP digital signatur -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Roaming, and libresolv being stuck in the 1980's mindset
On Apr 20, 2015, at 12:23 AM, Siddhesh Poyarekar wrote: > On Sat, Apr 18, 2015 at 01:49:57PM -0600, Philip Prindeville wrote: >> If you go back through the previous glibc bugs, you'll find: >> >> https://sourceware.org/bugzilla/show_bug.cgi?id=984 >> >> from 2005 which was closed out as "RESOLVED, WONTFIX" with the text: >> >>There is a solution, already implemented. >>Use nscd and nscd -i hosts in the script that rewrites your resolv.conf >>(or nsswitch.conf etc.). > > Yes, that has been the upstream stance since quite some time now, so I > don't know if filing a fresh bug would change that. You could however > start a discussion upstream (libc-alpha at sourceware dot org) and > make a case for the resolver to watch for changes in resolv.conf. Yeah… I think inotify() might be the cleanest fix for that, but it’s not particularly portable. stat() would work everywhere else. Threading complicates things a bit, though. > >> Problem with that is that no one seems to have gravitated towards >> this solution, and I don't blame them. It adds an extra layer of >> complexity and makes debugging issues that much more murky. >> >> A simpler fix is to grab mtime from stat()ing _PATH_RESCONF each >> time through res_query() and see if it's changed since the last >> time. Perhaps caching the inode # also and checking that, since an >> older version of the file might have been renamed as >> /etc/resolv.conf. > > That is conceptually simple, but expensive, since you'll be adding > syscalls to every lookup. One may argue that it is not much overhead > for a network lookup since the latter will still take up a bulk of the > time, but it is an added cost nevertheless. > > Siddhesh syscalls are a lot cheaper than network round-trips. At least in 99% of the cases. How often does name resolution happen, anyway? Not very often. You typically take call it before opening a socket, and then that socket persists a long time… In the case of split-horizon DNS service in a corporate environment, this problem still wouldn’t be solved, at least not directly. What might happen is you resolve imap.mycorp.com to some inside-the-firewall 10.x.x.x address, and connect to that. Then you roam off the network, but of course Thunderbird (for instance) knows nothing about this… it just eventually drops the connection because that address becomes unreachable (or points a new host with no knowledge of this TCP connection and it promptly RESETs). TB then tries to reopen a connection… I’ve not looked at TB source in about 8 years so I don’t know if it would redo the name resolution or not… if yes, then it might point to a new exterior name server and get the external (public-facing) address of imap.mycorp.com and things work again… But that’s being optimistic. The behavior I’ve seen implies that it caches the address from the original resolution and keeps trying to reconnect to that. But having libresolv transparently relearn the /etc/resolv.conf settings is the first step toward doing the right thing. -Philip -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Roaming, and libresolv being stuck in the 1980's mindset
On Sat, Apr 18, 2015 at 01:49:57PM -0600, Philip Prindeville wrote: > If you go back through the previous glibc bugs, you'll find: > > https://sourceware.org/bugzilla/show_bug.cgi?id=984 > > from 2005 which was closed out as "RESOLVED, WONTFIX" with the text: > > There is a solution, already implemented. > Use nscd and nscd -i hosts in the script that rewrites your resolv.conf > (or nsswitch.conf etc.). Yes, that has been the upstream stance since quite some time now, so I don't know if filing a fresh bug would change that. You could however start a discussion upstream (libc-alpha at sourceware dot org) and make a case for the resolver to watch for changes in resolv.conf. > Problem with that is that no one seems to have gravitated towards > this solution, and I don't blame them. It adds an extra layer of > complexity and makes debugging issues that much more murky. > > A simpler fix is to grab mtime from stat()ing _PATH_RESCONF each > time through res_query() and see if it's changed since the last > time. Perhaps caching the inode # also and checking that, since an > older version of the file might have been renamed as > /etc/resolv.conf. That is conceptually simple, but expensive, since you'll be adding syscalls to every lookup. One may argue that it is not much overhead for a network lookup since the latter will still take up a bulk of the time, but it is an added cost nevertheless. Siddhesh pgpWfuCOBPQ4C.pgp Description: PGP signature -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Roaming, and libresolv being stuck in the 1980's mindset
Philip Prindeville wrote: > If you're getting bad resolver addresses from your DHCP server, > aren't you also potentially getting a bad default gateway and hence > setting yourself up for a man-in-the-middle attack? Man-in-the-middle attacks can be carried out from any computer on any of the networks that your packets pass through, not just from your default gateway. For most protocols the way to prevent them is to use TLS or IPsec. Man-in-the-middle attacks on DNS resolution is prevented with DNSsec. Björn Persson pgpReJVRxO5LT.pgp Description: OpenPGP digital signatur -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Roaming, and libresolv being stuck in the 1980's mindset
On 04/18/2015 02:25 PM, Björn Persson wrote: > Philip Prindeville wrote: >> I recently opened a bug with glibc because persistent programs (like >> Thunderbird, etc) don't seem to handle roaming onto different >> networks very well. >> >> Or rather, they rely on libresolv which opens /etc/resolv.conf at >> startup and then ignores changes to the file for the rest of the time >> the process it is linked to is running. >> >> This might have been fine for desktop tower computers in the 1980's >> (though even then we had PPP and dynamic network settings), but we're >> in the era of pervasive laptops with internet connections and you're >> settings are going to be volatile. Period. > On the other hand those laptops are moving around in a rather hostile > environment, so they really ought to start doing DNSsec validation > locally as soon as possible, preferably several years ago. That means > that libresolv will only ever query the resolver daemon on the local > host, and has no need to check for updates to resolv.conf. > > Some installations may be able to rely on a trusted DNS server doing > the validation for them, but then their resolv.conf is static, so again > there is no need to check for updates. > > Björn Persson > If you're getting bad resolver addresses from your DHCP server, aren't you also potentially getting a bad default gateway and hence setting yourself up for a man-in-the-middle attack? -Philip -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Roaming, and libresolv being stuck in the 1980's mindset
Please see: https://fedoraproject.org//wiki/Changes/Default_Local_DNS_Resolver It's an F23 change (deferred from F22). kevin pgpUktpQCj8A4.pgp Description: OpenPGP digital signature -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Roaming, and libresolv being stuck in the 1980's mindset
Philip Prindeville wrote: > I recently opened a bug with glibc because persistent programs (like > Thunderbird, etc) don't seem to handle roaming onto different > networks very well. > > Or rather, they rely on libresolv which opens /etc/resolv.conf at > startup and then ignores changes to the file for the rest of the time > the process it is linked to is running. > > This might have been fine for desktop tower computers in the 1980's > (though even then we had PPP and dynamic network settings), but we're > in the era of pervasive laptops with internet connections and you're > settings are going to be volatile. Period. On the other hand those laptops are moving around in a rather hostile environment, so they really ought to start doing DNSsec validation locally as soon as possible, preferably several years ago. That means that libresolv will only ever query the resolver daemon on the local host, and has no need to check for updates to resolv.conf. Some installations may be able to rely on a trusted DNS server doing the validation for them, but then their resolv.conf is static, so again there is no need to check for updates. Björn Persson pgpXF1eLCzyu5.pgp Description: OpenPGP digital signatur -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Re: Roaming, and libresolv being stuck in the 1980's mindset
On Sat, 18 Apr 2015 21:49:57 +0200, Philip Prindeville wrote: > I recently opened a bug with glibc because persistent programs (like > Thunderbird, etc) don't seem to handle roaming onto different networks very > well. dnf install bind-chroot, enable it, start it echo >/etc/resolv.conf nameserver 127.0.0.1 chattr +i /etc/resolv.conf ISPs have commonly their DNS servers broken, they hijack the resolving etc. I am aware it may not be friendly to root nameservers but it is the only way to get reliable networking (OT: with openvpn to access public Internet). Jan -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Roaming, and libresolv being stuck in the 1980's mindset
I recently opened a bug with glibc because persistent programs (like Thunderbird, etc) don't seem to handle roaming onto different networks very well. Or rather, they rely on libresolv which opens /etc/resolv.conf at startup and then ignores changes to the file for the rest of the time the process it is linked to is running. This might have been fine for desktop tower computers in the 1980's (though even then we had PPP and dynamic network settings), but we're in the era of pervasive laptops with internet connections and you're settings are going to be volatile. Period. It's naive of a vital library like libresolv to assume either that (1) the process it's running in is going to be short-lived or (2) that the network isn't dynamic and that nameserver settings (and the default domain and domain search path, i.e. basically everything in /etc/resolv.conf) isn't going to change. (For those who've never looked too closely, /etc/resolv.conf gets rewritten by ISC's dhclient when new DHCP settings are received which includes a domain name, domain search path, or server addresses.) Here's the bug I've opened: https://sourceware.org/bugzilla/show_bug.cgi?id=18279 If you go back through the previous glibc bugs, you'll find: https://sourceware.org/bugzilla/show_bug.cgi?id=984 from 2005 which was closed out as "RESOLVED, WONTFIX" with the text: There is a solution, already implemented. Use nscd and nscd -i hosts in the script that rewrites your resolv.conf (or nsswitch.conf etc.). Problem with that is that no one seems to have gravitated towards this solution, and I don't blame them. It adds an extra layer of complexity and makes debugging issues that much more murky. A simpler fix is to grab mtime from stat()ing _PATH_RESCONF each time through res_query() and see if it's changed since the last time. Perhaps caching the inode # also and checking that, since an older version of the file might have been renamed as /etc/resolv.conf. Or one could use inotify(), but that's a whole lot less portable. One obvious pitfall is multi-threading: it's a possibility that res_init() got called by the main thread before additional worker threads got created, in which case calling res_init subsequent times becomes a little more hairy. I suspect that bug #18279 might come to the same fate as #984, but I hope not: the network and the way that we use it have both evolved these last 10 years, and let's hope that the way the glibc maintainers view both has also changed. Please add yourself to this bug and if the glibc folks try to argue for closing it, perhaps someone out there has a more compelling argument than I do. Thanks, -Philip -- devel mailing list devel@lists.fedoraproject.org https://admin.fedoraproject.org/mailman/listinfo/devel Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct