Re: glibc's getaddrinfo() sort order
On Fri, Sep 07, 2007 at 06:54:21PM +1000, Anthony Towns wrote: > OTOH, getaddrinfo is meant to give a "close" answer, and doing prefix > matching on NATed addresses isn't the Right Thing. For IPv6, that's fine > because it's handled by earlier scoping rules. For NATed IPv4 though the > prefix we should be using is whatever the host is going to be NATed *to*. > And that would imply that the Right Thing would be to have an option > more like: > > pretend-that 10/8 is-really 1.2.3.4/32 > > That doesn't seem likely to work though because it requires extra > manual configuration, which won't happen. > > Giving up on actually getting getaddrinfo to give "close" answers for > NATed boxes leaves the option of trying to avoid getaddrinfo going out > of its way to give "far" answers instead, which would mean turning off > prefix-matching for NATed boxes; which could be done by ignoring rule > 9 by default for private IPv4 addresses. The problem with IPv4 is not only about NAT. It just happens to show the problem better. With the IPv6 allocation policies, it's likely that the more higher bits match, the closer it is network wise. It is rather unlikly in the IPv4 case, specially if you go above /16. Kurt -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Re: glibc's getaddrinfo() sort order
Kurt Roeckx writes ("Re: glibc's getaddrinfo() sort order"): > It's atleast in the spirit of the rfc to prefer one that's on the local > network. It might be the intention of rule 9, but then rule 9 isn't > very well written. I agree that applying RFC3484 section 6 rule 9 to IPv4 addresses is a mistake and that therefore we should change the default in Debian accordingly. I would encourage Kurt to take this matter up with the relevant IETF working group. Others have already written about problems involving NAT. I agree with this argument (although I don't approve of NAT and it galls me to use some braindamage involving NAT as an argument for anything). However there is another argument I would like to make: A host using getaddrinfo configured to apply rule 9 to IPv4 addresses will behave quite differently to a host using gethostbyname. I think that this change in behaviour is unwarranted. Whether an application uses gethostbyname or getaddrinfo is an implementation detail (related closely to whether that particular application's source code has been modified to try to support IPv6) and this should not change the behaviour. Presently when connecting to a service offering only IPv4 addresses, most hosts will use gethostbyname and use the addresses offered in round-robin DNS order. That is to say, the meaning (pre-RFC3484, and current de-facto) of a DNS RRset containing several IP addresses is that the addresses should be tried `uniformly at random' by callers, as done by the nameserver round-robin RRset rotation algorithm. RFC3484 section 6 rule 9 applied to IPv4 appears to be an attempt to change that meaning. This interpretation of rule 9 for IPv4 as an attempt to change the meaning of existing deployed DNS RRsets is supported by the fact that proponents of rule 9 for IPv4 claim that it will fix existing problems, as in http://udrepper.livejournal.com/16116.html. However, it is obviously wrongheaded to attempt to change the defined meaning of all existing multi-record A RRsets. On the existing Internet, zone administrators use multi-record A RRsets in the knowledge that those RRsets will be used by callers in an evenly-distributed round-robin fashion as currently implemented by bind and gethostbyname. This meaning for multiple A records had been established for well over a decade by the time 3848 was written and in the intervening years it has continued to be dominant. New systems, and systems newly modified to support IPv6, should continue to interpret existing A RRsets in the same way as before. A few cursory web searches show that this new behaviour of getaddrinfo is indeed causing trouble as applications are converted to IPv6 and the change in behaviour with IPv4 is found to be undesirable. Finally, I would like to preemptively address the line "but this is an RFC and we must do what it says". There are two responses: The most obvious one is that RFC3484 is merely Proposed Standard. At this stage of the standardisation process one can expect to find errors, mistaken deviations from existing practice, and so on. (The IETF standardisation process has been broken so that documents often get stuck in this state; but that doesn't mean that we should treat draft documents as if they were gospel, let alone documents that aren't even drafts.) The second is a more general point: if a standards document tells us to do something which is wrong, then we should not do it. Obviously we should think fairly hard before making the decision to go against a standard, but our job is to do the right thing and standards documents are there to help us not to constrain us. I think my argument above about the existing meaning of multiple A records is irrefutable. > I already suggested that maybe rule 9 should be limited to the common > prefix length of the netmask you're using. An other option is that you > extend rule 2 to have the same behaviour with ipv4, and that 10/8, > 172.16/12 and 192.168/16 should be considered organization-local. Replacing rule 9 with something more limited based on local network interfaces (ie, prefer what appear to be locally-attached addresses) would be fine. Or a default based on routing metrics would be fine too. (Although I think these may be too much work to do in getaddrinfo.) The problem occurs when we start ranking IPv4 addresses of foreign systems about we have no special knowledge of the topology. Ranking RFC1918 addresses ahead of others is not entirely a safe thing to do because people sometimes foolishly publish RFC1918 addresses for public services and expect callers to skip those addresses somehow. But at least it wouldn't break people who weren't already doing wrong things. Ian. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Re: glibc's getaddrinfo() sort order
On Fri, Sep 07, 2007 at 01:06:06AM +0200, Kurt Roeckx wrote: > It's atleast in the spirit of the rfc to prefer one that's on the local > network. It might be the intention of rule 9, but then rule 9 isn't > very well written. Rule 9 seems perfectly well written, it just does something you (reasonably) consider undesirable. The RFC says: ] Rule 9: Use longest matching prefix. ] When DA and DB belong to the same address family (both are IPv6 or ] both are IPv4): If CommonPrefixLen(DA, Source(DA)) > ] CommonPrefixLen(DB, Source(DB)), then prefer DA. Similarly, if ] CommonPrefixLen(DA, Source(DA)) < CommonPrefixLen(DB, Source(DB)), ] then prefer DB. ] ] Rule 10: Otherwise, leave the order unchanged. ] If DA preceded DB in the original list, prefer DA. Otherwise prefer ] DB. ] ] Rules 9 and 10 may be superseded if the implementation has other ] means of sorting destination addresses. For example, if the ] implementation somehow knows which destination addresses will result ] in the "best" communications performance. "The admin says that rule 9 isn't appropriate" seems to fit "somehow knows which destination address will result in the "best" communications performance", so afaict, the description in the new gai.conf, # sortv4 #If set to no, getaddrinfo(3) will ignore IPv4 adresses in rule 9. See #section 6 in RFC 3484. The default is yes. Setting this option to #no breaks conformance to RFC 3484. is incorrect, in that that the implementation is still in conformance with the RFC. In addition, I think there's two different aspects here: the first is "should getaddrinfo() return results in random order to aid in load distribution?" and the second is "is prefix matching a reasonable way to determine a good host to use?" AFAICS, the answer to the first question is simply "no, it shouldn't" -- randomised load balancing like that needs to be done at the application level, or by giving different sets of IPs in response to DNS queries by different hosts, such as using BGP or similar. As far as pool.ntp.org is concerned, that looks like the end of the story, afaics: ntp can't rely in getaddrinfo to give a suitably random answer. OTOH, getaddrinfo is meant to give a "close" answer, and doing prefix matching on NATed addresses isn't the Right Thing. For IPv6, that's fine because it's handled by earlier scoping rules. For NATed IPv4 though the prefix we should be using is whatever the host is going to be NATed *to*. And that would imply that the Right Thing would be to have an option more like: pretend-that 10/8 is-really 1.2.3.4/32 That doesn't seem likely to work though because it requires extra manual configuration, which won't happen. Giving up on actually getting getaddrinfo to give "close" answers for NATed boxes leaves the option of trying to avoid getaddrinfo going out of its way to give "far" answers instead, which would mean turning off prefix-matching for NATed boxes; which could be done by ignoring rule 9 by default for private IPv4 addresses. Actually, it might also be reasonable to ignore rule 9 if scope(DA) > scope(source(DA)) and scope(DB) > scope(source(DB)) which seems reasonably equivalent to "DA and DB are only reachable through a NAT" for both IPv4 and IPv6. The corner case is if the destination is in a DMZ and can access both the Internet and local boxes directly, but I don't think you can get the right answer for that atm anyway. Doing it by changing Rule 9 to: Rule 9: Use longest matching prefix. When DA and DB belong to the same address family (both are IPv6 or both are IPv4): If xCommonPrefixLen(DA, Source(DA)) > xCommonPrefixLen(DB, Source(DB)), then prefer DA. Similarly, if xCommonPrefixLen(DA, Source(DA)) < xCommonPrefixLen(DB, Source(DB)), then prefer DB. If scope(X) > scope(Y) then xCommonPrefixLen(X,Y) = 0 Else: xCommonPrefixLen(X,Y) = CommonPrefixLen(X,Y) would give reasonable behaviour, I think (preferring addresses that can be reached without NAT first, then leaving addresses that require NAT in the order received). In essence, the problem is that comparing prefixes of real addresses against addresses that will be NATed is not adding information, and is possibly losing information -- eg, if your site DNS already orders A addresses by prefix matching on your actual IP range. > I already suggested that maybe rule 9 should be limited to the common > prefix length of the netmask you're using. An other option is that you > extend rule 2 to have the same behaviour with ipv4, and that 10/8, > 172.16/12 and 192.168/16 should be considered organization-local. Those are specified as having site-local scope in 3.2; but Rule 2 only comes into play if one of the IPs returned by the nameserver is also site-local anyway which isn't particularly useful. Cheers, aj signature.asc Description: Digital signature
r2535 - in glibc-package/trunk/debian: . sysdeps
Author: aurel32 Date: 2007-09-07 08:22:29 + (Fri, 07 Sep 2007) New Revision: 2535 Modified: glibc-package/trunk/debian/changelog glibc-package/trunk/debian/sysdeps/amd64.mk Log: * sysdeps/amd64.mk: uses x86_64 headers also for the i486 flavour now that they are compatible. Modified: glibc-package/trunk/debian/changelog === --- glibc-package/trunk/debian/changelog2007-09-05 07:28:37 UTC (rev 2534) +++ glibc-package/trunk/debian/changelog2007-09-07 08:22:29 UTC (rev 2535) @@ -1,3 +1,10 @@ +glibc (2.6.1-3) unstable; urgency=low + + * sysdeps/amd64.mk: uses x86_64 headers also for the i486 flavour now +that they are compatible. + + -- Aurelien Jarno <[EMAIL PROTECTED]> Fri, 7 Sep 2007 10:21:30 +0200 + glibc (2.6.1-2) unstable; urgency=low [ Samuel Thibault ] Modified: glibc-package/trunk/debian/sysdeps/amd64.mk === --- glibc-package/trunk/debian/sysdeps/amd64.mk 2007-09-05 07:28:37 UTC (rev 2534) +++ glibc-package/trunk/debian/sysdeps/amd64.mk 2007-09-07 08:22:29 UTC (rev 2535) @@ -24,9 +24,15 @@ i386_libdir = /emul/ia32-linux/usr/lib define libc6-dev-i386_extra_pkg_install -mkdir -p debian/libc6-dev-i386/usr/include -cp -af debian/tmp-i386/usr/include/i486-linux-gnu \ - debian/libc6-dev-i386/usr/include +mkdir -p debian/libc6-dev-i386/usr/include/gnu +cp -af debian/tmp-i386/usr/include/i486-linux-gnu/gnu/stubs-32.h \ + debian/libc6-dev-i386/usr/include/gnu +mkdir -p debian/libc6-dev-i386/usr/include/sys +cp -af debian/tmp-i386/usr/include/i486-linux-gnu/sys/elf.h \ + debian/libc6-dev-i386/usr/include/sys +cp -af debian/tmp-i386/usr/include/i486-linux-gnu/sys/vm86.h \ + debian/libc6-dev-i386/usr/include/sys +ln -sf . debian/libc6-dev-i386/usr/include/i486-linux-gnu endef define libc6-i386_extra_pkg_install -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]
Re: glibc's getaddrinfo() sort order
On ven, sep 07, 2007 at 07:45:52 +, Pierre Habouzit wrote: > On ven, sep 07, 2007 at 07:15:42 +, Pierre Habouzit wrote: > > On Thu, Sep 06, 2007 at 11:46:54PM +, Joey Hess wrote: > > > Pierre Habouzit wrote: > > Also note that probably many many Windows machines work that way (the > > RFC was written by a MS guy). And this behaviour impacts software > > developpers, and people that hoped that having multiple A records for > > their service will see a perfect round robin will be stuck anyways. I > > mean, it's non previous-practice-backward-compliant and one can argue > > reasonably it sucks. But hel-llooo ! this kind of "design" choice is not > > only local. If every one (or the majority) on the internet behaves like > > this, fixing this "bug" (if it is really one) in Debian will _not_, I > > say _not_ prevent us from fixing many software that rely on DNS round > > robin, because OTHER PARTIES will use the RFC-foo algorithm, and WE will > > have to cope with that whatever choice is made. > > On that matter, according to Aurélien, Vista (maybe XP), > {Open,Net,Free}BSD follow the RFC. Other OSes could be tested (MacOS X > and solaris come to mind). So it's kind of a decision of Debian vs. the > rest of the world. And if I don't really care about the issue of the > decision technically, this aspect worries me. Still one technical point, here is the excerpt from the RFC on the offending rule: Rule 9: Use longest matching prefix. When DA and DB belong to the same address family (both are IPv6 or both are IPv4): If CommonPrefixLen(DA, Source(DA)) > CommonPrefixLen(DB, Source(DB)), then prefer DA. Similarly, if CommonPrefixLen(DA, Source(DA)) < CommonPrefixLen(DB, Source(DB)), then prefer DB. What it means is that for IPs with the same common prefix, the order of the address is unchanged wrt how it came up in the DNS answer. What it means, is that when I use apt to fetch from ftp.debian.org from my home ISP (proxad) it takes the mirror that proxad does (ftp.fr.d.o). When I go to my parents, using wanadoo (now Orange), it picks the Oleane one (ftp.fr2.d.o) which indeed is nearer. It makes completely sense. And as per rule of the common prefix, on a local network, RR still can be assumed on a given VLAN. It actually makes quite some sense to me. Maybe that's why Joey Hess had variability: the RFC does not specify a *full* ordering, it just aim to restrict the RR to the "nearest" servers to the client. Of course, usualy ISP IP's have first octet smaller than 127, so if you host a service with RR on a network with the first octet greater than 128 and a mirror on an IP with a first octet smaller than 128, the client of your service from the ISP will never chose the former because of this rule. This is a RFC that favors people with large mirroring networks for their service, and hinders people with small mirroring networks because they have to chose the IP for their network servers with care. I think I've described everything important for the Ctte to rule this, so unless a question pop up, I'll let you rule in peace :) -- ·O· Pierre Habouzit ··O[EMAIL PROTECTED] OOOhttp://www.madism.org pgpfigLMJUEPw.pgp Description: PGP signature
Re: glibc's getaddrinfo() sort order
On ven, sep 07, 2007 at 07:15:42 +, Pierre Habouzit wrote: > On Thu, Sep 06, 2007 at 11:46:54PM +, Joey Hess wrote: > > Pierre Habouzit wrote: > Also note that probably many many Windows machines work that way (the > RFC was written by a MS guy). And this behaviour impacts software > developpers, and people that hoped that having multiple A records for > their service will see a perfect round robin will be stuck anyways. I > mean, it's non previous-practice-backward-compliant and one can argue > reasonably it sucks. But hel-llooo ! this kind of "design" choice is not > only local. If every one (or the majority) on the internet behaves like > this, fixing this "bug" (if it is really one) in Debian will _not_, I > say _not_ prevent us from fixing many software that rely on DNS round > robin, because OTHER PARTIES will use the RFC-foo algorithm, and WE will > have to cope with that whatever choice is made. On that matter, according to Aurélien, Vista (maybe XP), {Open,Net,Free}BSD follow the RFC. Other OSes could be tested (MacOS X and solaris come to mind). So it's kind of a decision of Debian vs. the rest of the world. And if I don't really care about the issue of the decision technically, this aspect worries me. -- ·O· Pierre Habouzit ··O[EMAIL PROTECTED] OOOhttp://www.madism.org pgpCalQfmLsOJ.pgp Description: PGP signature
Re: glibc's getaddrinfo() sort order
On Thu, Sep 06, 2007 at 11:46:54PM +, Joey Hess wrote: > Pierre Habouzit wrote: > > The point is, there is an RFC, and we put a patch so that admins can > > disable it using gai.conf. > > "There is an RFC" is not always a good excuse for breaking existing systems. > > "Admins can disable it" is not a good argument when one common class of > the breakage is all the systems that _don't_ disable it hammering > systems that have round-robins set up to distribute load. More > generally, "we added an option so your bug is fixed" is a common > fallacy. The point is: the option is here, I don't really care if the Ctte decides to set true or false by default. My underlying point was just that the switch is easy for us. SO yeah, if we change the default option, the "bug" is definitely fixed and is anything but a fallacy. OTOH there is no way upstream will change that (Ulrich refused the patch with blatant aggressiveness), so every other distribution (Fedora and RedHat, probably many other) will work that way. So we can rule everything we want here (and I absolutely don't care about the issue of the decision, I was just giving some pointers to the "pro's" as I assumed that the cons were obvious to anyone), this will not change how upstream glibc works, so many people (probably a majority ?) will use this new scheme anyway. And I also say that knowing Uli, (and knowing how deeply I care about this issue ;p) I won't spend a minute trying to argue with Uli, I'm not insane, and don't have the man-years to do that. Also note that probably many many Windows machines work that way (the RFC was written by a MS guy). And this behaviour impacts software developpers, and people that hoped that having multiple A records for their service will see a perfect round robin will be stuck anyways. I mean, it's non previous-practice-backward-compliant and one can argue reasonably it sucks. But hel-llooo ! this kind of "design" choice is not only local. If every one (or the majority) on the internet behaves like this, fixing this "bug" (if it is really one) in Debian will _not_, I say _not_ prevent us from fixing many software that rely on DNS round robin, because OTHER PARTIES will use the RFC-foo algorithm, and WE will have to cope with that whatever choice is made. > BTW, I'm seeing some programs that use getaddrinfo and still don't have > the RFC 3484 sorting behavior. Is this controlled by the AI_ADDRCONFIG flag? TTBOMK it's a "bug" wrt intended behaviour as per upstream. -- ·O· Pierre Habouzit ··O[EMAIL PROTECTED] OOOhttp://www.madism.org pgpf1lydUiqh6.pgp Description: PGP signature