Re: glibc's getaddrinfo() sort order

2007-09-07 Thread Kurt Roeckx
On Fri, Sep 07, 2007 at 06:54:21PM +1000, Anthony Towns wrote:
> OTOH, getaddrinfo is meant to give a "close" answer, and doing prefix
> matching on NATed addresses isn't the Right Thing. For IPv6, that's fine
> because it's handled by earlier scoping rules. For NATed IPv4 though the
> prefix we should be using is whatever the host is going to be NATed *to*.
> And that would imply that the Right Thing would be to have an option
> more like:
> 
>   pretend-that 10/8 is-really 1.2.3.4/32
> 
> That doesn't seem likely to work though because it requires extra
> manual configuration, which won't happen.
> 
> Giving up on actually getting getaddrinfo to give "close" answers for
> NATed boxes leaves the option of trying to avoid getaddrinfo going out
> of its way to give "far" answers instead, which would mean turning off
> prefix-matching for NATed boxes; which could be done by ignoring rule
> 9 by default for private IPv4 addresses.

The problem with IPv4 is not only about NAT.  It just happens to show
the problem better.

With the IPv6 allocation policies, it's likely that the more higher bits
match, the closer it is network wise.  It is rather unlikly in the IPv4
case, specially if you go above /16.


Kurt


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: glibc's getaddrinfo() sort order

2007-09-07 Thread Ian Jackson
Kurt Roeckx writes ("Re: glibc's getaddrinfo() sort order"):
> It's atleast in the spirit of the rfc to prefer one that's on the local
> network.  It might be the intention of rule 9, but then rule 9 isn't
> very well written.

I agree that applying RFC3484 section 6 rule 9 to IPv4 addresses is a
mistake and that therefore we should change the default in Debian
accordingly.  I would encourage Kurt to take this matter up with the
relevant IETF working group.


Others have already written about problems involving NAT.  I agree
with this argument (although I don't approve of NAT and it galls me to
use some braindamage involving NAT as an argument for anything).

However there is another argument I would like to make:

A host using getaddrinfo configured to apply rule 9 to IPv4 addresses
will behave quite differently to a host using gethostbyname.  I think
that this change in behaviour is unwarranted.  Whether an application
uses gethostbyname or getaddrinfo is an implementation detail (related
closely to whether that particular application's source code has been
modified to try to support IPv6) and this should not change the
behaviour.

Presently when connecting to a service offering only IPv4 addresses,
most hosts will use gethostbyname and use the addresses offered in
round-robin DNS order.  That is to say, the meaning (pre-RFC3484, and
current de-facto) of a DNS RRset containing several IP addresses is
that the addresses should be tried `uniformly at random' by callers,
as done by the nameserver round-robin RRset rotation algorithm.

RFC3484 section 6 rule 9 applied to IPv4 appears to be an attempt to
change that meaning.  This interpretation of rule 9 for IPv4 as an
attempt to change the meaning of existing deployed DNS RRsets is
supported by the fact that proponents of rule 9 for IPv4 claim that it
will fix existing problems, as in
http://udrepper.livejournal.com/16116.html.

However, it is obviously wrongheaded to attempt to change the defined
meaning of all existing multi-record A RRsets.  On the existing
Internet, zone administrators use multi-record A RRsets in the
knowledge that those RRsets will be used by callers in an
evenly-distributed round-robin fashion as currently implemented by
bind and gethostbyname.

This meaning for multiple A records had been established for well over
a decade by the time 3848 was written and in the intervening years it
has continued to be dominant.  New systems, and systems newly modified
to support IPv6, should continue to interpret existing A RRsets in the
same way as before.

A few cursory web searches show that this new behaviour of getaddrinfo
is indeed causing trouble as applications are converted to IPv6 and
the change in behaviour with IPv4 is found to be undesirable.


Finally, I would like to preemptively address the line "but this is an
RFC and we must do what it says".  There are two responses:

The most obvious one is that RFC3484 is merely Proposed Standard.  At
this stage of the standardisation process one can expect to find
errors, mistaken deviations from existing practice, and so on.
(The IETF standardisation process has been broken so that documents
often get stuck in this state; but that doesn't mean that we should
treat draft documents as if they were gospel, let alone documents that
aren't even drafts.)

The second is a more general point: if a standards document tells us
to do something which is wrong, then we should not do it.  Obviously
we should think fairly hard before making the decision to go against a
standard, but our job is to do the right thing and standards documents
are there to help us not to constrain us.  I think my argument above
about the existing meaning of multiple A records is irrefutable.


> I already suggested that maybe rule 9 should be limited to the common
> prefix length of the netmask you're using.  An other option is that you
> extend rule 2 to have the same behaviour with ipv4, and that 10/8,
> 172.16/12 and 192.168/16 should be considered organization-local.

Replacing rule 9 with something more limited based on local network
interfaces (ie, prefer what appear to be locally-attached addresses)
would be fine.  Or a default based on routing metrics would be fine
too.  (Although I think these may be too much work to do in
getaddrinfo.)

The problem occurs when we start ranking IPv4 addresses of foreign
systems about we have no special knowledge of the topology.

Ranking RFC1918 addresses ahead of others is not entirely a safe thing
to do because people sometimes foolishly publish RFC1918 addresses for
public services and expect callers to skip those addresses somehow.
But at least it wouldn't break people who weren't already doing wrong
things.


Ian.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: glibc's getaddrinfo() sort order

2007-09-07 Thread Anthony Towns
On Fri, Sep 07, 2007 at 01:06:06AM +0200, Kurt Roeckx wrote:
> It's atleast in the spirit of the rfc to prefer one that's on the local
> network.  It might be the intention of rule 9, but then rule 9 isn't
> very well written.

Rule 9 seems perfectly well written, it just does something you
(reasonably) consider undesirable.

The RFC says:

]   Rule 9:  Use longest matching prefix.
]   When DA and DB belong to the same address family (both are IPv6 or
]   both are IPv4): If CommonPrefixLen(DA, Source(DA)) >
]   CommonPrefixLen(DB, Source(DB)), then prefer DA.  Similarly, if
]   CommonPrefixLen(DA, Source(DA)) < CommonPrefixLen(DB, Source(DB)),
]   then prefer DB.
]
]   Rule 10:  Otherwise, leave the order unchanged.
]   If DA preceded DB in the original list, prefer DA.  Otherwise prefer
]   DB.
]
]   Rules 9 and 10 may be superseded if the implementation has other
]   means of sorting destination addresses.  For example, if the
]   implementation somehow knows which destination addresses will result
]   in the "best" communications performance.

"The admin says that rule 9 isn't appropriate" seems to fit "somehow
knows which destination address will result in the "best" communications
performance", so afaict, the description in the new gai.conf,

# sortv4  
#If set to no, getaddrinfo(3) will ignore IPv4 adresses in rule 9.  See
#section 6 in RFC 3484.  The default is yes.  Setting this option to 
#no breaks conformance to RFC 3484.

is incorrect, in that that the implementation is still in conformance
with the RFC.

In addition, I think there's two different aspects here: the first is
"should getaddrinfo() return results in random order to aid in load
distribution?" and the second is "is prefix matching a reasonable way
to determine a good host to use?"

AFAICS, the answer to the first question is simply "no, it shouldn't" --
randomised load balancing like that needs to be done at the application
level, or by giving different sets of IPs in response to DNS queries by
different hosts, such as using BGP or similar. As far as pool.ntp.org
is concerned, that looks like the end of the story, afaics: ntp can't
rely in getaddrinfo to give a suitably random answer.

OTOH, getaddrinfo is meant to give a "close" answer, and doing prefix
matching on NATed addresses isn't the Right Thing. For IPv6, that's fine
because it's handled by earlier scoping rules. For NATed IPv4 though the
prefix we should be using is whatever the host is going to be NATed *to*.
And that would imply that the Right Thing would be to have an option
more like:

pretend-that 10/8 is-really 1.2.3.4/32

That doesn't seem likely to work though because it requires extra
manual configuration, which won't happen.

Giving up on actually getting getaddrinfo to give "close" answers for
NATed boxes leaves the option of trying to avoid getaddrinfo going out
of its way to give "far" answers instead, which would mean turning off
prefix-matching for NATed boxes; which could be done by ignoring rule
9 by default for private IPv4 addresses.

Actually, it might also be reasonable to ignore rule 9 if

scope(DA) > scope(source(DA)) and scope(DB) > scope(source(DB))

which seems reasonably equivalent to "DA and DB are only reachable through
a NAT" for both IPv4 and IPv6. The corner case is if the destination
is in a DMZ and can access both the Internet and local boxes directly,
but I don't think you can get the right answer for that atm anyway.

Doing it by changing Rule 9 to:

   Rule 9:  Use longest matching prefix.
   When DA and DB belong to the same address family (both are IPv6 or
   both are IPv4): If xCommonPrefixLen(DA, Source(DA)) >
   xCommonPrefixLen(DB, Source(DB)), then prefer DA.  Similarly, if
   xCommonPrefixLen(DA, Source(DA)) < xCommonPrefixLen(DB, Source(DB)),
   then prefer DB.

   If scope(X) > scope(Y) then
xCommonPrefixLen(X,Y) = 0
   Else:
xCommonPrefixLen(X,Y) = CommonPrefixLen(X,Y)

would give reasonable behaviour, I think (preferring addresses that can
be reached without NAT first, then leaving addresses that require NAT
in the order received).

In essence, the problem is that comparing prefixes of real addresses
against addresses that will be NATed is not adding information, and is
possibly losing information -- eg, if your site DNS already orders A
addresses by prefix matching on your actual IP range.

> I already suggested that maybe rule 9 should be limited to the common
> prefix length of the netmask you're using.  An other option is that you
> extend rule 2 to have the same behaviour with ipv4, and that 10/8,
> 172.16/12 and 192.168/16 should be considered organization-local.

Those are specified as having site-local scope in 3.2; but Rule 2 only
comes into play if one of the IPs returned by the nameserver is also
site-local anyway which isn't particularly useful.

Cheers,
aj



signature.asc
Description: Digital signature


r2535 - in glibc-package/trunk/debian: . sysdeps

2007-09-07 Thread aurel32
Author: aurel32
Date: 2007-09-07 08:22:29 + (Fri, 07 Sep 2007)
New Revision: 2535

Modified:
   glibc-package/trunk/debian/changelog
   glibc-package/trunk/debian/sysdeps/amd64.mk
Log:
  * sysdeps/amd64.mk: uses x86_64 headers also for the i486 flavour now
that they are compatible.



Modified: glibc-package/trunk/debian/changelog
===
--- glibc-package/trunk/debian/changelog2007-09-05 07:28:37 UTC (rev 
2534)
+++ glibc-package/trunk/debian/changelog2007-09-07 08:22:29 UTC (rev 
2535)
@@ -1,3 +1,10 @@
+glibc (2.6.1-3) unstable; urgency=low
+
+  * sysdeps/amd64.mk: uses x86_64 headers also for the i486 flavour now
+that they are compatible.
+
+ -- Aurelien Jarno <[EMAIL PROTECTED]>  Fri,  7 Sep 2007 10:21:30 +0200
+
 glibc (2.6.1-2) unstable; urgency=low
 
   [ Samuel Thibault ]

Modified: glibc-package/trunk/debian/sysdeps/amd64.mk
===
--- glibc-package/trunk/debian/sysdeps/amd64.mk 2007-09-05 07:28:37 UTC (rev 
2534)
+++ glibc-package/trunk/debian/sysdeps/amd64.mk 2007-09-07 08:22:29 UTC (rev 
2535)
@@ -24,9 +24,15 @@
 i386_libdir = /emul/ia32-linux/usr/lib
 
 define libc6-dev-i386_extra_pkg_install
-mkdir -p debian/libc6-dev-i386/usr/include
-cp -af debian/tmp-i386/usr/include/i486-linux-gnu \
-   debian/libc6-dev-i386/usr/include
+mkdir -p debian/libc6-dev-i386/usr/include/gnu
+cp -af debian/tmp-i386/usr/include/i486-linux-gnu/gnu/stubs-32.h \
+   debian/libc6-dev-i386/usr/include/gnu
+mkdir -p debian/libc6-dev-i386/usr/include/sys
+cp -af debian/tmp-i386/usr/include/i486-linux-gnu/sys/elf.h \
+   debian/libc6-dev-i386/usr/include/sys
+cp -af debian/tmp-i386/usr/include/i486-linux-gnu/sys/vm86.h \
+   debian/libc6-dev-i386/usr/include/sys
+ln -sf . debian/libc6-dev-i386/usr/include/i486-linux-gnu
 endef
 
 define libc6-i386_extra_pkg_install


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: glibc's getaddrinfo() sort order

2007-09-07 Thread Pierre Habouzit
On ven, sep 07, 2007 at 07:45:52 +, Pierre Habouzit wrote:
> On ven, sep 07, 2007 at 07:15:42 +, Pierre Habouzit wrote:
> > On Thu, Sep 06, 2007 at 11:46:54PM +, Joey Hess wrote:
> > > Pierre Habouzit wrote:
> >   Also note that probably many many Windows machines work that way (the
> > RFC was written by a MS guy). And this behaviour impacts software
> > developpers, and people that hoped that having multiple A records for
> > their service will see a perfect round robin will be stuck anyways. I
> > mean, it's non previous-practice-backward-compliant and one can argue
> > reasonably it sucks. But hel-llooo ! this kind of "design" choice is not
> > only local. If every one (or the majority) on the internet behaves like
> > this, fixing this "bug" (if it is really one) in Debian will _not_, I
> > say _not_ prevent us from fixing many software that rely on DNS round
> > robin, because OTHER PARTIES will use the RFC-foo algorithm, and WE will
> > have to cope with that whatever choice is made.
> 
>   On that matter, according to Aurélien, Vista (maybe XP),
> {Open,Net,Free}BSD follow the RFC. Other OSes could be tested (MacOS X
> and solaris come to mind). So it's kind of a decision of Debian vs. the
> rest of the world. And if I don't really care about the issue of the
> decision technically, this aspect worries me.

  Still one technical point, here is the excerpt from the RFC on the
offending rule:
   Rule 9:  Use longest matching prefix.
   When DA and DB belong to the same address family (both are IPv6 or
   both are IPv4): If CommonPrefixLen(DA, Source(DA)) >
   CommonPrefixLen(DB, Source(DB)), then prefer DA.  Similarly, if
   CommonPrefixLen(DA, Source(DA)) < CommonPrefixLen(DB, Source(DB)),
   then prefer DB.

  What it means is that for IPs with the same common prefix, the order
of the address is unchanged wrt how it came up in the DNS answer.

  What it means, is that when I use apt to fetch from ftp.debian.org
from my home ISP (proxad) it takes the mirror that proxad does
(ftp.fr.d.o). When I go to my parents, using wanadoo (now Orange), it
picks the Oleane one (ftp.fr2.d.o) which indeed is nearer. It makes
completely sense.

  And as per rule of the common prefix, on a local network, RR still can
be assumed on a given VLAN. It actually makes quite some sense to me.

  Maybe that's why Joey Hess had variability: the RFC does not specify a
*full* ordering, it just aim to restrict the RR to the "nearest"
servers to the client.


  Of course, usualy ISP IP's have first octet smaller than 127, so if
you host a service with RR on a network with the first octet greater
than 128 and a mirror on an IP with a first octet smaller than 128, the
client of your service from the ISP will never chose the former because
of this rule. This is a RFC that favors people with large mirroring
networks for their service, and hinders people with small mirroring
networks because they have to chose the IP for their network servers
with care.


  I think I've described everything important for the Ctte to rule this,
so unless a question pop up, I'll let you rule in peace :)

-- 
·O·  Pierre Habouzit
··O[EMAIL PROTECTED]
OOOhttp://www.madism.org


pgpfigLMJUEPw.pgp
Description: PGP signature


Re: glibc's getaddrinfo() sort order

2007-09-07 Thread Pierre Habouzit
On ven, sep 07, 2007 at 07:15:42 +, Pierre Habouzit wrote:
> On Thu, Sep 06, 2007 at 11:46:54PM +, Joey Hess wrote:
> > Pierre Habouzit wrote:
>   Also note that probably many many Windows machines work that way (the
> RFC was written by a MS guy). And this behaviour impacts software
> developpers, and people that hoped that having multiple A records for
> their service will see a perfect round robin will be stuck anyways. I
> mean, it's non previous-practice-backward-compliant and one can argue
> reasonably it sucks. But hel-llooo ! this kind of "design" choice is not
> only local. If every one (or the majority) on the internet behaves like
> this, fixing this "bug" (if it is really one) in Debian will _not_, I
> say _not_ prevent us from fixing many software that rely on DNS round
> robin, because OTHER PARTIES will use the RFC-foo algorithm, and WE will
> have to cope with that whatever choice is made.

  On that matter, according to Aurélien, Vista (maybe XP),
{Open,Net,Free}BSD follow the RFC. Other OSes could be tested (MacOS X
and solaris come to mind). So it's kind of a decision of Debian vs. the
rest of the world. And if I don't really care about the issue of the
decision technically, this aspect worries me.

-- 
·O·  Pierre Habouzit
··O[EMAIL PROTECTED]
OOOhttp://www.madism.org


pgpCalQfmLsOJ.pgp
Description: PGP signature


Re: glibc's getaddrinfo() sort order

2007-09-07 Thread Pierre Habouzit
On Thu, Sep 06, 2007 at 11:46:54PM +, Joey Hess wrote:
> Pierre Habouzit wrote:
> >   The point is, there is an RFC, and we put a patch so that admins can
> > disable it using gai.conf.
> 
> "There is an RFC" is not always a good excuse for breaking existing systems.
> 
> "Admins can disable it" is not a good argument when one common class of
> the breakage is all the systems that _don't_ disable it hammering
> systems that have round-robins set up to distribute load. More
> generally, "we added an option so your bug is fixed" is a common
> fallacy.

  The point is: the option is here, I don't really care if the Ctte
decides to set true or false by default. My underlying point was just
that the switch is easy for us. SO yeah, if we change the default
option, the "bug" is definitely fixed and is anything but a fallacy.

  OTOH there is no way upstream will change that (Ulrich refused the
patch with blatant aggressiveness), so every other distribution (Fedora
and RedHat, probably many other) will work that way. So we can rule
everything we want here (and I absolutely don't care about the issue of
the decision, I was just giving some pointers to the "pro's" as I
assumed that the cons were obvious to anyone), this will not change how
upstream glibc works, so many people (probably a majority ?) will use
this new scheme anyway.

  And I also say that knowing Uli, (and knowing how deeply I care about
this issue ;p) I won't spend a minute trying to argue with Uli, I'm not
insane, and don't have the man-years to do that.

  Also note that probably many many Windows machines work that way (the
RFC was written by a MS guy). And this behaviour impacts software
developpers, and people that hoped that having multiple A records for
their service will see a perfect round robin will be stuck anyways. I
mean, it's non previous-practice-backward-compliant and one can argue
reasonably it sucks. But hel-llooo ! this kind of "design" choice is not
only local. If every one (or the majority) on the internet behaves like
this, fixing this "bug" (if it is really one) in Debian will _not_, I
say _not_ prevent us from fixing many software that rely on DNS round
robin, because OTHER PARTIES will use the RFC-foo algorithm, and WE will
have to cope with that whatever choice is made.

> BTW, I'm seeing some programs that use getaddrinfo and still don't have
> the RFC 3484 sorting behavior. Is this controlled by the AI_ADDRCONFIG flag?

  TTBOMK it's a "bug" wrt intended behaviour as per upstream.

-- 
·O·  Pierre Habouzit
··O[EMAIL PROTECTED]
OOOhttp://www.madism.org


pgpf1lydUiqh6.pgp
Description: PGP signature