On Tuesday 27 January 2009 11:59:46 Tobias Klausmann wrote:
> On Tue, 27 Jan 2009, Peter Alfredsen wrote:
> > On Tue, 27 Jan 2009 16:47:50 +0100 Tobias Klausmann wrote:
> > > glibc 2.9 uses a different way to implement getaddrinfo() which
> > > triggers a race condition in most (if not all) Netfilter
> > > firewalls that use connection tracking. glibc does nothing wrong
> > > per se, it just triggers the condition. (technical details here:
> > > http://marc.info/?l=linux-netdev&m=123304473331445)
> >
> > [...]
> >
> > > I don't have any experience with glibc upstream but pestering
> > > them about this out of the blue might only cause a flame war
> > > between kernel and glibc folks. Thus, I'm asking you, my fellow
> > > devs (and the glibc and kernel teams specifically), what you
> > > think is the best idea/course of action.
> >
> > The connection with IPv6 leads me to believe that this is
> > http://bugs.gentoo.org/250468
> > http://sourceware.org/bugzilla/show_bug.cgi?id=7060
>
> I doubt it: sometimes the lookups work, as this race is very
> timing-critical. When experimenting, I had to go below 50
> microseconds between the two packets to actually trigger it plus
> the forwarding machines always were multicore. Also, the content
> is irrelevant. I implemented this myself sending the payloads
> with sendto() and it didn't matter if I sent the A or AAAA query
> first.
>
> That said, without seeing a tcpdump from those people with the
> error described in those two bugs, I can not rule it out.

the referenced bug generally deals with broken nameservers that cant handle 
the type of requests that glibc sends out (the requests are correct according 
to the relevant standards/RFCs, but apparently many DNS servers out there 
screw up with it due to the ipv4/ipv6 combo).

the referenced thread seems to indicate even more the issue is in the 
netfilter code.

> On the wire between the client and the firewall, this happens:
>
> a packet 1 is sent
> b packet 2 is sent
> c answer 1 is received
> d answer 2 is received
>
> Sometimes d doesn't happen because b is lost in the firewall
> along the way (where the race condition happens).

does this affect actual userspace behavior ?  in other words, does this lead 
to lost lookups and errors from the resolver ?
-mike

Reply via email to