Hi! 

(fixed the subject)

On Tue, 27 Jan 2009, Peter Alfredsen wrote:
> [Mike: This looks like your field of expertise]
> On Tue, 27 Jan 2009 16:47:50 +0100
> Tobias Klausmann <klaus...@gentoo.org> wrote:
> 
> > glibc 2.9 uses a different way to implement getaddrinfo() which
> > triggers a race condition in most (if not all) Netfilter
> > firewalls that use connection tracking. glibc does nothing wrong
> > per se, it just triggers the condition. (technical details here:
> > http://marc.info/?l=linux-netdev&m=123304473331445)
> [...]
> > I don't have any experience with glibc upstream but pestering
> > them about this out of the blue might only cause a flame war
> > between kernel and glibc folks. Thus, I'm asking you, my fellow
> > devs (and the glibc and kernel teams specifically), what you
> > think is the best idea/course of action.
> 
> The connection with IPv6 leads me to believe that this is
> http://bugs.gentoo.org/250468
> http://sourceware.org/bugzilla/show_bug.cgi?id=7060

I doubt it: sometimes the lookups work, as this race is very
timing-critical. When experimenting, I had to go below 50
microseconds between the two packets to actually trigger it plus
the forwarding machines always were multicore. Also, the content
is irrelevant. I implemented this myself sending the payloads
with sendto() and it didn't matter if I sent the A or AAAA query
first.

That said, without seeing a tcpdump from those people with the
error described in those two bugs, I can not rule it out.

On the wire between the client and the firewall, this happens:

a packet 1 is sent
b packet 2 is sent
c answer 1 is received
d answer 2 is received

Sometimes d doesn't happen because b is lost in the firewall
along the way (where the race condition happens). 

> Mike has added a patch to Gentoo's patchset but hasn't bumped the
> revision yet. It does look spectacularly hacky, though :-)
> 
> Anyway, if this is your problem, it looks like upstream is already
> working on it and that we just need to *prod* Mike a bit to get a fix
> into the tarball.

The bug is in the kernel, not glibc. The latter just triggers it
because the newer resolver has a more aggressive timing. Note
that I think that what the glibc guys did is a *good* idea. It
just happens to rub Netfilter the wrong way.

Regards,
Tobias

-- 
printk("Cool stuff's happening!\n")
        linux-2.4.3/fs/jffs/intrep.c

Reply via email to