Re: getaddrinfo() return value chaos

2013-07-07 Thread Kurt Roeckx
On Sun, Jul 07, 2013 at 02:30:33PM +0200, Thomas Hood wrote:
> Continuing on from the "boot ordering and resolvconf" thread;
> cc:ed to Helmut in case this gets filtered again; bcc:ed to
> 683...@bugs.debian.org since this is relevant for how that
> issue is addressed...

A related bug is #582916

Thanks for looking into this.


Kurt


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130707124139.ga29...@roeckx.be



Re: getaddrinfo() return value chaos

2013-07-07 Thread Thomas Hood
On Sun, Jul 7, 2013 at 2:41 PM, Kurt Roeckx  wrote:

> A related bug is #582916
>

Thanks.  Besides that, #671789 and #713799 are also probably related.

The latter was closed by the 2.17-7 release which is the most recent in
unstable;
I should be testing with that version rather than with the wheezy version,
2.13-38
(not 2.13-37 as I incorrectly claimed in my previous message, sigh).
-- 
Thomas


Re: getaddrinfo() return value chaos

2013-07-07 Thread Russ Allbery
Thomas Hood  writes:

> Executive summary: The getaddrinfo() returns different values depending
> on the OS and on nsswitch.conf settings, making it very difficult to use
> getaddrinfo() return values to deciding how to handle an error.

As you note, I believe glibc upstream is actively working on fixing this.
I've seen quite a few patches flow by around fixing the return status in
various system error conditions, and it struck me as more comprehensive
and better thought-out and not as ad hoc.

-- 
Russ Allbery (r...@debian.org)   


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/87mwpynis4@windlord.stanford.edu



Re: getaddrinfo() return value chaos

2013-07-07 Thread Helmut Grohne
On Sun, Jul 07, 2013 at 02:30:33PM +0200, Thomas Hood wrote:
> Executive summary: The getaddrinfo() returns different values
> depending on the OS and on nsswitch.conf settings, making it
> very difficult to use getaddrinfo() return values to deciding how
> to handle an error.

Thanks for not giving up here. I attempted to reproduce your results
with your attached test program. Here are my results:


Making resolv.conf empty
Results of looking up www.google.com: status = -11, errno = 111
Results of looking up a bogus name: status = -11, errno = 111
Writing nameserver option to resolv.conf
Results of looking up www.google.com: status = 0, errno = 101
Results of looking up a bogus name: status = -2, errno = 101
Making resolv.conf empty
Results of looking up www.google.com: status = -11, errno = 111
Results of looking up a bogus name: status = -11, errno = 111
Writing incorrect nameserver option to resolv.conf
Results of looking up www.google.com: status = -11, errno = 110
Results of looking up a bogus name: status = -11, errno = 110


This is an almost-sid box with libc6 2.17-5. The results are consistent
with my earlier observations.

> Helmut got different results. Is the difference between my machine
> and Helmut's machine attributable to some diff in nsswitch.conf,
> perhaps?  I have:
> 
> hosts: files mdns4_minimal [NOTFOUND=return] dns mdns4

My system above has another variant:

  hosts: files gw_name dns

> This is different in two ways. First the status is -11 (EAI_SYSTEM)
> instead of -2 (EAI_NONAME) when no nameserver can be reached.
> Second, there is now a difference between the empty-resolv.conf
> case and the resolv.conf-with-bogus-address case. In the latter
> case errno is 110 (ETIMEDOUT) instead of 111 (ECONNREFUSED).
> This is better.

Indeed. So maybe mdns is to blame here for part of the trouble? Can you
verify that really the last mdns4 entry makes up for the difference?

> I don't get the impression that the handling of return values
> by the various eglibc layers has been well thought out and
> documented; the developers seem to be making changes ad-hoc.

Thanks for the extensive research here. I concur with your observation.

> In any case, because of all these differences and changes we
> won't have a good, stable getaddrinfo() interface to program
> against until Jessie.  In the meantime a program that needs to
> distinguish between different causes for a name resolution
> failure will have to do more than just check the status and
> errno from getaddinfo().

In particular one of the more recent bugs set out to return EAI_NONAME
when the network is unavailable. This may be a condition where a lookup
needs to be retried though.

Helmut


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130708062327.ga7...@alf.mars



Re: getaddrinfo() return value chaos

2013-07-08 Thread Thomas Hood
It looked to me as if #582916 and roughly duplicate #671789 could have been
fixed in libc6 2.17-7 which it includes two commits


http://sourceware.org/git/?p=glibc.git;a=commitdiff;h=cfde9b463d63092ff0908d4c2748ace648e2ead8

http://sourceware.org/git/?p=glibc.git;a=commitdiff;h=3d04f5db20c8f0d1ba3881b5f5373586a18cf188

the first of which is included in eglibc 2.17


http://www.eglibc.org/cgi-bin/viewvc.cgi/branches/eglibc-2_17/libc/NEWS?view=markup

and the second of which is included as a patch named
'cvs-getaddrinfo-EAI_NONAME.diff' in 2.17-7.

So I upgraded libc6 on my Debian 7.0 machine.

With the standard nsswitch.conf there is no change in the behavior of my
test program.  As before, either with empty or with bogus resolv.conf or
with bogus domain name getaddrinfo() returns -2 with (supposedly therefore
not significant) errno 2.

With nsswitch.conf changed to have simply "hosts: dns", the following is
the output of the test program.


Making resolv.conf empty
Results of looking up www.google.com: status = -2, errno = 111
Results of looking up a bogus name: status = -2, errno = 111
Writing nameserver option to resolv.conf
Results of looking up www.google.com: status = 0, errno = 101
Results of looking up a bogus name: status = -2, errno = 101
Making resolv.conf empty
Results of looking up www.google.com: status = -2, errno = 111
Results of looking up a bogus name: status = -2, errno = 111
Writing incorrect nameserver option to resolv.conf
Results of looking up www.google.com: status = -2, errno = 110
Results of looking up a bogus name: status = -2, errno = 110


This is different from both Debian 7.0 and Ubuntu 13.04 and sort of half
way between the two. As in Debian 7.0 the status is still always -2 in case
of error. As in Ubuntu 13.04 errno is 110 when an incorrect nameserver
address is given, as opposed to 111 when resolv.conf is empty (but errno is
not supposed to be significant here because status is -2).

Callers of getaddrinfo() will only be able to rely on these return values
once the values have stabilized in eglibc (which they may not yet have
done) and once the bug (assuming it's a bug and not a feature) is fixed
whereby, with the standard nsswitch.conf, the incorrect errno is returned.

Interested parties might want to enter into discussion with upstream in
order to ensure that there is a clear specification of what these return
values should be under different circumstances. Ideally tests would be
added which check whether the specification has been adhered to.
-- 
Thomas


Re: getaddrinfo() return value chaos

2013-07-08 Thread Thomas Hood
On Mon, Jul 8, 2013 at 8:23 AM, Helmut Grohne  wrote:

> Here are my results:
>
> 
> Making resolv.conf empty
> Results of looking up www.google.com: status = -11, errno = 111
> Results of looking up a bogus name: status = -11, errno = 111
> Writing nameserver option to resolv.conf
> Results of looking up www.google.com: status = 0, errno = 101
> Results of looking up a bogus name: status = -2, errno = 101
> Making resolv.conf empty
> Results of looking up www.google.com: status = -11, errno = 111
> Results of looking up a bogus name: status = -11, errno = 111
> Writing incorrect nameserver option to resolv.conf
> Results of looking up www.google.com: status = -11, errno = 110
> Results of looking up a bogus name: status = -11, errno = 110
> 
>
> This is an almost-sid box with libc6 2.17-5. The results are consistent
> with my earlier observations.
>
> My system above has [...]:
>
>   hosts: files gw_name dns
> [...] So maybe mdns is to blame here for part of the trouble? Can you
> verify that really the last mdns4 entry makes up for the difference?
>


Just checked on Debian 7.0 with libc6 upgraded to version 2.17-7 from sid.

Behavior is exactly the same as yours, quoted above, except with "status =
-2" where you have "status = -11".

Same behavior with "hosts: dns" and "hosts: files mdns4_minimal
[NOTFOUND=return] dns".
-- 
Thomas


Re: getaddrinfo() return value chaos

2013-07-08 Thread Kurt Roeckx
On Mon, Jul 08, 2013 at 08:23:28AM +0200, Helmut Grohne wrote:
> 
> Indeed. So maybe mdns is to blame here for part of the trouble? Can you
> verify that really the last mdns4 entry makes up for the difference?

mdns has always been a problem in my expierence.  I thought there
was a bug open about that against libnss-mdns but I can't find it.

> > In any case, because of all these differences and changes we
> > won't have a good, stable getaddrinfo() interface to program
> > against until Jessie.  In the meantime a program that needs to
> > distinguish between different causes for a name resolution
> > failure will have to do more than just check the status and
> > errno from getaddinfo().
> 
> In particular one of the more recent bugs set out to return EAI_NONAME
> when the network is unavailable. This may be a condition where a lookup
> needs to be retried though.

I think it needs to return EAI_AGAIN in that case.


Kurt


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20130708164331.ga13...@roeckx.be



Re: getaddrinfo() return value chaos

2013-07-11 Thread Thomas Hood
Kurt has filed a new bug report against eglibc

http://sourceware.org/bugzilla/show_bug.cgi?id=15726

which draws the developers' attention to RFC3493 which specifies the return 
values
of getaddrinfo(). These should be as follows.

> - Things work as expected: return 0
> - The nameserver replies that the hostname does not exist: EAI_FAIL
> - The nameserver doesn't reply, or replies with a temporary failure: EAI_AGAIN
> - You used AI_NUMERICHOST or AI_NUMERICSERV and didn't give a number: 
> EAI_NONAME

Further discussion can best be carried on in the upstream Bugzilla ticket.
-- 
Thomas Hood



Re: getaddrinfo() return value chaos

2013-08-28 Thread Thomas Hood
I wrote:
> Kurt has filed a new bug report against eglibc
>
>http://sourceware.org/bugzilla/show_bug.cgi?id=15726
>
> which draws the developers' attention to RFC3493 which specifies the
> return values of getaddrinfo(). These should be as follows.
> [...]
> Further discussion can best be carried on in the upstream Bugzilla ticket.

Arising from the discussion in that ticket an effort has begun better
to describe the desired behavior of getaddrinfo(). Initial results can
be seen here: http://sourceware.org/glibc/wiki/NameResolver
-- 
Thomas Hood


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/521dde8f.5000...@gmail.com