Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-08-29 Thread Eric S. Raymond
Processing old mail... Hal Murray : > > I believe you're right that these platforms don't have it. The question is, > > how important is that fact? Is the performance hit from synchronous DNS > > really a showstopper? I don't know the answer. > > There are two cases I know of where ntpd does

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-29 Thread Matthew Selsky
On Tue, Jun 28, 2016 at 11:39:16PM -0700, Hal Murray wrote: > > matthew.sel...@twosigma.com said: > > "rlimit memlock 0" using Classic causes ntpd to died after 3 minutes with > > this error 2016-06-29T00:13:21.903+00:00 host.example.com ntpd[27206]: > > libgcc_s.so.1 must be installed for pthread

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-28 Thread Hal Murray
matthew.sel...@twosigma.com said: > "rlimit memlock 0" using Classic causes ntpd to died after 3 minutes with > this error 2016-06-29T00:13:21.903+00:00 host.example.com ntpd[27206]: > libgcc_s.so.1 must be installed for pthread_cancel to work What version of Classic are you running? I though t

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-28 Thread Matthew Selsky
On Tue, Jun 28, 2016 at 07:26:39PM -0400, Eric S. Raymond wrote: > Hal Murray : > > I think you have extrapolated from some modern systems to our whole target > > environment. I don't remember any discussion supporting memlock not being > > interesting/important. > > There were actually two thr

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-28 Thread Gary E. Miller
Yo Eric! On Tue, 28 Jun 2016 19:47:14 -0400 "Eric S. Raymond" wrote: > Gary E. Miller : > > Yo Eric! > > > > On Tue, 28 Jun 2016 19:26:39 -0400 > > "Eric S. Raymond" wrote: > > > > > (You should camp on #ntpsec. Also join our Signal channel - > > > because that's secured, most of the vuln

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-28 Thread Eric S. Raymond
Gary E. Miller : > Yo Eric! > > On Tue, 28 Jun 2016 19:26:39 -0400 > "Eric S. Raymond" wrote: > > > (You should camp on #ntpsec. Also join our Signal channel - because > > that's secured, most of the vuln discussions happen there.) > > Ah, how do we joing the Signal channel? Install Signal on

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-28 Thread Gary E. Miller
Yo Eric! On Tue, 28 Jun 2016 19:26:39 -0400 "Eric S. Raymond" wrote: > (You should camp on #ntpsec. Also join our Signal channel - because > that's secured, most of the vuln discussions happen there.) Ah, how do we joing the Signal channel? RGDS GARY --

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-28 Thread Eric S. Raymond
Hal Murray : > I think you have extrapolated from some modern systems to our whole target > environment. I don't remember any discussion supporting memlock not being > interesting/important. There were actually two threads about this attached to memlock-related bug reports in Classic. They ini

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-28 Thread Hal Murray
e...@thyrsus.com said: > After discussion with Daniel about the performance and security issues I > deleted the memlock code. As the comment explains: I think changes like that are worthy of a general announcement. > on modern systems, which swap so seldom > that many people don't bother wit

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-27 Thread Hal Murray
cbwie...@gmail.com said: > I was thinking of setting up associations using the DNS lookup code. If the > mechanism for adding new pool servers was blocking on the DNS call but > asynchronous to the rest of the daemon, I was figuring to call the lookup > with the name provided by the server direct

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-27 Thread Clark B. Wierda
On Mon, Jun 27, 2016 at 3:47 PM, Hal Murray wrote: > > cbwie...@gmail.com said: > > How are pool entries added when the service decides it needs more? > > There is some background stuff that roughly says "need more?", and if so > fires off the DNS lookup. > > > > Would it be possible to leverage

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-27 Thread Hal Murray
cbwie...@gmail.com said: > How are pool entries added when the service decides it needs more? There is some background stuff that roughly says "need more?", and if so fires off the DNS lookup. > Would it be possible to leverage this code for adding all servers specified > by name? Probably n

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-27 Thread Clark B. Wierda
A question: How are pool entries added when the service decides it needs more? Would it be possible to leverage this code for adding all servers specified by name? The DNS cost would be the same. This only difference is the name used for the query. Once a server is associated, the IP is used.

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-26 Thread Eric S. Raymond
Hal Murray : > > e...@thyrsus.com said: > > Ugh. Our options have just narrowed. I've just seen > > libgcc_s.so.1 must be installed for pthread_cancel to work Aborted (core > > dumped) > > > with memlock off in the build. > > Can you reproduce it? > > My guess is that you didn't really get me

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-26 Thread Hal Murray
e...@thyrsus.com said: > Ugh. Our options have just narrowed. I've just seen > libgcc_s.so.1 must be installed for pthread_cancel to work Aborted (core > dumped) > with memlock off in the build. Can you reproduce it? My guess is that you didn't really get memlock turned off. How about puttin

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-26 Thread Hal Murray
Possible crazy idea... How about we never kill the DNS helper thread. Just let it sit there in case it gets more work to do. The only cost is a bit of memory. Or maybe only do that if we are locking stuff into memory. -- These are my opinions. I hate spam.

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-26 Thread Hal Murray
e...@thyrsus.com said: >> We could try simplifying things to only supporting lock-everything-I-need >> rather than specifying how much. There might be a slippery slope if >> something like a thread stack needs a sane size specified. > I'm not intimate with mlockall, but it looks like it works

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-26 Thread Eric S. Raymond
Hal Murray : > If it uses threads, we still have the problem of not being able to load the > thread cleanup code. Maybe. We don't know if the libc implementation is vulnerable to that bug or not. I should do an experimental implementation on a branch and find out. -- http://www

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-26 Thread Hal Murray
e...@thyrsus.com said: >> Is getaddrinfo_a() in RTEMS? QNX? BSD? > It's not an OS thing, it's a toolchain thing. getaddrinfo_a() is > implemented using standard C and POSIX threads, it doesn't need OS-specific > support. Or it's in an optional extra library. > Linux has it because Linux uses

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-26 Thread Eric S. Raymond
Mark Atwood : > Is getaddrinfo_a() in RTEMS? QNX? BSD? It's not an OS thing, it's a toolchain thing. getaddrinfo_a() is implemented using standard C and POSIX threads, it doesn't need OS-specific support. Linux has it because Linux uses libc whether you're compiling with gcc or clang. Any of

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-26 Thread Mark Atwood
Is getaddrinfo_a() in RTEMS? QNX? BSD? On Sun, Jun 26, 2016 at 7:06 AM Eric S. Raymond wrote: > Eric S. Raymond : > > > What would you do if we discovered a case where we wanted it? > > > > Cry a lot. Then add logic to force synchronous DNS when memlocking is > > selected, and document this

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-26 Thread Eric S. Raymond
Eric S. Raymond : > > What would you do if we discovered a case where we wanted it? > > Cry a lot. Then add logic to force synchronous DNS when memlocking is > selected, and document this as a workaround for a bug we haven't fixed yet. Ugh. Our options have just narrowed. I've just seen libgc

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-26 Thread Eric S. Raymond
Hal Murray : > > e...@thyrsus.com said: > > In this case, we have two possible complexity-reducing fixes. One is to > > drop the memlock feature entirely. The other is to drop the buggy homebrew > > asynchronous-DNS lookup from Classic and use libc's. > > Dropping memlock is an interesting idea

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-25 Thread Hal Murray
e...@thyrsus.com said: > In this case, we have two possible complexity-reducing fixes. One is to > drop the memlock feature entirely. The other is to drop the buggy homebrew > asynchronous-DNS lookup from Classic and use libc's. Dropping memlock is an interesting idea. I can't think of any pla

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-25 Thread Eric S. Raymond
Mark: Heads up! Policy issue. Important but not urgent. Hal Murray : > > e...@thyrsus.com said: > > I think the hack is to force libgcc_s to be loaded early. I don't know how > > to do that in waf. > > There are two problems in this area. One is the end-of-thread code not > getting locked i

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-25 Thread Hal Murray
e...@thyrsus.com said: > I think the hack is to force libgcc_s to be loaded early. I don't know how > to do that in waf. There are two problems in this area. One is the end-of-thread code not getting locked into memory. I think that is what you are running into. The other is a tangle of erro

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-25 Thread Eric S. Raymond
Kurt Roeckx : > > This matches what I remember, except for "use more memory". There was a > > third > > workaround involved weird linker options to force early loading of the > > library. > > Like -WL,-z,now? That's not such a weird option. No, something related to the message I got when I cam

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-25 Thread Kurt Roeckx
On Sat, Jun 25, 2016 at 06:13:56PM -0400, Eric S. Raymond wrote: > Hal Murray : > > > > e...@thyrsus.com said: > > > 1. Apply Classic's workaround for the problem, which I don't remember the > > > details of but involved some dodgy nonstandard linker hacks done through > > > the > > > build syste

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-25 Thread Eric S. Raymond
Hal Murray : > > e...@thyrsus.com said: > > 1. Apply Classic's workaround for the problem, which I don't remember the > > details of but involved some dodgy nonstandard linker hacks done through the > > build system. *However, I did not trust this method when I understood it.* > > It seemed sure

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-25 Thread Eric S. Raymond
Kurt Roeckx : > On Sat, Jun 25, 2016 at 11:00:39AM -0400, Eric S. Raymond wrote: > > > > While this did enable me to recover from my errors, it also turned up > > a serious problem. The combination of the buggy async-DNS code we > > inherited from Classic and use of pool servers causes *very* fre

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-25 Thread Hal Murray
e...@thyrsus.com said: > 1. Apply Classic's workaround for the problem, which I don't remember the > details of but involved some dodgy nonstandard linker hacks done through the > build system. *However, I did not trust this method when I understood it.* > It seemed sure to cause porting difficul

Re: Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-25 Thread Kurt Roeckx
On Sat, Jun 25, 2016 at 11:00:39AM -0400, Eric S. Raymond wrote: > > While this did enable me to recover from my errors, it also turned up > a serious problem. The combination of the buggy async-DNS code we > inherited from Classic and use of pool servers causes *very* frequent > crashes. Can yo

Use of pool servers reveals unacceptable crash rate in async DNS

2016-06-25 Thread Eric S. Raymond
Yesterday I pushed some erroneous commits that got out because my smoke-test procedure was throwing false negatives. To deal with this, I've improved the way I test; everything now gets tried on snark before being pushed to the public repo so the test farm machines can see it. While this did enabl