Re: Unbound exiting on stats write failure?,Re: Unbound exiting on stats write failure?

2016-09-20 Thread Havard Eidnes via Unbound-users
> The error is on a pipe between unbound processes (threads).  It should
> not be out of resources (it might block of course, waiting for them, and
> blocking pipes are not a problem for unbound, but this error is like a
> pipe randomly breaks up).

Hm.

> Are you on OpenBSD?  Perhaps upgrade the kernel?

Nope, on NetBSD 7.0.

Regards,

- Håvard


Re: Unbound exiting on stats write failure?,Re: Unbound exiting on stats write failure?

2016-10-03 Thread Havard Eidnes via Unbound-users
>> one of our unbound hosts recently exited, and before it did, it
>> logged this:
>> 
>>   Sep 19 14:25:56 xxx unbound: [96:4] error: tube msg write failed: 
>> Resource temporarily unavailable
>>   Sep 19 14:25:56 xxx unbound: [96:4] fatal error: could not write stat 
>> values over cmd channel
>
> The error is on a pipe between unbound processes (threads).  It should
> not be out of resources (it might block of course, waiting for them, and
> blocking pipes are not a problem for unbound, but this error is like a
> pipe randomly breaks up).

This turned out to be caused by us running a too old version of
unbound, version 1.5.4.  I've since upgraded to 1.5.9, so this
exact problem should not happen again for us.  In-between there,
tube_write_msg() grew a test for EAGAIN (causing a retry) in the
non-blocking case.

Regards,

- Håvard


Unbound exiting on stats write failure?

2016-09-20 Thread Havard Eidnes via Unbound-users
Hi,

one of our unbound hosts recently exited, and before it did, it
logged this:

  Sep 19 14:25:56 xxx unbound: [96:4] error: tube msg write failed: 
Resource temporarily unavailable
  Sep 19 14:25:56 xxx unbound: [96:4] fatal error: could not write stat 
values over cmd channel

Now, we're periodically polling stats via "unbound-control stats" and
feeding this into collectd, and our collectd hasn't exactly been fully
stable.  However, is there a good reason the failure to write the
stats values is considered a fatal error?  One would have thought that
it would not be, and that abandoning the output channel would be a
rasonable error recovery mechanism, allowing the main task of unbound
to proceed uninterrupted?

Regards,

- Håvard


Re: Unbound exiting on stats write failure?

2016-09-20 Thread W.C.A. Wijngaards via Unbound-users
Hi Havard,

The error is on a pipe between unbound processes (threads).  It should
not be out of resources (it might block of course, waiting for them, and
blocking pipes are not a problem for unbound, but this error is like a
pipe randomly breaks up).

Are you on OpenBSD?  Perhaps upgrade the kernel?

Best regards, Wouter

On 20/09/16 09:47, Havard Eidnes via Unbound-users wrote:
> Hi,
> 
> one of our unbound hosts recently exited, and before it did, it
> logged this:
> 
>   Sep 19 14:25:56 xxx unbound: [96:4] error: tube msg write failed: 
> Resource temporarily unavailable
>   Sep 19 14:25:56 xxx unbound: [96:4] fatal error: could not write stat 
> values over cmd channel
> 
> Now, we're periodically polling stats via "unbound-control stats" and
> feeding this into collectd, and our collectd hasn't exactly been fully
> stable.  However, is there a good reason the failure to write the
> stats values is considered a fatal error?  One would have thought that
> it would not be, and that abandoning the output channel would be a
> rasonable error recovery mechanism, allowing the main task of unbound
> to proceed uninterrupted?
> 
> Regards,
> 
> - Håvard
> 




signature.asc
Description: OpenPGP digital signature