Here's what I can tell you:

When the system is under high PPS load, it relayd seems to restart
(and frequently at that)! unless I significantly raise the check
delays and timeouts.  Otherwise, relayd functions normally (excepting
the lost hce child) with the lower, more preferable values.

This bug is elusive as hell and doesn't rear its head often.  But,
when it does, it usually does this repeatedly and continuously.  I use
a command to auto-restart relayd when it signal 6's and the output
ends up looking like:

Tue Jan 17 13:33:40 MST 2012
restarted
Tue Jan 17 13:34:04 MST 2012
restarted
Tue Jan 17 13:34:28 MST 2012
restarted
Tue Jan 17 13:34:56 MST 2012
restarted
Tue Jan 17 13:35:06 MST 2012
restarted
Tue Jan 17 13:35:24 MST 2012
restarted
Tue Jan 17 13:35:48 MST 2012
restarted
Tue Jan 17 13:35:55 MST 2012
restarted
Tue Jan 17 13:36:20 MST 2012
restarted

So, as you can see, this occurs rather frequently during high load PPS
load times.

The error I see when running relayd with -dv is:

relayd in free(): error: bogus pointer (double free?) 0x206ac8000
lost child: hce terminated; signal 6
pfe exiting, pid 12691
relay exiting, pid 31468
relay exiting, pid 5714
relay exiting, pid 2319
relay exiting, pid 19145
relay exiting, pid 20233
parent terminating, pid 5977

dmesg.boot.bz2 as requested by the FAQ is attached.

I've also included a copy of the relayd.conf.bz2.

I wish I could provide you with more information, but, this is as much
as I can provide at this point in time.  Unfortunately, this problem
is most of an issue on our production router (as it's the only one
that receives such high traffic at any given point in time).  I can't
tweak around with it enough to get further trace information and I
don't have the time/resources to dig further into this issue at the
moment.

I hope this is enough to get started on the bug.  If you need any more
information from me on my environment, I will do my best to get it for
you.

Happy hacking and all the best,

Zack

[demime 1.01d removed an attachment of type application/x-bzip2 which had a 
name of relayd.conf.bz2]

[demime 1.01d removed an attachment of type application/x-bzip2 which had a 
name of dmesg.boot.bz2]

Reply via email to