Here's what I can tell you: When the system is under high PPS load, it relayd seems to restart (and frequently at that)! unless I significantly raise the check delays and timeouts. Otherwise, relayd functions normally (excepting the lost hce child) with the lower, more preferable values.
This bug is elusive as hell and doesn't rear its head often. But, when it does, it usually does this repeatedly and continuously. I use a command to auto-restart relayd when it signal 6's and the output ends up looking like: Tue Jan 17 13:33:40 MST 2012 restarted Tue Jan 17 13:34:04 MST 2012 restarted Tue Jan 17 13:34:28 MST 2012 restarted Tue Jan 17 13:34:56 MST 2012 restarted Tue Jan 17 13:35:06 MST 2012 restarted Tue Jan 17 13:35:24 MST 2012 restarted Tue Jan 17 13:35:48 MST 2012 restarted Tue Jan 17 13:35:55 MST 2012 restarted Tue Jan 17 13:36:20 MST 2012 restarted So, as you can see, this occurs rather frequently during high load PPS load times. The error I see when running relayd with -dv is: relayd in free(): error: bogus pointer (double free?) 0x206ac8000 lost child: hce terminated; signal 6 pfe exiting, pid 12691 relay exiting, pid 31468 relay exiting, pid 5714 relay exiting, pid 2319 relay exiting, pid 19145 relay exiting, pid 20233 parent terminating, pid 5977 dmesg.boot.bz2 as requested by the FAQ is attached. I've also included a copy of the relayd.conf.bz2. I wish I could provide you with more information, but, this is as much as I can provide at this point in time. Unfortunately, this problem is most of an issue on our production router (as it's the only one that receives such high traffic at any given point in time). I can't tweak around with it enough to get further trace information and I don't have the time/resources to dig further into this issue at the moment. I hope this is enough to get started on the bug. If you need any more information from me on my environment, I will do my best to get it for you. Happy hacking and all the best, Zack [demime 1.01d removed an attachment of type application/x-bzip2 which had a name of relayd.conf.bz2] [demime 1.01d removed an attachment of type application/x-bzip2 which had a name of dmesg.boot.bz2]