OK, created a JIRA here https://issues.apache.org/jira/browse/QPID-3618
Thanks, -Brandon On Tue, Nov 15, 2011 at 11:04 AM, Gordon Sim <[email protected]> wrote: > On 11/14/2011 10:28 PM, Brandon Pedersen wrote: >> >> On Mon, Nov 14, 2011 at 4:01 AM, Gordon Sim<[email protected]> wrote: >>> >>> On 11/13/2011 07:45 PM, Brandon Pedersen wrote: >>>> >>>> I have a durable federation link set up. When I start the broker that >>>> initializes the connection there are sometimes when I get a weird >>>> error about the connection receiving an invalid frame and subsequently >>>> kills the qpid daemon. Is this expected behavior? >>> >>> No, that is a bug. What version are you using and are you able to isolate >>> a >>> reproducible test case? >> >> Running version 0.12. I think I have narrowed it down to flakiness in >> the link between the 2 brokers. I am doing this over a cellular >> connection and this seems to happen only when the cell connection is >> brought up and perhaps not been fully initialized yet. It is somewhat >> tricky to reproduce, but what happens is I fire up the broker (which >> has a durable route/link) and then fire up the cellular connection. >> Sometimes the connection will succeed, other times it will fail and >> then seg fault. > > Could you raise a JIRA for this? Sounds like a dangling pointer to a failed > session or similar perhaps... perhaps related to heartbeat induced > connection abort, perhaps related to push routes specifically. > > I'll try and get some time to reproduce and find a fix, but am tied up right > at the minute. > >> >>>> It seems if there is >>>> an error trying to connect it should just retry. Here is what I see in >>>> the log: >>>> >>>> Nov 13 13:31:28 mtcdp daemon.err qpidd[1579]: 2011-11-13 13:31:28 >>>> error Connection local:59780-remote:5672 closed by error: Connection >>>> not yet open, invalid frame received.(501) >>>> >>>> Any idea how to fix this? >>> >>> Can you enable core dumps on the broker and get a backtrace? >> >> I enabled core dumps and got a couple, both of them have the following >> trace: >> Core was generated by `qpidd'. >> Program terminated with signal 11, Segmentation fault. >> #0 0x403fdc98 in qpid::SessionState::disableReceiverTracking() () >> from /usr/lib/libqpidcommon.so.2 >> (gdb) backtrace >> #0 0x403fdc98 in qpid::SessionState::disableReceiverTracking() () >> from /usr/lib/libqpidcommon.so.2 >> #1 0x4010e8a8 in >> qpid::broker::Bridge::create(qpid::broker::Connection&) () from >> /usr/lib/libqpidbroker.so.2 >> #2 0x4017ed18 in qpid::broker::Link::ioThreadProcessing() () from >> /usr/lib/libqpidbroker.so.2 >> #3 0x40180920 in ?? () from /usr/lib/libqpidbroker.so.2 >> Cannot access memory at address 0x2d74c0f8 >> >> Also, I am a little suspicious of the log as well. That message that >> is output to the log actually appears twice, one right after another, >> just before it dies. So it looks like: >> Nov 14 15:43:35 mtcdp daemon.err qpidd[6790]: 2011-11-14 15:43:35 >> error Connection local:55764-remote:5672 closed by error: Connection >> not yet open, invalid frame received.(501) >> Nov 14 15:43:35 mtcdp daemon.err qpidd[6790]: 2011-11-14 15:43:35 >> error Connection local:55764-remote:5672 closed by error: Connection >> not yet open, invalid frame received.(501) >> I'm not sure if that helps or not.... > > Yes, add all that in the JIRA, it will certainly help. > > --------------------------------------------------------------------- > Apache Qpid - AMQP Messaging Implementation > Project: http://qpid.apache.org > Use/Interact: mailto:[email protected] > > --------------------------------------------------------------------- Apache Qpid - AMQP Messaging Implementation Project: http://qpid.apache.org Use/Interact: mailto:[email protected]
