We're planning a 6.0.5 to be released really soon which should be a drop in replacement for 6.0.4 - this fix should be in it.
The code here hasn't changed for a long, *long* time so you'll have been exposed to it for as long as you've been running Qpid... I think you'd have to be *really* unlucky to hit the bug I identified if you are not using TTL (even with TTL it's still a pretty unlucky thing to hit). -- Rob On 7 October 2016 at 00:05, Ramayan Tiwari <[email protected]> wrote: > Yeah, we have been using this version of the broker in production for more > than a year. I am not aware of any change with respect to our production > environment. We only saw this issue past weekend in only one of the > instance (all the brokers in that instance) among more than 100 instances. > Broker hardware and configs are same across all the instances. I will keep > digging. > > For 0.32, we have patched the broker before so we will be able apply this > patch (if we go that route). We are currently performance testing 6.0.4 and > if that goes well, we will probably using the fix that you make in 6.0.x. > > Thanks > Ramayan > > On Thu, Oct 6, 2016 at 2:46 PM, Rob Godfrey <[email protected]> wrote: > >> OK - TTL was the most likely way... there are other possibilities >> (basically it just needs the message to go from being available to >> dequeued simultaneous to it being evaluated to be sent to a consumer - >> competing consumers *could* do it, though I would have guessed it to >> be unlikely). It's possible there's another bug separate to the one I >> identified that you are running in to, though nothing jumps out at me. >> I believe you guys have been running 0.32 for a while... have you just >> started seeing this issue - has something changed in your environment >> or your messaging usage? >> >> We won't be doing a 0.32.x release, there's a *lot* of other stuff >> that has been fixed in 6.0 which I would include in that before I'd >> include this change... I think you may have already patched your 0.32 >> broker anyway, in which case you should be able to add the patch I put >> on the JIRA. >> >> On 6 October 2016 at 23:33, Ramayan Tiwari <[email protected]> >> wrote: >> > Hi Rob, >> > >> > Thanks so much a prompt response and create JIRA to track this. >> > >> > We don't set a TTL on our messages though -- i.e., we use 0 (unlimited). >> We >> > do send it in non-persistent mode; not sure if that matters or could >> > trigger the same bug? >> > >> > For the fix, would it be possible to fix it in 0.32 as well as in 6.0.x ? >> > >> > CCing, Helen from our team. >> > >> > Thanks, >> > >> > Ramayan >> > >> > On Thu, Oct 6, 2016 at 1:21 PM, Rob Godfrey <[email protected]> >> wrote: >> > >> >> I've added a patch file which should apply to 0.32 on the following >> >> JIRA: https://issues.apache.org/jira/browse/QPID-7451 >> >> >> >> Hope this helps, >> >> Rob >> >> >> >> On 6 October 2016 at 22:04, Rob Godfrey <[email protected]> >> wrote: >> >> > Hi Ramayan, >> >> > >> >> > this is exception indicates that a message has been deleted from the >> >> > store, but the in-memory queue still references it. Unfortunately the >> >> > exception doesn't really tell us anything about how the broker will >> >> > have got to this state. >> >> > >> >> > Having looked at the code I have an idea about what may be happening - >> >> > do your messages have a TTL set? I *think* that the AMQP 0-10 message >> >> > path may be vulnerable to a race condition if a message expires at >> >> > precisely the same time that the message is picked up to be sent to a >> >> > consumer (essentially the message has to be available to consume and >> >> > then three lines of code later it must have been deleted). The 0-9-1 >> >> > codepath is not vulnerable to this as it caches the size of the >> >> > message (as you can see in the stack trace, the consumer is checking >> >> > the size of the message to make sure that the consumer has enough >> >> > credit to receive it). >> >> > >> >> > We're unlikely to do a patch release for 0.32, but we will likely be >> >> > putting out a new 6.0.x release soon, and soon after a 6.1 release. >> >> > Would you be able to upgrade to one of these, or would you prefer me >> >> > to send you a patch file that you could apply to the 0.32 source to >> >> > test? >> >> > >> >> > -- Rob >> >> > >> >> > On 6 October 2016 at 21:27, Ramayan Tiwari <[email protected]> >> >> wrote: >> >> >> Hi, >> >> >> >> >> >> >> >> >> We are ran into this StoreException in our production environment >> >> multiple >> >> >> times on different brokers, which caused broker shutdown. We are >> running >> >> >> 0.32 Java broker with 0.16 client. I see that this was reported and >> >> fixed >> >> >> here: >> >> >> https://issues.apache.org/jira/browse/QPID-4012 >> >> >> >> >> >> This is still happening, I don't have enough context to reproduce >> this >> >> >> locally. Any help is appreciated! >> >> >> >> >> >> Thanks >> >> >> Ramayan >> >> >> >> >> >> *Exception* >> >> >> >> >> >> Uncaught exception, shutting down. >> >> >> org.apache.qpid.server.store.StoreException: Metadata not found for >> >> message >> >> >> with id 1762118451 >> >> >> at >> >> >> org.apache.qpid.server.store.berkeleydb.AbstractBDBMessageStore. >> >> getMessageMetaData(AbstractBDBMessageStore.java:343) >> >> >> at >> >> >> org.apache.qpid.server.store.berkeleydb.AbstractBDBMessageStore$ >> >> StoredBDBMessage.getMetaData(AbstractBDBMessageStore.java:1224) >> >> >> at >> >> >> org.apache.qpid.server.protocol.v0_10.MessageTransferMessage. >> >> getMetaData(MessageTransferMessage.java:41) >> >> >> at >> >> >> org.apache.qpid.server.protocol.v0_10.MessageTransferMessage. >> getSize( >> >> MessageTransferMessage.java:56) >> >> >> at >> >> >> org.apache.qpid.server.protocol.v0_10.ConsumerTarget_ >> >> 0_10.allocateCredit(ConsumerTarget_0_10.java:494) >> >> >> at >> >> >> org.apache.qpid.server.queue.QueueConsumerImpl.wouldSuspend( >> >> QueueConsumerImpl.java:278) >> >> >> at >> >> >> org.apache.qpid.server.queue.AbstractQueue.attemptDelivery( >> >> AbstractQueue.java:2059) >> >> >> at >> >> >> org.apache.qpid.server.queue.AbstractQueue.flushConsumer( >> >> AbstractQueue.java:1981) >> >> >> at >> >> >> org.apache.qpid.server.queue.AbstractQueue.flushConsumer( >> >> AbstractQueue.java:1957) >> >> >> at >> >> >> org.apache.qpid.server.queue.QueueConsumerImpl.flush( >> >> QueueConsumerImpl.java:318) >> >> >> at >> >> >> org.apache.qpid.server.protocol.v0_10.ConsumerTarget_ >> >> 0_10.flush(ConsumerTarget_0_10.java:605) >> >> >> at >> >> >> org.apache.qpid.server.protocol.v0_10.ServerSessionDelegate. >> >> messageFlush(ServerSessionDelegate.java:521) >> >> >> at >> >> >> org.apache.qpid.server.protocol.v0_10.ServerSessionDelegate. >> >> messageFlush(ServerSessionDelegate.java:82) >> >> >> at org.apache.qpid.transport.MessageFlush.dispatch( >> >> MessageFlush.java:87) >> >> >> at >> >> >> org.apache.qpid.transport.SessionDelegate.command( >> >> SessionDelegate.java:55) >> >> >> at >> >> >> org.apache.qpid.server.protocol.v0_10.ServerSessionDelegate.command( >> >> ServerSessionDelegate.java:99) >> >> >> at >> >> >> org.apache.qpid.server.protocol.v0_10.ServerSessionDelegate.command( >> >> ServerSessionDelegate.java:82) >> >> >> at org.apache.qpid.transport.Method.delegate(Method.java:159) >> >> >> at org.apache.qpid.transport.Session.received(Session.java:596) >> >> >> at org.apache.qpid.transport.Connection.dispatch( >> Connection.java:452) >> >> >> at >> >> >> org.apache.qpid.transport.ConnectionDelegate.handle( >> >> ConnectionDelegate.java:64) >> >> >> at >> >> >> org.apache.qpid.transport.ConnectionDelegate.handle( >> >> ConnectionDelegate.java:40) >> >> >> at >> >> >> org.apache.qpid.transport.MethodDelegate.messageFlush( >> >> MethodDelegate.java:143) >> >> >> at org.apache.qpid.transport.MessageFlush.dispatch( >> >> MessageFlush.java:87) >> >> >> at >> >> >> org.apache.qpid.transport.ConnectionDelegate.command( >> >> ConnectionDelegate.java:54) >> >> >> at >> >> >> org.apache.qpid.transport.ConnectionDelegate.command( >> >> ConnectionDelegate.java:40) >> >> >> at org.apache.qpid.transport.Method.delegate(Method.java:159) >> >> >> at org.apache.qpid.transport.Connection.received( >> Connection.java:405) >> >> >> at >> >> >> org.apache.qpid.server.protocol.v0_10.ServerConnection.access$001( >> >> ServerConnection.java:64) >> >> >> at >> >> >> org.apache.qpid.server.protocol.v0_10.ServerConnection$1.run( >> >> ServerConnection.java:316) >> >> >> at >> >> >> org.apache.qpid.server.protocol.v0_10.ServerConnection$1.run( >> >> ServerConnection.java:312) >> >> >> at java.security.AccessController.doPrivileged(Native Method) >> >> >> at javax.security.auth.Subject.doAs(Subject.java:360) >> >> >> at >> >> >> org.apache.qpid.server.protocol.v0_10.ServerConnection.received( >> >> ServerConnection.java:311) >> >> >> at >> >> >> org.apache.qpid.server.protocol.v0_10.ServerConnection.received( >> >> ServerConnection.java:64) >> >> >> at org.apache.qpid.transport.network.Assembler.emit( >> Assembler.java:97) >> >> >> at org.apache.qpid.transport.network.Assembler.assemble( >> >> Assembler.java:198) >> >> >> at org.apache.qpid.transport.network.Assembler.frame( >> >> Assembler.java:131) >> >> >> at org.apache.qpid.transport.network.Frame.delegate(Frame.java:128) >> >> >> at org.apache.qpid.transport.network.Assembler.received( >> >> Assembler.java:102) >> >> >> at org.apache.qpid.transport.network.Assembler.received( >> >> Assembler.java:44) >> >> >> at >> >> >> org.apache.qpid.transport.network.InputHandler.next( >> >> InputHandler.java:199) >> >> >> at >> >> >> org.apache.qpid.transport.network.InputHandler.received( >> >> InputHandler.java:114) >> >> >> at >> >> >> org.apache.qpid.server.protocol.v0_10.ProtocolEngine_ >> >> 0_10.received(ProtocolEngine_0_10.java:179) >> >> >> at >> >> >> org.apache.qpid.server.protocol.v0_10.ProtocolEngine_ >> >> 0_10.received(ProtocolEngine_0_10.java:43) >> >> >> at >> >> >> org.apache.qpid.server.protocol.MultiVersionProtocolEngine.received( >> >> MultiVersionProtocolEngine.java:153) >> >> >> at >> >> >> org.apache.qpid.server.protocol.MultiVersionProtocolEngine.received( >> >> MultiVersionProtocolEngine.java:51) >> >> >> at org.apache.qpid.transport.network.io.IoReceiver.run( >> >> IoReceiver.java:161) >> >> >> at java.lang.Thread.run(Thread.java:745) >> >> >> >> --------------------------------------------------------------------- >> >> To unsubscribe, e-mail: [email protected] >> >> For additional commands, e-mail: [email protected] >> >> >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> >> --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
