It just occurred to me that this could be a problem with the recycler. If we recycle a buffer too early, but then keep using it, another user could pick it up, and between them they could corrupt the data that would cause it to be unhandled.
-Ivan On Thu, Mar 8, 2018 at 1:49 PM, Enrico Olivelli <[email protected]> wrote: > > > 2018-03-08 11:57 GMT+01:00 Ivan Kelly <[email protected]>: >> >> Hmm, yes, looks messed up. Do you have a reliable repro of this? >> >> "at io.netty.channel.DefaultChannelPipeline.onUnhandledInboundMessage" >> <- it seems the message itself wasn't handled. Do you have any more >> bookie side logs leading up to this event? > > > > The only suspicious logs are for logger named > "org.apache.bookkeeper.bookie.EntryLogger" > > > 18-03-08-09-22-40 Created new entry log file > /data/qasql/bookie/./bookie_data/current/4b0.log for logId 1200. > 18-03-08-09-32-41 Failed to get ledgers map index from: 53.log : No > ledgers map index found on entryLogId53 > 18-03-08-09-33-10 Failed to get ledgers map index from: 54.log : No > ledgers map index found on entryLogId54 > 18-03-08-09-33-10 Failed to get ledgers map index from: 375.log : No > ledgers map index found on entryLogId375 > 18-03-08-09-33-13 Failed to get ledgers map index from: 878.log : No > ledgers map index found on entryLogId878 > 18-03-08-09-33-22 Failed to get ledgers map index from: 1197.log : No > ledgers map index found on entryLogId1197 > 18-03-08-09-33-57 Failed to get ledgers map index from: 1198.log : No > ledgers map index found on entryLogId1198 > 18-03-08-09-34-54 Failed to get ledgers map index from: 1199.log : No > ledgers map index found on entryLogId1199 > > I will try to run Bookie with level debug, but it produces lots of spam > > Thanks > Enrico > >> >> >> -Ivan >> >> On Thu, Mar 8, 2018 at 9:18 AM, Enrico Olivelli <[email protected]> >> wrote: >> > Hi all, >> > I am seeing this bad errors in some test environments with 4.6.1 >> > The errors appear during rolling restart of the application, the test >> > env is >> > made of 6 machines: >> > - 3 bookies >> > - 3 client machines (with several BK clients, of different sub-systems) >> > - running with 4.6.1 both client and servers >> > >> > *I do not have* reports of this errors from production, already running >> > 4.6.1 for the last month >> > >> > But the problem is quite scary >> > >> > This is a sample of relevant errors on clients (in this case Majordodo >> > brokers, with log level = INFO) >> > >> > logs/org.apache.bookkeeper.client.PendingAddOp: >> > 18-03-07-09-51-43 Write of ledger entry to quorum failed: L366634 >> > E2557 >> > 18-03-07-09-51-43 Write of ledger entry to quorum failed: L366634 >> > E2558 >> > 18-03-07-09-51-43 Write of ledger entry to quorum failed: L366634 >> > E2559 >> > 18-03-07-15-59-55 Failed to write entry (366680, 1865): Bookie >> > operation timeout >> > 18-03-07-15-59-55 Failed to write entry (366680, 1865): Bookie >> > operation timeout >> > 18-03-07-16-00-00 Failed to write entry (366680, 1865): Bookie >> > operation timeout >> > 18-03-07-16-00-00 Failed to write entry (366680, 1865): Bookie >> > operation timeout >> > 18-03-07-16-00-05 Failed to write entry (366680, 1865): Bookie >> > operation timeout >> > >> > >> > org.apache.bookkeeper.proto.PerChannelBookieClient: >> > >> > 18-03-07-10-06-44 Unexpected exception caught by bookie client >> > channel >> > handler >> > 18-03-07-10-06-44 io.netty.handler.codec.DecoderException: >> > java.lang.IllegalStateException: Received unknown response : op code = 6 >> > io.netty.handler.codec.DecoderException: >> > java.lang.IllegalStateException: >> > Received unknown response : op code = 6 >> > at >> > >> > io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:98) >> > at >> > >> > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) >> > at >> > >> > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) >> > at >> > >> > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) >> > at >> > >> > io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310) >> > at >> > >> > io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:297) >> > at >> > >> > io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:413) >> > at >> > >> > io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265) >> > at >> > >> > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) >> > at >> > >> > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) >> > at >> > >> > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) >> > at >> > >> > io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1414) >> > at >> > >> > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) >> > at >> > >> > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) >> > at >> > >> > io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:945) >> > at >> > >> > io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:800) >> > at >> > >> > io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:404) >> > at >> > io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:304) >> > at >> > >> > io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:886) >> > at java.base/java.lang.Thread.run(Thread.java:844) >> > Caused by: java.lang.IllegalStateException: Received unknown response : >> > op >> > code = 6 >> > at >> > >> > org.apache.bookkeeper.proto.BookieProtoEncoding$ResponseEnDeCoderPreV3.decode(BookieProtoEncoding.java:294) >> > at >> > >> > org.apache.bookkeeper.proto.BookieProtoEncoding$ResponseDecoder.decode(BookieProtoEncoding.java:478) >> > at >> > >> > io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88) >> > >> > >> > ... 19 more >> > >> > >> > Errors on Bookie side, with log level = INFO) >> > >> > >> > io.netty.channel.DefaultChannelPipeline: >> > 8-03-07-09-51-38 An exceptionCaught() event was fired, and it >> > reached >> > at the tail of the pipeline. It usually means the last handler in the >> > pipeline did not handle the exception. >> > 18-03-07-09-51-38 io.netty.util.IllegalReferenceCountException: >> > refCnt: 0, increment: 1 >> > io.netty.util.IllegalReferenceCountException: refCnt: 0, increment: 1 >> > at >> > >> > io.netty.buffer.AbstractReferenceCountedByteBuf.release0(AbstractReferenceCountedByteBuf.java:100) >> > at >> > >> > io.netty.buffer.AbstractReferenceCountedByteBuf.release(AbstractReferenceCountedByteBuf.java:84) >> > at >> > io.netty.util.ReferenceCountUtil.release(ReferenceCountUtil.java:88) >> > at >> > >> > io.netty.channel.DefaultChannelPipeline.onUnhandledInboundMessage(DefaultChannelPipeline.java:1202) >> > at >> > >> > io.netty.channel.DefaultChannelPipeline$TailContext.channelRead(DefaultChannelPipeline.java:1299) >> > at >> > >> > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) >> > at >> > >> > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) >> > at >> > >> > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) >> > at >> > >> > io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) >> > at >> > >> > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) >> > at >> > >> > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) >> > at >> > >> > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) >> > at >> > >> > org.apache.bookkeeper.proto.BookieRequestHandler.channelRead(BookieRequestHandler.java:78) >> > at >> > >> > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) >> > at >> > >> > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) >> > at >> > >> > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) >> > at >> > >> > io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) >> > at >> > >> > org.apache.bookkeeper.proto.AuthHandler$ServerSideHandler.channelRead(AuthHandler.java:88) >> > at >> > >> > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) >> > at >> > >> > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) >> > at >> > >> > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) >> > at >> > >> > io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102) >> > at >> > >> > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) >> > at >> > >> > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) >> > at >> > >> > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) >> > at >> > >> > io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:310) >> > at >> > >> > io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:297) >> > at >> > >> > io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:413) >> > at >> > >> > io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:265) >> > at >> > >> > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) >> > at >> > >> > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) >> > at >> > >> > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) >> > at >> > >> > io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1414) >> > at >> > >> > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) >> > at >> > >> > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) >> > at >> > >> > io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:945) >> > at >> > >> > io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:800) >> > at >> > >> > io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:404) >> > at >> > io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:304) >> > at >> > >> > io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:886) >> > at java.base/java.lang.Thread.run(Thread.java:844) >> > >> > > >
