[ https://issues.apache.org/jira/browse/ZOOKEEPER-4247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Mohammad Arshad resolved ZOOKEEPER-4247. ---------------------------------------- Fix Version/s: 3.6.4 3.7.1 3.8.0 Resolution: Fixed > NPE while processing message from restarted quorum member > --------------------------------------------------------- > > Key: ZOOKEEPER-4247 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4247 > Project: ZooKeeper > Issue Type: Bug > Components: server > Affects Versions: 3.6.2 > Environment: K8S > Reporter: Devarshi Shah > Assignee: Mate Szalay-Beko > Priority: Major > Labels: pull-request-available > Fix For: 3.8.0, 3.7.1, 3.6.4 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > *Problem:* > While upgrading K8S cluster, container running Zookeeper (during serving it's > client) will rollover one by one. > During this rollover, +Null Pointer Exception+ was observed as below. > After updating to the latest Zookeeper 3.6.2 we still see the problem. > This is happening on a fresh install (and has all the time). > > *Stack-trace**:* > <from zk-pod-0-log> > {code:java} > 2021-02-08T12:42:08.229+0000 [myid:] - ERROR > [nioEventLoopGroup-4-1:NettyServerCnxnFactory$CnxnChannelHandler@329] - > Unexpected exception in receive > java.lang.NullPointerException: null > at > org.apache.zookeeper.server.NettyServerCnxn.receiveMessage(NettyServerCnxn.java:518) > ~[zookeeper-3.6.2.jar:3.6.2] > at > org.apache.zookeeper.server.NettyServerCnxn.processMessage(NettyServerCnxn.java:368) > ~[zookeeper-3.6.2.jar:3.6.2] > at > org.apache.zookeeper.server.NettyServerCnxnFactory$CnxnChannelHandler.channelRead(NettyServerCnxnFactory.java:326) > [zookeeper-3.6.2.jar:3.6.2] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) > [netty-transport-4.1.50.Final.jar:4.1.50.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) > [netty-transport-4.1.50.Final.jar:4.1.50.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) > [netty-transport-4.1.50.Final.jar:4.1.50.Final] > at > io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) > [netty-transport-4.1.50.Final.jar:4.1.50.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) > [netty-transport-4.1.50.Final.jar:4.1.50.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) > [netty-transport-4.1.50.Final.jar:4.1.50.Final] > at > io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) > [netty-transport-4.1.50.Final.jar:4.1.50.Final] > at > io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) > [netty-transport-4.1.50.Final.jar:4.1.50.Final] > at > io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714) > [netty-transport-4.1.50.Final.jar:4.1.50.Final] > at > io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650) > [netty-transport-4.1.50.Final.jar:4.1.50.Final] > at > io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576) > [netty-transport-4.1.50.Final.jar:4.1.50.Final] > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) > [netty-transport-4.1.50.Final.jar:4.1.50.Final] > at > io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) > [netty-common-4.1.50.Final.jar:4.1.50.Final] > at > io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) > [netty-common-4.1.50.Final.jar:4.1.50.Final] > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > [netty-common-4.1.50.Final.jar:4.1.50.Final] > at java.lang.Thread.run(Thread.java:834) [?:?] > {code} > > > *Expectation:* > This scenario should be handled and Zookeeper should not print Null Pointer > Exception in logs when peer member goes down as a part of the upgrade > procedure. > We are kindly requesting Apache Zookeeper team to fix this issue. -- This message was sent by Atlassian Jira (v8.3.4#803005)