[ https://issues.apache.org/jira/browse/CASSANDRA-15199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16878488#comment-16878488 ]
Sam Tunnicliffe commented on CASSANDRA-15199: --------------------------------------------- Just for reference, the stacktrace from the Kong issue. Looks like the {PagingState} contains a null partition key, so potentially an issue with the lua client, but I seem to recall seeing some similar stacktraces when looking into paging issues in mixed version (2.1/3.0) clusters during upgrades. {code} WARN [ReadStage-6] 2019-05-31 01:08:07,971 ReadCommand.java:569 - Read 1000 live rows and 1003 tombstone cells for query SELECT * FROM kong_dev.routes WHERE partition = routes LIMIT 1000 (see tombstone_warn_threshold) ERROR [Native-Transport-Requests-4] 2019-05-31 01:08:07,997 QueryMessage.java:129 - Unexpected error during query java.lang.NullPointerException: null at org.apache.cassandra.dht.Murmur3Partitioner.getHash(Murmur3Partitioner.java:230) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.dht.Murmur3Partitioner.decorateKey(Murmur3Partitioner.java:66) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.config.CFMetaData.decorateKey(CFMetaData.java:666) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.service.pager.PartitionRangeQueryPager.<init>(PartitionRangeQueryPager.java:44) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.db.PartitionRangeReadCommand.getPager(PartitionRangeReadCommand.java:269) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.cql3.statements.SelectStatement.getPager(SelectStatement.java:474) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:287) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:117) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:225) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:256) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:241) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:116) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:566) [apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410) [apache-cassandra-3.11.4.jar:3.11.4] at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) [netty-all-4.0.44.Final.jar:4.0.44.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) [netty-all-4.0.44.Final.jar:4.0.44.Final] at io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35) [netty-all-4.0.44.Final.jar:4.0.44.Final] at io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348) [netty-all-4.0.44.Final.jar:4.0.44.Final] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_181] at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) [apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:114) [apache-cassandra-3.11.4.jar:3.11.4] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181] ERROR [Native-Transport-Requests-4] 2019-05-31 01:08:07,997 ErrorMessage.java:384 - Unexpected exception during request java.lang.NullPointerException: null at org.apache.cassandra.dht.Murmur3Partitioner.getHash(Murmur3Partitioner.java:230) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.dht.Murmur3Partitioner.decorateKey(Murmur3Partitioner.java:66) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.config.CFMetaData.decorateKey(CFMetaData.java:666) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.service.pager.PartitionRangeQueryPager.<init>(PartitionRangeQueryPager.java:44) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.db.PartitionRangeReadCommand.getPager(PartitionRangeReadCommand.java:269) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.cql3.statements.SelectStatement.getPager(SelectStatement.java:474) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:287) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:117) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:225) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:256) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:241) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:116) ~[apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:566) [apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410) [apache-cassandra-3.11.4.jar:3.11.4] at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105) [netty-all-4.0.44.Final.jar:4.0.44.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357) [netty-all-4.0.44.Final.jar:4.0.44.Final] at io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35) [netty-all-4.0.44.Final.jar:4.0.44.Final] at io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348) [netty-all-4.0.44.Final.jar:4.0.44.Final] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_181] at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) [apache-cassandra-3.11.4.jar:3.11.4] at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:114) [apache-cassandra-3.11.4.jar:3.11.4] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181] {code} > Cassandra throwing occasional NPE 3.11.x > ---------------------------------------- > > Key: CASSANDRA-15199 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15199 > Project: Cassandra > Issue Type: Bug > Components: Messaging/Client > Reporter: Jeremy Justus > Priority: Normal > > Hey folks, decided to raise an official Jira(never done one of these before) > about an issue we have found between Kong API Gateway leveraging Cassandra as > our db. > We run a C* cluster in 2 close DCs, 6 C* nodes total, 3 in each DC. Data > replicated across all nodes. We have found C* sometimes throws NPEs based on > the calls made by the lua-cassandra driver Kong > leverages([https://github.com/thibaultcha/lua-cassandra]). Very specifically > it seems to occur when attempting to do paging across multiple C* nodes. When > persistently paging to a single C* node we can't reproduce NPEs in C*. > The exact Error C* throws with its stack-trace can be seen here: > [https://github.com/Kong/kong/issues/4194#issuecomment-497572751] > And again when we tried to upgrade from 3.11.2 to 3.11.4 in hopes it was > already resolved: > [https://github.com/Kong/kong/issues/4194#issuecomment-497590235] > Same error same line numbers, so code must be same in this portion. > > Sample of our C* Config: > [https://github.com/Kong/kong/issues/4194#issuecomment-497595766] > We discussed this in the ASF Slack flow as well: > ASF Slack discussion > ([https://the-asf.slack.com/archives/CJZLTM05A/p1559321422028200] ) > Some of the more important technical comments I saw people posting here: > [https://github.com/Kong/kong/issues/4194#issuecomment-497858824] > > I am not a C* DBA so I don't have an exact repro for you other than stating > what the client application was attempting to do(Paging across multiple C* > nodes within a DC) when we could see the failures. Any Apache C* folk think > they see the issue or could drop me C* JAR with extra debugging print > statements I could run in dev to help feed you more info? Or if you see the > problem and can one shot a fix so no more NPE and Cassandra responds > appropriately to the client with some sort of error message around what was > wrong that would be insightful. > > Thanks! -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org