[ 
https://issues.apache.org/jira/browse/CASSANDRA-15199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16878488#comment-16878488
 ] 

Sam Tunnicliffe commented on CASSANDRA-15199:
---------------------------------------------

Just for reference, the stacktrace from the Kong issue. Looks like the 
{PagingState} contains a null partition key, so potentially an issue with the 
lua client, but I seem to recall seeing some similar stacktraces when looking 
into paging issues in mixed version (2.1/3.0) clusters during upgrades.

{code}
WARN  [ReadStage-6] 2019-05-31 01:08:07,971 ReadCommand.java:569 - Read 1000 
live rows and 1003 tombstone cells for query SELECT * FROM kong_dev.routes 
WHERE partition = routes LIMIT 1000 (see tombstone_warn_threshold)
ERROR [Native-Transport-Requests-4] 2019-05-31 01:08:07,997 
QueryMessage.java:129 - Unexpected error during query
java.lang.NullPointerException: null
        at 
org.apache.cassandra.dht.Murmur3Partitioner.getHash(Murmur3Partitioner.java:230)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.dht.Murmur3Partitioner.decorateKey(Murmur3Partitioner.java:66)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.config.CFMetaData.decorateKey(CFMetaData.java:666) 
~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.service.pager.PartitionRangeQueryPager.<init>(PartitionRangeQueryPager.java:44)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.db.PartitionRangeReadCommand.getPager(PartitionRangeReadCommand.java:269)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.cql3.statements.SelectStatement.getPager(SelectStatement.java:474)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:287)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:117)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:225)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:256) 
~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:241) 
~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:116)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:566)
 [apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410)
 [apache-cassandra-3.11.4.jar:3.11.4]
        at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 [netty-all-4.0.44.Final.jar:4.0.44.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
 [netty-all-4.0.44.Final.jar:4.0.44.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35)
 [netty-all-4.0.44.Final.jar:4.0.44.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348)
 [netty-all-4.0.44.Final.jar:4.0.44.Final]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_181]
        at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
 [apache-cassandra-3.11.4.jar:3.11.4]
        at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:114) 
[apache-cassandra-3.11.4.jar:3.11.4]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181]

ERROR [Native-Transport-Requests-4] 2019-05-31 01:08:07,997 
ErrorMessage.java:384 - Unexpected exception during request
java.lang.NullPointerException: null
        at 
org.apache.cassandra.dht.Murmur3Partitioner.getHash(Murmur3Partitioner.java:230)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.dht.Murmur3Partitioner.decorateKey(Murmur3Partitioner.java:66)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.config.CFMetaData.decorateKey(CFMetaData.java:666) 
~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.service.pager.PartitionRangeQueryPager.<init>(PartitionRangeQueryPager.java:44)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.db.PartitionRangeReadCommand.getPager(PartitionRangeReadCommand.java:269)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.cql3.statements.SelectStatement.getPager(SelectStatement.java:474)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:287)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:117)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:225)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:256) 
~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:241) 
~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.transport.messages.QueryMessage.execute(QueryMessage.java:116)
 ~[apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:566)
 [apache-cassandra-3.11.4.jar:3.11.4]
        at 
org.apache.cassandra.transport.Message$Dispatcher.channelRead0(Message.java:410)
 [apache-cassandra-3.11.4.jar:3.11.4]
        at 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
 [netty-all-4.0.44.Final.jar:4.0.44.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
 [netty-all-4.0.44.Final.jar:4.0.44.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext.access$600(AbstractChannelHandlerContext.java:35)
 [netty-all-4.0.44.Final.jar:4.0.44.Final]
        at 
io.netty.channel.AbstractChannelHandlerContext$7.run(AbstractChannelHandlerContext.java:348)
 [netty-all-4.0.44.Final.jar:4.0.44.Final]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
[na:1.8.0_181]
        at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
 [apache-cassandra-3.11.4.jar:3.11.4]
        at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:114) 
[apache-cassandra-3.11.4.jar:3.11.4]
        at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181]
{code}

> Cassandra throwing occasional NPE 3.11.x
> ----------------------------------------
>
>                 Key: CASSANDRA-15199
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15199
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Messaging/Client
>            Reporter: Jeremy Justus
>            Priority: Normal
>
> Hey folks, decided to raise an official Jira(never done one of these before) 
> about an issue we have found between Kong API Gateway leveraging Cassandra as 
> our db.
> We run a C* cluster in 2 close DCs, 6 C* nodes total, 3 in each DC. Data 
> replicated across all nodes. We have found C* sometimes throws NPEs based on 
> the calls made by the lua-cassandra driver Kong 
> leverages([https://github.com/thibaultcha/lua-cassandra]). Very specifically 
> it seems to occur when attempting to do paging across multiple C* nodes. When 
> persistently paging to a single C* node we can't reproduce NPEs in C*.
> The exact Error C* throws with its stack-trace can be seen here:
> [https://github.com/Kong/kong/issues/4194#issuecomment-497572751]
> And again when we tried to upgrade from 3.11.2 to 3.11.4 in hopes it was 
> already resolved:
> [https://github.com/Kong/kong/issues/4194#issuecomment-497590235]
> Same error same line numbers, so code must be same in this portion.
>  
> Sample of our C* Config: 
> [https://github.com/Kong/kong/issues/4194#issuecomment-497595766]
> We discussed this in the ASF Slack flow as well:
> ASF Slack discussion 
> ([https://the-asf.slack.com/archives/CJZLTM05A/p1559321422028200] )
> Some of the more important technical comments I saw people posting here:
> [https://github.com/Kong/kong/issues/4194#issuecomment-497858824]
>  
> I am not a C* DBA so I don't have an exact repro for you other than stating 
> what the client application was attempting to do(Paging across multiple C* 
> nodes within a DC) when we could see the failures. Any Apache C* folk think 
> they see the issue or could drop me C* JAR with extra debugging print 
> statements I could run in dev to help feed you more info? Or if you see the 
> problem and can one shot a fix so no more NPE and Cassandra responds 
> appropriately to the client with some sort of error message around what was 
> wrong that would be insightful.
>  
> Thanks!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to