[ 
https://issues.apache.org/jira/browse/CASSANDRA-13407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15960470#comment-15960470
 ] 

Alex Petrov edited comment on CASSANDRA-13407 at 4/10/17 7:55 AM:
------------------------------------------------------------------

Looks like I was able to gather a bit more information on the issue. To confirm 
what you're saying. It is possible to reproduce locally by tweaking timeouts 
(particularly making the gossip interval shorter, to emulate the slow VM). 

{code}
INFO  [GossipTasks:1] 2017-04-03 23:05:53,433 Gossiper.java:810 - FatClient 
/127.0.0.4 has been silent for 1000ms, removing from gossip
DEBUG [GossipTasks:1] 2017-04-03 23:05:53,436 Gossiper.java:432 - removing 
endpoint /127.0.0.4
DEBUG [GossipTasks:1] 2017-04-03 23:05:53,436 Gossiper.java:407 - evicting 
/127.0.0.4 from gossip
{code}

After that we can get an NPE either in {{Gossiper#getHostId}} or 
{{StorageService#isStatus}}. 

The patch for 2.0 and 3.0 is slightly different, as if we do not initialise 
schema, we'll get the following error: 

{code}
    [junit] junit.framework.AssertionFailedError: []
    [junit]     at 
org.apache.cassandra.db.lifecycle.Tracker.getMemtableFor(Tracker.java:312)
    [junit]     at 
org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:1185)
    [junit]     at 
org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:573)
    [junit]     at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:421)
    [junit]     at org.apache.cassandra.db.Mutation.apply(Mutation.java:210)
    [junit]     at org.apache.cassandra.db.Mutation.apply(Mutation.java:215)
    [junit]     at org.apache.cassandra.db.Mutation.apply(Mutation.java:224)
    [junit]     at 
org.apache.cassandra.cql3.statements.ModificationStatement.executeInternalWithoutCondition(ModificationStatement.java:566)
    [junit]     at 
org.apache.cassandra.cql3.statements.ModificationStatement.executeInternal(ModificationStatement.java:556)
    [junit]     at 
org.apache.cassandra.cql3.QueryProcessor.executeInternal(QueryProcessor.java:295)
    [junit]     at 
org.apache.cassandra.db.SystemKeyspace.updatePeerInfo(SystemKeyspace.java:712)
    [junit]     at 
org.apache.cassandra.service.StorageService.updatePeerInfo(StorageService.java:1801)
    [junit]     at 
org.apache.cassandra.service.StorageService.handleStateNormal(StorageService.java:2014)
    [junit]     at 
org.apache.cassandra.service.StorageService.onChange(StorageService.java:1669)
    [junit]     at org.apache.cassandra.Util.createInitialRing(Util.java:213)
    [junit]     at 
org.apache.cassandra.service.RemoveTest.setup(RemoveTest.java:77)
{code}

|[2.2|https://github.com/apache/cassandra/compare/2.2...ifesdjeen:13407-2.2]|[testall|http://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13407-2.2-testall/]|
|[3.0|https://github.com/apache/cassandra/compare/3.0...ifesdjeen:13407-3.0]|[testall|http://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13407-3.0-testall/]|
|[3.11|https://github.com/apache/cassandra/compare/3.11...ifesdjeen:13407-3.11]|[testall|http://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13407-3.11-testall/]|
|[trunk|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:13407-trunk]|[testall|http://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13407-trunk-testall/]|


was (Author: ifesdjeen):
Looks like I was able to gather a bit more information on the issue. To confirm 
what you're saying. It is possible to reproduce locally by tweaking timeouts 
(particularly making the gossip interval shorter, to emulate the slow VM). 

{code}
INFO  [GossipTasks:1] 2017-04-03 23:05:53,433 Gossiper.java:810 - FatClient 
/127.0.0.4 has been silent for 1000ms, removing from gossip
DEBUG [GossipTasks:1] 2017-04-03 23:05:53,436 Gossiper.java:432 - removing 
endpoint /127.0.0.4
DEBUG [GossipTasks:1] 2017-04-03 23:05:53,436 Gossiper.java:407 - evicting 
/127.0.0.4 from gossip
{code}

After that we can get an NPE either in {{Gossiper#getHostId}} or 
{{StorageService#isStatus}}. 

The patch for 2.0 and 3.0 is slightly different, as if we do not initialise 
schema, we'll get the following error: 

{code}
    [junit] junit.framework.AssertionFailedError: []
    [junit]     at 
org.apache.cassandra.db.lifecycle.Tracker.getMemtableFor(Tracker.java:312)
    [junit]     at 
org.apache.cassandra.db.ColumnFamilyStore.apply(ColumnFamilyStore.java:1185)
    [junit]     at 
org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:573)
    [junit]     at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:421)
    [junit]     at org.apache.cassandra.db.Mutation.apply(Mutation.java:210)
    [junit]     at org.apache.cassandra.db.Mutation.apply(Mutation.java:215)
    [junit]     at org.apache.cassandra.db.Mutation.apply(Mutation.java:224)
    [junit]     at 
org.apache.cassandra.cql3.statements.ModificationStatement.executeInternalWithoutCondition(ModificationStatement.java:566)
    [junit]     at 
org.apache.cassandra.cql3.statements.ModificationStatement.executeInternal(ModificationStatement.java:556)
    [junit]     at 
org.apache.cassandra.cql3.QueryProcessor.executeInternal(QueryProcessor.java:295)
    [junit]     at 
org.apache.cassandra.db.SystemKeyspace.updatePeerInfo(SystemKeyspace.java:712)
    [junit]     at 
org.apache.cassandra.service.StorageService.updatePeerInfo(StorageService.java:1801)
    [junit]     at 
org.apache.cassandra.service.StorageService.handleStateNormal(StorageService.java:2014)
    [junit]     at 
org.apache.cassandra.service.StorageService.onChange(StorageService.java:1669)
    [junit]     at org.apache.cassandra.Util.createInitialRing(Util.java:213)
    [junit]     at 
org.apache.cassandra.service.RemoveTest.setup(RemoveTest.java:77)
{code}

|[2.2|https://github.com/apache/cassandra/compare/2.2...ifesdjeen:13407-2.2]|[testall|http://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13407-2.2-testall/]|[dtest|http://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13407-2.2-dtest/]|
|[3.0|https://github.com/apache/cassandra/compare/3.0...ifesdjeen:13407-3.0]|[testall|http://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13407-3.0-testall/]|[dtest|http://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13407-3.0-dtest/]|
|[3.11|https://github.com/apache/cassandra/compare/3.11...ifesdjeen:13407-3.11]|[testall|http://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13407-3.11-testall/]|[dtest|http://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13407-3.11-dtest/]|
|[trunk|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:13407-trunk]|[testall|http://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13407-trunk-testall/]|[dtest|http://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13407-trunk-dtest/]|

> test failure at RemoveTest.testBadHostId
> ----------------------------------------
>
>                 Key: CASSANDRA-13407
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13407
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Alex Petrov
>            Assignee: Alex Petrov
>
> Example trace:
> {code}
> java.lang.NullPointerException
>       at org.apache.cassandra.gms.Gossiper.getHostId(Gossiper.java:881)
>       at org.apache.cassandra.gms.Gossiper.getHostId(Gossiper.java:876)
>       at 
> org.apache.cassandra.service.StorageService.handleStateNormal(StorageService.java:2201)
>       at 
> org.apache.cassandra.service.StorageService.onChange(StorageService.java:1855)
>       at org.apache.cassandra.Util.createInitialRing(Util.java:216)
>       at org.apache.cassandra.service.RemoveTest.setup(RemoveTest.java:89)
> {code} 
> [failure 
> example|https://cassci.datastax.com/job/trunk_testall/1491/testReport/org.apache.cassandra.service/RemoveTest/testBadHostId/]
> [history|https://cassci.datastax.com/job/trunk_testall/lastCompletedBuild/testReport/org.apache.cassandra.service/RemoveTest/testBadHostId/history/]



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to