[ 
https://issues.apache.org/jira/browse/CASSANDRA-10104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14709072#comment-14709072
 ] 

Sam Tunnicliffe commented on CASSANDRA-10104:
---------------------------------------------

It seems like the problem is that {{Schema#getCFMetaData}} only fetches the 
{{KSMetaData}} from schema to get the {{CFMetaData}}, but during 
{{LegacySchemaTable#mergeSchema}} the {{KSMetaData}} is added in 
{{mergeKeyspaces}} before the {{ColumnFamilyStore}} is initialised in 
{{mergeTables}}. So, there's a window in which the {{CFMetaData}} is 
retrievable, but the {{ColumnFamilyStore}} isn't yet. Could we solve this by 
just changing the condition checking the table to actually retrieve the {{CFS}} 
rather than the {{CFM}}? In the case where we race, this would cause us to 
attempt a redundant {{addTable}}, but that will be harmless.

{code}
@@ -1029,7 +1029,7 @@ public class StorageService extends 
NotificationBroadcasterSupport implements IE
         // Also, the addKeyspace above can be racy if multiple nodes are 
started
         // concurrently - see CASSANDRA-9201
         for (Map.Entry<String, CFMetaData> table : 
AuthKeyspace.definition().cfMetaData().entrySet())
-            if (Schema.instance.getCFMetaData(AuthKeyspace.NAME, 
table.getKey()) == null)
+            if (Schema.instance.getCF(table.getValue().cfId) == null)
                 maybeAddTable(table.getValue());
{code}

I don't think we need the check for existence of the Keyspace because we know 
both the KS and CF will be created ultimately, we just need to make sure we 
don't progress to the role manager setup until they are and waiting for the 
table ensures that.

> Windows dtest 3.0: jmx_test.py:TestJMX.netstats_test fails
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-10104
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10104
>             Project: Cassandra
>          Issue Type: Sub-task
>            Reporter: Joshua McKenzie
>            Assignee: Paulo Motta
>              Labels: Windows
>             Fix For: 3.0.x
>
>
> {noformat}
> Unexpected error in node1 node log: ['ERROR [HintedHandoff:2] 2015-08-16 
> 23:14:04,419 CassandraDaemon.java:191 - Exception in thread 
> Thread[HintedHandoff:2,1,main] 
> org.apache.cassandra.exceptions.WriteFailureException: Operation failed - 
> received 0 responses and 1 failures \tat 
> org.apache.cassandra.service.AbstractWriteResponseHandler.get(AbstractWriteResponseHandler.java:106)
>  ~[main/:na] \tat 
> org.apache.cassandra.db.HintedHandOffManager.checkDelivered(HintedHandOffManager.java:358)
>  ~[main/:na] \tat 
> org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:414)
>  ~[main/:na] \tat 
> org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:346)
>  ~[main/:na] \tat 
> org.apache.cassandra.db.HintedHandOffManager.access$400(HintedHandOffManager.java:91)
>  ~[main/:na] \tat 
> org.apache.cassandra.db.HintedHandOffManager$5.run(HintedHandOffManager.java:537)
>  ~[main/:na] \tat 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_45] \tat 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  ~[na:1.8.0_45] \tat java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_45]']
> -------------------- >> begin captured logging << --------------------
> dtest: DEBUG: cluster ccm directory: d:\temp\dtest-j1ttp3
> dtest: DEBUG: Nodetool command 
> 'D:\jenkins\workspace\cassandra-3.0_dtest_win32\cassandra\bin\nodetool.bat -h 
> localhost -p 7100 netstats' failed; exit status: 1; stdout: Starting NodeTool
> ; stderr: nodetool: Failed to connect to 'localhost:7100' - ConnectException: 
> 'Connection refused: connect'.
> dtest: DEBUG: removing ccm cluster test at: d:\temp\dtest-j1ttp3
> dtest: DEBUG: clearing ssl stores from [d:\temp\dtest-j1ttp3] directory
> --------------------- >> end captured logging << ---------------------
> {noformat}
> Failure history: 
> [consistent|http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_dtest_win32/17/testReport/junit/jmx_test/TestJMX/netstats_test/history/].
>  Looks to have regressed on build 
> [#5|http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_dtest_win32/5/]
>  which seems unlikely given the commit.
> Env: Both, though on a local run the test fails due to:
> {noformat}
> Traceback (most recent call last):
>   File "c:\src\cassandra-dtest\dtest.py", line 532, in tearDown
>     raise AssertionError('Unexpected error in %s node log: %s' % (node.name, 
> errors))
> AssertionError: Unexpected error in node1 node log: ['ERROR [main] 2015-08-17 
> 15:42:07,717 NoSpamLogger.java:97 - This platform does not support atomic 
> directory streams (SecureDirectoryStream); race conditions when loading 
> sstable files could occurr', 'ERROR [main] 2015-08-17 15:50:43,978 
> NoSpamLogger.java:97 - This platform does not support atomic directory 
> streams (SecureDirectoryStream); race conditions when loading sstable files 
> could occurr']
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to