from:"Aleksandr Sorokoumov \(Jira\)"

[jira] [Updated] (CASSANDRA-17456) Test Failures: write_failures_test.TestMultiDCWriteFailures.test_oversized_mutation

2022-05-09 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-17456:
-
Fix Version/s: 4.1
   (was: 4.x)

> Test Failures: 
> write_failures_test.TestMultiDCWriteFailures.test_oversized_mutation
> ---
>
> Key: CASSANDRA-17456
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17456
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI
>Reporter: Ekaterina Dimitrova
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 4.1
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> https://ci-cassandra.apache.org/job/Cassandra-trunk/1002/testReport/dtest-offheap.write_failures_test/TestMultiDCWriteFailures/test_oversized_mutation/
> {code:java}
> Error Message
> AssertionError: assert 0 == 8  +  where 8 =  JolokiaAgent.read_attribute of  0x7f1fca78dac0>>('org.apache.cassandra.metrics:type=Storage,name=TotalHints', 
> 'Count')  +where  > = 
> .read_attribute  +
> and   'org.apache.cassandra.metrics:type=Storage,name=TotalHints' = 
> make_mbean('metrics', type='Storage', name='TotalHints')
> Stacktrace
> self = 
> def test_oversized_mutation(self):
> """
> Test that multi-DC write failures return operation failed rather 
> than a timeout.
> @jira_ticket CASSANDRA-16334.
> """
> 
> cluster = self.cluster
> cluster.populate([2, 2])
> cluster.set_configuration_options(values={'max_mutation_size_in_kb': 
> 128})
> cluster.start()
> 
> node1 = cluster.nodelist()[0]
> session = self.patient_exclusive_cql_connection(node1)
> 
> session.execute("CREATE KEYSPACE k WITH replication = {'class': 
> 'NetworkTopologyStrategy', 'dc1': 2, 'dc2': 2}")
> session.execute("CREATE TABLE k.t (key int PRIMARY KEY, val blob)")
> 
> payload = '1' * 1024 * 256
> query = "INSERT INTO k.t (key, val) VALUES (1, 
> textAsBlob('{}'))".format(payload)
> 
> assert_write_failure(session, query, ConsistencyLevel.LOCAL_ONE)
> assert_write_failure(session, query, ConsistencyLevel.ONE)
> 
> # verify that no hints are created
> with JolokiaAgent(node1) as jmx:
> >   assert 0 == jmx.read_attribute(make_mbean('metrics', 
> > type='Storage', name='TotalHints'), 'Count')
> E   AssertionError: assert 0 == 8
> E+  where 8 =   0x7f1fca78dac0>>('org.apache.cassandra.metrics:type=Storage,name=TotalHints', 
> 'Count')
> E+where  > = 
> .read_attribute
> E+and   
> 'org.apache.cassandra.metrics:type=Storage,name=TotalHints' = 
> make_mbean('metrics', type='Storage', name='TotalHints')
> write_failures_test.py:277: AssertionError
> REST API
> CloudBees CI Client Controller 2.319.3.4-rolling
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-17456) Test Failures: write_failures_test.TestMultiDCWriteFailures.test_oversized_mutation

2022-05-09 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-17456:
-
Status: Ready to Commit  (was: Review In Progress)

> Test Failures: 
> write_failures_test.TestMultiDCWriteFailures.test_oversized_mutation
> ---
>
> Key: CASSANDRA-17456
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17456
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI
>Reporter: Ekaterina Dimitrova
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> https://ci-cassandra.apache.org/job/Cassandra-trunk/1002/testReport/dtest-offheap.write_failures_test/TestMultiDCWriteFailures/test_oversized_mutation/
> {code:java}
> Error Message
> AssertionError: assert 0 == 8  +  where 8 =  JolokiaAgent.read_attribute of  0x7f1fca78dac0>>('org.apache.cassandra.metrics:type=Storage,name=TotalHints', 
> 'Count')  +where  > = 
> .read_attribute  +
> and   'org.apache.cassandra.metrics:type=Storage,name=TotalHints' = 
> make_mbean('metrics', type='Storage', name='TotalHints')
> Stacktrace
> self = 
> def test_oversized_mutation(self):
> """
> Test that multi-DC write failures return operation failed rather 
> than a timeout.
> @jira_ticket CASSANDRA-16334.
> """
> 
> cluster = self.cluster
> cluster.populate([2, 2])
> cluster.set_configuration_options(values={'max_mutation_size_in_kb': 
> 128})
> cluster.start()
> 
> node1 = cluster.nodelist()[0]
> session = self.patient_exclusive_cql_connection(node1)
> 
> session.execute("CREATE KEYSPACE k WITH replication = {'class': 
> 'NetworkTopologyStrategy', 'dc1': 2, 'dc2': 2}")
> session.execute("CREATE TABLE k.t (key int PRIMARY KEY, val blob)")
> 
> payload = '1' * 1024 * 256
> query = "INSERT INTO k.t (key, val) VALUES (1, 
> textAsBlob('{}'))".format(payload)
> 
> assert_write_failure(session, query, ConsistencyLevel.LOCAL_ONE)
> assert_write_failure(session, query, ConsistencyLevel.ONE)
> 
> # verify that no hints are created
> with JolokiaAgent(node1) as jmx:
> >   assert 0 == jmx.read_attribute(make_mbean('metrics', 
> > type='Storage', name='TotalHints'), 'Count')
> E   AssertionError: assert 0 == 8
> E+  where 8 =   0x7f1fca78dac0>>('org.apache.cassandra.metrics:type=Storage,name=TotalHints', 
> 'Count')
> E+where  > = 
> .read_attribute
> E+and   
> 'org.apache.cassandra.metrics:type=Storage,name=TotalHints' = 
> make_mbean('metrics', type='Storage', name='TotalHints')
> write_failures_test.py:277: AssertionError
> REST API
> CloudBees CI Client Controller 2.319.3.4-rolling
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-17456) Test Failures: write_failures_test.TestMultiDCWriteFailures.test_oversized_mutation

2022-05-09 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-17456:
-
Status: Review In Progress  (was: Needs Committer)

> Test Failures: 
> write_failures_test.TestMultiDCWriteFailures.test_oversized_mutation
> ---
>
> Key: CASSANDRA-17456
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17456
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI
>Reporter: Ekaterina Dimitrova
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> https://ci-cassandra.apache.org/job/Cassandra-trunk/1002/testReport/dtest-offheap.write_failures_test/TestMultiDCWriteFailures/test_oversized_mutation/
> {code:java}
> Error Message
> AssertionError: assert 0 == 8  +  where 8 =  JolokiaAgent.read_attribute of  0x7f1fca78dac0>>('org.apache.cassandra.metrics:type=Storage,name=TotalHints', 
> 'Count')  +where  > = 
> .read_attribute  +
> and   'org.apache.cassandra.metrics:type=Storage,name=TotalHints' = 
> make_mbean('metrics', type='Storage', name='TotalHints')
> Stacktrace
> self = 
> def test_oversized_mutation(self):
> """
> Test that multi-DC write failures return operation failed rather 
> than a timeout.
> @jira_ticket CASSANDRA-16334.
> """
> 
> cluster = self.cluster
> cluster.populate([2, 2])
> cluster.set_configuration_options(values={'max_mutation_size_in_kb': 
> 128})
> cluster.start()
> 
> node1 = cluster.nodelist()[0]
> session = self.patient_exclusive_cql_connection(node1)
> 
> session.execute("CREATE KEYSPACE k WITH replication = {'class': 
> 'NetworkTopologyStrategy', 'dc1': 2, 'dc2': 2}")
> session.execute("CREATE TABLE k.t (key int PRIMARY KEY, val blob)")
> 
> payload = '1' * 1024 * 256
> query = "INSERT INTO k.t (key, val) VALUES (1, 
> textAsBlob('{}'))".format(payload)
> 
> assert_write_failure(session, query, ConsistencyLevel.LOCAL_ONE)
> assert_write_failure(session, query, ConsistencyLevel.ONE)
> 
> # verify that no hints are created
> with JolokiaAgent(node1) as jmx:
> >   assert 0 == jmx.read_attribute(make_mbean('metrics', 
> > type='Storage', name='TotalHints'), 'Count')
> E   AssertionError: assert 0 == 8
> E+  where 8 =   0x7f1fca78dac0>>('org.apache.cassandra.metrics:type=Storage,name=TotalHints', 
> 'Count')
> E+where  > = 
> .read_attribute
> E+and   
> 'org.apache.cassandra.metrics:type=Storage,name=TotalHints' = 
> make_mbean('metrics', type='Storage', name='TotalHints')
> write_failures_test.py:277: AssertionError
> REST API
> CloudBees CI Client Controller 2.319.3.4-rolling
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-17456) Test Failures: write_failures_test.TestMultiDCWriteFailures.test_oversized_mutation

2022-05-09 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-17456:
-
  Since Version: 4.1
Source Control Link: Committed as 9f3bc657273dfa9e20d233636adf662904f01f34 
to 4.1 and 11bdf1bf8038fa7f872fe9161a0568d023e6cfac to trunk.
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Test Failures: 
> write_failures_test.TestMultiDCWriteFailures.test_oversized_mutation
> ---
>
> Key: CASSANDRA-17456
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17456
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI
>Reporter: Ekaterina Dimitrova
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> https://ci-cassandra.apache.org/job/Cassandra-trunk/1002/testReport/dtest-offheap.write_failures_test/TestMultiDCWriteFailures/test_oversized_mutation/
> {code:java}
> Error Message
> AssertionError: assert 0 == 8  +  where 8 =  JolokiaAgent.read_attribute of  0x7f1fca78dac0>>('org.apache.cassandra.metrics:type=Storage,name=TotalHints', 
> 'Count')  +where  > = 
> .read_attribute  +
> and   'org.apache.cassandra.metrics:type=Storage,name=TotalHints' = 
> make_mbean('metrics', type='Storage', name='TotalHints')
> Stacktrace
> self = 
> def test_oversized_mutation(self):
> """
> Test that multi-DC write failures return operation failed rather 
> than a timeout.
> @jira_ticket CASSANDRA-16334.
> """
> 
> cluster = self.cluster
> cluster.populate([2, 2])
> cluster.set_configuration_options(values={'max_mutation_size_in_kb': 
> 128})
> cluster.start()
> 
> node1 = cluster.nodelist()[0]
> session = self.patient_exclusive_cql_connection(node1)
> 
> session.execute("CREATE KEYSPACE k WITH replication = {'class': 
> 'NetworkTopologyStrategy', 'dc1': 2, 'dc2': 2}")
> session.execute("CREATE TABLE k.t (key int PRIMARY KEY, val blob)")
> 
> payload = '1' * 1024 * 256
> query = "INSERT INTO k.t (key, val) VALUES (1, 
> textAsBlob('{}'))".format(payload)
> 
> assert_write_failure(session, query, ConsistencyLevel.LOCAL_ONE)
> assert_write_failure(session, query, ConsistencyLevel.ONE)
> 
> # verify that no hints are created
> with JolokiaAgent(node1) as jmx:
> >   assert 0 == jmx.read_attribute(make_mbean('metrics', 
> > type='Storage', name='TotalHints'), 'Count')
> E   AssertionError: assert 0 == 8
> E+  where 8 =   0x7f1fca78dac0>>('org.apache.cassandra.metrics:type=Storage,name=TotalHints', 
> 'Count')
> E+where  > = 
> .read_attribute
> E+and   
> 'org.apache.cassandra.metrics:type=Storage,name=TotalHints' = 
> make_mbean('metrics', type='Storage', name='TotalHints')
> write_failures_test.py:277: AssertionError
> REST API
> CloudBees CI Client Controller 2.319.3.4-rolling
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-17456) Test Failures: write_failures_test.TestMultiDCWriteFailures.test_oversized_mutation

2022-05-09 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-17456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17533708#comment-17533708
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-17456:
--

Committed as 
[9f3bc657273dfa9e20d233636adf662904f01f34|https://github.com/apache/cassandra/commit/9f3bc657273dfa9e20d233636adf662904f01f34]
 to 4.1 and 
[11bdf1bf8038fa7f872fe9161a0568d023e6cfac|https://github.com/apache/cassandra/commit/11bdf1bf8038fa7f872fe9161a0568d023e6cfac]
 to trunk.

> Test Failures: 
> write_failures_test.TestMultiDCWriteFailures.test_oversized_mutation
> ---
>
> Key: CASSANDRA-17456
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17456
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI
>Reporter: Ekaterina Dimitrova
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> https://ci-cassandra.apache.org/job/Cassandra-trunk/1002/testReport/dtest-offheap.write_failures_test/TestMultiDCWriteFailures/test_oversized_mutation/
> {code:java}
> Error Message
> AssertionError: assert 0 == 8  +  where 8 =  JolokiaAgent.read_attribute of  0x7f1fca78dac0>>('org.apache.cassandra.metrics:type=Storage,name=TotalHints', 
> 'Count')  +where  > = 
> .read_attribute  +
> and   'org.apache.cassandra.metrics:type=Storage,name=TotalHints' = 
> make_mbean('metrics', type='Storage', name='TotalHints')
> Stacktrace
> self = 
> def test_oversized_mutation(self):
> """
> Test that multi-DC write failures return operation failed rather 
> than a timeout.
> @jira_ticket CASSANDRA-16334.
> """
> 
> cluster = self.cluster
> cluster.populate([2, 2])
> cluster.set_configuration_options(values={'max_mutation_size_in_kb': 
> 128})
> cluster.start()
> 
> node1 = cluster.nodelist()[0]
> session = self.patient_exclusive_cql_connection(node1)
> 
> session.execute("CREATE KEYSPACE k WITH replication = {'class': 
> 'NetworkTopologyStrategy', 'dc1': 2, 'dc2': 2}")
> session.execute("CREATE TABLE k.t (key int PRIMARY KEY, val blob)")
> 
> payload = '1' * 1024 * 256
> query = "INSERT INTO k.t (key, val) VALUES (1, 
> textAsBlob('{}'))".format(payload)
> 
> assert_write_failure(session, query, ConsistencyLevel.LOCAL_ONE)
> assert_write_failure(session, query, ConsistencyLevel.ONE)
> 
> # verify that no hints are created
> with JolokiaAgent(node1) as jmx:
> >   assert 0 == jmx.read_attribute(make_mbean('metrics', 
> > type='Storage', name='TotalHints'), 'Count')
> E   AssertionError: assert 0 == 8
> E+  where 8 =   0x7f1fca78dac0>>('org.apache.cassandra.metrics:type=Storage,name=TotalHints', 
> 'Count')
> E+where  > = 
> .read_attribute
> E+and   
> 'org.apache.cassandra.metrics:type=Storage,name=TotalHints' = 
> make_mbean('metrics', type='Storage', name='TotalHints')
> write_failures_test.py:277: AssertionError
> REST API
> CloudBees CI Client Controller 2.319.3.4-rolling
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-17456) Test Failures: write_failures_test.TestMultiDCWriteFailures.test_oversized_mutation

2022-05-06 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-17456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17532703#comment-17532703
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-17456:
--

CI looks good. I am moving the issue to ready to commit.

> Test Failures: 
> write_failures_test.TestMultiDCWriteFailures.test_oversized_mutation
> ---
>
> Key: CASSANDRA-17456
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17456
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI
>Reporter: Ekaterina Dimitrova
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> https://ci-cassandra.apache.org/job/Cassandra-trunk/1002/testReport/dtest-offheap.write_failures_test/TestMultiDCWriteFailures/test_oversized_mutation/
> {code:java}
> Error Message
> AssertionError: assert 0 == 8  +  where 8 =  JolokiaAgent.read_attribute of  0x7f1fca78dac0>>('org.apache.cassandra.metrics:type=Storage,name=TotalHints', 
> 'Count')  +where  > = 
> .read_attribute  +
> and   'org.apache.cassandra.metrics:type=Storage,name=TotalHints' = 
> make_mbean('metrics', type='Storage', name='TotalHints')
> Stacktrace
> self = 
> def test_oversized_mutation(self):
> """
> Test that multi-DC write failures return operation failed rather 
> than a timeout.
> @jira_ticket CASSANDRA-16334.
> """
> 
> cluster = self.cluster
> cluster.populate([2, 2])
> cluster.set_configuration_options(values={'max_mutation_size_in_kb': 
> 128})
> cluster.start()
> 
> node1 = cluster.nodelist()[0]
> session = self.patient_exclusive_cql_connection(node1)
> 
> session.execute("CREATE KEYSPACE k WITH replication = {'class': 
> 'NetworkTopologyStrategy', 'dc1': 2, 'dc2': 2}")
> session.execute("CREATE TABLE k.t (key int PRIMARY KEY, val blob)")
> 
> payload = '1' * 1024 * 256
> query = "INSERT INTO k.t (key, val) VALUES (1, 
> textAsBlob('{}'))".format(payload)
> 
> assert_write_failure(session, query, ConsistencyLevel.LOCAL_ONE)
> assert_write_failure(session, query, ConsistencyLevel.ONE)
> 
> # verify that no hints are created
> with JolokiaAgent(node1) as jmx:
> >   assert 0 == jmx.read_attribute(make_mbean('metrics', 
> > type='Storage', name='TotalHints'), 'Count')
> E   AssertionError: assert 0 == 8
> E+  where 8 =   0x7f1fca78dac0>>('org.apache.cassandra.metrics:type=Storage,name=TotalHints', 
> 'Count')
> E+where  > = 
> .read_attribute
> E+and   
> 'org.apache.cassandra.metrics:type=Storage,name=TotalHints' = 
> make_mbean('metrics', type='Storage', name='TotalHints')
> write_failures_test.py:277: AssertionError
> REST API
> CloudBees CI Client Controller 2.319.3.4-rolling
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-17456) Test Failures: write_failures_test.TestMultiDCWriteFailures.test_oversized_mutation

2022-05-06 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-17456:
-
Status: Needs Committer  (was: Patch Available)

> Test Failures: 
> write_failures_test.TestMultiDCWriteFailures.test_oversized_mutation
> ---
>
> Key: CASSANDRA-17456
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17456
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI
>Reporter: Ekaterina Dimitrova
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> https://ci-cassandra.apache.org/job/Cassandra-trunk/1002/testReport/dtest-offheap.write_failures_test/TestMultiDCWriteFailures/test_oversized_mutation/
> {code:java}
> Error Message
> AssertionError: assert 0 == 8  +  where 8 =  JolokiaAgent.read_attribute of  0x7f1fca78dac0>>('org.apache.cassandra.metrics:type=Storage,name=TotalHints', 
> 'Count')  +where  > = 
> .read_attribute  +
> and   'org.apache.cassandra.metrics:type=Storage,name=TotalHints' = 
> make_mbean('metrics', type='Storage', name='TotalHints')
> Stacktrace
> self = 
> def test_oversized_mutation(self):
> """
> Test that multi-DC write failures return operation failed rather 
> than a timeout.
> @jira_ticket CASSANDRA-16334.
> """
> 
> cluster = self.cluster
> cluster.populate([2, 2])
> cluster.set_configuration_options(values={'max_mutation_size_in_kb': 
> 128})
> cluster.start()
> 
> node1 = cluster.nodelist()[0]
> session = self.patient_exclusive_cql_connection(node1)
> 
> session.execute("CREATE KEYSPACE k WITH replication = {'class': 
> 'NetworkTopologyStrategy', 'dc1': 2, 'dc2': 2}")
> session.execute("CREATE TABLE k.t (key int PRIMARY KEY, val blob)")
> 
> payload = '1' * 1024 * 256
> query = "INSERT INTO k.t (key, val) VALUES (1, 
> textAsBlob('{}'))".format(payload)
> 
> assert_write_failure(session, query, ConsistencyLevel.LOCAL_ONE)
> assert_write_failure(session, query, ConsistencyLevel.ONE)
> 
> # verify that no hints are created
> with JolokiaAgent(node1) as jmx:
> >   assert 0 == jmx.read_attribute(make_mbean('metrics', 
> > type='Storage', name='TotalHints'), 'Count')
> E   AssertionError: assert 0 == 8
> E+  where 8 =   0x7f1fca78dac0>>('org.apache.cassandra.metrics:type=Storage,name=TotalHints', 
> 'Count')
> E+where  > = 
> .read_attribute
> E+and   
> 'org.apache.cassandra.metrics:type=Storage,name=TotalHints' = 
> make_mbean('metrics', type='Storage', name='TotalHints')
> write_failures_test.py:277: AssertionError
> REST API
> CloudBees CI Client Controller 2.319.3.4-rolling
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-17456) Test Failures: write_failures_test.TestMultiDCWriteFailures.test_oversized_mutation

2022-04-30 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-17456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17530386#comment-17530386
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-17456:
--

I've put the size validation back to CommitLog#add and added the NEWS entry.

> Test Failures: 
> write_failures_test.TestMultiDCWriteFailures.test_oversized_mutation
> ---
>
> Key: CASSANDRA-17456
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17456
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI
>Reporter: Ekaterina Dimitrova
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> https://ci-cassandra.apache.org/job/Cassandra-trunk/1002/testReport/dtest-offheap.write_failures_test/TestMultiDCWriteFailures/test_oversized_mutation/
> {code:java}
> Error Message
> AssertionError: assert 0 == 8  +  where 8 =  JolokiaAgent.read_attribute of  0x7f1fca78dac0>>('org.apache.cassandra.metrics:type=Storage,name=TotalHints', 
> 'Count')  +where  > = 
> .read_attribute  +
> and   'org.apache.cassandra.metrics:type=Storage,name=TotalHints' = 
> make_mbean('metrics', type='Storage', name='TotalHints')
> Stacktrace
> self = 
> def test_oversized_mutation(self):
> """
> Test that multi-DC write failures return operation failed rather 
> than a timeout.
> @jira_ticket CASSANDRA-16334.
> """
> 
> cluster = self.cluster
> cluster.populate([2, 2])
> cluster.set_configuration_options(values={'max_mutation_size_in_kb': 
> 128})
> cluster.start()
> 
> node1 = cluster.nodelist()[0]
> session = self.patient_exclusive_cql_connection(node1)
> 
> session.execute("CREATE KEYSPACE k WITH replication = {'class': 
> 'NetworkTopologyStrategy', 'dc1': 2, 'dc2': 2}")
> session.execute("CREATE TABLE k.t (key int PRIMARY KEY, val blob)")
> 
> payload = '1' * 1024 * 256
> query = "INSERT INTO k.t (key, val) VALUES (1, 
> textAsBlob('{}'))".format(payload)
> 
> assert_write_failure(session, query, ConsistencyLevel.LOCAL_ONE)
> assert_write_failure(session, query, ConsistencyLevel.ONE)
> 
> # verify that no hints are created
> with JolokiaAgent(node1) as jmx:
> >   assert 0 == jmx.read_attribute(make_mbean('metrics', 
> > type='Storage', name='TotalHints'), 'Count')
> E   AssertionError: assert 0 == 8
> E+  where 8 =   0x7f1fca78dac0>>('org.apache.cassandra.metrics:type=Storage,name=TotalHints', 
> 'Count')
> E+where  > = 
> .read_attribute
> E+and   
> 'org.apache.cassandra.metrics:type=Storage,name=TotalHints' = 
> make_mbean('metrics', type='Storage', name='TotalHints')
> write_failures_test.py:277: AssertionError
> REST API
> CloudBees CI Client Controller 2.319.3.4-rolling
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-17456) Test Failures: write_failures_test.TestMultiDCWriteFailures.test_oversized_mutation

2022-04-29 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-17456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529899#comment-17529899
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-17456:
--

Looking at the CommitLog code, it will allocate a new segment if the mutation 
does not fit in the current segment 
([link|https://github.com/apache/cassandra/blob/7ce140bd1dea311b9f98cdfbcd07dcff9fbd457c/src/java/org/apache/cassandra/db/commitlog/CommitLogSegmentManagerStandard.java#L52-L57]).
 This effectively gives us the desired behavior as long as a single mutation 
fits in a segment or am I missing something?

I can think of one corner case when a mutation is larger than a segment. I can 
add an assertion for that, just to be on the safe side.

> Test Failures: 
> write_failures_test.TestMultiDCWriteFailures.test_oversized_mutation
> ---
>
> Key: CASSANDRA-17456
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17456
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI
>Reporter: Ekaterina Dimitrova
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> https://ci-cassandra.apache.org/job/Cassandra-trunk/1002/testReport/dtest-offheap.write_failures_test/TestMultiDCWriteFailures/test_oversized_mutation/
> {code:java}
> Error Message
> AssertionError: assert 0 == 8  +  where 8 =  JolokiaAgent.read_attribute of  0x7f1fca78dac0>>('org.apache.cassandra.metrics:type=Storage,name=TotalHints', 
> 'Count')  +where  > = 
> .read_attribute  +
> and   'org.apache.cassandra.metrics:type=Storage,name=TotalHints' = 
> make_mbean('metrics', type='Storage', name='TotalHints')
> Stacktrace
> self = 
> def test_oversized_mutation(self):
> """
> Test that multi-DC write failures return operation failed rather 
> than a timeout.
> @jira_ticket CASSANDRA-16334.
> """
> 
> cluster = self.cluster
> cluster.populate([2, 2])
> cluster.set_configuration_options(values={'max_mutation_size_in_kb': 
> 128})
> cluster.start()
> 
> node1 = cluster.nodelist()[0]
> session = self.patient_exclusive_cql_connection(node1)
> 
> session.execute("CREATE KEYSPACE k WITH replication = {'class': 
> 'NetworkTopologyStrategy', 'dc1': 2, 'dc2': 2}")
> session.execute("CREATE TABLE k.t (key int PRIMARY KEY, val blob)")
> 
> payload = '1' * 1024 * 256
> query = "INSERT INTO k.t (key, val) VALUES (1, 
> textAsBlob('{}'))".format(payload)
> 
> assert_write_failure(session, query, ConsistencyLevel.LOCAL_ONE)
> assert_write_failure(session, query, ConsistencyLevel.ONE)
> 
> # verify that no hints are created
> with JolokiaAgent(node1) as jmx:
> >   assert 0 == jmx.read_attribute(make_mbean('metrics', 
> > type='Storage', name='TotalHints'), 'Count')
> E   AssertionError: assert 0 == 8
> E+  where 8 =   0x7f1fca78dac0>>('org.apache.cassandra.metrics:type=Storage,name=TotalHints', 
> 'Count')
> E+where  > = 
> .read_attribute
> E+and   
> 'org.apache.cassandra.metrics:type=Storage,name=TotalHints' = 
> make_mbean('metrics', type='Storage', name='TotalHints')
> write_failures_test.py:277: AssertionError
> REST API
> CloudBees CI Client Controller 2.319.3.4-rolling
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-17456) Test Failures: write_failures_test.TestMultiDCWriteFailures.test_oversized_mutation

2022-04-22 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-17456:
-
Test and Documentation Plan: I made the existing dtest applicable to C* 
versions until 4.0.x and added an in-jvm dtest to cover rejection of oversized 
mutations on insert.
 Status: Patch Available  (was: In Progress)

As Benedict suggested, I moved the mutation size check from CommitLog to the 
client and internode connections.

Patches:
 * 
[17456-trunk|https://github.com/apache/cassandra/compare/trunk...Ge:17456-trunk?expand=1]
 * [dtest|https://github.com/apache/cassandra-dtest/pull/186]

[Jenkins CI 
run|https://ci-cassandra.apache.org/job/Cassandra-devbranch/1626/#showFailuresLink]

> Test Failures: 
> write_failures_test.TestMultiDCWriteFailures.test_oversized_mutation
> ---
>
> Key: CASSANDRA-17456
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17456
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI
>Reporter: Ekaterina Dimitrova
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 4.x
>
>
> https://ci-cassandra.apache.org/job/Cassandra-trunk/1002/testReport/dtest-offheap.write_failures_test/TestMultiDCWriteFailures/test_oversized_mutation/
> {code:java}
> Error Message
> AssertionError: assert 0 == 8  +  where 8 =  JolokiaAgent.read_attribute of  0x7f1fca78dac0>>('org.apache.cassandra.metrics:type=Storage,name=TotalHints', 
> 'Count')  +where  > = 
> .read_attribute  +
> and   'org.apache.cassandra.metrics:type=Storage,name=TotalHints' = 
> make_mbean('metrics', type='Storage', name='TotalHints')
> Stacktrace
> self = 
> def test_oversized_mutation(self):
> """
> Test that multi-DC write failures return operation failed rather 
> than a timeout.
> @jira_ticket CASSANDRA-16334.
> """
> 
> cluster = self.cluster
> cluster.populate([2, 2])
> cluster.set_configuration_options(values={'max_mutation_size_in_kb': 
> 128})
> cluster.start()
> 
> node1 = cluster.nodelist()[0]
> session = self.patient_exclusive_cql_connection(node1)
> 
> session.execute("CREATE KEYSPACE k WITH replication = {'class': 
> 'NetworkTopologyStrategy', 'dc1': 2, 'dc2': 2}")
> session.execute("CREATE TABLE k.t (key int PRIMARY KEY, val blob)")
> 
> payload = '1' * 1024 * 256
> query = "INSERT INTO k.t (key, val) VALUES (1, 
> textAsBlob('{}'))".format(payload)
> 
> assert_write_failure(session, query, ConsistencyLevel.LOCAL_ONE)
> assert_write_failure(session, query, ConsistencyLevel.ONE)
> 
> # verify that no hints are created
> with JolokiaAgent(node1) as jmx:
> >   assert 0 == jmx.read_attribute(make_mbean('metrics', 
> > type='Storage', name='TotalHints'), 'Count')
> E   AssertionError: assert 0 == 8
> E+  where 8 =   0x7f1fca78dac0>>('org.apache.cassandra.metrics:type=Storage,name=TotalHints', 
> 'Count')
> E+where  > = 
> .read_attribute
> E+and   
> 'org.apache.cassandra.metrics:type=Storage,name=TotalHints' = 
> make_mbean('metrics', type='Storage', name='TotalHints')
> write_failures_test.py:277: AssertionError
> REST API
> CloudBees CI Client Controller 2.319.3.4-rolling
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Assigned] (CASSANDRA-17456) Test Failures: write_failures_test.TestMultiDCWriteFailures.test_oversized_mutation

2022-04-12 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov reassigned CASSANDRA-17456:


Assignee: Aleksandr Sorokoumov  (was: Josh McKenzie)

> Test Failures: 
> write_failures_test.TestMultiDCWriteFailures.test_oversized_mutation
> ---
>
> Key: CASSANDRA-17456
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17456
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI
>Reporter: Ekaterina Dimitrova
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 4.x
>
>
> https://ci-cassandra.apache.org/job/Cassandra-trunk/1002/testReport/dtest-offheap.write_failures_test/TestMultiDCWriteFailures/test_oversized_mutation/
> {code:java}
> Error Message
> AssertionError: assert 0 == 8  +  where 8 =  JolokiaAgent.read_attribute of  0x7f1fca78dac0>>('org.apache.cassandra.metrics:type=Storage,name=TotalHints', 
> 'Count')  +where  > = 
> .read_attribute  +
> and   'org.apache.cassandra.metrics:type=Storage,name=TotalHints' = 
> make_mbean('metrics', type='Storage', name='TotalHints')
> Stacktrace
> self = 
> def test_oversized_mutation(self):
> """
> Test that multi-DC write failures return operation failed rather 
> than a timeout.
> @jira_ticket CASSANDRA-16334.
> """
> 
> cluster = self.cluster
> cluster.populate([2, 2])
> cluster.set_configuration_options(values={'max_mutation_size_in_kb': 
> 128})
> cluster.start()
> 
> node1 = cluster.nodelist()[0]
> session = self.patient_exclusive_cql_connection(node1)
> 
> session.execute("CREATE KEYSPACE k WITH replication = {'class': 
> 'NetworkTopologyStrategy', 'dc1': 2, 'dc2': 2}")
> session.execute("CREATE TABLE k.t (key int PRIMARY KEY, val blob)")
> 
> payload = '1' * 1024 * 256
> query = "INSERT INTO k.t (key, val) VALUES (1, 
> textAsBlob('{}'))".format(payload)
> 
> assert_write_failure(session, query, ConsistencyLevel.LOCAL_ONE)
> assert_write_failure(session, query, ConsistencyLevel.ONE)
> 
> # verify that no hints are created
> with JolokiaAgent(node1) as jmx:
> >   assert 0 == jmx.read_attribute(make_mbean('metrics', 
> > type='Storage', name='TotalHints'), 'Count')
> E   AssertionError: assert 0 == 8
> E+  where 8 =   0x7f1fca78dac0>>('org.apache.cassandra.metrics:type=Storage,name=TotalHints', 
> 'Count')
> E+where  > = 
> .read_attribute
> E+and   
> 'org.apache.cassandra.metrics:type=Storage,name=TotalHints' = 
> make_mbean('metrics', type='Storage', name='TotalHints')
> write_failures_test.py:277: AssertionError
> REST API
> CloudBees CI Client Controller 2.319.3.4-rolling
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-16349) SSTableLoader reports error when SSTable(s) do not have data for some nodes

2022-02-17 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17494218#comment-17494218
 ] 

Aleksandr Sorokoumov edited comment on CASSANDRA-16349 at 2/17/22, 8:43 PM:


I've rebased the patch and the dtest. [~blerer] can you please review?

Links:
 * 
[patch|https://github.com/apache/cassandra/compare/trunk...Ge:16349-streaming-sstableloader-4.0?expand=1]
 * [dtest|https://github.com/apache/cassandra-dtest/pull/151]


was (Author: ge):
I've rebased the patch and the dtest. [~blerer] can you please review?

> SSTableLoader reports error when SSTable(s) do not have data for some nodes
> ---
>
> Key: CASSANDRA-16349
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16349
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/sstable
>Reporter: Serban Teodorescu
>Assignee: Serban Teodorescu
>Priority: Normal
> Fix For: 4.0.x, 4.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Running SSTableLoader in verbose mode will show error(s) if there are node(s) 
> that do not own any data from the SSTable(s). This can happen in at least 2 
> cases:
>  # SSTableLoader is used to stream backups while keeping the same token ranges
>  # SSTable(s) are created with CQLSSTableWriter to match token ranges (this 
> can bring better performance by using ZeroCopy streaming)
> Partial output of the SSTableLoader:
> {quote}ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] 
> Remote peer /127.0.0.4:7000 failed stream session.
> ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] Remote peer 
> /127.0.0.3:7000 failed stream session.
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.515KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.427KiB/s)
> {quote}
>  
> Stack trace:
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:552)
> at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:533)
> at org.apache.cassandra.tools.BulkLoader.load(BulkLoader.java:99)
> at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:49)
> Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:88)
> at 
> com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1056)
> at 
> com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
> at 
> com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138)
> at 
> com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:958)
> at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:748)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:220)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:196)
> at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:505)
> at 
> org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:819)
> at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:595)
> at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:189)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:844)
> {quote}
> To reproduce create a cluster with ccm with more nodes than the RF, put some 
> data into it copy a SSTable and stream it.
>  
> The error originates on the nodes, the following stack trace is shown in the 
> logs:
> {quote}java.lang.IllegalStateException: Stream hasn't been read yet
>     at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:507)
>     at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.getSize(CassandraIncomingFile.java:96)
>     at 
>

[jira] [Commented] (CASSANDRA-16349) SSTableLoader reports error when SSTable(s) do not have data for some nodes

2022-02-17 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17494218#comment-17494218
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-16349:
--

I've rebased the patch and the dtest. [~blerer] can you please review?

> SSTableLoader reports error when SSTable(s) do not have data for some nodes
> ---
>
> Key: CASSANDRA-16349
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16349
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/sstable
>Reporter: Serban Teodorescu
>Assignee: Serban Teodorescu
>Priority: Normal
> Fix For: 4.0.x, 4.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Running SSTableLoader in verbose mode will show error(s) if there are node(s) 
> that do not own any data from the SSTable(s). This can happen in at least 2 
> cases:
>  # SSTableLoader is used to stream backups while keeping the same token ranges
>  # SSTable(s) are created with CQLSSTableWriter to match token ranges (this 
> can bring better performance by using ZeroCopy streaming)
> Partial output of the SSTableLoader:
> {quote}ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] 
> Remote peer /127.0.0.4:7000 failed stream session.
> ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] Remote peer 
> /127.0.0.3:7000 failed stream session.
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.515KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.427KiB/s)
> {quote}
>  
> Stack trace:
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:552)
> at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:533)
> at org.apache.cassandra.tools.BulkLoader.load(BulkLoader.java:99)
> at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:49)
> Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:88)
> at 
> com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1056)
> at 
> com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
> at 
> com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138)
> at 
> com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:958)
> at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:748)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:220)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:196)
> at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:505)
> at 
> org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:819)
> at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:595)
> at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:189)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:844)
> {quote}
> To reproduce create a cluster with ccm with more nodes than the RF, put some 
> data into it copy a SSTable and stream it.
>  
> The error originates on the nodes, the following stack trace is shown in the 
> logs:
> {quote}java.lang.IllegalStateException: Stream hasn't been read yet
>     at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:507)
>     at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.getSize(CassandraIncomingFile.java:96)
>     at 
> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:789)
>     at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:587)
>     at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:189)
>     at 
>

[jira] [Commented] (CASSANDRA-16349) SSTableLoader reports error when SSTable(s) do not have data for some nodes

2022-01-04 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17469088#comment-17469088
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-16349:
--

[~e.dimitrova] Do you have spare cycles to review this patch?

> SSTableLoader reports error when SSTable(s) do not have data for some nodes
> ---
>
> Key: CASSANDRA-16349
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16349
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/sstable
>Reporter: Serban Teodorescu
>Assignee: Serban Teodorescu
>Priority: Normal
> Fix For: 4.0.x, 4.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Running SSTableLoader in verbose mode will show error(s) if there are node(s) 
> that do not own any data from the SSTable(s). This can happen in at least 2 
> cases:
>  # SSTableLoader is used to stream backups while keeping the same token ranges
>  # SSTable(s) are created with CQLSSTableWriter to match token ranges (this 
> can bring better performance by using ZeroCopy streaming)
> Partial output of the SSTableLoader:
> {quote}ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] 
> Remote peer /127.0.0.4:7000 failed stream session.
> ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] Remote peer 
> /127.0.0.3:7000 failed stream session.
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.515KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.427KiB/s)
> {quote}
>  
> Stack trace:
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:552)
> at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:533)
> at org.apache.cassandra.tools.BulkLoader.load(BulkLoader.java:99)
> at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:49)
> Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:88)
> at 
> com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1056)
> at 
> com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
> at 
> com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138)
> at 
> com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:958)
> at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:748)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:220)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:196)
> at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:505)
> at 
> org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:819)
> at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:595)
> at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:189)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:844)
> {quote}
> To reproduce create a cluster with ccm with more nodes than the RF, put some 
> data into it copy a SSTable and stream it.
>  
> The error originates on the nodes, the following stack trace is shown in the 
> logs:
> {quote}java.lang.IllegalStateException: Stream hasn't been read yet
>     at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:507)
>     at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.getSize(CassandraIncomingFile.java:96)
>     at 
> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:789)
>     at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:587)
>     at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:189)
>     at 
>

[jira] [Updated] (CASSANDRA-15215) VIntCoding should read and write more efficiently

2022-01-04 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-15215:
-
Reviewers: Benedict Elliott Smith, Branimir Lambov  (was: Benedict Elliott 
Smith)

> VIntCoding should read and write more efficiently
> -
>
> Key: CASSANDRA-15215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15215
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Local/SSTable
>Reporter: Benedict Elliott Smith
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.x
>
> Attachments: testWriteRandomLongDOP_final.png, 
> writeUnsignedVInt_megamorphic_BB.png, writeUnsignedVInt_megamorphic_DOP.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Most vints occupy significantly fewer than 8 bytes, and most buffers have >= 
> 8 bytes spare, in which case we can construct the relevant bytes in a 
> register and memcpy them to the correct position.  Since we read and write a 
> lot of vints, this waste is probably measurable, particularly during 
> compaction and flush, and can probably be considered a performance bug.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15215) VIntCoding should read and write more efficiently

2022-01-04 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17468706#comment-17468706
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-15215:
--

I added [~blambov] as a reviewer as he approved the PR to trunk. [~benedict] is 
there anything I can do to facilitate the merge?

> VIntCoding should read and write more efficiently
> -
>
> Key: CASSANDRA-15215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15215
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Local/SSTable
>Reporter: Benedict Elliott Smith
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.x
>
> Attachments: testWriteRandomLongDOP_final.png, 
> writeUnsignedVInt_megamorphic_BB.png, writeUnsignedVInt_megamorphic_DOP.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Most vints occupy significantly fewer than 8 bytes, and most buffers have >= 
> 8 bytes spare, in which case we can construct the relevant bytes in a 
> register and memcpy them to the correct position.  Since we read and write a 
> lot of vints, this waste is probably measurable, particularly during 
> compaction and flush, and can probably be considered a performance bug.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15215) VIntCoding should read and write more efficiently

2021-12-14 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17459045#comment-17459045
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-15215:
--

No worries, CQLConnectionTest failures indeed looked suspicious. I agree with 
your commit and am looking forward to green CI and merge :)

> VIntCoding should read and write more efficiently
> -
>
> Key: CASSANDRA-15215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15215
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Local/SSTable
>Reporter: Benedict Elliott Smith
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.x
>
> Attachments: testWriteRandomLongDOP_final.png, 
> writeUnsignedVInt_megamorphic_BB.png, writeUnsignedVInt_megamorphic_DOP.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Most vints occupy significantly fewer than 8 bytes, and most buffers have >= 
> 8 bytes spare, in which case we can construct the relevant bytes in a 
> register and memcpy them to the correct position.  Since we read and write a 
> lot of vints, this waste is probably measurable, particularly during 
> compaction and flush, and can probably be considered a performance bug.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15215) VIntCoding should read and write more efficiently

2021-12-11 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17457638#comment-17457638
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-15215:
--

I looked into the most recent test failures and I am fairly convinced that none 
of them are caused by this patch:
* [CQLConnectionTest test failures 
|https://app.circleci.com/pipelines/github/belliottsmith/cassandra/216/workflows/9b2ff75d-d2fd-47ad-a4d6-a407a649780c/jobs/5659/tests#failed-test-2]
 - there are recent bug reports regarding this test suite failing in various 
ways - CASSANDRA-16677 as an "aggregate issue" and a number of linked 
duplicates. One example is [this 
build|https://app.circleci.com/pipelines/github/dcapwell/cassandra/1037/workflows/c728d370-49b9-41aa-bdfb-8c41cf0355d8/jobs/6577/tests]
 from CASSANDRA-16949 that has exactly the same failures.
 * 
[TestClientRequestMetrics|https://app.circleci.com/pipelines/github/belliottsmith/cassandra/216/workflows/418d4b46-8d8b-41df-ad80-06f377593caf/jobs/5646/tests#failed-test-0]
 - was also observed in 
https://issues.apache.org/jira/browse/CASSANDRA-15234?focusedCommentId=17454221=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17454221.
* [MessagingServiceTest 
|https://app.circleci.com/pipelines/github/belliottsmith/cassandra/216/workflows/418d4b46-8d8b-41df-ad80-06f377593caf/jobs/5637]
 - CASSANDRA-17033.


> VIntCoding should read and write more efficiently
> -
>
> Key: CASSANDRA-15215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15215
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Local/SSTable
>Reporter: Benedict Elliott Smith
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.x
>
> Attachments: testWriteRandomLongDOP_final.png, 
> writeUnsignedVInt_megamorphic_BB.png, writeUnsignedVInt_megamorphic_DOP.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Most vints occupy significantly fewer than 8 bytes, and most buffers have >= 
> 8 bytes spare, in which case we can construct the relevant bytes in a 
> register and memcpy them to the correct position.  Since we read and write a 
> lot of vints, this waste is probably measurable, particularly during 
> compaction and flush, and can probably be considered a performance bug.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15215) VIntCoding should read and write more efficiently

2021-12-07 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17454582#comment-17454582
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-15215:
--

I'll fix the last test failures on the weekend. Enjoy your holiday :)

> VIntCoding should read and write more efficiently
> -
>
> Key: CASSANDRA-15215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15215
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Local/SSTable
>Reporter: Benedict Elliott Smith
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.x
>
> Attachments: testWriteRandomLongDOP_final.png, 
> writeUnsignedVInt_megamorphic_BB.png, writeUnsignedVInt_megamorphic_DOP.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Most vints occupy significantly fewer than 8 bytes, and most buffers have >= 
> 8 bytes spare, in which case we can construct the relevant bytes in a 
> register and memcpy them to the correct position.  Since we read and write a 
> lot of vints, this waste is probably measurable, particularly during 
> compaction and flush, and can probably be considered a performance bug.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15215) VIntCoding should read and write more efficiently

2021-12-05 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17453616#comment-17453616
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-15215:
--

The issue was caused by the slow path in 
{{BufferedDataOutputStreamPlus#writeBytes}} when the underlying buffer has less 
than 8 bytes remaining. Previously, this method fell back to {{writeSlow}}. 
This was not correct because it writes N least significant bytes to the wire. 
As {{writeBytes}} treats the register as an optimized version of a byte array, 
it should write N most significant bytes instead. I added a test case that 
isolates the issue and fixed it in all branches. 

[~benedict] can you please re-run the CI?

> VIntCoding should read and write more efficiently
> -
>
> Key: CASSANDRA-15215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15215
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Local/SSTable
>Reporter: Benedict Elliott Smith
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.x
>
> Attachments: testWriteRandomLongDOP_final.png, 
> writeUnsignedVInt_megamorphic_BB.png, writeUnsignedVInt_megamorphic_DOP.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Most vints occupy significantly fewer than 8 bytes, and most buffers have >= 
> 8 bytes spare, in which case we can construct the relevant bytes in a 
> register and memcpy them to the correct position.  Since we read and write a 
> lot of vints, this waste is probably measurable, particularly during 
> compaction and flush, and can probably be considered a performance bug.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-17169) Flaky RecomputingSupplierTest

2021-12-02 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-17169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452291#comment-17452291
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-17169:
--

The patch makes sense to me and seems to fix the test. +1 to merge it.

> Flaky RecomputingSupplierTest
> -
>
> Key: CASSANDRA-17169
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17169
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0.x, 4.x
>
>
> See 
> https://ci-cassandra.apache.org/job/Cassandra-4.0/293/testReport/junit/org.apache.cassandra.utils/RecomputingSupplierTest/recomputingSupplierTest/
> {noformat}
> java.util.concurrent.TimeoutException
>   at 
> java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)
>   at 
> java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)
>   at 
> org.apache.cassandra.utils.RecomputingSupplier.get(RecomputingSupplier.java:110)
>   at 
> org.apache.cassandra.utils.RecomputingSupplierTest.recomputingSupplierTest(RecomputingSupplierTest.java:120)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15215) VIntCoding should read and write more efficiently

2021-11-30 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17451023#comment-17451023
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-15215:
--

I'll fix test failures closer to the end of the week, probably on the weekend.

> VIntCoding should read and write more efficiently
> -
>
> Key: CASSANDRA-15215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15215
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Local/SSTable
>Reporter: Benedict Elliott Smith
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.x
>
> Attachments: testWriteRandomLongDOP_final.png, 
> writeUnsignedVInt_megamorphic_BB.png, writeUnsignedVInt_megamorphic_DOP.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Most vints occupy significantly fewer than 8 bytes, and most buffers have >= 
> 8 bytes spare, in which case we can construct the relevant bytes in a 
> register and memcpy them to the correct position.  Since we read and write a 
> lot of vints, this waste is probably measurable, particularly during 
> compaction and flush, and can probably be considered a performance bug.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-17142) Limit the maximum hints size per host

2021-11-27 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-17142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17449856#comment-17449856
 ] 

Aleksandr Sorokoumov edited comment on CASSANDRA-17142 at 11/27/21, 3:19 PM:
-

This change can fit nicely in the Guardrails framework, similar to 
CASSANDRA-17150.


was (Author: ge):
This change can fit nicely in the Guardrails framework, similarly to 
https://issues.apache.org/jira/browse/CASSANDRA-17150.

> Limit the maximum hints size per host
> -
>
> Key: CASSANDRA-17142
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17142
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Hints
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
>
> The hints system defines a time window, i.e. max_hint_window_in_ms, to store 
> the hints. 
> It defines no limit on how much data can be kept during the time window. The 
> hints can grow excessively and make the node running out of disk. In such 
> scenario, the operators have to truncate the hints manually.
> I'd propose that in addition to the conventional hints window, operators 
> should be able to define the maximum hints size per host, i.e. 
> max_hints_size_per_host_in_mb, to provide an another layer of protection. A 
> node stops to store hints for the down node whenever it reaches to the time 
> cap or the size cap. In order to not surprise the users, the config should be 
> disabled by default. It should also be configurable via JMX. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-17142) Limit the maximum hints size per host

2021-11-27 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-17142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17449856#comment-17449856
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-17142:
--

This change can fit nicely in the Guardrails framework, similarly to 
https://issues.apache.org/jira/browse/CASSANDRA-17150.

> Limit the maximum hints size per host
> -
>
> Key: CASSANDRA-17142
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17142
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Hints
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
>
> The hints system defines a time window, i.e. max_hint_window_in_ms, to store 
> the hints. 
> It defines no limit on how much data can be kept during the time window. The 
> hints can grow excessively and make the node running out of disk. In such 
> scenario, the operators have to truncate the hints manually.
> I'd propose that in addition to the conventional hints window, operators 
> should be able to define the maximum hints size per host, i.e. 
> max_hints_size_per_host_in_mb, to provide an another layer of protection. A 
> node stops to store hints for the down node whenever it reaches to the time 
> cap or the size cap. In order to not surprise the users, the config should be 
> disabled by default. It should also be configurable via JMX. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16840) Close native transport port before hint transfer during decommission

2021-11-27 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-16840:
-
Reviewers: Aleksandr Sorokoumov, Brandon Williams  (was: Brandon Williams)

> Close native transport port before hint transfer during decommission
> 
>
> Key: CASSANDRA-16840
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16840
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Hints
>Reporter: Matt Fleming
>Assignee: Matt Fleming
>Priority: Normal
> Fix For: 4.x
>
>
> New hints can be generated on a node when it's decommissioning which is a 
> problem if the node has already started hint transfer because any hints that 
> come in after the transfer has begun will remain on-disk and not be 
> transferred to a peer.
> You can work around this problem by manually closing the native transport 
> port before starting the decommission with {{nodetool disablebinary}} but it 
> feels like something we might want to do automatically.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16840) Close native transport port before hint transfer during decommission

2021-11-27 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17449855#comment-17449855
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-16840:
--

Hey Matt!

This patch looks good to me. In my opinion the interaction between {{nodetool 
decomission}} and transferring and creating new hints is subtle enough to 
benefit from a (d)test. WDYT? Please let me know if you need a hand with it.

> Close native transport port before hint transfer during decommission
> 
>
> Key: CASSANDRA-16840
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16840
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Hints
>Reporter: Matt Fleming
>Assignee: Matt Fleming
>Priority: Normal
> Fix For: 4.x
>
>
> New hints can be generated on a node when it's decommissioning which is a 
> problem if the node has already started hint transfer because any hints that 
> come in after the transfer has begun will remain on-disk and not be 
> transferred to a peer.
> You can work around this problem by manually closing the native transport 
> port before starting the decommission with {{nodetool disablebinary}} but it 
> feels like something we might want to do automatically.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15215) VIntCoding should read and write more efficiently

2021-11-26 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-15215:
-
Test and Documentation Plan: Added unit tests for new methods and 
benchmarks to show performance improvements.
 Status: Patch Available  (was: In Progress)

> VIntCoding should read and write more efficiently
> -
>
> Key: CASSANDRA-15215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15215
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Local/SSTable
>Reporter: Benedict Elliott Smith
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.x
>
> Attachments: testWriteRandomLongDOP_final.png, 
> writeUnsignedVInt_megamorphic_BB.png, writeUnsignedVInt_megamorphic_DOP.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Most vints occupy significantly fewer than 8 bytes, and most buffers have >= 
> 8 bytes spare, in which case we can construct the relevant bytes in a 
> register and memcpy them to the correct position.  Since we read and write a 
> lot of vints, this waste is probably measurable, particularly during 
> compaction and flush, and can probably be considered a performance bug.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15215) VIntCoding should read and write more efficiently

2021-11-26 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17449488#comment-17449488
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-15215:
--

While working on the read path I've realized that it already has the 
optimization we discussed since CASSANDRA-8630 - 
https://github.com/apache/cassandra/blob/951d72cd929d1f6c9329becbdd7604a9e709587b/src/java/org/apache/cassandra/io/util/RebufferingInputStream.java#L239-L268.

Since the last update I added new test cases to {{VIntCodingTest}} to cover 
buffered and unbuffered reads and writes as well as extended {{DataOutputTest}} 
to cover {{DataOutputPlus#writeBytes}}.

Patches:
* [3.0|https://github.com/apache/cassandra/pull/1343]
* [3.11|https://github.com/apache/cassandra/pull/1344]
* [4.0|https://github.com/apache/cassandra/pull/1345]
* [trunk|https://github.com/apache/cassandra/pull/1346]

To demonstrate the results I picked a single benchmark - 
{{testWriteRandomLongDOP}} as it shows overall performance improvement and is 
relevant for all Cassandra versions. The results are for the megamorphic 
benchmark variation.

 !testWriteRandomLongDOP_final.png|width=800px! 


[~benedict], [~blambov] Can you please review the patch and run the CI?


> VIntCoding should read and write more efficiently
> -
>
> Key: CASSANDRA-15215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15215
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Local/SSTable
>Reporter: Benedict Elliott Smith
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.x
>
> Attachments: testWriteRandomLongDOP_final.png, 
> writeUnsignedVInt_megamorphic_BB.png, writeUnsignedVInt_megamorphic_DOP.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Most vints occupy significantly fewer than 8 bytes, and most buffers have >= 
> 8 bytes spare, in which case we can construct the relevant bytes in a 
> register and memcpy them to the correct position.  Since we read and write a 
> lot of vints, this waste is probably measurable, particularly during 
> compaction and flush, and can probably be considered a performance bug.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-15215) VIntCoding should read and write more efficiently

2021-11-26 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-15215:
-
Attachment: testWriteRandomLongDOP_final.png

> VIntCoding should read and write more efficiently
> -
>
> Key: CASSANDRA-15215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15215
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Local/SSTable
>Reporter: Benedict Elliott Smith
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.x
>
> Attachments: testWriteRandomLongDOP_final.png, 
> writeUnsignedVInt_megamorphic_BB.png, writeUnsignedVInt_megamorphic_DOP.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Most vints occupy significantly fewer than 8 bytes, and most buffers have >= 
> 8 bytes spare, in which case we can construct the relevant bytes in a 
> register and memcpy them to the correct position.  Since we read and write a 
> lot of vints, this waste is probably measurable, particularly during 
> compaction and flush, and can probably be considered a performance bug.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15215) VIntCoding should read and write more efficiently

2021-11-22 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17447439#comment-17447439
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-15215:
--

Thanks for the review and the suggestions [~benedict] ! This Wednesday I plan 
to work on read performance, applying your changes, and adding unit tests for 
new code branches. Hopefully, the patch is going to be ready by the end of the 
week.

> VIntCoding should read and write more efficiently
> -
>
> Key: CASSANDRA-15215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15215
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Local/SSTable
>Reporter: Benedict Elliott Smith
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.x
>
> Attachments: writeUnsignedVInt_megamorphic_BB.png, 
> writeUnsignedVInt_megamorphic_DOP.png
>
>
> Most vints occupy significantly fewer than 8 bytes, and most buffers have >= 
> 8 bytes spare, in which case we can construct the relevant bytes in a 
> register and memcpy them to the correct position.  Since we read and write a 
> lot of vints, this waste is probably measurable, particularly during 
> compaction and flush, and can probably be considered a performance bug.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15215) VIntCoding should read and write more efficiently

2021-11-16 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17444639#comment-17444639
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-15215:
--

I implemented {{DataOutputPlus#writeBytes}} and added benchmarks that use the 
{{DataOutputPlus}} version of the method.

The register approach definitely improves write throughput. Due to increased 
number of benchmarks, I also added a visualization for megamorphic calls in 
addition to the raw results. "Multiple writes" below refers to the initial 
approach I tried with switch-cases for different number of bytes.

I am going to apply the same register approach to reads next.

!writeUnsignedVInt_megamorphic_DOP.png|width=800!

!writeUnsignedVInt_megamorphic_BB.png|width=800!

h4. Register
{noformat}
Benchmark(allocation)  Mode  Cnt   Score   
Error  Units
VIntCodingBench.testComputeUnsignedVIntSize   monomorphic  avgt   15  15.939 ± 
0.235  ns/op
VIntCodingBench.testComputeUnsignedVIntSize bimorphic  avgt   15  15.972 ± 
0.170  ns/op
VIntCodingBench.testComputeUnsignedVIntSize   megamorphic  avgt   15  15.976 ± 
0.225  ns/op
VIntCodingBench.testWrite1ByteBB  monomorphic  avgt   15   9.555 ± 
0.059  ns/op
VIntCodingBench.testWrite1ByteBBbimorphic  avgt   15  16.777 ± 
0.107  ns/op
VIntCodingBench.testWrite1ByteBB  megamorphic  avgt   15  18.286 ± 
0.155  ns/op
VIntCodingBench.testWrite1ByteDOP monomorphic  avgt   15  10.507 ± 
0.522  ns/op
VIntCodingBench.testWrite1ByteDOP   bimorphic  avgt   15  19.048 ± 
0.262  ns/op
VIntCodingBench.testWrite1ByteDOP megamorphic  avgt   15  19.339 ± 
0.155  ns/op
VIntCodingBench.testWrite2BytesBB monomorphic  avgt   15  14.688 ± 
0.170  ns/op
VIntCodingBench.testWrite2BytesBB   bimorphic  avgt   15  19.421 ± 
0.115  ns/op
VIntCodingBench.testWrite2BytesBB megamorphic  avgt   15  21.975 ± 
0.110  ns/op
VIntCodingBench.testWrite2BytesDOPmonomorphic  avgt   15  14.675 ± 
0.102  ns/op
VIntCodingBench.testWrite2BytesDOP  bimorphic  avgt   15  22.644 ± 
0.217  ns/op
VIntCodingBench.testWrite2BytesDOPmegamorphic  avgt   15  22.789 ± 
0.854  ns/op
VIntCodingBench.testWrite3BytesBB monomorphic  avgt   15  14.764 ± 
0.112  ns/op
VIntCodingBench.testWrite3BytesBB   bimorphic  avgt   15  19.543 ± 
0.363  ns/op
VIntCodingBench.testWrite3BytesBB megamorphic  avgt   15  22.054 ± 
0.138  ns/op
VIntCodingBench.testWrite3BytesDOPmonomorphic  avgt   15  14.706 ± 
0.115  ns/op
VIntCodingBench.testWrite3BytesDOP  bimorphic  avgt   15  22.549 ± 
0.151  ns/op
VIntCodingBench.testWrite3BytesDOPmegamorphic  avgt   15  22.560 ± 
0.370  ns/op
VIntCodingBench.testWrite4BytesBB monomorphic  avgt   15  14.679 ± 
0.158  ns/op
VIntCodingBench.testWrite4BytesBB   bimorphic  avgt   15  19.593 ± 
0.254  ns/op
VIntCodingBench.testWrite4BytesBB megamorphic  avgt   15  22.202 ± 
0.194  ns/op
VIntCodingBench.testWrite4BytesDOPmonomorphic  avgt   15  14.669 ± 
0.098  ns/op
VIntCodingBench.testWrite4BytesDOP  bimorphic  avgt   15  22.469 ± 
0.195  ns/op
VIntCodingBench.testWrite4BytesDOPmegamorphic  avgt   15  22.681 ± 
0.643  ns/op
VIntCodingBench.testWrite5BytesBB monomorphic  avgt   15  14.655 ± 
0.142  ns/op
VIntCodingBench.testWrite5BytesBB   bimorphic  avgt   15  19.390 ± 
0.100  ns/op
VIntCodingBench.testWrite5BytesBB megamorphic  avgt   15  22.086 ± 
0.185  ns/op
VIntCodingBench.testWrite5BytesDOPmonomorphic  avgt   15  14.668 ± 
0.137  ns/op
VIntCodingBench.testWrite5BytesDOP  bimorphic  avgt   15  22.833 ± 
0.615  ns/op
VIntCodingBench.testWrite5BytesDOPmegamorphic  avgt   15  22.127 ± 
0.298  ns/op
VIntCodingBench.testWrite6BytesBB monomorphic  avgt   15  14.766 ± 
0.252  ns/op
VIntCodingBench.testWrite6BytesBB   bimorphic  avgt   15  19.502 ± 
0.128  ns/op
VIntCodingBench.testWrite6BytesBB megamorphic  avgt   15  22.386 ± 
0.314  ns/op
VIntCodingBench.testWrite6BytesDOPmonomorphic  avgt   15  14.690 ± 
0.122  ns/op
VIntCodingBench.testWrite6BytesDOP  bimorphic  avgt   15  22.543 ± 
0.200  ns/op
VIntCodingBench.testWrite6BytesDOPmegamorphic  avgt   15  22.278 ± 
0.469  ns/op
VIntCodingBench.testWrite7BytesBB monomorphic  avgt   15  14.687 ± 
0.268  ns/op
VIntCodingBench.testWrite7BytesBB   bimorphic  avgt   15  19.434 ± 
0.179  ns/op
VIntCodingBench.testWrite7BytesBB megamorphic  avgt   15  21.991 ± 
0.160  ns/op
VIntCodingBench.testWrite7BytesDOPmonomorphic  avgt   15  14.677 ± 
0.131  ns/op

[jira] [Updated] (CASSANDRA-15215) VIntCoding should read and write more efficiently

2021-11-16 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-15215:
-
Attachment: writeUnsignedVInt_megamorphic_BB.png
writeUnsignedVInt_megamorphic_DOP.png

> VIntCoding should read and write more efficiently
> -
>
> Key: CASSANDRA-15215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15215
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Local/SSTable
>Reporter: Benedict Elliott Smith
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.x
>
> Attachments: writeUnsignedVInt_megamorphic_BB.png, 
> writeUnsignedVInt_megamorphic_DOP.png
>
>
> Most vints occupy significantly fewer than 8 bytes, and most buffers have >= 
> 8 bytes spare, in which case we can construct the relevant bytes in a 
> register and memcpy them to the correct position.  Since we read and write a 
> lot of vints, this waste is probably measurable, particularly during 
> compaction and flush, and can probably be considered a performance bug.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15215) VIntCoding should read and write more efficiently

2021-11-07 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17440071#comment-17440071
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-15215:
--

Short status update: I just pushed changes to the benchmarks as suggested by 
Branimir and Benedict. The changes are:
 * All tests have monomorphic, bimorphic, and megamorphic versions.
 * Added a test that writes random longs to the ByteBuffer.
 * Added a test for calculating VInt size and applied Branimir's formula.

Results:
h4. Baseline
{noformat}
Benchmark(allocation)  Mode  Cnt   Score   
Error  Units
VIntCodingBench.testComputeUnsignedVIntSize   monomorphic  avgt   15  17.069 ± 
1.087  ns/op
VIntCodingBench.testComputeUnsignedVIntSize bimorphic  avgt   15  17.323 ± 
0.656  ns/op
VIntCodingBench.testComputeUnsignedVIntSize   megamorphic  avgt   15  16.791 ± 
0.473  ns/op
VIntCodingBench.testWrite1Bytemonomorphic  avgt   15   9.047 ± 
0.254  ns/op
VIntCodingBench.testWrite1Byte  bimorphic  avgt   15  16.935 ± 
0.207  ns/op
VIntCodingBench.testWrite1Bytemegamorphic  avgt   15  17.835 ± 
0.090  ns/op
VIntCodingBench.testWrite2Bytes   monomorphic  avgt   15  18.612 ± 
0.194  ns/op
VIntCodingBench.testWrite2Bytes bimorphic  avgt   15  25.033 ± 
0.239  ns/op
VIntCodingBench.testWrite2Bytes   megamorphic  avgt   15  28.352 ± 
0.115  ns/op
VIntCodingBench.testWrite3Bytes   monomorphic  avgt   15  21.333 ± 
0.197  ns/op
VIntCodingBench.testWrite3Bytes bimorphic  avgt   15  26.173 ± 
0.170  ns/op
VIntCodingBench.testWrite3Bytes   megamorphic  avgt   15  29.983 ± 
0.208  ns/op
VIntCodingBench.testWrite4Bytes   monomorphic  avgt   15  21.229 ± 
0.245  ns/op
VIntCodingBench.testWrite4Bytes bimorphic  avgt   15  28.966 ± 
0.606  ns/op
VIntCodingBench.testWrite4Bytes   megamorphic  avgt   15  33.219 ± 
1.276  ns/op
VIntCodingBench.testWrite5Bytes   monomorphic  avgt   15  22.886 ± 
0.602  ns/op
VIntCodingBench.testWrite5Bytes bimorphic  avgt   15  29.209 ± 
1.077  ns/op
VIntCodingBench.testWrite5Bytes   megamorphic  avgt   15  32.731 ± 
0.944  ns/op
VIntCodingBench.testWrite6Bytes   monomorphic  avgt   15  22.579 ± 
0.794  ns/op
VIntCodingBench.testWrite6Bytes bimorphic  avgt   15  29.067 ± 
0.678  ns/op
VIntCodingBench.testWrite6Bytes   megamorphic  avgt   15  35.419 ± 
1.496  ns/op
VIntCodingBench.testWrite7Bytes   monomorphic  avgt   15  22.823 ± 
0.527  ns/op
VIntCodingBench.testWrite7Bytes bimorphic  avgt   15  29.521 ± 
1.216  ns/op
VIntCodingBench.testWrite7Bytes   megamorphic  avgt   15  34.295 ± 
2.327  ns/op
VIntCodingBench.testWrite8Bytes   monomorphic  avgt   15  22.032 ± 
0.918  ns/op
VIntCodingBench.testWrite8Bytes bimorphic  avgt   15  30.388 ± 
1.015  ns/op
VIntCodingBench.testWrite8Bytes   megamorphic  avgt   15  33.632 ± 
1.200  ns/op
VIntCodingBench.testWrite9Bytes   monomorphic  avgt   15  22.616 ± 
1.309  ns/op
VIntCodingBench.testWrite9Bytes bimorphic  avgt   15  29.291 ± 
1.096  ns/op
VIntCodingBench.testWrite9Bytes   megamorphic  avgt   15  32.597 ± 
0.807  ns/op
VIntCodingBench.testWriteRandomLong   monomorphic  avgt   15  35.010 ± 
1.145  ns/op
VIntCodingBench.testWriteRandomLong bimorphic  avgt   15  43.090 ± 
0.615  ns/op
VIntCodingBench.testWriteRandomLong   megamorphic  avgt   15  43.196 ± 
1.742  ns/op
{noformat}
h4. Patch
{noformat}
VIntCodingBench.testComputeUnsignedVIntSize   monomorphic  avgt   15  16.339 ± 
0.418  ns/op
VIntCodingBench.testComputeUnsignedVIntSize bimorphic  avgt   15  16.340 ± 
0.417  ns/op
VIntCodingBench.testComputeUnsignedVIntSize   megamorphic  avgt   15  16.435 ± 
0.408  ns/op
VIntCodingBench.testWrite1Bytemonomorphic  avgt   15   9.362 ± 
0.208  ns/op
VIntCodingBench.testWrite1Byte  bimorphic  avgt   15  18.164 ± 
0.839  ns/op
VIntCodingBench.testWrite1Bytemegamorphic  avgt   15  19.800 ± 
0.942  ns/op
VIntCodingBench.testWrite2Bytes   monomorphic  avgt   15  10.094 ± 
0.444  ns/op
VIntCodingBench.testWrite2Bytes bimorphic  avgt   15  18.310 ± 
0.813  ns/op
VIntCodingBench.testWrite2Bytes   megamorphic  avgt   15  19.685 ± 
0.692  ns/op
VIntCodingBench.testWrite3Bytes   monomorphic  avgt   15  11.541 ± 
0.433  ns/op
VIntCodingBench.testWrite3Bytes bimorphic  avgt   15  19.087 ± 
0.720  ns/op
VIntCodingBench.testWrite3Bytes   megamorphic  avgt   15  20.518 ± 
1.035  ns/op
VIntCodingBench.testWrite4Bytes   monomorphic  avgt

[jira] [Commented] (CASSANDRA-16349) SSTableLoader reports error when SSTable(s) do not have data for some nodes

2021-11-04 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17438899#comment-17438899
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-16349:
--

I've rebased the dtest and added {{@since("3.0")}} to it.

> SSTableLoader reports error when SSTable(s) do not have data for some nodes
> ---
>
> Key: CASSANDRA-16349
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16349
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/sstable
>Reporter: Serban Teodorescu
>Assignee: Serban Teodorescu
>Priority: Normal
> Fix For: 4.0.x, 4.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Running SSTableLoader in verbose mode will show error(s) if there are node(s) 
> that do not own any data from the SSTable(s). This can happen in at least 2 
> cases:
>  # SSTableLoader is used to stream backups while keeping the same token ranges
>  # SSTable(s) are created with CQLSSTableWriter to match token ranges (this 
> can bring better performance by using ZeroCopy streaming)
> Partial output of the SSTableLoader:
> {quote}ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] 
> Remote peer /127.0.0.4:7000 failed stream session.
> ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] Remote peer 
> /127.0.0.3:7000 failed stream session.
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.515KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.427KiB/s)
> {quote}
>  
> Stack trace:
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:552)
> at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:533)
> at org.apache.cassandra.tools.BulkLoader.load(BulkLoader.java:99)
> at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:49)
> Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:88)
> at 
> com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1056)
> at 
> com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
> at 
> com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138)
> at 
> com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:958)
> at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:748)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:220)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:196)
> at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:505)
> at 
> org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:819)
> at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:595)
> at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:189)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:844)
> {quote}
> To reproduce create a cluster with ccm with more nodes than the RF, put some 
> data into it copy a SSTable and stream it.
>  
> The error originates on the nodes, the following stack trace is shown in the 
> logs:
> {quote}java.lang.IllegalStateException: Stream hasn't been read yet
>     at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:507)
>     at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.getSize(CassandraIncomingFile.java:96)
>     at 
> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:789)
>     at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:587)
>     at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:189)
>     at 
>

[jira] [Commented] (CASSANDRA-16349) SSTableLoader reports error when SSTable(s) do not have data for some nodes

2021-11-03 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17438226#comment-17438226
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-16349:
--

Thank you for the review and running the tests [~e.dimitrova]! Tomorrow I will 
mark the dtest to run only since 3.0, rebase the patch against latest trunk and 
backport it to 4.0.

> SSTableLoader reports error when SSTable(s) do not have data for some nodes
> ---
>
> Key: CASSANDRA-16349
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16349
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/sstable
>Reporter: Serban Teodorescu
>Assignee: Serban Teodorescu
>Priority: Normal
> Fix For: 4.0.x, 4.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Running SSTableLoader in verbose mode will show error(s) if there are node(s) 
> that do not own any data from the SSTable(s). This can happen in at least 2 
> cases:
>  # SSTableLoader is used to stream backups while keeping the same token ranges
>  # SSTable(s) are created with CQLSSTableWriter to match token ranges (this 
> can bring better performance by using ZeroCopy streaming)
> Partial output of the SSTableLoader:
> {quote}ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] 
> Remote peer /127.0.0.4:7000 failed stream session.
> ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] Remote peer 
> /127.0.0.3:7000 failed stream session.
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.515KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.427KiB/s)
> {quote}
>  
> Stack trace:
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:552)
> at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:533)
> at org.apache.cassandra.tools.BulkLoader.load(BulkLoader.java:99)
> at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:49)
> Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:88)
> at 
> com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1056)
> at 
> com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
> at 
> com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138)
> at 
> com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:958)
> at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:748)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:220)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:196)
> at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:505)
> at 
> org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:819)
> at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:595)
> at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:189)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:844)
> {quote}
> To reproduce create a cluster with ccm with more nodes than the RF, put some 
> data into it copy a SSTable and stream it.
>  
> The error originates on the nodes, the following stack trace is shown in the 
> logs:
> {quote}java.lang.IllegalStateException: Stream hasn't been read yet
>     at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:507)
>     at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.getSize(CassandraIncomingFile.java:96)
>     at 
> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:789)
>     at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:587)
>     at 
>

[jira] [Commented] (CASSANDRA-15215) VIntCoding should read and write more efficiently

2021-11-02 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17437257#comment-17437257
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-15215:
--

Thank you for the suggestion! I will add a benchmark to see if there is 
measurable difference.

> VIntCoding should read and write more efficiently
> -
>
> Key: CASSANDRA-15215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15215
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Local/SSTable
>Reporter: Benedict Elliott Smith
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> Most vints occupy significantly fewer than 8 bytes, and most buffers have >= 
> 8 bytes spare, in which case we can construct the relevant bytes in a 
> register and memcpy them to the correct position.  Since we read and write a 
> lot of vints, this waste is probably measurable, particularly during 
> compaction and flush, and can probably be considered a performance bug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15215) VIntCoding should read and write more efficiently

2021-10-31 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436539#comment-17436539
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-15215:
--

Hey [~benedict]! Thank you so much for a quick and elaborate answer! 

As a next step, I am going to extend the benchmark for implementations of 
{{DataOutput}} to estimate how well my patch works there. After that, I will 
extend {{DataOutputPlus}} and {{DataInputPlus}} as you suggested. 

As I am working on this patch in my free time, it might take a bit. I hope to 
provide an update by the end of next week.

> VIntCoding should read and write more efficiently
> -
>
> Key: CASSANDRA-15215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15215
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Local/SSTable
>Reporter: Benedict Elliott Smith
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> Most vints occupy significantly fewer than 8 bytes, and most buffers have >= 
> 8 bytes spare, in which case we can construct the relevant bytes in a 
> register and memcpy them to the correct position.  Since we read and write a 
> lot of vints, this waste is probably measurable, particularly during 
> compaction and flush, and can probably be considered a performance bug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15215) VIntCoding should read and write more efficiently

2021-10-31 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17436519#comment-17436519
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-15215:
--

[~benedict] if I understood your idea correctly, you suggest writing relevant 
bytes directly instead of preparing the thread-local byte array and memcpy'ing 
it into the given buffer. Is my interpretation correct? In the title and 
description you also mentioned reads, but I haven't figured how to adjust this 
idea there, so here are results for writes only.

h2. Code
||Branch||Description||
|[baseline|https://github.com/apache/cassandra/compare/trunk...Ge:15215-baseline-trunk?expand=1]|trunk
 + benchmark|
|[patch|https://github.com/apache/cassandra/compare/trunk...Ge:15215-trunk?expand=1]|patch
 + benchmark|
h2. Setup

Each benchmark does {{VIntCoding.writeUnsignedVInt}} on a {{long}} from 1 to 9 
bytes. The write target is a {{ByteBuffer}} - both on- and off- heap.

The results are produced on a MBP 2019 - 2,3 GHz 8-Core Intel Core i9.
h2. Results
h3. Baseline
{noformat}
Benchmark(allocation)  Mode  Cnt   Score   Error  Units
VIntCodingBench.testWrite1Byte   HEAP  avgt   15   9.084 ± 5.196  ns/op
VIntCodingBench.testWrite1Byte DIRECT  avgt   15   5.037 ± 0.638  ns/op
VIntCodingBench.testWrite2Bytes  HEAP  avgt   15  15.604 ± 0.646  ns/op
VIntCodingBench.testWrite2BytesDIRECT  avgt   15  15.028 ± 0.568  ns/op
VIntCodingBench.testWrite3Bytes  HEAP  avgt   15  16.704 ± 0.461  ns/op
VIntCodingBench.testWrite3BytesDIRECT  avgt   15  17.410 ± 0.489  ns/op
VIntCodingBench.testWrite4Bytes  HEAP  avgt   15  17.086 ± 0.527  ns/op
VIntCodingBench.testWrite4BytesDIRECT  avgt   15  20.307 ± 0.705  ns/op
VIntCodingBench.testWrite5Bytes  HEAP  avgt   15  17.395 ± 0.578  ns/op
VIntCodingBench.testWrite5BytesDIRECT  avgt   15  17.558 ± 0.512  ns/op
VIntCodingBench.testWrite6Bytes  HEAP  avgt   15  18.114 ± 0.967  ns/op
VIntCodingBench.testWrite6BytesDIRECT  avgt   15  19.023 ± 0.591  ns/op
VIntCodingBench.testWrite7Bytes  HEAP  avgt   15  18.004 ± 0.298  ns/op
VIntCodingBench.testWrite7BytesDIRECT  avgt   15  19.081 ± 0.601  ns/op
VIntCodingBench.testWrite8Bytes  HEAP  avgt   15  18.466 ± 0.463  ns/op
VIntCodingBench.testWrite8BytesDIRECT  avgt   15  20.228 ± 5.620  ns/op
VIntCodingBench.testWrite9Bytes  HEAP  avgt   15  18.553 ± 0.537  ns/op
VIntCodingBench.testWrite9BytesDIRECT  avgt   15  20.101 ± 0.476  ns/op
{noformat}
h3. Patch
{noformat}
Benchmark(allocation)  Mode  Cnt   Score   Error  Units
VIntCodingBench.testWrite1Byte   HEAP  avgt   15   4.728 ± 0.077  ns/op
VIntCodingBench.testWrite1Byte DIRECT  avgt   15   6.415 ± 3.157  ns/op
VIntCodingBench.testWrite2Bytes  HEAP  avgt   15   8.244 ± 0.440  ns/op
VIntCodingBench.testWrite2BytesDIRECT  avgt   15   9.136 ± 3.979  ns/op
VIntCodingBench.testWrite3Bytes  HEAP  avgt   15   8.714 ± 0.134  ns/op
VIntCodingBench.testWrite3BytesDIRECT  avgt   15   9.690 ± 2.735  ns/op
VIntCodingBench.testWrite4Bytes  HEAP  avgt   15   8.634 ± 0.164  ns/op
VIntCodingBench.testWrite4BytesDIRECT  avgt   15   6.830 ± 0.061  ns/op
VIntCodingBench.testWrite5Bytes  HEAP  avgt   15   8.389 ± 0.207  ns/op
VIntCodingBench.testWrite5BytesDIRECT  avgt   15   8.059 ± 1.537  ns/op
VIntCodingBench.testWrite6Bytes  HEAP  avgt   15  10.861 ± 0.336  ns/op
VIntCodingBench.testWrite6BytesDIRECT  avgt   15   9.816 ± 1.482  ns/op
VIntCodingBench.testWrite7Bytes  HEAP  avgt   15  11.045 ± 0.419  ns/op
VIntCodingBench.testWrite7BytesDIRECT  avgt   15  10.702 ± 2.377  ns/op
VIntCodingBench.testWrite8Bytes  HEAP  avgt   15  10.375 ± 0.423  ns/op
VIntCodingBench.testWrite8BytesDIRECT  avgt   15   7.237 ± 0.176  ns/op
VIntCodingBench.testWrite9Bytes  HEAP  avgt   15  11.200 ± 0.365  ns/op
VIntCodingBench.testWrite9BytesDIRECT  avgt   15   8.152 ± 0.282  ns/op
{noformat}

> VIntCoding should read and write more efficiently
> -
>
> Key: CASSANDRA-15215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15215
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Local/SSTable
>Reporter: Benedict Elliott Smith
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> Most vints occupy significantly fewer than 8 bytes, and most buffers have >= 
> 8 bytes spare, in which case we can construct the relevant bytes in a 
> register and memcpy them to the correct position.  Since we read and write a 
> lot of vints, this waste is probably measurable,

[jira] [Assigned] (CASSANDRA-15215) VIntCoding should read and write more efficiently

2021-10-31 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov reassigned CASSANDRA-15215:


Assignee: Aleksandr Sorokoumov

> VIntCoding should read and write more efficiently
> -
>
> Key: CASSANDRA-15215
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15215
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Local/SSTable
>Reporter: Benedict Elliott Smith
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> Most vints occupy significantly fewer than 8 bytes, and most buffers have >= 
> 8 bytes spare, in which case we can construct the relevant bytes in a 
> register and memcpy them to the correct position.  Since we read and write a 
> lot of vints, this waste is probably measurable, particularly during 
> compaction and flush, and can probably be considered a performance bug.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16349) SSTableLoader reports error when SSTable(s) do not have data for some nodes

2021-10-27 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-16349:
-
Status: Needs Committer  (was: Review In Progress)

> SSTableLoader reports error when SSTable(s) do not have data for some nodes
> ---
>
> Key: CASSANDRA-16349
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16349
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/sstable
>Reporter: Serban Teodorescu
>Assignee: Serban Teodorescu
>Priority: Normal
> Fix For: 4.0.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Running SSTableLoader in verbose mode will show error(s) if there are node(s) 
> that do not own any data from the SSTable(s). This can happen in at least 2 
> cases:
>  # SSTableLoader is used to stream backups while keeping the same token ranges
>  # SSTable(s) are created with CQLSSTableWriter to match token ranges (this 
> can bring better performance by using ZeroCopy streaming)
> Partial output of the SSTableLoader:
> {quote}ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] 
> Remote peer /127.0.0.4:7000 failed stream session.
> ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] Remote peer 
> /127.0.0.3:7000 failed stream session.
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.515KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.427KiB/s)
> {quote}
>  
> Stack trace:
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:552)
> at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:533)
> at org.apache.cassandra.tools.BulkLoader.load(BulkLoader.java:99)
> at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:49)
> Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:88)
> at 
> com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1056)
> at 
> com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
> at 
> com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138)
> at 
> com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:958)
> at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:748)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:220)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:196)
> at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:505)
> at 
> org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:819)
> at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:595)
> at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:189)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:844)
> {quote}
> To reproduce create a cluster with ccm with more nodes than the RF, put some 
> data into it copy a SSTable and stream it.
>  
> The error originates on the nodes, the following stack trace is shown in the 
> logs:
> {quote}java.lang.IllegalStateException: Stream hasn't been read yet
>     at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:507)
>     at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.getSize(CassandraIncomingFile.java:96)
>     at 
> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:789)
>     at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:587)
>     at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:189)
>     at 
>

[jira] [Updated] (CASSANDRA-16349) SSTableLoader reports error when SSTable(s) do not have data for some nodes

2021-10-27 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-16349:
-
Authors: Aleksandr Sorokoumov, Serban Teodorescu  (was: Serban Teodorescu)

> SSTableLoader reports error when SSTable(s) do not have data for some nodes
> ---
>
> Key: CASSANDRA-16349
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16349
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/sstable
>Reporter: Serban Teodorescu
>Assignee: Serban Teodorescu
>Priority: Normal
> Fix For: 4.0.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Running SSTableLoader in verbose mode will show error(s) if there are node(s) 
> that do not own any data from the SSTable(s). This can happen in at least 2 
> cases:
>  # SSTableLoader is used to stream backups while keeping the same token ranges
>  # SSTable(s) are created with CQLSSTableWriter to match token ranges (this 
> can bring better performance by using ZeroCopy streaming)
> Partial output of the SSTableLoader:
> {quote}ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] 
> Remote peer /127.0.0.4:7000 failed stream session.
> ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] Remote peer 
> /127.0.0.3:7000 failed stream session.
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.515KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.427KiB/s)
> {quote}
>  
> Stack trace:
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:552)
> at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:533)
> at org.apache.cassandra.tools.BulkLoader.load(BulkLoader.java:99)
> at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:49)
> Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:88)
> at 
> com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1056)
> at 
> com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
> at 
> com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138)
> at 
> com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:958)
> at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:748)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:220)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:196)
> at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:505)
> at 
> org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:819)
> at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:595)
> at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:189)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:844)
> {quote}
> To reproduce create a cluster with ccm with more nodes than the RF, put some 
> data into it copy a SSTable and stream it.
>  
> The error originates on the nodes, the following stack trace is shown in the 
> logs:
> {quote}java.lang.IllegalStateException: Stream hasn't been read yet
>     at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:507)
>     at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.getSize(CassandraIncomingFile.java:96)
>     at 
> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:789)
>     at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:587)
>     at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:189)
>     at 
>

[jira] [Commented] (CASSANDRA-16349) SSTableLoader reports error when SSTable(s) do not have data for some nodes

2021-10-27 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17434736#comment-17434736
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-16349:
--

Thank you for a quick review [~marcuse]! I fixed the nit. 

AFAIU, with one +1 from a committer, the correct status for this issue is 
{{NEEDS COMMITTER}}; I will change it accordingly.

> SSTableLoader reports error when SSTable(s) do not have data for some nodes
> ---
>
> Key: CASSANDRA-16349
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16349
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/sstable
>Reporter: Serban Teodorescu
>Assignee: Serban Teodorescu
>Priority: Normal
> Fix For: 4.0.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Running SSTableLoader in verbose mode will show error(s) if there are node(s) 
> that do not own any data from the SSTable(s). This can happen in at least 2 
> cases:
>  # SSTableLoader is used to stream backups while keeping the same token ranges
>  # SSTable(s) are created with CQLSSTableWriter to match token ranges (this 
> can bring better performance by using ZeroCopy streaming)
> Partial output of the SSTableLoader:
> {quote}ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] 
> Remote peer /127.0.0.4:7000 failed stream session.
> ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] Remote peer 
> /127.0.0.3:7000 failed stream session.
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.515KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.427KiB/s)
> {quote}
>  
> Stack trace:
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:552)
> at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:533)
> at org.apache.cassandra.tools.BulkLoader.load(BulkLoader.java:99)
> at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:49)
> Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:88)
> at 
> com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1056)
> at 
> com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
> at 
> com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138)
> at 
> com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:958)
> at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:748)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:220)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:196)
> at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:505)
> at 
> org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:819)
> at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:595)
> at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:189)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:844)
> {quote}
> To reproduce create a cluster with ccm with more nodes than the RF, put some 
> data into it copy a SSTable and stream it.
>  
> The error originates on the nodes, the following stack trace is shown in the 
> logs:
> {quote}java.lang.IllegalStateException: Stream hasn't been read yet
>     at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:507)
>     at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.getSize(CassandraIncomingFile.java:96)
>     at 
> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:789)
>     at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:587)
>     at 
>

[jira] [Updated] (CASSANDRA-16349) SSTableLoader reports error when SSTable(s) do not have data for some nodes

2021-10-26 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-16349:
-
Reviewers: Aleksandr Sorokoumov, Marcus Eriksson, Aleksandr Sorokoumov  
(was: Aleksandr Sorokoumov, Marcus Eriksson)
   Aleksandr Sorokoumov, Marcus Eriksson, Aleksandr Sorokoumov  
(was: Aleksandr Sorokoumov, Marcus Eriksson)
   Status: Review In Progress  (was: Patch Available)

> SSTableLoader reports error when SSTable(s) do not have data for some nodes
> ---
>
> Key: CASSANDRA-16349
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16349
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/sstable
>Reporter: Serban Teodorescu
>Assignee: Serban Teodorescu
>Priority: Normal
> Fix For: 4.0.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Running SSTableLoader in verbose mode will show error(s) if there are node(s) 
> that do not own any data from the SSTable(s). This can happen in at least 2 
> cases:
>  # SSTableLoader is used to stream backups while keeping the same token ranges
>  # SSTable(s) are created with CQLSSTableWriter to match token ranges (this 
> can bring better performance by using ZeroCopy streaming)
> Partial output of the SSTableLoader:
> {quote}ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] 
> Remote peer /127.0.0.4:7000 failed stream session.
> ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] Remote peer 
> /127.0.0.3:7000 failed stream session.
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.515KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.427KiB/s)
> {quote}
>  
> Stack trace:
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:552)
> at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:533)
> at org.apache.cassandra.tools.BulkLoader.load(BulkLoader.java:99)
> at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:49)
> Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:88)
> at 
> com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1056)
> at 
> com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
> at 
> com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138)
> at 
> com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:958)
> at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:748)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:220)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:196)
> at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:505)
> at 
> org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:819)
> at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:595)
> at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:189)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:844)
> {quote}
> To reproduce create a cluster with ccm with more nodes than the RF, put some 
> data into it copy a SSTable and stream it.
>  
> The error originates on the nodes, the following stack trace is shown in the 
> logs:
> {quote}java.lang.IllegalStateException: Stream hasn't been read yet
>     at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:507)
>     at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.getSize(CassandraIncomingFile.java:96)
>     at 
> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:789)
>     at 
>

[jira] [Updated] (CASSANDRA-16349) SSTableLoader reports error when SSTable(s) do not have data for some nodes

2021-10-26 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-16349:
-
Reviewers: Aleksandr Sorokoumov, Marcus Eriksson  (was: Aleksandr 
Sorokoumov)

> SSTableLoader reports error when SSTable(s) do not have data for some nodes
> ---
>
> Key: CASSANDRA-16349
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16349
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/sstable
>Reporter: Serban Teodorescu
>Assignee: Serban Teodorescu
>Priority: Normal
> Fix For: 4.0.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Running SSTableLoader in verbose mode will show error(s) if there are node(s) 
> that do not own any data from the SSTable(s). This can happen in at least 2 
> cases:
>  # SSTableLoader is used to stream backups while keeping the same token ranges
>  # SSTable(s) are created with CQLSSTableWriter to match token ranges (this 
> can bring better performance by using ZeroCopy streaming)
> Partial output of the SSTableLoader:
> {quote}ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] 
> Remote peer /127.0.0.4:7000 failed stream session.
> ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] Remote peer 
> /127.0.0.3:7000 failed stream session.
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.515KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.427KiB/s)
> {quote}
>  
> Stack trace:
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:552)
> at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:533)
> at org.apache.cassandra.tools.BulkLoader.load(BulkLoader.java:99)
> at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:49)
> Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:88)
> at 
> com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1056)
> at 
> com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
> at 
> com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138)
> at 
> com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:958)
> at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:748)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:220)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:196)
> at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:505)
> at 
> org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:819)
> at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:595)
> at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:189)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:844)
> {quote}
> To reproduce create a cluster with ccm with more nodes than the RF, put some 
> data into it copy a SSTable and stream it.
>  
> The error originates on the nodes, the following stack trace is shown in the 
> logs:
> {quote}java.lang.IllegalStateException: Stream hasn't been read yet
>     at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:507)
>     at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.getSize(CassandraIncomingFile.java:96)
>     at 
> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:789)
>     at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:587)
>     at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:189)
>     at 
>

[jira] [Commented] (CASSANDRA-16349) SSTableLoader reports error when SSTable(s) do not have data for some nodes

2021-10-26 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17434226#comment-17434226
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-16349:
--

[~bdeggleston], [~marcuse] Do you have cycles to review? [Streaming fix + 
SSTableLoader 
fix|https://github.com/apache/cassandra/compare/trunk...Ge:16349-streaming-sstableloader-4.0?expand=1]
 from the comment above is the patch I think we should merge.

> SSTableLoader reports error when SSTable(s) do not have data for some nodes
> ---
>
> Key: CASSANDRA-16349
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16349
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/sstable
>Reporter: Serban Teodorescu
>Assignee: Serban Teodorescu
>Priority: Normal
> Fix For: 4.0.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Running SSTableLoader in verbose mode will show error(s) if there are node(s) 
> that do not own any data from the SSTable(s). This can happen in at least 2 
> cases:
>  # SSTableLoader is used to stream backups while keeping the same token ranges
>  # SSTable(s) are created with CQLSSTableWriter to match token ranges (this 
> can bring better performance by using ZeroCopy streaming)
> Partial output of the SSTableLoader:
> {quote}ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] 
> Remote peer /127.0.0.4:7000 failed stream session.
> ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] Remote peer 
> /127.0.0.3:7000 failed stream session.
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.515KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.427KiB/s)
> {quote}
>  
> Stack trace:
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:552)
> at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:533)
> at org.apache.cassandra.tools.BulkLoader.load(BulkLoader.java:99)
> at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:49)
> Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:88)
> at 
> com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1056)
> at 
> com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
> at 
> com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138)
> at 
> com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:958)
> at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:748)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:220)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:196)
> at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:505)
> at 
> org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:819)
> at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:595)
> at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:189)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:844)
> {quote}
> To reproduce create a cluster with ccm with more nodes than the RF, put some 
> data into it copy a SSTable and stream it.
>  
> The error originates on the nodes, the following stack trace is shown in the 
> logs:
> {quote}java.lang.IllegalStateException: Stream hasn't been read yet
>     at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:507)
>     at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.getSize(CassandraIncomingFile.java:96)
>     at 
> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:789)
>     at 
>

[jira] [Commented] (CASSANDRA-16334) Replica failure causes timeout on multi-DC write

2021-10-16 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17429575#comment-17429575
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-16334:
--

Thank you for the review and running the CI [~adelapena]! I added the non-null 
check in 3.0 and 3.11 branches. The same check is not necessary in 4.0 onward, 
because BatchlogManager no longer passes null as a configured CL level.

> Replica failure causes timeout on multi-DC write
> 
>
> Key: CASSANDRA-16334
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16334
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination, Messaging/Internode
>Reporter: Paulo Motta
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x, 4.x
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Inserting a mutation larger than {{max_mutation_size_in_kb}} correctly throws 
> a write error on a single DC keyspace with RF=3:
> {noformat}
> cassandra.WriteFailure: Error from server: code=1500 [Replica(s) failed to 
> execute write] message="Operation failed - received 0 responses and 3 
> failures: UNKNOWN from /127.0.0.3:7000, UNKNOWN from /127.0.0.2:7000, UNKNOWN 
> from /127.0.0.1:7000" info={'consistency': 'LOCAL_ONE', 'required_responses': 
> 1, 'received_responses': 0, 'failures': 3}
> {noformat}
> The same insert wrongly causes a timeout on a keyspace with 2 dcs (RF=3 each):
> {noformat}
> cassandra.WriteTimeout: Error from server: code=1100 [Coordinator node timed 
> out waiting for replica nodes' responses] message="Operation timed out - 
> received only 0 responses." info={'consistency': 'LOCAL_ONE', 
> 'required_responses': 1, 'received_responses': 0}
> {noformat}
> Reproduction steps:
> {noformat}
> # Setup cluster
> ccm create -n 3:3 test
> for i in {1..6}; do echo 'max_mutation_size_in_kb: 1000' >> 
> ~/.ccm/test/node$i/conf/cassandra.yaml; done
> ccm start
> # Create schema
> ccm node1 cqlsh
> CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
> 'dc1': 3, 'dc2': 3};
> CREATE TABLE test.test (key int PRIMARY KEY, val blob);
> exit;
> # Insert data
> python
> from cassandra.cluster import Cluster
> cluster = Cluster()
> session = cluster.connect('test')
> blob = f = open("2mbBlob", "rb").read().hex()
> session.execute("INSERT INTO test (key, val) VALUES (1, textAsBlob('" + blob 
> + "'))")
> {noformat}
> Reproduced in 3.0, 3.11, 4.0, trunk.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16334) Replica failure causes timeout on multi-DC write

2021-10-09 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-16334:
-
Fix Version/s: 4.x
   4.0.x
   3.11.x
   3.0.x

> Replica failure causes timeout on multi-DC write
> 
>
> Key: CASSANDRA-16334
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16334
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination, Messaging/Internode
>Reporter: Paulo Motta
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x, 4.x
>
>
> Inserting a mutation larger than {{max_mutation_size_in_kb}} correctly throws 
> a write error on a single DC keyspace with RF=3:
> {noformat}
> cassandra.WriteFailure: Error from server: code=1500 [Replica(s) failed to 
> execute write] message="Operation failed - received 0 responses and 3 
> failures: UNKNOWN from /127.0.0.3:7000, UNKNOWN from /127.0.0.2:7000, UNKNOWN 
> from /127.0.0.1:7000" info={'consistency': 'LOCAL_ONE', 'required_responses': 
> 1, 'received_responses': 0, 'failures': 3}
> {noformat}
> The same insert wrongly causes a timeout on a keyspace with 2 dcs (RF=3 each):
> {noformat}
> cassandra.WriteTimeout: Error from server: code=1100 [Coordinator node timed 
> out waiting for replica nodes' responses] message="Operation timed out - 
> received only 0 responses." info={'consistency': 'LOCAL_ONE', 
> 'required_responses': 1, 'received_responses': 0}
> {noformat}
> Reproduction steps:
> {noformat}
> # Setup cluster
> ccm create -n 3:3 test
> for i in {1..6}; do echo 'max_mutation_size_in_kb: 1000' >> 
> ~/.ccm/test/node$i/conf/cassandra.yaml; done
> ccm start
> # Create schema
> ccm node1 cqlsh
> CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
> 'dc1': 3, 'dc2': 3};
> CREATE TABLE test.test (key int PRIMARY KEY, val blob);
> exit;
> # Insert data
> python
> from cassandra.cluster import Cluster
> cluster = Cluster()
> session = cluster.connect('test')
> blob = f = open("2mbBlob", "rb").read().hex()
> session.execute("INSERT INTO test (key, val) VALUES (1, textAsBlob('" + blob 
> + "'))")
> {noformat}
> Reproduced in 3.0, 3.11, 4.0, trunk.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16334) Replica failure causes timeout on multi-DC write

2021-10-09 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-16334:
-
Test and Documentation Plan: Added a dtest that covers both bugs causing 
replica failures to appear as timeouts.
 Status: Patch Available  (was: In Progress)

> Replica failure causes timeout on multi-DC write
> 
>
> Key: CASSANDRA-16334
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16334
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination, Messaging/Internode
>Reporter: Paulo Motta
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
>
> Inserting a mutation larger than {{max_mutation_size_in_kb}} correctly throws 
> a write error on a single DC keyspace with RF=3:
> {noformat}
> cassandra.WriteFailure: Error from server: code=1500 [Replica(s) failed to 
> execute write] message="Operation failed - received 0 responses and 3 
> failures: UNKNOWN from /127.0.0.3:7000, UNKNOWN from /127.0.0.2:7000, UNKNOWN 
> from /127.0.0.1:7000" info={'consistency': 'LOCAL_ONE', 'required_responses': 
> 1, 'received_responses': 0, 'failures': 3}
> {noformat}
> The same insert wrongly causes a timeout on a keyspace with 2 dcs (RF=3 each):
> {noformat}
> cassandra.WriteTimeout: Error from server: code=1100 [Coordinator node timed 
> out waiting for replica nodes' responses] message="Operation timed out - 
> received only 0 responses." info={'consistency': 'LOCAL_ONE', 
> 'required_responses': 1, 'received_responses': 0}
> {noformat}
> Reproduction steps:
> {noformat}
> # Setup cluster
> ccm create -n 3:3 test
> for i in {1..6}; do echo 'max_mutation_size_in_kb: 1000' >> 
> ~/.ccm/test/node$i/conf/cassandra.yaml; done
> ccm start
> # Create schema
> ccm node1 cqlsh
> CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
> 'dc1': 3, 'dc2': 3};
> CREATE TABLE test.test (key int PRIMARY KEY, val blob);
> exit;
> # Insert data
> python
> from cassandra.cluster import Cluster
> cluster = Cluster()
> session = cluster.connect('test')
> blob = f = open("2mbBlob", "rb").read().hex()
> session.execute("INSERT INTO test (key, val) VALUES (1, textAsBlob('" + blob 
> + "'))")
> {noformat}
> Reproduced in 3.0, 3.11, 4.0, trunk.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16334) Replica failure causes timeout on multi-DC write

2021-10-09 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17426575#comment-17426575
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-16334:
--

I have described the root cause in the previous comment. Two distinct bugs make 
replica failures appear as timeouts, one for DC- local and -global consistency 
levels. Fixing the latter also resolves the "zombie-hint" issue I described at 
the end of the previous message.

The reason replica failure appears as a timeout in DC-local consistency level 
is that {{AbstractWriteResponseHandler}} counts nodes in all DCs as potential 
candidates to wait for. The fix is to wait only for the DC-local nodes.

The second bug that is responsible both for the "zombie-hints" and the timeout 
issue with global consistency levels is related to forwarding replica failures 
to the correct address. This patch makes replicas send request failures to the 
original coordinator rather than the DC-local one that forwarded them the 
message. Besides, in 3.0 and 3.11, I also added missing respond-on-failure flag 
to the forwarded messages.

Patches:

* [dtest|https://github.com/apache/cassandra-dtest/pull/165]
* [3.0|https://github.com/apache/cassandra/pull/1259]
* [3.11|https://github.com/apache/cassandra/pull/1260]
* [4.0|https://github.com/apache/cassandra/pull/1261]
* [trunk|https://github.com/apache/cassandra/pull/1262]

[~paulo] Can you please start the CI?

> Replica failure causes timeout on multi-DC write
> 
>
> Key: CASSANDRA-16334
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16334
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination, Messaging/Internode
>Reporter: Paulo Motta
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
>
> Inserting a mutation larger than {{max_mutation_size_in_kb}} correctly throws 
> a write error on a single DC keyspace with RF=3:
> {noformat}
> cassandra.WriteFailure: Error from server: code=1500 [Replica(s) failed to 
> execute write] message="Operation failed - received 0 responses and 3 
> failures: UNKNOWN from /127.0.0.3:7000, UNKNOWN from /127.0.0.2:7000, UNKNOWN 
> from /127.0.0.1:7000" info={'consistency': 'LOCAL_ONE', 'required_responses': 
> 1, 'received_responses': 0, 'failures': 3}
> {noformat}
> The same insert wrongly causes a timeout on a keyspace with 2 dcs (RF=3 each):
> {noformat}
> cassandra.WriteTimeout: Error from server: code=1100 [Coordinator node timed 
> out waiting for replica nodes' responses] message="Operation timed out - 
> received only 0 responses." info={'consistency': 'LOCAL_ONE', 
> 'required_responses': 1, 'received_responses': 0}
> {noformat}
> Reproduction steps:
> {noformat}
> # Setup cluster
> ccm create -n 3:3 test
> for i in {1..6}; do echo 'max_mutation_size_in_kb: 1000' >> 
> ~/.ccm/test/node$i/conf/cassandra.yaml; done
> ccm start
> # Create schema
> ccm node1 cqlsh
> CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
> 'dc1': 3, 'dc2': 3};
> CREATE TABLE test.test (key int PRIMARY KEY, val blob);
> exit;
> # Insert data
> python
> from cassandra.cluster import Cluster
> cluster = Cluster()
> session = cluster.connect('test')
> blob = f = open("2mbBlob", "rb").read().hex()
> session.execute("INSERT INTO test (key, val) VALUES (1, textAsBlob('" + blob 
> + "'))")
> {noformat}
> Reproduced in 3.0, 3.11, 4.0, trunk.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14795) Expose information about stored hints via JMX

2021-10-09 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17426570#comment-17426570
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-14795:
--

The CI results are a bit far from green, but none of the failures seem to be 
related to this patch. Is there anything else I should do before this patch is 
ready to commit [~e.dimitrova] [~azotcsit] [~stefan.miklosovic]?

> Expose information about stored hints via JMX
> -
>
> Key: CASSANDRA-14795
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14795
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Observability
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Low
> Fix For: 4.x
>
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> Currently there is no way to determine what kind of hints a node has, apart 
> from looking at the filenames (thus host-ids) on disk. Having a way to access 
> this information would help with debugging hint creation/replay scenarios.
> In addition to the JMX method, there is a new nodetool command:
> {noformat}$ bin/nodetool -h 127.0.0.1 -p 7100 listendpointspendinghints
> Host ID Address Rack DC Status Total files Newest Oldest
> 5762b140-3fdf-4057-9ca7-05c070ccc9c3 127.0.0.2 rack1 datacenter1 DOWN 2 
> 2018-09-18 14:05:18,835 2018-09-18 14:05:08,811
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-14795) Expose information about stored hints via JMX

2021-10-08 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17426196#comment-17426196
 ] 

Aleksandr Sorokoumov edited comment on CASSANDRA-14795 at 10/8/21, 5:32 PM:


As there are no more outstanding review feedback, I squashed the changes to 
prepare for commit:

* 
[patch|https://github.com/apache/cassandra/pull/1232/commits/989ace231731a822a7d583625f3f0615ceba4a35]
* 
[dtest|https://github.com/apache/cassandra-dtest/pull/162/commits/ca882c704e0cba48027a2f6b84e603b63b81f882]
* [CI|https://ci-cassandra.apache.org/job/Cassandra-devbranch/1202/]


was (Author: ge):
As there are no more outstanding review feedback, I squashed the changes to 
prepare for commit:

* 
[patch|https://github.com/apache/cassandra/pull/1232/commits/989ace231731a822a7d583625f3f0615ceba4a35]
* 
[dtest|https://github.com/apache/cassandra-dtest/pull/162/commits/ca882c704e0cba48027a2f6b84e603b63b81f882]
* CI (TBA)

> Expose information about stored hints via JMX
> -
>
> Key: CASSANDRA-14795
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14795
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Observability
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Low
> Fix For: 4.x
>
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> Currently there is no way to determine what kind of hints a node has, apart 
> from looking at the filenames (thus host-ids) on disk. Having a way to access 
> this information would help with debugging hint creation/replay scenarios.
> In addition to the JMX method, there is a new nodetool command:
> {noformat}$ bin/nodetool -h 127.0.0.1 -p 7100 listendpointspendinghints
> Host ID Address Rack DC Status Total files Newest Oldest
> 5762b140-3fdf-4057-9ca7-05c070ccc9c3 127.0.0.2 rack1 datacenter1 DOWN 2 
> 2018-09-18 14:05:18,835 2018-09-18 14:05:08,811
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-14795) Expose information about stored hints via JMX

2021-10-08 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17426196#comment-17426196
 ] 

Aleksandr Sorokoumov edited comment on CASSANDRA-14795 at 10/8/21, 5:27 PM:


As there are no more outstanding review feedback, I squashed the changes to 
prepare for commit:

* 
[patch|https://github.com/apache/cassandra/pull/1232/commits/989ace231731a822a7d583625f3f0615ceba4a35]
* 
[dtest|https://github.com/apache/cassandra-dtest/pull/162/commits/ca882c704e0cba48027a2f6b84e603b63b81f882]
* CI (TBA)


was (Author: ge):
As there are no more outstanding review feedback, I squashed the changes to 
prepare for commit:

* 
[patch|https://github.com/apache/cassandra/pull/1232/commits/989ace231731a822a7d583625f3f0615ceba4a35]
* 
[dtest|https://github.com/apache/cassandra-dtest/pull/162/commits/ca882c704e0cba48027a2f6b84e603b63b81f882]
* 
[CI|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/1196/]

> Expose information about stored hints via JMX
> -
>
> Key: CASSANDRA-14795
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14795
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Observability
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Low
> Fix For: 4.x
>
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> Currently there is no way to determine what kind of hints a node has, apart 
> from looking at the filenames (thus host-ids) on disk. Having a way to access 
> this information would help with debugging hint creation/replay scenarios.
> In addition to the JMX method, there is a new nodetool command:
> {noformat}$ bin/nodetool -h 127.0.0.1 -p 7100 listendpointspendinghints
> Host ID Address Rack DC Status Total files Newest Oldest
> 5762b140-3fdf-4057-9ca7-05c070ccc9c3 127.0.0.2 rack1 datacenter1 DOWN 2 
> 2018-09-18 14:05:18,835 2018-09-18 14:05:08,811
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-14795) Expose information about stored hints via JMX

2021-10-08 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17426196#comment-17426196
 ] 

Aleksandr Sorokoumov edited comment on CASSANDRA-14795 at 10/8/21, 5:26 PM:


As there are no more outstanding review feedback, I squashed the changes to 
prepare for commit:

* 
[patch|https://github.com/apache/cassandra/pull/1232/commits/989ace231731a822a7d583625f3f0615ceba4a35]
* 
[dtest|https://github.com/apache/cassandra-dtest/pull/162/commits/ca882c704e0cba48027a2f6b84e603b63b81f882]
* 
[CI|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/1196/]


was (Author: ge):
As there are no more outstanding review feedback, I squashed the changes to 
prepare for commit:

* 
[patch|https://github.com/apache/cassandra/pull/1232/commits/0f733e070bb6a33b6d1f7b7bde33d40383d5fcfa]
* 
[dtest|https://github.com/apache/cassandra-dtest/pull/162/commits/ca882c704e0cba48027a2f6b84e603b63b81f882]
* 
[CI|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/1196/]

> Expose information about stored hints via JMX
> -
>
> Key: CASSANDRA-14795
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14795
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Observability
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Low
> Fix For: 4.x
>
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> Currently there is no way to determine what kind of hints a node has, apart 
> from looking at the filenames (thus host-ids) on disk. Having a way to access 
> this information would help with debugging hint creation/replay scenarios.
> In addition to the JMX method, there is a new nodetool command:
> {noformat}$ bin/nodetool -h 127.0.0.1 -p 7100 listendpointspendinghints
> Host ID Address Rack DC Status Total files Newest Oldest
> 5762b140-3fdf-4057-9ca7-05c070ccc9c3 127.0.0.2 rack1 datacenter1 DOWN 2 
> 2018-09-18 14:05:18,835 2018-09-18 14:05:08,811
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14795) Expose information about stored hints via JMX

2021-10-08 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17426196#comment-17426196
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-14795:
--

As there are no more outstanding review feedback, I squashed the changes to 
prepare for commit:

* 
[patch|https://github.com/apache/cassandra/pull/1232/commits/0f733e070bb6a33b6d1f7b7bde33d40383d5fcfa]
* 
[dtest|https://github.com/apache/cassandra-dtest/pull/162/commits/ca882c704e0cba48027a2f6b84e603b63b81f882]
* 
[CI|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/1196/]

> Expose information about stored hints via JMX
> -
>
> Key: CASSANDRA-14795
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14795
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Observability
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Low
> Fix For: 4.x
>
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> Currently there is no way to determine what kind of hints a node has, apart 
> from looking at the filenames (thus host-ids) on disk. Having a way to access 
> this information would help with debugging hint creation/replay scenarios.
> In addition to the JMX method, there is a new nodetool command:
> {noformat}$ bin/nodetool -h 127.0.0.1 -p 7100 listendpointspendinghints
> Host ID Address Rack DC Status Total files Newest Oldest
> 5762b140-3fdf-4057-9ca7-05c070ccc9c3 127.0.0.2 rack1 datacenter1 DOWN 2 
> 2018-09-18 14:05:18,835 2018-09-18 14:05:08,811
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14795) Expose information about stored hints via JMX

2021-10-08 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17426110#comment-17426110
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-14795:
--

Thank you [~stefan.miklosovic] for starting the CI! In this run there were no 
timeout failures which confirms overall running time of {{HintsServiceTest}} as 
a likely reason. To fix it, I moved newly added {{testListPendingHints}} to a 
separate test suite and reverted increased timeout.

[~azotcsit] I don't see any new comments in the PR. Perhaps, you should click 
"Submit review" for them to appear in the PR?

> Expose information about stored hints via JMX
> -
>
> Key: CASSANDRA-14795
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14795
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Observability
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Low
> Fix For: 4.x
>
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> Currently there is no way to determine what kind of hints a node has, apart 
> from looking at the filenames (thus host-ids) on disk. Having a way to access 
> this information would help with debugging hint creation/replay scenarios.
> In addition to the JMX method, there is a new nodetool command:
> {noformat}$ bin/nodetool -h 127.0.0.1 -p 7100 listendpointspendinghints
> Host ID Address Rack DC Status Total files Newest Oldest
> 5762b140-3fdf-4057-9ca7-05c070ccc9c3 127.0.0.2 rack1 datacenter1 DOWN 2 
> 2018-09-18 14:05:18,835 2018-09-18 14:05:08,811
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14795) Expose information about stored hints via JMX

2021-10-07 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17425425#comment-17425425
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-14795:
--

I have a suspicion that the test times out because the entire suite takes too 
long to run for the given timeout. We observe it as 
{{HintsServiceTest.testListPendingHints}} failure because it is the last test 
case in the suite.

For test report and logs see e.g. 
https://nightlies.apache.org/cassandra/devbranch/Cassandra-devbranch/1187/

{{test.timeout}} is set to 240 seconds. On successful runs the suite takes a 
bit longer than 200 seconds to finish, each case taking between 30 and 60 
seconds. As a speculation, a small hiccup or a slight deviation in test time 
might lead to a timeout. I'd like to verify this idea by increasing 
{{test.timeout}} to 360 seconds and re-running the CI. If this theory is 
correct, {{HintsServiceTest}} should succeed and overall test duration might 
exceed 240 seconds on 1 or more attempts.

[~azotcsit], [~e.dimitrova] Can I kindly ask you to re-run 
https://ci-cassandra.apache.org/job/Cassandra-devbranch/1187/?



> Expose information about stored hints via JMX
> -
>
> Key: CASSANDRA-14795
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14795
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Observability
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Low
> Fix For: 4.x
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> Currently there is no way to determine what kind of hints a node has, apart 
> from looking at the filenames (thus host-ids) on disk. Having a way to access 
> this information would help with debugging hint creation/replay scenarios.
> In addition to the JMX method, there is a new nodetool command:
> {noformat}$ bin/nodetool -h 127.0.0.1 -p 7100 listendpointspendinghints
> Host ID Address Rack DC Status Total files Newest Oldest
> 5762b140-3fdf-4057-9ca7-05c070ccc9c3 127.0.0.2 rack1 datacenter1 DOWN 2 
> 2018-09-18 14:05:18,835 2018-09-18 14:05:08,811
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16334) Replica failure causes timeout on multi-DC write

2021-10-06 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-16334:
-
Description: 
Inserting a mutation larger than {{max_mutation_size_in_kb}} correctly throws a 
write error on a single DC keyspace with RF=3:
{noformat}
cassandra.WriteFailure: Error from server: code=1500 [Replica(s) failed to 
execute write] message="Operation failed - received 0 responses and 3 failures: 
UNKNOWN from /127.0.0.3:7000, UNKNOWN from /127.0.0.2:7000, UNKNOWN from 
/127.0.0.1:7000" info={'consistency': 'LOCAL_ONE', 'required_responses': 1, 
'received_responses': 0, 'failures': 3}
{noformat}
The same insert wrongly causes a timeout on a keyspace with 2 dcs (RF=3 each):
{noformat}
cassandra.WriteTimeout: Error from server: code=1100 [Coordinator node timed 
out waiting for replica nodes' responses] message="Operation timed out - 
received only 0 responses." info={'consistency': 'LOCAL_ONE', 
'required_responses': 1, 'received_responses': 0}
{noformat}
Reproduction steps:
{noformat}
# Setup cluster
ccm create -n 3:3 test
for i in {1..6}; do echo 'max_mutation_size_in_kb: 1000' >> 
~/.ccm/test/node$i/conf/cassandra.yaml; done
ccm start

# Create schema
ccm node1 cqlsh
CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
'dc1': 3, 'dc2': 3};
CREATE TABLE test.test (key int PRIMARY KEY, val blob);
exit;

# Insert data
python
from cassandra.cluster import Cluster
cluster = Cluster()
session = cluster.connect('test')
blob = f = open("2mbBlob", "rb").read().hex()
session.execute("INSERT INTO test (key, val) VALUES (1, textAsBlob('" + blob + 
"'))")
{noformat}
Reproduced in 3.0, 3.11, 4.0, trunk.

  was:
Inserting a mutation larger than {{max_mutation_size_in_kb}} correctly throws a 
write error on a single DC keyspace with RF=3:
{noformat}
cassandra.WriteFailure: Error from server: code=1500 [Replica(s) failed to 
execute write] message="Operation failed - received 0 responses and 3 failures: 
UNKNOWN from /127.0.0.3:7000, UNKNOWN from /127.0.0.2:7000, UNKNOWN from 
/127.0.0.1:7000" info={'consistency': 'LOCAL_ONE', 'required_responses': 1, 
'received_responses': 0, 'failures': 3}
{noformat}
The same insert wrongly causes a timeout on a keyspace with 2 dcs (RF=3 each):
{noformat}
cassandra.WriteTimeout: Error from server: code=1100 [Coordinator node timed 
out waiting for replica nodes' responses] message="Operation timed out - 
received only 0 responses." info={'consistency': 'LOCAL_ONE', 
'required_responses': 1, 'received_responses': 0}
{noformat}
Reproduction steps:
{noformat}
# Setup cluster
ccm create -n 3:3 test
for i in {1..6}; do echo 'max_mutation_size_in_kb: 1000' >> 
~/.ccm/test/node$i/conf/cassandra.yaml; done
ccm start

# Create schema
ccm node1 cqlsh
CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
'dc1': 3, 'dc2': 3};
CREATE TABLE test.test (key int PRIMARY KEY, val blob);
exit;

# Insert data
python
from cassandra.cluster import Cluster
cluster = Cluster()
session = cluster.connect('test')
blob = f = open("2mbBlob", "rb").read().hex()
session.execute("INSERT INTO test (key, val) VALUES (1, textAsBlob('" + blob + 
"'))")
{noformat}
Reproduced in 3.0, 3.11, trunk.


> Replica failure causes timeout on multi-DC write
> 
>
> Key: CASSANDRA-16334
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16334
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination, Messaging/Internode
>Reporter: Paulo Motta
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
>
> Inserting a mutation larger than {{max_mutation_size_in_kb}} correctly throws 
> a write error on a single DC keyspace with RF=3:
> {noformat}
> cassandra.WriteFailure: Error from server: code=1500 [Replica(s) failed to 
> execute write] message="Operation failed - received 0 responses and 3 
> failures: UNKNOWN from /127.0.0.3:7000, UNKNOWN from /127.0.0.2:7000, UNKNOWN 
> from /127.0.0.1:7000" info={'consistency': 'LOCAL_ONE', 'required_responses': 
> 1, 'received_responses': 0, 'failures': 3}
> {noformat}
> The same insert wrongly causes a timeout on a keyspace with 2 dcs (RF=3 each):
> {noformat}
> cassandra.WriteTimeout: Error from server: code=1100 [Coordinator node timed 
> out waiting for replica nodes' responses] message="Operation timed out - 
> received only 0 responses." info={'consistency': 'LOCAL_ONE', 
> 'required_responses': 1, 'received_responses': 0}
> {noformat}
> Reproduction steps:
> {noformat}
> # Setup cluster
> ccm create -n 3:3 test
> for i in {1..6}; do echo 'max_mutation_size_in_kb: 1000' >> 
> ~/.ccm/test/node$i/conf/cassandra.yaml; done
> ccm start
> # Create schema
> ccm node1 cqlsh
> CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy',

[jira] [Updated] (CASSANDRA-16334) Replica failure causes timeout on multi-DC write

2021-10-06 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-16334:
-
Description: 
Inserting a mutation larger than {{max_mutation_size_in_kb}} correctly throws a 
write error on a single DC keyspace with RF=3:
{noformat}
cassandra.WriteFailure: Error from server: code=1500 [Replica(s) failed to 
execute write] message="Operation failed - received 0 responses and 3 failures: 
UNKNOWN from /127.0.0.3:7000, UNKNOWN from /127.0.0.2:7000, UNKNOWN from 
/127.0.0.1:7000" info={'consistency': 'LOCAL_ONE', 'required_responses': 1, 
'received_responses': 0, 'failures': 3}
{noformat}
The same insert wrongly causes a timeout on a keyspace with 2 dcs (RF=3 each):
{noformat}
cassandra.WriteTimeout: Error from server: code=1100 [Coordinator node timed 
out waiting for replica nodes' responses] message="Operation timed out - 
received only 0 responses." info={'consistency': 'LOCAL_ONE', 
'required_responses': 1, 'received_responses': 0}
{noformat}
Reproduction steps:
{noformat}
# Setup cluster
ccm create -n 3:3 test
for i in {1..6}; do echo 'max_mutation_size_in_kb: 1000' >> 
~/.ccm/test/node$i/conf/cassandra.yaml; done
ccm start

# Create schema
ccm node1 cqlsh
CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
'dc1': 3, 'dc2': 3};
CREATE TABLE test.test (key int PRIMARY KEY, val blob);
exit;

# Insert data
python
from cassandra.cluster import Cluster
cluster = Cluster()
session = cluster.connect('test')
blob = f = open("2mbBlob", "rb").read().hex()
session.execute("INSERT INTO test (key, val) VALUES (1, textAsBlob('" + blob + 
"'))")
{noformat}
Reproduced in 3.0, 3.11, trunk.

  was:
Inserting a mutation larger than {{max_mutation_size_in_kb}} correctly throws a 
write error on a single DC keyspace with RF=3:
{noformat}
cassandra.WriteFailure: Error from server: code=1500 [Replica(s) failed to 
execute write] message="Operation failed - received 0 responses and 3 failures: 
UNKNOWN from /127.0.0.3:7000, UNKNOWN from /127.0.0.2:7000, UNKNOWN from 
/127.0.0.1:7000" info={'consistency': 'LOCAL_ONE', 'required_responses': 1, 
'received_responses': 0, 'failures': 3}
{noformat}

The same insert wrongly causes a timeout on a keyspace with 2 dcs (RF=3 each):
{noformat}
cassandra.WriteTimeout: Error from server: code=1100 [Coordinator node timed 
out waiting for replica nodes' responses] message="Operation timed out - 
received only 0 responses." info={'consistency': 'LOCAL_ONE', 
'required_responses': 1, 'received_responses': 0}
{noformat}

Reproduction steps:
{noformat}
# Setup cluster
ccm create -n 3:3 test
for i in {1..6}; do echo 'max_mutation_size_in_kb: 1000' >> 
~/.ccm/test/node$i/conf/cassandra.yaml; done
ccm start

# Create schema
ccm node1 cqlsh
CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
'dc1': 3, 'dc2': 3};
CREATE TABLE test.test (key int PRIMARY KEY, val blob);
exit;

# Insert data
python
from cassandra.cluster import Cluster
cluster = Cluster()
session = cluster.connect('test')
blob = f = open("2mbBlob", "rb").read().hex()
session.execute("INSERT INTO test (key, val) VALUES (1, textAsBlob('" + blob + 
"'))")
{noformat}

Reproduced in 3.11, trunk.


> Replica failure causes timeout on multi-DC write
> 
>
> Key: CASSANDRA-16334
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16334
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination, Messaging/Internode
>Reporter: Paulo Motta
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
>
> Inserting a mutation larger than {{max_mutation_size_in_kb}} correctly throws 
> a write error on a single DC keyspace with RF=3:
> {noformat}
> cassandra.WriteFailure: Error from server: code=1500 [Replica(s) failed to 
> execute write] message="Operation failed - received 0 responses and 3 
> failures: UNKNOWN from /127.0.0.3:7000, UNKNOWN from /127.0.0.2:7000, UNKNOWN 
> from /127.0.0.1:7000" info={'consistency': 'LOCAL_ONE', 'required_responses': 
> 1, 'received_responses': 0, 'failures': 3}
> {noformat}
> The same insert wrongly causes a timeout on a keyspace with 2 dcs (RF=3 each):
> {noformat}
> cassandra.WriteTimeout: Error from server: code=1100 [Coordinator node timed 
> out waiting for replica nodes' responses] message="Operation timed out - 
> received only 0 responses." info={'consistency': 'LOCAL_ONE', 
> 'required_responses': 1, 'received_responses': 0}
> {noformat}
> Reproduction steps:
> {noformat}
> # Setup cluster
> ccm create -n 3:3 test
> for i in {1..6}; do echo 'max_mutation_size_in_kb: 1000' >> 
> ~/.ccm/test/node$i/conf/cassandra.yaml; done
> ccm start
> # Create schema
> ccm node1 cqlsh
> CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
>

[jira] [Updated] (CASSANDRA-16334) Replica failure causes timeout on multi-DC write

2021-10-06 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-16334:
-
Description: 
Inserting a mutation larger than {{max_mutation_size_in_kb}} correctly throws a 
write error on a single DC keyspace with RF=3:
{noformat}
cassandra.WriteFailure: Error from server: code=1500 [Replica(s) failed to 
execute write] message="Operation failed - received 0 responses and 3 failures: 
UNKNOWN from /127.0.0.3:7000, UNKNOWN from /127.0.0.2:7000, UNKNOWN from 
/127.0.0.1:7000" info={'consistency': 'LOCAL_ONE', 'required_responses': 1, 
'received_responses': 0, 'failures': 3}
{noformat}

The same insert wrongly causes a timeout on a keyspace with 2 dcs (RF=3 each):
{noformat}
cassandra.WriteTimeout: Error from server: code=1100 [Coordinator node timed 
out waiting for replica nodes' responses] message="Operation timed out - 
received only 0 responses." info={'consistency': 'LOCAL_ONE', 
'required_responses': 1, 'received_responses': 0}
{noformat}

Reproduction steps:
{noformat}
# Setup cluster
ccm create -n 3:3 test
for i in {1..6}; do echo 'max_mutation_size_in_kb: 1000' >> 
~/.ccm/test/node$i/conf/cassandra.yaml; done
ccm start

# Create schema
ccm node1 cqlsh
CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
'dc1': 3, 'dc2': 3};
CREATE TABLE test.test (key int PRIMARY KEY, val blob);
exit;

# Insert data
python
from cassandra.cluster import Cluster
cluster = Cluster()
session = cluster.connect('test')
blob = f = open("2mbBlob", "rb").read().hex()
session.execute("INSERT INTO test (key, val) VALUES (1, textAsBlob('" + blob + 
"'))")
{noformat}

Reproduced in 3.11, trunk.

  was:
Inserting a mutation larger than {{max_mutation_size_in_kb}} correctly throws a 
write error on a single DC keyspace with RF=3:
{noformat}
cassandra.WriteFailure: Error from server: code=1500 [Replica(s) failed to 
execute write] message="Operation failed - received 0 responses and 3 failures: 
UNKNOWN from /127.0.0.3:7000, UNKNOWN from /127.0.0.2:7000, UNKNOWN from 
/127.0.0.1:7000" info={'consistency': 'LOCAL_ONE', 'required_responses': 1, 
'received_responses': 0, 'failures': 3}
{noformat}

The same insert wrongly causes a timeout on a keyspace with 2 dcs (RF=3 each):
{noformat}
cassandra.WriteTimeout: Error from server: code=1100 [Coordinator node timed 
out waiting for replica nodes' responses] message="Operation timed out - 
received only 0 responses." info={'consistency': 'LOCAL_ONE', 
'required_responses': 1, 'received_responses': 0}
{noformat}

Reproduction steps:
{noformat}
# Setup cluster
ccm create -n 3:3 test
for i in {1..6}; do echo 'max_mutation_size_in_kb: 1000' >> 
~/.ccm/test/node$i/conf/cassandra.yaml; done
ccm start

# Create schema
ccm node1 cqlsh
CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
'dc1': 3, 'dc2': 3};
CREATE TABLE test.test (key int PRIMARY KEY, val blob);
exit;

# Insert data
python
from cassandra.cluster import Cluster
session = cluster.connect('test')
blob = f = open("2mbBlob", "rb").read().hex()
session.execute("INSERT INTO test (key, val) VALUES (1, textAsBlob('" + blob + 
"'))")
{noformat}

Reproduced in 3.11, trunk.


> Replica failure causes timeout on multi-DC write
> 
>
> Key: CASSANDRA-16334
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16334
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination, Messaging/Internode
>Reporter: Paulo Motta
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
>
> Inserting a mutation larger than {{max_mutation_size_in_kb}} correctly throws 
> a write error on a single DC keyspace with RF=3:
> {noformat}
> cassandra.WriteFailure: Error from server: code=1500 [Replica(s) failed to 
> execute write] message="Operation failed - received 0 responses and 3 
> failures: UNKNOWN from /127.0.0.3:7000, UNKNOWN from /127.0.0.2:7000, UNKNOWN 
> from /127.0.0.1:7000" info={'consistency': 'LOCAL_ONE', 'required_responses': 
> 1, 'received_responses': 0, 'failures': 3}
> {noformat}
> The same insert wrongly causes a timeout on a keyspace with 2 dcs (RF=3 each):
> {noformat}
> cassandra.WriteTimeout: Error from server: code=1100 [Coordinator node timed 
> out waiting for replica nodes' responses] message="Operation timed out - 
> received only 0 responses." info={'consistency': 'LOCAL_ONE', 
> 'required_responses': 1, 'received_responses': 0}
> {noformat}
> Reproduction steps:
> {noformat}
> # Setup cluster
> ccm create -n 3:3 test
> for i in {1..6}; do echo 'max_mutation_size_in_kb: 1000' >> 
> ~/.ccm/test/node$i/conf/cassandra.yaml; done
> ccm start
> # Create schema
> ccm node1 cqlsh
> CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
> 'dc1': 3, 'dc2': 3};
>

[jira] [Commented] (CASSANDRA-16334) Replica failure causes timeout on multi-DC write

2021-10-06 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17425115#comment-17425115
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-16334:
--

This bug happens in the 
[AbstractWriteResponseHandler#onFailure|https://github.com/apache/cassandra/blob/2e2db4dc40c4935305b9a2d5d271580e96dabe42/src/java/org/apache/cassandra/service/AbstractWriteResponseHandler.java#L252-L265]:

{code}
@Override
public void onFailure(InetAddressAndPort from, RequestFailureReason 
failureReason)
{
logger.trace("Got failure from {}", from);

int n = waitingFor(from)
? failuresUpdater.incrementAndGet(this)
: failures;

failureReasonByEndpoint.put(from, failureReason);

if (blockFor() + n > candidateReplicaCount())
signal();
}
{code}

In the reproduction steps, {{INSERT INTO TEST}} uses CL {{LOCAL_ONE}}. 
Accordingly, 
[DatacenterWriteResponseHandler#waitingFor|https://github.com/apache/cassandra/blob/2e2db4dc40c4935305b9a2d5d271580e96dabe42/src/java/org/apache/cassandra/service/DatacenterWriteResponseHandler.java#L59-L63]
 only waits for the local nodes:

{code}
private final Predicate waitingFor = 
InOurDcTester.endpoints();

@Override
protected boolean waitingFor(InetAddressAndPort from)
{
return waitingFor.test(from);
}
{code}

[AbstractWriteResponseHandler#candidateReplicaCount()|https://github.com/apache/cassandra/blob/2e2db4dc40c4935305b9a2d5d271580e96dabe42/src/java/org/apache/cassandra/service/AbstractWriteResponseHandler.java#L205-L213]
 in the condition above, however, counts live and down replicas in ALL DCs as 
valid candidates:

{code}
protected int candidateReplicaCount()
{
return replicaPlan.liveAndDown().size();
}
{code}


As a result, even after all local nodes respond with {{FAILURE_RSP}}, the 
coordinator waits for responses from nodes in other DCs... but never counts 
them in.


There is more! Following the timeout or request failure, the coordinator 
creates hints for the nodes in other DCs which it will try to deliver forever.

> Replica failure causes timeout on multi-DC write
> 
>
> Key: CASSANDRA-16334
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16334
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination, Messaging/Internode
>Reporter: Paulo Motta
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
>
> Inserting a mutation larger than {{max_mutation_size_in_kb}} correctly throws 
> a write error on a single DC keyspace with RF=3:
> {noformat}
> cassandra.WriteFailure: Error from server: code=1500 [Replica(s) failed to 
> execute write] message="Operation failed - received 0 responses and 3 
> failures: UNKNOWN from /127.0.0.3:7000, UNKNOWN from /127.0.0.2:7000, UNKNOWN 
> from /127.0.0.1:7000" info={'consistency': 'LOCAL_ONE', 'required_responses': 
> 1, 'received_responses': 0, 'failures': 3}
> {noformat}
> The same insert wrongly causes a timeout on a keyspace with 2 dcs (RF=3 each):
> {noformat}
> cassandra.WriteTimeout: Error from server: code=1100 [Coordinator node timed 
> out waiting for replica nodes' responses] message="Operation timed out - 
> received only 0 responses." info={'consistency': 'LOCAL_ONE', 
> 'required_responses': 1, 'received_responses': 0}
> {noformat}
> Reproduction steps:
> {noformat}
> # Setup cluster
> ccm create -n 3:3 test
> for i in {1..6}; do echo 'max_mutation_size_in_kb: 1000' >> 
> ~/.ccm/test/node$i/conf/cassandra.yaml; done
> ccm start
> # Create schema
> ccm node1 cqlsh
> CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
> 'dc1': 3, 'dc2': 3};
> CREATE TABLE test.test (key int PRIMARY KEY, val blob);
> exit;
> # Insert data
> python
> from cassandra.cluster import Cluster
> session = cluster.connect('test')
> blob = f = open("2mbBlob", "rb").read().hex()
> session.execute("INSERT INTO test (key, val) VALUES (1, textAsBlob('" + blob 
> + "'))")
> {noformat}
> Reproduced in 3.11, trunk.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14795) Expose information about stored hints via JMX

2021-10-06 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424934#comment-17424934
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-14795:
--

Thank you for the review [~stefan.miklosovic]!

I do not think that CASSANDRA-14309 collides with this patch. Brief review of 
the code did not show any conceptual clash - the changes in 14309 should not be 
affected by the changes in my patch. I also cherry-picked 
[https://github.com/instaclustr/cassandra/tree/CASSANDRA-14309] and 
[https://github.com/apache/cassandra-dtest/pull/153.] Resolving git conflicts 
was trivial and all tests added by both patches passed.

> Expose information about stored hints via JMX
> -
>
> Key: CASSANDRA-14795
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14795
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Observability
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Low
> Fix For: 4.x
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> Currently there is no way to determine what kind of hints a node has, apart 
> from looking at the filenames (thus host-ids) on disk. Having a way to access 
> this information would help with debugging hint creation/replay scenarios.
> In addition to the JMX method, there is a new nodetool command:
> {noformat}$ bin/nodetool -h 127.0.0.1 -p 7100 listendpointspendinghints
> Host ID Address Rack DC Status Total files Newest Oldest
> 5762b140-3fdf-4057-9ca7-05c070ccc9c3 127.0.0.2 rack1 datacenter1 DOWN 2 
> 2018-09-18 14:05:18,835 2018-09-18 14:05:08,811
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14795) Expose information about stored hints via JMX

2021-10-05 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424439#comment-17424439
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-14795:
--

[~e.dimitrova], [~azotcsit]

 

I rebased the PR against latest trunk and squashed review commits; haven't 
started new CI runs as I no longer have access to the CircleCi's enterprise 
account.

 

Please let me know if you have more suggestions.

> Expose information about stored hints via JMX
> -
>
> Key: CASSANDRA-14795
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14795
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Observability
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Low
> Fix For: 4.x
>
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> Currently there is no way to determine what kind of hints a node has, apart 
> from looking at the filenames (thus host-ids) on disk. Having a way to access 
> this information would help with debugging hint creation/replay scenarios.
> In addition to the JMX method, there is a new nodetool command:
> {noformat}$ bin/nodetool -h 127.0.0.1 -p 7100 listendpointspendinghints
> Host ID Address Rack DC Status Total files Newest Oldest
> 5762b140-3fdf-4057-9ca7-05c070ccc9c3 127.0.0.2 rack1 datacenter1 DOWN 2 
> 2018-09-18 14:05:18,835 2018-09-18 14:05:08,811
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16986) DROP Table should not recycle active CommitLog segments

2021-10-02 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17423500#comment-17423500
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-16986:
--

[~maedhroz] Yes, please!

> DROP Table should not recycle active CommitLog segments
> ---
>
> Key: CASSANDRA-16986
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16986
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Commit Log
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Low
> Fix For: 3.0.x, 3.11.x, 4.0.x, 4.x
>
>
> Right now, DROP TABLE recycles all active CL segments and explicitly marks 
> intervals as clean for all dropping tables. I believe that this is not 
> necessary.
> Recycling of CL segments was introduced in CASSANDRA-3578. Back then, it was 
> necessary to recycle all active segments because:
> 1. CommitLog reused old segments after they were clean. This is no longer the 
> case, I believe, since CASSANDRA-6809.
> 2. CommitLog segments must have been closed and recycled on {{DROP TABLE}} to 
> avoid resurrecting data if a table with the same name is created. This was an 
> issue because tables didn't have unique ids yet (CASSANDRA-5202).
> Given that {{DROP TABLE}} triggers flush, which in turn cleans CL intervals 
> in Keyspace#unloadCF, I think that we can avoid the call to 
> {{forceRecycleAll}} there.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Assigned] (CASSANDRA-16334) Replica failure causes timeout on multi-DC write

2021-09-30 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov reassigned CASSANDRA-16334:


Assignee: Aleksandr Sorokoumov

> Replica failure causes timeout on multi-DC write
> 
>
> Key: CASSANDRA-16334
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16334
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination, Messaging/Internode
>Reporter: Paulo Motta
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
>
> Inserting a mutation larger than {{max_mutation_size_in_kb}} correctly throws 
> a write error on a single DC keyspace with RF=3:
> {noformat}
> cassandra.WriteFailure: Error from server: code=1500 [Replica(s) failed to 
> execute write] message="Operation failed - received 0 responses and 3 
> failures: UNKNOWN from /127.0.0.3:7000, UNKNOWN from /127.0.0.2:7000, UNKNOWN 
> from /127.0.0.1:7000" info={'consistency': 'LOCAL_ONE', 'required_responses': 
> 1, 'received_responses': 0, 'failures': 3}
> {noformat}
> The same insert wrongly causes a timeout on a keyspace with 2 dcs (RF=3 each):
> {noformat}
> cassandra.WriteTimeout: Error from server: code=1100 [Coordinator node timed 
> out waiting for replica nodes' responses] message="Operation timed out - 
> received only 0 responses." info={'consistency': 'LOCAL_ONE', 
> 'required_responses': 1, 'received_responses': 0}
> {noformat}
> Reproduction steps:
> {noformat}
> # Setup cluster
> ccm create -n 3:3 test
> for i in {1..6}; do echo 'max_mutation_size_in_kb: 1000' >> 
> ~/.ccm/test/node$i/conf/cassandra.yaml; done
> ccm start
> # Create schema
> ccm node1 cqlsh
> CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
> 'dc1': 3, 'dc2': 3};
> CREATE TABLE test.test (key int PRIMARY KEY, val blob);
> exit;
> # Insert data
> python
> from cassandra.cluster import Cluster
> session = cluster.connect('test')
> blob = f = open("2mbBlob", "rb").read().hex()
> session.execute("INSERT INTO test (key, val) VALUES (1, textAsBlob('" + blob 
> + "'))")
> {noformat}
> Reproduced in 3.11, trunk.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16975) CompactionTask#runMayThrow should not release new SSTables for offline transactions

2021-09-30 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17422678#comment-17422678
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-16975:
--

Ah, you are right! I am still not used to the fact that 4.0 is not trunk :)

> CompactionTask#runMayThrow should not release new SSTables for offline 
> transactions
> ---
>
> Key: CASSANDRA-16975
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16975
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x
>
>
> Right now, {{CompactionTask#runMayThrow}} releases new SSTables for offline 
> transactions 
> ([code|https://github.com/apache/cassandra/blob/f7c71f65c000c2c3ef7df1b034b8fdd822a396d8/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L227-L230]).
>  This change was added in CASSANDRA-8962, prior to the introduction of 
> lifecycle transactions in CASSANDRA-8568. I suspect that this behavior might 
> be undesired and could have just fallen through the cracks.
> To my knowledge, this code does not cause any known bugs solely because 
> in-tree tools do not access the SSTables they produce before exiting. 
> However, if someone is to write, say, offline compaction daemon, it might 
> break on subsequent compactions because newly created SSTables will be 
> released.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16975) CompactionTask#runMayThrow should not release new SSTables for offline transactions

2021-09-30 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17422664#comment-17422664
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-16975:
--

Thank you!

> CompactionTask#runMayThrow should not release new SSTables for offline 
> transactions
> ---
>
> Key: CASSANDRA-16975
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16975
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x
>
>
> Right now, {{CompactionTask#runMayThrow}} releases new SSTables for offline 
> transactions 
> ([code|https://github.com/apache/cassandra/blob/f7c71f65c000c2c3ef7df1b034b8fdd822a396d8/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L227-L230]).
>  This change was added in CASSANDRA-8962, prior to the introduction of 
> lifecycle transactions in CASSANDRA-8568. I suspect that this behavior might 
> be undesired and could have just fallen through the cracks.
> To my knowledge, this code does not cause any known bugs solely because 
> in-tree tools do not access the SSTables they produce before exiting. 
> However, if someone is to write, say, offline compaction daemon, it might 
> break on subsequent compactions because newly created SSTables will be 
> released.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14795) Expose information about stored hints via JMX

2021-09-30 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17422646#comment-17422646
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-14795:
--

[~e.dimitrova]

{quote}
I would suggest running the two new tests in a loop in the Circle CI 
multiplexer to ensure no weird flakiness appears in the future.
{quote}

* [j8 repeated 
tests|https://app.circleci.com/pipelines/github/Ge/cassandra/216/workflows/c789a0c0-2974-48b5-bd27-2a33de2d72b0]
* [j11 repeated 
tests|https://app.circleci.com/pipelines/github/Ge/cassandra/216/workflows/b59c5c5a-4f25-47ba-ae3b-917b01db8b67]

> Expose information about stored hints via JMX
> -
>
> Key: CASSANDRA-14795
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14795
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Observability
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Low
> Fix For: 4.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently there is no way to determine what kind of hints a node has, apart 
> from looking at the filenames (thus host-ids) on disk. Having a way to access 
> this information would help with debugging hint creation/replay scenarios.
> In addition to the JMX method, there is a new nodetool command:
> {noformat}$ bin/nodetool -h 127.0.0.1 -p 7100 listendpointspendinghints
> Host ID Address Rack DC Status Total files Newest Oldest
> 5762b140-3fdf-4057-9ca7-05c070ccc9c3 127.0.0.2 rack1 datacenter1 DOWN 2 
> 2018-09-18 14:05:18,835 2018-09-18 14:05:08,811
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16975) CompactionTask#runMayThrow should not release new SSTables for offline transactions

2021-09-30 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17422601#comment-17422601
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-16975:
--

[~adelapena] As this patch has two +1s, should I move it to {{READY TO COMMIT}}?

> CompactionTask#runMayThrow should not release new SSTables for offline 
> transactions
> ---
>
> Key: CASSANDRA-16975
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16975
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x
>
>
> Right now, {{CompactionTask#runMayThrow}} releases new SSTables for offline 
> transactions 
> ([code|https://github.com/apache/cassandra/blob/f7c71f65c000c2c3ef7df1b034b8fdd822a396d8/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L227-L230]).
>  This change was added in CASSANDRA-8962, prior to the introduction of 
> lifecycle transactions in CASSANDRA-8568. I suspect that this behavior might 
> be undesired and could have just fallen through the cracks.
> To my knowledge, this code does not cause any known bugs solely because 
> in-tree tools do not access the SSTables they produce before exiting. 
> However, if someone is to write, say, offline compaction daemon, it might 
> break on subsequent compactions because newly created SSTables will be 
> released.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14795) Expose information about stored hints via JMX

2021-09-29 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17422105#comment-17422105
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-14795:
--

Thank you for the review [~azotcsit], [~e.dimitrova]! I've created a PR 
https://github.com/apache/cassandra/pull/1232 as you asked.

In my opinion, per-target hints together with the status column are helpful to 
understand what nodes we accumulate hints for and what nodes are ready for the 
hand-off. I added information about dc and rack to correlate the number of 
hints and the nodes' status with their location in a single output.

I don't have too much experience operating C*, so maybe I am over-complicating 
it in an attempt to design a convenient UX :) Looking forward to seeing other 
opinions.



> Expose information about stored hints via JMX
> -
>
> Key: CASSANDRA-14795
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14795
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Observability
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Low
> Fix For: 4.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently there is no way to determine what kind of hints a node has, apart 
> from looking at the filenames (thus host-ids) on disk. Having a way to access 
> this information would help with debugging hint creation/replay scenarios.
> In addition to the JMX method, there is a new nodetool command:
> {noformat}$ bin/nodetool -h 127.0.0.1 -p 7100 listendpointspendinghints
> Host ID Address Rack DC Status Total files Newest Oldest
> 5762b140-3fdf-4057-9ca7-05c070ccc9c3 127.0.0.2 rack1 datacenter1 DOWN 2 
> 2018-09-18 14:05:18,835 2018-09-18 14:05:08,811
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16975) CompactionTask#runMayThrow should not release new SSTables for offline transactions

2021-09-27 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-16975:
-
Fix Version/s: 3.11.x
   3.0.x

> CompactionTask#runMayThrow should not release new SSTables for offline 
> transactions
> ---
>
> Key: CASSANDRA-16975
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16975
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x
>
>
> Right now, {{CompactionTask#runMayThrow}} releases new SSTables for offline 
> transactions 
> ([code|https://github.com/apache/cassandra/blob/f7c71f65c000c2c3ef7df1b034b8fdd822a396d8/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L227-L230]).
>  This change was added in CASSANDRA-8962, prior to the introduction of 
> lifecycle transactions in CASSANDRA-8568. I suspect that this behavior might 
> be undesired and could have just fallen through the cracks.
> To my knowledge, this code does not cause any known bugs solely because 
> in-tree tools do not access the SSTables they produce before exiting. 
> However, if someone is to write, say, offline compaction daemon, it might 
> break on subsequent compactions because newly created SSTables will be 
> released.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16986) DROP Table should not recycle active CommitLog segments

2021-09-27 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-16986:
-
Change Category: Code Clarity  (was: Performance)

> DROP Table should not recycle active CommitLog segments
> ---
>
> Key: CASSANDRA-16986
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16986
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Commit Log
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x, 4.x
>
>
> Right now, DROP TABLE recycles all active CL segments and explicitly marks 
> intervals as clean for all dropping tables. I believe that this is not 
> necessary.
> Recycling of CL segments was introduced in CASSANDRA-3578. Back then, it was 
> necessary to recycle all active segments because:
> 1. CommitLog reused old segments after they were clean. This is no longer the 
> case, I believe, since CASSANDRA-6809.
> 2. CommitLog segments must have been closed and recycled on {{DROP TABLE}} to 
> avoid resurrecting data if a table with the same name is created. This was an 
> issue because tables didn't have unique ids yet (CASSANDRA-5202).
> Given that {{DROP TABLE}} triggers flush, which in turn cleans CL intervals 
> in Keyspace#unloadCF, I think that we can avoid the call to 
> {{forceRecycleAll}} there.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16986) DROP Table should not recycle active CommitLog segments

2021-09-27 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17420753#comment-17420753
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-16986:
--

Patches:
* 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...Ge:16986-3.0?expand=1]
* [3.11, 4.0, 
4.1|https://github.com/apache/cassandra/compare/cassandra-3.11...Ge:16986-3.11?expand=1]

> DROP Table should not recycle active CommitLog segments
> ---
>
> Key: CASSANDRA-16986
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16986
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Commit Log
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 4.x
>
>
> Right now, DROP TABLE recycles all active CL segments and explicitly marks 
> intervals as clean for all dropping tables. I believe that this is not 
> necessary.
> Recycling of CL segments was introduced in CASSANDRA-3578. Back then, it was 
> necessary to recycle all active segments because:
> 1. CommitLog reused old segments after they were clean. This is no longer the 
> case, I believe, since CASSANDRA-6809.
> 2. CommitLog segments must have been closed and recycled on {{DROP TABLE}} to 
> avoid resurrecting data if a table with the same name is created. This was an 
> issue because tables didn't have unique ids yet (CASSANDRA-5202).
> Given that {{DROP TABLE}} triggers flush, which in turn cleans CL intervals 
> in Keyspace#unloadCF, I think that we can avoid the call to 
> {{forceRecycleAll}} there.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16986) DROP Table should not recycle active CommitLog segments

2021-09-27 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-16986:
-
Fix Version/s: 4.0.x
   3.11.x
   3.0.x

> DROP Table should not recycle active CommitLog segments
> ---
>
> Key: CASSANDRA-16986
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16986
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Commit Log
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0.x, 4.x
>
>
> Right now, DROP TABLE recycles all active CL segments and explicitly marks 
> intervals as clean for all dropping tables. I believe that this is not 
> necessary.
> Recycling of CL segments was introduced in CASSANDRA-3578. Back then, it was 
> necessary to recycle all active segments because:
> 1. CommitLog reused old segments after they were clean. This is no longer the 
> case, I believe, since CASSANDRA-6809.
> 2. CommitLog segments must have been closed and recycled on {{DROP TABLE}} to 
> avoid resurrecting data if a table with the same name is created. This was an 
> issue because tables didn't have unique ids yet (CASSANDRA-5202).
> Given that {{DROP TABLE}} triggers flush, which in turn cleans CL intervals 
> in Keyspace#unloadCF, I think that we can avoid the call to 
> {{forceRecycleAll}} there.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16986) DROP Table should not recycle active CommitLog segments

2021-09-27 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-16986:
-
Priority: Low  (was: Normal)

> DROP Table should not recycle active CommitLog segments
> ---
>
> Key: CASSANDRA-16986
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16986
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Commit Log
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Low
> Fix For: 3.0.x, 3.11.x, 4.0.x, 4.x
>
>
> Right now, DROP TABLE recycles all active CL segments and explicitly marks 
> intervals as clean for all dropping tables. I believe that this is not 
> necessary.
> Recycling of CL segments was introduced in CASSANDRA-3578. Back then, it was 
> necessary to recycle all active segments because:
> 1. CommitLog reused old segments after they were clean. This is no longer the 
> case, I believe, since CASSANDRA-6809.
> 2. CommitLog segments must have been closed and recycled on {{DROP TABLE}} to 
> avoid resurrecting data if a table with the same name is created. This was an 
> issue because tables didn't have unique ids yet (CASSANDRA-5202).
> Given that {{DROP TABLE}} triggers flush, which in turn cleans CL intervals 
> in Keyspace#unloadCF, I think that we can avoid the call to 
> {{forceRecycleAll}} there.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16986) DROP Table should not recycle active CommitLog segments

2021-09-27 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17420607#comment-17420607
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-16986:
--

Thank you for the discussion! I agree with Caleb's points and will create a new 
patch later today with a comment that explains why we still need to recycle 
segments on DROP TABLE.

> DROP Table should not recycle active CommitLog segments
> ---
>
> Key: CASSANDRA-16986
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16986
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Commit Log
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 4.x
>
>
> Right now, DROP TABLE recycles all active CL segments and explicitly marks 
> intervals as clean for all dropping tables. I believe that this is not 
> necessary.
> Recycling of CL segments was introduced in CASSANDRA-3578. Back then, it was 
> necessary to recycle all active segments because:
> 1. CommitLog reused old segments after they were clean. This is no longer the 
> case, I believe, since CASSANDRA-6809.
> 2. CommitLog segments must have been closed and recycled on {{DROP TABLE}} to 
> avoid resurrecting data if a table with the same name is created. This was an 
> issue because tables didn't have unique ids yet (CASSANDRA-5202).
> Given that {{DROP TABLE}} triggers flush, which in turn cleans CL intervals 
> in Keyspace#unloadCF, I think that we can avoid the call to 
> {{forceRecycleAll}} there.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16975) CompactionTask#runMayThrow should not release new SSTables for offline transactions

2021-09-23 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17419144#comment-17419144
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-16975:
--

I removed {{throws Exception}} from the test and added patches + CI for 3.0 and 
3.11 to the table in the previous comment.

> CompactionTask#runMayThrow should not release new SSTables for offline 
> transactions
> ---
>
> Key: CASSANDRA-16975
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16975
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 4.0.x
>
>
> Right now, {{CompactionTask#runMayThrow}} releases new SSTables for offline 
> transactions 
> ([code|https://github.com/apache/cassandra/blob/f7c71f65c000c2c3ef7df1b034b8fdd822a396d8/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L227-L230]).
>  This change was added in CASSANDRA-8962, prior to the introduction of 
> lifecycle transactions in CASSANDRA-8568. I suspect that this behavior might 
> be undesired and could have just fallen through the cracks.
> To my knowledge, this code does not cause any known bugs solely because 
> in-tree tools do not access the SSTables they produce before exiting. 
> However, if someone is to write, say, offline compaction daemon, it might 
> break on subsequent compactions because newly created SSTables will be 
> released.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-16975) CompactionTask#runMayThrow should not release new SSTables for offline transactions

2021-09-23 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17417326#comment-17417326
 ] 

Aleksandr Sorokoumov edited comment on CASSANDRA-16975 at 9/23/21, 11:24 AM:
-

||Branch||CI||
|[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...Ge:16975-3.0?expand=1]|[j8|https://app.circleci.com/pipelines/github/Ge/cassandra/208/workflows/ed8ff9b7-f126-477b-8191-b39efd61345d]|
|[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...Ge:16975-3.11?expand=1]|[j8|https://app.circleci.com/pipelines/github/Ge/cassandra/207/workflows/12bf105b-60b2-4fec-b98e-ab08ef6d33bc]|
|[4.0|https://github.com/apache/cassandra/compare/trunk...Ge:CASSANDRA-16975?expand=1]
 
|[j8|https://app.circleci.com/pipelines/github/Ge/cassandra/200/workflows/9f0978f7-b363-440d-aa88-1a8a2b4b6316]
 
[j11|https://app.circleci.com/pipelines/github/Ge/cassandra/200/workflows/6fbd5910-0e98-457f-8d1a-0b1f2048052c]|

 


was (Author: ge):
||Branch||CI||
|[4.0|https://github.com/apache/cassandra/compare/trunk...Ge:CASSANDRA-16975?expand=1]
 
|[j8|https://app.circleci.com/pipelines/github/Ge/cassandra/200/workflows/9f0978f7-b363-440d-aa88-1a8a2b4b6316]
 
[j11|https://app.circleci.com/pipelines/github/Ge/cassandra/200/workflows/6fbd5910-0e98-457f-8d1a-0b1f2048052c]|

 

> CompactionTask#runMayThrow should not release new SSTables for offline 
> transactions
> ---
>
> Key: CASSANDRA-16975
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16975
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 4.0.x
>
>
> Right now, {{CompactionTask#runMayThrow}} releases new SSTables for offline 
> transactions 
> ([code|https://github.com/apache/cassandra/blob/f7c71f65c000c2c3ef7df1b034b8fdd822a396d8/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L227-L230]).
>  This change was added in CASSANDRA-8962, prior to the introduction of 
> lifecycle transactions in CASSANDRA-8568. I suspect that this behavior might 
> be undesired and could have just fallen through the cracks.
> To my knowledge, this code does not cause any known bugs solely because 
> in-tree tools do not access the SSTables they produce before exiting. 
> However, if someone is to write, say, offline compaction daemon, it might 
> break on subsequent compactions because newly created SSTables will be 
> released.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16986) DROP Table should not recycle active CommitLog segments

2021-09-23 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-16986:
-
Test and Documentation Plan: I added a test that ensures that a dropped 
table does not leave dirty intervals in the active segments.
 Status: Patch Available  (was: Open)

This patch removes {{CommitLog.instance#forceRecycleAllSegments}} from the 
{{DROP TABLE}} path. In addition, 
{{AbstractCommitLogSegmentManager#forceRecycleAll}} no longer explicitly cleans 
up the segment intervals for the dropping table.



||Branch||CI||
|[trunk|https://github.com/apache/cassandra/compare/trunk...Ge:CASSANDRA-16986?expand=1]
 
|[j8|https://app.circleci.com/pipelines/github/Ge/cassandra/204/workflows/1125c27e-392f-49b7-9488-e702d5afbc84]
 
[j11|https://app.circleci.com/pipelines/github/Ge/cassandra/204/workflows/a55543ad-e4ca-469a-8037-79dd508ba606]|

 

> DROP Table should not recycle active CommitLog segments
> ---
>
> Key: CASSANDRA-16986
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16986
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Commit Log
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 4.x
>
>
> Right now, DROP TABLE recycles all active CL segments and explicitly marks 
> intervals as clean for all dropping tables. I believe that this is not 
> necessary.
> Recycling of CL segments was introduced in CASSANDRA-3578. Back then, it was 
> necessary to recycle all active segments because:
> 1. CommitLog reused old segments after they were clean. This is no longer the 
> case, I believe, since CASSANDRA-6809.
> 2. CommitLog segments must have been closed and recycled on {{DROP TABLE}} to 
> avoid resurrecting data if a table with the same name is created. This was an 
> issue because tables didn't have unique ids yet (CASSANDRA-5202).
> Given that {{DROP TABLE}} triggers flush, which in turn cleans CL intervals 
> in Keyspace#unloadCF, I think that we can avoid the call to 
> {{forceRecycleAll}} there.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16986) DROP Table should not recycle active CommitLog segments

2021-09-23 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-16986:
-
Change Category: Performance
 Complexity: Normal
 Status: Open  (was: Triage Needed)

> DROP Table should not recycle active CommitLog segments
> ---
>
> Key: CASSANDRA-16986
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16986
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Commit Log
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 4.x
>
>
> Right now, DROP TABLE recycles all active CL segments and explicitly marks 
> intervals as clean for all dropping tables. I believe that this is not 
> necessary.
> Recycling of CL segments was introduced in CASSANDRA-3578. Back then, it was 
> necessary to recycle all active segments because:
> 1. CommitLog reused old segments after they were clean. This is no longer the 
> case, I believe, since CASSANDRA-6809.
> 2. CommitLog segments must have been closed and recycled on {{DROP TABLE}} to 
> avoid resurrecting data if a table with the same name is created. This was an 
> issue because tables didn't have unique ids yet (CASSANDRA-5202).
> Given that {{DROP TABLE}} triggers flush, which in turn cleans CL intervals 
> in Keyspace#unloadCF, I think that we can avoid the call to 
> {{forceRecycleAll}} there.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16986) DROP Table should not recycle active CommitLog segments

2021-09-23 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-16986:
-
Fix Version/s: 4.x

> DROP Table should not recycle active CommitLog segments
> ---
>
> Key: CASSANDRA-16986
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16986
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Commit Log
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 4.x
>
>
> Right now, DROP TABLE recycles all active CL segments and explicitly marks 
> intervals as clean for all dropping tables. I believe that this is not 
> necessary.
> Recycling of CL segments was introduced in CASSANDRA-3578. Back then, it was 
> necessary to recycle all active segments because:
> 1. CommitLog reused old segments after they were clean. This is no longer the 
> case, I believe, since CASSANDRA-6809.
> 2. CommitLog segments must have been closed and recycled on {{DROP TABLE}} to 
> avoid resurrecting data if a table with the same name is created. This was an 
> issue because tables didn't have unique ids yet (CASSANDRA-5202).
> Given that {{DROP TABLE}} triggers flush, which in turn cleans CL intervals 
> in Keyspace#unloadCF, I think that we can avoid the call to 
> {{forceRecycleAll}} there.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-16986) DROP Table should not recycle active CommitLog segments

2021-09-23 Thread Aleksandr Sorokoumov (Jira)

Aleksandr Sorokoumov created CASSANDRA-16986:


 Summary: DROP Table should not recycle active CommitLog segments
 Key: CASSANDRA-16986
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16986
 Project: Cassandra
  Issue Type: Improvement
  Components: Local/Commit Log
Reporter: Aleksandr Sorokoumov
Assignee: Aleksandr Sorokoumov


Right now, DROP TABLE recycles all active CL segments and explicitly marks 
intervals as clean for all dropping tables. I believe that this is not 
necessary.

Recycling of CL segments was introduced in CASSANDRA-3578. Back then, it was 
necessary to recycle all active segments because:
1. CommitLog reused old segments after they were clean. This is no longer the 
case, I believe, since CASSANDRA-6809.
2. CommitLog segments must have been closed and recycled on {{DROP TABLE}} to 
avoid resurrecting data if a table with the same name is created. This was an 
issue because tables didn't have unique ids yet (CASSANDRA-5202).

Given that {{DROP TABLE}} triggers flush, which in turn cleans CL intervals in 
Keyspace#unloadCF, I think that we can avoid the call to {{forceRecycleAll}} 
there.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-14795) Expose information about stored hints via JMX

2021-09-23 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-14795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17419060#comment-17419060
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-14795:
--

This patch introduces both a nodetool command and a virtual table for pending 
hints.

||Branch||CI||
|[dtest|https://github.com/apache/cassandra-dtest/pull/162]| |
|[trunk|https://github.com/apache/cassandra/compare/trunk...Ge:14795?expand=1]
 
|[j8|https://app.circleci.com/pipelines/github/Ge/cassandra/203/workflows/81678055-f5ee-44cc-b975-49225a2dc6b0]
 
[j11|https://app.circleci.com/pipelines/github/Ge/cassandra/203/workflows/3a1b9c54-befb-4a7a-957c-a18cf536ab93]|

 

> Expose information about stored hints via JMX
> -
>
> Key: CASSANDRA-14795
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14795
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Observability
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Low
> Fix For: 4.x
>
>
> Currently there is no way to determine what kind of hints a node has, apart 
> from looking at the filenames (thus host-ids) on disk. Having a way to access 
> this information would help with debugging hint creation/replay scenarios.
> In addition to the JMX method, there is a new nodetool command:
> {noformat}$ bin/nodetool -h 127.0.0.1 -p 7100 listendpointspendinghints
> Host ID Address Rack DC Status Total files Newest Oldest
> 5762b140-3fdf-4057-9ca7-05c070ccc9c3 127.0.0.2 rack1 datacenter1 DOWN 2 
> 2018-09-18 14:05:18,835 2018-09-18 14:05:08,811
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-14795) Expose information about stored hints via JMX

2021-09-23 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-14795:
-
Test and Documentation Plan: Added a dtest for the new nodetool command and 
the virtual table.
 Status: Patch Available  (was: In Progress)

> Expose information about stored hints via JMX
> -
>
> Key: CASSANDRA-14795
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14795
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Observability
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Low
> Fix For: 4.x
>
>
> Currently there is no way to determine what kind of hints a node has, apart 
> from looking at the filenames (thus host-ids) on disk. Having a way to access 
> this information would help with debugging hint creation/replay scenarios.
> In addition to the JMX method, there is a new nodetool command:
> {noformat}$ bin/nodetool -h 127.0.0.1 -p 7100 listendpointspendinghints
> Host ID Address Rack DC Status Total files Newest Oldest
> 5762b140-3fdf-4057-9ca7-05c070ccc9c3 127.0.0.2 rack1 datacenter1 DOWN 2 
> 2018-09-18 14:05:18,835 2018-09-18 14:05:08,811
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16975) CompactionTask#runMayThrow should not release new SSTables for offline transactions

2021-09-20 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-16975:
-
Test and Documentation Plan: I added a test to {{CompactionTaskTest}} that 
ensures that SSTables produced by offline CompactionTasks are not released.
 Status: Patch Available  (was: Open)

> CompactionTask#runMayThrow should not release new SSTables for offline 
> transactions
> ---
>
> Key: CASSANDRA-16975
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16975
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 4.0.x
>
>
> Right now, {{CompactionTask#runMayThrow}} releases new SSTables for offline 
> transactions 
> ([code|https://github.com/apache/cassandra/blob/f7c71f65c000c2c3ef7df1b034b8fdd822a396d8/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L227-L230]).
>  This change was added in CASSANDRA-8962, prior to the introduction of 
> lifecycle transactions in CASSANDRA-8568. I suspect that this behavior might 
> be undesired and could have just fallen through the cracks.
> To my knowledge, this code does not cause any known bugs solely because 
> in-tree tools do not access the SSTables they produce before exiting. 
> However, if someone is to write, say, offline compaction daemon, it might 
> break on subsequent compactions because newly created SSTables will be 
> released.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16975) CompactionTask#runMayThrow should not release new SSTables for offline transactions

2021-09-20 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-16975:
-
 Bug Category: Parent values: Correctness(12982)
   Complexity: Normal
Discovered By: Code Inspection
 Severity: Low
   Status: Open  (was: Triage Needed)

> CompactionTask#runMayThrow should not release new SSTables for offline 
> transactions
> ---
>
> Key: CASSANDRA-16975
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16975
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 4.0.x
>
>
> Right now, {{CompactionTask#runMayThrow}} releases new SSTables for offline 
> transactions 
> ([code|https://github.com/apache/cassandra/blob/f7c71f65c000c2c3ef7df1b034b8fdd822a396d8/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L227-L230]).
>  This change was added in CASSANDRA-8962, prior to the introduction of 
> lifecycle transactions in CASSANDRA-8568. I suspect that this behavior might 
> be undesired and could have just fallen through the cracks.
> To my knowledge, this code does not cause any known bugs solely because 
> in-tree tools do not access the SSTables they produce before exiting. 
> However, if someone is to write, say, offline compaction daemon, it might 
> break on subsequent compactions because newly created SSTables will be 
> released.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16975) CompactionTask#runMayThrow should not release new SSTables for offline transactions

2021-09-19 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17417326#comment-17417326
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-16975:
--

 
||Branch||CI||
|[4.0|https://github.com/apache/cassandra/compare/trunk...Ge:CASSANDRA-16975?expand=1]
 
|[j8|https://app.circleci.com/pipelines/github/Ge/cassandra/200/workflows/9f0978f7-b363-440d-aa88-1a8a2b4b6316]
 
[j11|https://app.circleci.com/pipelines/github/Ge/cassandra/200/workflows/6fbd5910-0e98-457f-8d1a-0b1f2048052c]|

 

> CompactionTask#runMayThrow should not release new SSTables for offline 
> transactions
> ---
>
> Key: CASSANDRA-16975
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16975
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 4.0.x
>
>
> Right now, {{CompactionTask#runMayThrow}} releases new SSTables for offline 
> transactions 
> ([code|https://github.com/apache/cassandra/blob/f7c71f65c000c2c3ef7df1b034b8fdd822a396d8/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L227-L230]).
>  This change was added in CASSANDRA-8962, prior to the introduction of 
> lifecycle transactions in CASSANDRA-8568. I suspect that this behavior might 
> be undesired and could have just fallen through the cracks.
> To my knowledge, this code does not cause any known bugs solely because 
> in-tree tools do not access the SSTables they produce before exiting. 
> However, if someone is to write, say, offline compaction daemon, it might 
> break on subsequent compactions because newly created SSTables will be 
> released.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-16975) CompactionTask#runMayThrow should not release new SSTables for offline transactions

2021-09-19 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17417326#comment-17417326
 ] 

Aleksandr Sorokoumov edited comment on CASSANDRA-16975 at 9/19/21, 12:28 PM:
-

||Branch||CI||
|[4.0|https://github.com/apache/cassandra/compare/trunk...Ge:CASSANDRA-16975?expand=1]
 
|[j8|https://app.circleci.com/pipelines/github/Ge/cassandra/200/workflows/9f0978f7-b363-440d-aa88-1a8a2b4b6316]
 
[j11|https://app.circleci.com/pipelines/github/Ge/cassandra/200/workflows/6fbd5910-0e98-457f-8d1a-0b1f2048052c]|

 


was (Author: ge):
 
||Branch||CI||
|[4.0|https://github.com/apache/cassandra/compare/trunk...Ge:CASSANDRA-16975?expand=1]
 
|[j8|https://app.circleci.com/pipelines/github/Ge/cassandra/200/workflows/9f0978f7-b363-440d-aa88-1a8a2b4b6316]
 
[j11|https://app.circleci.com/pipelines/github/Ge/cassandra/200/workflows/6fbd5910-0e98-457f-8d1a-0b1f2048052c]|

 

> CompactionTask#runMayThrow should not release new SSTables for offline 
> transactions
> ---
>
> Key: CASSANDRA-16975
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16975
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 4.0.x
>
>
> Right now, {{CompactionTask#runMayThrow}} releases new SSTables for offline 
> transactions 
> ([code|https://github.com/apache/cassandra/blob/f7c71f65c000c2c3ef7df1b034b8fdd822a396d8/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L227-L230]).
>  This change was added in CASSANDRA-8962, prior to the introduction of 
> lifecycle transactions in CASSANDRA-8568. I suspect that this behavior might 
> be undesired and could have just fallen through the cracks.
> To my knowledge, this code does not cause any known bugs solely because 
> in-tree tools do not access the SSTables they produce before exiting. 
> However, if someone is to write, say, offline compaction daemon, it might 
> break on subsequent compactions because newly created SSTables will be 
> released.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16975) CompactionTask#runMayThrow should not release new SSTables for offline transactions

2021-09-19 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-16975:
-
Summary: CompactionTask#runMayThrow should not release new SSTables for 
offline transactions  (was: CompactionTask#runMayThrow should not remove new 
SSTables from the tracker for offline transactions)

> CompactionTask#runMayThrow should not release new SSTables for offline 
> transactions
> ---
>
> Key: CASSANDRA-16975
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16975
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 4.0.x
>
>
> Right now, {{CompactionTask#runMayThrow}} releases new SSTables for offline 
> transactions 
> ([code|https://github.com/apache/cassandra/blob/f7c71f65c000c2c3ef7df1b034b8fdd822a396d8/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L227-L230]).
>  This change was added in CASSANDRA-8962, prior to the introduction of 
> lifecycle transactions in CASSANDRA-8568. I suspect that this behavior might 
> be undesired and could have just fallen through the cracks.
> To my knowledge, this code does not cause any known bugs solely because 
> in-tree tools do not access the SSTables they produce before exiting. 
> However, if someone is to write, say, offline compaction daemon, it might 
> break on subsequent compactions because newly created SSTables will be 
> released.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16975) CompactionTask#runMayThrow should not remove new SSTables from the tracker for offline transactions

2021-09-18 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-16975:
-
Description: 
Right now, {{CompactionTask#runMayThrow}} releases new SSTables for offline 
transactions 
([code|https://github.com/apache/cassandra/blob/f7c71f65c000c2c3ef7df1b034b8fdd822a396d8/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L227-L230]).
 This change was added in CASSANDRA-8962, prior to the introduction of 
lifecycle transactions in CASSANDRA-8568. I suspect that this behavior might be 
undesired and could have just fallen through the cracks.

To my knowledge, this code does not cause any known bugs solely because in-tree 
tools do not access the SSTables they produce before exiting. However, if 
someone is to write, say, offline compaction daemon, it might break on 
subsequent compactions because newly created SSTables will be released.

  was:
Right now, {{CompactionTask#runMayThrow}} removes new SSTables from the tracker 
for offline transactions 
([code|https://github.com/apache/cassandra/blob/f7c71f65c000c2c3ef7df1b034b8fdd822a396d8/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L227-L230]).
 This change was added in CASSANDRA-8962, prior to the introduction of 
lifecycle transactions in CASSANDRA-8568. I suspect that this behavior might be 
undesired and could have just fallen through the cracks.

To my knowledge, this code does not cause any known bugs solely because in-tree 
tools do not access the SSTables they produce before exiting. However, if 
someone is to write, say, offline compaction daemon, it might break on 
subsequent iterations because newly created SSTables won't be in the tracker.


> CompactionTask#runMayThrow should not remove new SSTables from the tracker 
> for offline transactions
> ---
>
> Key: CASSANDRA-16975
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16975
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 4.0.x
>
>
> Right now, {{CompactionTask#runMayThrow}} releases new SSTables for offline 
> transactions 
> ([code|https://github.com/apache/cassandra/blob/f7c71f65c000c2c3ef7df1b034b8fdd822a396d8/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L227-L230]).
>  This change was added in CASSANDRA-8962, prior to the introduction of 
> lifecycle transactions in CASSANDRA-8568. I suspect that this behavior might 
> be undesired and could have just fallen through the cracks.
> To my knowledge, this code does not cause any known bugs solely because 
> in-tree tools do not access the SSTables they produce before exiting. 
> However, if someone is to write, say, offline compaction daemon, it might 
> break on subsequent compactions because newly created SSTables will be 
> released.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Created] (CASSANDRA-16975) CompactionTask#runMayThrow should not remove new SSTables from the tracker for offline transactions

2021-09-18 Thread Aleksandr Sorokoumov (Jira)

Aleksandr Sorokoumov created CASSANDRA-16975:


 Summary: CompactionTask#runMayThrow should not remove new SSTables 
from the tracker for offline transactions
 Key: CASSANDRA-16975
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16975
 Project: Cassandra
  Issue Type: Bug
  Components: Local/Compaction
Reporter: Aleksandr Sorokoumov
Assignee: Aleksandr Sorokoumov


Right now, {{CompactionTask#runMayThrow}} removes new SSTables from the tracker 
for offline transactions 
([code|https://github.com/apache/cassandra/blob/f7c71f65c000c2c3ef7df1b034b8fdd822a396d8/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L227-L230]).
 This change was added in CASSANDRA-8962, prior to the introduction of 
lifecycle transactions in CASSANDRA-8568. I suspect that this behavior might be 
undesired and could have just fallen through the cracks.

To my knowledge, this code does not cause any known bugs solely because in-tree 
tools do not access the SSTables they produce before exiting. However, if 
someone is to write, say, offline compaction daemon, it might break on 
subsequent iterations because newly created SSTables won't be in the tracker.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16975) CompactionTask#runMayThrow should not remove new SSTables from the tracker for offline transactions

2021-09-18 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-16975:
-
Fix Version/s: 4.0.x

> CompactionTask#runMayThrow should not remove new SSTables from the tracker 
> for offline transactions
> ---
>
> Key: CASSANDRA-16975
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16975
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Aleksandr Sorokoumov
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 4.0.x
>
>
> Right now, {{CompactionTask#runMayThrow}} removes new SSTables from the 
> tracker for offline transactions 
> ([code|https://github.com/apache/cassandra/blob/f7c71f65c000c2c3ef7df1b034b8fdd822a396d8/src/java/org/apache/cassandra/db/compaction/CompactionTask.java#L227-L230]).
>  This change was added in CASSANDRA-8962, prior to the introduction of 
> lifecycle transactions in CASSANDRA-8568. I suspect that this behavior might 
> be undesired and could have just fallen through the cracks.
> To my knowledge, this code does not cause any known bugs solely because 
> in-tree tools do not access the SSTables they produce before exiting. 
> However, if someone is to write, say, offline compaction daemon, it might 
> break on subsequent iterations because newly created SSTables won't be in the 
> tracker.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-16349) SSTableLoader reports error when SSTable(s) do not have data for some nodes

2021-09-07 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17411254#comment-17411254
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-16349:
--

Hey [~ascott],

Can you please try to reproduce the error with the *Streaming fix* patch I 
linked above? If you still can reproduce it, it'd help if you can attach 
relevant stack traces from the failing nodes.

> SSTableLoader reports error when SSTable(s) do not have data for some nodes
> ---
>
> Key: CASSANDRA-16349
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16349
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/sstable
>Reporter: Serban Teodorescu
>Assignee: Serban Teodorescu
>Priority: Normal
> Fix For: 4.0.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Running SSTableLoader in verbose mode will show error(s) if there are node(s) 
> that do not own any data from the SSTable(s). This can happen in at least 2 
> cases:
>  # SSTableLoader is used to stream backups while keeping the same token ranges
>  # SSTable(s) are created with CQLSSTableWriter to match token ranges (this 
> can bring better performance by using ZeroCopy streaming)
> Partial output of the SSTableLoader:
> {quote}ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] 
> Remote peer /127.0.0.4:7000 failed stream session.
> ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] Remote peer 
> /127.0.0.3:7000 failed stream session.
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.515KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.427KiB/s)
> {quote}
>  
> Stack trace:
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:552)
> at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:533)
> at org.apache.cassandra.tools.BulkLoader.load(BulkLoader.java:99)
> at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:49)
> Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:88)
> at 
> com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1056)
> at 
> com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
> at 
> com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138)
> at 
> com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:958)
> at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:748)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:220)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:196)
> at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:505)
> at 
> org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:819)
> at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:595)
> at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:189)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:844)
> {quote}
> To reproduce create a cluster with ccm with more nodes than the RF, put some 
> data into it copy a SSTable and stream it.
>  
> The error originates on the nodes, the following stack trace is shown in the 
> logs:
> {quote}java.lang.IllegalStateException: Stream hasn't been read yet
>     at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:507)
>     at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.getSize(CassandraIncomingFile.java:96)
>     at 
> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:789)
>     at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:587)
>     at 
>

[jira] [Updated] (CASSANDRA-16349) SSTableLoader reports error when SSTable(s) do not have data for some nodes

2021-08-05 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-16349:
-
Fix Version/s: 4.0.x

> SSTableLoader reports error when SSTable(s) do not have data for some nodes
> ---
>
> Key: CASSANDRA-16349
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16349
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/sstable
>Reporter: Serban Teodorescu
>Assignee: Serban Teodorescu
>Priority: Normal
> Fix For: 4.0.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Running SSTableLoader in verbose mode will show error(s) if there are node(s) 
> that do not own any data from the SSTable(s). This can happen in at least 2 
> cases:
>  # SSTableLoader is used to stream backups while keeping the same token ranges
>  # SSTable(s) are created with CQLSSTableWriter to match token ranges (this 
> can bring better performance by using ZeroCopy streaming)
> Partial output of the SSTableLoader:
> {quote}ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] 
> Remote peer /127.0.0.4:7000 failed stream session.
> ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] Remote peer 
> /127.0.0.3:7000 failed stream session.
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.515KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.427KiB/s)
> {quote}
>  
> Stack trace:
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:552)
> at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:533)
> at org.apache.cassandra.tools.BulkLoader.load(BulkLoader.java:99)
> at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:49)
> Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:88)
> at 
> com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1056)
> at 
> com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
> at 
> com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138)
> at 
> com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:958)
> at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:748)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:220)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:196)
> at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:505)
> at 
> org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:819)
> at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:595)
> at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:189)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:844)
> {quote}
> To reproduce create a cluster with ccm with more nodes than the RF, put some 
> data into it copy a SSTable and stream it.
>  
> The error originates on the nodes, the following stack trace is shown in the 
> logs:
> {quote}java.lang.IllegalStateException: Stream hasn't been read yet
>     at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:507)
>     at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.getSize(CassandraIncomingFile.java:96)
>     at 
> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:789)
>     at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:587)
>     at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:189)
>     at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>     at

[jira] [Commented] (CASSANDRA-15985) python dtest TestCqlsh added enable_scripted_user_defined_functions which breaks on 2.2

2021-07-21 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17384729#comment-17384729
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-15985:
--

I added fixes for the rest of the broken tests mentioned in [my comment 
above|https://issues.apache.org/jira/browse/CASSANDRA-15985?focusedCommentId=17264377=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17264377]
 and started 
[CI|https://app.circleci.com/pipelines/github/Ge/cassandra/196/workflows/61015777-9b3f-4994-8098-405b3485d658].
 

> python dtest TestCqlsh added enable_scripted_user_defined_functions which 
> breaks on 2.2
> ---
>
> Key: CASSANDRA-15985
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15985
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: David Capwell
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 2.2.x
>
>
> {code}
> ERROR [main] 2020-07-26 03:03:14,108 CassandraDaemon.java:744 - Exception 
> encountered during startup
> org.apache.cassandra.exceptions.ConfigurationException: Invalid yaml. Please 
> remove properties [enable_scripted_user_defined_functions] from your 
> cassandra.yaml
>   at 
> org.apache.cassandra.config.YamlConfigurationLoader$MissingPropertiesChecker.check(YamlConfigurationLoader.java:146)
>  ~[main/:na]
>   at 
> org.apache.cassandra.config.YamlConfigurationLoader.loadConfig(YamlConfigurationLoader.java:113)
>  ~[main/:na]
>   at 
> org.apache.cassandra.config.YamlConfigurationLoader.loadConfig(YamlConfigurationLoader.java:85)
>  ~[main/:na]
>   at 
> org.apache.cassandra.config.DatabaseDescriptor.loadConfig(DatabaseDescriptor.java:151)
>  ~[main/:na]
>   at 
> org.apache.cassandra.config.DatabaseDescriptor.(DatabaseDescriptor.java:133)
>  ~[main/:na]
>   at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:604)
>  [main/:na]
>   at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:731) 
> [main/:na]]
> {code}
> This test doesn’t put a version limit, so all tests fail on 2.2 since the 
> property was added to all clusters.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Comment Edited] (CASSANDRA-16349) SSTableLoader reports error when SSTable(s) do not have data for some nodes

2021-07-19 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17382609#comment-17382609
 ] 

Aleksandr Sorokoumov edited comment on CASSANDRA-16349 at 7/19/21, 8:09 AM:


*Short version of the review*
 * The bug is reproducible in 4.0+
 * The fix for SSTableLoader LGTM as a way to avoid useless streaming tasks
 * I added a python dtest for the issue
 * We should also fix the way streaming handles empty SSTables after 
CASSANDRA-14115

*Code and CI*
||branch||CI||
|[dtest|https://github.com/apache/cassandra-dtest/pull/151]| |
|[4.0 
(baseline)|https://github.com/ge/cassandra/tree/cassandra-4.0-16349-dtest]|[j8|https://app.circleci.com/pipelines/github/Ge/cassandra/188/workflows/27d68d7c-3ae8-4dcd-869b-d8bbd47157a4]
 
[j11|https://app.circleci.com/pipelines/github/Ge/cassandra/188/workflows/844e33c6-1327-439d-980f-0112cf958829]|
|[SSTableLoader 
fix|https://github.com/apache/cassandra/compare/trunk...Ge:16349-sstableloader-fix-4.0?expand=1]|[j8|https://app.circleci.com/pipelines/github/Ge/cassandra/189/workflows/2a308aa6-6ff6-4294-842a-6e691831c59f]
 
[j11|https://app.circleci.com/pipelines/github/Ge/cassandra/189/workflows/c9729460-f035-49ab-873d-14f0cf6e2cc5]|
|[Streaming 
fix|https://github.com/apache/cassandra/compare/trunk...Ge:16349-streaming-fix-4.0?expand=1]|[j8|https://app.circleci.com/pipelines/github/Ge/cassandra/193/workflows/3d4d2069-0dd5-4b86-9510-d3d140ed49bf]
 
[j11|https://app.circleci.com/pipelines/github/Ge/cassandra/193/workflows/0738e555-aa9a-4367-9c77-96ff957147c5]|
|[Streaming fix + SSTableLoader 
fix|https://github.com/apache/cassandra/compare/trunk...Ge:16349-streaming-sstableloader-4.0?expand=1]|[j8|https://app.circleci.com/pipelines/github/Ge/cassandra/194/workflows/a07f3909-987c-4ccc-8625-9166f74a7000]
 
[j11|https://app.circleci.com/pipelines/github/Ge/cassandra/194/workflows/524e0774-ddd3-4d0d-b2d7-6d55d673a773]|

*Long version of the review*

I was able to reproduce the bug following the steps in the issue description in 
{{cassandra-4.0}} and {{trunk}}. The issue does not reproduce in the earlier 
versions. Given no changes in the SSTableLoader between {{3.11}} and {{trunk}}, 
it got me curious if the fix should be on the streaming side instead.

AFAIU the failing assertion 
([link|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/streaming/CassandraIncomingFile.java#L96])
 was introduced in CASSANDRA-14115 as a sanity check that the file's size is 
not accessed before it has been read. However, this assertion might be 
incorrect as the default state for the size is -1, and the intention is to 
verify that the value has been updated.

As an experiment, I changed the assertion in {{getSize}} and re-ran the test. 
Streaming tasks started to crash in 
[StreamReceiveTask#receive|https://github.com/apache/cassandra/blob/9cc7a0025d8b0859d8e9c947f6fdffd8455dd141/src/java/org/apache/cassandra/streaming/StreamReceiveTask.java#L87]
 to due [no open SSTable 
writers|https://github.com/apache/cassandra/blob/9cc7a0025d8b0859d8e9c947f6fdffd8455dd141/src/java/org/apache/cassandra/io/sstable/format/RangeAwareSSTableWriter.java#L168-L171].

In my opinion, this is a bug as C* could handle streaming empty SSTables in 
prior versions, so I created a patch that handles empty streams without 
throwing exceptions. Even though it works without Serban's SSTableLoader fix, 
we should include it to prevent SSTableLoader from doing unnecessary work.


was (Author: ge):
*Short version of the review*
 * The bug is reproducible in 4.0+
 * The fix for SSTableLoader LGTM as a way to avoid useless streaming tasks
 * I added a python dtest for the issue
 * We should also fix the way streaming handles empty SSTables after 
CASSANDRA-14115

*Code and CI*
||branch||CI||
|[dtest|https://github.com/apache/cassandra-dtest/pull/151]| |
|[4.0 
(baseline)|https://github.com/ge/cassandra/tree/cassandra-4.0-16349-dtest]|[j8|https://app.circleci.com/pipelines/github/Ge/cassandra/188/workflows/27d68d7c-3ae8-4dcd-869b-d8bbd47157a4]
 
[j11|https://app.circleci.com/pipelines/github/Ge/cassandra/188/workflows/844e33c6-1327-439d-980f-0112cf958829]|
|[SSTableLoader 
fix|https://github.com/apache/cassandra/compare/trunk...Ge:16349-sstableloader-fix-4.0?expand=1]|[j8|https://app.circleci.com/pipelines/github/Ge/cassandra/189/workflows/2a308aa6-6ff6-4294-842a-6e691831c59f]
 
[j11|https://app.circleci.com/pipelines/github/Ge/cassandra/189/workflows/c9729460-f035-49ab-873d-14f0cf6e2cc5]|
|[Streaming 
fix|https://github.com/apache/cassandra/compare/trunk...Ge:16349-streaming-fix-4.0?expand=1]|[j8|https://app.circleci.com/pipelines/github/Ge/cassandra/191/workflows/4855a5e0-8ba3-4007-8e87-3c4094702b53]

[jira] [Comment Edited] (CASSANDRA-16349) SSTableLoader reports error when SSTable(s) do not have data for some nodes

2021-07-17 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17382609#comment-17382609
 ] 

Aleksandr Sorokoumov edited comment on CASSANDRA-16349 at 7/17/21, 4:12 PM:


*Short version of the review*
 * The bug is reproducible in 4.0+
 * The fix for SSTableLoader LGTM as a way to avoid useless streaming tasks
 * I added a python dtest for the issue
 * We should also fix the way streaming handles empty SSTables after 
CASSANDRA-14115

*Code and CI*
||branch||CI||
|[dtest|https://github.com/apache/cassandra-dtest/pull/151]| |
|[4.0 
(baseline)|https://github.com/ge/cassandra/tree/cassandra-4.0-16349-dtest]|[j8|https://app.circleci.com/pipelines/github/Ge/cassandra/188/workflows/27d68d7c-3ae8-4dcd-869b-d8bbd47157a4]
 
[j11|https://app.circleci.com/pipelines/github/Ge/cassandra/188/workflows/844e33c6-1327-439d-980f-0112cf958829]|
|[SSTableLoader 
fix|https://github.com/apache/cassandra/compare/trunk...Ge:16349-sstableloader-fix-4.0?expand=1]|[j8|https://app.circleci.com/pipelines/github/Ge/cassandra/189/workflows/2a308aa6-6ff6-4294-842a-6e691831c59f]
 
[j11|https://app.circleci.com/pipelines/github/Ge/cassandra/189/workflows/c9729460-f035-49ab-873d-14f0cf6e2cc5]|
|[Streaming 
fix|https://github.com/apache/cassandra/compare/trunk...Ge:16349-streaming-fix-4.0?expand=1]|[j8|https://app.circleci.com/pipelines/github/Ge/cassandra/191/workflows/4855a5e0-8ba3-4007-8e87-3c4094702b53]
 
[j11|https://app.circleci.com/pipelines/github/Ge/cassandra/191/workflows/b3e1df91-fcae-410f-a3d7-2f5176203586]|
|[Streaming fix + SSTableLoader 
fix|https://github.com/apache/cassandra/compare/trunk...Ge:16349-streaming-sstableloader-4.0?expand=1]|[j8|https://app.circleci.com/pipelines/github/Ge/cassandra/192/workflows/0918fc51-7492-467b-8f87-8ea46830f262]
 
[j11|https://app.circleci.com/pipelines/github/Ge/cassandra/192/workflows/0d15992d-c33b-4955-a531-e8c371beab15]|

*Long version of the review*

I was able to reproduce the bug following the steps in the issue description in 
{{cassandra-4.0}} and {{trunk}}. The issue does not reproduce in the earlier 
versions. Given no changes in the SSTableLoader between {{3.11}} and {{trunk}}, 
it got me curious if the fix should be on the streaming side instead.

AFAIU the failing assertion 
([link|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/streaming/CassandraIncomingFile.java#L96])
 was introduced in CASSANDRA-14115 as a sanity check that the file's size is 
not accessed before it has been read. However, this assertion might be 
incorrect as the default state for the size is -1, and the intention is to 
verify that the value has been updated.

As an experiment, I changed the assertion in {{getSize}} and re-ran the test. 
Streaming tasks started to crash in 
[StreamReceiveTask#receive|https://github.com/apache/cassandra/blob/9cc7a0025d8b0859d8e9c947f6fdffd8455dd141/src/java/org/apache/cassandra/streaming/StreamReceiveTask.java#L87]
 to due [no open SSTable 
writers|https://github.com/apache/cassandra/blob/9cc7a0025d8b0859d8e9c947f6fdffd8455dd141/src/java/org/apache/cassandra/io/sstable/format/RangeAwareSSTableWriter.java#L168-L171].

In my opinion, this is a bug as C* could handle streaming empty SSTables in 
prior versions, so I created a patch that handles empty streams without 
throwing exceptions. Even though it works without Serban's SSTableLoader fix, 
we should include it to prevent SSTableLoader from doing unnecessary work.


was (Author: ge):
*The short version of the review*

* The bug is reproducible in 4.0+
* The fix for SSTableLoader LGTM as a way to avoid useless streaming tasks
* I added a python dtest for the issue
* We should also fix the way streaming handles empty SSTables after 
CASSANDRA-14115

*Code and CI*

||branch||CI|
|[dtest|https://github.com/apache/cassandra-dtest/pull/151]| |
|[4.0 
(baseline)|https://github.com/ge/cassandra/tree/cassandra-4.0-16349-dtest]|[j8|https://app.circleci.com/pipelines/github/Ge/cassandra/188/workflows/27d68d7c-3ae8-4dcd-869b-d8bbd47157a4]
 
[j11|https://app.circleci.com/pipelines/github/Ge/cassandra/188/workflows/844e33c6-1327-439d-980f-0112cf958829]
|[SSTableLoader 
fix|https://github.com/apache/cassandra/compare/trunk...Ge:16349-sstableloader-fix-4.0?expand=1]|[j8|https://app.circleci.com/pipelines/github/Ge/cassandra/189/workflows/2a308aa6-6ff6-4294-842a-6e691831c59f]
 
[j11|https://app.circleci.com/pipelines/github/Ge/cassandra/189/workflows/c9729460-f035-49ab-873d-14f0cf6e2cc5]
|[Streaming 
fix|https://github.com/apache/cassandra/compare/trunk...Ge:16349-streaming-fix-4.0?expand=1]|[j8|https://app.circleci.com/pipelines/github/Ge/cassandra/191/workflows/4855a5e0-8ba3-4007-8e87-3c4094702b53]

[jira] [Updated] (CASSANDRA-16349) SSTableLoader reports error when SSTable(s) do not have data for some nodes

2021-07-17 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-16349:
-
Status: Needs Reviewer  (was: Review In Progress)

> SSTableLoader reports error when SSTable(s) do not have data for some nodes
> ---
>
> Key: CASSANDRA-16349
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16349
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/sstable
>Reporter: Serban Teodorescu
>Assignee: Serban Teodorescu
>Priority: Normal
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Running SSTableLoader in verbose mode will show error(s) if there are node(s) 
> that do not own any data from the SSTable(s). This can happen in at least 2 
> cases:
>  # SSTableLoader is used to stream backups while keeping the same token ranges
>  # SSTable(s) are created with CQLSSTableWriter to match token ranges (this 
> can bring better performance by using ZeroCopy streaming)
> Partial output of the SSTableLoader:
> {quote}ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] 
> Remote peer /127.0.0.4:7000 failed stream session.
> ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] Remote peer 
> /127.0.0.3:7000 failed stream session.
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.515KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.427KiB/s)
> {quote}
>  
> Stack trace:
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:552)
> at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:533)
> at org.apache.cassandra.tools.BulkLoader.load(BulkLoader.java:99)
> at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:49)
> Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:88)
> at 
> com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1056)
> at 
> com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
> at 
> com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138)
> at 
> com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:958)
> at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:748)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:220)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:196)
> at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:505)
> at 
> org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:819)
> at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:595)
> at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:189)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:844)
> {quote}
> To reproduce create a cluster with ccm with more nodes than the RF, put some 
> data into it copy a SSTable and stream it.
>  
> The error originates on the nodes, the following stack trace is shown in the 
> logs:
> {quote}java.lang.IllegalStateException: Stream hasn't been read yet
>     at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:507)
>     at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.getSize(CassandraIncomingFile.java:96)
>     at 
> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:789)
>     at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:587)
>     at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:189)
>     at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>     at

[jira] [Commented] (CASSANDRA-16349) SSTableLoader reports error when SSTable(s) do not have data for some nodes

2021-07-17 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17382609#comment-17382609
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-16349:
--

*The short version of the review*

* The bug is reproducible in 4.0+
* The fix for SSTableLoader LGTM as a way to avoid useless streaming tasks
* I added a python dtest for the issue
* We should also fix the way streaming handles empty SSTables after 
CASSANDRA-14115

*Code and CI*

||branch||CI|
|[dtest|https://github.com/apache/cassandra-dtest/pull/151]| |
|[4.0 
(baseline)|https://github.com/ge/cassandra/tree/cassandra-4.0-16349-dtest]|[j8|https://app.circleci.com/pipelines/github/Ge/cassandra/188/workflows/27d68d7c-3ae8-4dcd-869b-d8bbd47157a4]
 
[j11|https://app.circleci.com/pipelines/github/Ge/cassandra/188/workflows/844e33c6-1327-439d-980f-0112cf958829]
|[SSTableLoader 
fix|https://github.com/apache/cassandra/compare/trunk...Ge:16349-sstableloader-fix-4.0?expand=1]|[j8|https://app.circleci.com/pipelines/github/Ge/cassandra/189/workflows/2a308aa6-6ff6-4294-842a-6e691831c59f]
 
[j11|https://app.circleci.com/pipelines/github/Ge/cassandra/189/workflows/c9729460-f035-49ab-873d-14f0cf6e2cc5]
|[Streaming 
fix|https://github.com/apache/cassandra/compare/trunk...Ge:16349-streaming-fix-4.0?expand=1]|[j8|https://app.circleci.com/pipelines/github/Ge/cassandra/191/workflows/4855a5e0-8ba3-4007-8e87-3c4094702b53]
 
[j11|https://app.circleci.com/pipelines/github/Ge/cassandra/191/workflows/b3e1df91-fcae-410f-a3d7-2f5176203586]
|[Streaming fix + SSTableLoader 
fix|https://github.com/apache/cassandra/compare/trunk...Ge:16349-streaming-sstableloader-4.0?expand=1]|[j8|https://app.circleci.com/pipelines/github/Ge/cassandra/192/workflows/0918fc51-7492-467b-8f87-8ea46830f262]
 
[j11|https://app.circleci.com/pipelines/github/Ge/cassandra/192/workflows/0d15992d-c33b-4955-a531-e8c371beab15]


*Long version of the review*

I was able to reproduce the bug following the steps in the issue description in 
{{cassandra-4.0}} and {{trunk}}. The issue does not reproduce in the earlier 
versions. Given no changes in the SSTableLoader between {{3.11}} and {{trunk}}, 
it got me curious if the fix should be on the streaming side instead.

 AFAIU the failing assertion 
([link|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/streaming/CassandraIncomingFile.java#L96])
 was introduced in CASSANDRA-14115 as a sanity check that the file's size is 
not accessed before it has been read. However, this assertion might be 
incorrect as the default state for the size is -1, and the intention is to 
verify that the value has been updated. 

As an experiment, I changed the assertion in {{getSize}} and re-ran the test. 
Streaming tasks started to crash in 
[StreamReceiveTask#receive|https://github.com/apache/cassandra/blob/9cc7a0025d8b0859d8e9c947f6fdffd8455dd141/src/java/org/apache/cassandra/streaming/StreamReceiveTask.java#L87]
 to due [no open SSTable 
writers|https://github.com/apache/cassandra/blob/9cc7a0025d8b0859d8e9c947f6fdffd8455dd141/src/java/org/apache/cassandra/io/sstable/format/RangeAwareSSTableWriter.java#L168-L171].
 

In my opinion, this is a bug as C* could handle streaming empty SSTables in 
prior versions, so I created a patch that handles empty streams without 
throwing exceptions. Even though it works without Serban's SSTableLoader fix, 
we should include it to prevent SSTableLoader from doing unnecessary work.

> SSTableLoader reports error when SSTable(s) do not have data for some nodes
> ---
>
> Key: CASSANDRA-16349
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16349
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/sstable
>Reporter: Serban Teodorescu
>Assignee: Serban Teodorescu
>Priority: Normal
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Running SSTableLoader in verbose mode will show error(s) if there are node(s) 
> that do not own any data from the SSTable(s). This can happen in at least 2 
> cases:
>  # SSTableLoader is used to stream backups while keeping the same token ranges
>  # SSTable(s) are created with CQLSSTableWriter to match token ranges (this 
> can bring better performance by using ZeroCopy streaming)
> Partial output of the SSTableLoader:
> {quote}ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] 
> Remote peer /127.0.0.4:7000 failed stream session.
> ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] Remote peer 
> /127.0.0.3:7000 failed stream session.
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress:

[jira] [Commented] (CASSANDRA-15985) python dtest TestCqlsh added enable_scripted_user_defined_functions which breaks on 2.2

2021-07-17 Thread Aleksandr Sorokoumov (Jira)



[ 
https://issues.apache.org/jira/browse/CASSANDRA-15985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17382572#comment-17382572
 ] 

Aleksandr Sorokoumov commented on CASSANDRA-15985:
--

Thanks for reviewing my patch [~e.dimitrova]! Feel free to cherry-pick the fix 
for TestCqlsh#test_pycodestyle_compliance.

Regarding proposed changes, should I maybe ask in Slack, wdyt?

> python dtest TestCqlsh added enable_scripted_user_defined_functions which 
> breaks on 2.2
> ---
>
> Key: CASSANDRA-15985
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15985
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: David Capwell
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
> Fix For: 2.2.x
>
>
> {code}
> ERROR [main] 2020-07-26 03:03:14,108 CassandraDaemon.java:744 - Exception 
> encountered during startup
> org.apache.cassandra.exceptions.ConfigurationException: Invalid yaml. Please 
> remove properties [enable_scripted_user_defined_functions] from your 
> cassandra.yaml
>   at 
> org.apache.cassandra.config.YamlConfigurationLoader$MissingPropertiesChecker.check(YamlConfigurationLoader.java:146)
>  ~[main/:na]
>   at 
> org.apache.cassandra.config.YamlConfigurationLoader.loadConfig(YamlConfigurationLoader.java:113)
>  ~[main/:na]
>   at 
> org.apache.cassandra.config.YamlConfigurationLoader.loadConfig(YamlConfigurationLoader.java:85)
>  ~[main/:na]
>   at 
> org.apache.cassandra.config.DatabaseDescriptor.loadConfig(DatabaseDescriptor.java:151)
>  ~[main/:na]
>   at 
> org.apache.cassandra.config.DatabaseDescriptor.(DatabaseDescriptor.java:133)
>  ~[main/:na]
>   at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:604)
>  [main/:na]
>   at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:731) 
> [main/:na]]
> {code}
> This test doesn’t put a version limit, so all tests fail on 2.2 since the 
> property was added to all clusters.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Updated] (CASSANDRA-16349) SSTableLoader reports error when SSTable(s) do not have data for some nodes

2021-07-16 Thread Aleksandr Sorokoumov (Jira)



 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksandr Sorokoumov updated CASSANDRA-16349:
-
Reviewers: Aleksandr Sorokoumov, Aleksandr Sorokoumov  (was: Aleksandr 
Sorokoumov)
   Aleksandr Sorokoumov, Aleksandr Sorokoumov
   Status: Review In Progress  (was: Patch Available)

> SSTableLoader reports error when SSTable(s) do not have data for some nodes
> ---
>
> Key: CASSANDRA-16349
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16349
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/sstable
>Reporter: Serban Teodorescu
>Assignee: Serban Teodorescu
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Running SSTableLoader in verbose mode will show error(s) if there are node(s) 
> that do not own any data from the SSTable(s). This can happen in at least 2 
> cases:
>  # SSTableLoader is used to stream backups while keeping the same token ranges
>  # SSTable(s) are created with CQLSSTableWriter to match token ranges (this 
> can bring better performance by using ZeroCopy streaming)
> Partial output of the SSTableLoader:
> {quote}ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] 
> Remote peer /127.0.0.4:7000 failed stream session.
> ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] Remote peer 
> /127.0.0.3:7000 failed stream session.
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.515KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.427KiB/s)
> {quote}
>  
> Stack trace:
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:552)
> at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:533)
> at org.apache.cassandra.tools.BulkLoader.load(BulkLoader.java:99)
> at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:49)
> Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:88)
> at 
> com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1056)
> at 
> com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
> at 
> com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138)
> at 
> com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:958)
> at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:748)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:220)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:196)
> at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:505)
> at 
> org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:819)
> at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:595)
> at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:189)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:844)
> {quote}
> To reproduce create a cluster with ccm with more nodes than the RF, put some 
> data into it copy a SSTable and stream it.
>  
> The error originates on the nodes, the following stack trace is shown in the 
> logs:
> {quote}java.lang.IllegalStateException: Stream hasn't been read yet
>     at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:507)
>     at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.getSize(CassandraIncomingFile.java:96)
>     at 
> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:789)
>     at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:587)
>     at 
>

1 2 3 >

1 - 100 of 238 matches

Mail list logo