[jira] [Updated] (CASSANDRA-12929) dtest failure in bootstrap_test.TestBootstrap.simple_bootstrap_test_small_keepalive_period
[ https://issues.apache.org/jira/browse/CASSANDRA-12929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuki Morishita updated CASSANDRA-12929: --- Status: Ready to Commit (was: Patch Available) > dtest failure in > bootstrap_test.TestBootstrap.simple_bootstrap_test_small_keepalive_period > -- > > Key: CASSANDRA-12929 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12929 > Project: Cassandra > Issue Type: Bug >Reporter: Michael Shuler >Assignee: Paulo Motta > Labels: dtest, test-failure > > example failure: > http://cassci.datastax.com/job/trunk_novnode_dtest/494/testReport/bootstrap_test/TestBootstrap/simple_bootstrap_test_small_keepalive_period > {noformat} > Error Message > Expected [['COMPLETED']] from SELECT bootstrapped FROM system.local WHERE > key='local', but got [[u'IN_PROGRESS']] > >> begin captured logging << > dtest: DEBUG: cluster ccm directory: /tmp/dtest-YmnyEI > dtest: DEBUG: Done setting configuration options: > { 'num_tokens': None, 'phi_convict_threshold': 5, 'start_rpc': 'true'} > cassandra.cluster: INFO: New Cassandra host > discovered > - >> end captured logging << - > Stacktrace > File "/usr/lib/python2.7/unittest/case.py", line 329, in run > testMethod() > File "/home/automaton/cassandra-dtest/tools/decorators.py", line 46, in > wrapped > f(obj) > File "/home/automaton/cassandra-dtest/bootstrap_test.py", line 163, in > simple_bootstrap_test_small_keepalive_period > assert_bootstrap_state(self, node2, 'COMPLETED') > File "/home/automaton/cassandra-dtest/tools/assertions.py", line 297, in > assert_bootstrap_state > assert_one(session, "SELECT bootstrapped FROM system.local WHERE > key='local'", [expected_bootstrap_state]) > File "/home/automaton/cassandra-dtest/tools/assertions.py", line 130, in > assert_one > assert list_res == [expected], "Expected {} from {}, but got > {}".format([expected], query, list_res) > "Expected [['COMPLETED']] from SELECT bootstrapped FROM system.local WHERE > key='local', but got [[u'IN_PROGRESS']]\n >> begin > captured logging << \ndtest: DEBUG: cluster ccm > directory: /tmp/dtest-YmnyEI\ndtest: DEBUG: Done setting configuration > options:\n{ 'num_tokens': None, 'phi_convict_threshold': 5, 'start_rpc': > 'true'}\ncassandra.cluster: INFO: New Cassandra host datacenter1> discovered\n- >> end captured logging << > -" > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-12929) dtest failure in bootstrap_test.TestBootstrap.simple_bootstrap_test_small_keepalive_period
[ https://issues.apache.org/jira/browse/CASSANDRA-12929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15950206#comment-15950206 ] Yuki Morishita commented on CASSANDRA-12929: Nice catch. +1. > dtest failure in > bootstrap_test.TestBootstrap.simple_bootstrap_test_small_keepalive_period > -- > > Key: CASSANDRA-12929 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12929 > Project: Cassandra > Issue Type: Bug >Reporter: Michael Shuler >Assignee: Paulo Motta > Labels: dtest, test-failure > > example failure: > http://cassci.datastax.com/job/trunk_novnode_dtest/494/testReport/bootstrap_test/TestBootstrap/simple_bootstrap_test_small_keepalive_period > {noformat} > Error Message > Expected [['COMPLETED']] from SELECT bootstrapped FROM system.local WHERE > key='local', but got [[u'IN_PROGRESS']] > >> begin captured logging << > dtest: DEBUG: cluster ccm directory: /tmp/dtest-YmnyEI > dtest: DEBUG: Done setting configuration options: > { 'num_tokens': None, 'phi_convict_threshold': 5, 'start_rpc': 'true'} > cassandra.cluster: INFO: New Cassandra host > discovered > - >> end captured logging << - > Stacktrace > File "/usr/lib/python2.7/unittest/case.py", line 329, in run > testMethod() > File "/home/automaton/cassandra-dtest/tools/decorators.py", line 46, in > wrapped > f(obj) > File "/home/automaton/cassandra-dtest/bootstrap_test.py", line 163, in > simple_bootstrap_test_small_keepalive_period > assert_bootstrap_state(self, node2, 'COMPLETED') > File "/home/automaton/cassandra-dtest/tools/assertions.py", line 297, in > assert_bootstrap_state > assert_one(session, "SELECT bootstrapped FROM system.local WHERE > key='local'", [expected_bootstrap_state]) > File "/home/automaton/cassandra-dtest/tools/assertions.py", line 130, in > assert_one > assert list_res == [expected], "Expected {} from {}, but got > {}".format([expected], query, list_res) > "Expected [['COMPLETED']] from SELECT bootstrapped FROM system.local WHERE > key='local', but got [[u'IN_PROGRESS']]\n >> begin > captured logging << \ndtest: DEBUG: cluster ccm > directory: /tmp/dtest-YmnyEI\ndtest: DEBUG: Done setting configuration > options:\n{ 'num_tokens': None, 'phi_convict_threshold': 5, 'start_rpc': > 'true'}\ncassandra.cluster: INFO: New Cassandra host datacenter1> discovered\n- >> end captured logging << > -" > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-12929) dtest failure in bootstrap_test.TestBootstrap.simple_bootstrap_test_small_keepalive_period
[ https://issues.apache.org/jira/browse/CASSANDRA-12929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuki Morishita updated CASSANDRA-12929: --- Reviewer: Yuki Morishita > dtest failure in > bootstrap_test.TestBootstrap.simple_bootstrap_test_small_keepalive_period > -- > > Key: CASSANDRA-12929 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12929 > Project: Cassandra > Issue Type: Bug >Reporter: Michael Shuler >Assignee: Paulo Motta > Labels: dtest, test-failure > > example failure: > http://cassci.datastax.com/job/trunk_novnode_dtest/494/testReport/bootstrap_test/TestBootstrap/simple_bootstrap_test_small_keepalive_period > {noformat} > Error Message > Expected [['COMPLETED']] from SELECT bootstrapped FROM system.local WHERE > key='local', but got [[u'IN_PROGRESS']] > >> begin captured logging << > dtest: DEBUG: cluster ccm directory: /tmp/dtest-YmnyEI > dtest: DEBUG: Done setting configuration options: > { 'num_tokens': None, 'phi_convict_threshold': 5, 'start_rpc': 'true'} > cassandra.cluster: INFO: New Cassandra host > discovered > - >> end captured logging << - > Stacktrace > File "/usr/lib/python2.7/unittest/case.py", line 329, in run > testMethod() > File "/home/automaton/cassandra-dtest/tools/decorators.py", line 46, in > wrapped > f(obj) > File "/home/automaton/cassandra-dtest/bootstrap_test.py", line 163, in > simple_bootstrap_test_small_keepalive_period > assert_bootstrap_state(self, node2, 'COMPLETED') > File "/home/automaton/cassandra-dtest/tools/assertions.py", line 297, in > assert_bootstrap_state > assert_one(session, "SELECT bootstrapped FROM system.local WHERE > key='local'", [expected_bootstrap_state]) > File "/home/automaton/cassandra-dtest/tools/assertions.py", line 130, in > assert_one > assert list_res == [expected], "Expected {} from {}, but got > {}".format([expected], query, list_res) > "Expected [['COMPLETED']] from SELECT bootstrapped FROM system.local WHERE > key='local', but got [[u'IN_PROGRESS']]\n >> begin > captured logging << \ndtest: DEBUG: cluster ccm > directory: /tmp/dtest-YmnyEI\ndtest: DEBUG: Done setting configuration > options:\n{ 'num_tokens': None, 'phi_convict_threshold': 5, 'start_rpc': > 'true'}\ncassandra.cluster: INFO: New Cassandra host datacenter1> discovered\n- >> end captured logging << > -" > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (CASSANDRA-13392) Repaired status should be cleared on new sstables when issuing nodetool refresh
[ https://issues.apache.org/jira/browse/CASSANDRA-13392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15950041#comment-15950041 ] Blake Eggleston edited comment on CASSANDRA-13392 at 3/30/17 11:13 PM: --- This seems like a documentation problem. I don't think we should change the behavior of nodetool refresh, it should be treated as a shortcut for stopping a node, adding sstables, and restarting it. That's it. If an operator wants to add sstables to a live node we should expect that they’re adding them in the correct state, and the nodetool docs should explicitly warn them that they should modify repaired statuses prior to adding the sstables to the data directories. My reasoning here is: * You each bring up valid use cases for nodetool refresh working one way or the other, so we can't really make assumptions about the operator's intentions. * If a node unexpectedly dies after the sstables have been added to the data directories, it will load the sstables without adjusting any metatdata when it restarts, so the failure recovery behavior would differ from the normal behavior pretty significantly. * After (briefly) reviewing the CFS.loadNewSSTables code, along with the other addSSTable code, I’m not confident that removing the repaired status of manually added sstables wouldn’t also clear the repaired status of legit sstables inadvertently in some cases. Specifically, there doesn’t appear to be anything preventing streamed repaired tables from appearing on disk between when we get the initial set of sstables, and when we start scanning the files on disk for sstables to add. was (Author: bdeggleston): This seems like a documentation problem. I don't think we should change the behavior of nodetool refresh, it should be treated as a shortcut for stopping a node, adding sstables, and restarting it. That's it. If an operator wants to add sstables to a live node we should expect that they’re adding them in the correct state, and the nodetool docs should explicitly warn them that they should modify repaired statuses prior to adding the sstables to the data set. My reasoning here is: * You each bring up valid use cases for nodetool refresh working one way or the other, so we can't really make assumptions about the operator's intentions. * If a node unexpectedly dies after the sstables have been added to the dataset, it will load the sstables without adjusting any metatdata when it restarts, so the failure recovery behavior would differ from the normal behavior pretty significantly. * After (briefly) reviewing the CFS.loadNewSSTables code, along with the other addSSTable code, I’m not confident that removing the repaired status of manually added sstables wouldn’t also clear the repaired status of legit sstables inadvertently in some cases. Specifically, there doesn’t appear to be anything preventing streamed repaired tables from appearing on disk between when we get the initial set of sstables, and when we start scanning the files on disk for sstables to add. > Repaired status should be cleared on new sstables when issuing nodetool > refresh > --- > > Key: CASSANDRA-13392 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13392 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 3.0.x, 3.11.x, 4.x > > > We can't assume that new sstables added when doing nodetool refresh > (ColumnFamilyStore#loadNewSSTables) are actually repaired if they have the > repairedAt flag set -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13392) Repaired status should be cleared on new sstables when issuing nodetool refresh
[ https://issues.apache.org/jira/browse/CASSANDRA-13392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15950041#comment-15950041 ] Blake Eggleston commented on CASSANDRA-13392: - This seems like a documentation problem. I don't think we should change the behavior of nodetool refresh, it should be treated as a shortcut for stopping a node, adding sstables, and restarting it. That's it. If an operator wants to add sstables to a live node we should expect that they’re adding them in the correct state, and the nodetool docs should explicitly warn them that they should modify repaired statuses prior to adding the sstables to the data set. My reasoning here is: * You each bring up valid use cases for nodetool refresh working one way or the other, so we can't really make assumptions about the operator's intentions. * If a node unexpectedly dies after the sstables have been added to the dataset, it will load the sstables without adjusting any metatdata when it restarts, so the failure recovery behavior would differ from the normal behavior pretty significantly. * After (briefly) reviewing the CFS.loadNewSSTables code, along with the other addSSTable code, I’m not confident that removing the repaired status of manually added sstables wouldn’t also clear the repaired status of legit sstables inadvertently in some cases. Specifically, there doesn’t appear to be anything preventing streamed repaired tables from appearing on disk between when we get the initial set of sstables, and when we start scanning the files on disk for sstables to add. > Repaired status should be cleared on new sstables when issuing nodetool > refresh > --- > > Key: CASSANDRA-13392 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13392 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 3.0.x, 3.11.x, 4.x > > > We can't assume that new sstables added when doing nodetool refresh > (ColumnFamilyStore#loadNewSSTables) are actually repaired if they have the > repairedAt flag set -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (CASSANDRA-12406) dtest failure in pushed_notifications_test.TestPushedNotifications.move_single_node_test
[ https://issues.apache.org/jira/browse/CASSANDRA-12406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta resolved CASSANDRA-12406. - Resolution: Fixed Test has been stable for the last several runs + newer branches so I'm closing this. > dtest failure in > pushed_notifications_test.TestPushedNotifications.move_single_node_test > > > Key: CASSANDRA-12406 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12406 > Project: Cassandra > Issue Type: Bug >Reporter: Sean McCarthy >Assignee: Paulo Motta > Labels: dtest > Fix For: 2.1.x > > Attachments: node1.log, node2.log, node3.log > > > example failure: > http://cassci.datastax.com/job/cassandra-2.1_novnode_dtest/271/testReport/pushed_notifications_test/TestPushedNotifications/move_single_node_test > {code} > Stacktrace > File "/usr/lib/python2.7/unittest/case.py", line 329, in run > testMethod() > File "/home/automaton/cassandra-dtest/pushed_notifications_test.py", line > 110, in move_single_node_test > self.assertEquals(1, len(notifications), notifications) > File "/usr/lib/python2.7/unittest/case.py", line 513, in assertEqual > assertion_func(first, second, msg=msg) > File "/usr/lib/python2.7/unittest/case.py", line 506, in _baseAssertEqual > raise self.failureException(msg) > "[{'change_type': u'MOVED_NODE', 'address': ('127.0.0.1', 9042)}, > {'change_type': u'NEW_NODE', 'address': ('127.0.0.1', 9042)}] > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-12929) dtest failure in bootstrap_test.TestBootstrap.simple_bootstrap_test_small_keepalive_period
[ https://issues.apache.org/jira/browse/CASSANDRA-12929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta updated CASSANDRA-12929: Status: Patch Available (was: Open) > dtest failure in > bootstrap_test.TestBootstrap.simple_bootstrap_test_small_keepalive_period > -- > > Key: CASSANDRA-12929 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12929 > Project: Cassandra > Issue Type: Bug >Reporter: Michael Shuler >Assignee: Paulo Motta > Labels: dtest, test-failure > > example failure: > http://cassci.datastax.com/job/trunk_novnode_dtest/494/testReport/bootstrap_test/TestBootstrap/simple_bootstrap_test_small_keepalive_period > {noformat} > Error Message > Expected [['COMPLETED']] from SELECT bootstrapped FROM system.local WHERE > key='local', but got [[u'IN_PROGRESS']] > >> begin captured logging << > dtest: DEBUG: cluster ccm directory: /tmp/dtest-YmnyEI > dtest: DEBUG: Done setting configuration options: > { 'num_tokens': None, 'phi_convict_threshold': 5, 'start_rpc': 'true'} > cassandra.cluster: INFO: New Cassandra host > discovered > - >> end captured logging << - > Stacktrace > File "/usr/lib/python2.7/unittest/case.py", line 329, in run > testMethod() > File "/home/automaton/cassandra-dtest/tools/decorators.py", line 46, in > wrapped > f(obj) > File "/home/automaton/cassandra-dtest/bootstrap_test.py", line 163, in > simple_bootstrap_test_small_keepalive_period > assert_bootstrap_state(self, node2, 'COMPLETED') > File "/home/automaton/cassandra-dtest/tools/assertions.py", line 297, in > assert_bootstrap_state > assert_one(session, "SELECT bootstrapped FROM system.local WHERE > key='local'", [expected_bootstrap_state]) > File "/home/automaton/cassandra-dtest/tools/assertions.py", line 130, in > assert_one > assert list_res == [expected], "Expected {} from {}, but got > {}".format([expected], query, list_res) > "Expected [['COMPLETED']] from SELECT bootstrapped FROM system.local WHERE > key='local', but got [[u'IN_PROGRESS']]\n >> begin > captured logging << \ndtest: DEBUG: cluster ccm > directory: /tmp/dtest-YmnyEI\ndtest: DEBUG: Done setting configuration > options:\n{ 'num_tokens': None, 'phi_convict_threshold': 5, 'start_rpc': > 'true'}\ncassandra.cluster: INFO: New Cassandra host datacenter1> discovered\n- >> end captured logging << > -" > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (CASSANDRA-12428) dtest failure in topology_test.TestTopology.simple_decommission_test
[ https://issues.apache.org/jira/browse/CASSANDRA-12428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta resolved CASSANDRA-12428. - Resolution: Fixed Closing since test is fixed on CI. > dtest failure in topology_test.TestTopology.simple_decommission_test > > > Key: CASSANDRA-12428 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12428 > Project: Cassandra > Issue Type: Bug >Reporter: Sean McCarthy >Assignee: Paulo Motta > Labels: dtest > Attachments: node1.log, node2.log, node3.log > > > example failure: > http://cassci.datastax.com/job/cassandra-2.1_dtest/499/testReport/topology_test/TestTopology/simple_decommission_test > {code} > Stacktrace > File "/usr/lib/python2.7/unittest/case.py", line 358, in run > self.tearDown() > File "/home/automaton/cassandra-dtest/dtest.py", line 666, in tearDown > raise AssertionError('Unexpected error in log, see stdout') > "Unexpected error in log, see stdout > {code} > {code} > Standard Output > Unexpected error in node2 log, error: > ERROR [OptionalTasks:1] 2016-08-09 22:19:17,578 CassandraDaemon.java:231 - > Exception in thread Thread[OptionalTasks:1,5,main] > java.lang.AssertionError: -1798176113661253264 not found in > -9176030984652505006, -871714249145979, -8567082690920363685, > -7728355195270516929, -7671560790707332672, -6815296744215479977, > -6611548514765694876, -6137228431100324821, -5871381962314776798, > -5709026171638534111, -5696874364498510312, -4663855838820854356, > -3304329091857535864, -3251864206536309230, -3188788124715894197, > -2549476409976316844, -2423479156112489442, -2389574204458609132, > -2160965082438649456, -2046105283339446875, -1622678693166245335, > -1421783322562475411, -503110248141412377, -256005860529123222, > -229477804731423425, -144610334523764289, -64851179421923626, > 127314057436704028, 313816817127566322, 376139846959091135, > 561504311435506912, 858207556605072954, 1261151151588160011, > 1454126256475083217, 1618377671275204279, 2317929712453820894, > 2560612758275508783, 2587728682790085050, 2848178890309615427, > 2885660694771463522, 3140716395155672330, 3178980457497133951, > 3591038406660159757, 3766734787881223437, 3769457468208792646, > 3824534990286253644, 5183723622628782738, 5314317607985127226, > 584580052753930, 6235156095343170404, 6242029497543352525, > 6281404742986921776, 6589819833145109726, 6821551756387826137, > 6889949766088620327, 7754073703959464783, 7756209389182352710, > 7952201212324370303, 8053856175511744133, 8081402847785658462, > 8227459864244671435, 8350507973899452057, 8826283221671184683, > 912045907067355 > at > org.apache.cassandra.locator.TokenMetadata.getPredecessor(TokenMetadata.java:717) > ~[main/:na] > at > org.apache.cassandra.locator.TokenMetadata.getPrimaryRangesFor(TokenMetadata.java:661) > ~[main/:na] > at > org.apache.cassandra.db.SizeEstimatesRecorder.run(SizeEstimatesRecorder.java:69) > ~[main/:na] > at > org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118) > ~[main/:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > [na:1.7.0_80] > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) > [na:1.7.0_80] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) > [na:1.7.0_80] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > [na:1.7.0_80] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [na:1.7.0_80] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_80] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80] > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (CASSANDRA-12929) dtest failure in bootstrap_test.TestBootstrap.simple_bootstrap_test_small_keepalive_period
[ https://issues.apache.org/jira/browse/CASSANDRA-12929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15950013#comment-15950013 ] Paulo Motta edited comment on CASSANDRA-12929 at 3/30/17 10:52 PM: --- Hmm, the version check {{CassandraVersion.isSupportedBy}} only consider versions in the same major compatible, so keep-alive was never being enabled. I updated the keep-alive version check to use {{compareTo}} which compares both majors and minors instead, which makes 4.0 compatible with 3.10. Also updated dtest to use byteman so it will be faster [PR|https://github.com/riptano/cassandra-dtest/pull/1458] Trivial patch attached: ||trunk||dtest|| |[branch|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-12929]|[branch|https://github.com/riptano/cassandra-dtest/compare/master...pauloricardomg:12929]| |[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-12929-dtest/lastCompletedBuild/testReport/]| Mind having a quick look [~yukim]? Thanks! was (Author: pauloricardomg): Hmm, the version check {{CassandraVersion.isSupportedBy}} only consider versions in the same major compatible, so keep-alive was never being enabled. I updated the keep-alive version check to use {{compareTo}} which compares both majors and minors instead, which makes 4.0 compatible with 3.10. Also updated dtest to use byteman so it will be faster [PR|https://github.com/riptano/cassandra-dtest/pull/1458] Trivial patch attached: ||12929||dtest|| |[branch|https://github.com/apache/cassandra/compare/cassandra-12929...pauloricardomg:12929-trunk]|[branch|https://github.com/riptano/cassandra-dtest/compare/master...pauloricardomg:trunk]| |[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-12929-trunk-dtest/lastCompletedBuild/testReport/]| Mind having a quick look [~yukim]? Thanks! > dtest failure in > bootstrap_test.TestBootstrap.simple_bootstrap_test_small_keepalive_period > -- > > Key: CASSANDRA-12929 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12929 > Project: Cassandra > Issue Type: Bug >Reporter: Michael Shuler >Assignee: Paulo Motta > Labels: dtest, test-failure > > example failure: > http://cassci.datastax.com/job/trunk_novnode_dtest/494/testReport/bootstrap_test/TestBootstrap/simple_bootstrap_test_small_keepalive_period > {noformat} > Error Message > Expected [['COMPLETED']] from SELECT bootstrapped FROM system.local WHERE > key='local', but got [[u'IN_PROGRESS']] > >> begin captured logging << > dtest: DEBUG: cluster ccm directory: /tmp/dtest-YmnyEI > dtest: DEBUG: Done setting configuration options: > { 'num_tokens': None, 'phi_convict_threshold': 5, 'start_rpc': 'true'} > cassandra.cluster: INFO: New Cassandra host > discovered > - >> end captured logging << - > Stacktrace > File "/usr/lib/python2.7/unittest/case.py", line 329, in run > testMethod() > File "/home/automaton/cassandra-dtest/tools/decorators.py", line 46, in > wrapped > f(obj) > File "/home/automaton/cassandra-dtest/bootstrap_test.py", line 163, in > simple_bootstrap_test_small_keepalive_period > assert_bootstrap_state(self, node2, 'COMPLETED') > File "/home/automaton/cassandra-dtest/tools/assertions.py", line 297, in > assert_bootstrap_state > assert_one(session, "SELECT bootstrapped FROM system.local WHERE > key='local'", [expected_bootstrap_state]) > File "/home/automaton/cassandra-dtest/tools/assertions.py", line 130, in > assert_one > assert list_res == [expected], "Expected {} from {}, but got > {}".format([expected], query, list_res) > "Expected [['COMPLETED']] from SELECT bootstrapped FROM system.local WHERE > key='local', but got [[u'IN_PROGRESS']]\n >> begin > captured logging << \ndtest: DEBUG: cluster ccm > directory: /tmp/dtest-YmnyEI\ndtest: DEBUG: Done setting configuration > options:\n{ 'num_tokens': None, 'phi_convict_threshold': 5, 'start_rpc': > 'true'}\ncassandra.cluster: INFO: New Cassandra host datacenter1> discovered\n- >> end captured logging << > -" > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-12929) dtest failure in bootstrap_test.TestBootstrap.simple_bootstrap_test_small_keepalive_period
[ https://issues.apache.org/jira/browse/CASSANDRA-12929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15950013#comment-15950013 ] Paulo Motta commented on CASSANDRA-12929: - Hmm, the version check {{CassandraVersion.isSupportedBy}} only consider versions in the same major compatible, so keep-alive was never being enabled. I updated the keep-alive version check to use {{compareTo}} which compares both majors and minors instead, which makes 4.0 compatible with 3.10. Also updated dtest to use byteman so it will be faster [PR|https://github.com/riptano/cassandra-dtest/pull/1458] Trivial patch attached: ||12929||dtest|| |[branch|https://github.com/apache/cassandra/compare/cassandra-12929...pauloricardomg:12929-trunk]|[branch|https://github.com/riptano/cassandra-dtest/compare/master...pauloricardomg:trunk]| |[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-12929-trunk-dtest/lastCompletedBuild/testReport/]| Mind having a quick look [~yukim]? Thanks! > dtest failure in > bootstrap_test.TestBootstrap.simple_bootstrap_test_small_keepalive_period > -- > > Key: CASSANDRA-12929 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12929 > Project: Cassandra > Issue Type: Bug >Reporter: Michael Shuler >Assignee: Paulo Motta > Labels: dtest, test-failure > > example failure: > http://cassci.datastax.com/job/trunk_novnode_dtest/494/testReport/bootstrap_test/TestBootstrap/simple_bootstrap_test_small_keepalive_period > {noformat} > Error Message > Expected [['COMPLETED']] from SELECT bootstrapped FROM system.local WHERE > key='local', but got [[u'IN_PROGRESS']] > >> begin captured logging << > dtest: DEBUG: cluster ccm directory: /tmp/dtest-YmnyEI > dtest: DEBUG: Done setting configuration options: > { 'num_tokens': None, 'phi_convict_threshold': 5, 'start_rpc': 'true'} > cassandra.cluster: INFO: New Cassandra host > discovered > - >> end captured logging << - > Stacktrace > File "/usr/lib/python2.7/unittest/case.py", line 329, in run > testMethod() > File "/home/automaton/cassandra-dtest/tools/decorators.py", line 46, in > wrapped > f(obj) > File "/home/automaton/cassandra-dtest/bootstrap_test.py", line 163, in > simple_bootstrap_test_small_keepalive_period > assert_bootstrap_state(self, node2, 'COMPLETED') > File "/home/automaton/cassandra-dtest/tools/assertions.py", line 297, in > assert_bootstrap_state > assert_one(session, "SELECT bootstrapped FROM system.local WHERE > key='local'", [expected_bootstrap_state]) > File "/home/automaton/cassandra-dtest/tools/assertions.py", line 130, in > assert_one > assert list_res == [expected], "Expected {} from {}, but got > {}".format([expected], query, list_res) > "Expected [['COMPLETED']] from SELECT bootstrapped FROM system.local WHERE > key='local', but got [[u'IN_PROGRESS']]\n >> begin > captured logging << \ndtest: DEBUG: cluster ccm > directory: /tmp/dtest-YmnyEI\ndtest: DEBUG: Done setting configuration > options:\n{ 'num_tokens': None, 'phi_convict_threshold': 5, 'start_rpc': > 'true'}\ncassandra.cluster: INFO: New Cassandra host datacenter1> discovered\n- >> end captured logging << > -" > {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (CASSANDRA-11370) Display sstable count per level according to repair status on nodetool tablestats
[ https://issues.apache.org/jira/browse/CASSANDRA-11370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta reassigned CASSANDRA-11370: --- Assignee: (was: Paulo Motta) > Display sstable count per level according to repair status on nodetool > tablestats > -- > > Key: CASSANDRA-11370 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11370 > Project: Cassandra > Issue Type: Improvement > Components: Tools >Reporter: Paulo Motta >Priority: Minor > Labels: lhf > > After CASSANDRA-8004 we still display sstables in each level on nodetool > tablestats as if we had a single compaction strategy, while we have one > strategy for repaired and another for unrepaired data. > We should split display into repaired and unrepaired set, so this: > SSTables in each level: [2, 20/10, 15, 0, 0, 0, 0, 0, 0] > Would become: > SSTables in each level (repaired): [1, 10, 0, 0, 0, 0, 0, 0, 0] > SSTables in each level (unrepaired): [1, 10, 15, 0, 0, 0, 0, 0, 0] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (CASSANDRA-3486) Node Tool command to stop repair
[ https://issues.apache.org/jira/browse/CASSANDRA-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta reassigned CASSANDRA-3486: -- Assignee: (was: Paulo Motta) > Node Tool command to stop repair > > > Key: CASSANDRA-3486 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3486 > Project: Cassandra > Issue Type: Improvement > Components: Tools > Environment: JVM >Reporter: Vijay >Priority: Minor > Labels: repair > Fix For: 2.1.x > > Attachments: 0001-stop-repair-3583.patch > > > After CASSANDRA-1740, If the validation compaction is stopped then the repair > will hang. This ticket will allow users to kill the original repair. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (CASSANDRA-11190) Fail fast repairs
[ https://issues.apache.org/jira/browse/CASSANDRA-11190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta reassigned CASSANDRA-11190: --- Assignee: (was: Paulo Motta) > Fail fast repairs > - > > Key: CASSANDRA-11190 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11190 > Project: Cassandra > Issue Type: Improvement > Components: Streaming and Messaging >Reporter: Paulo Motta >Priority: Minor > > Currently, if one node fails any phase of the repair (validation, streaming), > the repair session is aborted, but the other nodes are not notified and keep > doing either validation or syncing with other nodes. > With CASSANDRA-10070 automatically scheduling repairs and potentially > scheduling retries it would be nice to make sure all nodes abort failed > repairs in other to be able to start other repairs safely in the same nodes. > From CASSANDRA-10070: > bq. As far as I understood, if there are nodes A, B, C running repair, A is > the coordinator. If validation or streaming fails on node B, the coordinator > (A) is notified and fails the repair session, but node C will remain doing > validation and/or streaming, what could cause problems (or increased load) if > we start another repair session on the same range. > bq. We will probably need to extend the repair protocol to perform this > cleanup/abort step on failure. We already have a legacy cleanup message that > doesn't seem to be used in the current protocol that we could maybe reuse to > cleanup repair state after a failure. This repair abortion will probably have > intersection with CASSANDRA-3486. In any case, this is a separate (but > related) issue and we should address it in an independent ticket, and make > this ticket dependent on that. > On CASSANDRA-5426 [~slebresne] suggested doing this to avoid unexpected > conditions/hangs: > bq. I wonder if maybe we should have more of a fail-fast policy when there is > errors. For instance, if one node fail it's validation phase, maybe it might > be worth failing right away and let the user re-trigger a repair once he has > fixed whatever was the source of the error, rather than still > differencing/syncing the other nodes. > bq. Going a bit further, I think we should add 2 messages to interrupt the > validation and sync phase. If only because that could be useful to users if > they need to stop a repair for some reason, but also, if we get an error > during validation from one node, we could use that to interrupt the other > nodes and thus fail fast while minimizing the amount of work done uselessly. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (CASSANDRA-12842) testall failure inorg.apache.cassandra.pig.CqlTableTest.testCqlNativeStorageCompositeKeyTable
[ https://issues.apache.org/jira/browse/CASSANDRA-12842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta resolved CASSANDRA-12842. - Resolution: Fixed Closing since it hasn't failed for the last ~5 months. > testall failure > inorg.apache.cassandra.pig.CqlTableTest.testCqlNativeStorageCompositeKeyTable > - > > Key: CASSANDRA-12842 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12842 > Project: Cassandra > Issue Type: Bug >Reporter: Sean McCarthy >Assignee: Paulo Motta > Labels: test-failure, testall > > example failure: > http://cassci.datastax.com/job/cassandra-2.2_testall/598/testReport/org.apache.cassandra.pig/CqlTableTest/testCqlNativeStorageCompositeKeyTable/ > {code} > Error Message > expected:<4> but was:<9> > {code}{code} > Stacktrace > junit.framework.AssertionFailedError: expected:<4> but was:<9> > at > org.apache.cassandra.pig.CqlTableTest.compositeKeyTableTest(CqlTableTest.java:200) > at > org.apache.cassandra.pig.CqlTableTest.testCqlNativeStorageCompositeKeyTable(CqlTableTest.java:172) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13392) Repaired status should be cleared on new sstables when issuing nodetool refresh
[ https://issues.apache.org/jira/browse/CASSANDRA-13392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-13392: Reviewer: Blake Eggleston > Repaired status should be cleared on new sstables when issuing nodetool > refresh > --- > > Key: CASSANDRA-13392 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13392 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 3.0.x, 3.11.x, 4.x > > > We can't assume that new sstables added when doing nodetool refresh > (ColumnFamilyStore#loadNewSSTables) are actually repaired if they have the > repairedAt flag set -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13289) Make it possible to monitor an ideal consistency level separate from actual consistency level
[ https://issues.apache.org/jira/browse/CASSANDRA-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ariel Weisberg updated CASSANDRA-13289: --- Resolution: Fixed Status: Resolved (was: Ready to Commit) Committed as [6f647aaa0df6f90ee298d372e624c9e3c1ae937e|https://github.com/apache/cassandra/commit/6f647aaa0df6f90ee298d372e624c9e3c1ae937e] > Make it possible to monitor an ideal consistency level separate from actual > consistency level > - > > Key: CASSANDRA-13289 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13289 > Project: Cassandra > Issue Type: New Feature > Components: Core >Reporter: Ariel Weisberg >Assignee: Ariel Weisberg > Fix For: 4.0 > > > As an operator there are several issues related to multi-datacenter > replication and consistency you may want to have more information on from > your production database. > For instance. If your application writes at LOCAL_QUORUM how often are those > writes failing to achieve EACH_QUORUM at other data centers. If you failed > your application over to one of those data centers roughly how inconsistent > might it be given the number of writes that didn't propagate since the last > incremental repair? > You might also want to know roughly what the latency of writes would be if > you switched to a different consistency level. For instance you are writing > at LOCAL_QUORUM and want to know what would happen if you switched to > EACH_QUORUM. > The proposed change is to allow an ideal_consistency_level to be specified in > cassandra.yaml as well as get/set via JMX. If no ideal consistency level is > specified no additional tracking is done. > if an ideal consistency level is specified then the > {{AbstractWriteResponesHandler}} will contain a delegate WriteResponseHandler > that tracks whether the ideal consistency level is met before a write times > out. It also tracks the latency for achieving the ideal CL of successful > writes. > These two metrics would be reported on a per keyspace basis. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (CASSANDRA-9830) Option to disable bloom filter in highest level of LCS sstables
[ https://issues.apache.org/jira/browse/CASSANDRA-9830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta resolved CASSANDRA-9830. Resolution: Later Fix Version/s: (was: 3.11.x) > Option to disable bloom filter in highest level of LCS sstables > --- > > Key: CASSANDRA-9830 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9830 > Project: Cassandra > Issue Type: New Feature > Components: Compaction >Reporter: Jonathan Ellis >Assignee: Paulo Motta >Priority: Minor > Labels: lcs, performance > > We expect about 90% of data to be in the highest level of LCS in a fully > populated series. (See also CASSANDRA-9829.) > Thus if the user is primarily asking for data (partitions) that has actually > been inserted, the bloom filter on the highest level only helps reject > sstables about 10% of the time. > We should add an option that suppresses bloom filter creation on top-level > sstables. This will dramatically reduce memory usage for LCS and may even > improve performance as we no longer check a low-value filter. > (This is also an idea from RocksDB.) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (CASSANDRA-9244) replace_address is not topology-aware
[ https://issues.apache.org/jira/browse/CASSANDRA-9244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta reassigned CASSANDRA-9244: -- Assignee: (was: Paulo Motta) Reproduced In: 3.1, 3.0.2, 2.1.12, 2.0.17 (was: 2.0.17, 2.1.12, 3.0.2, 3.1) > replace_address is not topology-aware > - > > Key: CASSANDRA-9244 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9244 > Project: Cassandra > Issue Type: Bug > Components: Distributed Metadata, Streaming and Messaging > Environment: 2.0.12 >Reporter: Rick Branson > Fix For: 2.1.x, 2.2.x, 3.0.x, 3.11.x > > > Replaced a node with one in another rack (using replace_address) and it > caused improper distribution after the bootstrap was finished. It looks like > the ranges for the streams are not created in a way that is topology-aware. > This should probably either be prevented, or ideally, would work properly. > The use case is migrating several nodes from one rack to another. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13289) Make it possible to monitor an ideal consistency level separate from actual consistency level
[ https://issues.apache.org/jira/browse/CASSANDRA-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ariel Weisberg updated CASSANDRA-13289: --- Status: Ready to Commit (was: Patch Available) > Make it possible to monitor an ideal consistency level separate from actual > consistency level > - > > Key: CASSANDRA-13289 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13289 > Project: Cassandra > Issue Type: New Feature > Components: Core >Reporter: Ariel Weisberg >Assignee: Ariel Weisberg > Fix For: 4.0 > > > As an operator there are several issues related to multi-datacenter > replication and consistency you may want to have more information on from > your production database. > For instance. If your application writes at LOCAL_QUORUM how often are those > writes failing to achieve EACH_QUORUM at other data centers. If you failed > your application over to one of those data centers roughly how inconsistent > might it be given the number of writes that didn't propagate since the last > incremental repair? > You might also want to know roughly what the latency of writes would be if > you switched to a different consistency level. For instance you are writing > at LOCAL_QUORUM and want to know what would happen if you switched to > EACH_QUORUM. > The proposed change is to allow an ideal_consistency_level to be specified in > cassandra.yaml as well as get/set via JMX. If no ideal consistency level is > specified no additional tracking is done. > if an ideal consistency level is specified then the > {{AbstractWriteResponesHandler}} will contain a delegate WriteResponseHandler > that tracks whether the ideal consistency level is met before a write times > out. It also tracks the latency for achieving the ideal CL of successful > writes. > These two metrics would be reported on a per keyspace basis. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
cassandra git commit: Make it possible to monitor an ideal consistency level separate from actual consistency level
Repository: cassandra Updated Branches: refs/heads/trunk bb4c5c3c4 -> 6f647aaa0 Make it possible to monitor an ideal consistency level separate from actual consistency level Patch by Ariel Weisberg; Reviewed by Jason Brown for CASSANDRA-13289 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/6f647aaa Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/6f647aaa Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/6f647aaa Branch: refs/heads/trunk Commit: 6f647aaa0df6f90ee298d372e624c9e3c1ae937e Parents: bb4c5c3 Author: Ariel Weisberg Authored: Thu Mar 2 16:46:13 2017 -0500 Committer: Ariel Weisberg Committed: Thu Mar 30 17:01:20 2017 -0400 -- CHANGES.txt | 1 + conf/cassandra.yaml | 5 + .../org/apache/cassandra/config/Config.java | 8 + .../cassandra/config/DatabaseDescriptor.java| 11 + .../locator/AbstractReplicationStrategy.java| 49 +++- .../cassandra/metrics/KeyspaceMetrics.java | 9 + .../service/AbstractWriteResponseHandler.java | 105 - .../DatacenterSyncWriteResponseHandler.java | 30 ++- .../service/DatacenterWriteResponseHandler.java | 8 + .../apache/cassandra/service/StorageProxy.java | 26 ++- .../cassandra/service/StorageProxyMBean.java| 5 + .../cassandra/service/WriteResponseHandler.java | 4 + .../config/DatabaseDescriptorRefTest.java | 1 + .../service/WriteResponseHandlerTest.java | 234 +++ 14 files changed, 474 insertions(+), 22 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/6f647aaa/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 6a164ee..d4b53d0 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 4.0 + * Make it possible to monitor an ideal consistency level separate from actual consistency level (CASSANDRA-13289) * Outbound TCP connections ignore internode authenticator (CASSANDRA-13324) * Upgrade junit from 4.6 to 4.12 (CASSANDRA-13360) * Cleanup ParentRepairSession after repairs (CASSANDRA-13359) http://git-wip-us.apache.org/repos/asf/cassandra/blob/6f647aaa/conf/cassandra.yaml -- diff --git a/conf/cassandra.yaml b/conf/cassandra.yaml index d8392a0..f2c4c84 100644 --- a/conf/cassandra.yaml +++ b/conf/cassandra.yaml @@ -1120,3 +1120,8 @@ back_pressure_strategy: # Do not try to coalesce messages if we already got that many messages. This should be more than 2 and less than 128. # otc_coalescing_enough_coalesced_messages: 8 + +# Track a metric per keyspace indicating whether replication achieved the ideal consistency +# level for writes without timing out. This is different from the consistency level requested by +# each write which may be lower in order to facilitate availability. +# ideal_consistency_level: EACH_QUORUM http://git-wip-us.apache.org/repos/asf/cassandra/blob/6f647aaa/src/java/org/apache/cassandra/config/Config.java -- diff --git a/src/java/org/apache/cassandra/config/Config.java b/src/java/org/apache/cassandra/config/Config.java index 36ce576..1461cd4 100644 --- a/src/java/org/apache/cassandra/config/Config.java +++ b/src/java/org/apache/cassandra/config/Config.java @@ -32,6 +32,8 @@ import com.google.common.collect.Sets; import org.slf4j.Logger; import org.slf4j.LoggerFactory; +import org.apache.cassandra.db.ConsistencyLevel; + /** * A class that contains configuration properties for the cassandra node it runs within. * @@ -271,6 +273,12 @@ public class Config public int tracetype_query_ttl = (int) TimeUnit.DAYS.toSeconds(1); public int tracetype_repair_ttl = (int) TimeUnit.DAYS.toSeconds(7); +/** + * Maintain statistics on whether writes achieve the ideal consistency level + * before expiring and becoming hints + */ +public volatile ConsistencyLevel ideal_consistency_level = null; + /* * Strategy to use for coalescing messages in OutboundTcpConnection. * Can be fixed, movingaverage, timehorizon, disabled. Setting is case and leading/trailing http://git-wip-us.apache.org/repos/asf/cassandra/blob/6f647aaa/src/java/org/apache/cassandra/config/DatabaseDescriptor.java -- diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java index 465cd8a..debf161 100644 --- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java +++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java @@ -44,6 +44,7 @@ import org.apac
[jira] [Commented] (CASSANDRA-13087) Not enough bytes exception during compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-13087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15949861#comment-15949861 ] Paulo Motta commented on CASSANDRA-13087: - This is probably a leftover from the bug CASSANDRA-10791 and is be fixed by scrubbing the faulty sstables on the source and destination nodes before running repair again. > Not enough bytes exception during compaction > > > Key: CASSANDRA-13087 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13087 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: Ubuntu 14.04.3 LTS, Cassandra 2.1.14 >Reporter: FACORAT > Attachments: CASSANDRA-13087.patch > > > After a repair we have compaction exceptions on some nodes and its spreading > {noformat} > ERROR [CompactionExecutor:14065] 2016-12-30 14:45:07,245 > CassandraDaemon.java:229 - Exception in thread > Thread[CompactionExecutor:14065,1,main] > java.lang.IllegalArgumentException: Not enough bytes. Offset: 5. Length: > 20275. Buffer size: 12594 > at > org.apache.cassandra.db.composites.AbstractCType.checkRemaining(AbstractCType.java:378) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > org.apache.cassandra.db.composites.AbstractCompoundCellNameType.fromByteBuffer(AbstractCompoundCellNameType.java:100) > ~[apache-cassandra-2.1.14.ja > r:2.1.14] > at > org.apache.cassandra.db.composites.AbstractCType$Serializer.deserialize(AbstractCType.java:398) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > org.apache.cassandra.db.composites.AbstractCType$Serializer.deserialize(AbstractCType.java:382) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:75) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > org.apache.cassandra.db.AbstractCell$1.computeNext(AbstractCell.java:52) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > org.apache.cassandra.db.AbstractCell$1.computeNext(AbstractCell.java:46) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) > ~[guava-16.0.jar:na] > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) > ~[guava-16.0.jar:na] > at > org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasNext(SSTableIdentityIterator.java:171) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext(MergeIterator.java:202) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) > ~[guava-16.0.jar:na] > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) > ~[guava-16.0.jar:na] > at > com.google.common.collect.Iterators$7.computeNext(Iterators.java:645) > ~[guava-16.0.jar:na] > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) > ~[guava-16.0.jar:na] > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) > ~[guava-16.0.jar:na] > at > org.apache.cassandra.db.ColumnIndex$Builder.buildForCompaction(ColumnIndex.java:166) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > org.apache.cassandra.db.compaction.LazilyCompactedRow.write(LazilyCompactedRow.java:121) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:193) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:127) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:197) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:73) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:264) > ~[apache-cassandra-2.1.14.jar:2 > .1.14] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_60] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_60] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor
[jira] [Comment Edited] (CASSANDRA-13087) Not enough bytes exception during compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-13087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15949861#comment-15949861 ] Paulo Motta edited comment on CASSANDRA-13087 at 3/30/17 9:20 PM: -- This is probably a leftover from the bug CASSANDRA-10791 and is fixed by scrubbing the faulty sstables on the source and destination nodes before running repair again. was (Author: pauloricardomg): This is probably a leftover from the bug CASSANDRA-10791 and is be fixed by scrubbing the faulty sstables on the source and destination nodes before running repair again. > Not enough bytes exception during compaction > > > Key: CASSANDRA-13087 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13087 > Project: Cassandra > Issue Type: Bug > Components: Compaction > Environment: Ubuntu 14.04.3 LTS, Cassandra 2.1.14 >Reporter: FACORAT > Attachments: CASSANDRA-13087.patch > > > After a repair we have compaction exceptions on some nodes and its spreading > {noformat} > ERROR [CompactionExecutor:14065] 2016-12-30 14:45:07,245 > CassandraDaemon.java:229 - Exception in thread > Thread[CompactionExecutor:14065,1,main] > java.lang.IllegalArgumentException: Not enough bytes. Offset: 5. Length: > 20275. Buffer size: 12594 > at > org.apache.cassandra.db.composites.AbstractCType.checkRemaining(AbstractCType.java:378) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > org.apache.cassandra.db.composites.AbstractCompoundCellNameType.fromByteBuffer(AbstractCompoundCellNameType.java:100) > ~[apache-cassandra-2.1.14.ja > r:2.1.14] > at > org.apache.cassandra.db.composites.AbstractCType$Serializer.deserialize(AbstractCType.java:398) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > org.apache.cassandra.db.composites.AbstractCType$Serializer.deserialize(AbstractCType.java:382) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:75) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > org.apache.cassandra.db.AbstractCell$1.computeNext(AbstractCell.java:52) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > org.apache.cassandra.db.AbstractCell$1.computeNext(AbstractCell.java:46) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) > ~[guava-16.0.jar:na] > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) > ~[guava-16.0.jar:na] > at > org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasNext(SSTableIdentityIterator.java:171) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext(MergeIterator.java:202) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) > ~[guava-16.0.jar:na] > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) > ~[guava-16.0.jar:na] > at > com.google.common.collect.Iterators$7.computeNext(Iterators.java:645) > ~[guava-16.0.jar:na] > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) > ~[guava-16.0.jar:na] > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) > ~[guava-16.0.jar:na] > at > org.apache.cassandra.db.ColumnIndex$Builder.buildForCompaction(ColumnIndex.java:166) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > org.apache.cassandra.db.compaction.LazilyCompactedRow.write(LazilyCompactedRow.java:121) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:193) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:127) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:197) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:73) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59) > ~[apache-cassandra-2.1.14.jar:2.1.14] > at > org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionCandidate.run(CompactionManager.java:264) > ~[apache-cassandra-2.1.14.jar:2 > .1.14] > at > java.util.c
[jira] [Commented] (CASSANDRA-13257) Add repair streaming preview
[ https://issues.apache.org/jira/browse/CASSANDRA-13257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15949858#comment-15949858 ] Blake Eggleston commented on CASSANDRA-13257: - bq. I think even so streaming preview covers both full and incremental repair case, and other streaming usage. No, I’m afraid it doesn’t. Part of the confusion here is that my linked patch doesn’t include the fix included in CASSANDRA-13328, which fixes how sstables are selected for streaming post #9143. Sorry about that. The other part is that, post CASSANDRA-9143, incremental repair does an anti-compaction before doing anything else, including validation or streaming. Rewriting a bunch of sstables just so we can estimate the streaming that would happen if we ran one for real is sort of a non-starter. So, I still don’t see a way we can prevent StreamSession from having some notion of what is being previewed. Previewing incremental repair streaming means that we need StreamSession to know it should only include unrepaired sstables, instead of all sstables, as it would with a full repair, since we won’t be including a pending repair id. After #13328, the isIncremental flag in StreamSession is not doing anything, and I have a note to remove it before 4.0. We could make the argument that we should leave it to support preview, but then why not just have the preview enum, which has a much clearer purpose? Also, while knowing that there was a merkle tree mismatch is technically enough to validate whether repaired data is in sync across nodes, having information about the related streaming we expect does have value which shouldn’t be dismissed just because it’s a bit abstract. From the development side, it will provide clues about the cause of the mismatch (ie: a one way transfer indicates that one node failed to promote an sstable). From the operational side, knowing how much data needs to be streamed to fix the out of sync data is useful, it also indicates the severity of the problem, and worst case data loss risk in the case of corruption. But, we can't do this without StreamSession having some notion of what's being previewed. Rebased against trunk (and CASSANDRA-13325) here: https://github.com/bdeggleston/cassandra/tree/13257-squashed-trunk > Add repair streaming preview > > > Key: CASSANDRA-13257 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13257 > Project: Cassandra > Issue Type: New Feature > Components: Streaming and Messaging >Reporter: Blake Eggleston >Assignee: Blake Eggleston > Fix For: 4.0 > > > It would be useful to be able to estimate the amount of repair streaming that > needs to be done, without actually doing any streaming. Our main motivation > for this having something this is validating CASSANDRA-9143 in production, > but I’d imagine it could also be a useful tool in troubleshooting. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13327) Pending endpoints size check for CAS doesn't play nicely with writes-on-replacement
[ https://issues.apache.org/jira/browse/CASSANDRA-13327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15949808#comment-15949808 ] Paulo Motta commented on CASSANDRA-13327: - bq. If a node is streaming from a node that is replaced, we should probably detect that and fail the bootstrapping node since we know it will never complete (and hence has no reason to be accounted as pending anymore). Actually this is a feature and not a bug, the joining node remains in JOINING state so bootstrap can be resumed (CASSANDRA-8838) after the failure is resolved. In this case it will obviously not play along well with CAS it will mean 2 pending endpoints will remain unavailable, which is a pity. Is there a reason we can't lift the limitation from CASSANDRA-8346 by making the read phase use an extended (RF + P + 1) / 2 quorum? > Pending endpoints size check for CAS doesn't play nicely with > writes-on-replacement > --- > > Key: CASSANDRA-13327 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13327 > Project: Cassandra > Issue Type: Bug > Components: Coordination >Reporter: Ariel Weisberg >Assignee: Ariel Weisberg > > Consider this ring: > 127.0.0.1 MR UP JOINING -7301836195843364181 > 127.0.0.2MR UP NORMAL -7263405479023135948 > 127.0.0.3MR UP NORMAL -7205759403792793599 > 127.0.0.4 MR DOWN NORMAL -7148113328562451251 > where 127.0.0.1 was bootstrapping for cluster expansion. Note that, due to > the failure of 127.0.0.4, 127.0.0.1 was stuck trying to stream from it and > making no progress. > Then the down node was replaced so we had: > 127.0.0.1 MR UP JOINING -7301836195843364181 > 127.0.0.2MR UP NORMAL -7263405479023135948 > 127.0.0.3MR UP NORMAL -7205759403792793599 > 127.0.0.5 MR UP JOINING -7148113328562451251 > It’s confusing in the ring - the first JOINING is a genuine bootstrap, the > second is a replacement. We now had CAS unavailables (but no non-CAS > unvailables). I think it’s because the pending endpoints check thinks that > 127.0.0.5 is gaining a range when it’s just replacing. > The workaround is to kill the stuck JOINING node, but Cassandra shouldn’t > unnecessarily fail these requests. > It also appears like required participants is bumped by 1 during a host > replacement so if the replacing host fails you will get unavailables and > timeouts. > This is related to the check added in CASSANDRA-8346 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13289) Make it possible to monitor an ideal consistency level separate from actual consistency level
[ https://issues.apache.org/jira/browse/CASSANDRA-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15949755#comment-15949755 ] Jason Brown commented on CASSANDRA-13289: - [~aweisberg] has updated his branch with some improvements and a nice unit test. +1, ship it > Make it possible to monitor an ideal consistency level separate from actual > consistency level > - > > Key: CASSANDRA-13289 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13289 > Project: Cassandra > Issue Type: New Feature > Components: Core >Reporter: Ariel Weisberg >Assignee: Ariel Weisberg > Fix For: 4.0 > > > As an operator there are several issues related to multi-datacenter > replication and consistency you may want to have more information on from > your production database. > For instance. If your application writes at LOCAL_QUORUM how often are those > writes failing to achieve EACH_QUORUM at other data centers. If you failed > your application over to one of those data centers roughly how inconsistent > might it be given the number of writes that didn't propagate since the last > incremental repair? > You might also want to know roughly what the latency of writes would be if > you switched to a different consistency level. For instance you are writing > at LOCAL_QUORUM and want to know what would happen if you switched to > EACH_QUORUM. > The proposed change is to allow an ideal_consistency_level to be specified in > cassandra.yaml as well as get/set via JMX. If no ideal consistency level is > specified no additional tracking is done. > if an ideal consistency level is specified then the > {{AbstractWriteResponesHandler}} will contain a delegate WriteResponseHandler > that tracks whether the ideal consistency level is met before a write times > out. It also tracks the latency for achieving the ideal CL of successful > writes. > These two metrics would be reported on a per keyspace basis. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13065) Consistent range movements to not require MV updates to go through write paths
[ https://issues.apache.org/jira/browse/CASSANDRA-13065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15949693#comment-15949693 ] Paulo Motta commented on CASSANDRA-13065: - Thanks [~brstgt], this looks mostly good, except the following nits: - Add header to {{StreamOperation}} test - CDC should always go through write path on streaming, not only repair After this can you squash and format your patch according to these [guidelines|https://cassandra.apache.org/doc/latest/development/patches.html#creating-a-patch] (including CHANGES.txt entry). I submitted CI tests for the current version of the patch: ||trunk|| |[branch|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-13065]| |[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-13065-testall/lastCompletedBuild/testReport/]| |[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-13065-dtest/lastCompletedBuild/testReport/]| > Consistent range movements to not require MV updates to go through write > paths > --- > > Key: CASSANDRA-13065 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13065 > Project: Cassandra > Issue Type: Improvement >Reporter: Benjamin Roth >Assignee: Benjamin Roth >Priority: Critical > Fix For: 4.0 > > > Booting or decommisioning nodes with MVs is unbearably slow as all streams go > through the regular write paths. This causes read-before-writes for every > mutation and during bootstrap it causes them to be sent to batchlog. > The makes it virtually impossible to boot a new node in an acceptable amount > of time. > Using the regular streaming behaviour for consistent range movements works > much better in this case and does not break the MV local consistency contract. > Already tested on own cluster. > Bootstrap case is super easy to handle, decommission case requires > CASSANDRA-13064 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13229) dtest failure in topology_test.TestTopology.size_estimates_multidc_test
[ https://issues.apache.org/jira/browse/CASSANDRA-13229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta updated CASSANDRA-13229: Status: Open (was: Patch Available) > dtest failure in topology_test.TestTopology.size_estimates_multidc_test > --- > > Key: CASSANDRA-13229 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13229 > Project: Cassandra > Issue Type: Bug > Components: Testing >Reporter: Sean McCarthy >Assignee: Alex Petrov > Labels: dtest, test-failure > Fix For: 4.0 > > Attachments: node1_debug.log, node1_gc.log, node1.log, > node2_debug.log, node2_gc.log, node2.log, node3_debug.log, node3_gc.log, > node3.log > > > example failure: > http://cassci.datastax.com/job/trunk_novnode_dtest/508/testReport/topology_test/TestTopology/size_estimates_multidc_test > {code} > Standard Output > Unexpected error in node1 log, error: > ERROR [MemtablePostFlush:1] 2017-02-15 16:07:33,837 CassandraDaemon.java:211 > - Exception in thread Thread[MemtablePostFlush:1,5,main] > java.lang.IndexOutOfBoundsException: Index: 3, Size: 3 > at java.util.ArrayList.rangeCheck(ArrayList.java:653) ~[na:1.8.0_45] > at java.util.ArrayList.get(ArrayList.java:429) ~[na:1.8.0_45] > at > org.apache.cassandra.dht.Splitter.splitOwnedRangesNoPartialRanges(Splitter.java:92) > ~[main/:na] > at org.apache.cassandra.dht.Splitter.splitOwnedRanges(Splitter.java:59) > ~[main/:na] > at > org.apache.cassandra.service.StorageService.getDiskBoundaries(StorageService.java:5180) > ~[main/:na] > at > org.apache.cassandra.db.Memtable.createFlushRunnables(Memtable.java:312) > ~[main/:na] > at org.apache.cassandra.db.Memtable.flushRunnables(Memtable.java:304) > ~[main/:na] > at > org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtable(ColumnFamilyStore.java:1150) > ~[main/:na] > at > org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1115) > ~[main/:na] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > ~[na:1.8.0_45] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_45] > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$290(NamedThreadFactory.java:81) > [main/:na] > at > org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$5/1321203216.run(Unknown > Source) [main/:na] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45] > Unexpected error in node1 log, error: > ERROR [MigrationStage:1] 2017-02-15 16:07:33,853 CassandraDaemon.java:211 - > Exception in thread Thread[MigrationStage:1,5,main] > java.lang.RuntimeException: java.util.concurrent.ExecutionException: > java.lang.IndexOutOfBoundsException: Index: 3, Size: 3 > at > org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:401) > ~[main/:na] > at > org.apache.cassandra.schema.SchemaKeyspace.lambda$flush$496(SchemaKeyspace.java:284) > ~[main/:na] > at > org.apache.cassandra.schema.SchemaKeyspace$$Lambda$222/1949434065.accept(Unknown > Source) ~[na:na] > at java.lang.Iterable.forEach(Iterable.java:75) ~[na:1.8.0_45] > at > org.apache.cassandra.schema.SchemaKeyspace.flush(SchemaKeyspace.java:284) > ~[main/:na] > at > org.apache.cassandra.schema.SchemaKeyspace.applyChanges(SchemaKeyspace.java:1265) > ~[main/:na] > at org.apache.cassandra.schema.Schema.merge(Schema.java:577) ~[main/:na] > at > org.apache.cassandra.schema.Schema.mergeAndAnnounceVersion(Schema.java:564) > ~[main/:na] > at > org.apache.cassandra.schema.MigrationManager$1.runMayThrow(MigrationManager.java:402) > ~[main/:na] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > ~[main/:na] > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_45] > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > ~[na:1.8.0_45] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > ~[na:1.8.0_45] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_45] > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$290(NamedThreadFactory.java:81) > [main/:na] > at > org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$5/1321203216.run(Unknown > Source) [main/:na] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45] > Caused by: java.util.concurrent.ExecutionException: > java.lang.IndexOutOfBoundsException: Index: 3, Size: 3 > at java.util.concurrent.FutureTask.report(FutureTask.java:122)
[jira] [Commented] (CASSANDRA-13229) dtest failure in topology_test.TestTopology.size_estimates_multidc_test
[ https://issues.apache.org/jira/browse/CASSANDRA-13229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15949608#comment-15949608 ] Paulo Motta commented on CASSANDRA-13229: - Nice catch! I'm afraid we can't fallback to split the token ranges evenly given it's expected that a single vnode range should not span more than 1 disk (CASSANDRA-6696). Actually in this specific case, given it's the system keyspace which spans the whole token range we could probably split the token ranges evenly (and probably should for better distribution), but when {{dontSplitRanges}} flag is passed we should always assign at least 1 vnode range per disk even if one of the disks becomes unbalanced (cases like this will become very rare after CASSANDRA-7032, but we should still protect against it). Although this will probably happen in rare cases when the token ranges are unbalanced and the vnode-to-disk ratio is low, we can probably tweak the {{splitOwnedRangesNoPartialRanges}} algorithm to only add more ranges to the current disk if the # of remaining tokens > # remaining parts. Does this sound reasonable or can you think of a simpler/better approach [~krummas]? > dtest failure in topology_test.TestTopology.size_estimates_multidc_test > --- > > Key: CASSANDRA-13229 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13229 > Project: Cassandra > Issue Type: Bug > Components: Testing >Reporter: Sean McCarthy >Assignee: Alex Petrov > Labels: dtest, test-failure > Fix For: 4.0 > > Attachments: node1_debug.log, node1_gc.log, node1.log, > node2_debug.log, node2_gc.log, node2.log, node3_debug.log, node3_gc.log, > node3.log > > > example failure: > http://cassci.datastax.com/job/trunk_novnode_dtest/508/testReport/topology_test/TestTopology/size_estimates_multidc_test > {code} > Standard Output > Unexpected error in node1 log, error: > ERROR [MemtablePostFlush:1] 2017-02-15 16:07:33,837 CassandraDaemon.java:211 > - Exception in thread Thread[MemtablePostFlush:1,5,main] > java.lang.IndexOutOfBoundsException: Index: 3, Size: 3 > at java.util.ArrayList.rangeCheck(ArrayList.java:653) ~[na:1.8.0_45] > at java.util.ArrayList.get(ArrayList.java:429) ~[na:1.8.0_45] > at > org.apache.cassandra.dht.Splitter.splitOwnedRangesNoPartialRanges(Splitter.java:92) > ~[main/:na] > at org.apache.cassandra.dht.Splitter.splitOwnedRanges(Splitter.java:59) > ~[main/:na] > at > org.apache.cassandra.service.StorageService.getDiskBoundaries(StorageService.java:5180) > ~[main/:na] > at > org.apache.cassandra.db.Memtable.createFlushRunnables(Memtable.java:312) > ~[main/:na] > at org.apache.cassandra.db.Memtable.flushRunnables(Memtable.java:304) > ~[main/:na] > at > org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtable(ColumnFamilyStore.java:1150) > ~[main/:na] > at > org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1115) > ~[main/:na] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > ~[na:1.8.0_45] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > [na:1.8.0_45] > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$290(NamedThreadFactory.java:81) > [main/:na] > at > org.apache.cassandra.concurrent.NamedThreadFactory$$Lambda$5/1321203216.run(Unknown > Source) [main/:na] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45] > Unexpected error in node1 log, error: > ERROR [MigrationStage:1] 2017-02-15 16:07:33,853 CassandraDaemon.java:211 - > Exception in thread Thread[MigrationStage:1,5,main] > java.lang.RuntimeException: java.util.concurrent.ExecutionException: > java.lang.IndexOutOfBoundsException: Index: 3, Size: 3 > at > org.apache.cassandra.utils.FBUtilities.waitOnFuture(FBUtilities.java:401) > ~[main/:na] > at > org.apache.cassandra.schema.SchemaKeyspace.lambda$flush$496(SchemaKeyspace.java:284) > ~[main/:na] > at > org.apache.cassandra.schema.SchemaKeyspace$$Lambda$222/1949434065.accept(Unknown > Source) ~[na:na] > at java.lang.Iterable.forEach(Iterable.java:75) ~[na:1.8.0_45] > at > org.apache.cassandra.schema.SchemaKeyspace.flush(SchemaKeyspace.java:284) > ~[main/:na] > at > org.apache.cassandra.schema.SchemaKeyspace.applyChanges(SchemaKeyspace.java:1265) > ~[main/:na] > at org.apache.cassandra.schema.Schema.merge(Schema.java:577) ~[main/:na] > at > org.apache.cassandra.schema.Schema.mergeAndAnnounceVersion(Schema.java:564) > ~[main/:na] > at > org.apache.cassandra.schema.MigrationManager$1.runMayThrow(MigrationManager.java:402) > ~[main/:na] >
[jira] [Comment Edited] (CASSANDRA-13289) Make it possible to monitor an ideal consistency level separate from actual consistency level
[ https://issues.apache.org/jira/browse/CASSANDRA-13289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15898641#comment-15898641 ] Ariel Weisberg edited comment on CASSANDRA-13289 at 3/30/17 6:45 PM: - ||Code|utests|dtests|| |[trunk|https://github.com/apache/cassandra/compare/trunk...aweisberg:cassandra-13289?expand=1]|[utests|https://cassci.datastax.com/view/Dev/view/aweisberg/job/aweisberg-cassandra-13289-testall/5/]|[dtests|https://cassci.datastax.com/view/Dev/view/aweisberg/job/aweisberg-cassandra-13289-dtest/5/]| was (Author: aweisberg): ||Code|utests|dtests|| |[trunk|https://github.com/apache/cassandra/compare/trunk...aweisberg:cassandra-13289?expand=1]|[utests|https://cassci.datastax.com/view/Dev/view/aweisberg/job/aweisberg-cassandra-13289-testall/3/]|[dtests|https://cassci.datastax.com/view/Dev/view/aweisberg/job/aweisberg-cassandra-13289-dtest/3/]| > Make it possible to monitor an ideal consistency level separate from actual > consistency level > - > > Key: CASSANDRA-13289 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13289 > Project: Cassandra > Issue Type: New Feature > Components: Core >Reporter: Ariel Weisberg >Assignee: Ariel Weisberg > Fix For: 4.0 > > > As an operator there are several issues related to multi-datacenter > replication and consistency you may want to have more information on from > your production database. > For instance. If your application writes at LOCAL_QUORUM how often are those > writes failing to achieve EACH_QUORUM at other data centers. If you failed > your application over to one of those data centers roughly how inconsistent > might it be given the number of writes that didn't propagate since the last > incremental repair? > You might also want to know roughly what the latency of writes would be if > you switched to a different consistency level. For instance you are writing > at LOCAL_QUORUM and want to know what would happen if you switched to > EACH_QUORUM. > The proposed change is to allow an ideal_consistency_level to be specified in > cassandra.yaml as well as get/set via JMX. If no ideal consistency level is > specified no additional tracking is done. > if an ideal consistency level is specified then the > {{AbstractWriteResponesHandler}} will contain a delegate WriteResponseHandler > that tracks whether the ideal consistency level is met before a write times > out. It also tracks the latency for achieving the ideal CL of successful > writes. > These two metrics would be reported on a per keyspace basis. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13304) Add checksumming to the native protocol
[ https://issues.apache.org/jira/browse/CASSANDRA-13304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15949536#comment-15949536 ] Alexandre Dutra commented on CASSANDRA-13304: - I have a compatible Java driver version [here|https://github.com/datastax/java-driver/tree/cassandra13304]. All tests pass against [~beobal]'s branch. I will run some benchmarks tomorrow to determine what is the overhead of checksumming client-side. A few remarks: # Currently checksum is enforced even for {{STARTUP}} messages, is that on purpose? Given that compression is always disabled for this message, I was wondering if we shouldn't disable checksum as well. # Currently if checksum fails an {{IOException}} is thrown which is caught by {{UnexpectedChannelExceptionHandler}} and the connection is closed. This is not very user-friendly because clients have no clue about what happened: {code} com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /127.0.1.1:58532 (com.datastax.driver.core.exceptions.TransportException: [/127.0.1.1:58532] Connection has been closed)) at com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:232) {code} # Minor: {{ChunkCompressor.compressChunk()}} is always called with {{srcOffset}} = 0 and {{destOffset}} = 0; can we simplify the signature? # Nit: packages {{org.apache.cassandra.transport.frame}} and {{org.apache.cassandra.transport.frame.body}} should imo contain the word {{checksum}} because this is what they are meant for. # Nit: In {{ChecksummedFrameCompressor}} and {{ChecksummedTransformer}} I would replace {{Checksummed}} with {{Checksumming}}. > Add checksumming to the native protocol > --- > > Key: CASSANDRA-13304 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13304 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Michael Kjellman >Assignee: Michael Kjellman > Labels: client-impacting > Attachments: 13304_v1.diff > > > The native binary transport implementation doesn't include checksums. This > makes it highly susceptible to silently inserting corrupted data either due > to hardware issues causing bit flips on the sender/client side, C*/receiver > side, or network in between. > Attaching an implementation that makes checksum'ing mandatory (assuming both > client and server know about a protocol version that supports checksums) -- > and also adds checksumming to clients that request compression. > The serialized format looks something like this: > {noformat} > * 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 > * 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Number of Compressed Chunks | Compressed Length (e1)/ > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * / Compressed Length cont. (e1) |Uncompressed Length (e1) / > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Uncompressed Length cont. (e1)| CRC32 Checksum of Lengths (e1)| > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Checksum of Lengths cont. (e1)|Compressed Bytes (e1)+// > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | CRC32 Checksum (e1) || > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * |Compressed Length (e2) | > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Uncompressed Length (e2)| > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * |CRC32 Checksum of Lengths (e2) | > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Compressed Bytes (e2) +// > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | CRC32 Checksum (e2) || > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * |Compressed Length (en) | > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Uncompressed Length (en)| > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * |CRC32 Checksum of Lengths (en) | > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Compressed Bytes (en) +// > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > *
[jira] [Created] (CASSANDRA-13394) test failure in upgrade_tests.upgrade_through_versions_test.TestUpgrade_current_2_1_x_To_indev_3_0_x.bootstrap_multidc_test
Michael Shuler created CASSANDRA-13394: -- Summary: test failure in upgrade_tests.upgrade_through_versions_test.TestUpgrade_current_2_1_x_To_indev_3_0_x.bootstrap_multidc_test Key: CASSANDRA-13394 URL: https://issues.apache.org/jira/browse/CASSANDRA-13394 Project: Cassandra Issue Type: Bug Reporter: Michael Shuler Attachments: node1_debug.log, node1_gc.log, node1.log, node2_debug.log, node2_gc.log, node2.log, node3_debug.log, node3_gc.log, node3.log, node4_debug.log, node4_gc.log, node4.log example failure: http://cassci.datastax.com/job/cassandra-3.0_large_dtest/60/testReport/upgrade_tests.upgrade_through_versions_test/TestUpgrade_current_2_1_x_To_indev_3_0_x/bootstrap_multidc_test {code} Error Message errors={: ReadTimeout('Error from server: code=1200 [Coordinator node timed out waiting for replica nodes\' responses] message="Operation timed out - received only 2 responses." info={\'received_responses\': 2, \'required_responses\': 3, \'consistency\': \'ALL\'}',)}, last_host=127.0.0.2 {code} {code} Stacktrace File "/usr/lib/python2.7/unittest/case.py", line 329, in run testMethod() File "/home/automaton/cassandra-dtest/upgrade_tests/upgrade_through_versions_test.py", line 716, in bootstrap_multidc_test self.upgrade_scenario(populate=False, create_schema=False, after_upgrade_call=(self._bootstrap_new_node_multidc,)) File "/home/automaton/cassandra-dtest/upgrade_tests/upgrade_through_versions_test.py", line 370, in upgrade_scenario self._check_values() File "/home/automaton/cassandra-dtest/upgrade_tests/upgrade_through_versions_test.py", line 508, in _check_values result = session.execute(query) File "/home/automaton/venv/src/cassandra-driver/cassandra/cluster.py", line 2012, in execute return self.execute_async(query, parameters, trace, custom_payload, timeout, execution_profile, paging_state).result() File "/home/automaton/venv/src/cassandra-driver/cassandra/cluster.py", line 3801, in result raise self._final_exception 'errors={: ReadTimeout(\'Error from server: code=1200 [Coordinator node timed out waiting for replica nodes\\\' responses] message="Operation timed out - received only 2 responses." info={\\\'received_responses\\\': 2, \\\'required_responses\\\': 3, \\\'consistency\\\': \\\'ALL\\\'}\',)}, last_host=127.0.0.2 {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13356) BootstrapMonitor.progress does not store all error messages
[ https://issues.apache.org/jira/browse/CASSANDRA-13356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15949318#comment-15949318 ] Paulo Motta commented on CASSANDRA-13356: - Thanks for the patch. I'm afraid this will not work yet because it will throw the exception to the JMX notifier and not to the client which will still hang. You need to do something similar to the {{RepairRunner}} which will store the exception and signal completion to the client, which will then display the exception. Can you test this manually before submitting a new patch? You can use [ccm|https://github.com/pcmanus/ccm] to bring up a cluster and you may induct a streaming/bootstrap failure with the following code: {noformat} diff --git a/src/java/org/apache/cassandra/streaming/StreamSession.java b/src/java/org/apache/cassandra/streaming/StreamSession.java index d57fae8..4a91af3 100644 --- a/src/java/org/apache/cassandra/streaming/StreamSession.java +++ b/src/java/org/apache/cassandra/streaming/StreamSession.java @@ -731,6 +731,10 @@ public class StreamSession implements IEndpointStateChangeSubscriber private void startStreamingFiles() { +if (true) +{ +throw new RuntimeException("hoho"); +} streamResult.handleSessionPrepared(this); state(State.STREAMING); {noformat} If you're feeling adventurous you may try to add a new [cassandra-dtest|https://github.com/riptano/cassandra-dtest/] based on {{bootstrap_test.py:resumable_bootstrap_test}}. > BootstrapMonitor.progress does not store all error messages > --- > > Key: CASSANDRA-13356 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13356 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Hao Zhong >Assignee: Hao Zhong > Labels: lhf > Fix For: 4.x > > Attachments: cassandra.patch > > > The BootstrapMonitor.progress ignores error messages when an error is > ProgressEventType.ERROR. Indeed, RepairRunner.progress once had a similar > bug, but is fixed. The fixed code is: > {code} > public void progress(String tag, ProgressEvent event) > { > ProgressEventType type = event.getType(); > String message = String.format("[%s] %s", > format.format(System.currentTimeMillis()), event.getMessage()); > if (type == ProgressEventType.PROGRESS) > { > message = message + " (progress: " + > (int)event.getProgressPercentage() + "%)"; > } > out.println(message); > if (type == ProgressEventType.ERROR) > { > error = new RuntimeException("Repair job has failed with the > error message: " + message); > } > if (type == ProgressEventType.COMPLETE) > { > condition.signalAll(); > } > } > {code} > Please refer to CASSANDRA-12508 for details. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13356) BootstrapMonitor.progress does not store all error messages
[ https://issues.apache.org/jira/browse/CASSANDRA-13356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta updated CASSANDRA-13356: Status: Open (was: Patch Available) > BootstrapMonitor.progress does not store all error messages > --- > > Key: CASSANDRA-13356 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13356 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Hao Zhong >Assignee: Hao Zhong > Labels: lhf > Fix For: 4.x > > Attachments: cassandra.patch > > > The BootstrapMonitor.progress ignores error messages when an error is > ProgressEventType.ERROR. Indeed, RepairRunner.progress once had a similar > bug, but is fixed. The fixed code is: > {code} > public void progress(String tag, ProgressEvent event) > { > ProgressEventType type = event.getType(); > String message = String.format("[%s] %s", > format.format(System.currentTimeMillis()), event.getMessage()); > if (type == ProgressEventType.PROGRESS) > { > message = message + " (progress: " + > (int)event.getProgressPercentage() + "%)"; > } > out.println(message); > if (type == ProgressEventType.ERROR) > { > error = new RuntimeException("Repair job has failed with the > error message: " + message); > } > if (type == ProgressEventType.COMPLETE) > { > condition.signalAll(); > } > } > {code} > Please refer to CASSANDRA-12508 for details. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13356) BootstrapMonitor.progress does not store all error messages
[ https://issues.apache.org/jira/browse/CASSANDRA-13356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paulo Motta updated CASSANDRA-13356: Reviewer: Paulo Motta > BootstrapMonitor.progress does not store all error messages > --- > > Key: CASSANDRA-13356 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13356 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Hao Zhong >Assignee: Hao Zhong > Labels: lhf > Fix For: 4.x > > Attachments: cassandra.patch > > > The BootstrapMonitor.progress ignores error messages when an error is > ProgressEventType.ERROR. Indeed, RepairRunner.progress once had a similar > bug, but is fixed. The fixed code is: > {code} > public void progress(String tag, ProgressEvent event) > { > ProgressEventType type = event.getType(); > String message = String.format("[%s] %s", > format.format(System.currentTimeMillis()), event.getMessage()); > if (type == ProgressEventType.PROGRESS) > { > message = message + " (progress: " + > (int)event.getProgressPercentage() + "%)"; > } > out.println(message); > if (type == ProgressEventType.ERROR) > { > error = new RuntimeException("Repair job has failed with the > error message: " + message); > } > if (type == ProgressEventType.COMPLETE) > { > condition.signalAll(); > } > } > {code} > Please refer to CASSANDRA-12508 for details. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13147) Secondary index query on partition key columns might not return all the rows.
[ https://issues.apache.org/jira/browse/CASSANDRA-13147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15949034#comment-15949034 ] Andrés de la Peña commented on CASSANDRA-13147: --- There are initial versions of the patch for 2.1 and 2.2: ||[2.1|https://github.com/apache/cassandra/compare/cassandra-2.1...adelapena:13147-2.1]|[utests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-13147-2.1-testall/]|[dtests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-13147-2.1-dtest/]| ||[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...adelapena:13147-2.1]|[utests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-13147-2.2-testall/]|[dtests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-13147-2.2-dtest/]| > Secondary index query on partition key columns might not return all the rows. > - > > Key: CASSANDRA-13147 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13147 > Project: Cassandra > Issue Type: Bug >Reporter: Benjamin Lerer >Assignee: Andrés de la Peña > > A secondary index query on a partition key column will, apparently, not > return the empty partitions with static data. > The following unit test can be used to reproduce the problem. > {code} > public void testIndexOnPartitionKeyWithStaticColumnAndNoRows() throws > Throwable > { > createTable("CREATE TABLE %s (pk1 int, pk2 int, c int, s int static, > v int, PRIMARY KEY((pk1, pk2), c))"); > createIndex("CREATE INDEX ON %s (pk2)"); > execute("INSERT INTO %s (pk1, pk2, c, s, v) VALUES (?, ?, ?, ?, ?)", > 1, 1, 1, 9, 1); > execute("INSERT INTO %s (pk1, pk2, c, s, v) VALUES (?, ?, ?, ?, ?)", > 1, 1, 2, 9, 2); > execute("INSERT INTO %s (pk1, pk2, s) VALUES (?, ?, ?)", 2, 1, 9); > execute("INSERT INTO %s (pk1, pk2, c, s, v) VALUES (?, ?, ?, ?, ?)", > 3, 1, 1, 9, 1); > assertRows(execute("SELECT * FROM %s WHERE pk2 = ?", 1), >row(1, 1, 1, 9, 1), >row(1, 1, 2, 9, 2), >row(2, 1, null, 9, null), <-- is not returned >row(3, 1, 1, 9, 1)); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13147) Secondary index query on partition key columns might not return all the rows.
[ https://issues.apache.org/jira/browse/CASSANDRA-13147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer updated CASSANDRA-13147: --- Reviewer: Benjamin Lerer > Secondary index query on partition key columns might not return all the rows. > - > > Key: CASSANDRA-13147 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13147 > Project: Cassandra > Issue Type: Bug >Reporter: Benjamin Lerer >Assignee: Andrés de la Peña > > A secondary index query on a partition key column will, apparently, not > return the empty partitions with static data. > The following unit test can be used to reproduce the problem. > {code} > public void testIndexOnPartitionKeyWithStaticColumnAndNoRows() throws > Throwable > { > createTable("CREATE TABLE %s (pk1 int, pk2 int, c int, s int static, > v int, PRIMARY KEY((pk1, pk2), c))"); > createIndex("CREATE INDEX ON %s (pk2)"); > execute("INSERT INTO %s (pk1, pk2, c, s, v) VALUES (?, ?, ?, ?, ?)", > 1, 1, 1, 9, 1); > execute("INSERT INTO %s (pk1, pk2, c, s, v) VALUES (?, ?, ?, ?, ?)", > 1, 1, 2, 9, 2); > execute("INSERT INTO %s (pk1, pk2, s) VALUES (?, ?, ?)", 2, 1, 9); > execute("INSERT INTO %s (pk1, pk2, c, s, v) VALUES (?, ?, ?, ?, ?)", > 3, 1, 1, 9, 1); > assertRows(execute("SELECT * FROM %s WHERE pk2 = ?", 1), >row(1, 1, 1, 9, 1), >row(1, 1, 2, 9, 2), >row(2, 1, null, 9, null), <-- is not returned >row(3, 1, 1, 9, 1)); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13147) Secondary index query on partition key columns might not return all the rows.
[ https://issues.apache.org/jira/browse/CASSANDRA-13147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15948942#comment-15948942 ] Andrés de la Peña commented on CASSANDRA-13147: --- It seems that it affects to 2.1, 2.2, 3.0, 3.x and trunk. There are two causes for the problem: * For 2.1, 2.2 and 3.0, the partition containing only static columns is found but [it is not returned at CQL level|https://github.com/apache/cassandra/blob/cassandra-2.2/src/java/org/apache/cassandra/cql3/statements/SelectStatement.java#L720]. * For 3.0, 3.x and trunk, the partition containing only static columns [is never indexed|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/index/internal/CassandraIndex.java#L382]. > Secondary index query on partition key columns might not return all the rows. > - > > Key: CASSANDRA-13147 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13147 > Project: Cassandra > Issue Type: Bug >Reporter: Benjamin Lerer >Assignee: Andrés de la Peña > > A secondary index query on a partition key column will, apparently, not > return the empty partitions with static data. > The following unit test can be used to reproduce the problem. > {code} > public void testIndexOnPartitionKeyWithStaticColumnAndNoRows() throws > Throwable > { > createTable("CREATE TABLE %s (pk1 int, pk2 int, c int, s int static, > v int, PRIMARY KEY((pk1, pk2), c))"); > createIndex("CREATE INDEX ON %s (pk2)"); > execute("INSERT INTO %s (pk1, pk2, c, s, v) VALUES (?, ?, ?, ?, ?)", > 1, 1, 1, 9, 1); > execute("INSERT INTO %s (pk1, pk2, c, s, v) VALUES (?, ?, ?, ?, ?)", > 1, 1, 2, 9, 2); > execute("INSERT INTO %s (pk1, pk2, s) VALUES (?, ?, ?)", 2, 1, 9); > execute("INSERT INTO %s (pk1, pk2, c, s, v) VALUES (?, ?, ?, ?, ?)", > 3, 1, 1, 9, 1); > assertRows(execute("SELECT * FROM %s WHERE pk2 = ?", 1), >row(1, 1, 1, 9, 1), >row(1, 1, 2, 9, 2), >row(2, 1, null, 9, null), <-- is not returned >row(3, 1, 1, 9, 1)); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13392) Repaired status should be cleared on new sstables when issuing nodetool refresh
[ https://issues.apache.org/jira/browse/CASSANDRA-13392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15948933#comment-15948933 ] Marcus Eriksson commented on CASSANDRA-13392: - bq. How would users know in which case to refresh or restart? We would tell them in the error message if they try to refresh with repaired sstables? > Repaired status should be cleared on new sstables when issuing nodetool > refresh > --- > > Key: CASSANDRA-13392 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13392 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 3.0.x, 3.11.x, 4.x > > > We can't assume that new sstables added when doing nodetool refresh > (ColumnFamilyStore#loadNewSSTables) are actually repaired if they have the > repairedAt flag set -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13392) Repaired status should be cleared on new sstables when issuing nodetool refresh
[ https://issues.apache.org/jira/browse/CASSANDRA-13392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15948923#comment-15948923 ] Stefan Podkowinski commented on CASSANDRA-13392: bq. Well, if we are running nodetool refresh, we will want the data to reappear on replicas right? Someone copies in a bunch of sstables on one node, runs repair, that data should end up on all nodes right? Shouldn't you use sstableloader in that case? I personally would never thought of using nodetool refresh for this. For me it's simply a command to make copied sstables available in a running Cassandra process. I also don't understand why it should make a difference here running refresh or restarting the node. How would users know in which case to refresh or restart? > Repaired status should be cleared on new sstables when issuing nodetool > refresh > --- > > Key: CASSANDRA-13392 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13392 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 3.0.x, 3.11.x, 4.x > > > We can't assume that new sstables added when doing nodetool refresh > (ColumnFamilyStore#loadNewSSTables) are actually repaired if they have the > repairedAt flag set -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13392) Repaired status should be cleared on new sstables when issuing nodetool refresh
[ https://issues.apache.org/jira/browse/CASSANDRA-13392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15948909#comment-15948909 ] Marcus Eriksson commented on CASSANDRA-13392: - Ok so it is not safe to remove the repair flag and it is not safe to keep it. Maybe we should just fail the refresh if there is a repaired sstable, forcing the user to either mark it unrepaired using tools/bin/sstablerepairedset or restarting the node to keep the flag? > Repaired status should be cleared on new sstables when issuing nodetool > refresh > --- > > Key: CASSANDRA-13392 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13392 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 3.0.x, 3.11.x, 4.x > > > We can't assume that new sstables added when doing nodetool refresh > (ColumnFamilyStore#loadNewSSTables) are actually repaired if they have the > repairedAt flag set -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-12820) testall failure in org.apache.cassandra.db.KeyspaceTest.testLimitSSTables-compression
[ https://issues.apache.org/jira/browse/CASSANDRA-12820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-12820: Resolution: Fixed Status: Resolved (was: Ready to Commit) Thank you, pushed as [81b8895151742668ff5035960612d8c4325a1761|https://github.com/apache/cassandra/commit/81b8895151742668ff5035960612d8c4325a1761]. > testall failure in > org.apache.cassandra.db.KeyspaceTest.testLimitSSTables-compression > - > > Key: CASSANDRA-12820 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12820 > Project: Cassandra > Issue Type: Bug >Reporter: Sean McCarthy >Assignee: Branimir Lambov > Labels: test-failure > > example failure: > http://cassci.datastax.com/job/cassandra-3.X_testall/38/testReport/org.apache.cassandra.db/KeyspaceTest/testLimitSSTables_compression/ > {code} > Error Message > expected:<5.0> but was:<6.0> > {code}{code} > Stacktrace > junit.framework.AssertionFailedError: expected:<5.0> but was:<6.0> > at > org.apache.cassandra.db.KeyspaceTest.testLimitSSTables(KeyspaceTest.java:421) > {code}{code} > Standard Output > ERROR [main] 2016-10-20 05:56:18,156 ?:? - SLF4J: stderr > INFO [main] 2016-10-20 05:56:18,516 ?:? - Configuration location: > file:/home/automaton/cassandra/test/conf/cassandra.yaml > DEBUG [main] 2016-10-20 05:56:18,532 ?:? - Loading settings from > file:/home/automaton/cassandra/test/conf/cassandra.yaml > INFO [main] 2016-10-20 05:56:19,632 ?:? - Node > configuration:[allocate_tokens_for_keyspace=null; authenticator=null; > authorizer=null; auto_bootstrap=true; auto_snapshot=true; > back_pressure_enabled=f > ...[truncated 453203 chars]... > ableReader(path='/home/automaton/cassandra/build/test/cassandra/data:108/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/mc-26-big-Data.db')] > (1 sstables, 6.278KiB), biggest 6.278KiB, smallest 6.278KiB > DEBUG [MemtableFlushWriter:2] 2016-10-20 05:56:34,725 ?:? - Flushed to > [BigTableReader(path='/home/automaton/cassandra/build/test/cassandra/data:108/system/compaction_history-b4dbb7b4dc493fb5b3bfce6e434832ca/mc-22-big-Data.db')] > (1 sstables, 5.559KiB), biggest 5.559KiB, smallest 5.559KiB > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[3/3] cassandra git commit: Merge branch 'cassandra-3.11' into trunk
Merge branch 'cassandra-3.11' into trunk # Conflicts: # test/unit/org/apache/cassandra/db/KeyspaceTest.java Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/bb4c5c3c Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/bb4c5c3c Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/bb4c5c3c Branch: refs/heads/trunk Commit: bb4c5c3c4adf4c76e53fed297036489cee1a2768 Parents: b3465f9 81b8895 Author: Branimir Lambov Authored: Thu Mar 30 14:38:23 2017 +0300 Committer: Branimir Lambov Committed: Thu Mar 30 14:38:23 2017 +0300 -- CHANGES.txt | 1 + .../org/apache/cassandra/db/KeyspaceTest.java | 72 2 files changed, 45 insertions(+), 28 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/bb4c5c3c/CHANGES.txt -- diff --cc CHANGES.txt index 2040524,3ead1d1..6a164ee --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,52 -1,5 +1,53 @@@ +4.0 + * Outbound TCP connections ignore internode authenticator (CASSANDRA-13324) + * Upgrade junit from 4.6 to 4.12 (CASSANDRA-13360) + * Cleanup ParentRepairSession after repairs (CASSANDRA-13359) + * Incremental repair not streaming correct sstables (CASSANDRA-13328) + * Upgrade the jna version to 4.3.0 (CASSANDRA-13300) + * Add the currentTimestamp, currentDate, currentTime and currentTimeUUID functions (CASSANDRA-13132) + * Remove config option index_interval (CASSANDRA-10671) + * Reduce lock contention for collection types and serializers (CASSANDRA-13271) + * Make it possible to override MessagingService.Verb ids (CASSANDRA-13283) + * Avoid synchronized on prepareForRepair in ActiveRepairService (CASSANDRA-9292) + * Adds the ability to use uncompressed chunks in compressed files (CASSANDRA-10520) + * Don't flush sstables when streaming for incremental repair (CASSANDRA-13226) + * Remove unused method (CASSANDRA-13227) + * Fix minor bugs related to #9143 (CASSANDRA-13217) + * Output warning if user increases RF (CASSANDRA-13079) + * Remove pre-3.0 streaming compatibility code for 4.0 (CASSANDRA-13081) + * Add support for + and - operations on dates (CASSANDRA-11936) + * Fix consistency of incrementally repaired data (CASSANDRA-9143) + * Increase commitlog version (CASSANDRA-13161) + * Make TableMetadata immutable, optimize Schema (CASSANDRA-9425) + * Refactor ColumnCondition (CASSANDRA-12981) + * Parallelize streaming of different keyspaces (CASSANDRA-4663) + * Improved compactions metrics (CASSANDRA-13015) + * Speed-up start-up sequence by avoiding un-needed flushes (CASSANDRA-13031) + * Use Caffeine (W-TinyLFU) for on-heap caches (CASSANDRA-10855) + * Thrift removal (CASSANDRA-5) + * Remove pre-3.0 compatibility code for 4.0 (CASSANDRA-12716) + * Add column definition kind to dropped columns in schema (CASSANDRA-12705) + * Add (automate) Nodetool Documentation (CASSANDRA-12672) + * Update bundled cqlsh python driver to 3.7.0 (CASSANDRA-12736) + * Reject invalid replication settings when creating or altering a keyspace (CASSANDRA-12681) + * Clean up the SSTableReader#getScanner API wrt removal of RateLimiter (CASSANDRA-12422) + * Use new token allocation for non bootstrap case as well (CASSANDRA-13080) + * Avoid byte-array copy when key cache is disabled (CASSANDRA-13084) + * Require forceful decommission if number of nodes is less than replication factor (CASSANDRA-12510) + * Allow IN restrictions on column families with collections (CASSANDRA-12654) + * Log message size in trace message in OutboundTcpConnection (CASSANDRA-13028) + * Add timeUnit Days for cassandra-stress (CASSANDRA-13029) + * Add mutation size and batch metrics (CASSANDRA-12649) + * Add method to get size of endpoints to TokenMetadata (CASSANDRA-12999) + * Expose time spent waiting in thread pool queue (CASSANDRA-8398) + * Conditionally update index built status to avoid unnecessary flushes (CASSANDRA-12969) + * cqlsh auto completion: refactor definition of compaction strategy options (CASSANDRA-12946) + * Add support for arithmetic operators (CASSANDRA-11935) + * Add histogram for delay to deliver hints (CASSANDRA-13234) + + 3.11.0 + * Fix testLimitSSTables flake caused by concurrent flush (CASSANDRA-12820) * cdc column addition strikes again (CASSANDRA-13382) * Fix static column indexes (CASSANDRA-13277) * DataOutputBuffer.asNewBuffer broken (CASSANDRA-13298) http://git-wip-us.apache.org/repos/asf/cassandra/blob/bb4c5c3c/test/unit/org/apache/cassandra/db/KeyspaceTest.java --
[1/3] cassandra git commit: Fix testLimitSSTables flake caused by concurrent flush
Repository: cassandra Updated Branches: refs/heads/cassandra-3.11 0b1675d43 -> 81b889515 refs/heads/trunk b3465f937 -> bb4c5c3c4 Fix testLimitSSTables flake caused by concurrent flush Patch by Branimir Lambov; reviewed by Stefania Alborghetti for CASSANDRA-12820 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/81b88951 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/81b88951 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/81b88951 Branch: refs/heads/cassandra-3.11 Commit: 81b8895151742668ff5035960612d8c4325a1761 Parents: 0b1675d Author: Branimir Lambov Authored: Mon Jan 23 17:20:43 2017 +0200 Committer: Branimir Lambov Committed: Thu Mar 30 14:36:07 2017 +0300 -- CHANGES.txt | 1 + .../org/apache/cassandra/db/KeyspaceTest.java | 74 2 files changed, 46 insertions(+), 29 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/81b88951/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 9cde2d8..3ead1d1 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.11.0 + * Fix testLimitSSTables flake caused by concurrent flush (CASSANDRA-12820) * cdc column addition strikes again (CASSANDRA-13382) * Fix static column indexes (CASSANDRA-13277) * DataOutputBuffer.asNewBuffer broken (CASSANDRA-13298) http://git-wip-us.apache.org/repos/asf/cassandra/blob/81b88951/test/unit/org/apache/cassandra/db/KeyspaceTest.java -- diff --git a/test/unit/org/apache/cassandra/db/KeyspaceTest.java b/test/unit/org/apache/cassandra/db/KeyspaceTest.java index 5036749..f2a9984 100644 --- a/test/unit/org/apache/cassandra/db/KeyspaceTest.java +++ b/test/unit/org/apache/cassandra/db/KeyspaceTest.java @@ -24,6 +24,7 @@ import java.util.*; import org.apache.cassandra.Util; import org.apache.cassandra.cql3.CQLTester; import org.apache.cassandra.cql3.ColumnIdentifier; +import org.apache.cassandra.cql3.UntypedResultSet; import org.apache.cassandra.db.rows.Cell; import org.apache.cassandra.db.rows.Row; import org.apache.cassandra.db.rows.RowIterator; @@ -41,14 +42,34 @@ import static org.junit.Assert.*; public class KeyspaceTest extends CQLTester { +// Test needs synchronous table drop to avoid flushes causing flaky failures of testLimitSSTables + +@Override +protected String createTable(String query) +{ +return super.createTable(KEYSPACE_PER_TEST, query); +} + +@Override +protected UntypedResultSet execute(String query, Object... values) throws Throwable +{ +return executeFormattedQuery(formatQuery(KEYSPACE_PER_TEST, query), values); +} + +@Override +public ColumnFamilyStore getCurrentColumnFamilyStore() +{ +return super.getCurrentColumnFamilyStore(KEYSPACE_PER_TEST); +} + @Test public void testGetRowNoColumns() throws Throwable { -String tableName = createTable("CREATE TABLE %s (a text, b int, c int, PRIMARY KEY (a, b))"); +createTable("CREATE TABLE %s (a text, b int, c int, PRIMARY KEY (a, b))"); execute("INSERT INTO %s (a, b, c) VALUES (?, ?, ?)", "0", 0, 0); -final ColumnFamilyStore cfs = Keyspace.open(KEYSPACE).getColumnFamilyStore(tableName); +final ColumnFamilyStore cfs = getCurrentColumnFamilyStore(); for (int round = 0; round < 2; round++) { @@ -69,12 +90,12 @@ public class KeyspaceTest extends CQLTester @Test public void testGetRowSingleColumn() throws Throwable { -String tableName = createTable("CREATE TABLE %s (a text, b int, c int, PRIMARY KEY (a, b))"); +createTable("CREATE TABLE %s (a text, b int, c int, PRIMARY KEY (a, b))"); for (int i = 0; i < 2; i++) execute("INSERT INTO %s (a, b, c) VALUES (?, ?, ?)", "0", i, i); -final ColumnFamilyStore cfs = Keyspace.open(KEYSPACE).getColumnFamilyStore(tableName); +final ColumnFamilyStore cfs = getCurrentColumnFamilyStore(); for (int round = 0; round < 2; round++) { @@ -104,11 +125,11 @@ public class KeyspaceTest extends CQLTester @Test public void testGetSliceBloomFilterFalsePositive() throws Throwable { -String tableName = createTable("CREATE TABLE %s (a text, b int, c int, PRIMARY KEY (a, b))"); +createTable("CREATE TABLE %s (a text, b int, c int, PRIMARY KEY (a, b))"); execute("INSERT INTO %s (a, b, c) VALUES (?, ?, ?)", "1", 1, 1); -final ColumnFamilyStore cfs = Keyspace.open(KEYSPACE).getColumnFamilyStore(tableName); +final ColumnFamilyStore cfs = getCurrentColumnFamilyStore();
[2/3] cassandra git commit: Fix testLimitSSTables flake caused by concurrent flush
Fix testLimitSSTables flake caused by concurrent flush Patch by Branimir Lambov; reviewed by Stefania Alborghetti for CASSANDRA-12820 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/81b88951 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/81b88951 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/81b88951 Branch: refs/heads/trunk Commit: 81b8895151742668ff5035960612d8c4325a1761 Parents: 0b1675d Author: Branimir Lambov Authored: Mon Jan 23 17:20:43 2017 +0200 Committer: Branimir Lambov Committed: Thu Mar 30 14:36:07 2017 +0300 -- CHANGES.txt | 1 + .../org/apache/cassandra/db/KeyspaceTest.java | 74 2 files changed, 46 insertions(+), 29 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/81b88951/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 9cde2d8..3ead1d1 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.11.0 + * Fix testLimitSSTables flake caused by concurrent flush (CASSANDRA-12820) * cdc column addition strikes again (CASSANDRA-13382) * Fix static column indexes (CASSANDRA-13277) * DataOutputBuffer.asNewBuffer broken (CASSANDRA-13298) http://git-wip-us.apache.org/repos/asf/cassandra/blob/81b88951/test/unit/org/apache/cassandra/db/KeyspaceTest.java -- diff --git a/test/unit/org/apache/cassandra/db/KeyspaceTest.java b/test/unit/org/apache/cassandra/db/KeyspaceTest.java index 5036749..f2a9984 100644 --- a/test/unit/org/apache/cassandra/db/KeyspaceTest.java +++ b/test/unit/org/apache/cassandra/db/KeyspaceTest.java @@ -24,6 +24,7 @@ import java.util.*; import org.apache.cassandra.Util; import org.apache.cassandra.cql3.CQLTester; import org.apache.cassandra.cql3.ColumnIdentifier; +import org.apache.cassandra.cql3.UntypedResultSet; import org.apache.cassandra.db.rows.Cell; import org.apache.cassandra.db.rows.Row; import org.apache.cassandra.db.rows.RowIterator; @@ -41,14 +42,34 @@ import static org.junit.Assert.*; public class KeyspaceTest extends CQLTester { +// Test needs synchronous table drop to avoid flushes causing flaky failures of testLimitSSTables + +@Override +protected String createTable(String query) +{ +return super.createTable(KEYSPACE_PER_TEST, query); +} + +@Override +protected UntypedResultSet execute(String query, Object... values) throws Throwable +{ +return executeFormattedQuery(formatQuery(KEYSPACE_PER_TEST, query), values); +} + +@Override +public ColumnFamilyStore getCurrentColumnFamilyStore() +{ +return super.getCurrentColumnFamilyStore(KEYSPACE_PER_TEST); +} + @Test public void testGetRowNoColumns() throws Throwable { -String tableName = createTable("CREATE TABLE %s (a text, b int, c int, PRIMARY KEY (a, b))"); +createTable("CREATE TABLE %s (a text, b int, c int, PRIMARY KEY (a, b))"); execute("INSERT INTO %s (a, b, c) VALUES (?, ?, ?)", "0", 0, 0); -final ColumnFamilyStore cfs = Keyspace.open(KEYSPACE).getColumnFamilyStore(tableName); +final ColumnFamilyStore cfs = getCurrentColumnFamilyStore(); for (int round = 0; round < 2; round++) { @@ -69,12 +90,12 @@ public class KeyspaceTest extends CQLTester @Test public void testGetRowSingleColumn() throws Throwable { -String tableName = createTable("CREATE TABLE %s (a text, b int, c int, PRIMARY KEY (a, b))"); +createTable("CREATE TABLE %s (a text, b int, c int, PRIMARY KEY (a, b))"); for (int i = 0; i < 2; i++) execute("INSERT INTO %s (a, b, c) VALUES (?, ?, ?)", "0", i, i); -final ColumnFamilyStore cfs = Keyspace.open(KEYSPACE).getColumnFamilyStore(tableName); +final ColumnFamilyStore cfs = getCurrentColumnFamilyStore(); for (int round = 0; round < 2; round++) { @@ -104,11 +125,11 @@ public class KeyspaceTest extends CQLTester @Test public void testGetSliceBloomFilterFalsePositive() throws Throwable { -String tableName = createTable("CREATE TABLE %s (a text, b int, c int, PRIMARY KEY (a, b))"); +createTable("CREATE TABLE %s (a text, b int, c int, PRIMARY KEY (a, b))"); execute("INSERT INTO %s (a, b, c) VALUES (?, ?, ?)", "1", 1, 1); -final ColumnFamilyStore cfs = Keyspace.open(KEYSPACE).getColumnFamilyStore(tableName); +final ColumnFamilyStore cfs = getCurrentColumnFamilyStore(); // check empty reads on the partitions before and after the existing one for (String key : new String[]{"0", "2"}) @@ -166,14 +1
[jira] [Commented] (CASSANDRA-13392) Repaired status should be cleared on new sstables when issuing nodetool refresh
[ https://issues.apache.org/jira/browse/CASSANDRA-13392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15948896#comment-15948896 ] Stefan Podkowinski commented on CASSANDRA-13392: Let's say one of my sstables got corrupted and removed/scrubbed manually afterwards. Node has been started again. Now the admin pulls the same sstable from yesterdays backup into the data dir and runs refresh. Having the "new" sstable replicated during next repair would be rather unexpected and not safe. > Repaired status should be cleared on new sstables when issuing nodetool > refresh > --- > > Key: CASSANDRA-13392 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13392 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 3.0.x, 3.11.x, 4.x > > > We can't assume that new sstables added when doing nodetool refresh > (ColumnFamilyStore#loadNewSSTables) are actually repaired if they have the > repairedAt flag set -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (CASSANDRA-13392) Repaired status should be cleared on new sstables when issuing nodetool refresh
[ https://issues.apache.org/jira/browse/CASSANDRA-13392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15948837#comment-15948837 ] Marcus Eriksson edited comment on CASSANDRA-13392 at 3/30/17 10:56 AM: --- Well, if we are running nodetool refresh, we will want the data to reappear on replicas right? Someone copies in a bunch of sstables on one node, runs repair, that data should end up on all nodes right? We will not be moving data from repaired to unrepaired on any 'live' sstables on the node we are running refresh - only on the sstables copied in to the data directory. Or do you have some other use case of 'nodetool refresh' where this is not true? was (Author: krummas): Well, if we are running nodetool refresh, we will want the data to reappear on replicas right? Someone copies in a bunch of sstables on one node, runs repair, that data should end up on all nodes right? We will not be moving data from repaired to unrepaired on any 'live' sstables on the node we are running refresh - only on the sstables copied in to the data directory. Or do you have some other use case where this is not true? > Repaired status should be cleared on new sstables when issuing nodetool > refresh > --- > > Key: CASSANDRA-13392 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13392 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 3.0.x, 3.11.x, 4.x > > > We can't assume that new sstables added when doing nodetool refresh > (ColumnFamilyStore#loadNewSSTables) are actually repaired if they have the > repairedAt flag set -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13392) Repaired status should be cleared on new sstables when issuing nodetool refresh
[ https://issues.apache.org/jira/browse/CASSANDRA-13392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15948837#comment-15948837 ] Marcus Eriksson commented on CASSANDRA-13392: - Well, if we are running nodetool refresh, we will want the data to reappear on replicas right? Someone copies in a bunch of sstables on one node, runs repair, that data should end up on all nodes right? We will not be moving data from repaired to unrepaired on any 'live' sstables on the node we are running refresh - only on the sstables copied in to the data directory. Or do you have some other use case where this is not true? > Repaired status should be cleared on new sstables when issuing nodetool > refresh > --- > > Key: CASSANDRA-13392 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13392 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 3.0.x, 3.11.x, 4.x > > > We can't assume that new sstables added when doing nodetool refresh > (ColumnFamilyStore#loadNewSSTables) are actually repaired if they have the > repairedAt flag set -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13392) Repaired status should be cleared on new sstables when issuing nodetool refresh
[ https://issues.apache.org/jira/browse/CASSANDRA-13392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15948820#comment-15948820 ] Stefan Podkowinski commented on CASSANDRA-13392: How would the repairedAt flag be set, if the sstables haven't been repaired before? Moving sstables from repaired to unrepaired again can resurrect data that has already been purged from the replicas, so it's not safe to assume that we can always drop sstables to unrepaired without consistency implications. > Repaired status should be cleared on new sstables when issuing nodetool > refresh > --- > > Key: CASSANDRA-13392 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13392 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 3.0.x, 3.11.x, 4.x > > > We can't assume that new sstables added when doing nodetool refresh > (ColumnFamilyStore#loadNewSSTables) are actually repaired if they have the > repairedAt flag set -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-12820) testall failure in org.apache.cassandra.db.KeyspaceTest.testLimitSSTables-compression
[ https://issues.apache.org/jira/browse/CASSANDRA-12820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15948777#comment-15948777 ] Stefania commented on CASSANDRA-12820: -- In fact, changes are only in the test and they LGTM, so feel free to commit today. > testall failure in > org.apache.cassandra.db.KeyspaceTest.testLimitSSTables-compression > - > > Key: CASSANDRA-12820 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12820 > Project: Cassandra > Issue Type: Bug >Reporter: Sean McCarthy >Assignee: Branimir Lambov > Labels: test-failure > > example failure: > http://cassci.datastax.com/job/cassandra-3.X_testall/38/testReport/org.apache.cassandra.db/KeyspaceTest/testLimitSSTables_compression/ > {code} > Error Message > expected:<5.0> but was:<6.0> > {code}{code} > Stacktrace > junit.framework.AssertionFailedError: expected:<5.0> but was:<6.0> > at > org.apache.cassandra.db.KeyspaceTest.testLimitSSTables(KeyspaceTest.java:421) > {code}{code} > Standard Output > ERROR [main] 2016-10-20 05:56:18,156 ?:? - SLF4J: stderr > INFO [main] 2016-10-20 05:56:18,516 ?:? - Configuration location: > file:/home/automaton/cassandra/test/conf/cassandra.yaml > DEBUG [main] 2016-10-20 05:56:18,532 ?:? - Loading settings from > file:/home/automaton/cassandra/test/conf/cassandra.yaml > INFO [main] 2016-10-20 05:56:19,632 ?:? - Node > configuration:[allocate_tokens_for_keyspace=null; authenticator=null; > authorizer=null; auto_bootstrap=true; auto_snapshot=true; > back_pressure_enabled=f > ...[truncated 453203 chars]... > ableReader(path='/home/automaton/cassandra/build/test/cassandra/data:108/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/mc-26-big-Data.db')] > (1 sstables, 6.278KiB), biggest 6.278KiB, smallest 6.278KiB > DEBUG [MemtableFlushWriter:2] 2016-10-20 05:56:34,725 ?:? - Flushed to > [BigTableReader(path='/home/automaton/cassandra/build/test/cassandra/data:108/system/compaction_history-b4dbb7b4dc493fb5b3bfce6e434832ca/mc-22-big-Data.db')] > (1 sstables, 5.559KiB), biggest 5.559KiB, smallest 5.559KiB > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-12820) testall failure in org.apache.cassandra.db.KeyspaceTest.testLimitSSTables-compression
[ https://issues.apache.org/jira/browse/CASSANDRA-12820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania updated CASSANDRA-12820: - Status: Ready to Commit (was: Patch Available) > testall failure in > org.apache.cassandra.db.KeyspaceTest.testLimitSSTables-compression > - > > Key: CASSANDRA-12820 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12820 > Project: Cassandra > Issue Type: Bug >Reporter: Sean McCarthy >Assignee: Branimir Lambov > Labels: test-failure > > example failure: > http://cassci.datastax.com/job/cassandra-3.X_testall/38/testReport/org.apache.cassandra.db/KeyspaceTest/testLimitSSTables_compression/ > {code} > Error Message > expected:<5.0> but was:<6.0> > {code}{code} > Stacktrace > junit.framework.AssertionFailedError: expected:<5.0> but was:<6.0> > at > org.apache.cassandra.db.KeyspaceTest.testLimitSSTables(KeyspaceTest.java:421) > {code}{code} > Standard Output > ERROR [main] 2016-10-20 05:56:18,156 ?:? - SLF4J: stderr > INFO [main] 2016-10-20 05:56:18,516 ?:? - Configuration location: > file:/home/automaton/cassandra/test/conf/cassandra.yaml > DEBUG [main] 2016-10-20 05:56:18,532 ?:? - Loading settings from > file:/home/automaton/cassandra/test/conf/cassandra.yaml > INFO [main] 2016-10-20 05:56:19,632 ?:? - Node > configuration:[allocate_tokens_for_keyspace=null; authenticator=null; > authorizer=null; auto_bootstrap=true; auto_snapshot=true; > back_pressure_enabled=f > ...[truncated 453203 chars]... > ableReader(path='/home/automaton/cassandra/build/test/cassandra/data:108/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/mc-26-big-Data.db')] > (1 sstables, 6.278KiB), biggest 6.278KiB, smallest 6.278KiB > DEBUG [MemtableFlushWriter:2] 2016-10-20 05:56:34,725 ?:? - Flushed to > [BigTableReader(path='/home/automaton/cassandra/build/test/cassandra/data:108/system/compaction_history-b4dbb7b4dc493fb5b3bfce6e434832ca/mc-22-big-Data.db')] > (1 sstables, 5.559KiB), biggest 5.559KiB, smallest 5.559KiB > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-12820) testall failure in org.apache.cassandra.db.KeyspaceTest.testLimitSSTables-compression
[ https://issues.apache.org/jira/browse/CASSANDRA-12820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania updated CASSANDRA-12820: - Reviewer: Stefania (was: Stefania Alborghetti) > testall failure in > org.apache.cassandra.db.KeyspaceTest.testLimitSSTables-compression > - > > Key: CASSANDRA-12820 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12820 > Project: Cassandra > Issue Type: Bug >Reporter: Sean McCarthy >Assignee: Branimir Lambov > Labels: test-failure > > example failure: > http://cassci.datastax.com/job/cassandra-3.X_testall/38/testReport/org.apache.cassandra.db/KeyspaceTest/testLimitSSTables_compression/ > {code} > Error Message > expected:<5.0> but was:<6.0> > {code}{code} > Stacktrace > junit.framework.AssertionFailedError: expected:<5.0> but was:<6.0> > at > org.apache.cassandra.db.KeyspaceTest.testLimitSSTables(KeyspaceTest.java:421) > {code}{code} > Standard Output > ERROR [main] 2016-10-20 05:56:18,156 ?:? - SLF4J: stderr > INFO [main] 2016-10-20 05:56:18,516 ?:? - Configuration location: > file:/home/automaton/cassandra/test/conf/cassandra.yaml > DEBUG [main] 2016-10-20 05:56:18,532 ?:? - Loading settings from > file:/home/automaton/cassandra/test/conf/cassandra.yaml > INFO [main] 2016-10-20 05:56:19,632 ?:? - Node > configuration:[allocate_tokens_for_keyspace=null; authenticator=null; > authorizer=null; auto_bootstrap=true; auto_snapshot=true; > back_pressure_enabled=f > ...[truncated 453203 chars]... > ableReader(path='/home/automaton/cassandra/build/test/cassandra/data:108/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/mc-26-big-Data.db')] > (1 sstables, 6.278KiB), biggest 6.278KiB, smallest 6.278KiB > DEBUG [MemtableFlushWriter:2] 2016-10-20 05:56:34,725 ?:? - Flushed to > [BigTableReader(path='/home/automaton/cassandra/build/test/cassandra/data:108/system/compaction_history-b4dbb7b4dc493fb5b3bfce6e434832ca/mc-22-big-Data.db')] > (1 sstables, 5.559KiB), biggest 5.559KiB, smallest 5.559KiB > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-12820) testall failure in org.apache.cassandra.db.KeyspaceTest.testLimitSSTables-compression
[ https://issues.apache.org/jira/browse/CASSANDRA-12820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15948775#comment-15948775 ] Stefania commented on CASSANDRA-12820: -- So sorry I missed the notification, I'll review it first thing tomorrow. > testall failure in > org.apache.cassandra.db.KeyspaceTest.testLimitSSTables-compression > - > > Key: CASSANDRA-12820 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12820 > Project: Cassandra > Issue Type: Bug >Reporter: Sean McCarthy >Assignee: Branimir Lambov > Labels: test-failure > > example failure: > http://cassci.datastax.com/job/cassandra-3.X_testall/38/testReport/org.apache.cassandra.db/KeyspaceTest/testLimitSSTables_compression/ > {code} > Error Message > expected:<5.0> but was:<6.0> > {code}{code} > Stacktrace > junit.framework.AssertionFailedError: expected:<5.0> but was:<6.0> > at > org.apache.cassandra.db.KeyspaceTest.testLimitSSTables(KeyspaceTest.java:421) > {code}{code} > Standard Output > ERROR [main] 2016-10-20 05:56:18,156 ?:? - SLF4J: stderr > INFO [main] 2016-10-20 05:56:18,516 ?:? - Configuration location: > file:/home/automaton/cassandra/test/conf/cassandra.yaml > DEBUG [main] 2016-10-20 05:56:18,532 ?:? - Loading settings from > file:/home/automaton/cassandra/test/conf/cassandra.yaml > INFO [main] 2016-10-20 05:56:19,632 ?:? - Node > configuration:[allocate_tokens_for_keyspace=null; authenticator=null; > authorizer=null; auto_bootstrap=true; auto_snapshot=true; > back_pressure_enabled=f > ...[truncated 453203 chars]... > ableReader(path='/home/automaton/cassandra/build/test/cassandra/data:108/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/mc-26-big-Data.db')] > (1 sstables, 6.278KiB), biggest 6.278KiB, smallest 6.278KiB > DEBUG [MemtableFlushWriter:2] 2016-10-20 05:56:34,725 ?:? - Flushed to > [BigTableReader(path='/home/automaton/cassandra/build/test/cassandra/data:108/system/compaction_history-b4dbb7b4dc493fb5b3bfce6e434832ca/mc-22-big-Data.db')] > (1 sstables, 5.559KiB), biggest 5.559KiB, smallest 5.559KiB > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-12820) testall failure in org.apache.cassandra.db.KeyspaceTest.testLimitSSTables-compression
[ https://issues.apache.org/jira/browse/CASSANDRA-12820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15948769#comment-15948769 ] Branimir Lambov commented on CASSANDRA-12820: - Gentle ping [~Stefania], this should be quick and easy to review. > testall failure in > org.apache.cassandra.db.KeyspaceTest.testLimitSSTables-compression > - > > Key: CASSANDRA-12820 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12820 > Project: Cassandra > Issue Type: Bug >Reporter: Sean McCarthy >Assignee: Branimir Lambov > Labels: test-failure > > example failure: > http://cassci.datastax.com/job/cassandra-3.X_testall/38/testReport/org.apache.cassandra.db/KeyspaceTest/testLimitSSTables_compression/ > {code} > Error Message > expected:<5.0> but was:<6.0> > {code}{code} > Stacktrace > junit.framework.AssertionFailedError: expected:<5.0> but was:<6.0> > at > org.apache.cassandra.db.KeyspaceTest.testLimitSSTables(KeyspaceTest.java:421) > {code}{code} > Standard Output > ERROR [main] 2016-10-20 05:56:18,156 ?:? - SLF4J: stderr > INFO [main] 2016-10-20 05:56:18,516 ?:? - Configuration location: > file:/home/automaton/cassandra/test/conf/cassandra.yaml > DEBUG [main] 2016-10-20 05:56:18,532 ?:? - Loading settings from > file:/home/automaton/cassandra/test/conf/cassandra.yaml > INFO [main] 2016-10-20 05:56:19,632 ?:? - Node > configuration:[allocate_tokens_for_keyspace=null; authenticator=null; > authorizer=null; auto_bootstrap=true; auto_snapshot=true; > back_pressure_enabled=f > ...[truncated 453203 chars]... > ableReader(path='/home/automaton/cassandra/build/test/cassandra/data:108/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/mc-26-big-Data.db')] > (1 sstables, 6.278KiB), biggest 6.278KiB, smallest 6.278KiB > DEBUG [MemtableFlushWriter:2] 2016-10-20 05:56:34,725 ?:? - Flushed to > [BigTableReader(path='/home/automaton/cassandra/build/test/cassandra/data:108/system/compaction_history-b4dbb7b4dc493fb5b3bfce6e434832ca/mc-22-big-Data.db')] > (1 sstables, 5.559KiB), biggest 5.559KiB, smallest 5.559KiB > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CASSANDRA-13393) Invalid row cache size (in MB) is reported by JMX and NodeTool
Fuud created CASSANDRA-13393: Summary: Invalid row cache size (in MB) is reported by JMX and NodeTool Key: CASSANDRA-13393 URL: https://issues.apache.org/jira/browse/CASSANDRA-13393 Project: Cassandra Issue Type: Bug Components: Tools Reporter: Fuud Priority: Minor Row Cache size is reported in entries but should be reported in bytes (as KeyCache do). It happens because incorrect OHCProvider.OHCacheAdapter.weightedSize method. Currently it returns cache size but should return ohCache.memUsed() -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13216) testall failure in org.apache.cassandra.net.MessagingServiceTest.testDroppedMessages
[ https://issues.apache.org/jira/browse/CASSANDRA-13216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15948652#comment-15948652 ] Alex Petrov commented on CASSANDRA-13216: - CI looks good, no failures for this test in 10 builds. On trunk, 11th build contained 64 failures, but since there were no changes inbetween and given the nature of errors there I tend to believe it's an enviroment issue. [~mkjellman] would you like to keep yourself as a reviewer? > testall failure in > org.apache.cassandra.net.MessagingServiceTest.testDroppedMessages > > > Key: CASSANDRA-13216 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13216 > Project: Cassandra > Issue Type: Bug > Components: Testing >Reporter: Sean McCarthy >Assignee: Alex Petrov > Labels: test-failure, testall > Fix For: 3.0.13, 3.11.0, 4.0 > > Attachments: TEST-org.apache.cassandra.net.MessagingServiceTest.log, > TEST-org.apache.cassandra.net.MessagingServiceTest.log > > > example failure: > http://cassci.datastax.com/job/cassandra-3.11_testall/81/testReport/org.apache.cassandra.net/MessagingServiceTest/testDroppedMessages > {code} > Error Message > expected:<... dropped latency: 27[30 ms and Mean cross-node dropped latency: > 2731] ms> but was:<... dropped latency: 27[28 ms and Mean cross-node dropped > latency: 2730] ms> > {code}{code} > Stacktrace > junit.framework.AssertionFailedError: expected:<... dropped latency: 27[30 ms > and Mean cross-node dropped latency: 2731] ms> but was:<... dropped latency: > 27[28 ms and Mean cross-node dropped latency: 2730] ms> > at > org.apache.cassandra.net.MessagingServiceTest.testDroppedMessages(MessagingServiceTest.java:83) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13392) Repaired status should be cleared on new sstables when issuing nodetool refresh
[ https://issues.apache.org/jira/browse/CASSANDRA-13392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-13392: Fix Version/s: 4.x 3.11.x 3.0.x Status: Patch Available (was: Open) https://github.com/krummas/cassandra/commits/marcuse/clearrepairedrefresh > Repaired status should be cleared on new sstables when issuing nodetool > refresh > --- > > Key: CASSANDRA-13392 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13392 > Project: Cassandra > Issue Type: Bug >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson > Fix For: 3.0.x, 3.11.x, 4.x > > > We can't assume that new sstables added when doing nodetool refresh > (ColumnFamilyStore#loadNewSSTables) are actually repaired if they have the > repairedAt flag set -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13327) Pending endpoints size check for CAS doesn't play nicely with writes-on-replacement
[ https://issues.apache.org/jira/browse/CASSANDRA-13327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15948593#comment-15948593 ] Sylvain Lebresne commented on CASSANDRA-13327: -- Fyi, some of the confusion is probably my fault as I initially read the description too quickly, and though the _replaced_ node was in pending, which is what looked unnecessary to me but it appears this is not what is happening here. Re-reading said description, it does look like there is 2 genuine "pending" nodes: one that is bootstrapping and one that is replacing some other node. In that case, I'm afraid the code is working as designed: a replacing node _is_ gaining a range in the sense that it's not a replica for that range as far as read are concerned, but it may become one at any time once the replacement ends. bq. Note that, due to the failure of 127.0.0.4, 127.0.0.1 was stuck trying to stream from it and making no progress. I'll submit that this is probably the part where we ought to do better. If a node is streaming from a node that is replaced, we should probably detect that and fail the bootstrapping node since we know it will never complete (and hence has no reason to be accounted as pending anymore). > Pending endpoints size check for CAS doesn't play nicely with > writes-on-replacement > --- > > Key: CASSANDRA-13327 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13327 > Project: Cassandra > Issue Type: Bug > Components: Coordination >Reporter: Ariel Weisberg >Assignee: Ariel Weisberg > > Consider this ring: > 127.0.0.1 MR UP JOINING -7301836195843364181 > 127.0.0.2MR UP NORMAL -7263405479023135948 > 127.0.0.3MR UP NORMAL -7205759403792793599 > 127.0.0.4 MR DOWN NORMAL -7148113328562451251 > where 127.0.0.1 was bootstrapping for cluster expansion. Note that, due to > the failure of 127.0.0.4, 127.0.0.1 was stuck trying to stream from it and > making no progress. > Then the down node was replaced so we had: > 127.0.0.1 MR UP JOINING -7301836195843364181 > 127.0.0.2MR UP NORMAL -7263405479023135948 > 127.0.0.3MR UP NORMAL -7205759403792793599 > 127.0.0.5 MR UP JOINING -7148113328562451251 > It’s confusing in the ring - the first JOINING is a genuine bootstrap, the > second is a replacement. We now had CAS unavailables (but no non-CAS > unvailables). I think it’s because the pending endpoints check thinks that > 127.0.0.5 is gaining a range when it’s just replacing. > The workaround is to kill the stuck JOINING node, but Cassandra shouldn’t > unnecessarily fail these requests. > It also appears like required participants is bumped by 1 during a host > replacement so if the replacing host fails you will get unavailables and > timeouts. > This is related to the check added in CASSANDRA-8346 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CASSANDRA-13392) Repaired status should be cleared on new sstables when issuing nodetool refresh
Marcus Eriksson created CASSANDRA-13392: --- Summary: Repaired status should be cleared on new sstables when issuing nodetool refresh Key: CASSANDRA-13392 URL: https://issues.apache.org/jira/browse/CASSANDRA-13392 Project: Cassandra Issue Type: Bug Reporter: Marcus Eriksson Assignee: Marcus Eriksson We can't assume that new sstables added when doing nodetool refresh (ColumnFamilyStore#loadNewSSTables) are actually repaired if they have the repairedAt flag set -- This message was sent by Atlassian JIRA (v6.3.15#6346)