[jira] [Commented] (CASSANDRA-14145) Detecting data resurrection during read
[ https://issues.apache.org/jira/browse/CASSANDRA-14145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16618326#comment-16618326 ] ASF GitHub Bot commented on CASSANDRA-14145: Github user jrwest commented on a diff in the pull request: https://github.com/apache/cassandra-dtest/pull/37#discussion_r218267799 --- Diff: repair_tests/incremental_repair_test.py --- @@ -918,3 +931,196 @@ def test_subrange(self): self.assertRepairedAndUnrepaired(node1, 'ks') self.assertRepairedAndUnrepaired(node2, 'ks') self.assertRepairedAndUnrepaired(node3, 'ks') + +@since('4.0') +def test_repaired_tracking_with_partition_deletes(self): +""" +check that when an tracking repaired data status following a digest mismatch, +repaired data mismatches are marked as unconfirmed as we may skip sstables +after the partition delete are encountered. +@jira_ticket CASSANDRA-14145 +""" +session, node1, node2 = self.setup_for_repaired_data_tracking() +stmt = SimpleStatement("INSERT INTO ks.tbl (k, c, v) VALUES (%s, %s, %s)") +stmt.consistency_level = ConsistencyLevel.ALL +for i in range(10): +session.execute(stmt, (i, i, i)) + +for node in self.cluster.nodelist(): +node.flush() +self.assertNoRepairedSSTables(node, 'ks') + +node1.repair(options=['ks']) +node2.stop(wait_other_notice=True) + +session.execute("delete from ks.tbl where k = 5") + +node1.flush() +node2.start(wait_other_notice=True) + +# expect unconfirmed inconsistencies as the partition deletes cause some sstables to be skipped +with JolokiaAgent(node1) as jmx: +self.query_and_check_repaired_mismatches(jmx, session, "SELECT * FROM ks.tbl WHERE k = 5", + expect_unconfirmed_inconsistencies=True) +self.query_and_check_repaired_mismatches(jmx, session, "SELECT * FROM ks.tbl WHERE k = 5 AND c = 5", + expect_unconfirmed_inconsistencies=True) +# no digest reads for range queries so blocking read repair metric isn't incremented +# *all* sstables are read for partition ranges too, and as the repaired set is still in sync there should +# be no inconsistencies +self.query_and_check_repaired_mismatches(jmx, session, "SELECT * FROM ks.tbl", expect_read_repair=False) + +@since('4.0') +def test_repaired_tracking_with_varying_sstable_sets(self): +""" +verify that repaired data digests are computed over the merged data for each replica +and that the particular number of sstables on each doesn't affect the comparisons +both replicas start with the same repaired set, comprising 2 sstables. node1's is +then compacted and additional unrepaired data added (which overwrites some in the +repaired set). We expect the repaired digests to still match as the tracking will +force all sstables containing the partitions to be read +there are two variants of this, for single partition slice & names reads and range reads +@jira_ticket CASSANDRA-14145 +""" +session, node1, node2 = self.setup_for_repaired_data_tracking() +stmt = SimpleStatement("INSERT INTO ks.tbl (k, c, v) VALUES (%s, %s, %s)") +stmt.consistency_level = ConsistencyLevel.ALL +for i in range(10): +session.execute(stmt, (i, i, i)) + +for node in self.cluster.nodelist(): +node.flush() + +for i in range(10,20): +session.execute(stmt, (i, i, i)) + +for node in self.cluster.nodelist(): +node.flush() +self.assertNoRepairedSSTables(node, 'ks') + +node1.repair(options=['ks']) +node2.stop(wait_other_notice=True) + +session.execute("insert into ks.tbl (k, c, v) values (5, 5, 55)") +session.execute("insert into ks.tbl (k, c, v) values (15, 15, 155)") +node1.flush() +node1.compact() +node1.compact() +node2.start(wait_other_notice=True) + +# we don't expect any inconsistencies as all repaired data is read on both replicas +with JolokiaAgent(node1) as jmx: +self.query_and_check_repaired_mismatches(jmx, session, "SELECT * FROM ks.tbl WHERE k = 5") +self.query_and_check_repaired_mismatches(jmx, session, "SELECT * FROM ks.tbl
[jira] [Commented] (CASSANDRA-14145) Detecting data resurrection during read
[ https://issues.apache.org/jira/browse/CASSANDRA-14145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16618323#comment-16618323 ] ASF GitHub Bot commented on CASSANDRA-14145: Github user jrwest commented on a diff in the pull request: https://github.com/apache/cassandra-dtest/pull/37#discussion_r218262733 --- Diff: repair_tests/incremental_repair_test.py --- @@ -207,6 +208,7 @@ def test_manual_session_fail(self): self.fixture_dtest_setup.setup_overrides.cluster_options = ImmutableMapping({'hinted_handoff_enabled': 'false', 'num_tokens': 1, 'commitlog_sync_period_in_ms': 500}) +self.fixture_dtest_setup.init_default_config() --- End diff -- I believe `self.init_default_config()` and `self.fixture_dtest_setup.init_default_config()` are synonymous: https://github.com/apache/cassandra-dtest/blob/master/dtest.py#L228-L233 > Detecting data resurrection during read > > > Key: CASSANDRA-14145 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14145 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Assignee: Sam Tunnicliffe >Priority: Minor > Labels: pull-request-available > Fix For: 4.0 > > > We have seen several bugs in which deleted data gets resurrected. We should > try to see if we can detect this on the read path and possibly fix it. Here > are a few examples which brought back data > A replica lost an sstable on startup which caused one replica to lose the > tombstone and not the data. This tombstone was past gc grace which means this > could resurrect data. We can detect such invalid states by looking at other > replicas. > If we are running incremental repair, Cassandra will keep repaired and > non-repaired data separate. Every-time incremental repair will run, it will > move the data from non-repaired to repaired. Repaired data across all > replicas should be 100% consistent. > Here is an example of how we can detect and mitigate the issue in most cases. > Say we have 3 machines, A,B and C. All these machines will have data split > b/w repaired and non-repaired. > 1. Machine A due to some bug bring backs data D. This data D is in repaired > dataset. All other replicas will have data D and tombstone T > 2. Read for data D comes from application which involve replicas A and B. The > data being read involves data which is in repaired state. A will respond > back to co-ordinator with data D and B will send nothing as tombstone is past > gc grace. This will cause digest mismatch. > 3. This patch will only kick in when there is a digest mismatch. Co-ordinator > will ask both replicas to send back all data like we do today but with this > patch, replicas will respond back what data it is returning is coming from > repaired vs non-repaired. If data coming from repaired does not match, we > know there is a something wrong!! At this time, co-ordinator cannot determine > if replica A has resurrected some data or replica B has lost some data. We > can still log error in the logs saying we hit an invalid state. > 4. Besides the log, we can take this further and even correct the response to > the query. After logging an invalid state, we can ask replica A and B (and > also C if alive) to send back all data for this including gcable tombstones. > If any machine returns a tombstone which is after this data, we know we > cannot return this data. This way we can avoid returning data which has been > deleted. > Some Challenges with this > 1. When data will be moved from non-repaired to repaired, there could be a > race here. We can look at which incremental repairs have promoted things on > which replica to avoid false positives. > 2. If the third replica is down and live replica does not have any tombstone, > we wont be able to break the tie in deciding whether data was actually > deleted or resurrected. > 3. If the read is for latest data only, we wont be able to detect it as the > read will be served from non-repaired data. > 4. If the replica where we lose a tombstone is the last replica to compact > the tombstone, we wont be able to decide if data is coming back or rest of > the replicas has lost that data. But we will still detect something is wrong. > 5. We wont affect 99.9% of the read queries as we only do extra work during > digest mismatch. > 6. CL.ONE reads will not be able to detect this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsub
[jira] [Commented] (CASSANDRA-14145) Detecting data resurrection during read
[ https://issues.apache.org/jira/browse/CASSANDRA-14145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16618324#comment-16618324 ] ASF GitHub Bot commented on CASSANDRA-14145: Github user jrwest commented on a diff in the pull request: https://github.com/apache/cassandra-dtest/pull/37#discussion_r218264113 --- Diff: repair_tests/incremental_repair_test.py --- @@ -918,3 +931,196 @@ def test_subrange(self): self.assertRepairedAndUnrepaired(node1, 'ks') self.assertRepairedAndUnrepaired(node2, 'ks') self.assertRepairedAndUnrepaired(node3, 'ks') + +@since('4.0') +def test_repaired_tracking_with_partition_deletes(self): +""" +check that when an tracking repaired data status following a digest mismatch, +repaired data mismatches are marked as unconfirmed as we may skip sstables +after the partition delete are encountered. +@jira_ticket CASSANDRA-14145 +""" +session, node1, node2 = self.setup_for_repaired_data_tracking() +stmt = SimpleStatement("INSERT INTO ks.tbl (k, c, v) VALUES (%s, %s, %s)") +stmt.consistency_level = ConsistencyLevel.ALL +for i in range(10): +session.execute(stmt, (i, i, i)) + +for node in self.cluster.nodelist(): +node.flush() +self.assertNoRepairedSSTables(node, 'ks') + +node1.repair(options=['ks']) +node2.stop(wait_other_notice=True) + +session.execute("delete from ks.tbl where k = 5") + +node1.flush() +node2.start(wait_other_notice=True) + +# expect unconfirmed inconsistencies as the partition deletes cause some sstables to be skipped +with JolokiaAgent(node1) as jmx: +self.query_and_check_repaired_mismatches(jmx, session, "SELECT * FROM ks.tbl WHERE k = 5", + expect_unconfirmed_inconsistencies=True) +self.query_and_check_repaired_mismatches(jmx, session, "SELECT * FROM ks.tbl WHERE k = 5 AND c = 5", + expect_unconfirmed_inconsistencies=True) +# no digest reads for range queries so blocking read repair metric isn't incremented +# *all* sstables are read for partition ranges too, and as the repaired set is still in sync there should +# be no inconsistencies +self.query_and_check_repaired_mismatches(jmx, session, "SELECT * FROM ks.tbl", expect_read_repair=False) + +@since('4.0') +def test_repaired_tracking_with_varying_sstable_sets(self): +""" +verify that repaired data digests are computed over the merged data for each replica +and that the particular number of sstables on each doesn't affect the comparisons +both replicas start with the same repaired set, comprising 2 sstables. node1's is +then compacted and additional unrepaired data added (which overwrites some in the +repaired set). We expect the repaired digests to still match as the tracking will +force all sstables containing the partitions to be read +there are two variants of this, for single partition slice & names reads and range reads +@jira_ticket CASSANDRA-14145 +""" +session, node1, node2 = self.setup_for_repaired_data_tracking() +stmt = SimpleStatement("INSERT INTO ks.tbl (k, c, v) VALUES (%s, %s, %s)") +stmt.consistency_level = ConsistencyLevel.ALL +for i in range(10): +session.execute(stmt, (i, i, i)) + +for node in self.cluster.nodelist(): +node.flush() + +for i in range(10,20): +session.execute(stmt, (i, i, i)) + +for node in self.cluster.nodelist(): +node.flush() +self.assertNoRepairedSSTables(node, 'ks') + +node1.repair(options=['ks']) +node2.stop(wait_other_notice=True) + +session.execute("insert into ks.tbl (k, c, v) values (5, 5, 55)") +session.execute("insert into ks.tbl (k, c, v) values (15, 15, 155)") +node1.flush() +node1.compact() +node1.compact() +node2.start(wait_other_notice=True) + +# we don't expect any inconsistencies as all repaired data is read on both replicas +with JolokiaAgent(node1) as jmx: +self.query_and_check_repaired_mismatches(jmx, session, "SELECT * FROM ks.tbl WHERE k = 5") +self.query_and_check_repaired_mismatches(jmx, session, "SELECT * FROM ks.tbl
[jira] [Commented] (CASSANDRA-14145) Detecting data resurrection during read
[ https://issues.apache.org/jira/browse/CASSANDRA-14145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16618325#comment-16618325 ] ASF GitHub Bot commented on CASSANDRA-14145: Github user jrwest commented on a diff in the pull request: https://github.com/apache/cassandra-dtest/pull/37#discussion_r218264028 --- Diff: repair_tests/incremental_repair_test.py --- @@ -918,3 +931,196 @@ def test_subrange(self): self.assertRepairedAndUnrepaired(node1, 'ks') self.assertRepairedAndUnrepaired(node2, 'ks') self.assertRepairedAndUnrepaired(node3, 'ks') + +@since('4.0') +def test_repaired_tracking_with_partition_deletes(self): +""" +check that when an tracking repaired data status following a digest mismatch, +repaired data mismatches are marked as unconfirmed as we may skip sstables +after the partition delete are encountered. +@jira_ticket CASSANDRA-14145 +""" +session, node1, node2 = self.setup_for_repaired_data_tracking() +stmt = SimpleStatement("INSERT INTO ks.tbl (k, c, v) VALUES (%s, %s, %s)") +stmt.consistency_level = ConsistencyLevel.ALL +for i in range(10): +session.execute(stmt, (i, i, i)) + +for node in self.cluster.nodelist(): +node.flush() +self.assertNoRepairedSSTables(node, 'ks') + +node1.repair(options=['ks']) +node2.stop(wait_other_notice=True) + +session.execute("delete from ks.tbl where k = 5") + +node1.flush() +node2.start(wait_other_notice=True) + +# expect unconfirmed inconsistencies as the partition deletes cause some sstables to be skipped +with JolokiaAgent(node1) as jmx: +self.query_and_check_repaired_mismatches(jmx, session, "SELECT * FROM ks.tbl WHERE k = 5", + expect_unconfirmed_inconsistencies=True) +self.query_and_check_repaired_mismatches(jmx, session, "SELECT * FROM ks.tbl WHERE k = 5 AND c = 5", + expect_unconfirmed_inconsistencies=True) +# no digest reads for range queries so blocking read repair metric isn't incremented +# *all* sstables are read for partition ranges too, and as the repaired set is still in sync there should +# be no inconsistencies +self.query_and_check_repaired_mismatches(jmx, session, "SELECT * FROM ks.tbl", expect_read_repair=False) + +@since('4.0') +def test_repaired_tracking_with_varying_sstable_sets(self): +""" +verify that repaired data digests are computed over the merged data for each replica +and that the particular number of sstables on each doesn't affect the comparisons +both replicas start with the same repaired set, comprising 2 sstables. node1's is +then compacted and additional unrepaired data added (which overwrites some in the +repaired set). We expect the repaired digests to still match as the tracking will +force all sstables containing the partitions to be read +there are two variants of this, for single partition slice & names reads and range reads +@jira_ticket CASSANDRA-14145 +""" +session, node1, node2 = self.setup_for_repaired_data_tracking() +stmt = SimpleStatement("INSERT INTO ks.tbl (k, c, v) VALUES (%s, %s, %s)") +stmt.consistency_level = ConsistencyLevel.ALL +for i in range(10): +session.execute(stmt, (i, i, i)) + +for node in self.cluster.nodelist(): +node.flush() + +for i in range(10,20): +session.execute(stmt, (i, i, i)) + +for node in self.cluster.nodelist(): +node.flush() +self.assertNoRepairedSSTables(node, 'ks') + +node1.repair(options=['ks']) +node2.stop(wait_other_notice=True) + +session.execute("insert into ks.tbl (k, c, v) values (5, 5, 55)") +session.execute("insert into ks.tbl (k, c, v) values (15, 15, 155)") +node1.flush() +node1.compact() +node1.compact() +node2.start(wait_other_notice=True) + +# we don't expect any inconsistencies as all repaired data is read on both replicas +with JolokiaAgent(node1) as jmx: +self.query_and_check_repaired_mismatches(jmx, session, "SELECT * FROM ks.tbl WHERE k = 5") +self.query_and_check_repaired_mismatches(jmx, session, "SELECT * FROM ks.tbl
[jira] [Commented] (CASSANDRA-14145) Detecting data resurrection during read
[ https://issues.apache.org/jira/browse/CASSANDRA-14145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16618322#comment-16618322 ] ASF GitHub Bot commented on CASSANDRA-14145: Github user jrwest commented on a diff in the pull request: https://github.com/apache/cassandra-dtest/pull/37#discussion_r218263635 --- Diff: repair_tests/incremental_repair_test.py --- @@ -207,6 +208,7 @@ def test_manual_session_fail(self): self.fixture_dtest_setup.setup_overrides.cluster_options = ImmutableMapping({'hinted_handoff_enabled': 'false', 'num_tokens': 1, 'commitlog_sync_period_in_ms': 500}) +self.fixture_dtest_setup.init_default_config() --- End diff -- Also I'm not sure its clear to me why `init_default_config()` is called before every test. Is it because the config changes in the preceding lines weren't actually being picked up prior? > Detecting data resurrection during read > > > Key: CASSANDRA-14145 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14145 > Project: Cassandra > Issue Type: Improvement >Reporter: sankalp kohli >Assignee: Sam Tunnicliffe >Priority: Minor > Labels: pull-request-available > Fix For: 4.0 > > > We have seen several bugs in which deleted data gets resurrected. We should > try to see if we can detect this on the read path and possibly fix it. Here > are a few examples which brought back data > A replica lost an sstable on startup which caused one replica to lose the > tombstone and not the data. This tombstone was past gc grace which means this > could resurrect data. We can detect such invalid states by looking at other > replicas. > If we are running incremental repair, Cassandra will keep repaired and > non-repaired data separate. Every-time incremental repair will run, it will > move the data from non-repaired to repaired. Repaired data across all > replicas should be 100% consistent. > Here is an example of how we can detect and mitigate the issue in most cases. > Say we have 3 machines, A,B and C. All these machines will have data split > b/w repaired and non-repaired. > 1. Machine A due to some bug bring backs data D. This data D is in repaired > dataset. All other replicas will have data D and tombstone T > 2. Read for data D comes from application which involve replicas A and B. The > data being read involves data which is in repaired state. A will respond > back to co-ordinator with data D and B will send nothing as tombstone is past > gc grace. This will cause digest mismatch. > 3. This patch will only kick in when there is a digest mismatch. Co-ordinator > will ask both replicas to send back all data like we do today but with this > patch, replicas will respond back what data it is returning is coming from > repaired vs non-repaired. If data coming from repaired does not match, we > know there is a something wrong!! At this time, co-ordinator cannot determine > if replica A has resurrected some data or replica B has lost some data. We > can still log error in the logs saying we hit an invalid state. > 4. Besides the log, we can take this further and even correct the response to > the query. After logging an invalid state, we can ask replica A and B (and > also C if alive) to send back all data for this including gcable tombstones. > If any machine returns a tombstone which is after this data, we know we > cannot return this data. This way we can avoid returning data which has been > deleted. > Some Challenges with this > 1. When data will be moved from non-repaired to repaired, there could be a > race here. We can look at which incremental repairs have promoted things on > which replica to avoid false positives. > 2. If the third replica is down and live replica does not have any tombstone, > we wont be able to break the tie in deciding whether data was actually > deleted or resurrected. > 3. If the read is for latest data only, we wont be able to detect it as the > read will be served from non-repaired data. > 4. If the replica where we lose a tombstone is the last replica to compact > the tombstone, we wont be able to decide if data is coming back or rest of > the replicas has lost that data. But we will still detect something is wrong. > 5. We wont affect 99.9% of the read queries as we only do extra work during > digest mismatch. > 6. CL.ONE reads will not be able to detect this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commit
[jira] [Updated] (CASSANDRA-14756) Transient Replication - range movement improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-14756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-14756: Status: Patch Available (was: Open) > Transient Replication - range movement improvements > --- > > Key: CASSANDRA-14756 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14756 > Project: Cassandra > Issue Type: Improvement >Reporter: Alex Petrov >Assignee: Alex Petrov >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > * Simplify iteration in calculateRangesToFetchWithPreferredEndpoints > * Minor changes to calculateRangesToFetchWithPreferredEndpoints to improve > readability: > * Simplify RangeRelocator code > * Fix range relocation > * Simplify calculateStreamAndFetchRanges > * Unify request/transfer ranges interface (Added benefit of this change is > that we have a check for non-intersecting ranges) > * Simplify iteration in calculateRangesToFetchWithPreferredEndpoints > * Improve error messages -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
cassandra-dtest git commit: Transient Replication and Cheap Quorums tests
Repository: cassandra-dtest Updated Branches: refs/heads/master 4e1c05565 -> 0d9c98ee1 Transient Replication and Cheap Quorums tests Patch by Blake Eggleston, Alex Petrov, Ariel Weisberg; Reviewed by Blake Eggleston for CASSANDRA-14404 Co-authored-by: Blake Eggleston Co-authored-by: Ariel Weisberg Project: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/commit/0d9c98ee Tree: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/tree/0d9c98ee Diff: http://git-wip-us.apache.org/repos/asf/cassandra-dtest/diff/0d9c98ee Branch: refs/heads/master Commit: 0d9c98ee1ec006604e4f8f1787f7be5b5792cf78 Parents: 4e1c055 Author: Alex Petrov Authored: Fri Sep 14 14:32:31 2018 +0200 Committer: Alex Petrov Committed: Mon Sep 17 17:29:20 2018 +0200 -- transient_replication_ring_test.py | 502 transient_replication_test.py | 653 2 files changed, 1155 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra-dtest/blob/0d9c98ee/transient_replication_ring_test.py -- diff --git a/transient_replication_ring_test.py b/transient_replication_ring_test.py new file mode 100644 index 000..a3b596e --- /dev/null +++ b/transient_replication_ring_test.py @@ -0,0 +1,502 @@ +import logging +import types + +from cassandra import ConsistencyLevel +from cassandra.query import SimpleStatement +from ccmlib.node import Node + +from dtest import Tester +from tools.assertions import (assert_all) + +from flaky import flaky + +from cassandra.metadata import BytesToken, OrderedDict +import pytest +from itertools import chain +from tools.misc import new_node + +logging.getLogger('cassandra').setLevel(logging.CRITICAL) + +NODELOCAL = 11 + +def jmx_start(to_start, **kwargs): +kwargs['jvm_args'] = kwargs.get('jvm_args', []) + ['-XX:-PerfDisableSharedMem'] +to_start.start(**kwargs) + + +def gen_expected(*values): +return [["%05d" % i, i, i] for i in chain(*values)] + + +def repair_nodes(nodes): +for node in nodes: +node.nodetool('repair -pr') + +def cleanup_nodes(nodes): +for node in nodes: +node.nodetool('cleanup') + +def patch_start(startable): +old_start = startable.start + +def new_start(self, *args, **kwargs): +kwargs['jvm_args'] = kwargs.get('jvm_args', []) + ['-XX:-PerfDisableSharedMem' + ' -Dcassandra.enable_nodelocal_queries=true'] +return old_start(*args, **kwargs) + +startable.start = types.MethodType(new_start, startable) + +class TestTransientReplicationRing(Tester): + +keyspace = "ks" +table = "tbl" + +def select(self): +return "SELECT * from %s.%s" % (self.keyspace, self.table) + +def select_statement(self): +return SimpleStatement(self.select(), consistency_level=NODELOCAL) + +def point_select(self): +return "SELECT * from %s.%s where pk = %%s" % (self.keyspace, self.table) + +def point_select_statement(self): +return SimpleStatement(self.point_select(), consistency_level=NODELOCAL) + +def check_expected(self, sessions, expected, node=[i for i in range(0,1000)], cleanup=False): +"""Check that each node has the expected values present""" +for idx, session, expect, node in zip(range(0, 1000), sessions, expected, node): +print("Checking idx " + str(idx)) +print(str([row for row in session.execute(self.select_statement())])) +if cleanup: +node.nodetool('cleanup') +assert_all(session, + self.select(), + expect, + cl=NODELOCAL) + +def check_replication(self, sessions, exactly=None, gte=None, lte=None): +"""Assert that the test values are replicated a required number of times""" +for i in range(0, 40): +count = 0 +for session in sessions: +for row in session.execute(self.point_select_statement(), ["%05d" % i]): +count += 1 +if exactly: +assert count == exactly, "Wrong replication for %05d should be exactly" % (i, exactly) +if gte: +assert count >= gte, "Count for %05d should be >= %d" % (i, gte) +if lte: +assert count <= lte, "Count for %05d should be <= %d" % (i, lte) + +@pytest.fixture +def cheap_quorums(self): +return False + +@pytest.fixture(scope='function', autouse=True) +def setup_cluster(self, fixture_dtest_setup): +self.tokens = ['00010', '00020', '00030'] + +patch_start(self.cluster) + self.cluster.set_configuration_options(value
[jira] [Updated] (CASSANDRA-14756) Transient Replication - range movement improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-14756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict updated CASSANDRA-14756: - Description: * Simplify iteration in calculateRangesToFetchWithPreferredEndpoints * Minor changes to calculateRangesToFetchWithPreferredEndpoints to improve readability: * Simplify RangeRelocator code * Fix range relocation * Simplify calculateStreamAndFetchRanges * Unify request/transfer ranges interface (Added benefit of this change is that we have a check for non-intersecting ranges) * Simplify iteration in calculateRangesToFetchWithPreferredEndpoints * Improve error messages was: Simplify iteration in calculateRangesToFetchWithPreferredEndpoints Minor changes to calculateRangesToFetchWithPreferredEndpoints to improve readability: Simplify RangeRelocator code Fix range relocation Simplify calculateStreamAndFetchRanges Unify request/transfer ranges interface (Added benefit of this change is that we have a check for non-intersecting ranges) Simplify iteration in calculateRangesToFetchWithPreferredEndpoints Improve error messages > Transient Replication - range movement improvements > --- > > Key: CASSANDRA-14756 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14756 > Project: Cassandra > Issue Type: Improvement >Reporter: Alex Petrov >Assignee: Alex Petrov >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > * Simplify iteration in calculateRangesToFetchWithPreferredEndpoints > * Minor changes to calculateRangesToFetchWithPreferredEndpoints to improve > readability: > * Simplify RangeRelocator code > * Fix range relocation > * Simplify calculateStreamAndFetchRanges > * Unify request/transfer ranges interface (Added benefit of this change is > that we have a check for non-intersecting ranges) > * Simplify iteration in calculateRangesToFetchWithPreferredEndpoints > * Improve error messages -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14756) Transient Replication - range movement improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-14756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-14756: Reviewers: Ariel Weisberg, Benedict > Transient Replication - range movement improvements > --- > > Key: CASSANDRA-14756 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14756 > Project: Cassandra > Issue Type: Improvement >Reporter: Alex Petrov >Assignee: Alex Petrov >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Simplify iteration in calculateRangesToFetchWithPreferredEndpoints > Minor changes to calculateRangesToFetchWithPreferredEndpoints to improve > readability: > Simplify RangeRelocator code > Fix range relocation > Simplify calculateStreamAndFetchRanges > Unify request/transfer ranges interface (Added benefit of this change is > that we have a check for non-intersecting ranges) > Simplify iteration in calculateRangesToFetchWithPreferredEndpoints > Improve error messages -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14756) Transient Replication - range movement improvements
[ https://issues.apache.org/jira/browse/CASSANDRA-14756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated CASSANDRA-14756: --- Labels: pull-request-available (was: ) > Transient Replication - range movement improvements > --- > > Key: CASSANDRA-14756 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14756 > Project: Cassandra > Issue Type: Improvement >Reporter: Alex Petrov >Assignee: Alex Petrov >Priority: Major > Labels: pull-request-available > > Simplify iteration in calculateRangesToFetchWithPreferredEndpoints > Minor changes to calculateRangesToFetchWithPreferredEndpoints to improve > readability: > Simplify RangeRelocator code > Fix range relocation > Simplify calculateStreamAndFetchRanges > Unify request/transfer ranges interface (Added benefit of this change is > that we have a check for non-intersecting ranges) > Simplify iteration in calculateRangesToFetchWithPreferredEndpoints > Improve error messages -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14724) Duration addition to Date/Timestamp is broken for leapseconds
[ https://issues.apache.org/jira/browse/CASSANDRA-14724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16617691#comment-16617691 ] Benedict commented on CASSANDRA-14724: -- Ah, yes, so it will. So I guess the only question remaining is if your concept of 'an hour' is consistent with how other date libraries interpret the concept of an hour across a DST boundary. It seems reasonable to me that they might, and either way I guess we can leave this until DST is implemented (if ever). The original scope of this ticket is clearly not addressable since none of our libraries handle leapseconds, anyway, so I'll close as Not a Problem. > Duration addition to Date/Timestamp is broken for leapseconds > - > > Key: CASSANDRA-14724 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14724 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict >Assignee: Benjamin Lerer >Priority: Major > Labels: correctness > > Hours, Minutes and Seconds are not always of the same duration; it varies > when they cross a leap second (or DST boundary). When we add durations to > instants, we do not account for this (as we have by then lost the necessary > information). Duration should take (and store) all components of time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Resolved] (CASSANDRA-14724) Duration addition to Date/Timestamp is broken for leapseconds
[ https://issues.apache.org/jira/browse/CASSANDRA-14724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict resolved CASSANDRA-14724. -- Resolution: Not A Problem > Duration addition to Date/Timestamp is broken for leapseconds > - > > Key: CASSANDRA-14724 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14724 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict >Assignee: Benjamin Lerer >Priority: Major > Labels: correctness > > Hours, Minutes and Seconds are not always of the same duration; it varies > when they cross a leap second (or DST boundary). When we add durations to > instants, we do not account for this (as we have by then lost the necessary > information). Duration should take (and store) all components of time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-14756) Transient Replication - range movement improvements
Alex Petrov created CASSANDRA-14756: --- Summary: Transient Replication - range movement improvements Key: CASSANDRA-14756 URL: https://issues.apache.org/jira/browse/CASSANDRA-14756 Project: Cassandra Issue Type: Improvement Reporter: Alex Petrov Assignee: Alex Petrov Simplify iteration in calculateRangesToFetchWithPreferredEndpoints Minor changes to calculateRangesToFetchWithPreferredEndpoints to improve readability: Simplify RangeRelocator code Fix range relocation Simplify calculateStreamAndFetchRanges Unify request/transfer ranges interface (Added benefit of this change is that we have a check for non-intersecting ranges) Simplify iteration in calculateRangesToFetchWithPreferredEndpoints Improve error messages -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Resolved] (CASSANDRA-14705) ReplicaLayout follow-up
[ https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict resolved CASSANDRA-14705. -- Resolution: Fixed Yes, thanks. > ReplicaLayout follow-up > --- > > Key: CASSANDRA-14705 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14705 > Project: Cassandra > Issue Type: Improvement >Reporter: Benedict >Assignee: Benedict >Priority: Major > > Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for > what we want to do) and {{ReplicaLayout}} (for what we know about the > cluster), with well defined semantics (and comments in the rare cases those > semantics are weird) > Found and fixed some bugs: > * {{commitPaxos}} was using only live nodes, when needed to include down > * We were not writing to pending transient replicas > * On write, we were not hinting to full nodes with transient replication > enabled (since we filtered to {{liveOnly}}, in order to include our transient > replicas above {{blockFor}}) > * If we speculated, in {{maybeSendAdditionalReads}} (in read repair) we > would only consult the same node we had speculated too. This also applied to > {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR. > * Transient->Full movements mishandled consistency level upgrade > ** While we need to treat a transitioning node as ‘full’ for writes, so that > it can safely begin serving full data requests once it has finished, we > cannot maintain it in the ‘pending’ collection else we will also increase our > consistency requirements by a node that doesn’t exist. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14705) ReplicaLayout follow-up
[ https://issues.apache.org/jira/browse/CASSANDRA-14705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16617639#comment-16617639 ] Ariel Weisberg commented on CASSANDRA-14705: This got merged as [047bcd7ad171d6a4aa89128c5e6c6ed5f012b1c0|https://github.com/apache/cassandra/commit/047bcd7ad171d6a4aa89128c5e6c6ed5f012b1c0]? Is this issue ready to resolve? > ReplicaLayout follow-up > --- > > Key: CASSANDRA-14705 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14705 > Project: Cassandra > Issue Type: Improvement >Reporter: Benedict >Assignee: Benedict >Priority: Major > > Clarify the new {{ReplicaLayout}} code, separating it into ReplicaPlan (for > what we want to do) and {{ReplicaLayout}} (for what we know about the > cluster), with well defined semantics (and comments in the rare cases those > semantics are weird) > Found and fixed some bugs: > * {{commitPaxos}} was using only live nodes, when needed to include down > * We were not writing to pending transient replicas > * On write, we were not hinting to full nodes with transient replication > enabled (since we filtered to {{liveOnly}}, in order to include our transient > replicas above {{blockFor}}) > * If we speculated, in {{maybeSendAdditionalReads}} (in read repair) we > would only consult the same node we had speculated too. This also applied to > {{maybeSendAdditionalWrites}} - and this issue was also true pre-TR. > * Transient->Full movements mishandled consistency level upgrade > ** While we need to treat a transitioning node as ‘full’ for writes, so that > it can safely begin serving full data requests once it has finished, we > cannot maintain it in the ‘pending’ collection else we will also increase our > consistency requirements by a node that doesn’t exist. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14742) Race Condition in batchlog replica collection
[ https://issues.apache.org/jira/browse/CASSANDRA-14742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-14742: Status: Patch Available (was: Open) > Race Condition in batchlog replica collection > - > > Key: CASSANDRA-14742 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14742 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Petrov >Assignee: Alex Petrov >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > When we collect nodes for it in {{StorageProxy#getBatchlogReplicas}}, we > already filter out down replicas; subsequently they get picked up and taken > for liveAndDown. > There's a possible race condition due to picking tokens from token metadata > twice (once in {{StorageProxy#getBatchlogReplicas}} and second one in > {{ReplicaPlan#forBatchlogWrite}}) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14742) Race Condition in batchlog replica collection
[ https://issues.apache.org/jira/browse/CASSANDRA-14742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16617589#comment-16617589 ] Alex Petrov commented on CASSANDRA-14742: - We're picking batchlog CL of {{ONE}} or {{TWO}} since the logic in {{BatchlogManager#EndpointFilter}} allows either one node (local, in case of single-node data centers), or two replicas from different racks. > Race Condition in batchlog replica collection > - > > Key: CASSANDRA-14742 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14742 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Petrov >Assignee: Alex Petrov >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > When we collect nodes for it in {{StorageProxy#getBatchlogReplicas}}, we > already filter out down replicas; subsequently they get picked up and taken > for liveAndDown. > There's a possible race condition due to picking tokens from token metadata > twice (once in {{StorageProxy#getBatchlogReplicas}} and second one in > {{ReplicaPlan#forBatchlogWrite}}) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14742) Race Condition in batchlog replica collection
[ https://issues.apache.org/jira/browse/CASSANDRA-14742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated CASSANDRA-14742: --- Labels: pull-request-available (was: ) > Race Condition in batchlog replica collection > - > > Key: CASSANDRA-14742 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14742 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Petrov >Assignee: Alex Petrov >Priority: Major > Labels: pull-request-available > > When we collect nodes for it in {{StorageProxy#getBatchlogReplicas}}, we > already filter out down replicas; subsequently they get picked up and taken > for liveAndDown. > There's a possible race condition due to picking tokens from token metadata > twice (once in {{StorageProxy#getBatchlogReplicas}} and second one in > {{ReplicaPlan#forBatchlogWrite}}) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14724) Duration addition to Date/Timestamp is broken for leapseconds
[ https://issues.apache.org/jira/browse/CASSANDRA-14724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16617583#comment-16617583 ] Benjamin Lerer commented on CASSANDRA-14724: We do not accept any TZ conversion but even if we do one day the current Duration type will be able to support it as long as we change the addition code logic. > Duration addition to Date/Timestamp is broken for leapseconds > - > > Key: CASSANDRA-14724 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14724 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict >Assignee: Benjamin Lerer >Priority: Major > Labels: correctness > > Hours, Minutes and Seconds are not always of the same duration; it varies > when they cross a leap second (or DST boundary). When we add durations to > instants, we do not account for this (as we have by then lost the necessary > information). Duration should take (and store) all components of time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14672) After deleting data in 3.11.3, reads fail with "open marker and close marker have different deletion times"
[ https://issues.apache.org/jira/browse/CASSANDRA-14672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16617522#comment-16617522 ] Jeff Jirsa commented on CASSANDRA-14672: In chatting with the folks who wrote CASSANDRA-14515 offline (namely [~iamaleksey] and [~benedict] ), it sounds like what you're seeing is likely corruption that CASSANDRA-14515 was meant to protect. That is: the bug in cassandra 3.11.0 to 3.11.2 that causes data loss (14515) is also leaving your data files in an invalid corrupt state. The exception is letting you know it's broken, and in this case, that you've probably lost data due to that bug. [~iamaleksey] / [~benedict] - any thoughts on how to prove this is really just 14515 corruption? Any ideas on recovery (will scrub help here)? > After deleting data in 3.11.3, reads fail with "open marker and close marker > have different deletion times" > --- > > Key: CASSANDRA-14672 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14672 > Project: Cassandra > Issue Type: Bug > Environment: CentOS 7, GCE, 9 nodes, 4TB disk/~2TB full each, level > compaction, timeseries data >Reporter: Spiros Ioannou >Priority: Blocker > > We had 3.11.0, then we upgraded to 3.11.3 last week. We routinely perform > deletions as the one described below. After upgrading we run the following > deletion query: > > {code:java} > DELETE FROM measurement_events_dbl WHERE measurement_source_id IN ( > 9df798a2-6337-11e8-b52b-42010afa015a, 9df7717e-6337-11e8-b52b-42010afa015a, > a08b8042-6337-11e8-b52b-42010afa015a, a08e52cc-6337-11e8-b52b-42010afa015a, > a08e6654-6337-11e8-b52b-42010afa015a, a08e6104-6337-11e8-b52b-42010afa015a, > a08e6c76-6337-11e8-b52b-42010afa015a, a08e5a9c-6337-11e8-b52b-42010afa015a, > a08bcc50-6337-11e8-b52b-42010afa015a) AND year IN (2018) AND measurement_time > >= '2018-07-19 04:00:00'{code} > > Immediately after that, trying to read the last value produces an error: > {code:java} > select * FROM measurement_events_dbl WHERE measurement_source_id = > a08b8042-6337-11e8-b52b-42010afa015a AND year IN (2018) order by > measurement_time desc limit 1; > ReadFailure: Error from server: code=1300 [Replica(s) failed to execute read] > message="Operation failed - received 0 responses and 2 failures" > info={'failures': 2, 'received_responses': 0, 'required_responses': 1, > 'consistency': 'ONE'}{code} > > And the following exception: > {noformat} > WARN [ReadStage-4] 2018-08-29 06:59:53,505 > AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread > Thread[ReadStage-4,5,main]: {} > java.lang.RuntimeException: java.lang.IllegalStateException: > UnfilteredRowIterator for pvpms_mevents.measurement_events_dbl has an illegal > RT bounds sequence: open marker and close marker have different deletion times > at > org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2601) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > ~[na:1.8.0_181] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at > org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134) > [apache-cassandra-3.11.3.jar:3.11.3] > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) > [apache-cassandra-3.11.3.jar:3.11.3] > at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181] > Caused by: java.lang.IllegalStateException: UnfilteredRowIterator for > pvpms_mevents.measurement_events_dbl has an illegal RT bounds sequence: open > marker and close marker have different deletion times > at > org.apache.cassandra.db.transform.RTBoundValidator$RowsTransformation.ise(RTBoundValidator.java:103) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at > org.apache.cassandra.db.transform.RTBoundValidator$RowsTransformation.applyToMarker(RTBoundValidator.java:81) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:148) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:136) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:92) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at > org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:79) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at > org.apache
[jira] [Resolved] (CASSANDRA-14749) Collection Deletions for Dropped Columns in 2.1/3.0 mixed-mode can delete rows
[ https://issues.apache.org/jira/browse/CASSANDRA-14749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict resolved CASSANDRA-14749. -- Resolution: Fixed > Collection Deletions for Dropped Columns in 2.1/3.0 mixed-mode can delete rows > -- > > Key: CASSANDRA-14749 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14749 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict >Assignee: Benedict >Priority: Major > Fix For: 3.0.18 > > > Similar to CASSANDRA-14568, if a 2.1 node sends a response to a 3.0 node > containing a deletion for a dropped collection column, instead of deleting > the collection, we will delete the row containing the collection. > > This is an admittedly unlikely cluster state but, during such a state, a > great deal of data loss could happen. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14724) Duration addition to Date/Timestamp is broken for leapseconds
[ https://issues.apache.org/jira/browse/CASSANDRA-14724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16617504#comment-16617504 ] Benedict commented on CASSANDRA-14724: -- Do we accept do any TZ conversion? If not, I guess we just need to annotate these classes with comments indicating that - should we ever support TZs - we need to create a new Duration type and deprecate this one. > Duration addition to Date/Timestamp is broken for leapseconds > - > > Key: CASSANDRA-14724 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14724 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict >Assignee: Benjamin Lerer >Priority: Major > Labels: correctness > > Hours, Minutes and Seconds are not always of the same duration; it varies > when they cross a leap second (or DST boundary). When we add durations to > instants, we do not account for this (as we have by then lost the necessary > information). Duration should take (and store) all components of time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-14713) Add docker testing image to cassandra-builds
[ https://issues.apache.org/jira/browse/CASSANDRA-14713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Podkowinski reassigned CASSANDRA-14713: -- Assignee: Stefan Podkowinski > Add docker testing image to cassandra-builds > > > Key: CASSANDRA-14713 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14713 > Project: Cassandra > Issue Type: New Feature > Components: Testing >Reporter: Stefan Podkowinski >Assignee: Stefan Podkowinski >Priority: Major > Attachments: Dockerfile > > > Tests executed on builds.apache.org ({{docker/jenkins/jenkinscommand.sh}}) > and circleCI ({{.circleci/config.yml}}) will currently use the same > [cassandra-test|https://hub.docker.com/r/kjellman/cassandra-test/] docker > image ([github|https://github.com/mkjellman/cassandra-test-docker]) by > [~mkjellman]. > We should manage this image on our own as part of cassandra-builds, to keep > it updated. There's also a [Apache > user|https://hub.docker.com/u/apache/?page=1] on docker hub for publishing > images. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14755) Reinstate repaired data tracking when ReadRepair == NONE
[ https://issues.apache.org/jira/browse/CASSANDRA-14755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-14755: Status: Patch Available (was: Open) Here's the dtest PR which contains the new tests and also turns on repaired tracking for all dtests by default: https://github.com/apache/cassandra-dtest/pull/37 C* Patch and CI runs: ||branch||utests||dtests|| |[14755-trunk|https://github.com/beobal/cassandra/tree/14755-trunk]|[utests|https://circleci.com/gh/beobal/cassandra/444]|[vnodes|https://circleci.com/gh/beobal/cassandra/445] / [no vnodes|https://circleci.com/gh/beobal/cassandra/446]| > Reinstate repaired data tracking when ReadRepair == NONE > > > Key: CASSANDRA-14755 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14755 > Project: Cassandra > Issue Type: Bug > Components: Local Write-Read Paths >Reporter: Sam Tunnicliffe >Assignee: Sam Tunnicliffe >Priority: Major > Fix For: 4.0 > > > Some of the refactoring in CASSANDRA-14698 breaks repaired data tracking when > read repair is disabled as it skips wrapping the {{MergeIterator}} in > {{DataResolver::wrapMergeListener}}. If repaired tracking is enabled, the > iterator still needs to be extended so that it calls > {{RepairedDataTracker::verify}} on close. This wasn't easy to spot as the new > dtests for CASSANDRA-14145 haven't yet been merged. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-14755) Reinstate repaired data tracking when ReadRepair == NONE
Sam Tunnicliffe created CASSANDRA-14755: --- Summary: Reinstate repaired data tracking when ReadRepair == NONE Key: CASSANDRA-14755 URL: https://issues.apache.org/jira/browse/CASSANDRA-14755 Project: Cassandra Issue Type: Bug Components: Local Write-Read Paths Reporter: Sam Tunnicliffe Assignee: Sam Tunnicliffe Fix For: 4.0 Some of the refactoring in CASSANDRA-14698 breaks repaired data tracking when read repair is disabled as it skips wrapping the {{MergeIterator}} in {{DataResolver::wrapMergeListener}}. If repaired tracking is enabled, the iterator still needs to be extended so that it calls {{RepairedDataTracker::verify}} on close. This wasn't easy to spot as the new dtests for CASSANDRA-14145 haven't yet been merged. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14724) Duration addition to Date/Timestamp is broken for leapseconds
[ https://issues.apache.org/jira/browse/CASSANDRA-14724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16617389#comment-16617389 ] Benjamin Lerer commented on CASSANDRA-14724: Sorry I got confused too. :-( Internally C* always store {{timestamps}} and {{dates}} in UTC time. As UTC does not change with a change of seasons we are not impacted by DST. > Duration addition to Date/Timestamp is broken for leapseconds > - > > Key: CASSANDRA-14724 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14724 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict >Assignee: Benjamin Lerer >Priority: Major > Labels: correctness > > Hours, Minutes and Seconds are not always of the same duration; it varies > when they cross a leap second (or DST boundary). When we add durations to > instants, we do not account for this (as we have by then lost the necessary > information). Duration should take (and store) all components of time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[2/3] cassandra git commit: ninja fix: bad merge in LegacyLayoutTest
ninja fix: bad merge in LegacyLayoutTest Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/287a960a Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/287a960a Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/287a960a Branch: refs/heads/trunk Commit: 287a960afb10332b3521399d8ae35f892b58c995 Parents: 5d55882 Author: Benedict Elliott Smith Authored: Mon Sep 17 11:58:27 2018 +0100 Committer: Benedict Elliott Smith Committed: Mon Sep 17 11:58:27 2018 +0100 -- test/unit/org/apache/cassandra/db/LegacyLayoutTest.java | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/287a960a/test/unit/org/apache/cassandra/db/LegacyLayoutTest.java -- diff --git a/test/unit/org/apache/cassandra/db/LegacyLayoutTest.java b/test/unit/org/apache/cassandra/db/LegacyLayoutTest.java index ae52902..2395963 100644 --- a/test/unit/org/apache/cassandra/db/LegacyLayoutTest.java +++ b/test/unit/org/apache/cassandra/db/LegacyLayoutTest.java @@ -276,8 +276,8 @@ public class LegacyLayoutTest Row.Builder builder; builder = BTreeRow.unsortedBuilder(0); -builder.newRow(new Clustering(UTF8Serializer.instance.serialize("a"))); -builder.addCell(BufferCell.live(table, v, 0L, Int32Serializer.instance.serialize(1), null)); +builder.newRow(new BufferClustering(UTF8Serializer.instance.serialize("a"))); +builder.addCell(BufferCell.live(v, 0L, Int32Serializer.instance.serialize(1), null)); builder.addComplexDeletion(bug, new DeletionTime(1L, 1)); Row row = builder.build(); - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[3/3] cassandra git commit: Merge branch 'cassandra-3.11' into trunk
Merge branch 'cassandra-3.11' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7a34477a Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7a34477a Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7a34477a Branch: refs/heads/trunk Commit: 7a34477a916ded2250368cbf995a38ed0242c6cf Parents: 261e75f 287a960 Author: Benedict Elliott Smith Authored: Mon Sep 17 11:58:50 2018 +0100 Committer: Benedict Elliott Smith Committed: Mon Sep 17 11:58:50 2018 +0100 -- -- - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[1/3] cassandra git commit: ninja fix: bad merge in LegacyLayoutTest
Repository: cassandra Updated Branches: refs/heads/cassandra-3.11 5d5588204 -> 287a960af refs/heads/trunk 261e75f19 -> 7a34477a9 ninja fix: bad merge in LegacyLayoutTest Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/287a960a Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/287a960a Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/287a960a Branch: refs/heads/cassandra-3.11 Commit: 287a960afb10332b3521399d8ae35f892b58c995 Parents: 5d55882 Author: Benedict Elliott Smith Authored: Mon Sep 17 11:58:27 2018 +0100 Committer: Benedict Elliott Smith Committed: Mon Sep 17 11:58:27 2018 +0100 -- test/unit/org/apache/cassandra/db/LegacyLayoutTest.java | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/287a960a/test/unit/org/apache/cassandra/db/LegacyLayoutTest.java -- diff --git a/test/unit/org/apache/cassandra/db/LegacyLayoutTest.java b/test/unit/org/apache/cassandra/db/LegacyLayoutTest.java index ae52902..2395963 100644 --- a/test/unit/org/apache/cassandra/db/LegacyLayoutTest.java +++ b/test/unit/org/apache/cassandra/db/LegacyLayoutTest.java @@ -276,8 +276,8 @@ public class LegacyLayoutTest Row.Builder builder; builder = BTreeRow.unsortedBuilder(0); -builder.newRow(new Clustering(UTF8Serializer.instance.serialize("a"))); -builder.addCell(BufferCell.live(table, v, 0L, Int32Serializer.instance.serialize(1), null)); +builder.newRow(new BufferClustering(UTF8Serializer.instance.serialize("a"))); +builder.addCell(BufferCell.live(v, 0L, Int32Serializer.instance.serialize(1), null)); builder.addComplexDeletion(bug, new DeletionTime(1L, 1)); Row row = builder.build(); - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14724) Duration addition to Date/Timestamp is broken for leapseconds
[ https://issues.apache.org/jira/browse/CASSANDRA-14724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16617362#comment-16617362 ] Benjamin Lerer commented on CASSANDRA-14724: Sorry, you are right the switch in {{Duration.add}} is at the wrong level. It should have been at the {{day}} level. > Duration addition to Date/Timestamp is broken for leapseconds > - > > Key: CASSANDRA-14724 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14724 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict >Assignee: Benjamin Lerer >Priority: Major > Labels: correctness > > Hours, Minutes and Seconds are not always of the same duration; it varies > when they cross a leap second (or DST boundary). When we add durations to > instants, we do not account for this (as we have by then lost the necessary > information). Duration should take (and store) all components of time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14749) Collection Deletions for Dropped Columns in 2.1/3.0 mixed-mode can delete rows
[ https://issues.apache.org/jira/browse/CASSANDRA-14749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16617364#comment-16617364 ] Benedict commented on CASSANDRA-14749: -- Hmm. Sorry about that. I ran {{ant clean && ant}} but clearly didn't look closely at the result. I'll ninja a fix in shortly, since it's only tests affected. > Collection Deletions for Dropped Columns in 2.1/3.0 mixed-mode can delete rows > -- > > Key: CASSANDRA-14749 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14749 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict >Assignee: Benedict >Priority: Major > Fix For: 3.0.18 > > > Similar to CASSANDRA-14568, if a 2.1 node sends a response to a 3.0 node > containing a deletion for a dropped collection column, instead of deleting > the collection, we will delete the row containing the collection. > > This is an admittedly unlikely cluster state but, during such a state, a > great deal of data loss could happen. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14724) Duration addition to Date/Timestamp is broken for leapseconds
[ https://issues.apache.org/jira/browse/CASSANDRA-14724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer updated CASSANDRA-14724: --- Reproduced In: 4.0.x (was: 2.2.x, 3.0.x, 4.0.x) > Duration addition to Date/Timestamp is broken for leapseconds > - > > Key: CASSANDRA-14724 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14724 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict >Assignee: Benjamin Lerer >Priority: Major > Labels: correctness > > Hours, Minutes and Seconds are not always of the same duration; it varies > when they cross a leap second (or DST boundary). When we add durations to > instants, we do not account for this (as we have by then lost the necessary > information). Duration should take (and store) all components of time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14724) Duration addition to Date/Timestamp is broken for leapseconds
[ https://issues.apache.org/jira/browse/CASSANDRA-14724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16617334#comment-16617334 ] Benedict commented on CASSANDRA-14724: -- bq. If you want to fetch all the data from the last 2 days but do not care that the day lasted 23, 24 or 25 hours you should query the data using a query like {{SELECT * FROM myTable WHERE pk = `XXX` AND operationTime > ? - 2d}}. But this isn't the case - it will give them either 1 extra or 1 fewer hours? We equate -2d with precisely -2*24*60*60*100ns, whatever the TZ or instant. bq. In my opinion for DST it is the correct behavior because it allows you to query exactly the data that you want. But as shown above, it doesn't, and it seems like even you are confused by this? Unless I am mistaken, in which case I am confused by it. In either case, it doesn't bode well for our users. > Duration addition to Date/Timestamp is broken for leapseconds > - > > Key: CASSANDRA-14724 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14724 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict >Assignee: Benjamin Lerer >Priority: Major > Labels: correctness > > Hours, Minutes and Seconds are not always of the same duration; it varies > when they cross a leap second (or DST boundary). When we add durations to > instants, we do not account for this (as we have by then lost the necessary > information). Duration should take (and store) all components of time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14749) Collection Deletions for Dropped Columns in 2.1/3.0 mixed-mode can delete rows
[ https://issues.apache.org/jira/browse/CASSANDRA-14749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16617331#comment-16617331 ] Robert Stupp edited comment on CASSANDRA-14749 at 9/17/18 10:21 AM: Seems that this commit broke the build in cassandra-3.11: {code:java} build-test: [javac] Compiling 540 source files to /home/automaton/cassandra-src/build/test/classes [javac] /home/automaton/cassandra-src/test/unit/org/apache/cassandra/db/LegacyLayoutTest.java:279: error: Clustering is abstract; cannot be instantiated [javac] builder.newRow(new Clustering(UTF8Serializer.instance.serialize("a"))); [javac] ^ [javac] /home/automaton/cassandra-src/test/unit/org/apache/cassandra/db/LegacyLayoutTest.java:280: error: no suitable method found for live(CFMetaData,ColumnDefinition,long,ByteBuffer,) [javac] builder.addCell(BufferCell.live(table, v, 0L, Int32Serializer.instance.serialize(1), null)); [javac] ^ [javac] method BufferCell.live(ColumnDefinition,long,ByteBuffer) is not applicable [javac] (actual and formal argument lists differ in length) [javac] method BufferCell.live(ColumnDefinition,long,ByteBuffer,CellPath) is not applicable [javac] (actual and formal argument lists differ in length) [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [javac] 2 errors {code} was (Author: snazy): Seems that this commit broke the build: {code:java} build-test: [javac] Compiling 540 source files to /home/automaton/cassandra-src/build/test/classes [javac] /home/automaton/cassandra-src/test/unit/org/apache/cassandra/db/LegacyLayoutTest.java:279: error: Clustering is abstract; cannot be instantiated [javac] builder.newRow(new Clustering(UTF8Serializer.instance.serialize("a"))); [javac] ^ [javac] /home/automaton/cassandra-src/test/unit/org/apache/cassandra/db/LegacyLayoutTest.java:280: error: no suitable method found for live(CFMetaData,ColumnDefinition,long,ByteBuffer,) [javac] builder.addCell(BufferCell.live(table, v, 0L, Int32Serializer.instance.serialize(1), null)); [javac] ^ [javac] method BufferCell.live(ColumnDefinition,long,ByteBuffer) is not applicable [javac] (actual and formal argument lists differ in length) [javac] method BufferCell.live(ColumnDefinition,long,ByteBuffer,CellPath) is not applicable [javac] (actual and formal argument lists differ in length) [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [javac] 2 errors {code} > Collection Deletions for Dropped Columns in 2.1/3.0 mixed-mode can delete rows > -- > > Key: CASSANDRA-14749 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14749 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict >Assignee: Benedict >Priority: Major > Fix For: 3.0.18 > > > Similar to CASSANDRA-14568, if a 2.1 node sends a response to a 3.0 node > containing a deletion for a dropped collection column, instead of deleting > the collection, we will delete the row containing the collection. > > This is an admittedly unlikely cluster state but, during such a state, a > great deal of data loss could happen. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Reopened] (CASSANDRA-14749) Collection Deletions for Dropped Columns in 2.1/3.0 mixed-mode can delete rows
[ https://issues.apache.org/jira/browse/CASSANDRA-14749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Stupp reopened CASSANDRA-14749: -- Seems that this commit broke the build: {code:java} build-test: [javac] Compiling 540 source files to /home/automaton/cassandra-src/build/test/classes [javac] /home/automaton/cassandra-src/test/unit/org/apache/cassandra/db/LegacyLayoutTest.java:279: error: Clustering is abstract; cannot be instantiated [javac] builder.newRow(new Clustering(UTF8Serializer.instance.serialize("a"))); [javac] ^ [javac] /home/automaton/cassandra-src/test/unit/org/apache/cassandra/db/LegacyLayoutTest.java:280: error: no suitable method found for live(CFMetaData,ColumnDefinition,long,ByteBuffer,) [javac] builder.addCell(BufferCell.live(table, v, 0L, Int32Serializer.instance.serialize(1), null)); [javac] ^ [javac] method BufferCell.live(ColumnDefinition,long,ByteBuffer) is not applicable [javac] (actual and formal argument lists differ in length) [javac] method BufferCell.live(ColumnDefinition,long,ByteBuffer,CellPath) is not applicable [javac] (actual and formal argument lists differ in length) [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [javac] 2 errors {code} > Collection Deletions for Dropped Columns in 2.1/3.0 mixed-mode can delete rows > -- > > Key: CASSANDRA-14749 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14749 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict >Assignee: Benedict >Priority: Major > Fix For: 3.0.18 > > > Similar to CASSANDRA-14568, if a 2.1 node sends a response to a 3.0 node > containing a deletion for a dropped collection column, instead of deleting > the collection, we will delete the row containing the collection. > > This is an admittedly unlikely cluster state but, during such a state, a > great deal of data loss could happen. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14724) Duration addition to Date/Timestamp is broken for leapseconds
[ https://issues.apache.org/jira/browse/CASSANDRA-14724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16617326#comment-16617326 ] Benjamin Lerer commented on CASSANDRA-14724: Lets first focus on the problem of {{DST boundaries}}. {{Durations}} as explained [here|http://cassandra.apache.org/doc/latest/cql/types.html#working-with-durations] contains 3 integers which are: a number of months, a number of days and a number of nanoseconds. This is due to the fact that the number of days in a month can change, and a day can have 23 or 25 hours depending on the daylight saving. If you want to fetch all the data from the last 2 days but do not care that the day lasted 23, 24 or 25 hours you should query the data using a query like {{SELECT \* FROM myTable WHERE pk = `XXX` AND operationTime > ? - 2d}}. If instead you want to fetch all the data from the last 2 hours but do not care about the fact that during that time your time changed due to daylight saving you should us {{SELECT \* FROM myTable WHERE pk = `XXX` AND operationTime > ? - 2h}}. In my opinion for DST it is the correct behavior because it allows you to query exactly the data that you want. Leap seconds are a much more complex problem because the Java libraries does not really handle leap seconds either. By consequence there is not much that we can do here. > Duration addition to Date/Timestamp is broken for leapseconds > - > > Key: CASSANDRA-14724 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14724 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict >Assignee: Benjamin Lerer >Priority: Major > Labels: correctness > > Hours, Minutes and Seconds are not always of the same duration; it varies > when they cross a leap second (or DST boundary). When we add durations to > instants, we do not account for this (as we have by then lost the necessary > information). Duration should take (and store) all components of time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14752) serializers/BooleanSerializer.java is using static bytebuffers which may cause problem for subsequent operations
[ https://issues.apache.org/jira/browse/CASSANDRA-14752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16617276#comment-16617276 ] Varun Barala edited comment on CASSANDRA-14752 at 9/17/18 9:32 AM: --- I found there are too many usages of `AbstractCompositeType#fromString()`. One way to corrupt the data:- Table schema:- {code:java} CREATE TABLE ks1.table1 ( t_id boolean, id boolean, ck boolean, nk boolean, PRIMARY KEY ((t_id,id),ck) );{code} Insert statement:- {code:java} insert into ks1.table1 (t_id, ck, id, nk) VALUES (true, false, false, true); {code} Now run nodetool command to get the SSTable for given key:- {code:java} bin/nodetool getsstables ks1 table1 "false:true" {code} Basically, this operation will modify the positions. Insert again:- {code:java} insert into ks1.table1 (t_id, ck, id, nk) VALUES (true, true, false, true); {code} select data from this table:- {code:java} true,false,false,true null,null,null,null {code} So now all boolean type data will be written as null. was (Author: varuna): I found there are too many usages of `AbstractCompositeType#fromString()`. One way to corrupt the data:- Table schema:- {code:java} CREATE TABLE ks1.table1 ( t_id boolean, id boolean, ck boolean, nk boolean, PRIMARY KEY ((t_id,id),ck) );{code} Insert statement:- {code:java} insert into ks1.table1 (tenant_id, ck, id, nk) VALUES (true, false, false, true); {code} Now run nodetool command to get the SSTable for given key:- {code:java} bin/nodetool getsstables ks1 table1 "false:true" {code} Basically, this operation will modify the positions. Insert again:- {code:java} insert into ks1.table1 (tenant_id, ck, id, nk) VALUES (true, true, false, true); {code} select data from this table:- {code:java} true,false,false,true null,null,null,null {code} So now all boolean type data will be written as null. > serializers/BooleanSerializer.java is using static bytebuffers which may > cause problem for subsequent operations > > > Key: CASSANDRA-14752 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14752 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Varun Barala >Priority: Major > Attachments: patch, patch-modified > > > [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/serializers/BooleanSerializer.java#L26] > It has two static Bytebuffer variables:- > {code:java} > private static final ByteBuffer TRUE = ByteBuffer.wrap(new byte[]{1}); > private static final ByteBuffer FALSE = ByteBuffer.wrap(new byte[]{0});{code} > What will happen if the position of these Bytebuffers is being changed by > some other operations? It'll affect other subsequent operations. IMO Using > static is not a good idea here. > A potential place where it can become problematic: > [https://github.com/apache/cassandra/blob/cassandra-2.1.13/src/java/org/apache/cassandra/db/marshal/AbstractCompositeType.java#L243] > Since we are calling *`.remaining()`* It may give wrong results _i.e 0_ if > these Bytebuffers have been used previously. > Solution: > > [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/serializers/BooleanSerializer.java#L42] > Every time we return new bytebuffer object. Please do let me know If there > is a better way. I'd like to contribute. Thanks!! > {code:java} > public ByteBuffer serialize(Boolean value) > { > return (value == null) ? ByteBufferUtil.EMPTY_BYTE_BUFFER > : value ? ByteBuffer.wrap(new byte[] {1}) : ByteBuffer.wrap(new byte[] {0}); > // false > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14752) serializers/BooleanSerializer.java is using static bytebuffers which may cause problem for subsequent operations
[ https://issues.apache.org/jira/browse/CASSANDRA-14752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16617276#comment-16617276 ] Varun Barala commented on CASSANDRA-14752: -- I found there are too many usages of `AbstractCompositeType#fromString()`. One way to corrupt the data:- Table schema:- {code:java} CREATE TABLE ks1.table1 ( t_id boolean, id boolean, ck boolean, nk boolean, PRIMARY KEY ((t_id,id),ck) );{code} Insert statement:- {code:java} insert into ks1.table1 (tenant_id, ck, id, nk) VALUES (true, false, false, true); {code} Now run nodetool command to get the SSTable for given key:- {code:java} bin/nodetool getsstables ks1 table1 "false:true" {code} Basically, this operation will modify the positions. Insert again:- {code:java} insert into ks1.table1 (tenant_id, ck, id, nk) VALUES (true, true, false, true); {code} select data from this table:- {code:java} true,false,false,true null,null,null,null {code} So now all boolean type data will be written as null. > serializers/BooleanSerializer.java is using static bytebuffers which may > cause problem for subsequent operations > > > Key: CASSANDRA-14752 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14752 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Varun Barala >Priority: Major > Attachments: patch, patch-modified > > > [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/serializers/BooleanSerializer.java#L26] > It has two static Bytebuffer variables:- > {code:java} > private static final ByteBuffer TRUE = ByteBuffer.wrap(new byte[]{1}); > private static final ByteBuffer FALSE = ByteBuffer.wrap(new byte[]{0});{code} > What will happen if the position of these Bytebuffers is being changed by > some other operations? It'll affect other subsequent operations. IMO Using > static is not a good idea here. > A potential place where it can become problematic: > [https://github.com/apache/cassandra/blob/cassandra-2.1.13/src/java/org/apache/cassandra/db/marshal/AbstractCompositeType.java#L243] > Since we are calling *`.remaining()`* It may give wrong results _i.e 0_ if > these Bytebuffers have been used previously. > Solution: > > [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/serializers/BooleanSerializer.java#L42] > Every time we return new bytebuffer object. Please do let me know If there > is a better way. I'd like to contribute. Thanks!! > {code:java} > public ByteBuffer serialize(Boolean value) > { > return (value == null) ? ByteBufferUtil.EMPTY_BYTE_BUFFER > : value ? ByteBuffer.wrap(new byte[] {1}) : ByteBuffer.wrap(new byte[] {0}); > // false > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14752) serializers/BooleanSerializer.java is using static bytebuffers which may cause problem for subsequent operations
[ https://issues.apache.org/jira/browse/CASSANDRA-14752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16617225#comment-16617225 ] Varun Barala commented on CASSANDRA-14752: -- [~blerer] Thanks for your reply. In one of our tool, we use below code to generate the DecoratedKey from String and In the case of boolean type, we are facing this issue. {code:java} DatabaseDescriptor.getPartitioner().decorateKey(getKeyValidator(row.getColumnFamily()) .fromString(stringKey)); {code} [https://github.com/apache/cassandra/blob/cassandra-2.1.13/src/java/org/apache/cassandra/db/marshal/AbstractCompositeType.java#L255] `byteBuffer.put` changes the position. Though it has a comment: *// it's ok to consume component as we won't use it anymore.* > serializers/BooleanSerializer.java is using static bytebuffers which may > cause problem for subsequent operations > > > Key: CASSANDRA-14752 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14752 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Varun Barala >Priority: Major > Attachments: patch, patch-modified > > > [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/serializers/BooleanSerializer.java#L26] > It has two static Bytebuffer variables:- > {code:java} > private static final ByteBuffer TRUE = ByteBuffer.wrap(new byte[]{1}); > private static final ByteBuffer FALSE = ByteBuffer.wrap(new byte[]{0});{code} > What will happen if the position of these Bytebuffers is being changed by > some other operations? It'll affect other subsequent operations. IMO Using > static is not a good idea here. > A potential place where it can become problematic: > [https://github.com/apache/cassandra/blob/cassandra-2.1.13/src/java/org/apache/cassandra/db/marshal/AbstractCompositeType.java#L243] > Since we are calling *`.remaining()`* It may give wrong results _i.e 0_ if > these Bytebuffers have been used previously. > Solution: > > [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/serializers/BooleanSerializer.java#L42] > Every time we return new bytebuffer object. Please do let me know If there > is a better way. I'd like to contribute. Thanks!! > {code:java} > public ByteBuffer serialize(Boolean value) > { > return (value == null) ? ByteBufferUtil.EMPTY_BYTE_BUFFER > : value ? ByteBuffer.wrap(new byte[] {1}) : ByteBuffer.wrap(new byte[] {0}); > // false > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14672) After deleting data in 3.11.3, reads fail with "open marker and close marker have different deletion times"
[ https://issues.apache.org/jira/browse/CASSANDRA-14672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16617214#comment-16617214 ] Nikos Mitsis commented on CASSANDRA-14672: -- We did some testing on a replica environment. Specifically added some printf statements to check the timestamps on open and close marker that Cassandra complains. The replication factor is 3, nodes 6, 8 & 9 contain the data. Node 8 & 9 are throwing the exception, node 6 when queried with consistency one does not fail (returns proper data - see below why). On node8 & node9 the following timestamps are found: Printf: {noformat} ==> openMarkerDeletionTime: null ==> openMarkerDeletionTime: deletedAt=1537103654634113, localDeletion=1537103654 ==> deletionTime: deletedAt=1530205388555918, localDeletion=1530205388 {noformat} Exception: {noformat} WARN [ReadStage-1] 2018-09-16 14:40:44,252 AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread Thread[ReadStage-1,5,main]: {} java.lang.IllegalStateException: ==> UnfilteredRowIterator for pvpms_mevents.measurement_events_dbl has an illegal RT bounds sequence: open marker and close marker have different deletion times [deletedAt=1537103654634113, localDeletion=1537103654 deletedAt=1530205388555918, localDeletion=1530205388] {noformat} Converted timestamps: {noformat} openDeletionTime: Sun Sep 16 13:14:14 UTC 2018 closeDeletionTime: Thu Jun 28 17:03:08 UTC 2018 {noformat} DELETE FROM command above was run on Sep 16. On node6 : Printf: {noformat} ==> openMarkerDeletionTime: null ==> openMarkerDeletionTime: deletedAt=1537103654634113, localDeletion=1537103654 ==> deletionTime: deletedAt=1537103654634113, localDeletion=1537103654 {noformat} Converted timestamps: {noformat} Sun Sep 16 13:14:14 UTC 2018 {noformat} No Exception! We did a json dump of the specified data from node 8 & 9 and found a range_tombstone_bound with timestamp start/end of "2018-06-28T17:03:08Z" that contains data from Jul & Aug (see rows below). On node 6 the same same data are not within a tombstone marker (it’s the same json without the range_tombstone_bound). {noformat} "partition" : { { "type" : "range_tombstone_bound", "start" : { "type" : "exclusive", "clustering" : [ "2018-06-27 04:55:00.000Z" ], "deletion_info" : { "marked_deleted" : "2018-06-28T17:03:08.555918Z", "local_delete_time" : "2018-06-28T17:03:08Z" } } }, { "type" : "row", "position" : 83860313, "clustering" : [ "2018-06-27 05:00:00.000Z" ], "liveness_info" : { "tstamp" : "2018-06-28T19:45:30.803293Z" }, "cells" : [ { "name" : "event_reception_time", "value" : "2018-06-28 19:45:30.784Z" }, { "name" : "quality", "value" : 100.0 }, { "name" : "value", "value" : 408307.66 } ] }, … { "type" : "row", "position" : 83953463, "clustering" : [ "2018-07-19 03:45:00.000Z" ], "liveness_info" : { "tstamp" : "2018-07-19T03:46:29.195118Z" }, "cells" : [ { "name" : "event_reception_time", "value" : "2018-07-19 03:46:29.193Z" }, { "name" : "quality", "value" : 100.0 }, { "name" : "value", "value" : 593846.06 } ] }, … { "type" : "row", "position" : 84054985, "clustering" : [ "2018-08-11 04:00:00.000Z" ], "liveness_info" : { "tstamp" : "2018-08-11T04:01:15.708470Z" }, "cells" : [ { "name" : "event_reception_time", "value" : "2018-08-11 04:01:15.703Z" }, { "name" : "quality", "value" : 100.0 }, { "name" : "value", "value" : 372654.53 } ] }, { "type" : "range_tombstone_bound", "end" : { "type" : "inclusive", "deletion_info" : { "marked_deleted" : "2018-06-28T17:03:08.555918Z", "local_delete_time" : "2018-06-28T17:03:08Z" } } } {noformat} We have downgraded on the mean time to Cassandra 3.11.2. Shouldn't at least these inconsistencies have more graceful assertions? > After deleting data in 3.11.3, reads fail with "open marker and close marker > have different deletion times" > --- > > Key: CASSANDRA-14672 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14672 > Project: Cassandra > Issue Type: Bug > Environment: CentOS 7, GCE, 9 nodes, 4TB disk/~2TB full each, level > compaction, timeseries data >Reporter: Spiros Ioannou >Priority: Blocker > > We had 3.11.0, then we upgraded to 3.11.3 last week. We routinely perform > deletions as the one described below. After upgrading we run the following > deletion query: > > {code:java} > DELETE FROM measurement_eve
[jira] [Commented] (CASSANDRA-14702) Cassandra Write failed even when the required nodes to Ack(consistency) are up.
[ https://issues.apache.org/jira/browse/CASSANDRA-14702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16617192#comment-16617192 ] Rui Pan commented on CASSANDRA-14702: - I think you could provide more detailed description on how you reproduced the issue. Such as the batchsize, the load, what was the data modeling and how you configured the 5-node cluster. Based on your previous comment, as you mentioned, I guess the cause is the batch statement that you performed involved too many cross-partition mutations which led to the high load of C* cluster. The latency created by this could possibly give the timeoutexception. > Cassandra Write failed even when the required nodes to Ack(consistency) are > up. > --- > > Key: CASSANDRA-14702 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14702 > Project: Cassandra > Issue Type: Bug >Reporter: Rohit Singh >Priority: Blocker > > Hi, > We have following configuration in our project for cassandra. > Total nodes in Cluster-5 > Replication Factor- 3 > Consistency- LOCAL_QUORUM > We get the writetimeout exception from cassandra even when 2 nodes are up and > why does stack trace says that 3 replica were required when consistency is 2? > Below is the exception we got:- > com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout > during write query at consistency LOCAL_QUORUM (3 replica were required but > only 2 acknowledged the write) > at com.datastax.driver.core.Responses$Error$1.decode(Responses.java:59) > at com.datastax.driver.core.Responses$Error$1.decode(Responses.java:37) > at com.datastax.driver.core.Message$ProtocolDecoder.decode(Message.java:289) > at com.datastax.driver.core.Message$ProtocolDecoder.decode(Message.java:269) > at > io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-14724) Duration addition to Date/Timestamp is broken for leapseconds
[ https://issues.apache.org/jira/browse/CASSANDRA-14724?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer reassigned CASSANDRA-14724: -- Assignee: Benjamin Lerer > Duration addition to Date/Timestamp is broken for leapseconds > - > > Key: CASSANDRA-14724 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14724 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Benedict >Assignee: Benjamin Lerer >Priority: Major > Labels: correctness > > Hours, Minutes and Seconds are not always of the same duration; it varies > when they cross a leap second (or DST boundary). When we add durations to > instants, we do not account for this (as we have by then lost the necessary > information). Duration should take (and store) all components of time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14752) serializers/BooleanSerializer.java is using static bytebuffers which may cause problem for subsequent operations
[ https://issues.apache.org/jira/browse/CASSANDRA-14752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16617180#comment-16617180 ] Benjamin Lerer commented on CASSANDRA-14752: bq. What will happen if the position of these Bytebuffers is being changed by some other operations? In C* you should not change the position unless you have duplicated your `ByteBuffer` first or you really know what you are doing. All the data being stored in the memtables for example are shared so if you change the position on a `ByteBuffer` coming from there you can corrupt the data in a much worst way. Personally, I would not try to change that code as its effect would simply be to cause more garbage. > serializers/BooleanSerializer.java is using static bytebuffers which may > cause problem for subsequent operations > > > Key: CASSANDRA-14752 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14752 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Varun Barala >Priority: Major > Attachments: patch, patch-modified > > > [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/serializers/BooleanSerializer.java#L26] > It has two static Bytebuffer variables:- > {code:java} > private static final ByteBuffer TRUE = ByteBuffer.wrap(new byte[]{1}); > private static final ByteBuffer FALSE = ByteBuffer.wrap(new byte[]{0});{code} > What will happen if the position of these Bytebuffers is being changed by > some other operations? It'll affect other subsequent operations. IMO Using > static is not a good idea here. > A potential place where it can become problematic: > [https://github.com/apache/cassandra/blob/cassandra-2.1.13/src/java/org/apache/cassandra/db/marshal/AbstractCompositeType.java#L243] > Since we are calling *`.remaining()`* It may give wrong results _i.e 0_ if > these Bytebuffers have been used previously. > Solution: > > [https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/serializers/BooleanSerializer.java#L42] > Every time we return new bytebuffer object. Please do let me know If there > is a better way. I'd like to contribute. Thanks!! > {code:java} > public ByteBuffer serialize(Boolean value) > { > return (value == null) ? ByteBufferUtil.EMPTY_BYTE_BUFFER > : value ? ByteBuffer.wrap(new byte[] {1}) : ByteBuffer.wrap(new byte[] {0}); > // false > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org