[jira] [Updated] (CASSANDRA-14869) Range.subtractContained produces incorrect results when used on full ring
[ https://issues.apache.org/jira/browse/CASSANDRA-14869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksandr Sorokoumov updated CASSANDRA-14869: - Description: The bug is in the way {{Range.subtractContained}} works if minuend range covers the full ring and subtrahend range goes over 0 (see illustration). For example, {{(50, 50] - (10, 100]}} returns \{{{(50,10], (100,50]}}} instead of \{{(100,10]}}. (was: The bug is in the way {{Range.subtractContained}} works if minuend range covers the full ring and subtrahend range goes over 0 (see illustration). For example, {{(-50, -50] - (10, 100]}} returns \{{{(50,10], (100,50]}}} instead of \{{(100,10]}}.) > Range.subtractContained produces incorrect results when used on full ring > - > > Key: CASSANDRA-14869 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14869 > Project: Cassandra > Issue Type: Bug >Reporter: Aleksandr Sorokoumov >Assignee: Aleksandr Sorokoumov >Priority: Major > Fix For: 3.0.x, 3.11.x, 4.0.x > > Attachments: range bug.jpg > > > The bug is in the way {{Range.subtractContained}} works if minuend range > covers the full ring and subtrahend range goes over 0 (see illustration). For > example, {{(50, 50] - (10, 100]}} returns \{{{(50,10], (100,50]}}} instead of > \{{(100,10]}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14869) Range.subtractContained produces incorrect results when used on full ring
[ https://issues.apache.org/jira/browse/CASSANDRA-14869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksandr Sorokoumov updated CASSANDRA-14869: - Description: The bug is in the way {{Range.subtractContained}} works if minuend range covers the full ring and subtrahend range goes over 0 (see illustration). For example, {{(-50, -50] - (10, 100]}} returns \{{{(50,10], (100,50]}}} instead of \{{(100,10]}}. (was: The bug is in the way {{Range.subtractContained}} works if minuend range covers the full ring and subtrahend range goes over 0 (see illustration). For example, {{(-50, 50] - (10, 100]}} returns \{{{(50,10], (100,50]}}} instead of \{{(100,10]}}.) > Range.subtractContained produces incorrect results when used on full ring > - > > Key: CASSANDRA-14869 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14869 > Project: Cassandra > Issue Type: Bug >Reporter: Aleksandr Sorokoumov >Assignee: Aleksandr Sorokoumov >Priority: Major > Fix For: 3.0.x, 3.11.x, 4.0.x > > Attachments: range bug.jpg > > > The bug is in the way {{Range.subtractContained}} works if minuend range > covers the full ring and subtrahend range goes over 0 (see illustration). For > example, {{(-50, -50] - (10, 100]}} returns \{{{(50,10], (100,50]}}} instead > of \{{(100,10]}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14869) Range.subtractContained produces incorrect results when used on full ring
[ https://issues.apache.org/jira/browse/CASSANDRA-14869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1663#comment-1663 ] Aleksandr Sorokoumov commented on CASSANDRA-14869: -- Patches: * [3.0 | https://github.com/Ge/cassandra/tree/14869-3.0] * [3.11 | https://github.com/Ge/cassandra/tree/14869-3.11] * [4.0 | https://github.com/Ge/cassandra/tree/14869-4.0] > Range.subtractContained produces incorrect results when used on full ring > - > > Key: CASSANDRA-14869 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14869 > Project: Cassandra > Issue Type: Bug >Reporter: Aleksandr Sorokoumov >Assignee: Aleksandr Sorokoumov >Priority: Major > Fix For: 3.0.x, 3.11.x, 4.0.x > > Attachments: range bug.jpg > > > The bug is in the way {{Range.subtractContained}} works if minuend range > covers the full ring and subtrahend range goes over 0 (see illustration). For > example, {{(-50, 50] - (10, 100]}} returns \{{{(50,10], (100,50]}}} instead > of \{{(100,10]}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-14869) Range.subtractContained produces incorrect results when used on full ring
Aleksandr Sorokoumov created CASSANDRA-14869: Summary: Range.subtractContained produces incorrect results when used on full ring Key: CASSANDRA-14869 URL: https://issues.apache.org/jira/browse/CASSANDRA-14869 Project: Cassandra Issue Type: Bug Reporter: Aleksandr Sorokoumov Assignee: Aleksandr Sorokoumov Fix For: 3.0.x, 3.11.x, 4.0.x Attachments: range bug.jpg The bug is in the way {{Range.subtractContained}} works if minuend range covers the full ring and subtrahend range goes over 0 (see illustration). For example, {{(-50, 50] - (10, 100]}} returns \{{{(50,10], (100,50]}}} instead of \{{(100,10]}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14554) LifecycleTransaction encounters ConcurrentModificationException when used in multi-threaded context
[ https://issues.apache.org/jira/browse/CASSANDRA-14554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677739#comment-16677739 ] Stefania commented on CASSANDRA-14554: -- We had a related issue where one of our customers ended up with a corrupt txn log file during streaming, with an ADD record following an ABORT record. We couldn't look at the logs as they weren't available any longer, since the customer only noticed the problem when the node would not restart 22 days later. However, it's pretty obvious in my opinion that one thread aborted the streaming session whilst the receiving thread was adding a new sstable. So this seems the same root cause as reported in this ticket, which is that streaming is using the txn in a thread unsafe way. In my opinion, the problem exists since 3.0. However it becomes significanlty more likely with the Netty streaming refactoring. Our customer was on a branch based on 3.11. We took a very conservative approach with the fix, in that we didn't want to fully synchronize abstract transactional and the lifecycle transaction on released branches. We could consider synchronizing these classes for 4.0 however, or reworking streaming. Here are the 3.11 changes, if there is interest in this approach I can create patches for 3.0 and trunk as well: [https://github.com/apache/cassandra/compare/cassandra-3.11...stef1927:db-2633-3.11] We simply extracted a new interface, the [sstable tracker|https://github.com/apache/cassandra/compare/cassandra-3.11...stef1927:db-2633-3.11#diff-9d71c7ad9ad16368bd0429d3b34e2b21R15], which is also [implemented|https://github.com/apache/cassandra/compare/cassandra-3.11...stef1927:db-2633-3.11#diff-1a464da4a62ac4a734c725059cbc918bR144] by {{StreamReceiveTask}} by synchronizing the access to the txn, just like it does for all its other accesses to the txn. Whilst it's not ideal to have an additional interface, the change should be quite safe for released branches. > LifecycleTransaction encounters ConcurrentModificationException when used in > multi-threaded context > --- > > Key: CASSANDRA-14554 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14554 > Project: Cassandra > Issue Type: Bug >Reporter: Dinesh Joshi >Assignee: Dinesh Joshi >Priority: Major > > When LifecycleTransaction is used in a multi-threaded context, we encounter > this exception - > {quote}java.util.ConcurrentModificationException: null > at > java.util.LinkedHashMap$LinkedHashIterator.nextNode(LinkedHashMap.java:719) > at java.util.LinkedHashMap$LinkedKeyIterator.next(LinkedHashMap.java:742) > at java.lang.Iterable.forEach(Iterable.java:74) > at > org.apache.cassandra.db.lifecycle.LogReplicaSet.maybeCreateReplica(LogReplicaSet.java:78) > at org.apache.cassandra.db.lifecycle.LogFile.makeRecord(LogFile.java:320) > at org.apache.cassandra.db.lifecycle.LogFile.add(LogFile.java:285) > at > org.apache.cassandra.db.lifecycle.LogTransaction.trackNew(LogTransaction.java:136) > at > org.apache.cassandra.db.lifecycle.LifecycleTransaction.trackNew(LifecycleTransaction.java:529) > {quote} > During streaming we create a reference to a {{LifeCycleTransaction}} and > share it between threads - > [https://github.com/apache/cassandra/blob/5cc68a87359dd02412bdb70a52dfcd718d44a5ba/src/java/org/apache/cassandra/db/streaming/CassandraStreamReader.java#L156] > This is used in a multi-threaded context inside {{CassandraIncomingFile}} > which is an {{IncomingStreamMessage}}. This is being deserialized in parallel. > {{LifecycleTransaction}} is not meant to be used in a multi-threaded context > and this leads to streaming failures due to object sharing. On trunk, this > object is shared across all threads that transfer sstables in parallel for > the given {{TableId}} in a {{StreamSession}}. There are two options to solve > this - make {{LifecycleTransaction}} and the associated objects thread safe, > scope the transaction to a single {{CassandraIncomingFile}}. The consequences > of the latter option is that if we experience streaming failure we may have > redundant SSTables on disk. This is ok as compaction should clean this up. A > third option is we synchronize access in the streaming infrastructure. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-14868) 安装出错
znn created CASSANDRA-14868: --- Summary: 安装出错 Key: CASSANDRA-14868 URL: https://issues.apache.org/jira/browse/CASSANDRA-14868 Project: Cassandra Issue Type: Bug Environment: HDP 2.5 Reporter: znn Traceback (most recent call last): File "/var/lib/ambari-agent/cache/stacks/HDP/2.5/services/CASSANDRA/package/scripts/cassandra_master.py", line 60, in Cassandra_Master().execute() File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 280, in execute method(env) File "/var/lib/ambari-agent/cache/stacks/HDP/2.5/services/CASSANDRA/package/scripts/cassandra_master.py", line 27, in install import params File "/var/lib/ambari-agent/cache/stacks/HDP/2.5/services/CASSANDRA/package/scripts/params.py", line 16, in from resource_management.libraries.functions.version import format_hdp_stack_version, compare_versions ImportError: cannot import name format_hdp_stack_version -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14867) Histogram overflows potentially leading to writes failing
[ https://issues.apache.org/jira/browse/CASSANDRA-14867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677468#comment-16677468 ] Chris Lohfink commented on CASSANDRA-14867: --- That dropped messages logging runs at on a scheduled executor. It will not interrupt your writes so it is mostly cosmetic. The fact that you had over 2^32 dropped messages is something that you should be concerned about (something to move to mailing list). The histogram throwing runtime exceptions has been recurring issue and is not really the right approach. In some of the methods this instead returns max value and I think should just be the result. If these methods are going to throw exceptions they should not be runtime as the uses of it rarely handle the condition. > Histogram overflows potentially leading to writes failing > - > > Key: CASSANDRA-14867 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14867 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging > Environment: cassandra 3.11.1 on ubuntu 16.04 >Reporter: David >Priority: Major > > I observed the following in cassandra logs on 1 host of a 6-node cluster: > ERROR [ScheduledTasks:1] 2018-11-01 17:26:41,277 CassandraDaemon.java:228 - > Exception in thread Thread[ScheduledTasks:1,5,main] > java.lang.IllegalStateException: Unable to compute when histogram overflowed > at > org.apache.cassandra.metrics.DecayingEstimatedHistogramReservoir$EstimatedHistogramReservoirSnapshot.getMean(DecayingEstimatedHistogramReservoir.java:472) > ~[apache-cassandra-3.11.1.jar:3.11.1] > at > org.apache.cassandra.net.MessagingService.getDroppedMessagesLogs(MessagingService.java:1263) > ~[apache-cassandra-3.11.1.jar:3.11.1] > at > org.apache.cassandra.net.MessagingService.logDroppedMessages(MessagingService.java:1236) > ~[apache-cassandra-3.11.1.jar:3.11.1] > at > org.apache.cassandra.net.MessagingService.access$200(MessagingService.java:87) > ~[apache-cassandra-3.11.1.jar:3.11.1] > at > org.apache.cassandra.net.MessagingService$4.run(MessagingService.java:507) > ~[apache-cassandra-3.11.1.jar:3.11.1] > at > org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118) > ~[apache-cassandra-3.11.1.jar:3.11.1] > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > [na:1.8.0_172] > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > [na:1.8.0_172] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > [na:1.8.0_172] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > [na:1.8.0_172] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [na:1.8.0_172] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [na:1.8.0_172] > at > org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) > [apache-cassandra-3.11.1.jar:3.11.1] > at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_172] > > At the same time, this node was failing all writes issued to it. Restarting > cassandra on the node brought the cluster into a good state and we stopped > seeing the histogram overflow errors. > Has this issue been observed before? Could the histogram overflows cause > writes to fail? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[4/5] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11
Merge branch 'cassandra-3.0' into cassandra-3.11 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/af600c79 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/af600c79 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/af600c79 Branch: refs/heads/cassandra-3.11 Commit: af600c7931e7e1e7d8cac960b3fc506d18c26243 Parents: 7eecf89 7bf6171 Author: Blake Eggleston Authored: Tue Nov 6 15:56:25 2018 -0800 Committer: Blake Eggleston Committed: Tue Nov 6 15:56:40 2018 -0800 -- .../io/sstable/format/big/BigFormat.java| 2 +- .../org/apache/cassandra/db/KeyspaceTest.java | 34 -- .../apache/cassandra/db/filter/SliceTest.java | 42 - .../io/sstable/SSTableMetadataTest.java | 49 4 files changed, 1 insertion(+), 126 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/af600c79/src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/af600c79/test/unit/org/apache/cassandra/db/KeyspaceTest.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/af600c79/test/unit/org/apache/cassandra/db/filter/SliceTest.java -- - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[3/5] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11
Merge branch 'cassandra-3.0' into cassandra-3.11 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/af600c79 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/af600c79 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/af600c79 Branch: refs/heads/trunk Commit: af600c7931e7e1e7d8cac960b3fc506d18c26243 Parents: 7eecf89 7bf6171 Author: Blake Eggleston Authored: Tue Nov 6 15:56:25 2018 -0800 Committer: Blake Eggleston Committed: Tue Nov 6 15:56:40 2018 -0800 -- .../io/sstable/format/big/BigFormat.java| 2 +- .../org/apache/cassandra/db/KeyspaceTest.java | 34 -- .../apache/cassandra/db/filter/SliceTest.java | 42 - .../io/sstable/SSTableMetadataTest.java | 49 4 files changed, 1 insertion(+), 126 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/af600c79/src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/af600c79/test/unit/org/apache/cassandra/db/KeyspaceTest.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/af600c79/test/unit/org/apache/cassandra/db/filter/SliceTest.java -- - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[2/5] cassandra git commit: ninja: fix out of date tests
ninja: fix out of date tests Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7bf61716 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7bf61716 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7bf61716 Branch: refs/heads/trunk Commit: 7bf617165e910f08db81917742ab8036215ab300 Parents: e04efab Author: Blake Eggleston Authored: Tue Nov 6 14:18:09 2018 -0800 Committer: Blake Eggleston Committed: Tue Nov 6 15:43:58 2018 -0800 -- .../io/sstable/format/big/BigFormat.java| 2 +- .../org/apache/cassandra/db/KeyspaceTest.java | 34 -- .../apache/cassandra/db/filter/SliceTest.java | 42 - .../io/sstable/SSTableMetadataTest.java | 49 4 files changed, 1 insertion(+), 126 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/7bf61716/src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java -- diff --git a/src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java b/src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java index d4549dd..ae93c5f 100644 --- a/src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java +++ b/src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java @@ -111,7 +111,7 @@ public class BigFormat implements SSTableFormat // we always incremented the major version. static class BigVersion extends Version { -public static final String current_version = "mc"; +public static final String current_version = "md"; public static final String earliest_supported_version = "jb"; // jb (2.0.1): switch from crc32 to adler32 for compression checksums http://git-wip-us.apache.org/repos/asf/cassandra/blob/7bf61716/test/unit/org/apache/cassandra/db/KeyspaceTest.java -- diff --git a/test/unit/org/apache/cassandra/db/KeyspaceTest.java b/test/unit/org/apache/cassandra/db/KeyspaceTest.java index d864fa3..dd11c1c 100644 --- a/test/unit/org/apache/cassandra/db/KeyspaceTest.java +++ b/test/unit/org/apache/cassandra/db/KeyspaceTest.java @@ -433,40 +433,6 @@ public class KeyspaceTest extends CQLTester assertRowsInResult(cfs, command, expectedValues); } -@Test -public void testLimitSSTablesComposites() throws Throwable -{ -// creates 10 sstables, composite columns like this: -// - -// k |a0:0|a1:1|..|a9:9 -// - -// - -// k |a0:10|a1:11|..|a9:19 -// - -// ... -// - -// k |a0:90|a1:91|..|a9:99 -// - -// then we slice out col1 = a5 and col2 > 85 -> which should let us just check 2 sstables and get 2 columns -String tableName = createTable("CREATE TABLE %s (a text, b text, c int, d int, PRIMARY KEY (a, b, c))"); -final ColumnFamilyStore cfs = Keyspace.open(KEYSPACE).getColumnFamilyStore(tableName); -cfs.disableAutoCompaction(); - -for (int j = 0; j < 10; j++) -{ -for (int i = 0; i < 10; i++) -execute("INSERT INTO %s (a, b, c, d) VALUES (?, ?, ?, ?)", "0", "a" + i, j * 10 + i, 0); - -cfs.forceBlockingFlush(); -} - -((ClearableHistogram)cfs.metric.sstablesPerReadHistogram.cf).clear(); -assertRows(execute("SELECT * FROM %s WHERE a = ? AND (b, c) >= (?, ?) AND (b) <= (?) LIMIT 1000", "0", "a5", 85, "a5"), -row("0", "a5", 85, 0), -row("0", "a5", 95, 0)); -assertEquals(2, cfs.metric.sstablesPerReadHistogram.cf.getSnapshot().getMax(), 0.1); -} - private void validateSliceLarge(ColumnFamilyStore cfs) { ClusteringIndexSliceFilter filter = slices(cfs, 1000, null, false); http://git-wip-us.apache.org/repos/asf/cassandra/blob/7bf61716/test/unit/org/apache/cassandra/db/filter/SliceTest.java -- diff --git a/test/unit/org/apache/cassandra/db/filter/SliceTest.java b/test/unit/org/apache/cassandra/db/filter/SliceTest.java index 2f07a24..606395c 100644 --- a/test/unit/org/apache/cassandra/db/filter/SliceTest.java +++ b/test/unit/org/apache/cassandra/db/filter/SliceTest.java @@ -228,48 +228,6 @@ public class SliceTest slice = Slice.make(makeBound(sk, 0), makeBound(ek, 2, 0, 0)); assertTrue(slice.intersects(cc, columnNames(1, 0, 0), columnNames(2, 0, 0))); -// the slice technically falls within the sstable range, but since the first component is restricted to -// a
[1/5] cassandra git commit: ninja: fix out of date tests
Repository: cassandra Updated Branches: refs/heads/cassandra-3.11 7eecf891f -> af600c793 refs/heads/trunk 3ebeef6d2 -> 0ad056432 ninja: fix out of date tests Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7bf61716 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7bf61716 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7bf61716 Branch: refs/heads/cassandra-3.11 Commit: 7bf617165e910f08db81917742ab8036215ab300 Parents: e04efab Author: Blake Eggleston Authored: Tue Nov 6 14:18:09 2018 -0800 Committer: Blake Eggleston Committed: Tue Nov 6 15:43:58 2018 -0800 -- .../io/sstable/format/big/BigFormat.java| 2 +- .../org/apache/cassandra/db/KeyspaceTest.java | 34 -- .../apache/cassandra/db/filter/SliceTest.java | 42 - .../io/sstable/SSTableMetadataTest.java | 49 4 files changed, 1 insertion(+), 126 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/7bf61716/src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java -- diff --git a/src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java b/src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java index d4549dd..ae93c5f 100644 --- a/src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java +++ b/src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java @@ -111,7 +111,7 @@ public class BigFormat implements SSTableFormat // we always incremented the major version. static class BigVersion extends Version { -public static final String current_version = "mc"; +public static final String current_version = "md"; public static final String earliest_supported_version = "jb"; // jb (2.0.1): switch from crc32 to adler32 for compression checksums http://git-wip-us.apache.org/repos/asf/cassandra/blob/7bf61716/test/unit/org/apache/cassandra/db/KeyspaceTest.java -- diff --git a/test/unit/org/apache/cassandra/db/KeyspaceTest.java b/test/unit/org/apache/cassandra/db/KeyspaceTest.java index d864fa3..dd11c1c 100644 --- a/test/unit/org/apache/cassandra/db/KeyspaceTest.java +++ b/test/unit/org/apache/cassandra/db/KeyspaceTest.java @@ -433,40 +433,6 @@ public class KeyspaceTest extends CQLTester assertRowsInResult(cfs, command, expectedValues); } -@Test -public void testLimitSSTablesComposites() throws Throwable -{ -// creates 10 sstables, composite columns like this: -// - -// k |a0:0|a1:1|..|a9:9 -// - -// - -// k |a0:10|a1:11|..|a9:19 -// - -// ... -// - -// k |a0:90|a1:91|..|a9:99 -// - -// then we slice out col1 = a5 and col2 > 85 -> which should let us just check 2 sstables and get 2 columns -String tableName = createTable("CREATE TABLE %s (a text, b text, c int, d int, PRIMARY KEY (a, b, c))"); -final ColumnFamilyStore cfs = Keyspace.open(KEYSPACE).getColumnFamilyStore(tableName); -cfs.disableAutoCompaction(); - -for (int j = 0; j < 10; j++) -{ -for (int i = 0; i < 10; i++) -execute("INSERT INTO %s (a, b, c, d) VALUES (?, ?, ?, ?)", "0", "a" + i, j * 10 + i, 0); - -cfs.forceBlockingFlush(); -} - -((ClearableHistogram)cfs.metric.sstablesPerReadHistogram.cf).clear(); -assertRows(execute("SELECT * FROM %s WHERE a = ? AND (b, c) >= (?, ?) AND (b) <= (?) LIMIT 1000", "0", "a5", 85, "a5"), -row("0", "a5", 85, 0), -row("0", "a5", 95, 0)); -assertEquals(2, cfs.metric.sstablesPerReadHistogram.cf.getSnapshot().getMax(), 0.1); -} - private void validateSliceLarge(ColumnFamilyStore cfs) { ClusteringIndexSliceFilter filter = slices(cfs, 1000, null, false); http://git-wip-us.apache.org/repos/asf/cassandra/blob/7bf61716/test/unit/org/apache/cassandra/db/filter/SliceTest.java -- diff --git a/test/unit/org/apache/cassandra/db/filter/SliceTest.java b/test/unit/org/apache/cassandra/db/filter/SliceTest.java index 2f07a24..606395c 100644 --- a/test/unit/org/apache/cassandra/db/filter/SliceTest.java +++ b/test/unit/org/apache/cassandra/db/filter/SliceTest.java @@ -228,48 +228,6 @@ public class SliceTest slice = Slice.make(makeBound(sk, 0), makeBound(ek, 2, 0, 0)); assertTrue(slice.intersects(cc, columnNames(1, 0, 0), columnNames(2,
[5/5] cassandra git commit: Merge branch 'cassandra-3.11' into trunk
Merge branch 'cassandra-3.11' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/0ad05643 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/0ad05643 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/0ad05643 Branch: refs/heads/trunk Commit: 0ad056432f357800c717b4474d15c15462c079c4 Parents: 3ebeef6 af600c7 Author: Blake Eggleston Authored: Tue Nov 6 16:09:12 2018 -0800 Committer: Blake Eggleston Committed: Tue Nov 6 16:09:12 2018 -0800 -- -- - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[2/3] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11
Merge branch 'cassandra-3.0' into cassandra-3.11 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/4b2692fc Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/4b2692fc Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/4b2692fc Branch: refs/heads/trunk Commit: 4b2692fc978764d93209db27c13b0fbdb5896034 Parents: a6a9dce e04efab Author: Blake Eggleston Authored: Tue Nov 6 11:59:49 2018 -0800 Committer: Blake Eggleston Committed: Tue Nov 6 15:48:06 2018 -0800 -- .../io/sstable/format/big/BigFormat.java| 2 +- .../io/sstable/metadata/MetadataCollector.java | 18 --- .../org/apache/cassandra/db/KeyspaceTest.java | 34 -- .../apache/cassandra/db/filter/SliceTest.java | 42 - .../io/sstable/SSTableMetadataTest.java | 49 5 files changed, 12 insertions(+), 133 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/4b2692fc/src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java -- diff --cc src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java index ea0214b,d4549dd..b62cb11 --- a/src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java +++ b/src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java @@@ -110,7 -111,7 +110,7 @@@ public class BigFormat implements SSTab // we always incremented the major version. static class BigVersion extends Version { --public static final String current_version = "mc"; ++public static final String current_version = "md"; public static final String earliest_supported_version = "jb"; // jb (2.0.1): switch from crc32 to adler32 for compression checksums http://git-wip-us.apache.org/repos/asf/cassandra/blob/4b2692fc/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java -- diff --cc src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java index a618c96,437d80f..0ac5187 --- a/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java +++ b/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java @@@ -273,8 -272,11 +274,11 @@@ public class MetadataCollector implemen public Map finalizeMetadata(String partitioner, double bloomFilterFPChance, long repairedAt, SerializationHeader header) { - Preconditions.checkState(comparator.compare(maxClustering, minClustering) >= 0); + Preconditions.checkState((minClustering == null && maxClustering == null) + || comparator.compare(maxClustering, minClustering) >= 0); + ByteBuffer[] minValues = minClustering != null ? minClustering.getRawValues() : EMPTY_CLUSTERING; + ByteBuffer[] maxValues = maxClustering != null ? maxClustering.getRawValues() : EMPTY_CLUSTERING; -Map components = Maps.newHashMap(); +Map components = new EnumMap<>(MetadataType.class); components.put(MetadataType.VALIDATION, new ValidationMetadata(partitioner, bloomFilterFPChance)); components.put(MetadataType.STATS, new StatsMetadata(estimatedPartitionSize, estimatedCellPerPartitionCount, http://git-wip-us.apache.org/repos/asf/cassandra/blob/4b2692fc/test/unit/org/apache/cassandra/db/KeyspaceTest.java -- diff --cc test/unit/org/apache/cassandra/db/KeyspaceTest.java index f2a9984,d864fa3..3c3b04b --- a/test/unit/org/apache/cassandra/db/KeyspaceTest.java +++ b/test/unit/org/apache/cassandra/db/KeyspaceTest.java @@@ -452,40 -433,40 +452,6 @@@ public class KeyspaceTest extends CQLTe assertRowsInResult(cfs, command, expectedValues); } --@Test --public void testLimitSSTablesComposites() throws Throwable --{ --// creates 10 sstables, composite columns like this: --// - --// k |a0:0|a1:1|..|a9:9 --// - --// - --// k |a0:10|a1:11|..|a9:19 --// - --// ... --// - --// k |a0:90|a1:91|..|a9:99 --// - --// then we slice out col1 = a5 and col2 > 85 -> which should let us just check 2 sstables and get 2 columns - createTable("CREATE TABLE %s (a text, b text, c int, d int, PRIMARY KEY (a, b, c))"); - final ColumnFamilyStore cfs = getCurrentColumnFamilyStore(); -String tableName = createTable("CREATE TABLE %s (a text, b text, c int, d int, PRIMARY KEY
[1/3] cassandra git commit: ninja: fix out of date tests
Repository: cassandra Updated Branches: refs/heads/cassandra-3.0 e04efab3f -> 7bf617165 refs/heads/trunk 51c8387de -> 3ebeef6d2 ninja: fix out of date tests Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7bf61716 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7bf61716 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7bf61716 Branch: refs/heads/cassandra-3.0 Commit: 7bf617165e910f08db81917742ab8036215ab300 Parents: e04efab Author: Blake Eggleston Authored: Tue Nov 6 14:18:09 2018 -0800 Committer: Blake Eggleston Committed: Tue Nov 6 15:43:58 2018 -0800 -- .../io/sstable/format/big/BigFormat.java| 2 +- .../org/apache/cassandra/db/KeyspaceTest.java | 34 -- .../apache/cassandra/db/filter/SliceTest.java | 42 - .../io/sstable/SSTableMetadataTest.java | 49 4 files changed, 1 insertion(+), 126 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/7bf61716/src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java -- diff --git a/src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java b/src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java index d4549dd..ae93c5f 100644 --- a/src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java +++ b/src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java @@ -111,7 +111,7 @@ public class BigFormat implements SSTableFormat // we always incremented the major version. static class BigVersion extends Version { -public static final String current_version = "mc"; +public static final String current_version = "md"; public static final String earliest_supported_version = "jb"; // jb (2.0.1): switch from crc32 to adler32 for compression checksums http://git-wip-us.apache.org/repos/asf/cassandra/blob/7bf61716/test/unit/org/apache/cassandra/db/KeyspaceTest.java -- diff --git a/test/unit/org/apache/cassandra/db/KeyspaceTest.java b/test/unit/org/apache/cassandra/db/KeyspaceTest.java index d864fa3..dd11c1c 100644 --- a/test/unit/org/apache/cassandra/db/KeyspaceTest.java +++ b/test/unit/org/apache/cassandra/db/KeyspaceTest.java @@ -433,40 +433,6 @@ public class KeyspaceTest extends CQLTester assertRowsInResult(cfs, command, expectedValues); } -@Test -public void testLimitSSTablesComposites() throws Throwable -{ -// creates 10 sstables, composite columns like this: -// - -// k |a0:0|a1:1|..|a9:9 -// - -// - -// k |a0:10|a1:11|..|a9:19 -// - -// ... -// - -// k |a0:90|a1:91|..|a9:99 -// - -// then we slice out col1 = a5 and col2 > 85 -> which should let us just check 2 sstables and get 2 columns -String tableName = createTable("CREATE TABLE %s (a text, b text, c int, d int, PRIMARY KEY (a, b, c))"); -final ColumnFamilyStore cfs = Keyspace.open(KEYSPACE).getColumnFamilyStore(tableName); -cfs.disableAutoCompaction(); - -for (int j = 0; j < 10; j++) -{ -for (int i = 0; i < 10; i++) -execute("INSERT INTO %s (a, b, c, d) VALUES (?, ?, ?, ?)", "0", "a" + i, j * 10 + i, 0); - -cfs.forceBlockingFlush(); -} - -((ClearableHistogram)cfs.metric.sstablesPerReadHistogram.cf).clear(); -assertRows(execute("SELECT * FROM %s WHERE a = ? AND (b, c) >= (?, ?) AND (b) <= (?) LIMIT 1000", "0", "a5", 85, "a5"), -row("0", "a5", 85, 0), -row("0", "a5", 95, 0)); -assertEquals(2, cfs.metric.sstablesPerReadHistogram.cf.getSnapshot().getMax(), 0.1); -} - private void validateSliceLarge(ColumnFamilyStore cfs) { ClusteringIndexSliceFilter filter = slices(cfs, 1000, null, false); http://git-wip-us.apache.org/repos/asf/cassandra/blob/7bf61716/test/unit/org/apache/cassandra/db/filter/SliceTest.java -- diff --git a/test/unit/org/apache/cassandra/db/filter/SliceTest.java b/test/unit/org/apache/cassandra/db/filter/SliceTest.java index 2f07a24..606395c 100644 --- a/test/unit/org/apache/cassandra/db/filter/SliceTest.java +++ b/test/unit/org/apache/cassandra/db/filter/SliceTest.java @@ -228,48 +228,6 @@ public class SliceTest slice = Slice.make(makeBound(sk, 0), makeBound(ek, 2, 0, 0)); assertTrue(slice.intersects(cc, columnNames(1, 0, 0), columnNames(2,
[3/3] cassandra git commit: Merge branch 'cassandra-3.11' into trunk
Merge branch 'cassandra-3.11' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/3ebeef6d Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/3ebeef6d Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/3ebeef6d Branch: refs/heads/trunk Commit: 3ebeef6d21d5feeb5305335b8ec9a7a3b3ef6311 Parents: 51c8387 4b2692f Author: Blake Eggleston Authored: Tue Nov 6 15:50:05 2018 -0800 Committer: Blake Eggleston Committed: Tue Nov 6 15:50:33 2018 -0800 -- .../org/apache/cassandra/db/KeyspaceTest.java | 34 -- .../apache/cassandra/db/filter/SliceTest.java | 42 - .../cassandra/io/sstable/LegacySSTableTest.java | 2 +- .../io/sstable/SSTableMetadataTest.java | 49 4 files changed, 1 insertion(+), 126 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/3ebeef6d/test/unit/org/apache/cassandra/db/KeyspaceTest.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/3ebeef6d/test/unit/org/apache/cassandra/io/sstable/LegacySSTableTest.java -- diff --cc test/unit/org/apache/cassandra/io/sstable/LegacySSTableTest.java index 7c98e7e,bd51c0f..de5ac52 --- a/test/unit/org/apache/cassandra/io/sstable/LegacySSTableTest.java +++ b/test/unit/org/apache/cassandra/io/sstable/LegacySSTableTest.java @@@ -265,7 -293,7 +265,7 @@@ public class LegacySSTableTes public void testInaccurateSSTableMinMax() throws Exception { QueryProcessor.executeInternal("CREATE TABLE legacy_tables.legacy_mc_inaccurate_min_max (k int, c1 int, c2 int, c3 int, v int, primary key (k, c1, c2, c3))"); - loadLegacyTable("legacy_%s_inaccurate_min_max%s", "mc"); -loadLegacyTable("legacy_%s_inaccurate_min_max%s", "mc", ""); ++loadLegacyTable("legacy_%s_inaccurate_min_max", "mc"); /* sstable has the following mutations: http://git-wip-us.apache.org/repos/asf/cassandra/blob/3ebeef6d/test/unit/org/apache/cassandra/io/sstable/SSTableMetadataTest.java -- - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14765) Evaluate Recovery Time on Single Token Cluster Test
[ https://issues.apache.org/jira/browse/CASSANDRA-14765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677320#comment-16677320 ] Joseph Lynch edited comment on CASSANDRA-14765 at 11/6/18 9:36 PM: --- Some initial impressions: !3.0.17-4.0.x-Streaming.png! things are looking very good. was (Author: jolynch): Some initial impressions: !image-2018-11-06-13-34-33-108.png! things are looking very good. > Evaluate Recovery Time on Single Token Cluster Test > --- > > Key: CASSANDRA-14765 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14765 > Project: Cassandra > Issue Type: Sub-task >Reporter: Joseph Lynch >Assignee: Sumanth Pasupuleti >Priority: Major > Attachments: 3.0.17-4.0.x-Streaming.png, > image-2018-11-06-13-34-33-108.png > > > *Setup:* > * Cassandra: 6 (2*3 rack) node i3.8xlarge AWS instance (32 cpu cores, 240GB > ram) running cassandra trunk with Jason's 14503 changes vs the same footprint > running 3.0.17 > * One datacenter, single tokens > * No compression, encryption, or coalescing turned on > *Test #1:* > ndbench loaded ~150GB of data per node into a LCS table. Then we killed a > node and let a new node stream. With a single token this should be a worst > case recovery scenario (only a few peers to stream from). > *Result:* > As the table used LCS and we didn't not have encryption on, the zero copy > transfer was used via CASSANDRA-14556. We recovered *150GB in 5 minutes,* > going at a consistent rate of about 3 gigabit per second. Theoretically we > should be able to get 10 gigabit, but this is still something like an > estimated 16x improvement over 3.0.x. We're still running the 3.0.x test for > a hard comparison. > *Follow Ups:* > We need to get more rigorous measurements (over more terminations), as well > as finishing the 3.0.x test. [~sumanth.pasupuleti] and [~djoshi3] are driving > this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14765) Evaluate Recovery Time on Single Token Cluster Test
[ https://issues.apache.org/jira/browse/CASSANDRA-14765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-14765: - Attachment: 3.0.17-4.0.x-Streaming.png > Evaluate Recovery Time on Single Token Cluster Test > --- > > Key: CASSANDRA-14765 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14765 > Project: Cassandra > Issue Type: Sub-task >Reporter: Joseph Lynch >Assignee: Sumanth Pasupuleti >Priority: Major > Attachments: 3.0.17-4.0.x-Streaming.png, > image-2018-11-06-13-34-33-108.png > > > *Setup:* > * Cassandra: 6 (2*3 rack) node i3.8xlarge AWS instance (32 cpu cores, 240GB > ram) running cassandra trunk with Jason's 14503 changes vs the same footprint > running 3.0.17 > * One datacenter, single tokens > * No compression, encryption, or coalescing turned on > *Test #1:* > ndbench loaded ~150GB of data per node into a LCS table. Then we killed a > node and let a new node stream. With a single token this should be a worst > case recovery scenario (only a few peers to stream from). > *Result:* > As the table used LCS and we didn't not have encryption on, the zero copy > transfer was used via CASSANDRA-14556. We recovered *150GB in 5 minutes,* > going at a consistent rate of about 3 gigabit per second. Theoretically we > should be able to get 10 gigabit, but this is still something like an > estimated 16x improvement over 3.0.x. We're still running the 3.0.x test for > a hard comparison. > *Follow Ups:* > We need to get more rigorous measurements (over more terminations), as well > as finishing the 3.0.x test. [~sumanth.pasupuleti] and [~djoshi3] are driving > this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14765) Evaluate Recovery Time on Single Token Cluster Test
[ https://issues.apache.org/jira/browse/CASSANDRA-14765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677320#comment-16677320 ] Joseph Lynch edited comment on CASSANDRA-14765 at 11/6/18 9:35 PM: --- Some initial impressions: !image-2018-11-06-13-34-33-108.png! things are looking very good. was (Author: jolynch): Some initial impressions: !image-2018-11-06-13-34-33-108.png! things are looking very good. > Evaluate Recovery Time on Single Token Cluster Test > --- > > Key: CASSANDRA-14765 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14765 > Project: Cassandra > Issue Type: Sub-task >Reporter: Joseph Lynch >Assignee: Sumanth Pasupuleti >Priority: Major > Attachments: image-2018-11-06-13-34-33-108.png > > > *Setup:* > * Cassandra: 6 (2*3 rack) node i3.8xlarge AWS instance (32 cpu cores, 240GB > ram) running cassandra trunk with Jason's 14503 changes vs the same footprint > running 3.0.17 > * One datacenter, single tokens > * No compression, encryption, or coalescing turned on > *Test #1:* > ndbench loaded ~150GB of data per node into a LCS table. Then we killed a > node and let a new node stream. With a single token this should be a worst > case recovery scenario (only a few peers to stream from). > *Result:* > As the table used LCS and we didn't not have encryption on, the zero copy > transfer was used via CASSANDRA-14556. We recovered *150GB in 5 minutes,* > going at a consistent rate of about 3 gigabit per second. Theoretically we > should be able to get 10 gigabit, but this is still something like an > estimated 16x improvement over 3.0.x. We're still running the 3.0.x test for > a hard comparison. > *Follow Ups:* > We need to get more rigorous measurements (over more terminations), as well > as finishing the 3.0.x test. [~sumanth.pasupuleti] and [~djoshi3] are driving > this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14765) Evaluate Recovery Time on Single Token Cluster Test
[ https://issues.apache.org/jira/browse/CASSANDRA-14765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677320#comment-16677320 ] Joseph Lynch commented on CASSANDRA-14765: -- Some initial impressions: !image-2018-11-06-13-34-33-108.png! things are looking very good. > Evaluate Recovery Time on Single Token Cluster Test > --- > > Key: CASSANDRA-14765 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14765 > Project: Cassandra > Issue Type: Sub-task >Reporter: Joseph Lynch >Assignee: Sumanth Pasupuleti >Priority: Major > Attachments: image-2018-11-06-13-34-33-108.png > > > *Setup:* > * Cassandra: 6 (2*3 rack) node i3.8xlarge AWS instance (32 cpu cores, 240GB > ram) running cassandra trunk with Jason's 14503 changes vs the same footprint > running 3.0.17 > * One datacenter, single tokens > * No compression, encryption, or coalescing turned on > *Test #1:* > ndbench loaded ~150GB of data per node into a LCS table. Then we killed a > node and let a new node stream. With a single token this should be a worst > case recovery scenario (only a few peers to stream from). > *Result:* > As the table used LCS and we didn't not have encryption on, the zero copy > transfer was used via CASSANDRA-14556. We recovered *150GB in 5 minutes,* > going at a consistent rate of about 3 gigabit per second. Theoretically we > should be able to get 10 gigabit, but this is still something like an > estimated 16x improvement over 3.0.x. We're still running the 3.0.x test for > a hard comparison. > *Follow Ups:* > We need to get more rigorous measurements (over more terminations), as well > as finishing the 3.0.x test. [~sumanth.pasupuleti] and [~djoshi3] are driving > this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14765) Evaluate Recovery Time on Single Token Cluster Test
[ https://issues.apache.org/jira/browse/CASSANDRA-14765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph Lynch updated CASSANDRA-14765: - Attachment: image-2018-11-06-13-34-33-108.png > Evaluate Recovery Time on Single Token Cluster Test > --- > > Key: CASSANDRA-14765 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14765 > Project: Cassandra > Issue Type: Sub-task >Reporter: Joseph Lynch >Assignee: Sumanth Pasupuleti >Priority: Major > Attachments: image-2018-11-06-13-34-33-108.png > > > *Setup:* > * Cassandra: 6 (2*3 rack) node i3.8xlarge AWS instance (32 cpu cores, 240GB > ram) running cassandra trunk with Jason's 14503 changes vs the same footprint > running 3.0.17 > * One datacenter, single tokens > * No compression, encryption, or coalescing turned on > *Test #1:* > ndbench loaded ~150GB of data per node into a LCS table. Then we killed a > node and let a new node stream. With a single token this should be a worst > case recovery scenario (only a few peers to stream from). > *Result:* > As the table used LCS and we didn't not have encryption on, the zero copy > transfer was used via CASSANDRA-14556. We recovered *150GB in 5 minutes,* > going at a consistent rate of about 3 gigabit per second. Theoretically we > should be able to get 10 gigabit, but this is still something like an > estimated 16x improvement over 3.0.x. We're still running the 3.0.x test for > a hard comparison. > *Follow Ups:* > We need to get more rigorous measurements (over more terminations), as well > as finishing the 3.0.x test. [~sumanth.pasupuleti] and [~djoshi3] are driving > this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14747) Evaluate 200 node, compression=none, encryption=none, coalescing=off
[ https://issues.apache.org/jira/browse/CASSANDRA-14747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinay Chella updated CASSANDRA-14747: - Attachment: trunk_14503_v2_cpuflamegraph.svg > Evaluate 200 node, compression=none, encryption=none, coalescing=off > - > > Key: CASSANDRA-14747 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14747 > Project: Cassandra > Issue Type: Sub-task >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Major > Attachments: 3.0.17-QPS.png, 4.0.1-QPS.png, > 4.0.11-after-jolynch-tweaks.svg, 4.0.12-after-unconditional-flush.svg, > 4.0.15-after-sndbuf-fix.svg, 4.0.7-before-my-changes.svg, > 4.0_errors_showing_heap_pressure.txt, > 4.0_heap_histogram_showing_many_MessageOuts.txt, > i-0ed2acd2dfacab7c1-after-looping-fixes.svg, > trunk_14503_v2_cpuflamegraph.svg, trunk_vs_3.0.17_latency_under_load.png, > ttop_NettyOutbound-Thread_spinning.txt, > useast1c-i-0e1ddfe8b2f769060-mutation-flame.svg, > useast1e-i-08635fa1631601538_flamegraph_96node.svg, > useast1e-i-08635fa1631601538_ttop_netty_outbound_threads_96nodes, > useast1e-i-08635fa1631601538_uninlinedcpuflamegraph.0_96node_60sec_profile.svg > > > Tracks evaluating a 200 node cluster with all internode settings off (no > compression, no encryption, no coalescing). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14747) Evaluate 200 node, compression=none, encryption=none, coalescing=off
[ https://issues.apache.org/jira/browse/CASSANDRA-14747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677293#comment-16677293 ] Vinay Chella commented on CASSANDRA-14747: -- Thank you [~jasobrown] for the patch on CASSANDRA-14503 [~jolynch] and I benchmarked Jason's 14503-v2 branch, our benchmark results show [trunk-Jason's branch|https://github.com/jasobrown/cassandra/tree/14503-v2] is significantly out-performing 3.0.17 in terms of mean, 99th, and 95th percentile during a pure write benchmark. When systems are under heavy load, we have seen coordinator mean latencies are ~14x better, 99th latencies are ~4x better and 95th latencies are ~6x better on the trunk. When both trunk and 3.0.17 had 67k write QPS applied, throughput is steady on the trunk and 3.0.17 fell over. Note that we have only tested writes in this benchmark. However, the trunk is accumulating more hints than 3.0.17 and dropping messages compared to 3.0.17, these issues are yet to troubleshoot. For a detailed analysis of this benchmarking, find attached document [Cassandra 4.0 testing with CASSANDRA-14503 fixes] > Evaluate 200 node, compression=none, encryption=none, coalescing=off > - > > Key: CASSANDRA-14747 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14747 > Project: Cassandra > Issue Type: Sub-task >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Major > Attachments: 3.0.17-QPS.png, 4.0.1-QPS.png, > 4.0.11-after-jolynch-tweaks.svg, 4.0.12-after-unconditional-flush.svg, > 4.0.15-after-sndbuf-fix.svg, 4.0.7-before-my-changes.svg, > 4.0_errors_showing_heap_pressure.txt, > 4.0_heap_histogram_showing_many_MessageOuts.txt, > i-0ed2acd2dfacab7c1-after-looping-fixes.svg, > trunk_14503_v2_cpuflamegraph.svg, trunk_vs_3.0.17_latency_under_load.png, > ttop_NettyOutbound-Thread_spinning.txt, > useast1c-i-0e1ddfe8b2f769060-mutation-flame.svg, > useast1e-i-08635fa1631601538_flamegraph_96node.svg, > useast1e-i-08635fa1631601538_ttop_netty_outbound_threads_96nodes, > useast1e-i-08635fa1631601538_uninlinedcpuflamegraph.0_96node_60sec_profile.svg > > > Tracks evaluating a 200 node cluster with all internode settings off (no > compression, no encryption, no coalescing). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14503) Internode connection management is race-prone
[ https://issues.apache.org/jira/browse/CASSANDRA-14503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677289#comment-16677289 ] Vinay Chella commented on CASSANDRA-14503: -- Thank you [~jasobrown] for the patch. [~jolynch] and I benchmarked Jason's 14503-v2 branch, our benchmark results show [trunk-Jason's branch|https://github.com/jasobrown/cassandra/tree/14503-v2] is significantly out-performing 3.0.17 in terms of mean, 99th, and 95th percentile during a pure write benchmark. When systems are under heavy load, we have seen coordinator mean latencies are ~14x better, 99th latencies are ~4x better and 95th latencies are ~6x better on the trunk. When both trunk and 3.0.17 had 67k write QPS applied, throughput is steady on the trunk and 3.0.17 fell over. Note that we have only tested writes in this benchmark. However, the trunk is accumulating more hints than 3.0.17 and dropping messages compared to 3.0.17, these issues are yet to troubleshoot. For a detailed analysis of this benchmarking, find attached document [Cassandra 4.0 testing with CASSANDRA-14503 fixes] > Internode connection management is race-prone > - > > Key: CASSANDRA-14503 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14503 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Sergio Bossa >Assignee: Jason Brown >Priority: Major > Labels: pull-request-available > Fix For: 4.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Following CASSANDRA-8457, internode connection management has been rewritten > to rely on Netty, but the new implementation in > {{OutboundMessagingConnection}} seems quite race prone to me, in particular > on those two cases: > * {{#finishHandshake()}} racing with {{#close()}}: i.e. in such case the > former could run into an NPE if the latter nulls the {{channelWriter}} (but > this is just an example, other conflicts might happen). > * Connection timeout and retry racing with state changing methods: > {{connectionRetryFuture}} and {{connectionTimeoutFuture}} are cancelled when > handshaking or closing, but there's no guarantee those will be actually > cancelled (as they might be already running), so they might end up changing > the connection state concurrently with other methods (i.e. by unexpectedly > closing the channel or clearing the backlog). > Overall, the thread safety of {{OutboundMessagingConnection}} is very > difficult to assess given the current implementation: I would suggest to > refactor it into a single-thread model, where all connection state changing > actions are enqueued on a single threaded scheduler, so that state > transitions can be clearly defined and checked. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14841) Don't write to system_distributed.repair_history, system_traces.sessions, system_traces.events in mixed version 3.X/4.0 clusters
[ https://issues.apache.org/jira/browse/CASSANDRA-14841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677266#comment-16677266 ] Tommy Stendahl commented on CASSANDRA-14841: We don't use nodetool for repair, we have a repair scheduler that triggers repair jobs via jmx. The scheduling is based on repair history, checking for failed repairs that needs to be retried, how long ago something was repaired and if something is completely missing it has not been repaired at all. There is no big issue shunting down this for a while during an upgrade but I don't want to do that for a longer time then a need to, if I can do it per DC it would help. Yes, you can alter the schema of the system distributed tables. Previously you could just do {{ALTER TABLE}} but since a few versions back its a little bit more complected, now you have to do {{INSERT}} in the system schema tables and use {{nodetool reloadlocalschema}}. > Don't write to system_distributed.repair_history, system_traces.sessions, > system_traces.events in mixed version 3.X/4.0 clusters > > > Key: CASSANDRA-14841 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14841 > Project: Cassandra > Issue Type: Bug >Reporter: Tommy Stendahl >Assignee: Ariel Weisberg >Priority: Major > Fix For: 4.0 > > > When upgrading from 3.x to 4.0 I get exceptions in the old nodes once the > first 4.0 node starts up. I have tested to upgrade from both 3.0.15 and > 3.11.3 and get the same problem. > > {noformat} > 2018-10-22T11:12:05.060+0200 ERROR > [MessagingService-Incoming-/10.216.193.244] CassandraDaemon.java:228 > Exception in thread Thread[MessagingService-Incoming-/10.216.193.244,5,main] > java.lang.RuntimeException: Unknown column coordinator_port during > deserialization > at org.apache.cassandra.db.Columns$Serializer.deserialize(Columns.java:452) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at > org.apache.cassandra.db.filter.ColumnFilter$Serializer.deserialize(ColumnFilter.java:482) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at > org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:760) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at > org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:697) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at > org.apache.cassandra.io.ForwardingVersionedSerializer.deserialize(ForwardingVersionedSerializer.java:50) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at org.apache.cassandra.net.MessageIn.read(MessageIn.java:123) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at > org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:192) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at > org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:180) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at > org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:94) > ~[apache-cassandra-3.11.3.jar:3.11.3]{noformat} > I think it was introduced by CASSANDRA-7544. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[2/6] cassandra git commit: ninja: fix precondition for unclustered tables
ninja: fix precondition for unclustered tables Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e04efab3 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e04efab3 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e04efab3 Branch: refs/heads/cassandra-3.11 Commit: e04efab3f9a60e5e8c34c845548b6ab6d0570376 Parents: d60c783 Author: Blake Eggleston Authored: Tue Nov 6 11:57:45 2018 -0800 Committer: Blake Eggleston Committed: Tue Nov 6 11:57:45 2018 -0800 -- .../io/sstable/metadata/MetadataCollector.java| 18 +++--- 1 file changed, 11 insertions(+), 7 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/e04efab3/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java -- diff --git a/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java b/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java index f48d0a6..437d80f 100644 --- a/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java +++ b/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java @@ -45,6 +45,7 @@ import org.apache.cassandra.utils.StreamingHistogram; public class MetadataCollector implements PartitionStatisticsCollector { public static final double NO_COMPRESSION_RATIO = -1.0; +private static final ByteBuffer[] EMPTY_CLUSTERING = new ByteBuffer[0]; static EstimatedHistogram defaultCellPerPartitionCountHistogram() { @@ -95,8 +96,8 @@ public class MetadataCollector implements PartitionStatisticsCollector protected double compressionRatio = NO_COMPRESSION_RATIO; protected StreamingHistogram.StreamingHistogramBuilder estimatedTombstoneDropTime = defaultTombstoneDropTimeHistogramBuilder(); protected int sstableLevel; -private ClusteringPrefix minClustering = Slice.Bound.TOP; -private ClusteringPrefix maxClustering = Slice.Bound.BOTTOM; +private ClusteringPrefix minClustering = null; +private ClusteringPrefix maxClustering = null; protected boolean hasLegacyCounterShards = false; protected long totalColumnsSet; protected long totalRows; @@ -228,8 +229,8 @@ public class MetadataCollector implements PartitionStatisticsCollector public MetadataCollector updateClusteringValues(ClusteringPrefix clustering) { -minClustering = comparator.compare(clustering, minClustering) < 0 ? clustering : minClustering; -maxClustering = comparator.compare(clustering, maxClustering) > 0 ? clustering : maxClustering; +minClustering = minClustering == null || comparator.compare(clustering, minClustering) < 0 ? clustering : minClustering; +maxClustering = maxClustering == null || comparator.compare(clustering, maxClustering) > 0 ? clustering : maxClustering; return this; } @@ -271,7 +272,10 @@ public class MetadataCollector implements PartitionStatisticsCollector public Map finalizeMetadata(String partitioner, double bloomFilterFPChance, long repairedAt, SerializationHeader header) { -Preconditions.checkState(comparator.compare(maxClustering, minClustering) >= 0); +Preconditions.checkState((minClustering == null && maxClustering == null) + || comparator.compare(maxClustering, minClustering) >= 0); +ByteBuffer[] minValues = minClustering != null ? minClustering.getRawValues() : EMPTY_CLUSTERING; +ByteBuffer[] maxValues = maxClustering != null ? maxClustering.getRawValues() : EMPTY_CLUSTERING; Map components = Maps.newHashMap(); components.put(MetadataType.VALIDATION, new ValidationMetadata(partitioner, bloomFilterFPChance)); components.put(MetadataType.STATS, new StatsMetadata(estimatedPartitionSize, @@ -286,8 +290,8 @@ public class MetadataCollector implements PartitionStatisticsCollector compressionRatio, estimatedTombstoneDropTime.build(), sstableLevel, - makeList(minClustering.getRawValues()), - makeList(maxClustering.getRawValues()), + makeList(minValues), + makeList(maxValues), hasLegacyCounterShards, repairedAt,
[6/6] cassandra git commit: Merge branch 'cassandra-3.11' into trunk
Merge branch 'cassandra-3.11' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/51c8387d Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/51c8387d Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/51c8387d Branch: refs/heads/trunk Commit: 51c8387deac074ee404eba0070016867253d90b1 Parents: 6ec4452 7eecf89 Author: Blake Eggleston Authored: Tue Nov 6 12:00:06 2018 -0800 Committer: Blake Eggleston Committed: Tue Nov 6 12:00:06 2018 -0800 -- .../io/sstable/metadata/MetadataCollector.java| 18 +++--- 1 file changed, 11 insertions(+), 7 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/51c8387d/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java -- diff --cc src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java index d7c6b61,0ac5187..19fa20c mode 100755,100644..100755 --- a/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java +++ b/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java @@@ -99,10 -96,10 +100,10 @@@ public class MetadataCollector implemen protected final MinMaxIntTracker localDeletionTimeTracker = new MinMaxIntTracker(Cell.NO_DELETION_TIME, Cell.NO_DELETION_TIME); protected final MinMaxIntTracker ttlTracker = new MinMaxIntTracker(Cell.NO_TTL, Cell.NO_TTL); protected double compressionRatio = NO_COMPRESSION_RATIO; -protected StreamingHistogram.StreamingHistogramBuilder estimatedTombstoneDropTime = defaultTombstoneDropTimeHistogramBuilder(); +protected StreamingTombstoneHistogramBuilder estimatedTombstoneDropTime = new StreamingTombstoneHistogramBuilder(SSTable.TOMBSTONE_HISTOGRAM_BIN_SIZE, SSTable.TOMBSTONE_HISTOGRAM_SPOOL_SIZE, SSTable.TOMBSTONE_HISTOGRAM_TTL_ROUND_SECONDS); protected int sstableLevel; - private ClusteringPrefix minClustering = ClusteringBound.TOP; - private ClusteringPrefix maxClustering = ClusteringBound.BOTTOM; + private ClusteringPrefix minClustering = null; + private ClusteringPrefix maxClustering = null; protected boolean hasLegacyCounterShards = false; protected long totalColumnsSet; protected long totalRows; @@@ -269,9 -272,12 +270,12 @@@ this.hasLegacyCounterShards = this.hasLegacyCounterShards || hasLegacyCounterShards; } -public Map finalizeMetadata(String partitioner, double bloomFilterFPChance, long repairedAt, SerializationHeader header) +public Map finalizeMetadata(String partitioner, double bloomFilterFPChance, long repairedAt, UUID pendingRepair, boolean isTransient, SerializationHeader header) { - Preconditions.checkState(comparator.compare(maxClustering, minClustering) >= 0); + Preconditions.checkState((minClustering == null && maxClustering == null) + || comparator.compare(maxClustering, minClustering) >= 0); + ByteBuffer[] minValues = minClustering != null ? minClustering.getRawValues() : EMPTY_CLUSTERING; + ByteBuffer[] maxValues = maxClustering != null ? maxClustering.getRawValues() : EMPTY_CLUSTERING; Map components = new EnumMap<>(MetadataType.class); components.put(MetadataType.VALIDATION, new ValidationMetadata(partitioner, bloomFilterFPChance)); components.put(MetadataType.STATS, new StatsMetadata(estimatedPartitionSize, - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[4/6] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11
Merge branch 'cassandra-3.0' into cassandra-3.11 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7eecf891 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7eecf891 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7eecf891 Branch: refs/heads/trunk Commit: 7eecf891f5a12a0f74b69a9aa60a91f584235d0c Parents: a6a9dce e04efab Author: Blake Eggleston Authored: Tue Nov 6 11:59:49 2018 -0800 Committer: Blake Eggleston Committed: Tue Nov 6 11:59:49 2018 -0800 -- .../io/sstable/metadata/MetadataCollector.java| 18 +++--- 1 file changed, 11 insertions(+), 7 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/7eecf891/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java -- diff --cc src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java index a618c96,437d80f..0ac5187 --- a/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java +++ b/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java @@@ -273,8 -272,11 +274,11 @@@ public class MetadataCollector implemen public Map finalizeMetadata(String partitioner, double bloomFilterFPChance, long repairedAt, SerializationHeader header) { - Preconditions.checkState(comparator.compare(maxClustering, minClustering) >= 0); + Preconditions.checkState((minClustering == null && maxClustering == null) + || comparator.compare(maxClustering, minClustering) >= 0); + ByteBuffer[] minValues = minClustering != null ? minClustering.getRawValues() : EMPTY_CLUSTERING; + ByteBuffer[] maxValues = maxClustering != null ? maxClustering.getRawValues() : EMPTY_CLUSTERING; -Map components = Maps.newHashMap(); +Map components = new EnumMap<>(MetadataType.class); components.put(MetadataType.VALIDATION, new ValidationMetadata(partitioner, bloomFilterFPChance)); components.put(MetadataType.STATS, new StatsMetadata(estimatedPartitionSize, estimatedCellPerPartitionCount, - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[1/6] cassandra git commit: ninja: fix precondition for unclustered tables
Repository: cassandra Updated Branches: refs/heads/cassandra-3.0 d60c78358 -> e04efab3f refs/heads/cassandra-3.11 a6a9dce15 -> 7eecf891f refs/heads/trunk 6ec445282 -> 51c8387de ninja: fix precondition for unclustered tables Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e04efab3 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e04efab3 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e04efab3 Branch: refs/heads/cassandra-3.0 Commit: e04efab3f9a60e5e8c34c845548b6ab6d0570376 Parents: d60c783 Author: Blake Eggleston Authored: Tue Nov 6 11:57:45 2018 -0800 Committer: Blake Eggleston Committed: Tue Nov 6 11:57:45 2018 -0800 -- .../io/sstable/metadata/MetadataCollector.java| 18 +++--- 1 file changed, 11 insertions(+), 7 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/e04efab3/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java -- diff --git a/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java b/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java index f48d0a6..437d80f 100644 --- a/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java +++ b/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java @@ -45,6 +45,7 @@ import org.apache.cassandra.utils.StreamingHistogram; public class MetadataCollector implements PartitionStatisticsCollector { public static final double NO_COMPRESSION_RATIO = -1.0; +private static final ByteBuffer[] EMPTY_CLUSTERING = new ByteBuffer[0]; static EstimatedHistogram defaultCellPerPartitionCountHistogram() { @@ -95,8 +96,8 @@ public class MetadataCollector implements PartitionStatisticsCollector protected double compressionRatio = NO_COMPRESSION_RATIO; protected StreamingHistogram.StreamingHistogramBuilder estimatedTombstoneDropTime = defaultTombstoneDropTimeHistogramBuilder(); protected int sstableLevel; -private ClusteringPrefix minClustering = Slice.Bound.TOP; -private ClusteringPrefix maxClustering = Slice.Bound.BOTTOM; +private ClusteringPrefix minClustering = null; +private ClusteringPrefix maxClustering = null; protected boolean hasLegacyCounterShards = false; protected long totalColumnsSet; protected long totalRows; @@ -228,8 +229,8 @@ public class MetadataCollector implements PartitionStatisticsCollector public MetadataCollector updateClusteringValues(ClusteringPrefix clustering) { -minClustering = comparator.compare(clustering, minClustering) < 0 ? clustering : minClustering; -maxClustering = comparator.compare(clustering, maxClustering) > 0 ? clustering : maxClustering; +minClustering = minClustering == null || comparator.compare(clustering, minClustering) < 0 ? clustering : minClustering; +maxClustering = maxClustering == null || comparator.compare(clustering, maxClustering) > 0 ? clustering : maxClustering; return this; } @@ -271,7 +272,10 @@ public class MetadataCollector implements PartitionStatisticsCollector public Map finalizeMetadata(String partitioner, double bloomFilterFPChance, long repairedAt, SerializationHeader header) { -Preconditions.checkState(comparator.compare(maxClustering, minClustering) >= 0); +Preconditions.checkState((minClustering == null && maxClustering == null) + || comparator.compare(maxClustering, minClustering) >= 0); +ByteBuffer[] minValues = minClustering != null ? minClustering.getRawValues() : EMPTY_CLUSTERING; +ByteBuffer[] maxValues = maxClustering != null ? maxClustering.getRawValues() : EMPTY_CLUSTERING; Map components = Maps.newHashMap(); components.put(MetadataType.VALIDATION, new ValidationMetadata(partitioner, bloomFilterFPChance)); components.put(MetadataType.STATS, new StatsMetadata(estimatedPartitionSize, @@ -286,8 +290,8 @@ public class MetadataCollector implements PartitionStatisticsCollector compressionRatio, estimatedTombstoneDropTime.build(), sstableLevel, - makeList(minClustering.getRawValues()), - makeList(maxClustering.getRawValues()), + makeList(minValues), + makeList(maxValues),
[3/6] cassandra git commit: ninja: fix precondition for unclustered tables
ninja: fix precondition for unclustered tables Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/e04efab3 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/e04efab3 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/e04efab3 Branch: refs/heads/trunk Commit: e04efab3f9a60e5e8c34c845548b6ab6d0570376 Parents: d60c783 Author: Blake Eggleston Authored: Tue Nov 6 11:57:45 2018 -0800 Committer: Blake Eggleston Committed: Tue Nov 6 11:57:45 2018 -0800 -- .../io/sstable/metadata/MetadataCollector.java| 18 +++--- 1 file changed, 11 insertions(+), 7 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/e04efab3/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java -- diff --git a/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java b/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java index f48d0a6..437d80f 100644 --- a/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java +++ b/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java @@ -45,6 +45,7 @@ import org.apache.cassandra.utils.StreamingHistogram; public class MetadataCollector implements PartitionStatisticsCollector { public static final double NO_COMPRESSION_RATIO = -1.0; +private static final ByteBuffer[] EMPTY_CLUSTERING = new ByteBuffer[0]; static EstimatedHistogram defaultCellPerPartitionCountHistogram() { @@ -95,8 +96,8 @@ public class MetadataCollector implements PartitionStatisticsCollector protected double compressionRatio = NO_COMPRESSION_RATIO; protected StreamingHistogram.StreamingHistogramBuilder estimatedTombstoneDropTime = defaultTombstoneDropTimeHistogramBuilder(); protected int sstableLevel; -private ClusteringPrefix minClustering = Slice.Bound.TOP; -private ClusteringPrefix maxClustering = Slice.Bound.BOTTOM; +private ClusteringPrefix minClustering = null; +private ClusteringPrefix maxClustering = null; protected boolean hasLegacyCounterShards = false; protected long totalColumnsSet; protected long totalRows; @@ -228,8 +229,8 @@ public class MetadataCollector implements PartitionStatisticsCollector public MetadataCollector updateClusteringValues(ClusteringPrefix clustering) { -minClustering = comparator.compare(clustering, minClustering) < 0 ? clustering : minClustering; -maxClustering = comparator.compare(clustering, maxClustering) > 0 ? clustering : maxClustering; +minClustering = minClustering == null || comparator.compare(clustering, minClustering) < 0 ? clustering : minClustering; +maxClustering = maxClustering == null || comparator.compare(clustering, maxClustering) > 0 ? clustering : maxClustering; return this; } @@ -271,7 +272,10 @@ public class MetadataCollector implements PartitionStatisticsCollector public Map finalizeMetadata(String partitioner, double bloomFilterFPChance, long repairedAt, SerializationHeader header) { -Preconditions.checkState(comparator.compare(maxClustering, minClustering) >= 0); +Preconditions.checkState((minClustering == null && maxClustering == null) + || comparator.compare(maxClustering, minClustering) >= 0); +ByteBuffer[] minValues = minClustering != null ? minClustering.getRawValues() : EMPTY_CLUSTERING; +ByteBuffer[] maxValues = maxClustering != null ? maxClustering.getRawValues() : EMPTY_CLUSTERING; Map components = Maps.newHashMap(); components.put(MetadataType.VALIDATION, new ValidationMetadata(partitioner, bloomFilterFPChance)); components.put(MetadataType.STATS, new StatsMetadata(estimatedPartitionSize, @@ -286,8 +290,8 @@ public class MetadataCollector implements PartitionStatisticsCollector compressionRatio, estimatedTombstoneDropTime.build(), sstableLevel, - makeList(minClustering.getRawValues()), - makeList(maxClustering.getRawValues()), + makeList(minValues), + makeList(maxValues), hasLegacyCounterShards, repairedAt,
[5/6] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11
Merge branch 'cassandra-3.0' into cassandra-3.11 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/7eecf891 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/7eecf891 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/7eecf891 Branch: refs/heads/cassandra-3.11 Commit: 7eecf891f5a12a0f74b69a9aa60a91f584235d0c Parents: a6a9dce e04efab Author: Blake Eggleston Authored: Tue Nov 6 11:59:49 2018 -0800 Committer: Blake Eggleston Committed: Tue Nov 6 11:59:49 2018 -0800 -- .../io/sstable/metadata/MetadataCollector.java| 18 +++--- 1 file changed, 11 insertions(+), 7 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/7eecf891/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java -- diff --cc src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java index a618c96,437d80f..0ac5187 --- a/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java +++ b/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java @@@ -273,8 -272,11 +274,11 @@@ public class MetadataCollector implemen public Map finalizeMetadata(String partitioner, double bloomFilterFPChance, long repairedAt, SerializationHeader header) { - Preconditions.checkState(comparator.compare(maxClustering, minClustering) >= 0); + Preconditions.checkState((minClustering == null && maxClustering == null) + || comparator.compare(maxClustering, minClustering) >= 0); + ByteBuffer[] minValues = minClustering != null ? minClustering.getRawValues() : EMPTY_CLUSTERING; + ByteBuffer[] maxValues = maxClustering != null ? maxClustering.getRawValues() : EMPTY_CLUSTERING; -Map components = Maps.newHashMap(); +Map components = new EnumMap<>(MetadataType.class); components.put(MetadataType.VALIDATION, new ValidationMetadata(partitioner, bloomFilterFPChance)); components.put(MetadataType.STATS, new StatsMetadata(estimatedPartitionSize, estimatedCellPerPartitionCount, - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14861) sstable min/max metadata can cause data loss
[ https://issues.apache.org/jira/browse/CASSANDRA-14861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-14861: Resolution: Fixed Status: Resolved (was: Ready to Commit) committed as {{d60c78358b6f599a83f3c112bfd6ce72c1129c9f}}, thanks > sstable min/max metadata can cause data loss > > > Key: CASSANDRA-14861 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14861 > Project: Cassandra > Issue Type: Bug > Components: Local Write-Read Paths >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Major > Fix For: 3.0.18, 3.11.4, 4.0 > > > There’s a bug in the way we filter sstables in the read path that can cause > sstables containing relevant range tombstones to be excluded from reads. This > can cause data resurrection for an individual read, and if compaction timing > is right, permanent resurrection via read repair. > We track the min and max clustering values when writing an sstable so we can > avoid reading from sstables that don’t contain the clustering values we’re > looking for in a given read. The min max for each clustering column are > updated for each row / RT marker we write. In the case of range tombstones > markers though, we only update the min max for the clustering values they > contain, which is almost never the full set of clustering values. This leaves > a min/max that are above/below (respectively) the real ranges covered by the > range tombstone contained in the sstable. > For instance, assume we’re writing an sstable for a table with 3 clustering > values. The current min clustering is 5:6:7. We write an RT marker for a > range tombstone that deletes any row with the value 4 in the first clustering > value so the open marker is [4:]. This would make the new min clustering > 4:6:7 when it should really be 4:. If we do a read for clustering values of > 4:5 and lower, we’ll exclude this sstable and it’s range tombstone, > resurrecting any data there that this tombstone would have deleted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[4/6] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11
Merge branch 'cassandra-3.0' into cassandra-3.11 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a6a9dce1 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a6a9dce1 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a6a9dce1 Branch: refs/heads/trunk Commit: a6a9dce157a4ed14e7d08e854c504423dd199daa Parents: 69f8cc7 d60c783 Author: Blake Eggleston Authored: Tue Nov 6 11:17:47 2018 -0800 Committer: Blake Eggleston Committed: Tue Nov 6 11:19:04 2018 -0800 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/db/Slice.java | 25 +- .../cassandra/io/sstable/format/Version.java| 2 + .../io/sstable/format/big/BigFormat.java| 9 ++ .../io/sstable/metadata/MetadataCollector.java | 25 +++--- .../io/sstable/metadata/StatsMetadata.java | 14 ++- .../mc-1-big-CompressionInfo.db | Bin 0 -> 43 bytes .../mc-1-big-Data.db| Bin 0 -> 65 bytes .../mc-1-big-Digest.crc32 | 1 + .../mc-1-big-Filter.db | Bin 0 -> 16 bytes .../mc-1-big-Index.db | Bin 0 -> 8 bytes .../mc-1-big-Statistics.db | Bin 0 -> 4789 bytes .../mc-1-big-Summary.db | Bin 0 -> 56 bytes .../mc-1-big-TOC.txt| 8 ++ .../db/SinglePartitionSliceCommandTest.java | 87 +++ .../cassandra/io/sstable/LegacySSTableTest.java | 34 +++- 16 files changed, 166 insertions(+), 40 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a6a9dce1/CHANGES.txt -- diff --cc CHANGES.txt index 03abb5b,0fb1b86..f923fa0 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,5 -1,5 +1,6 @@@ -3.0.18 +3.11.4 +Merged from 3.0: + * Sstable min/max metadata can cause data loss (CASSANDRA-14861) * Dropped columns can cause reverse sstable iteration to return prematurely (CASSANDRA-14838) * Legacy sstables with multi block range tombstones create invalid bound sequences (CASSANDRA-14823) * Expand range tombstone validation checks to multiple interim request stages (CASSANDRA-14824) http://git-wip-us.apache.org/repos/asf/cassandra/blob/a6a9dce1/src/java/org/apache/cassandra/db/Slice.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a6a9dce1/src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a6a9dce1/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java -- diff --cc src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java index e61f4b3,f48d0a6..a618c96 --- a/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java +++ b/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java @@@ -19,8 -19,8 +19,9 @@@ package org.apache.cassandra.io.sstable import java.nio.ByteBuffer; import java.util.ArrayList; + import java.util.Arrays; import java.util.Collections; +import java.util.EnumMap; import java.util.List; import java.util.Map; @@@ -93,8 -95,8 +97,8 @@@ public class MetadataCollector implemen protected double compressionRatio = NO_COMPRESSION_RATIO; protected StreamingHistogram.StreamingHistogramBuilder estimatedTombstoneDropTime = defaultTombstoneDropTimeHistogramBuilder(); protected int sstableLevel; - protected ByteBuffer[] minClusteringValues; - protected ByteBuffer[] maxClusteringValues; -private ClusteringPrefix minClustering = Slice.Bound.TOP; -private ClusteringPrefix maxClustering = Slice.Bound.BOTTOM; ++private ClusteringPrefix minClustering = ClusteringBound.TOP; ++private ClusteringPrefix maxClustering = ClusteringBound.BOTTOM; protected boolean hasLegacyCounterShards = false; protected long totalColumnsSet; protected long totalRows; @@@ -277,7 -271,8 +273,8 @@@ public Map finalizeMetadata(String partitioner, double bloomFilterFPChance, long repairedAt, SerializationHeader header) { + Preconditions.checkState(comparator.compare(maxClustering, minClustering) >= 0); -Map components = Maps.newHashMap(); +Map components = new EnumMap<>(MetadataType.class); components.put(MetadataType.VALIDATION, new ValidationMetadata(partitioner, bloomFilterFPChance)); components.put(MetadataType.STATS, new StatsMetadata(estimatedPartitionSize,
[6/6] cassandra git commit: Merge branch 'cassandra-3.11' into trunk
Merge branch 'cassandra-3.11' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/6ec44528 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/6ec44528 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/6ec44528 Branch: refs/heads/trunk Commit: 6ec445282ad2ed0620d7fd16f23fab256123e7cf Parents: bfbc527 a6a9dce Author: Blake Eggleston Authored: Tue Nov 6 11:19:55 2018 -0800 Committer: Blake Eggleston Committed: Tue Nov 6 11:20:52 2018 -0800 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/db/Slice.java | 25 +- .../cassandra/io/sstable/format/Version.java| 2 + .../io/sstable/format/big/BigFormat.java| 10 +++ .../io/sstable/metadata/MetadataCollector.java | 25 +++--- .../io/sstable/metadata/StatsMetadata.java | 14 ++- .../mc-1-big-CompressionInfo.db | Bin 0 -> 43 bytes .../mc-1-big-Data.db| Bin 0 -> 65 bytes .../mc-1-big-Digest.crc32 | 1 + .../mc-1-big-Filter.db | Bin 0 -> 16 bytes .../mc-1-big-Index.db | Bin 0 -> 8 bytes .../mc-1-big-Statistics.db | Bin 0 -> 4789 bytes .../mc-1-big-Summary.db | Bin 0 -> 56 bytes .../mc-1-big-TOC.txt| 8 ++ .../db/SinglePartitionSliceCommandTest.java | 89 +++ .../cassandra/io/sstable/LegacySSTableTest.java | 35 +++- 16 files changed, 170 insertions(+), 40 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/6ec44528/CHANGES.txt -- diff --cc CHANGES.txt index 41b3da9,f923fa0..2373cb2 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,330 -1,6 +1,331 @@@ +4.0 + * Partitioned outbound internode TCP connections can occur when nodes restart (CASSANDRA-14358) + * Don't write to system_distributed.repair_history, system_traces.sessions, system_traces.events in mixed version 3.X/4.0 clusters (CASSANDRA-14841) + * Avoid running query to self through messaging service (CASSANDRA-14807) + * Allow using custom script for chronicle queue BinLog archival (CASSANDRA-14373) + * Transient->Full range movements mishandle consistency level upgrade (CASSANDRA-14759) + * ReplicaCollection follow-up (CASSANDRA-14726) + * Transient node receives full data requests (CASSANDRA-14762) + * Enable snapshot artifacts publish (CASSANDRA-12704) + * Introduce RangesAtEndpoint.unwrap to simplify StreamSession.addTransferRanges (CASSANDRA-14770) + * LOCAL_QUORUM may speculate to non-local nodes, resulting in Timeout instead of Unavailable (CASSANDRA-14735) + * Avoid creating empty compaction tasks after truncate (CASSANDRA-14780) + * Fail incremental repair prepare phase if it encounters sstables from un-finalized sessions (CASSANDRA-14763) + * Add a check for receiving digest response from transient node (CASSANDRA-14750) + * Fail query on transient replica if coordinator only expects full data (CASSANDRA-14704) + * Remove mentions of transient replication from repair path (CASSANDRA-14698) + * Fix handleRepairStatusChangedNotification to remove first then add (CASSANDRA-14720) + * Allow transient node to serve as a repair coordinator (CASSANDRA-14693) + * DecayingEstimatedHistogramReservoir.EstimatedHistogramReservoirSnapshot returns wrong value for size() and incorrectly calculates count (CASSANDRA-14696) + * AbstractReplicaCollection equals and hash code should throw due to conflict between order sensitive/insensitive uses (CASSANDRA-14700) + * Detect inconsistencies in repaired data on the read path (CASSANDRA-14145) + * Add checksumming to the native protocol (CASSANDRA-13304) + * Make AuthCache more easily extendable (CASSANDRA-14662) + * Extend RolesCache to include detailed role info (CASSANDRA-14497) + * Add fqltool compare (CASSANDRA-14619) + * Add fqltool replay (CASSANDRA-14618) + * Log keyspace in full query log (CASSANDRA-14656) + * Transient Replication and Cheap Quorums (CASSANDRA-14404) + * Log server-generated timestamp and nowInSeconds used by queries in FQL (CASSANDRA-14675) + * Add diagnostic events for read repairs (CASSANDRA-14668) + * Use consistent nowInSeconds and timestamps values within a request (CASSANDRA-14671) + * Add sampler for query time and expose with nodetool (CASSANDRA-14436) + * Clean up Message.Request implementations (CASSANDRA-14677) + * Disable old native protocol versions on demand (CASANDRA-14659) + * Allow specifying now-in-seconds in native protocol (CASSANDRA-14664) + * Improve BTree build performance by avoiding data copy (CASSANDRA-9989) + * Make monotonic read / read repair configurable
[2/6] cassandra git commit: Sstable min/max metadata can cause data loss
Sstable min/max metadata can cause data loss Patch by Blake Eggleston; Reviewed by Sam Tunnicliffe for CASSANDRA-14861 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d60c7835 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d60c7835 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d60c7835 Branch: refs/heads/cassandra-3.11 Commit: d60c78358b6f599a83f3c112bfd6ce72c1129c9f Parents: e4bac44 Author: Blake Eggleston Authored: Wed Oct 31 15:55:48 2018 -0700 Committer: Blake Eggleston Committed: Tue Nov 6 11:17:06 2018 -0800 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/db/Slice.java | 25 +- .../cassandra/io/sstable/format/Version.java| 2 + .../io/sstable/format/big/BigFormat.java| 9 ++ .../io/sstable/metadata/MetadataCollector.java | 26 ++ .../io/sstable/metadata/StatsMetadata.java | 14 ++- .../mc-1-big-CompressionInfo.db | Bin 0 -> 43 bytes .../mc-1-big-Data.db| Bin 0 -> 65 bytes .../mc-1-big-Digest.crc32 | 1 + .../mc-1-big-Filter.db | Bin 0 -> 16 bytes .../mc-1-big-Index.db | Bin 0 -> 8 bytes .../mc-1-big-Statistics.db | Bin 0 -> 4789 bytes .../mc-1-big-Summary.db | Bin 0 -> 56 bytes .../mc-1-big-TOC.txt| 8 ++ .../db/SinglePartitionSliceCommandTest.java | 87 +++ .../cassandra/io/sstable/LegacySSTableTest.java | 35 +++- 16 files changed, 165 insertions(+), 43 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/d60c7835/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index cc8e348..0fb1b86 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.0.18 + * Sstable min/max metadata can cause data loss (CASSANDRA-14861) * Dropped columns can cause reverse sstable iteration to return prematurely (CASSANDRA-14838) * Legacy sstables with multi block range tombstones create invalid bound sequences (CASSANDRA-14823) * Expand range tombstone validation checks to multiple interim request stages (CASSANDRA-14824) http://git-wip-us.apache.org/repos/asf/cassandra/blob/d60c7835/src/java/org/apache/cassandra/db/Slice.java -- diff --git a/src/java/org/apache/cassandra/db/Slice.java b/src/java/org/apache/cassandra/db/Slice.java index 3c645dc..f90c195 100644 --- a/src/java/org/apache/cassandra/db/Slice.java +++ b/src/java/org/apache/cassandra/db/Slice.java @@ -248,29 +248,8 @@ public class Slice */ public boolean intersects(ClusteringComparator comparator, List minClusteringValues, List maxClusteringValues) { -// If this slice start after max or end before min, it can't intersect -if (start.compareTo(comparator, maxClusteringValues) > 0 || end.compareTo(comparator, minClusteringValues) < 0) -return false; - -// We could safely return true here, but there's a minor optimization: if the first component -// of the slice is restricted to a single value (typically the slice is [4:5, 4:7]), we can -// check that the second component falls within the min/max for that component (and repeat for -// all components). -for (int j = 0; j < minClusteringValues.size() && j < maxClusteringValues.size(); j++) -{ -ByteBuffer s = j < start.size() ? start.get(j) : null; -ByteBuffer f = j < end.size() ? end.get(j) : null; - -// we already know the first component falls within its min/max range (otherwise we wouldn't get here) -if (j > 0 && (j < end.size() && comparator.compareComponent(j, f, minClusteringValues.get(j)) < 0 || -j < start.size() && comparator.compareComponent(j, s, maxClusteringValues.get(j)) > 0)) -return false; - -// if this component isn't equal in the start and finish, we don't need to check any more -if (j >= start.size() || j >= end.size() || comparator.compareComponent(j, s, f) != 0) -break; -} -return true; +// If this slice starts after max clustering or ends before min clustering, it can't intersect +return start.compareTo(comparator, maxClusteringValues) <= 0 && end.compareTo(comparator, minClusteringValues) >= 0; } public String toString(CFMetaData metadata) http://git-wip-us.apache.org/repos/asf/cassandra/blob/d60c7835/src/java/org/apache/cassandra/io/sstable/format/Version.java
[3/6] cassandra git commit: Sstable min/max metadata can cause data loss
Sstable min/max metadata can cause data loss Patch by Blake Eggleston; Reviewed by Sam Tunnicliffe for CASSANDRA-14861 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d60c7835 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d60c7835 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d60c7835 Branch: refs/heads/trunk Commit: d60c78358b6f599a83f3c112bfd6ce72c1129c9f Parents: e4bac44 Author: Blake Eggleston Authored: Wed Oct 31 15:55:48 2018 -0700 Committer: Blake Eggleston Committed: Tue Nov 6 11:17:06 2018 -0800 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/db/Slice.java | 25 +- .../cassandra/io/sstable/format/Version.java| 2 + .../io/sstable/format/big/BigFormat.java| 9 ++ .../io/sstable/metadata/MetadataCollector.java | 26 ++ .../io/sstable/metadata/StatsMetadata.java | 14 ++- .../mc-1-big-CompressionInfo.db | Bin 0 -> 43 bytes .../mc-1-big-Data.db| Bin 0 -> 65 bytes .../mc-1-big-Digest.crc32 | 1 + .../mc-1-big-Filter.db | Bin 0 -> 16 bytes .../mc-1-big-Index.db | Bin 0 -> 8 bytes .../mc-1-big-Statistics.db | Bin 0 -> 4789 bytes .../mc-1-big-Summary.db | Bin 0 -> 56 bytes .../mc-1-big-TOC.txt| 8 ++ .../db/SinglePartitionSliceCommandTest.java | 87 +++ .../cassandra/io/sstable/LegacySSTableTest.java | 35 +++- 16 files changed, 165 insertions(+), 43 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/d60c7835/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index cc8e348..0fb1b86 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.0.18 + * Sstable min/max metadata can cause data loss (CASSANDRA-14861) * Dropped columns can cause reverse sstable iteration to return prematurely (CASSANDRA-14838) * Legacy sstables with multi block range tombstones create invalid bound sequences (CASSANDRA-14823) * Expand range tombstone validation checks to multiple interim request stages (CASSANDRA-14824) http://git-wip-us.apache.org/repos/asf/cassandra/blob/d60c7835/src/java/org/apache/cassandra/db/Slice.java -- diff --git a/src/java/org/apache/cassandra/db/Slice.java b/src/java/org/apache/cassandra/db/Slice.java index 3c645dc..f90c195 100644 --- a/src/java/org/apache/cassandra/db/Slice.java +++ b/src/java/org/apache/cassandra/db/Slice.java @@ -248,29 +248,8 @@ public class Slice */ public boolean intersects(ClusteringComparator comparator, List minClusteringValues, List maxClusteringValues) { -// If this slice start after max or end before min, it can't intersect -if (start.compareTo(comparator, maxClusteringValues) > 0 || end.compareTo(comparator, minClusteringValues) < 0) -return false; - -// We could safely return true here, but there's a minor optimization: if the first component -// of the slice is restricted to a single value (typically the slice is [4:5, 4:7]), we can -// check that the second component falls within the min/max for that component (and repeat for -// all components). -for (int j = 0; j < minClusteringValues.size() && j < maxClusteringValues.size(); j++) -{ -ByteBuffer s = j < start.size() ? start.get(j) : null; -ByteBuffer f = j < end.size() ? end.get(j) : null; - -// we already know the first component falls within its min/max range (otherwise we wouldn't get here) -if (j > 0 && (j < end.size() && comparator.compareComponent(j, f, minClusteringValues.get(j)) < 0 || -j < start.size() && comparator.compareComponent(j, s, maxClusteringValues.get(j)) > 0)) -return false; - -// if this component isn't equal in the start and finish, we don't need to check any more -if (j >= start.size() || j >= end.size() || comparator.compareComponent(j, s, f) != 0) -break; -} -return true; +// If this slice starts after max clustering or ends before min clustering, it can't intersect +return start.compareTo(comparator, maxClusteringValues) <= 0 && end.compareTo(comparator, minClusteringValues) >= 0; } public String toString(CFMetaData metadata) http://git-wip-us.apache.org/repos/asf/cassandra/blob/d60c7835/src/java/org/apache/cassandra/io/sstable/format/Version.java
[5/6] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11
Merge branch 'cassandra-3.0' into cassandra-3.11 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/a6a9dce1 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/a6a9dce1 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/a6a9dce1 Branch: refs/heads/cassandra-3.11 Commit: a6a9dce157a4ed14e7d08e854c504423dd199daa Parents: 69f8cc7 d60c783 Author: Blake Eggleston Authored: Tue Nov 6 11:17:47 2018 -0800 Committer: Blake Eggleston Committed: Tue Nov 6 11:19:04 2018 -0800 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/db/Slice.java | 25 +- .../cassandra/io/sstable/format/Version.java| 2 + .../io/sstable/format/big/BigFormat.java| 9 ++ .../io/sstable/metadata/MetadataCollector.java | 25 +++--- .../io/sstable/metadata/StatsMetadata.java | 14 ++- .../mc-1-big-CompressionInfo.db | Bin 0 -> 43 bytes .../mc-1-big-Data.db| Bin 0 -> 65 bytes .../mc-1-big-Digest.crc32 | 1 + .../mc-1-big-Filter.db | Bin 0 -> 16 bytes .../mc-1-big-Index.db | Bin 0 -> 8 bytes .../mc-1-big-Statistics.db | Bin 0 -> 4789 bytes .../mc-1-big-Summary.db | Bin 0 -> 56 bytes .../mc-1-big-TOC.txt| 8 ++ .../db/SinglePartitionSliceCommandTest.java | 87 +++ .../cassandra/io/sstable/LegacySSTableTest.java | 34 +++- 16 files changed, 166 insertions(+), 40 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a6a9dce1/CHANGES.txt -- diff --cc CHANGES.txt index 03abb5b,0fb1b86..f923fa0 --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,5 -1,5 +1,6 @@@ -3.0.18 +3.11.4 +Merged from 3.0: + * Sstable min/max metadata can cause data loss (CASSANDRA-14861) * Dropped columns can cause reverse sstable iteration to return prematurely (CASSANDRA-14838) * Legacy sstables with multi block range tombstones create invalid bound sequences (CASSANDRA-14823) * Expand range tombstone validation checks to multiple interim request stages (CASSANDRA-14824) http://git-wip-us.apache.org/repos/asf/cassandra/blob/a6a9dce1/src/java/org/apache/cassandra/db/Slice.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a6a9dce1/src/java/org/apache/cassandra/io/sstable/format/big/BigFormat.java -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/a6a9dce1/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java -- diff --cc src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java index e61f4b3,f48d0a6..a618c96 --- a/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java +++ b/src/java/org/apache/cassandra/io/sstable/metadata/MetadataCollector.java @@@ -19,8 -19,8 +19,9 @@@ package org.apache.cassandra.io.sstable import java.nio.ByteBuffer; import java.util.ArrayList; + import java.util.Arrays; import java.util.Collections; +import java.util.EnumMap; import java.util.List; import java.util.Map; @@@ -93,8 -95,8 +97,8 @@@ public class MetadataCollector implemen protected double compressionRatio = NO_COMPRESSION_RATIO; protected StreamingHistogram.StreamingHistogramBuilder estimatedTombstoneDropTime = defaultTombstoneDropTimeHistogramBuilder(); protected int sstableLevel; - protected ByteBuffer[] minClusteringValues; - protected ByteBuffer[] maxClusteringValues; -private ClusteringPrefix minClustering = Slice.Bound.TOP; -private ClusteringPrefix maxClustering = Slice.Bound.BOTTOM; ++private ClusteringPrefix minClustering = ClusteringBound.TOP; ++private ClusteringPrefix maxClustering = ClusteringBound.BOTTOM; protected boolean hasLegacyCounterShards = false; protected long totalColumnsSet; protected long totalRows; @@@ -277,7 -271,8 +273,8 @@@ public Map finalizeMetadata(String partitioner, double bloomFilterFPChance, long repairedAt, SerializationHeader header) { + Preconditions.checkState(comparator.compare(maxClustering, minClustering) >= 0); -Map components = Maps.newHashMap(); +Map components = new EnumMap<>(MetadataType.class); components.put(MetadataType.VALIDATION, new ValidationMetadata(partitioner, bloomFilterFPChance)); components.put(MetadataType.STATS, new StatsMetadata(estimatedPartitionSize,
[1/6] cassandra git commit: Sstable min/max metadata can cause data loss
Repository: cassandra Updated Branches: refs/heads/cassandra-3.0 e4bac44a0 -> d60c78358 refs/heads/cassandra-3.11 69f8cc7d2 -> a6a9dce15 refs/heads/trunk bfbc5274f -> 6ec445282 Sstable min/max metadata can cause data loss Patch by Blake Eggleston; Reviewed by Sam Tunnicliffe for CASSANDRA-14861 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/d60c7835 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/d60c7835 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/d60c7835 Branch: refs/heads/cassandra-3.0 Commit: d60c78358b6f599a83f3c112bfd6ce72c1129c9f Parents: e4bac44 Author: Blake Eggleston Authored: Wed Oct 31 15:55:48 2018 -0700 Committer: Blake Eggleston Committed: Tue Nov 6 11:17:06 2018 -0800 -- CHANGES.txt | 1 + src/java/org/apache/cassandra/db/Slice.java | 25 +- .../cassandra/io/sstable/format/Version.java| 2 + .../io/sstable/format/big/BigFormat.java| 9 ++ .../io/sstable/metadata/MetadataCollector.java | 26 ++ .../io/sstable/metadata/StatsMetadata.java | 14 ++- .../mc-1-big-CompressionInfo.db | Bin 0 -> 43 bytes .../mc-1-big-Data.db| Bin 0 -> 65 bytes .../mc-1-big-Digest.crc32 | 1 + .../mc-1-big-Filter.db | Bin 0 -> 16 bytes .../mc-1-big-Index.db | Bin 0 -> 8 bytes .../mc-1-big-Statistics.db | Bin 0 -> 4789 bytes .../mc-1-big-Summary.db | Bin 0 -> 56 bytes .../mc-1-big-TOC.txt| 8 ++ .../db/SinglePartitionSliceCommandTest.java | 87 +++ .../cassandra/io/sstable/LegacySSTableTest.java | 35 +++- 16 files changed, 165 insertions(+), 43 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/d60c7835/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index cc8e348..0fb1b86 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.0.18 + * Sstable min/max metadata can cause data loss (CASSANDRA-14861) * Dropped columns can cause reverse sstable iteration to return prematurely (CASSANDRA-14838) * Legacy sstables with multi block range tombstones create invalid bound sequences (CASSANDRA-14823) * Expand range tombstone validation checks to multiple interim request stages (CASSANDRA-14824) http://git-wip-us.apache.org/repos/asf/cassandra/blob/d60c7835/src/java/org/apache/cassandra/db/Slice.java -- diff --git a/src/java/org/apache/cassandra/db/Slice.java b/src/java/org/apache/cassandra/db/Slice.java index 3c645dc..f90c195 100644 --- a/src/java/org/apache/cassandra/db/Slice.java +++ b/src/java/org/apache/cassandra/db/Slice.java @@ -248,29 +248,8 @@ public class Slice */ public boolean intersects(ClusteringComparator comparator, List minClusteringValues, List maxClusteringValues) { -// If this slice start after max or end before min, it can't intersect -if (start.compareTo(comparator, maxClusteringValues) > 0 || end.compareTo(comparator, minClusteringValues) < 0) -return false; - -// We could safely return true here, but there's a minor optimization: if the first component -// of the slice is restricted to a single value (typically the slice is [4:5, 4:7]), we can -// check that the second component falls within the min/max for that component (and repeat for -// all components). -for (int j = 0; j < minClusteringValues.size() && j < maxClusteringValues.size(); j++) -{ -ByteBuffer s = j < start.size() ? start.get(j) : null; -ByteBuffer f = j < end.size() ? end.get(j) : null; - -// we already know the first component falls within its min/max range (otherwise we wouldn't get here) -if (j > 0 && (j < end.size() && comparator.compareComponent(j, f, minClusteringValues.get(j)) < 0 || -j < start.size() && comparator.compareComponent(j, s, maxClusteringValues.get(j)) > 0)) -return false; - -// if this component isn't equal in the start and finish, we don't need to check any more -if (j >= start.size() || j >= end.size() || comparator.compareComponent(j, s, f) != 0) -break; -} -return true; +// If this slice starts after max clustering or ends before min clustering, it can't intersect +return start.compareTo(comparator, maxClusteringValues) <= 0 && end.compareTo(comparator, minClusteringValues) >= 0; } public String
[jira] [Commented] (CASSANDRA-14358) Partitioned outbound internode TCP connections can occur when nodes restart
[ https://issues.apache.org/jira/browse/CASSANDRA-14358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677198#comment-16677198 ] Ariel Weisberg commented on CASSANDRA-14358: I guess I misinterpreted the scolding I got long ago WRT to commit messages. > Partitioned outbound internode TCP connections can occur when nodes restart > --- > > Key: CASSANDRA-14358 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14358 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging > Environment: Cassandra 2.1.19 (also reproduced on 3.0.15), running > with {{internode_encryption: all}} and the EC2 multi region snitch on Linux > 4.13 within the same AWS region. Smallest cluster I've seen the problem on is > 12 nodes, reproduces more reliably on 40+ and 300 node clusters consistently > reproduce on at least one node in the cluster. > So all the connections are SSL and we're connecting on the internal ip > addresses (not the public endpoint ones). > Potentially relevant sysctls: > {noformat} > /proc/sys/net/ipv4/tcp_syn_retries = 2 > /proc/sys/net/ipv4/tcp_synack_retries = 5 > /proc/sys/net/ipv4/tcp_keepalive_time = 7200 > /proc/sys/net/ipv4/tcp_keepalive_probes = 9 > /proc/sys/net/ipv4/tcp_keepalive_intvl = 75 > /proc/sys/net/ipv4/tcp_retries2 = 15 > {noformat} >Reporter: Joseph Lynch >Assignee: Joseph Lynch >Priority: Major > Labels: 4.0-feature-freeze-review-requested > Fix For: 4.0 > > Attachments: 10 Minute Partition.pdf > > > edit summary: This primarily impacts networks with stateful firewalls such as > AWS. I'm working on a proper patch for trunk but unfortunately it relies on > the Netty refactor in 4.0 so it will be hard to backport to previous > versions. A workaround for earlier versions is to set the > {{net.ipv4.tcp_retries2}} sysctl to ~5. This can be done with the following: > {code:java} > $ cat /etc/sysctl.d/20-cassandra-tuning.conf > net.ipv4.tcp_retries2=5 > $ # Reload all sysctls > $ sysctl --system{code} > Original Bug Report: > I've been trying to debug nodes not being able to see each other during > longer (~5 minute+) Cassandra restarts in 3.0.x and 2.1.x which can > contribute to {{UnavailableExceptions}} during rolling restarts of 3.0.x and > 2.1.x clusters for us. I think I finally have a lead. It appears that prior > to trunk (with the awesome Netty refactor) we do not set socket connect > timeouts on SSL connections (in 2.1.x, 3.0.x, or 3.11.x) nor do we set > {{SO_TIMEOUT}} as far as I can tell on outbound connections either. I believe > that this means that we could potentially block forever on {{connect}} or > {{recv}} syscalls, and we could block forever on the SSL Handshake as well. I > think that the OS will protect us somewhat (and that may be what's causing > the eventual timeout) but I think that given the right network conditions our > {{OutboundTCPConnection}} threads can just be stuck never making any progress > until the OS intervenes. > I have attached some logs of such a network partition during a rolling > restart where an old node in the cluster has a completely foobarred > {{OutboundTcpConnection}} for ~10 minutes before finally getting a > {{java.net.SocketException: Connection timed out (Write failed)}} and > immediately successfully reconnecting. I conclude that the old node is the > problem because the new node (the one that restarted) is sending ECHOs to the > old node, and the old node is sending ECHOs and REQUEST_RESPONSES to the new > node's ECHOs, but the new node is never getting the ECHO's. This appears, to > me, to indicate that the old node's {{OutboundTcpConnection}} thread is just > stuck and can't make any forward progress. By the time we could notice this > and slap TRACE logging on, the only thing we see is ~10 minutes later a > {{SocketException}} inside {{writeConnected}}'s flush and an immediate > recovery. It is interesting to me that the exception happens in > {{writeConnected}} and it's a _connection timeout_ (and since we see {{Write > failure}} I believe that this can't be a connection reset), because my > understanding is that we should have a fully handshaked SSL connection at > that point in the code. > Current theory: > # "New" node restarts, "Old" node calls > [newSocket|https://github.com/apache/cassandra/blob/6f30677b28dcbf82bcd0a291f3294ddf87dafaac/src/java/org/apache/cassandra/net/OutboundTcpConnection.java#L433] > # Old node starts [creating a > new|https://github.com/apache/cassandra/blob/6f30677b28dcbf82bcd0a291f3294ddf87dafaac/src/java/org/apache/cassandra/net/OutboundTcpConnectionPool.java#L141] > SSL socket > # SSLSocket calls >
[jira] [Commented] (CASSANDRA-14841) Don't write to system_distributed.repair_history, system_traces.sessions, system_traces.events in mixed version 3.X/4.0 clusters
[ https://issues.apache.org/jira/browse/CASSANDRA-14841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677195#comment-16677195 ] Ariel Weisberg commented on CASSANDRA-14841: Why would this block repairs and why would it stop you from monitoring repairs? Wouldn't you use nodetool repair admin to monitor repairs? For repair it is just a history table. It's only populated after repair has already occurred. I'm not sure users can alter the schema of the distributed system tables. @imaleksey is that actually possible? > Don't write to system_distributed.repair_history, system_traces.sessions, > system_traces.events in mixed version 3.X/4.0 clusters > > > Key: CASSANDRA-14841 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14841 > Project: Cassandra > Issue Type: Bug >Reporter: Tommy Stendahl >Assignee: Ariel Weisberg >Priority: Major > Fix For: 4.0 > > > When upgrading from 3.x to 4.0 I get exceptions in the old nodes once the > first 4.0 node starts up. I have tested to upgrade from both 3.0.15 and > 3.11.3 and get the same problem. > > {noformat} > 2018-10-22T11:12:05.060+0200 ERROR > [MessagingService-Incoming-/10.216.193.244] CassandraDaemon.java:228 > Exception in thread Thread[MessagingService-Incoming-/10.216.193.244,5,main] > java.lang.RuntimeException: Unknown column coordinator_port during > deserialization > at org.apache.cassandra.db.Columns$Serializer.deserialize(Columns.java:452) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at > org.apache.cassandra.db.filter.ColumnFilter$Serializer.deserialize(ColumnFilter.java:482) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at > org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:760) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at > org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:697) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at > org.apache.cassandra.io.ForwardingVersionedSerializer.deserialize(ForwardingVersionedSerializer.java:50) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at org.apache.cassandra.net.MessageIn.read(MessageIn.java:123) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at > org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:192) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at > org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:180) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at > org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:94) > ~[apache-cassandra-3.11.3.jar:3.11.3]{noformat} > I think it was introduced by CASSANDRA-7544. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14861) sstable min/max metadata can cause data loss
[ https://issues.apache.org/jira/browse/CASSANDRA-14861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] C. Scott Andreas updated CASSANDRA-14861: - Component/s: Local Write-Read Paths > sstable min/max metadata can cause data loss > > > Key: CASSANDRA-14861 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14861 > Project: Cassandra > Issue Type: Bug > Components: Local Write-Read Paths >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Major > Fix For: 3.0.18, 3.11.4, 4.0 > > > There’s a bug in the way we filter sstables in the read path that can cause > sstables containing relevant range tombstones to be excluded from reads. This > can cause data resurrection for an individual read, and if compaction timing > is right, permanent resurrection via read repair. > We track the min and max clustering values when writing an sstable so we can > avoid reading from sstables that don’t contain the clustering values we’re > looking for in a given read. The min max for each clustering column are > updated for each row / RT marker we write. In the case of range tombstones > markers though, we only update the min max for the clustering values they > contain, which is almost never the full set of clustering values. This leaves > a min/max that are above/below (respectively) the real ranges covered by the > range tombstone contained in the sstable. > For instance, assume we’re writing an sstable for a table with 3 clustering > values. The current min clustering is 5:6:7. We write an RT marker for a > range tombstone that deletes any row with the value 4 in the first clustering > value so the open marker is [4:]. This would make the new min clustering > 4:6:7 when it should really be 4:. If we do a read for clustering values of > 4:5 and lower, we’ll exclude this sstable and it’s range tombstone, > resurrecting any data there that this tombstone would have deleted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14861) sstable min/max metadata can cause data loss
[ https://issues.apache.org/jira/browse/CASSANDRA-14861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677189#comment-16677189 ] Benedict commented on CASSANDRA-14861: -- +1 > sstable min/max metadata can cause data loss > > > Key: CASSANDRA-14861 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14861 > Project: Cassandra > Issue Type: Bug >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Major > Fix For: 3.0.18, 3.11.4, 4.0 > > > There’s a bug in the way we filter sstables in the read path that can cause > sstables containing relevant range tombstones to be excluded from reads. This > can cause data resurrection for an individual read, and if compaction timing > is right, permanent resurrection via read repair. > We track the min and max clustering values when writing an sstable so we can > avoid reading from sstables that don’t contain the clustering values we’re > looking for in a given read. The min max for each clustering column are > updated for each row / RT marker we write. In the case of range tombstones > markers though, we only update the min max for the clustering values they > contain, which is almost never the full set of clustering values. This leaves > a min/max that are above/below (respectively) the real ranges covered by the > range tombstone contained in the sstable. > For instance, assume we’re writing an sstable for a table with 3 clustering > values. The current min clustering is 5:6:7. We write an RT marker for a > range tombstone that deletes any row with the value 4 in the first clustering > value so the open marker is [4:]. This would make the new min clustering > 4:6:7 when it should really be 4:. If we do a read for clustering values of > 4:5 and lower, we’ll exclude this sstable and it’s range tombstone, > resurrecting any data there that this tombstone would have deleted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14765) Evaluate Recovery Time on Single Token Cluster Test
[ https://issues.apache.org/jira/browse/CASSANDRA-14765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16648053#comment-16648053 ] Sumanth Pasupuleti edited comment on CASSANDRA-14765 at 11/6/18 6:48 PM: - Document with comparisons on streaming upon termination between 3.0.x and 4.0 clusters. https://docs.google.com/document/d/1nB7_rvO14GHao-oqywjQs7lIDSodYZTjoxbo8rRpxv0/edit# was (Author: sumanth.pasupuleti): Live document with comparisons on streaming upon termination between 3.0.x and 4.0 clusters. https://docs.google.com/document/d/1nB7_rvO14GHao-oqywjQs7lIDSodYZTjoxbo8rRpxv0/edit# > Evaluate Recovery Time on Single Token Cluster Test > --- > > Key: CASSANDRA-14765 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14765 > Project: Cassandra > Issue Type: Sub-task >Reporter: Joseph Lynch >Assignee: Sumanth Pasupuleti >Priority: Major > > *Setup:* > * Cassandra: 6 (2*3 rack) node i3.8xlarge AWS instance (32 cpu cores, 240GB > ram) running cassandra trunk with Jason's 14503 changes vs the same footprint > running 3.0.17 > * One datacenter, single tokens > * No compression, encryption, or coalescing turned on > *Test #1:* > ndbench loaded ~150GB of data per node into a LCS table. Then we killed a > node and let a new node stream. With a single token this should be a worst > case recovery scenario (only a few peers to stream from). > *Result:* > As the table used LCS and we didn't not have encryption on, the zero copy > transfer was used via CASSANDRA-14556. We recovered *150GB in 5 minutes,* > going at a consistent rate of about 3 gigabit per second. Theoretically we > should be able to get 10 gigabit, but this is still something like an > estimated 16x improvement over 3.0.x. We're still running the 3.0.x test for > a hard comparison. > *Follow Ups:* > We need to get more rigorous measurements (over more terminations), as well > as finishing the 3.0.x test. [~sumanth.pasupuleti] and [~djoshi3] are driving > this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-14861) sstable min/max metadata can cause data loss
[ https://issues.apache.org/jira/browse/CASSANDRA-14861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe updated CASSANDRA-14861: Status: Ready to Commit (was: Patch Available) > sstable min/max metadata can cause data loss > > > Key: CASSANDRA-14861 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14861 > Project: Cassandra > Issue Type: Bug >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Major > Fix For: 3.0.18, 3.11.4, 4.0 > > > There’s a bug in the way we filter sstables in the read path that can cause > sstables containing relevant range tombstones to be excluded from reads. This > can cause data resurrection for an individual read, and if compaction timing > is right, permanent resurrection via read repair. > We track the min and max clustering values when writing an sstable so we can > avoid reading from sstables that don’t contain the clustering values we’re > looking for in a given read. The min max for each clustering column are > updated for each row / RT marker we write. In the case of range tombstones > markers though, we only update the min max for the clustering values they > contain, which is almost never the full set of clustering values. This leaves > a min/max that are above/below (respectively) the real ranges covered by the > range tombstone contained in the sstable. > For instance, assume we’re writing an sstable for a table with 3 clustering > values. The current min clustering is 5:6:7. We write an RT marker for a > range tombstone that deletes any row with the value 4 in the first clustering > value so the open marker is [4:]. This would make the new min clustering > 4:6:7 when it should really be 4:. If we do a read for clustering values of > 4:5 and lower, we’ll exclude this sstable and it’s range tombstone, > resurrecting any data there that this tombstone would have deleted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14861) sstable min/max metadata can cause data loss
[ https://issues.apache.org/jira/browse/CASSANDRA-14861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16677114#comment-16677114 ] Sam Tunnicliffe commented on CASSANDRA-14861: - +1 > sstable min/max metadata can cause data loss > > > Key: CASSANDRA-14861 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14861 > Project: Cassandra > Issue Type: Bug >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Major > Fix For: 3.0.18, 3.11.4, 4.0 > > > There’s a bug in the way we filter sstables in the read path that can cause > sstables containing relevant range tombstones to be excluded from reads. This > can cause data resurrection for an individual read, and if compaction timing > is right, permanent resurrection via read repair. > We track the min and max clustering values when writing an sstable so we can > avoid reading from sstables that don’t contain the clustering values we’re > looking for in a given read. The min max for each clustering column are > updated for each row / RT marker we write. In the case of range tombstones > markers though, we only update the min max for the clustering values they > contain, which is almost never the full set of clustering values. This leaves > a min/max that are above/below (respectively) the real ranges covered by the > range tombstone contained in the sstable. > For instance, assume we’re writing an sstable for a table with 3 clustering > values. The current min clustering is 5:6:7. We write an RT marker for a > range tombstone that deletes any row with the value 4 in the first clustering > value so the open marker is [4:]. This would make the new min clustering > 4:6:7 when it should really be 4:. If we do a read for clustering values of > 4:5 and lower, we’ll exclude this sstable and it’s range tombstone, > resurrecting any data there that this tombstone would have deleted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-14867) Histogram overflows potentially leading to writes failing
David created CASSANDRA-14867: - Summary: Histogram overflows potentially leading to writes failing Key: CASSANDRA-14867 URL: https://issues.apache.org/jira/browse/CASSANDRA-14867 Project: Cassandra Issue Type: Bug Components: Streaming and Messaging Environment: cassandra 3.11.1 on ubuntu 16.04 Reporter: David I observed the following in cassandra logs on 1 host of a 6-node cluster: ERROR [ScheduledTasks:1] 2018-11-01 17:26:41,277 CassandraDaemon.java:228 - Exception in thread Thread[ScheduledTasks:1,5,main] java.lang.IllegalStateException: Unable to compute when histogram overflowed at org.apache.cassandra.metrics.DecayingEstimatedHistogramReservoir$EstimatedHistogramReservoirSnapshot.getMean(DecayingEstimatedHistogramReservoir.java:472) ~[apache-cassandra-3.11.1.jar:3.11.1] at org.apache.cassandra.net.MessagingService.getDroppedMessagesLogs(MessagingService.java:1263) ~[apache-cassandra-3.11.1.jar:3.11.1] at org.apache.cassandra.net.MessagingService.logDroppedMessages(MessagingService.java:1236) ~[apache-cassandra-3.11.1.jar:3.11.1] at org.apache.cassandra.net.MessagingService.access$200(MessagingService.java:87) ~[apache-cassandra-3.11.1.jar:3.11.1] at org.apache.cassandra.net.MessagingService$4.run(MessagingService.java:507) ~[apache-cassandra-3.11.1.jar:3.11.1] at org.apache.cassandra.concurrent.DebuggableScheduledThreadPoolExecutor$UncomplainingRunnable.run(DebuggableScheduledThreadPoolExecutor.java:118) ~[apache-cassandra-3.11.1.jar:3.11.1] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_172] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_172] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_172] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_172] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_172] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_172] at org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81) [apache-cassandra-3.11.1.jar:3.11.1] at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_172] At the same time, this node was failing all writes issued to it. Restarting cassandra on the node brought the cluster into a good state and we stopped seeing the histogram overflow errors. Has this issue been observed before? Could the histogram overflows cause writes to fail? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14841) Don't write to system_distributed.repair_history, system_traces.sessions, system_traces.events in mixed version 3.X/4.0 clusters
[ https://issues.apache.org/jira/browse/CASSANDRA-14841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16676671#comment-16676671 ] Tommy Stendahl commented on CASSANDRA-14841: I think there is another scenario we have to consider, if we have several DC's its possible to do repair with in one dc and we would only need the nodes within that DC to be on the same version, if there are another version in another DC it should not be a problem. And its also possible to create the new columns manually before starting the upgrade, in this case both reads and writes will work. So just blocking the writes while having mixed versions might not be so good. Its only in the DC I'm currently upgrading that I can't run repairs but I would not be able to monitor the repair status in DC's with the new version. I think we should make it possible to override the blocking with some configuration so you can decide to write repair history despite having mixed versions. Or remove the blocking of writes and document in the NEWS.txt how to create the new columns before starting to upgrade. > Don't write to system_distributed.repair_history, system_traces.sessions, > system_traces.events in mixed version 3.X/4.0 clusters > > > Key: CASSANDRA-14841 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14841 > Project: Cassandra > Issue Type: Bug >Reporter: Tommy Stendahl >Assignee: Ariel Weisberg >Priority: Major > Fix For: 4.0 > > > When upgrading from 3.x to 4.0 I get exceptions in the old nodes once the > first 4.0 node starts up. I have tested to upgrade from both 3.0.15 and > 3.11.3 and get the same problem. > > {noformat} > 2018-10-22T11:12:05.060+0200 ERROR > [MessagingService-Incoming-/10.216.193.244] CassandraDaemon.java:228 > Exception in thread Thread[MessagingService-Incoming-/10.216.193.244,5,main] > java.lang.RuntimeException: Unknown column coordinator_port during > deserialization > at org.apache.cassandra.db.Columns$Serializer.deserialize(Columns.java:452) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at > org.apache.cassandra.db.filter.ColumnFilter$Serializer.deserialize(ColumnFilter.java:482) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at > org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:760) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at > org.apache.cassandra.db.ReadCommand$Serializer.deserialize(ReadCommand.java:697) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at > org.apache.cassandra.io.ForwardingVersionedSerializer.deserialize(ForwardingVersionedSerializer.java:50) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at org.apache.cassandra.net.MessageIn.read(MessageIn.java:123) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at > org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:192) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at > org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:180) > ~[apache-cassandra-3.11.3.jar:3.11.3] > at > org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:94) > ~[apache-cassandra-3.11.3.jar:3.11.3]{noformat} > I think it was introduced by CASSANDRA-7544. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org