[jira] [Commented] (CASSANDRA-14801) calculatePendingRanges no longer safe for multiple adjacent range movements

2020-04-03 Thread Michael Semb Wever (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074336#comment-17074336
 ] 

Michael Semb Wever commented on CASSANDRA-14801:


||branch||circleci||jenkins||
|[trunk_14801|https://github.com/apache/cassandra/compare/trunk...Ge:14801-4.0]|[circleci|https://circleci.com/gh/Ge/workflows/cassandra/tree/14801-4.0]|[!https://ci-cassandra.apache.org/job/Cassandra-devbranch/13/badge/icon!|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/13]|

> calculatePendingRanges no longer safe for multiple adjacent range movements
> ---
>
> Key: CASSANDRA-14801
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14801
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Coordination, Legacy/Distributed Metadata
>Reporter: Benedict Elliott Smith
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0, 4.0-beta
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Correctness depended upon the narrowing to a {{Set}}, 
> which we no longer do - we maintain a collection of all {{Replica}}.  Our 
> {{RangesAtEndpoint}} collection built by {{getPendingRanges}} can as a result 
> contain the same endpoint multiple times; and our {{EndpointsForToken}} 
> obtained by {{TokenMetadata.pendingEndpointsFor}} may fail to be constructed, 
> resulting in cluster-wide failures for writes to the affected token ranges 
> for the duration of the range movement.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15671) Testcase: testSubrangeCompaction(org.apache.cassandra.db.compaction.CancelCompactionsTest): FAILED

2020-04-03 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074356#comment-17074356
 ] 

Ekaterina Dimitrova commented on CASSANDRA-15671:
-

LGTM but I wanna look at the CI tomorrow. 

> Testcase: 
> testSubrangeCompaction(org.apache.cassandra.db.compaction.CancelCompactionsTest):
>FAILED
> --
>
> Key: CASSANDRA-15671
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15671
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Ekaterina Dimitrova
>Assignee: Francisco Fernandez
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0, 4.0-beta
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The following test failure was observed:
> [junit-timeout] Testcase: 
> testSubrangeCompaction(org.apache.cassandra.db.compaction.CancelCompactionsTest):
>FAILED
> [junit-timeout] expected:<4> but was:<5>
> [junit-timeout] junit.framework.AssertionFailedError: expected:<4> but was:<5>
> [junit-timeout]   at 
> org.apache.cassandra.db.compaction.CancelCompactionsTest.testSubrangeCompaction(CancelCompactionsTest.java:190)
> Java 8



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15657) Improve zero-copy-streaming containment check by using file sections

2020-04-03 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074303#comment-17074303
 ] 

ZhaoYang edited comment on CASSANDRA-15657 at 4/3/20, 8:25 AM:
---

bq. However after consulting Marcus Eriksson and Aleksey Yeschenko, I 
determined that the only way to correctly to implement the containment check is 
to enumerate all tokens

can you elaborate what cases? thanks


was (Author: jasonstack):
bq. However after consulting Marcus Eriksson and Aleksey Yeschenko, I 
determined that the only way to correctly to implement the containment check is 
to enumerate all tokens

can you elaborate what are cases? thanks

> Improve zero-copy-streaming containment check by using file sections
> 
>
> Key: CASSANDRA-15657
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15657
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Streaming and Messaging
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0
>
>
> Currently zero copy streaming is only enabled for leveled-compaction strategy 
> and it checks if all keys in the sstables are included in the transferred 
> ranges.
> This is very inefficient. The containment check can be improved by checking 
> if transferred sections (the transferred file positions) cover entire sstable.
> I also enabled ZCS for all compaction strategies since the new containment 
> check is very fast..



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15406) Add command to show the progress of data streaming and index build

2020-04-03 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074400#comment-17074400
 ] 

Stefan Miklosovic commented on CASSANDRA-15406:
---

[~djoshi] yes I can do that but without fixing the underlying issue when the 
streaming of entire sstable is happening, it does not make sense to continue 
with the computation and rendering of the progress in percentage as the 
computed figure would not make any sense and we would have to yet do some 
workarounds which is imho not desirable.

 

Hence, the course of action will be: 1) create new ticket as you suggested 
describing the problem of not updated figures properly when sstable is streamed 
in its entirety 2) momentarilly abandoning this ticket, marking it as dependant 
to the newly created one and returning to it once the former is resolved.

> Add command to show the progress of data streaming and index build 
> ---
>
> Key: CASSANDRA-15406
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15406
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Streaming, Legacy/Streaming and Messaging, 
> Tool/nodetool
>Reporter: maxwellguo
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.0, 4.x
>
>
> I found that we should supply a command to show the progress of streaming 
> when we do the operation of bootstrap/move/decommission/removenode. For when 
> do data streaming , noboday knows which steps there program are in , so I 
> think a command to show the joing/leaving node's is needed .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15406) Add command to show the progress of data streaming and index build

2020-04-03 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074404#comment-17074404
 ] 

Stefan Miklosovic commented on CASSANDRA-15406:
---

For the reference this is a branch it is fixed, with test checking that figures 
do match 

 

[https://github.com/smiklosovic/cassandra/commit/e98338082a73e5f360c6d9eb234710810eba5cea]

> Add command to show the progress of data streaming and index build 
> ---
>
> Key: CASSANDRA-15406
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15406
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Streaming, Legacy/Streaming and Messaging, 
> Tool/nodetool
>Reporter: maxwellguo
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.0, 4.x
>
>
> I found that we should supply a command to show the progress of streaming 
> when we do the operation of bootstrap/move/decommission/removenode. For when 
> do data streaming , noboday knows which steps there program are in , so I 
> think a command to show the joing/leaving node's is needed .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15666) Race condition when completing stream sessions

2020-04-03 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074405#comment-17074405
 ] 

ZhaoYang commented on CASSANDRA-15666:
--

bq. Regarding the changes to the CompleteMessage exchange, I still think that'd 
be a win regardless if the race is fixed in a different way

Let's see what [~blerer] has to say..

> Race condition when completing stream sessions
> --
>
> Key: CASSANDRA-15666
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15666
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Streaming and Messaging
>Reporter: Sergio Bossa
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0
>
>
> {{StreamSession#prepareAsync()}} executes, as the name implies, 
> asynchronously from the IO thread: this opens up for race conditions between 
> the sending of the {{PrepareSynAckMessage}} and the call to 
> {{StreamSession#maybeCompleted()}}. I.e., the following could happen:
> 1) Node A sends {{PrepareSynAckMessage}} from the {{prepareAsync()}} thread.
> 2) Node B receives it and starts streaming.
> 3) Node A receives the streamed file and sends {{ReceivedMessage}}.
> 4) At this point, if this was the only file to stream, both nodes are ready 
> to close the session via {{maybeCompleted()}}, but:
> a) Node A will call it twice from both the IO thread and the thread at #1, 
> closing the session and its channels.
> b) Node B will attempt to send a {{CompleteMessage}}, but will fail because 
> the session has been closed in the meantime.
> There are other subtle variations of the pattern above, depending on the 
> order of concurrently sent/received messages.
> I believe the best fix would be to modify the message exchange so that:
> 1) Only the "follower" is allowed to send the {{CompleteMessage}}.
> 2) Only the "initiator" is allowed to close the session and its channels 
> after receiving the {{CompleteMessage}}.
> By doing so, the message exchange logic would be easier to reason about, 
> which is overall a win anyway.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15684) CASSANDRA-15650 was merged after dtest refactor and modified classes no longer in the project

2020-04-03 Thread Benjamin Lerer (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074415#comment-17074415
 ] 

Benjamin Lerer commented on CASSANDRA-15684:


The patches look fine to me.

> CASSANDRA-15650 was merged after dtest refactor and modified classes no 
> longer in the project
> -
>
> Key: CASSANDRA-15684
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15684
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> CASSANDRA-15650 was based off commits before CASSANDRA-15539 which removed 
> some of the files modified in CASSANDRA-15650.  The tests were passing 
> pre-merge but off earlier commits.  On commit they started failing since the 
> dtest API no longer match so produces the following exception
> {code}
> [junit-timeout] 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts;
> [junit-timeout] java.lang.NoSuchMethodError: 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts;
> [junit-timeout] at 
> org.apache.cassandra.distributed.test.RepairCoordinatorFast.lambda$unknownHost$5(RepairCoordinatorFast.java:216)
> [junit-timeout] at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39)
> [junit-timeout] at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67)
> [junit-timeout] at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [junit-timeout] at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [junit-timeout] at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [junit-timeout] at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> [junit-timeout] at java.lang.Thread.run(Thread.java:748)
> {code}
> Root cause was 4 files exited which should have been deleted in 
> CASSANDRA-15539.  Since they were not when CASSANDRA-15650 modified one it 
> didn't cause a merge conflict, but when the test runs it conflicts and fails.
> [Circle 
> CI|https://circleci.com/gh/dcapwell/cassandra/tree/bug%2FCASSANDRA-15684]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15684) CASSANDRA-15650 was merged after dtest refactor and modified classes no longer in the project

2020-04-03 Thread Michael Semb Wever (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074468#comment-17074468
 ] 

Michael Semb Wever commented on CASSANDRA-15684:


for just the cassandra patch…
||branch||circleci||jenkins||
|[trunk_15684|https://github.com/apache/cassandra/compare/trunk...dcapwell:bug/CASSANDRA-15684]|[circleci|https://circleci.com/gh/dcapwell/cassandra/tree/bug%2FCASSANDRA-15684]|[!https://ci-cassandra.apache.org/job/Cassandra-devbranch/14/badge/icon!|https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/14]|

> CASSANDRA-15650 was merged after dtest refactor and modified classes no 
> longer in the project
> -
>
> Key: CASSANDRA-15684
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15684
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> CASSANDRA-15650 was based off commits before CASSANDRA-15539 which removed 
> some of the files modified in CASSANDRA-15650.  The tests were passing 
> pre-merge but off earlier commits.  On commit they started failing since the 
> dtest API no longer match so produces the following exception
> {code}
> [junit-timeout] 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts;
> [junit-timeout] java.lang.NoSuchMethodError: 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts;
> [junit-timeout] at 
> org.apache.cassandra.distributed.test.RepairCoordinatorFast.lambda$unknownHost$5(RepairCoordinatorFast.java:216)
> [junit-timeout] at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39)
> [junit-timeout] at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67)
> [junit-timeout] at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [junit-timeout] at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [junit-timeout] at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [junit-timeout] at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> [junit-timeout] at java.lang.Thread.run(Thread.java:748)
> {code}
> Root cause was 4 files exited which should have been deleted in 
> CASSANDRA-15539.  Since they were not when CASSANDRA-15650 modified one it 
> didn't cause a merge conflict, but when the test runs it conflicts and fails.
> [Circle 
> CI|https://circleci.com/gh/dcapwell/cassandra/tree/bug%2FCASSANDRA-15684]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] 03/04: Expose repair streaming metrics

2020-04-03 Thread marcuse
This is an automated email from the ASF dual-hosted git repository.

marcuse pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit caa3bd83dcb92a3145b7ef0fd73bbd3708b255bf
Author: Sankalp Kohli 
AuthorDate: Mon Mar 23 10:59:27 2020 +0100

Expose repair streaming metrics

Patch by Sankalp Kohli; reviewed by Ekaterina Dimitrova for CASSANDRA-15656
---
 CHANGES.txt |  1 +
 src/java/org/apache/cassandra/metrics/StreamingMetrics.java |  2 ++
 src/java/org/apache/cassandra/streaming/StreamSession.java  | 13 +
 3 files changed, 16 insertions(+)

diff --git a/CHANGES.txt b/CHANGES.txt
index 77d69ca..55243b8 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0-alpha4
+ * Expose repair streaming metrics (CASSANDRA-15656)
  * Set now in seconds in the future for validation repairs (CASSANDRA-15655)
  * Emit metric on preview repair failure (CASSANDRA-15654)
  * Use more appropriate logging levels (CASSANDRA-15661)
diff --git a/src/java/org/apache/cassandra/metrics/StreamingMetrics.java 
b/src/java/org/apache/cassandra/metrics/StreamingMetrics.java
index 793a8c0..80a5e13 100644
--- a/src/java/org/apache/cassandra/metrics/StreamingMetrics.java
+++ b/src/java/org/apache/cassandra/metrics/StreamingMetrics.java
@@ -39,6 +39,8 @@ public class StreamingMetrics
 public static final Counter activeStreamsOutbound = 
Metrics.counter(DefaultNameFactory.createMetricName(TYPE_NAME, 
"ActiveOutboundStreams", null));
 public static final Counter totalIncomingBytes = 
Metrics.counter(DefaultNameFactory.createMetricName(TYPE_NAME, 
"TotalIncomingBytes", null));
 public static final Counter totalOutgoingBytes = 
Metrics.counter(DefaultNameFactory.createMetricName(TYPE_NAME, 
"TotalOutgoingBytes", null));
+public static final Counter totalOutgoingRepairBytes = 
Metrics.counter(DefaultNameFactory.createMetricName(TYPE_NAME, 
"TotalOutgoingRepairBytes", null));
+public static final Counter totalOutgoingRepairSSTables = 
Metrics.counter(DefaultNameFactory.createMetricName(TYPE_NAME, 
"TotalOutgoingRepairSSTables", null));
 public final Counter incomingBytes;
 public final Counter outgoingBytes;
 
diff --git a/src/java/org/apache/cassandra/streaming/StreamSession.java 
b/src/java/org/apache/cassandra/streaming/StreamSession.java
index 95d3755..05bb5ff 100644
--- a/src/java/org/apache/cassandra/streaming/StreamSession.java
+++ b/src/java/org/apache/cassandra/streaming/StreamSession.java
@@ -510,8 +510,21 @@ public class StreamSession implements 
IEndpointStateChangeSubscriber
 state(State.PREPARING);
 PrepareSynMessage prepare = new PrepareSynMessage();
 prepare.requests.addAll(requests);
+long totalBytesToStream = 0;
+long totalSSTablesStreamed = 0;
 for (StreamTransferTask task : transfers.values())
+{
+totalBytesToStream += task.getTotalSize();
+totalSSTablesStreamed += task.getTotalNumberOfFiles();
 prepare.summaries.add(task.getSummary());
+}
+
+if(StreamOperation.REPAIR == getStreamOperation())
+{
+StreamingMetrics.totalOutgoingRepairBytes.inc(totalBytesToStream);
+
StreamingMetrics.totalOutgoingRepairSSTables.inc(totalSSTablesStreamed);
+}
+
 messageSender.sendMessage(prepare);
 }
 


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] 02/04: Set now in seconds in the future for validation repairs

2020-04-03 Thread marcuse
This is an automated email from the ASF dual-hosted git repository.

marcuse pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit 6a302f734a1dede4b4bdee5e5f67c5ada6f0115d
Author: Blake Eggleston 
AuthorDate: Tue Mar 17 16:47:50 2020 +0100

Set now in seconds in the future for validation repairs

Patch by Blake Eggleston; reviewed by Ekaterina Dimitrova for 
CASSANDRA-15655
---
 CHANGES.txt   |  1 +
 src/java/org/apache/cassandra/config/Config.java  |  7 +++
 .../apache/cassandra/config/DatabaseDescriptor.java   |  7 +++
 src/java/org/apache/cassandra/repair/RepairJob.java   | 19 ---
 4 files changed, 31 insertions(+), 3 deletions(-)

diff --git a/CHANGES.txt b/CHANGES.txt
index 9896272..77d69ca 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0-alpha4
+ * Set now in seconds in the future for validation repairs (CASSANDRA-15655)
  * Emit metric on preview repair failure (CASSANDRA-15654)
  * Use more appropriate logging levels (CASSANDRA-15661)
  * Added production recommendations and improved compaction doc organization
diff --git a/src/java/org/apache/cassandra/config/Config.java 
b/src/java/org/apache/cassandra/config/Config.java
index 5a24410..3fc314f 100644
--- a/src/java/org/apache/cassandra/config/Config.java
+++ b/src/java/org/apache/cassandra/config/Config.java
@@ -471,6 +471,13 @@ public class Config
 public volatile boolean report_unconfirmed_repaired_data_mismatches = 
false;
 
 /**
+ * number of seconds to set nowInSec into the future when performing 
validation previews against repaired data
+ * this (attempts) to prevent a race where validations on different 
machines are started on different sides of
+ * a tombstone being compacted away
+ */
+public volatile int validation_preview_purge_head_start_in_sec = 60 * 60;
+
+/**
  * @deprecated migrate to {@link DatabaseDescriptor#isClientInitialized()}
  */
 @Deprecated
diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java 
b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
index 14db023..7af310e 100644
--- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
+++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
@@ -3061,4 +3061,11 @@ public class DatabaseDescriptor
 throw new ConfigurationException(String.format("%s must be 
positive value < %d, but was %d",
name, 
unit.overflowThreshold(), val), false);
 }
+
+public static int getValidationPreviewPurgeHeadStartInSec()
+{
+int seconds = conf.validation_preview_purge_head_start_in_sec;
+return Math.max(seconds, 0);
+}
+
 }
diff --git a/src/java/org/apache/cassandra/repair/RepairJob.java 
b/src/java/org/apache/cassandra/repair/RepairJob.java
index 3740070..e609f0d 100644
--- a/src/java/org/apache/cassandra/repair/RepairJob.java
+++ b/src/java/org/apache/cassandra/repair/RepairJob.java
@@ -71,6 +71,19 @@ public class RepairJob extends AbstractFuture 
implements Runnable
 this.parallelismDegree = session.parallelismDegree;
 }
 
+public int getNowInSeconds()
+{
+int nowInSeconds = FBUtilities.nowInSeconds();
+if (session.previewKind == PreviewKind.REPAIRED)
+{
+return nowInSeconds + 
DatabaseDescriptor.getValidationPreviewPurgeHeadStartInSec();
+}
+else
+{
+return nowInSeconds;
+}
+}
+
 /**
  * Runs repair job.
  *
@@ -345,7 +358,7 @@ public class RepairJob extends AbstractFuture 
implements Runnable
 String message = String.format("Requesting merkle trees for %s (to 
%s)", desc.columnFamily, endpoints);
 logger.info("{} {}", session.previewKind.logPrefix(desc.sessionId), 
message);
 Tracing.traceRepair(message);
-int nowInSec = FBUtilities.nowInSeconds();
+int nowInSec = getNowInSeconds();
 List> tasks = new 
ArrayList<>(endpoints.size());
 for (InetAddressAndPort endpoint : endpoints)
 {
@@ -365,7 +378,7 @@ public class RepairJob extends AbstractFuture 
implements Runnable
 String message = String.format("Requesting merkle trees for %s (to 
%s)", desc.columnFamily, endpoints);
 logger.info("{} {}", session.previewKind.logPrefix(desc.sessionId), 
message);
 Tracing.traceRepair(message);
-int nowInSec = FBUtilities.nowInSeconds();
+int nowInSec = getNowInSeconds();
 List> tasks = new 
ArrayList<>(endpoints.size());
 
 Queue requests = new LinkedList<>(endpoints);
@@ -407,7 +420,7 @@ public class RepairJob extends AbstractFuture 
implements Runnable
 String message = String.format("Requesting merkle trees for %s (to 
%s)", desc.columnFamily, endpoints);
 logger.info("{} {}", session.previewKind.logPrefix(desc.sessi

[cassandra] 04/04: Fix force compaction of wrapping ranges

2020-04-03 Thread marcuse
This is an automated email from the ASF dual-hosted git repository.

marcuse pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit 49abedc2c30c6274339bc203e0ddd10f128dae58
Author: Jeff Jirsa 
AuthorDate: Mon Mar 23 10:16:26 2020 +0100

Fix force compaction of wrapping ranges

Patch by Jeff Jirsa; reviewed by Benjamin Lerer for CASSANDRA-15664
---
 CHANGES.txt|  1 +
 .../cassandra/db/compaction/CompactionManager.java | 18 +++-
 .../compaction/LeveledCompactionStrategyTest.java  | 52 ++
 3 files changed, 69 insertions(+), 2 deletions(-)

diff --git a/CHANGES.txt b/CHANGES.txt
index 55243b8..65111d0 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0-alpha4
+ * Fix force compaction of wrapping ranges (CASSANDRA-15664)
  * Expose repair streaming metrics (CASSANDRA-15656)
  * Set now in seconds in the future for validation repairs (CASSANDRA-15655)
  * Emit metric on preview repair failure (CASSANDRA-15654)
diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java 
b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
index 28db027..7924a1f 100644
--- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
+++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
@@ -33,6 +33,7 @@ import com.google.common.base.Preconditions;
 import com.google.common.collect.*;
 import com.google.common.util.concurrent.*;
 
+import org.apache.cassandra.dht.AbstractBounds;
 import org.apache.cassandra.locator.RangesAtEndpoint;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
@@ -929,8 +930,21 @@ public class CompactionManager implements 
CompactionManagerMBean
 
 for (Range tokenRange : tokenRangeCollection)
 {
-Iterable ssTableReaders = 
View.sstablesInBounds(tokenRange.left.minKeyBound(), 
tokenRange.right.maxKeyBound(), tree);
-Iterables.addAll(sstables, ssTableReaders);
+if (!AbstractBounds.strictlyWrapsAround(tokenRange.left, 
tokenRange.right))
+{
+Iterable ssTableReaders = 
View.sstablesInBounds(tokenRange.left.minKeyBound(), 
tokenRange.right.maxKeyBound(), tree);
+Iterables.addAll(sstables, ssTableReaders);
+}
+else
+{
+// Searching an interval tree will not return the correct 
results for a wrapping range
+// so we have to unwrap it first
+for (Range unwrappedRange : tokenRange.unwrap())
+{
+Iterable ssTableReaders = 
View.sstablesInBounds(unwrappedRange.left.minKeyBound(), 
unwrappedRange.right.maxKeyBound(), tree);
+Iterables.addAll(sstables, ssTableReaders);
+}
+}
 }
 return sstables;
 }
diff --git 
a/test/unit/org/apache/cassandra/db/compaction/LeveledCompactionStrategyTest.java
 
b/test/unit/org/apache/cassandra/db/compaction/LeveledCompactionStrategyTest.java
index b925bab..8a8ed13 100644
--- 
a/test/unit/org/apache/cassandra/db/compaction/LeveledCompactionStrategyTest.java
+++ 
b/test/unit/org/apache/cassandra/db/compaction/LeveledCompactionStrategyTest.java
@@ -460,6 +460,58 @@ public class LeveledCompactionStrategyTest
 
 // the 11 tables containing key1 should all compact to 1 table
 assertEquals(1, cfs.getLiveSSTables().size());
+// Set it up again
+cfs.truncateBlocking();
+
+// create 10 sstables that contain data for both key1 and key2
+for (int i = 0; i < numIterations; i++)
+{
+for (DecoratedKey key : keys)
+{
+UpdateBuilder update = UpdateBuilder.create(cfs.metadata(), 
key);
+for (int c = 0; c < columns; c++)
+update.newRow("column" + c).add("val", value);
+update.applyUnsafe();
+}
+cfs.forceBlockingFlush();
+}
+
+// create 20 more sstables with 10 containing data for key1 and other 
10 containing data for key2
+for (int i = 0; i < numIterations; i++)
+{
+for (DecoratedKey key : keys)
+{
+UpdateBuilder update = UpdateBuilder.create(cfs.metadata(), 
key);
+for (int c = 0; c < columns; c++)
+update.newRow("column" + c).add("val", value);
+update.applyUnsafe();
+cfs.forceBlockingFlush();
+}
+}
+
+// We should have a total of 30 sstables again
+assertEquals(30, cfs.getLiveSSTables().size());
+
+// This time, we're going to make sure the token range wraps around, 
to cover the full range
+Range wrappingRange;
+if (key1.getToken().compareTo(key2.getToken()) < 0)
+{
+wrappingRange = new Range<>(key2.getToken(), key1.getToken());
+  

[cassandra] branch trunk updated (6ae6596 -> 49abedc)

2020-04-03 Thread marcuse
This is an automated email from the ASF dual-hosted git repository.

marcuse pushed a change to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from 6ae6596  Fix tests expecting exceptions wrapped in RuntimeException.
 new dcf6fe8  Emit metric on preview repair failure
 new 6a302f7  Set now in seconds in the future for validation repairs
 new caa3bd8  Expose repair streaming metrics
 new 49abedc  Fix force compaction of wrapping ranges

The 4 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGES.txt|  4 ++
 src/java/org/apache/cassandra/config/Config.java   |  7 +++
 .../cassandra/config/DatabaseDescriptor.java   |  7 +++
 .../cassandra/db/compaction/CompactionManager.java | 18 +++-
 .../{MetricNameFactory.java => RepairMetrics.java} | 20 +
 .../apache/cassandra/metrics/StreamingMetrics.java |  2 +
 .../org/apache/cassandra/repair/RepairJob.java | 19 ++--
 .../apache/cassandra/repair/RepairRunnable.java|  2 +
 .../repair/consistent/SyncStatSummary.java | 12 +
 .../cassandra/service/ActiveRepairService.java |  2 +
 .../apache/cassandra/streaming/StreamSession.java  | 13 ++
 .../compaction/LeveledCompactionStrategyTest.java  | 52 ++
 12 files changed, 145 insertions(+), 13 deletions(-)
 copy src/java/org/apache/cassandra/metrics/{MetricNameFactory.java => 
RepairMetrics.java} (68%)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] 01/04: Emit metric on preview repair failure

2020-04-03 Thread marcuse
This is an automated email from the ASF dual-hosted git repository.

marcuse pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit dcf6fe8162f00353dae497a6adcfec2ac88f8e0a
Author: Blake Eggleston 
AuthorDate: Tue Mar 17 15:51:36 2020 +0100

Emit metric on preview repair failure

Patch by Blake Eggleston; reviewed by Ekaterina Dimitrova for 
CASSANDRA-15654
---
 CHANGES.txt|  1 +
 .../apache/cassandra/metrics/RepairMetrics.java| 34 ++
 .../apache/cassandra/repair/RepairRunnable.java|  2 ++
 .../repair/consistent/SyncStatSummary.java | 12 
 .../cassandra/service/ActiveRepairService.java |  2 ++
 5 files changed, 51 insertions(+)

diff --git a/CHANGES.txt b/CHANGES.txt
index b71d8da..9896272 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0-alpha4
+ * Emit metric on preview repair failure (CASSANDRA-15654)
  * Use more appropriate logging levels (CASSANDRA-15661)
  * Added production recommendations and improved compaction doc organization
  * Document usage of EC2Snitch with intra-region VPC peering (CASSANDRA-15337)
diff --git a/src/java/org/apache/cassandra/metrics/RepairMetrics.java 
b/src/java/org/apache/cassandra/metrics/RepairMetrics.java
new file mode 100644
index 000..5b4f67e
--- /dev/null
+++ b/src/java/org/apache/cassandra/metrics/RepairMetrics.java
@@ -0,0 +1,34 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.cassandra.metrics;
+
+import com.codahale.metrics.Counter;
+
+import static org.apache.cassandra.metrics.CassandraMetricsRegistry.Metrics;
+
+public class RepairMetrics
+{
+public static final String TYPE_NAME = "Repair";
+public static final Counter previewFailures = 
Metrics.counter(DefaultNameFactory.createMetricName(TYPE_NAME, 
"PreviewFailures", null));
+
+public static void init()
+{
+// noop
+}
+}
diff --git a/src/java/org/apache/cassandra/repair/RepairRunnable.java 
b/src/java/org/apache/cassandra/repair/RepairRunnable.java
index c673a6c..0ac34a3 100644
--- a/src/java/org/apache/cassandra/repair/RepairRunnable.java
+++ b/src/java/org/apache/cassandra/repair/RepairRunnable.java
@@ -53,6 +53,7 @@ import com.codahale.metrics.Timer;
 import org.apache.cassandra.concurrent.JMXEnabledThreadPoolExecutor;
 import org.apache.cassandra.concurrent.NamedThreadFactory;
 import org.apache.cassandra.config.DatabaseDescriptor;
+import org.apache.cassandra.metrics.RepairMetrics;
 import org.apache.cassandra.cql3.QueryOptions;
 import org.apache.cassandra.cql3.QueryProcessor;
 import org.apache.cassandra.cql3.UntypedResultSet;
@@ -527,6 +528,7 @@ public class RepairRunnable implements Runnable, 
ProgressEventNotifier
 else
 {
 message = (previewKind == PreviewKind.REPAIRED ? 
"Repaired data is inconsistent\n" : "Preview complete\n") + summary.toString();
+RepairMetrics.previewFailures.inc();
 }
 notification(message);
 
diff --git 
a/src/java/org/apache/cassandra/repair/consistent/SyncStatSummary.java 
b/src/java/org/apache/cassandra/repair/consistent/SyncStatSummary.java
index 156fde7..f8e1bfb 100644
--- a/src/java/org/apache/cassandra/repair/consistent/SyncStatSummary.java
+++ b/src/java/org/apache/cassandra/repair/consistent/SyncStatSummary.java
@@ -30,6 +30,8 @@ import org.apache.cassandra.locator.InetAddressAndPort;
 import org.apache.cassandra.repair.RepairResult;
 import org.apache.cassandra.repair.RepairSessionResult;
 import org.apache.cassandra.repair.SyncStat;
+import org.apache.cassandra.schema.Schema;
+import org.apache.cassandra.schema.TableMetadata;
 import org.apache.cassandra.streaming.SessionSummary;
 import org.apache.cassandra.streaming.StreamSummary;
 import org.apache.cassandra.utils.FBUtilities;
@@ -130,6 +132,12 @@ public class SyncStatSummary
 totalsCalculated = true;
 }
 
+boolean isCounter()
+{
+TableMetadata tmd = Schema.instance.getTableMetadata(keyspace, 
table);
+return tmd != null && tmd.isCounter();
+

[jira] [Updated] (CASSANDRA-15664) Handle wrapping ranges in token range compaction

2020-04-03 Thread Marcus Eriksson (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-15664:

  Since Version: 4.0-alpha
Source Control Link: 
https://github.com/apache/cassandra/commit/49abedc2c30c6274339bc203e0ddd10f128dae58
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

and committed, thanks!

> Handle wrapping ranges in token range compaction
> 
>
> Key: CASSANDRA-15664
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15664
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> We currently don't handle wrapping ranges in token range compaction



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15654) Track preview repair failures

2020-04-03 Thread Marcus Eriksson (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-15654:

Source Control Link: 
https://github.com/apache/cassandra/commit/dcf6fe8162f00353dae497a6adcfec2ac88f8e0a
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

committed, thanks!

> Track preview repair failures
> -
>
> Key: CASSANDRA-15654
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15654
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Repair, Observability/Metrics
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Low
> Fix For: 4.0-alpha
>
> Attachments: Screenshot 2020-04-02 at 09.10.06.png, Screenshot 
> 2020-04-02 at 09.10.34.png
>
>
> We should expose a metric for when preview repair fails



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15655) Optionally set nowInSec in the future for validation repairs

2020-04-03 Thread Marcus Eriksson (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-15655:

Source Control Link: 
https://github.com/apache/cassandra/commit/6a302f734a1dede4b4bdee5e5f67c5ada6f0115d
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

committed, thanks!

> Optionally set nowInSec in the future for validation repairs
> 
>
> Key: CASSANDRA-15655
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15655
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Repair
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 4.0-alpha
>
> Attachments: Screenshot 2020-04-02 at 09.07.41.png, Screenshot 
> 2020-04-02 at 09.07.54.png, Screenshot 2020-04-02 at 09.08.27.png
>
>
> There is a race when running validation repairs where we might build merkle 
> trees on one node, have gcgs expire, tombstones get compacted out and then 
> build merkle trees on other nodes, causing false positive preview repair 
> mismatch



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15656) Expose repair streaming metric

2020-04-03 Thread Marcus Eriksson (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-15656:

Source Control Link: 
https://github.com/apache/cassandra/commit/caa3bd83dcb92a3145b7ef0fd73bbd3708b255bf
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

committed, thanks!

> Expose repair streaming metric
> --
>
> Key: CASSANDRA-15656
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15656
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Repair, Consistency/Streaming
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 4.0-alpha
>
> Attachments: Screenshot 2020-04-02 at 09.04.41.png, Screenshot 
> 2020-04-02 at 09.05.03.png, Screenshot 2020-04-02 at 09.05.19.png
>
>
> We should expose a metric for how much data is streamed during repair



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12701) Repair history tables should have TTL and TWCS

2020-04-03 Thread Marcus Eriksson (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074537#comment-17074537
 ] 

Marcus Eriksson commented on CASSANDRA-12701:
-

test failures are handled in CASSANDRA-15684 - will rebase and rerun tests once 
that has been committed

> Repair history tables should have TTL and TWCS
> --
>
> Key: CASSANDRA-12701
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12701
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Core
>Reporter: Chris Lohfink
>Assignee: Marcus Eriksson
>Priority: Normal
>  Labels: lhf
> Attachments: CASSANDRA-12701.txt
>
>
> Some tools schedule a lot of small subrange repairs which can lead to a lot 
> of repairs constantly being run. These partitions can grow pretty big in 
> theory. I dont think much reads from them which might help but its still 
> kinda wasted disk space. I think a month TTL (longer than gc grace) and maybe 
> a 1 day twcs window makes sense to me.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15672) Testsuite: org.apache.cassandra.repair.consistent.CoordinatorMessagingTest Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.878 sec

2020-04-03 Thread Stefania Alborghetti (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania Alborghetti updated CASSANDRA-15672:
-
Status: Review In Progress  (was: Changes Suggested)

>  Testsuite: org.apache.cassandra.repair.consistent.CoordinatorMessagingTest 
> Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.878 sec
> -
>
> Key: CASSANDRA-15672
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15672
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Ekaterina Dimitrova
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0, 4.0-beta
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The following failure was observed:
>  Testsuite: org.apache.cassandra.repair.consistent.CoordinatorMessagingTest 
> Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.878 sec
> [junit-timeout] 
> [junit-timeout] Testcase: 
> testMockedMessagingPrepareFailureP1(org.apache.cassandra.repair.consistent.CoordinatorMessagingTest):
>FAILED
> [junit-timeout] null
> [junit-timeout] junit.framework.AssertionFailedError
> [junit-timeout]   at 
> org.apache.cassandra.repair.consistent.CoordinatorMessagingTest.testMockedMessagingPrepareFailure(CoordinatorMessagingTest.java:206)
> [junit-timeout]   at 
> org.apache.cassandra.repair.consistent.CoordinatorMessagingTest.testMockedMessagingPrepareFailureP1(CoordinatorMessagingTest.java:154)
> [junit-timeout] 
> [junit-timeout] 
> Seen on Java8



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-15667) StreamResultFuture check for completeness is inconsistent, leading to races

2020-04-03 Thread Massimiliano Tomassi (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Massimiliano Tomassi reassigned CASSANDRA-15667:


Assignee: Massimiliano Tomassi  (was: Benjamin Lerer)

> StreamResultFuture check for completeness is inconsistent, leading to races
> ---
>
> Key: CASSANDRA-15667
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15667
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Streaming and Messaging
>Reporter: Sergio Bossa
>Assignee: Massimiliano Tomassi
>Priority: Normal
> Fix For: 4.0
>
>
> {{StreamResultFuture#maybeComplete()}} uses 
> {{StreamCoordinator#hasActiveSessions()}} to determine if all sessions are 
> completed, but then accesses each session state via 
> {{StreamCoordinator#getAllSessionInfo()}}: this is inconsistent, as the 
> former relies on the actual {{StreamSession}} state, while the latter on the 
> {{SessionInfo}} state, and the two are concurrently updated with no 
> coordination whatsoever.
> This leads to races, i.e. apparent in some dtest spurious failures, such as 
> {{TestBootstrap.resumable_bootstrap_test}} in CASSANDRA-15614 cc 
> [~e.dimitrova].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15582) 4.0 quality testing: metrics

2020-04-03 Thread Stephen Mallette (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074619#comment-17074619
 ] 

Stephen Mallette commented on CASSANDRA-15582:
--

Going up further in the comments here there was mention of a desire to do:

> 1) diff metrics names (did we remove something? Change name is a remove)
> 2) diff metric signature (Used to return long but now string, used to take no 
> arguments but now takes string, etc)

I've experimented a bit with a Groovy script (because it has nice syntax to 
interact with jmx) to connect to cassandra's jmx port and pull all the MBeans. 
I could further this script to do some automated analysis to determine items 1 
and 2 above as mentioned in an earlier comment: 

> we should have some automated way of pulling all the metrics out so we can 
> detect changes in the list of published metrics in an automated way. Since we 
> publish all metrics via JMX, I think it is doable. 

Analysis of that fashion through my experimental script is more of a one-time 
test however and I wasn't sure if there was some intent based on these comments 
to more fully automate this sort of testing going forward to help avoid 
breaking metric changes as part of the standard and ongoing testing process. 
I'd be curious what the direction is in this area and I'm happy to try to help.


> 4.0 quality testing: metrics
> 
>
> Key: CASSANDRA-15582
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15582
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest
>Reporter: Josh McKenzie
>Assignee: Romain Hardouin
>Priority: Normal
> Fix For: 4.0-rc
>
>
> In past releases we've unknowingly broken metrics integrations and introduced 
> performance regressions in metrics collection and reporting. We strive in 4.0 
> to not do that. Metrics should work well!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15667) StreamResultFuture check for completeness is inconsistent, leading to races

2020-04-03 Thread Benjamin Lerer (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-15667:
---
 Bug Category: Parent values: Correctness(12982)
   Complexity: Normal
Discovered By: Adhoc Test
 Severity: Normal
   Status: Open  (was: Triage Needed)

> StreamResultFuture check for completeness is inconsistent, leading to races
> ---
>
> Key: CASSANDRA-15667
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15667
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Streaming and Messaging
>Reporter: Sergio Bossa
>Assignee: Massimiliano Tomassi
>Priority: Normal
> Fix For: 4.0
>
>
> {{StreamResultFuture#maybeComplete()}} uses 
> {{StreamCoordinator#hasActiveSessions()}} to determine if all sessions are 
> completed, but then accesses each session state via 
> {{StreamCoordinator#getAllSessionInfo()}}: this is inconsistent, as the 
> former relies on the actual {{StreamSession}} state, while the latter on the 
> {{SessionInfo}} state, and the two are concurrently updated with no 
> coordination whatsoever.
> This leads to races, i.e. apparent in some dtest spurious failures, such as 
> {{TestBootstrap.resumable_bootstrap_test}} in CASSANDRA-15614 cc 
> [~e.dimitrova].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15685) flaky testWithMismatchingPending - org.apache.cassandra.distributed.test.PreviewRepairTest

2020-04-03 Thread Kevin Gallardo (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074648#comment-17074648
 ] 

Kevin Gallardo commented on CASSANDRA-15685:


Agreed

> flaky testWithMismatchingPending - 
> org.apache.cassandra.distributed.test.PreviewRepairTest
> --
>
> Key: CASSANDRA-15685
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15685
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Kevin Gallardo
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Observed in: 
> https://app.circleci.com/pipelines/github/newkek/cassandra/34/workflows/1c6b157d-13c3-48a9-85fb-9fe8c153256b/jobs/191/tests
> Failure:
> {noformat}
> testWithMismatchingPending - 
> org.apache.cassandra.distributed.test.PreviewRepairTest
> junit.framework.AssertionFailedError
>   at 
> org.apache.cassandra.distributed.test.PreviewRepairTest.testWithMismatchingPending(PreviewRepairTest.java:97)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15685) flaky testWithMismatchingPending - org.apache.cassandra.distributed.test.PreviewRepairTest

2020-04-03 Thread Kevin Gallardo (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Gallardo updated CASSANDRA-15685:
---
Resolution: Duplicate
Status: Resolved  (was: Open)

> flaky testWithMismatchingPending - 
> org.apache.cassandra.distributed.test.PreviewRepairTest
> --
>
> Key: CASSANDRA-15685
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15685
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: Kevin Gallardo
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Observed in: 
> https://app.circleci.com/pipelines/github/newkek/cassandra/34/workflows/1c6b157d-13c3-48a9-85fb-9fe8c153256b/jobs/191/tests
> Failure:
> {noformat}
> testWithMismatchingPending - 
> org.apache.cassandra.distributed.test.PreviewRepairTest
> junit.framework.AssertionFailedError
>   at 
> org.apache.cassandra.distributed.test.PreviewRepairTest.testWithMismatchingPending(PreviewRepairTest.java:97)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15229) BufferPool Regression

2020-04-03 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074656#comment-17074656
 ] 

Benedict Elliott Smith commented on CASSANDRA-15229:


bq. a bump-the-pointer slab approach for the transient pool, not to dissimilar 
from the current implementation. We then exploit our thread per core 
architecture: core threads get a dedicated slab each, other threads share a 
global slab.

The current implementation isn't really a bump the pointer allocator?  It's 
bitmap based, though with a very tiny bitmap.  Could you elaborate on how these 
work, as my intuition is that anything designed for a thread-per-core 
architecture probably won't translate so well to the present state of the 
world.  Though, either way, I suppose this is probably orthogonal to this 
ticket as we only need to address the {{ChunkCache}} part.

bq. We also optimized the chunk cache to store memory addresses rather than 
byte buffers, which significantly reduced heap usage. The byte buffers are 
materialized on the fly.

This would be a huge improvement, and a welcome backport if it is easy - though 
it might (I would guess) depend on {{Unsafe}}, which may be going away soon.  
It's orthogonal to this ticket, though, I think.

bq. We changed the chunk cache to always store buffers of the same size.
bq. We have global lists of these slabs, sorted by buffer size where each size 
is a power-of-two.

How do these two statements reconcile?

Is it your opinion that your entire {{ChunkCache}} implementation can be 
dropped wholesale into 4.0?  I would assume it is still primarily 
multi-threaded.  If so, it might be preferable to trying to fix the existing 
{{ChunkCache}}

> BufferPool Regression
> -
>
> Key: CASSANDRA-15229
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15229
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Caching
>Reporter: Benedict Elliott Smith
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0, 4.0-beta
>
>
> The BufferPool was never intended to be used for a {{ChunkCache}}, and we 
> need to either change our behaviour to handle uncorrelated lifetimes or use 
> something else.  This is particularly important with the default chunk size 
> for compressed sstables being reduced.  If we address the problem, we should 
> also utilise the BufferPool for native transport connections like we do for 
> internode messaging, and reduce the number of pooling solutions we employ.
> Probably the best thing to do is to improve BufferPool’s behaviour when used 
> for things with uncorrelated lifetimes, which essentially boils down to 
> tracking those chunks that have not been freed and re-circulating them when 
> we run out of completely free blocks.  We should probably also permit 
> instantiating separate {{BufferPool}}, so that we can insulate internode 
> messaging from the {{ChunkCache}}, or at least have separate memory bounds 
> for each, and only share fully-freed chunks.
> With these improvements we can also safely increase the {{BufferPool}} chunk 
> size to 128KiB or 256KiB, to guarantee we can fit compressed pages and reduce 
> the amount of global coordination and per-allocation overhead.  We don’t need 
> 1KiB granularity for allocations, nor 16 byte granularity for tiny 
> allocations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15684) CASSANDRA-15650 was merged after dtest refactor and modified classes no longer in the project

2020-04-03 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074685#comment-17074685
 ] 

David Capwell commented on CASSANDRA-15684:
---

Thanks Michael, the direct link to the jvm-dtests [are 
here|https://ci-cassandra.apache.org/job/Cassandra-devbranch-jvm-dtest/6/testReport/]

{code}
All Failed Tests

Test Name
Duration
Age
 
org.apache.cassandra.distributed.test.CasWriteTest.casWriteContentionTimeoutTest
 (Failed 1 times in the last 6 runs. Flakiness: 20%, Stability: 83%)   0 ms1
 
org.apache.cassandra.distributed.test.SimpleReadWriteTest.readRepairTimeoutTest 
(Failed 1 times in the last 6 runs. Flakiness: 20%, Stability: 83%)11 sec  1
 
org.apache.cassandra.distributed.test.SimpleReadWriteTest.writeWithSchemaDisagreement
 (Failed 1 times in the last 6 runs. Flakiness: 20%, Stability: 83%)  9.8 
sec 1
 
org.apache.cassandra.distributed.test.SimpleReadWriteTest.readWithSchemaDisagreement
 (Failed 1 times in the last 6 runs. Flakiness: 20%, Stability: 83%)   9.7 
sec 1
{code}

> CASSANDRA-15650 was merged after dtest refactor and modified classes no 
> longer in the project
> -
>
> Key: CASSANDRA-15684
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15684
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> CASSANDRA-15650 was based off commits before CASSANDRA-15539 which removed 
> some of the files modified in CASSANDRA-15650.  The tests were passing 
> pre-merge but off earlier commits.  On commit they started failing since the 
> dtest API no longer match so produces the following exception
> {code}
> [junit-timeout] 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts;
> [junit-timeout] java.lang.NoSuchMethodError: 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts;
> [junit-timeout] at 
> org.apache.cassandra.distributed.test.RepairCoordinatorFast.lambda$unknownHost$5(RepairCoordinatorFast.java:216)
> [junit-timeout] at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39)
> [junit-timeout] at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67)
> [junit-timeout] at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [junit-timeout] at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [junit-timeout] at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [junit-timeout] at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> [junit-timeout] at java.lang.Thread.run(Thread.java:748)
> {code}
> Root cause was 4 files exited which should have been deleted in 
> CASSANDRA-15539.  Since they were not when CASSANDRA-15650 modified one it 
> didn't cause a merge conflict, but when the test runs it conflicts and fails.
> [Circle 
> CI|https://circleci.com/gh/dcapwell/cassandra/tree/bug%2FCASSANDRA-15684]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15684) CASSANDRA-15650 was merged after dtest refactor and modified classes no longer in the project

2020-04-03 Thread Alex Petrov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074695#comment-17074695
 ] 

Alex Petrov commented on CASSANDRA-15684:
-

+1 for a Cassandra change. 

I'll have to take a closer look at the dtest patch on Monday.

> CASSANDRA-15650 was merged after dtest refactor and modified classes no 
> longer in the project
> -
>
> Key: CASSANDRA-15684
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15684
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> CASSANDRA-15650 was based off commits before CASSANDRA-15539 which removed 
> some of the files modified in CASSANDRA-15650.  The tests were passing 
> pre-merge but off earlier commits.  On commit they started failing since the 
> dtest API no longer match so produces the following exception
> {code}
> [junit-timeout] 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts;
> [junit-timeout] java.lang.NoSuchMethodError: 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts;
> [junit-timeout] at 
> org.apache.cassandra.distributed.test.RepairCoordinatorFast.lambda$unknownHost$5(RepairCoordinatorFast.java:216)
> [junit-timeout] at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39)
> [junit-timeout] at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67)
> [junit-timeout] at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [junit-timeout] at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [junit-timeout] at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [junit-timeout] at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> [junit-timeout] at java.lang.Thread.run(Thread.java:748)
> {code}
> Root cause was 4 files exited which should have been deleted in 
> CASSANDRA-15539.  Since they were not when CASSANDRA-15650 modified one it 
> didn't cause a merge conflict, but when the test runs it conflicts and fails.
> [Circle 
> CI|https://circleci.com/gh/dcapwell/cassandra/tree/bug%2FCASSANDRA-15684]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15684) CASSANDRA-15650 was merged after dtest refactor and modified classes no longer in the project

2020-04-03 Thread Alex Petrov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074695#comment-17074695
 ] 

Alex Petrov edited comment on CASSANDRA-15684 at 4/3/20, 4:11 PM:
--

+1 for a Cassandra change. 

[~mck] if I understand the nature of the failure correctly, the main problem 
was merge. It was unintentional, and the patch itself was good, just based off 
earlier commit. 

I'll have to take a closer look at the dtest patch on Monday.


was (Author: ifesdjeen):
+1 for a Cassandra change. 

I'll have to take a closer look at the dtest patch on Monday.

> CASSANDRA-15650 was merged after dtest refactor and modified classes no 
> longer in the project
> -
>
> Key: CASSANDRA-15684
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15684
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> CASSANDRA-15650 was based off commits before CASSANDRA-15539 which removed 
> some of the files modified in CASSANDRA-15650.  The tests were passing 
> pre-merge but off earlier commits.  On commit they started failing since the 
> dtest API no longer match so produces the following exception
> {code}
> [junit-timeout] 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts;
> [junit-timeout] java.lang.NoSuchMethodError: 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts;
> [junit-timeout] at 
> org.apache.cassandra.distributed.test.RepairCoordinatorFast.lambda$unknownHost$5(RepairCoordinatorFast.java:216)
> [junit-timeout] at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39)
> [junit-timeout] at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67)
> [junit-timeout] at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [junit-timeout] at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [junit-timeout] at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [junit-timeout] at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> [junit-timeout] at java.lang.Thread.run(Thread.java:748)
> {code}
> Root cause was 4 files exited which should have been deleted in 
> CASSANDRA-15539.  Since they were not when CASSANDRA-15650 modified one it 
> didn't cause a merge conflict, but when the test runs it conflicts and fails.
> [Circle 
> CI|https://circleci.com/gh/dcapwell/cassandra/tree/bug%2FCASSANDRA-15684]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15642) Inconsistent failure messages on distributed queries

2020-04-03 Thread Kevin Gallardo (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Gallardo updated CASSANDRA-15642:
---
Description: 
As a follow up to some exploration I have done for CASSANDRA-15543, I realized 
the following behavior in both {{ReadCallback}} and {{AbstractWriteHandler}}:
 - await for responses
 - when all required number of responses have come back: unblock the wait
 - when a single failure happens: unblock the wait
 - when unblocked, look to see if the counter of failures is > 1 and if so 
return an error message based on the {{failures}} map that's been filled

Error messages that can result from this behavior can be a ReadTimeout, a 
ReadFailure, a WriteTimeout or a WriteFailure.

In case of a Write/ReadFailure, the user will get back an error looking like 
the following:

"Failure: Received X responses, and Y failures"

(if this behavior I describe is incorrect, please correct me)

This causes a usability problem. Since the handler will fail and throw an 
exception as soon as 1 failure happens, the error message that is returned to 
the user may not be accurate.

(note: I am not entirely sure of the behavior in case of timeouts for now)

For example, say a request at CL = QUORUM = 3, a failed request may complete 
first, then a successful one completes, and another fails. If the exception is 
thrown fast enough, the error message could say 
 "Failure: Received 0 response, and 1 failure at CL = 3"

Which:
1. doesn't make a lot of sense because the CL doesn't match the number of 
results in the message, so you end up thinking "what happened with the rest of 
the required CL?"
2. the information is incorrect. We did receive a successful response, only it 
came after the initial failure.

>From that logic, I think it is safe to assume that the information returned in 
>the error message cannot be trusted in case of a failure. Only information 
>users should extract out of it is that at least 1 node has failed.

For a big improvement in usability, the {{ReadCallback}} and 
{{AbstractWriteResponseHandler}} could instead wait for all responses to come 
back before unblocking the wait, or let it timeout. This is way, the users will 
be able to have some trust around the information returned to them.

Additionally, an error that happens first prevents a timeout to happen because 
it fails immediately, and so potentially it hides problems with other replicas. 
If we were to wait for all responses, we might get a timeout, in that case we'd 
also be able to tell wether failures have happened *before* that timeout, and 
have a more complete diagnostic where you can't detect both errors at the same 
time.

  was:
As a follow up to some exploration I have done for CASSANDRA-15543, I realized 
the following behavior in both {{ReadCallback}} and {{AbstractWriteHandler}}:
 - await for responses
 - when all required number of responses have come back: unblock the wait
 - when a single failure happens: unblock the wait
 - when unblocked, look to see if the counter of failures is > 1 and if so 
return an error message based on the {{failures}} map that's been filled

Error messages that can result from this behavior can be a ReadTimeout, a 
ReadFailure, a WriteTimeout or a WriteFailure.

In case of a Write/ReadFailure, the user will get back an error looking like 
the following:

"Failure: Received X responses, and Y failures"

(if this behavior I describe is incorrect, please correct me)

This causes a usability problem. Since the handler will fail and throw an 
exception as soon as 1 failure happens, the error message that is returned to 
the user may not be accurate.

(note: I am not entirely sure of the behavior in case of timeouts for now)

At, say, CL = QUORUM = 3, the failed request may complete first, then a 
successful one completes, and another fails. If the exception is thrown fast 
enough, the error message could say 
 "Failure: Received 0 response, and 1 failure at CL = 3"

Which 1. doesn't make a lot of sense because the CL doesn't match the previous 
information, but 2. the information is incorrect. We received a successful 
response, only it came after the initial failure.

>From that logic, I think it is safe to assume that the information returned in 
>the error message cannot be trusted in case of a failure. We can only know 
>that at least 1 node has failed, or not if the response is successful.

I am suggesting that for a big improvement in usability, the ReadCallback and 
AbstractWriteResponseHandler wait for all responses to come back before 
unblocking the wait, or let it timeout. This is way, the users will be able to 
have some trust around the numbers returned to them. Also we would be able to 
return more information this way.

Right now, an error that happens first prevents from a timeout to happen 
because it fails immediately, and so potentially it hides problems with other 
r

[jira] [Commented] (CASSANDRA-15676) flaky test testWriteUnknownResult- org.apache.cassandra.distributed.test.CasWriteTest

2020-04-03 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074723#comment-17074723
 ] 

David Capwell commented on CASSANDRA-15676:
---

I will file a different but related JIRA, CASSANDRA-15650 broke this test and 
causing it to hang consistently.  I checked the original build and see it 
didn't have CASSANDRA-15650, so the timeout happened before CASSANDRA-15650, 
but now consistent with CASSANDRA-15650.

> flaky test testWriteUnknownResult- 
> org.apache.cassandra.distributed.test.CasWriteTest
> -
>
> Key: CASSANDRA-15676
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15676
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest
>Reporter: Kevin Gallardo
>Assignee: Eduard Tudenhoefner
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Failure observed in: 
> https://app.circleci.com/pipelines/github/newkek/cassandra/33/workflows/54007cf7-4424-4ec1-9655-665f6044e6d1/jobs/187/tests
> {noformat}
> testWriteUnknownResult - org.apache.cassandra.distributed.test.CasWriteTest
> junit.framework.AssertionFailedError: Expecting cause to be 
> CasWriteUncertainException
>   at 
> org.apache.cassandra.distributed.test.CasWriteTest.testWriteUnknownResult(CasWriteTest.java:257)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15642) Inconsistent failure messages on distributed queries

2020-04-03 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074747#comment-17074747
 ] 

Benedict Elliott Smith commented on CASSANDRA-15642:


It is possible to simply report consistent information, and that would be my 
preference to not further burden the cluster with longer object lifetimes 
during cluster outages.  There is no real significant increase in information 
by waiting, and it delays how quickly a client may action this information

These classes need to be rewritten anyway.  This is post-4.0 work, but I have a 
partial rewrite lying around somewhere from a year or so ago I will dust off 
later perhaps.

> Inconsistent failure messages on distributed queries
> 
>
> Key: CASSANDRA-15642
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15642
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Coordination
>Reporter: Kevin Gallardo
>Priority: Normal
>
> As a follow up to some exploration I have done for CASSANDRA-15543, I 
> realized the following behavior in both {{ReadCallback}} and 
> {{AbstractWriteHandler}}:
>  - await for responses
>  - when all required number of responses have come back: unblock the wait
>  - when a single failure happens: unblock the wait
>  - when unblocked, look to see if the counter of failures is > 1 and if so 
> return an error message based on the {{failures}} map that's been filled
> Error messages that can result from this behavior can be a ReadTimeout, a 
> ReadFailure, a WriteTimeout or a WriteFailure.
> In case of a Write/ReadFailure, the user will get back an error looking like 
> the following:
> "Failure: Received X responses, and Y failures"
> (if this behavior I describe is incorrect, please correct me)
> This causes a usability problem. Since the handler will fail and throw an 
> exception as soon as 1 failure happens, the error message that is returned to 
> the user may not be accurate.
> (note: I am not entirely sure of the behavior in case of timeouts for now)
> For example, say a request at CL = QUORUM = 3, a failed request may complete 
> first, then a successful one completes, and another fails. If the exception 
> is thrown fast enough, the error message could say 
>  "Failure: Received 0 response, and 1 failure at CL = 3"
> Which:
> 1. doesn't make a lot of sense because the CL doesn't match the number of 
> results in the message, so you end up thinking "what happened with the rest 
> of the required CL?"
> 2. the information is incorrect. We did receive a successful response, only 
> it came after the initial failure.
> From that logic, I think it is safe to assume that the information returned 
> in the error message cannot be trusted in case of a failure. Only information 
> users should extract out of it is that at least 1 node has failed.
> For a big improvement in usability, the {{ReadCallback}} and 
> {{AbstractWriteResponseHandler}} could instead wait for all responses to come 
> back before unblocking the wait, or let it timeout. This is way, the users 
> will be able to have some trust around the information returned to them.
> Additionally, an error that happens first prevents a timeout to happen 
> because it fails immediately, and so potentially it hides problems with other 
> replicas. If we were to wait for all responses, we might get a timeout, in 
> that case we'd also be able to tell wether failures have happened *before* 
> that timeout, and have a more complete diagnostic where you can't detect both 
> errors at the same time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15689) CASSANDRA-15650 broke CasWriteTest causing it to fail and hang

2020-04-03 Thread David Capwell (Jira)
David Capwell created CASSANDRA-15689:
-

 Summary: CASSANDRA-15650 broke CasWriteTest causing it to fail and 
hang
 Key: CASSANDRA-15689
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15689
 Project: Cassandra
  Issue Type: Bug
  Components: Test/dtest
Reporter: David Capwell
Assignee: David Capwell


CasWriteTest changed IsolatedExecutor to rethrow runtime exceptions rather than 
wrap, this test assumes they are wrapped which causes tests to fail and 
casWriteContentionTimeoutTest to timeout.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15689) CASSANDRA-15650 broke CasWriteTest causing it to fail and hang

2020-04-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CASSANDRA-15689:
---
Labels: pull-request-available  (was: )

> CASSANDRA-15650 broke CasWriteTest causing it to fail and hang
> --
>
> Key: CASSANDRA-15689
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15689
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>  Labels: pull-request-available
>
> CasWriteTest changed IsolatedExecutor to rethrow runtime exceptions rather 
> than wrap, this test assumes they are wrapped which causes tests to fail and 
> casWriteContentionTimeoutTest to timeout.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15676) flaky test testWriteUnknownResult- org.apache.cassandra.distributed.test.CasWriteTest

2020-04-03 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074763#comment-17074763
 ] 

David Capwell commented on CASSANDRA-15676:
---

Created CASSANDRA-15689 to fix the regression. This shouldn't fix the timeout 
as that looks to be before CASSANDRA-15650

> flaky test testWriteUnknownResult- 
> org.apache.cassandra.distributed.test.CasWriteTest
> -
>
> Key: CASSANDRA-15676
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15676
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest
>Reporter: Kevin Gallardo
>Assignee: Eduard Tudenhoefner
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Failure observed in: 
> https://app.circleci.com/pipelines/github/newkek/cassandra/33/workflows/54007cf7-4424-4ec1-9655-665f6044e6d1/jobs/187/tests
> {noformat}
> testWriteUnknownResult - org.apache.cassandra.distributed.test.CasWriteTest
> junit.framework.AssertionFailedError: Expecting cause to be 
> CasWriteUncertainException
>   at 
> org.apache.cassandra.distributed.test.CasWriteTest.testWriteUnknownResult(CasWriteTest.java:257)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15689) CASSANDRA-15650 broke CasWriteTest causing it to fail and hang

2020-04-03 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-15689:
--
 Bug Category: Parent values: Code(13163)
   Complexity: Low Hanging Fruit
Discovered By: Unit Test
 Severity: Low
   Status: Open  (was: Triage Needed)

> CASSANDRA-15650 broke CasWriteTest causing it to fail and hang
> --
>
> Key: CASSANDRA-15689
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15689
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> CasWriteTest changed IsolatedExecutor to rethrow runtime exceptions rather 
> than wrap, this test assumes they are wrapped which causes tests to fail and 
> casWriteContentionTimeoutTest to timeout.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15690) Single partition queries can mistakenly omit partition deletions and resurrect data

2020-04-03 Thread Aleksey Yeschenko (Jira)
Aleksey Yeschenko created CASSANDRA-15690:
-

 Summary: Single partition queries can mistakenly omit partition 
deletions and resurrect data
 Key: CASSANDRA-15690
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15690
 Project: Cassandra
  Issue Type: Bug
Reporter: Aleksey Yeschenko
Assignee: Sam Tunnicliffe


We have logic that allows us to exclude sstables with partition deletions that 
are older than the minimum collected timestamp in a local request. However, 
it’s possible that another node could have rows that aren’t known to the local 
node that are in turn older than the excluded partition deletion. In such a 
scenario, those will be mistakenly resurrected, which is a correctness issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15689) CASSANDRA-15650 broke CasWriteTest causing it to fail and hang

2020-04-03 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074764#comment-17074764
 ] 

David Capwell commented on CASSANDRA-15689:
---

[~blerer] [~e.dimitrova] can you review?

> CASSANDRA-15650 broke CasWriteTest causing it to fail and hang
> --
>
> Key: CASSANDRA-15689
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15689
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> CasWriteTest changed IsolatedExecutor to rethrow runtime exceptions rather 
> than wrap, this test assumes they are wrapped which causes tests to fail and 
> casWriteContentionTimeoutTest to timeout.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15690) Single partition queries can mistakenly omit partition deletions and resurrect data

2020-04-03 Thread Aleksey Yeschenko (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074765#comment-17074765
 ] 

Aleksey Yeschenko commented on CASSANDRA-15690:
---

A simple in-JVM repro illustration provided by [~samt]:

{code}
@Test
public void skippedSSTableWithPartitionDeletionShadowingDataOnAnotherNode() 
throws Throwable
{
try (Cluster cluster = init(Cluster.create(2)))
{
cluster.schemaChange("CREATE TABLE " + KEYSPACE + ".tbl (pk int, ck 
int, v int, PRIMARY KEY(pk, ck))");
// insert a partition tombstone on node 1, the deletion timestamp 
should end up being the sstable's minTimestamp
cluster.get(1).executeInternal("DELETE FROM " + KEYSPACE + ".tbl 
USING TIMESTAMP 1 WHERE pk = 0");
// and a row from a different partition, to provide the sstable's 
min/max clustering
cluster.get(1).executeInternal("INSERT INTO " + KEYSPACE + ".tbl 
(pk, ck, v) VALUES (1, 1, 1) USING TIMESTAMP 1");
cluster.get(1).flush(KEYSPACE);
// sstable 1 has minTimestamp == maxTimestamp == 1 and is skipped 
due to its min/max clusterings. Now we
// insert a row which is not shadowed by the partition delete and 
flush to a second sstable. Importantly,
// this sstable's minTimestamp is > than the maxTimestamp of the 
first sstable. This would cause the first
// sstable not to be re-included in the merge input, but we can't 
really make that decision as we don't
// know what data and/or tombstones are present on other nodes
cluster.get(1).executeInternal("INSERT INTO " + KEYSPACE + ".tbl 
(pk, ck, v) VALUES (0, 6, 6) USING TIMESTAMP 2");
cluster.get(1).flush(KEYSPACE);
​
// on node 2, add a row for the deleted partition with an older 
timestamp than the deletion so it should be shadowed
cluster.get(2).executeInternal("INSERT INTO " + KEYSPACE + ".tbl 
(pk, ck, v) VALUES (0, 10, 10) USING TIMESTAMP 0");
​
Object[][] rows = cluster.coordinator(1)
 .execute("SELECT * FROM " + KEYSPACE + 
".tbl WHERE pk=0 AND ck > 5",
  ConsistencyLevel.ALL);
// we expect that the row from node 2 (0, 10, 10) was shadowed by 
the partition delete, but the row from
// node 1 (0, 6, 6) was not.
assertRows(rows, new Object[] {0, 6 ,6});
}
}
{code}

> Single partition queries can mistakenly omit partition deletions and 
> resurrect data
> ---
>
> Key: CASSANDRA-15690
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15690
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Aleksey Yeschenko
>Assignee: Sam Tunnicliffe
>Priority: Normal
>
> We have logic that allows us to exclude sstables with partition deletions 
> that are older than the minimum collected timestamp in a local request. 
> However, it’s possible that another node could have rows that aren’t known to 
> the local node that are in turn older than the excluded partition deletion. 
> In such a scenario, those will be mistakenly resurrected, which is a 
> correctness issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15689) CASSANDRA-15650 broke CasWriteTest causing it to fail and hang

2020-04-03 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-15689:
--
Description: 
CasWriteTest changed IsolatedExecutor to rethrow runtime exceptions rather than 
wrap, this test assumes they are wrapped which causes tests to fail and 
casWriteContentionTimeoutTest to timeout.

[Circle 
CI|https://circleci.com/gh/dcapwell/cassandra/tree/bug%2FCASSANDRA-15689]

  was:CasWriteTest changed IsolatedExecutor to rethrow runtime exceptions 
rather than wrap, this test assumes they are wrapped which causes tests to fail 
and casWriteContentionTimeoutTest to timeout.


> CASSANDRA-15650 broke CasWriteTest causing it to fail and hang
> --
>
> Key: CASSANDRA-15689
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15689
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> CasWriteTest changed IsolatedExecutor to rethrow runtime exceptions rather 
> than wrap, this test assumes they are wrapped which causes tests to fail and 
> casWriteContentionTimeoutTest to timeout.
> [Circle 
> CI|https://circleci.com/gh/dcapwell/cassandra/tree/bug%2FCASSANDRA-15689]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15690) Single partition queries can mistakenly omit partition deletions and resurrect data

2020-04-03 Thread Aleksey Yeschenko (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-15690:
--
 Bug Category: Parent values: Correctness(12982)Level 1 values: Transient 
Incorrect Response(12987)
   Complexity: Normal
  Component/s: Consistency/Coordination
Discovered By: Code Inspection
Fix Version/s: 4.0-alpha
   3.11.x
   3.0.x
Reviewers: Aleksey Yeschenko
 Severity: Critical
   Status: Open  (was: Triage Needed)

> Single partition queries can mistakenly omit partition deletions and 
> resurrect data
> ---
>
> Key: CASSANDRA-15690
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15690
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination
>Reporter: Aleksey Yeschenko
>Assignee: Sam Tunnicliffe
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0-alpha
>
>
> We have logic that allows us to exclude sstables with partition deletions 
> that are older than the minimum collected timestamp in a local request. 
> However, it’s possible that another node could have rows that aren’t known to 
> the local node that are in turn older than the excluded partition deletion. 
> In such a scenario, those will be mistakenly resurrected, which is a 
> correctness issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15642) Inconsistent failure messages on distributed queries

2020-04-03 Thread Kevin Gallardo (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074827#comment-17074827
 ] 

Kevin Gallardo commented on CASSANDRA-15642:


[~benedict] thanks for having a look at the ticket!

bq. There is no real significant increase in information by waiting, and it 
delays how quickly a client may action this information

I would argue that right now that a user wouldn't even be aware that the 
information is not reliable, as it's only by digging in server code that I 
realized that this is happening. The error message tell you "received X errors 
and X responses" and there is no indication externally to the user/client that 
the information returned in not reliable.

Additionally, given these findings I don't know how the client would be able to 
properly action on the information, quickly or not, if the info is not 
reliable? I am not sure that giving partial info quickly is better than 
cohesive info

bq. It is possible to simply report consistent information

I am not sure how that would be possible without waiting all the responses come 
back or timeout, but happy to be explained if I am missing something

> Inconsistent failure messages on distributed queries
> 
>
> Key: CASSANDRA-15642
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15642
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Coordination
>Reporter: Kevin Gallardo
>Priority: Normal
>
> As a follow up to some exploration I have done for CASSANDRA-15543, I 
> realized the following behavior in both {{ReadCallback}} and 
> {{AbstractWriteHandler}}:
>  - await for responses
>  - when all required number of responses have come back: unblock the wait
>  - when a single failure happens: unblock the wait
>  - when unblocked, look to see if the counter of failures is > 1 and if so 
> return an error message based on the {{failures}} map that's been filled
> Error messages that can result from this behavior can be a ReadTimeout, a 
> ReadFailure, a WriteTimeout or a WriteFailure.
> In case of a Write/ReadFailure, the user will get back an error looking like 
> the following:
> "Failure: Received X responses, and Y failures"
> (if this behavior I describe is incorrect, please correct me)
> This causes a usability problem. Since the handler will fail and throw an 
> exception as soon as 1 failure happens, the error message that is returned to 
> the user may not be accurate.
> (note: I am not entirely sure of the behavior in case of timeouts for now)
> For example, say a request at CL = QUORUM = 3, a failed request may complete 
> first, then a successful one completes, and another fails. If the exception 
> is thrown fast enough, the error message could say 
>  "Failure: Received 0 response, and 1 failure at CL = 3"
> Which:
> 1. doesn't make a lot of sense because the CL doesn't match the number of 
> results in the message, so you end up thinking "what happened with the rest 
> of the required CL?"
> 2. the information is incorrect. We did receive a successful response, only 
> it came after the initial failure.
> From that logic, I think it is safe to assume that the information returned 
> in the error message cannot be trusted in case of a failure. Only information 
> users should extract out of it is that at least 1 node has failed.
> For a big improvement in usability, the {{ReadCallback}} and 
> {{AbstractWriteResponseHandler}} could instead wait for all responses to come 
> back before unblocking the wait, or let it timeout. This is way, the users 
> will be able to have some trust around the information returned to them.
> Additionally, an error that happens first prevents a timeout to happen 
> because it fails immediately, and so potentially it hides problems with other 
> replicas. If we were to wait for all responses, we might get a timeout, in 
> that case we'd also be able to tell wether failures have happened *before* 
> that timeout, and have a more complete diagnostic where you can't detect both 
> errors at the same time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15642) Inconsistent failure messages on distributed queries

2020-04-03 Thread Kevin Gallardo (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074827#comment-17074827
 ] 

Kevin Gallardo edited comment on CASSANDRA-15642 at 4/3/20, 7:10 PM:
-

[~benedict] thanks for having a look at the ticket!

bq. There is no real significant increase in information by waiting, and it 
delays how quickly a client may action this information

I would argue that right now that a user wouldn't even be aware that the 
information is not reliable, as it's only by digging in server code that I 
realized that this is happening. The error message tell you "received X errors 
and X responses" and there is no indication externally to the user/client that 
the information returned in not complete/reliable.

Additionally, given these findings I don't know how the client would be able to 
properly action on the information, quickly or not, if the info is not 
reliable? I am not sure that giving partial info quickly is better than 
cohesive info

bq. It is possible to simply report consistent information

I am not sure how that would be possible without waiting all the responses come 
back or timeout, but happy to be explained if I am missing something


was (Author: newkek):
[~benedict] thanks for having a look at the ticket!

bq. There is no real significant increase in information by waiting, and it 
delays how quickly a client may action this information

I would argue that right now that a user wouldn't even be aware that the 
information is not reliable, as it's only by digging in server code that I 
realized that this is happening. The error message tell you "received X errors 
and X responses" and there is no indication externally to the user/client that 
the information returned in not reliable.

Additionally, given these findings I don't know how the client would be able to 
properly action on the information, quickly or not, if the info is not 
reliable? I am not sure that giving partial info quickly is better than 
cohesive info

bq. It is possible to simply report consistent information

I am not sure how that would be possible without waiting all the responses come 
back or timeout, but happy to be explained if I am missing something

> Inconsistent failure messages on distributed queries
> 
>
> Key: CASSANDRA-15642
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15642
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Coordination
>Reporter: Kevin Gallardo
>Priority: Normal
>
> As a follow up to some exploration I have done for CASSANDRA-15543, I 
> realized the following behavior in both {{ReadCallback}} and 
> {{AbstractWriteHandler}}:
>  - await for responses
>  - when all required number of responses have come back: unblock the wait
>  - when a single failure happens: unblock the wait
>  - when unblocked, look to see if the counter of failures is > 1 and if so 
> return an error message based on the {{failures}} map that's been filled
> Error messages that can result from this behavior can be a ReadTimeout, a 
> ReadFailure, a WriteTimeout or a WriteFailure.
> In case of a Write/ReadFailure, the user will get back an error looking like 
> the following:
> "Failure: Received X responses, and Y failures"
> (if this behavior I describe is incorrect, please correct me)
> This causes a usability problem. Since the handler will fail and throw an 
> exception as soon as 1 failure happens, the error message that is returned to 
> the user may not be accurate.
> (note: I am not entirely sure of the behavior in case of timeouts for now)
> For example, say a request at CL = QUORUM = 3, a failed request may complete 
> first, then a successful one completes, and another fails. If the exception 
> is thrown fast enough, the error message could say 
>  "Failure: Received 0 response, and 1 failure at CL = 3"
> Which:
> 1. doesn't make a lot of sense because the CL doesn't match the number of 
> results in the message, so you end up thinking "what happened with the rest 
> of the required CL?"
> 2. the information is incorrect. We did receive a successful response, only 
> it came after the initial failure.
> From that logic, I think it is safe to assume that the information returned 
> in the error message cannot be trusted in case of a failure. Only information 
> users should extract out of it is that at least 1 node has failed.
> For a big improvement in usability, the {{ReadCallback}} and 
> {{AbstractWriteResponseHandler}} could instead wait for all responses to come 
> back before unblocking the wait, or let it timeout. This is way, the users 
> will be able to have some trust around the information returned to them.
> Additionally, an error that happens first prevents a timeout to hap

[jira] [Commented] (CASSANDRA-14050) Many cqlsh_copy_tests are busted

2020-04-03 Thread Stefania Alborghetti (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074829#comment-17074829
 ] 

Stefania Alborghetti commented on CASSANDRA-14050:
--

[~sasrira] are you still working on this? If not, [~Ge] and I would like to 
take over. We would like to fix these tests before merging CASSANDRA-15679.

The reason for the failures is that the _cqlsh copy tests.py_ links to 
_cqlshlib/formatting.py_. It needs this in order to apply the identical 
formatting used by cqlsh and determine if the data obtained via 
{{self.session.execute("SELECT * FROM testtuple")}} matches the data in the csv 
files.

Since cqlshlib on trunk supports both python 3 and python 2, then the cqlsh 
copy tests work for trunk. But for older branches that only support python 2, 
the tests no longer work.

So to fix the tests we would need to make cqlshlib support both python 2 and 
python 3, at least as far as _formatting.py_. There is a problem with this 
approach though: this code is mostly tested via dtests, which only support 
python 3 (I assume this is the case because of the dependency on Python 3) and 
therefore, how would we know if we broke anything for Python 2? Maybe we could 
run the dtests from before the migration to python 3, hoping that they still 
work.

Another approach would be to copy formatting.py into the dtests repo for the 
older branches but this is quite ugly.

Lastly, there is the option of removing the dependency to _formatting.py_. I 
think we could try replacing {{self.session.execute("SELECT * FROM 
testtuple")}} with the equivalent cqlsh command and see if that works.



> Many cqlsh_copy_tests are busted
> 
>
> Key: CASSANDRA-14050
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14050
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Testing
>Reporter: Michael Kjellman
>Assignee: Sam Sriramadhesikan
>Priority: Normal
>
> Many cqlsh_copy_tests are busted. We should disable the entire suite until 
> this is resolved as these tests are currently nothing but a waste of time.
> test_bulk_round_trip_blogposts - cqlsh_tests.cqlsh_copy_tests.CqlshCopyTest
> test_bulk_round_trip_blogposts_with_max_connections - 
> cqlsh_tests.cqlsh_copy_tests.CqlshCopyTest
> test_bulk_round_trip_default - cqlsh_tests.cqlsh_copy_tests.CqlshCopyTest
> Error starting node3.
>  >> begin captured logging << 
> dtest: DEBUG: cluster ccm directory: /tmp/dtest-S9NfIH
> dtest: DEBUG: Done setting configuration options:
> {   'initial_token': None,
> 'memtable_allocation_type': 'offheap_objects',
> 'num_tokens': '256',
> 'phi_convict_threshold': 5,
> 'range_request_timeout_in_ms': 1,
> 'read_request_timeout_in_ms': 1,
> 'request_timeout_in_ms': 1,
> 'truncate_request_timeout_in_ms': 1,
> 'write_request_timeout_in_ms': 1}
> - >> end captured logging << -
>   File "/usr/lib/python2.7/unittest/case.py", line 329, in run
> testMethod()
>   File "/home/cassandra/cassandra-dtest/cqlsh_tests/cqlsh_copy_tests.py", 
> line 2546, in test_bulk_round_trip_blogposts
> stress_table='stresscql.blogposts')
>   File "/home/cassandra/cassandra-dtest/cqlsh_tests/cqlsh_copy_tests.py", 
> line 2451, in _test_bulk_round_trip
> self.prepare(nodes=nodes, partitioner=partitioner, 
> configuration_options=configuration_options)
>   File "/home/cassandra/cassandra-dtest/cqlsh_tests/cqlsh_copy_tests.py", 
> line 115, in prepare
> self.cluster.populate(nodes, 
> tokens=tokens).start(wait_for_binary_proto=True)
>   File 
> "/home/cassandra/env/local/lib/python2.7/site-packages/ccmlib/cluster.py", 
> line 423, in start
> raise NodeError("Error starting {0}.".format(node.name), p)
> "Error starting node3.\n >> begin captured logging << 
> \ndtest: DEBUG: cluster ccm directory: 
> /tmp/dtest-S9NfIH\ndtest: DEBUG: Done setting configuration options:\n{   
> 'initial_token': None,\n'memtable_allocation_type': 'offheap_objects',\n  
>   'num_tokens': '256',\n'phi_convict_threshold': 5,\n
> 'range_request_timeout_in_ms': 1,\n'read_request_timeout_in_ms': 
> 1,\n'request_timeout_in_ms': 1,\n
> 'truncate_request_timeout_in_ms': 1,\n'write_request_timeout_in_ms': 
> 1}\n- >> end captured logging << 
> -"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15642) Inconsistent failure messages on distributed queries

2020-04-03 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074850#comment-17074850
 ] 

Benedict Elliott Smith commented on CASSANDRA-15642:


bq. there is no indication externally to the user/client that the information 
returned in not complete/reliable

What is your definition of complete/reliable?

bq. I am not sure how that would be possible without waiting all the responses 
come back or timeout, but happy to be explained if I am missing something

At the point of failure you know if you are failing because of failure or 
timeout.  The problem is only that we produce a nonsense error message that is 
inconsistent.  We are of course able to produce an error message whose 
information is internally consistent with the situation.

For instance, we tend to have a pattern of:

{code}
if (isFailed()) 
  reportFailure()
{cod}

However the state changes between testing and reporting.  We should instead 
have:
{code}
state = state()
if (isFailed(state))
reportFailure(state)
{code}

We also now have a situation where state is a tuple of {{(triggerPrimitive, 
detailMap)}}, and we report a combination of {{triggerPrimitive}} and 
{{detailMap}}, despite them not being consistent.  Our decision and reporting 
should rest solely on {{detailMap}} with {{triggerPrimitive}} serving only for 
scheduling purposes (waking up the waiting thread)


> Inconsistent failure messages on distributed queries
> 
>
> Key: CASSANDRA-15642
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15642
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Coordination
>Reporter: Kevin Gallardo
>Priority: Normal
>
> As a follow up to some exploration I have done for CASSANDRA-15543, I 
> realized the following behavior in both {{ReadCallback}} and 
> {{AbstractWriteHandler}}:
>  - await for responses
>  - when all required number of responses have come back: unblock the wait
>  - when a single failure happens: unblock the wait
>  - when unblocked, look to see if the counter of failures is > 1 and if so 
> return an error message based on the {{failures}} map that's been filled
> Error messages that can result from this behavior can be a ReadTimeout, a 
> ReadFailure, a WriteTimeout or a WriteFailure.
> In case of a Write/ReadFailure, the user will get back an error looking like 
> the following:
> "Failure: Received X responses, and Y failures"
> (if this behavior I describe is incorrect, please correct me)
> This causes a usability problem. Since the handler will fail and throw an 
> exception as soon as 1 failure happens, the error message that is returned to 
> the user may not be accurate.
> (note: I am not entirely sure of the behavior in case of timeouts for now)
> For example, say a request at CL = QUORUM = 3, a failed request may complete 
> first, then a successful one completes, and another fails. If the exception 
> is thrown fast enough, the error message could say 
>  "Failure: Received 0 response, and 1 failure at CL = 3"
> Which:
> 1. doesn't make a lot of sense because the CL doesn't match the number of 
> results in the message, so you end up thinking "what happened with the rest 
> of the required CL?"
> 2. the information is incorrect. We did receive a successful response, only 
> it came after the initial failure.
> From that logic, I think it is safe to assume that the information returned 
> in the error message cannot be trusted in case of a failure. Only information 
> users should extract out of it is that at least 1 node has failed.
> For a big improvement in usability, the {{ReadCallback}} and 
> {{AbstractWriteResponseHandler}} could instead wait for all responses to come 
> back before unblocking the wait, or let it timeout. This is way, the users 
> will be able to have some trust around the information returned to them.
> Additionally, an error that happens first prevents a timeout to happen 
> because it fails immediately, and so potentially it hides problems with other 
> replicas. If we were to wait for all responses, we might get a timeout, in 
> that case we'd also be able to tell wether failures have happened *before* 
> that timeout, and have a more complete diagnostic where you can't detect both 
> errors at the same time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15657) Improve zero-copy-streaming containment check by using file sections

2020-04-03 Thread Dinesh Joshi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074933#comment-17074933
 ] 

Dinesh Joshi commented on CASSANDRA-15657:
--

Sure, IIRC the SSTable metadata only contains the beginning and ending tokens. 
However, in the wrap around use-case you'd need to unwrap the ranges and do the 
check. I don't think your check works in the wrap around case.

> Improve zero-copy-streaming containment check by using file sections
> 
>
> Key: CASSANDRA-15657
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15657
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Streaming and Messaging
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0
>
>
> Currently zero copy streaming is only enabled for leveled-compaction strategy 
> and it checks if all keys in the sstables are included in the transferred 
> ranges.
> This is very inefficient. The containment check can be improved by checking 
> if transferred sections (the transferred file positions) cover entire sstable.
> I also enabled ZCS for all compaction strategies since the new containment 
> check is very fast..



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15657) Improve zero-copy-streaming containment check by using file sections

2020-04-03 Thread Dinesh Joshi (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi updated CASSANDRA-15657:
-
Reviewers: Dinesh Joshi, T Jake Luciani  (was: Dinesh Joshi)

> Improve zero-copy-streaming containment check by using file sections
> 
>
> Key: CASSANDRA-15657
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15657
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Streaming and Messaging
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0
>
>
> Currently zero copy streaming is only enabled for leveled-compaction strategy 
> and it checks if all keys in the sstables are included in the transferred 
> ranges.
> This is very inefficient. The containment check can be improved by checking 
> if transferred sections (the transferred file positions) cover entire sstable.
> I also enabled ZCS for all compaction strategies since the new containment 
> check is very fast..



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15642) Inconsistent failure messages on distributed queries

2020-04-03 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17074850#comment-17074850
 ] 

Benedict Elliott Smith edited comment on CASSANDRA-15642 at 4/3/20, 11:12 PM:
--

bq. there is no indication externally to the user/client that the information 
returned in not complete/reliable

What is your definition of complete/reliable?

bq. I am not sure how that would be possible without waiting all the responses 
come back or timeout, but happy to be explained if I am missing something

At the point of failure you know if you are failing because of failure or 
timeout.  The problem is only that we produce a nonsense error message that is 
inconsistent.  We are of course able to produce an error message whose 
information is internally consistent with the situation.

For instance, we tend to have a pattern of:

{code}
if (isFailed()) 
  reportFailure()
{code}

However the state changes between testing and reporting.  We should instead 
have:
{code}
state = state()
if (isFailed(state))
reportFailure(state)
{code}

We also now have a situation where state is a tuple of {{(triggerPrimitive, 
detailMap)}}, and we report a combination of {{triggerPrimitive}} and 
{{detailMap}}, despite them not being consistent.  Our decision and reporting 
should rest solely on {{detailMap}} with {{triggerPrimitive}} serving only for 
scheduling purposes (waking up the waiting thread)



was (Author: benedict):
bq. there is no indication externally to the user/client that the information 
returned in not complete/reliable

What is your definition of complete/reliable?

bq. I am not sure how that would be possible without waiting all the responses 
come back or timeout, but happy to be explained if I am missing something

At the point of failure you know if you are failing because of failure or 
timeout.  The problem is only that we produce a nonsense error message that is 
inconsistent.  We are of course able to produce an error message whose 
information is internally consistent with the situation.

For instance, we tend to have a pattern of:

{code}
if (isFailed()) 
  reportFailure()
{cod}

However the state changes between testing and reporting.  We should instead 
have:
{code}
state = state()
if (isFailed(state))
reportFailure(state)
{code}

We also now have a situation where state is a tuple of {{(triggerPrimitive, 
detailMap)}}, and we report a combination of {{triggerPrimitive}} and 
{{detailMap}}, despite them not being consistent.  Our decision and reporting 
should rest solely on {{detailMap}} with {{triggerPrimitive}} serving only for 
scheduling purposes (waking up the waiting thread)


> Inconsistent failure messages on distributed queries
> 
>
> Key: CASSANDRA-15642
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15642
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Coordination
>Reporter: Kevin Gallardo
>Priority: Normal
>
> As a follow up to some exploration I have done for CASSANDRA-15543, I 
> realized the following behavior in both {{ReadCallback}} and 
> {{AbstractWriteHandler}}:
>  - await for responses
>  - when all required number of responses have come back: unblock the wait
>  - when a single failure happens: unblock the wait
>  - when unblocked, look to see if the counter of failures is > 1 and if so 
> return an error message based on the {{failures}} map that's been filled
> Error messages that can result from this behavior can be a ReadTimeout, a 
> ReadFailure, a WriteTimeout or a WriteFailure.
> In case of a Write/ReadFailure, the user will get back an error looking like 
> the following:
> "Failure: Received X responses, and Y failures"
> (if this behavior I describe is incorrect, please correct me)
> This causes a usability problem. Since the handler will fail and throw an 
> exception as soon as 1 failure happens, the error message that is returned to 
> the user may not be accurate.
> (note: I am not entirely sure of the behavior in case of timeouts for now)
> For example, say a request at CL = QUORUM = 3, a failed request may complete 
> first, then a successful one completes, and another fails. If the exception 
> is thrown fast enough, the error message could say 
>  "Failure: Received 0 response, and 1 failure at CL = 3"
> Which:
> 1. doesn't make a lot of sense because the CL doesn't match the number of 
> results in the message, so you end up thinking "what happened with the rest 
> of the required CL?"
> 2. the information is incorrect. We did receive a successful response, only 
> it came after the initial failure.
> From that logic, I think it is safe to assume that the information returned 
> in the error message cannot be trusted in case of a failu