[jira] [Updated] (CASSANDRA-15696) Only track ideal CL failure when request CL is met

2020-04-06 Thread Jon Haddad (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jon Haddad updated CASSANDRA-15696:
---
  Fix Version/s: 4.0-alpha
  Since Version: 4.0-alpha
Source Control Link: 
https://github.com/apache/cassandra/commit/0e0d288ab7e87e7d4a7542c955dd06701798bd06
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Only track ideal CL failure when request CL is met
> --
>
> Key: CASSANDRA-15696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15696
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Jon Haddad
>Assignee: Jon Haddad
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0-alpha
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When ideal_consistency_level is set (CASSANDRA-13289), we currently increment 
> a counter if a request doesn’t meet the consistency level specified in the 
> configuration (or through JMX).
> At the moment, we increment the counter if the query was successful or not. I 
> think it would be slightly better if we only incremented the counter if the 
> ideal CL wasn’t achieved but the query’s CL was met.
> The original JIRA, stated the following as an objective:
> {quote}If your application writes at LOCAL_QUORUM how often are those writes 
> failing to achieve EACH_QUORUM at other data centers. If you failed your 
> application over to one of those data centers roughly how inconsistent might 
> it be given the number of writes that didn't propagate since the last 
> incremental repair?
> {quote}
> The main benefit to the JIRA was to set a CL higher than the CL being used, 
> and to track how often we weren’t able to hit that CL despise hitting the 
> underlying CL. We should only increment the counter in a case where we were 
> able to meet the query provided consistency but were unable to meet the ideal 
> consistency level.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15696) Only track ideal CL failure when request CL is met

2020-04-06 Thread Jon Haddad (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jon Haddad updated CASSANDRA-15696:
---
Status: Ready to Commit  (was: Changes Suggested)

> Only track ideal CL failure when request CL is met
> --
>
> Key: CASSANDRA-15696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15696
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Jon Haddad
>Assignee: Jon Haddad
>Priority: Normal
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When ideal_consistency_level is set (CASSANDRA-13289), we currently increment 
> a counter if a request doesn’t meet the consistency level specified in the 
> configuration (or through JMX).
> At the moment, we increment the counter if the query was successful or not. I 
> think it would be slightly better if we only incremented the counter if the 
> ideal CL wasn’t achieved but the query’s CL was met.
> The original JIRA, stated the following as an objective:
> {quote}If your application writes at LOCAL_QUORUM how often are those writes 
> failing to achieve EACH_QUORUM at other data centers. If you failed your 
> application over to one of those data centers roughly how inconsistent might 
> it be given the number of writes that didn't propagate since the last 
> incremental repair?
> {quote}
> The main benefit to the JIRA was to set a CL higher than the CL being used, 
> and to track how often we weren’t able to hit that CL despise hitting the 
> underlying CL. We should only increment the counter in a case where we were 
> able to meet the query provided consistency but were unable to meet the ideal 
> consistency level.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch trunk updated: Only track ideal CL failure when request CL met

2020-04-06 Thread rustyrazorblade
This is an automated email from the ASF dual-hosted git repository.

rustyrazorblade pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/trunk by this push:
 new 0e0d288  Only track ideal CL failure when request CL met
0e0d288 is described below

commit 0e0d288ab7e87e7d4a7542c955dd06701798bd06
Author: Jon Haddad j...@jonhaddad.com 
AuthorDate: Mon Apr 6 12:53:27 2020 -0700

Only track ideal CL failure when request CL met

Ideal consistency level tracking should not report a failure when requested 
CL
was also not met either.

Patch by Jon Haddad; Reviewed by Dinesh Joshi for CASSANDRA-15696.
---
 CHANGES.txt|  1 +
 .../service/AbstractWriteResponseHandler.java  | 17 --
 .../service/WriteResponseHandlerTest.java  | 26 ++
 3 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/CHANGES.txt b/CHANGES.txt
index fb881de..95a6802 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0-alpha4
+ * Only track ideal CL failure when request CL met (CASSANDRA-15696)
  * Fix flaky CoordinatorMessagingTest and docstring in OutboundSink and 
ConsistentSession (CASSANDRA-15672)
  * Fix force compaction of wrapping ranges (CASSANDRA-15664)
  * Expose repair streaming metrics (CASSANDRA-15656)
diff --git 
a/src/java/org/apache/cassandra/service/AbstractWriteResponseHandler.java 
b/src/java/org/apache/cassandra/service/AbstractWriteResponseHandler.java
index 1889c79..b1eb5b3 100644
--- a/src/java/org/apache/cassandra/service/AbstractWriteResponseHandler.java
+++ b/src/java/org/apache/cassandra/service/AbstractWriteResponseHandler.java
@@ -74,6 +74,11 @@ public abstract class AbstractWriteResponseHandler 
implements RequestCallback
 private AbstractWriteResponseHandler idealCLDelegate;
 
 /**
+ * We don't want to increment the writeFailedIdealCL if we didn't achieve 
the original requested CL
+ */
+private boolean requestedCLAchieved = false;
+
+/**
  * @param callback   A callback to be called when the write is 
successful.
  * @param queryStartNanoTime
  */
@@ -232,6 +237,13 @@ public abstract class AbstractWriteResponseHandler 
implements RequestCallback
 
 protected void signal()
 {
+//The ideal CL should only count as a strike if the requested CL was 
achieved.
+//If the requested CL is not achieved it's fine for the ideal CL to 
also not be achieved.
+if (idealCLDelegate != null)
+{
+idealCLDelegate.requestedCLAchieved = true;
+}
+
 condition.signalAll();
 if (callback != null)
 callback.run();
@@ -279,8 +291,9 @@ public abstract class AbstractWriteResponseHandler 
implements RequestCallback
 int decrementedValue = responsesAndExpirations.decrementAndGet();
 if (decrementedValue == 0)
 {
-//The condition being signaled is a valid proxy for the CL being 
achieved
-if (!condition.isSignaled())
+// The condition being signaled is a valid proxy for the CL being 
achieved
+// Only mark it as failed if the requested CL was achieved.
+if (!condition.isSignaled() && requestedCLAchieved)
 {
 replicaPlan.keyspace().metric.writeFailedIdealCL.inc();
 }
diff --git 
a/test/unit/org/apache/cassandra/service/WriteResponseHandlerTest.java 
b/test/unit/org/apache/cassandra/service/WriteResponseHandlerTest.java
index f06b706..5d8d191 100644
--- a/test/unit/org/apache/cassandra/service/WriteResponseHandlerTest.java
+++ b/test/unit/org/apache/cassandra/service/WriteResponseHandlerTest.java
@@ -232,6 +232,32 @@ public class WriteResponseHandlerTest
 assertEquals(0, ks.metric.idealCLWriteLatency.totalLatency.getCount());
 }
 
+/**
+ * Validate that failing to achieve ideal CL doesn't increase the failure 
counter when not meeting CL
+ * @throws Throwable
+ */
+@Test
+public void failedIdealCLDoesNotIncrementsStatOnQueryFailure() throws 
Throwable
+{
+AbstractWriteResponseHandler awr = 
createWriteResponseHandler(ConsistencyLevel.LOCAL_QUORUM, 
ConsistencyLevel.EACH_QUORUM);
+
+long startingCount = ks.metric.writeFailedIdealCL.getCount();
+
+// Failure in local DC
+awr.onResponse(createDummyMessage(0));
+
+awr.expired();
+awr.expired();
+
+//Fail in remote DC
+awr.expired();
+awr.expired();
+awr.expired();
+
+assertEquals(startingCount, ks.metric.writeFailedIdealCL.getCount());
+}
+
+
 private static AbstractWriteResponseHandler 
createWriteResponseHandler(ConsistencyLevel cl, ConsistencyLevel ideal)
 {
 return createWriteResponseHandler(cl, ideal, System.nanoTime());



[jira] [Commented] (CASSANDRA-15660) Unable to specify -e/--execute flag in cqlsh

2020-04-06 Thread ZhaoYang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076803#comment-17076803
 ] 

ZhaoYang commented on CASSANDRA-15660:
--

[~djoshi] there is a regression dtest..

> Unable to specify -e/--execute flag in cqlsh
> 
>
> Key: CASSANDRA-15660
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15660
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/cqlsh
>Reporter: Stefan Miklosovic
>Assignee: ZhaoYang
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0-alpha
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> From mailing list:
> [https://lists.apache.org/thread.html/r377099b632c62b641e4feef5b738084fc5369b0c7157fae867853597%40%3Cdev.cassandra.apache.org%3E]
> The bug looks like this:
> {code:java}
> $ /usr/bin/cqlsh -e 'describe keyspaces' -u cassandra -p cassandra 127.0.0.1
> Usage: cqlsh.py [options] [host [port]]cqlsh.py: error: '127.0.0.1' is not a 
> valid port number.
> {code}
> This is working in 3.x releases just fine but fails on 4.
> The workaround for 4.x code as of today is to put these statements into file 
> and use "-f" flag.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15696) Only track ideal CL failure when request CL is met

2020-04-06 Thread Jon Haddad (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076804#comment-17076804
 ] 

Jon Haddad commented on CASSANDRA-15696:


Ah yes - that was a typo.  Fixed.

> Only track ideal CL failure when request CL is met
> --
>
> Key: CASSANDRA-15696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15696
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Jon Haddad
>Assignee: Jon Haddad
>Priority: Normal
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When ideal_consistency_level is set (CASSANDRA-13289), we currently increment 
> a counter if a request doesn’t meet the consistency level specified in the 
> configuration (or through JMX).
> At the moment, we increment the counter if the query was successful or not. I 
> think it would be slightly better if we only incremented the counter if the 
> ideal CL wasn’t achieved but the query’s CL was met.
> The original JIRA, stated the following as an objective:
> {quote}If your application writes at LOCAL_QUORUM how often are those writes 
> failing to achieve EACH_QUORUM at other data centers. If you failed your 
> application over to one of those data centers roughly how inconsistent might 
> it be given the number of writes that didn't propagate since the last 
> incremental repair?
> {quote}
> The main benefit to the JIRA was to set a CL higher than the CL being used, 
> and to track how often we weren’t able to hit that CL despise hitting the 
> underlying CL. We should only increment the counter in a case where we were 
> able to meet the query provided consistency but were unable to meet the ideal 
> consistency level.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15696) Only track ideal CL failure when request CL is met

2020-04-06 Thread Dinesh Joshi (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi updated CASSANDRA-15696:
-
Status: Changes Suggested  (was: Review In Progress)

Thanks for your patch [~rustyrazorblade]. Looks ok overall. I think there is 
just one minor issue where you're using a bitwise operator instead of logical 
operator. Although the results are the same right now, I'd prefer that you 
change it to logical operator on commit.

> Only track ideal CL failure when request CL is met
> --
>
> Key: CASSANDRA-15696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15696
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Jon Haddad
>Assignee: Jon Haddad
>Priority: Normal
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When ideal_consistency_level is set (CASSANDRA-13289), we currently increment 
> a counter if a request doesn’t meet the consistency level specified in the 
> configuration (or through JMX).
> At the moment, we increment the counter if the query was successful or not. I 
> think it would be slightly better if we only incremented the counter if the 
> ideal CL wasn’t achieved but the query’s CL was met.
> The original JIRA, stated the following as an objective:
> {quote}If your application writes at LOCAL_QUORUM how often are those writes 
> failing to achieve EACH_QUORUM at other data centers. If you failed your 
> application over to one of those data centers roughly how inconsistent might 
> it be given the number of writes that didn't propagate since the last 
> incremental repair?
> {quote}
> The main benefit to the JIRA was to set a CL higher than the CL being used, 
> and to track how often we weren’t able to hit that CL despise hitting the 
> underlying CL. We should only increment the counter in a case where we were 
> able to meet the query provided consistency but were unable to meet the ideal 
> consistency level.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15696) Only track ideal CL failure when request CL is met

2020-04-06 Thread Dinesh Joshi (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi updated CASSANDRA-15696:
-
Reviewers: Dinesh Joshi, Dinesh Joshi  (was: Dinesh Joshi)
   Dinesh Joshi, Dinesh Joshi  (was: Dinesh Joshi)
   Status: Review In Progress  (was: Patch Available)

> Only track ideal CL failure when request CL is met
> --
>
> Key: CASSANDRA-15696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15696
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Jon Haddad
>Assignee: Jon Haddad
>Priority: Normal
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When ideal_consistency_level is set (CASSANDRA-13289), we currently increment 
> a counter if a request doesn’t meet the consistency level specified in the 
> configuration (or through JMX).
> At the moment, we increment the counter if the query was successful or not. I 
> think it would be slightly better if we only incremented the counter if the 
> ideal CL wasn’t achieved but the query’s CL was met.
> The original JIRA, stated the following as an objective:
> {quote}If your application writes at LOCAL_QUORUM how often are those writes 
> failing to achieve EACH_QUORUM at other data centers. If you failed your 
> application over to one of those data centers roughly how inconsistent might 
> it be given the number of writes that didn't propagate since the last 
> incremental repair?
> {quote}
> The main benefit to the JIRA was to set a CL higher than the CL being used, 
> and to track how often we weren’t able to hit that CL despise hitting the 
> underlying CL. We should only increment the counter in a case where we were 
> able to meet the query provided consistency but were unable to meet the ideal 
> consistency level.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15696) Only track ideal CL failure when request CL is met

2020-04-06 Thread Dinesh Joshi (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dinesh Joshi updated CASSANDRA-15696:
-
Reviewers: Dinesh Joshi

> Only track ideal CL failure when request CL is met
> --
>
> Key: CASSANDRA-15696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15696
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Jon Haddad
>Assignee: Jon Haddad
>Priority: Normal
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When ideal_consistency_level is set (CASSANDRA-13289), we currently increment 
> a counter if a request doesn’t meet the consistency level specified in the 
> configuration (or through JMX).
> At the moment, we increment the counter if the query was successful or not. I 
> think it would be slightly better if we only incremented the counter if the 
> ideal CL wasn’t achieved but the query’s CL was met.
> The original JIRA, stated the following as an objective:
> {quote}If your application writes at LOCAL_QUORUM how often are those writes 
> failing to achieve EACH_QUORUM at other data centers. If you failed your 
> application over to one of those data centers roughly how inconsistent might 
> it be given the number of writes that didn't propagate since the last 
> incremental repair?
> {quote}
> The main benefit to the JIRA was to set a CL higher than the CL being used, 
> and to track how often we weren’t able to hit that CL despise hitting the 
> underlying CL. We should only increment the counter in a case where we were 
> able to meet the query provided consistency but were unable to meet the ideal 
> consistency level.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15696) Only track ideal CL failure when request CL is met

2020-04-06 Thread Jon Haddad (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jon Haddad updated CASSANDRA-15696:
---
Test and Documentation Plan: Test links in comments  (was: |[Unit 
Test|https://app.circleci.com/pipelines/github/rustyrazorblade/cassandra/20/workflows/62377765-3e08-4d6f-b4b5-aec609a197e6/jobs/254]|
|[DTest|https://app.circleci.com/pipelines/github/rustyrazorblade/cassandra/20/workflows/62377765-3e08-4d6f-b4b5-aec609a197e6/jobs/256]|
)
 Status: Patch Available  (was: In Progress)

> Only track ideal CL failure when request CL is met
> --
>
> Key: CASSANDRA-15696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15696
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Jon Haddad
>Assignee: Jon Haddad
>Priority: Normal
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When ideal_consistency_level is set (CASSANDRA-13289), we currently increment 
> a counter if a request doesn’t meet the consistency level specified in the 
> configuration (or through JMX).
> At the moment, we increment the counter if the query was successful or not. I 
> think it would be slightly better if we only incremented the counter if the 
> ideal CL wasn’t achieved but the query’s CL was met.
> The original JIRA, stated the following as an objective:
> {quote}If your application writes at LOCAL_QUORUM how often are those writes 
> failing to achieve EACH_QUORUM at other data centers. If you failed your 
> application over to one of those data centers roughly how inconsistent might 
> it be given the number of writes that didn't propagate since the last 
> incremental repair?
> {quote}
> The main benefit to the JIRA was to set a CL higher than the CL being used, 
> and to track how often we weren’t able to hit that CL despise hitting the 
> underlying CL. We should only increment the counter in a case where we were 
> able to meet the query provided consistency but were unable to meet the ideal 
> consistency level.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15696) Only track ideal CL failure when request CL is met

2020-04-06 Thread Jon Haddad (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076778#comment-17076778
 ] 

Jon Haddad edited comment on CASSANDRA-15696 at 4/7/20, 12:26 AM:
--

[Unit 
Tests|https://app.circleci.com/pipelines/github/rustyrazorblade/cassandra/23/workflows/61d1e8fd-6e9d-44c2-99d4-1c62e3d7cacc/jobs/273]
[JVM DTests| 
https://app.circleci.com/pipelines/github/rustyrazorblade/cassandra/23/workflows/61d1e8fd-6e9d-44c2-99d4-1c62e3d7cacc/jobs/272]
[Python DTests| 
https://app.circleci.com/pipelines/github/rustyrazorblade/cassandra/23/workflows/61d1e8fd-6e9d-44c2-99d4-1c62e3d7cacc/jobs/279]


was (Author: rustyrazorblade):
Unit Tests: 
https://app.circleci.com/pipelines/github/rustyrazorblade/cassandra/23/workflows/61d1e8fd-6e9d-44c2-99d4-1c62e3d7cacc/jobs/273
JVM DTests: 
https://app.circleci.com/pipelines/github/rustyrazorblade/cassandra/23/workflows/61d1e8fd-6e9d-44c2-99d4-1c62e3d7cacc/jobs/272
Python DTests: 
https://app.circleci.com/pipelines/github/rustyrazorblade/cassandra/23/workflows/61d1e8fd-6e9d-44c2-99d4-1c62e3d7cacc/jobs/279

> Only track ideal CL failure when request CL is met
> --
>
> Key: CASSANDRA-15696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15696
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Jon Haddad
>Assignee: Jon Haddad
>Priority: Normal
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When ideal_consistency_level is set (CASSANDRA-13289), we currently increment 
> a counter if a request doesn’t meet the consistency level specified in the 
> configuration (or through JMX).
> At the moment, we increment the counter if the query was successful or not. I 
> think it would be slightly better if we only incremented the counter if the 
> ideal CL wasn’t achieved but the query’s CL was met.
> The original JIRA, stated the following as an objective:
> {quote}If your application writes at LOCAL_QUORUM how often are those writes 
> failing to achieve EACH_QUORUM at other data centers. If you failed your 
> application over to one of those data centers roughly how inconsistent might 
> it be given the number of writes that didn't propagate since the last 
> incremental repair?
> {quote}
> The main benefit to the JIRA was to set a CL higher than the CL being used, 
> and to track how often we weren’t able to hit that CL despise hitting the 
> underlying CL. We should only increment the counter in a case where we were 
> able to meet the query provided consistency but were unable to meet the ideal 
> consistency level.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15696) Only track ideal CL failure when request CL is met

2020-04-06 Thread Jon Haddad (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076778#comment-17076778
 ] 

Jon Haddad commented on CASSANDRA-15696:


Unit Tests: 
https://app.circleci.com/pipelines/github/rustyrazorblade/cassandra/23/workflows/61d1e8fd-6e9d-44c2-99d4-1c62e3d7cacc/jobs/273
JVM DTests: 
https://app.circleci.com/pipelines/github/rustyrazorblade/cassandra/23/workflows/61d1e8fd-6e9d-44c2-99d4-1c62e3d7cacc/jobs/272
Python DTests: 
https://app.circleci.com/pipelines/github/rustyrazorblade/cassandra/23/workflows/61d1e8fd-6e9d-44c2-99d4-1c62e3d7cacc/jobs/279

> Only track ideal CL failure when request CL is met
> --
>
> Key: CASSANDRA-15696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15696
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Jon Haddad
>Assignee: Jon Haddad
>Priority: Normal
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When ideal_consistency_level is set (CASSANDRA-13289), we currently increment 
> a counter if a request doesn’t meet the consistency level specified in the 
> configuration (or through JMX).
> At the moment, we increment the counter if the query was successful or not. I 
> think it would be slightly better if we only incremented the counter if the 
> ideal CL wasn’t achieved but the query’s CL was met.
> The original JIRA, stated the following as an objective:
> {quote}If your application writes at LOCAL_QUORUM how often are those writes 
> failing to achieve EACH_QUORUM at other data centers. If you failed your 
> application over to one of those data centers roughly how inconsistent might 
> it be given the number of writes that didn't propagate since the last 
> incremental repair?
> {quote}
> The main benefit to the JIRA was to set a CL higher than the CL being used, 
> and to track how often we weren’t able to hit that CL despise hitting the 
> underlying CL. We should only increment the counter in a case where we were 
> able to meet the query provided consistency but were unable to meet the ideal 
> consistency level.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15660) Unable to specify -e/--execute flag in cqlsh

2020-04-06 Thread Dinesh Joshi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076754#comment-17076754
 ] 

Dinesh Joshi commented on CASSANDRA-15660:
--

thank you for the patch. Could we please add a test to detect a future 
regression?

> Unable to specify -e/--execute flag in cqlsh
> 
>
> Key: CASSANDRA-15660
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15660
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/cqlsh
>Reporter: Stefan Miklosovic
>Assignee: ZhaoYang
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0-alpha
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> From mailing list:
> [https://lists.apache.org/thread.html/r377099b632c62b641e4feef5b738084fc5369b0c7157fae867853597%40%3Cdev.cassandra.apache.org%3E]
> The bug looks like this:
> {code:java}
> $ /usr/bin/cqlsh -e 'describe keyspaces' -u cassandra -p cassandra 127.0.0.1
> Usage: cqlsh.py [options] [host [port]]cqlsh.py: error: '127.0.0.1' is not a 
> valid port number.
> {code}
> This is working in 3.x releases just fine but fails on 4.
> The workaround for 4.x code as of today is to put these statements into file 
> and use "-f" flag.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15696) Only track ideal CL failure when request CL is met

2020-04-06 Thread Jon Haddad (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jon Haddad updated CASSANDRA-15696:
---
Status: In Progress  (was: Patch Available)

> Only track ideal CL failure when request CL is met
> --
>
> Key: CASSANDRA-15696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15696
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Jon Haddad
>Assignee: Jon Haddad
>Priority: Normal
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When ideal_consistency_level is set (CASSANDRA-13289), we currently increment 
> a counter if a request doesn’t meet the consistency level specified in the 
> configuration (or through JMX).
> At the moment, we increment the counter if the query was successful or not. I 
> think it would be slightly better if we only incremented the counter if the 
> ideal CL wasn’t achieved but the query’s CL was met.
> The original JIRA, stated the following as an objective:
> {quote}If your application writes at LOCAL_QUORUM how often are those writes 
> failing to achieve EACH_QUORUM at other data centers. If you failed your 
> application over to one of those data centers roughly how inconsistent might 
> it be given the number of writes that didn't propagate since the last 
> incremental repair?
> {quote}
> The main benefit to the JIRA was to set a CL higher than the CL being used, 
> and to track how often we weren’t able to hit that CL despise hitting the 
> underlying CL. We should only increment the counter in a case where we were 
> able to meet the query provided consistency but were unable to meet the ideal 
> consistency level.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15696) Only track ideal CL failure when request CL is met

2020-04-06 Thread Jon Haddad (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jon Haddad updated CASSANDRA-15696:
---
Test and Documentation Plan: 
|[Unit 
Test|https://app.circleci.com/pipelines/github/rustyrazorblade/cassandra/20/workflows/62377765-3e08-4d6f-b4b5-aec609a197e6/jobs/254]|
|[DTest|https://app.circleci.com/pipelines/github/rustyrazorblade/cassandra/20/workflows/62377765-3e08-4d6f-b4b5-aec609a197e6/jobs/256]|

 Status: Patch Available  (was: Open)

There's a single failing DTest unrelated to this patch.  

> Only track ideal CL failure when request CL is met
> --
>
> Key: CASSANDRA-15696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15696
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Jon Haddad
>Assignee: Jon Haddad
>Priority: Normal
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When ideal_consistency_level is set (CASSANDRA-13289), we currently increment 
> a counter if a request doesn’t meet the consistency level specified in the 
> configuration (or through JMX).
> At the moment, we increment the counter if the query was successful or not. I 
> think it would be slightly better if we only incremented the counter if the 
> ideal CL wasn’t achieved but the query’s CL was met.
> The original JIRA, stated the following as an objective:
> {quote}If your application writes at LOCAL_QUORUM how often are those writes 
> failing to achieve EACH_QUORUM at other data centers. If you failed your 
> application over to one of those data centers roughly how inconsistent might 
> it be given the number of writes that didn't propagate since the last 
> incremental repair?
> {quote}
> The main benefit to the JIRA was to set a CL higher than the CL being used, 
> and to track how often we weren’t able to hit that CL despise hitting the 
> underlying CL. We should only increment the counter in a case where we were 
> able to meet the query provided consistency but were unable to meet the ideal 
> consistency level.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15338) Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest

2020-04-06 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076727#comment-17076727
 ] 

David Capwell commented on CASSANDRA-15338:
---

I have a few things on my plate, I should be able to look end of the week?

> Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest
> ---
>
> Key: CASSANDRA-15338
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15338
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: David Capwell
>Assignee: Yifan Cai
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0-alpha
>
> Attachments: CASS-15338-Docker.zip
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Example failure: 
> [https://circleci.com/gh/dcapwell/cassandra/11#artifacts/containers/1]
>   
> {code:java}
> Testcase: testMessagePurging(org.apache.cassandra.net.ConnectionTest):  FAILED
>  expected:<0> but was:<1>
>  junit.framework.AssertionFailedError: expected:<0> but was:<1>
>    at 
> org.apache.cassandra.net.ConnectionTest.lambda$testMessagePurging$38(ConnectionTest.java:625)
>    at 
> org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:258)
>    at 
> org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:231)
>    at 
> org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:584){code}
>   
>  Looking closer at 
> org.apache.cassandra.net.OutboundConnection.Delivery#stopAndRun it seems that 
> the run method is called before 
> org.apache.cassandra.net.OutboundConnection.Delivery#doRun which may lead to 
> a test race condition where the CountDownLatch completes before executing



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15696) Only track ideal CL failure when request CL is met

2020-04-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CASSANDRA-15696:
---
Labels: pull-request-available  (was: )

> Only track ideal CL failure when request CL is met
> --
>
> Key: CASSANDRA-15696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15696
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Jon Haddad
>Assignee: Jon Haddad
>Priority: Normal
>  Labels: pull-request-available
>
> When ideal_consistency_level is set (CASSANDRA-13289), we currently increment 
> a counter if a request doesn’t meet the consistency level specified in the 
> configuration (or through JMX).
> At the moment, we increment the counter if the query was successful or not. I 
> think it would be slightly better if we only incremented the counter if the 
> ideal CL wasn’t achieved but the query’s CL was met.
> The original JIRA, stated the following as an objective:
> {quote}If your application writes at LOCAL_QUORUM how often are those writes 
> failing to achieve EACH_QUORUM at other data centers. If you failed your 
> application over to one of those data centers roughly how inconsistent might 
> it be given the number of writes that didn't propagate since the last 
> incremental repair?
> {quote}
> The main benefit to the JIRA was to set a CL higher than the CL being used, 
> and to track how often we weren’t able to hit that CL despise hitting the 
> underlying CL. We should only increment the counter in a case where we were 
> able to meet the query provided consistency but were unable to meet the ideal 
> consistency level.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15642) Inconsistent failure messages on distributed queries

2020-04-06 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076718#comment-17076718
 ] 

Benedict Elliott Smith commented on CASSANDRA-15642:


So what do you propose?  Some questions to consider: 

* How long until you're sure you've received all the responses you might ever 
receive?  
* Can you guarantee to respond within the timeout specified by the operation?
* If not, can you as a result ever guarantee "complete" information?
* If not, is it a coherent concept for a distributed system?
* How would you balance the delayed responses with user requirements to take 
corrective action promptly in response to failures?



> Inconsistent failure messages on distributed queries
> 
>
> Key: CASSANDRA-15642
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15642
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Coordination
>Reporter: Kevin Gallardo
>Priority: Normal
>
> As a follow up to some exploration I have done for CASSANDRA-15543, I 
> realized the following behavior in both {{ReadCallback}} and 
> {{AbstractWriteHandler}}:
>  - await for responses
>  - when all required number of responses have come back: unblock the wait
>  - when a single failure happens: unblock the wait
>  - when unblocked, look to see if the counter of failures is > 1 and if so 
> return an error message based on the {{failures}} map that's been filled
> Error messages that can result from this behavior can be a ReadTimeout, a 
> ReadFailure, a WriteTimeout or a WriteFailure.
> In case of a Write/ReadFailure, the user will get back an error looking like 
> the following:
> "Failure: Received X responses, and Y failures"
> (if this behavior I describe is incorrect, please correct me)
> This causes a usability problem. Since the handler will fail and throw an 
> exception as soon as 1 failure happens, the error message that is returned to 
> the user may not be accurate.
> (note: I am not entirely sure of the behavior in case of timeouts for now)
> For example, say a request at CL = QUORUM = 3, a failed request may complete 
> first, then a successful one completes, and another fails. If the exception 
> is thrown fast enough, the error message could say 
>  "Failure: Received 0 response, and 1 failure at CL = 3"
> Which:
> 1. doesn't make a lot of sense because the CL doesn't match the number of 
> results in the message, so you end up thinking "what happened with the rest 
> of the required CL?"
> 2. the information is incorrect. We did receive a successful response, only 
> it came after the initial failure.
> From that logic, I think it is safe to assume that the information returned 
> in the error message cannot be trusted in case of a failure. Only information 
> users should extract out of it is that at least 1 node has failed.
> For a big improvement in usability, the {{ReadCallback}} and 
> {{AbstractWriteResponseHandler}} could instead wait for all responses to come 
> back before unblocking the wait, or let it timeout. This is way, the users 
> will be able to have some trust around the information returned to them.
> Additionally, an error that happens first prevents a timeout to happen 
> because it fails immediately, and so potentially it hides problems with other 
> replicas. If we were to wait for all responses, we might get a timeout, in 
> that case we'd also be able to tell wether failures have happened *before* 
> that timeout, and have a more complete diagnostic where you can't detect both 
> errors at the same time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15229) BufferPool Regression

2020-04-06 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076712#comment-17076712
 ] 

Benedict Elliott Smith edited comment on CASSANDRA-15229 at 4/6/20, 10:07 PM:
--

bq. memory wasted due to fragmentation is perhaps not an issue with a cache as 
little as 512 MB

My view is that having a significant proportion of memory wasted to 
fragmentation is a serious bug, irregardless of the total amount of memory that 
is wasted.

bq. The point is that it is not suitable for long lived buffers, similarly to 
our bump the pointer strategy.

It's not poorly suited to long lived buffers its it?  Only to buffers with 
widely divergent lifetimes.  If the lifetimes are loosely correlated then the 
length of the lifetime is mostly irrelevant I think.

bq. The changes to the buffer pool can be dropped in 4.0 if you think that

If you mean introducing a new pool specifically for {{ChunkCache}}. I'm fine 
with it as an alternative to permitting {{BufferPool}} to mitigate worst case 
behaviour for the {{ChunkCache}}.  But verifying a replacement for 
{{BufferPool}} is a lot more work, and we use the {{BufferPool}} extensively in 
networking now, which requires non-uniform buffer sizes.

Honestly, given chunks are normally the same size, simply re-using the evicted 
buffer if possible, and if not allocating new system memory, seems probably 
sufficient to me.

bq. I'll try to share some code so you can have a clearer picture.

Thanks, that sounds great.  I may not get to it immediately, but look forward 
to taking a look hopefully soon.


was (Author: benedict):
bq. memory wasted due to fragmentation is perhaps not an issue with a cache as 
little as 512 MB

My view is that having a significant proportion of memory wasted to 
fragmentation is a serious bug, irregardless of the total amount of memory that 
is wasted.

bq. The point is that it is not suitable for long lived buffers, similarly to 
our bump the pointer strategy.

It's not poorly suited to long lived buffers its it?  Only to buffers with 
widely divergent lifetimes.  If the lifetimes are loosely correlated then the 
length of the lifetime is mostly irrelevant I think.

bq. The changes to the buffer pool can be dropped in 4.0 if you think that

If you mean introducing a new pool specifically for {{ChunkCache}}. I'm fine 
with it as an alternative to permitting {{BufferPool}} to mitigate worst case 
behaviour for the {{ChunkCache}}.  But verifying a replacement for 
{{BufferPool}} is a lot more work, and we use the {{BufferPool}} extensively in 
networking now, which requires non-uniform buffer sizes.

Honestly, given chunks are normally the same size, simply re-using the evicted 
buffer if possible, and if not allocating new system memory, seems probably 
sufficient to me.

> BufferPool Regression
> -
>
> Key: CASSANDRA-15229
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15229
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Caching
>Reporter: Benedict Elliott Smith
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0, 4.0-beta
>
>
> The BufferPool was never intended to be used for a {{ChunkCache}}, and we 
> need to either change our behaviour to handle uncorrelated lifetimes or use 
> something else.  This is particularly important with the default chunk size 
> for compressed sstables being reduced.  If we address the problem, we should 
> also utilise the BufferPool for native transport connections like we do for 
> internode messaging, and reduce the number of pooling solutions we employ.
> Probably the best thing to do is to improve BufferPool’s behaviour when used 
> for things with uncorrelated lifetimes, which essentially boils down to 
> tracking those chunks that have not been freed and re-circulating them when 
> we run out of completely free blocks.  We should probably also permit 
> instantiating separate {{BufferPool}}, so that we can insulate internode 
> messaging from the {{ChunkCache}}, or at least have separate memory bounds 
> for each, and only share fully-freed chunks.
> With these improvements we can also safely increase the {{BufferPool}} chunk 
> size to 128KiB or 256KiB, to guarantee we can fit compressed pages and reduce 
> the amount of global coordination and per-allocation overhead.  We don’t need 
> 1KiB granularity for allocations, nor 16 byte granularity for tiny 
> allocations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15229) BufferPool Regression

2020-04-06 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076712#comment-17076712
 ] 

Benedict Elliott Smith commented on CASSANDRA-15229:


bq. memory wasted due to fragmentation is perhaps not an issue with a cache as 
little as 512 MB

My view is that having a significant proportion of memory wasted to 
fragmentation is a serious bug, irregardless of the total amount of memory that 
is wasted.

bq. The point is that it is not suitable for long lived buffers, similarly to 
our bump the pointer strategy.

It's not poorly suited to long lived buffers its it?  Only to buffers with 
widely divergent lifetimes.  If the lifetimes are loosely correlated then the 
length of the lifetime is mostly irrelevant I think.

bq. The changes to the buffer pool can be dropped in 4.0 if you think that

If you mean introducing a new pool specifically for {{ChunkCache}}. I'm fine 
with it as an alternative to permitting {{BufferPool}} to mitigate worst case 
behaviour for the {{ChunkCache}}.  But verifying a replacement for 
{{BufferPool}} is a lot more work, and we use the {{BufferPool}} extensively in 
networking now, which requires non-uniform buffer sizes.

Honestly, given chunks are normally the same size, simply re-using the evicted 
buffer if possible, and if not allocating new system memory, seems probably 
sufficient to me.

> BufferPool Regression
> -
>
> Key: CASSANDRA-15229
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15229
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Caching
>Reporter: Benedict Elliott Smith
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0, 4.0-beta
>
>
> The BufferPool was never intended to be used for a {{ChunkCache}}, and we 
> need to either change our behaviour to handle uncorrelated lifetimes or use 
> something else.  This is particularly important with the default chunk size 
> for compressed sstables being reduced.  If we address the problem, we should 
> also utilise the BufferPool for native transport connections like we do for 
> internode messaging, and reduce the number of pooling solutions we employ.
> Probably the best thing to do is to improve BufferPool’s behaviour when used 
> for things with uncorrelated lifetimes, which essentially boils down to 
> tracking those chunks that have not been freed and re-circulating them when 
> we run out of completely free blocks.  We should probably also permit 
> instantiating separate {{BufferPool}}, so that we can insulate internode 
> messaging from the {{ChunkCache}}, or at least have separate memory bounds 
> for each, and only share fully-freed chunks.
> With these improvements we can also safely increase the {{BufferPool}} chunk 
> size to 128KiB or 256KiB, to guarantee we can fit compressed pages and reduce 
> the amount of global coordination and per-allocation overhead.  We don’t need 
> 1KiB granularity for allocations, nor 16 byte granularity for tiny 
> allocations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15697) cqlsh -e parsing bug

2020-04-06 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076707#comment-17076707
 ] 

Stefan Miklosovic commented on CASSANDRA-15697:
---

[~jrwest] I have already hit this and it is solved here 
https://issues.apache.org/jira/browse/CASSANDRA-15660, this should be closed / 
resolved as a duplicate.

> cqlsh -e parsing bug
> 
>
> Key: CASSANDRA-15697
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15697
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/cqlsh
>Reporter: Jordan West
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> {{cqlsh -e}} no longer works on trunk after the introduction of python 3 
> support (CASSANDRA-10190). Examples below. 
> {code}
> $ ./bin/cqlsh -e 'select * from foo;'
> Usage: cqlsh.py [options] [host [port]]
> cqlsh.py: error: ‘CHANGES.txt' is not a valid port number.
> $ ./bin/cqlsh -e 'select id from foo;'
> Usage: cqlsh.py [options] [host [port]]
> cqlsh.py: error: 'from' is not a valid port number.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15671) Testcase: testSubrangeCompaction(org.apache.cassandra.db.compaction.CancelCompactionsTest): FAILED

2020-04-06 Thread Ekaterina Dimitrova (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ekaterina Dimitrova updated CASSANDRA-15671:

Status: Ready to Commit  (was: Review In Progress)

> Testcase: 
> testSubrangeCompaction(org.apache.cassandra.db.compaction.CancelCompactionsTest):
>FAILED
> --
>
> Key: CASSANDRA-15671
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15671
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Ekaterina Dimitrova
>Assignee: Francisco Fernandez
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0, 4.0-beta
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The following test failure was observed:
> [junit-timeout] Testcase: 
> testSubrangeCompaction(org.apache.cassandra.db.compaction.CancelCompactionsTest):
>FAILED
> [junit-timeout] expected:<4> but was:<5>
> [junit-timeout] junit.framework.AssertionFailedError: expected:<4> but was:<5>
> [junit-timeout]   at 
> org.apache.cassandra.db.compaction.CancelCompactionsTest.testSubrangeCompaction(CancelCompactionsTest.java:190)
> Java 8



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15671) Testcase: testSubrangeCompaction(org.apache.cassandra.db.compaction.CancelCompactionsTest): FAILED

2020-04-06 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076692#comment-17076692
 ] 

Ekaterina Dimitrova commented on CASSANDRA-15671:
-

Hi, now it is fine, thank you!
The patch LGTM. Thanks!
Only thing is I am not a committer.
[~brandon.williams] can you, please, check and commit this one? Thanks in 
advance!

> Testcase: 
> testSubrangeCompaction(org.apache.cassandra.db.compaction.CancelCompactionsTest):
>FAILED
> --
>
> Key: CASSANDRA-15671
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15671
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Ekaterina Dimitrova
>Assignee: Francisco Fernandez
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0, 4.0-beta
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The following test failure was observed:
> [junit-timeout] Testcase: 
> testSubrangeCompaction(org.apache.cassandra.db.compaction.CancelCompactionsTest):
>FAILED
> [junit-timeout] expected:<4> but was:<5>
> [junit-timeout] junit.framework.AssertionFailedError: expected:<4> but was:<5>
> [junit-timeout]   at 
> org.apache.cassandra.db.compaction.CancelCompactionsTest.testSubrangeCompaction(CancelCompactionsTest.java:190)
> Java 8



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15597) Correct Visibility and Improve Safety of Methods in LatencyMetrics

2020-04-06 Thread Jordan West (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076663#comment-17076663
 ] 

Jordan West edited comment on CASSANDRA-15597 at 4/6/20, 9:03 PM:
--

+1

Tests: https://circleci.com/gh/jrwest/cassandra/tree/15597-4.0


was (Author: jrwest):
+1

> Correct Visibility and Improve Safety of Methods in LatencyMetrics
> --
>
> Key: CASSANDRA-15597
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15597
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Jordan West
>Assignee: Jeff
>Priority: Normal
> Fix For: 4.0
>
>
> * add/removeChildren does not need to be public (and exposing addChildren is 
> unsafe since no lock is used). 
> * casting in the constructor is safer than casting each time in removeChildren



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15597) Correct Visibility and Improve Safety of Methods in LatencyMetrics

2020-04-06 Thread Jordan West (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076663#comment-17076663
 ] 

Jordan West commented on CASSANDRA-15597:
-

+1

> Correct Visibility and Improve Safety of Methods in LatencyMetrics
> --
>
> Key: CASSANDRA-15597
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15597
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Jordan West
>Assignee: Jeff
>Priority: Normal
> Fix For: 4.0
>
>
> * add/removeChildren does not need to be public (and exposing addChildren is 
> unsafe since no lock is used). 
> * casting in the constructor is safer than casting each time in removeChildren



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15697) cqlsh -e parsing bug

2020-04-06 Thread Jordan West (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan West updated CASSANDRA-15697:

 Bug Category: Parent values: Correctness(12982)Level 1 values: API / 
Semantic Implementation(12988)
   Complexity: Normal
  Component/s: Tool/cqlsh
Discovered By: User Report
Fix Version/s: 4.0-alpha
 Severity: Normal
   Status: Open  (was: Triage Needed)

> cqlsh -e parsing bug
> 
>
> Key: CASSANDRA-15697
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15697
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/cqlsh
>Reporter: Jordan West
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> {{cqlsh -e}} no longer works on trunk after the introduction of python 3 
> support (CASSANDRA-10190). Examples below. 
> {code}
> $ ./bin/cqlsh -e 'select * from foo;'
> Usage: cqlsh.py [options] [host [port]]
> cqlsh.py: error: ‘CHANGES.txt' is not a valid port number.
> $ ./bin/cqlsh -e 'select id from foo;'
> Usage: cqlsh.py [options] [host [port]]
> cqlsh.py: error: 'from' is not a valid port number.
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15697) cqlsh -e parsing bug

2020-04-06 Thread Jordan West (Jira)
Jordan West created CASSANDRA-15697:
---

 Summary: cqlsh -e parsing bug
 Key: CASSANDRA-15697
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15697
 Project: Cassandra
  Issue Type: Bug
Reporter: Jordan West


{{cqlsh -e}} no longer works on trunk after the introduction of python 3 
support (CASSANDRA-10190). Examples below. 

{code}
$ ./bin/cqlsh -e 'select * from foo;'
Usage: cqlsh.py [options] [host [port]]

cqlsh.py: error: ‘CHANGES.txt' is not a valid port number.

$ ./bin/cqlsh -e 'select id from foo;'
Usage: cqlsh.py [options] [host [port]]

cqlsh.py: error: 'from' is not a valid port number.

{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15338) Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest

2020-04-06 Thread Yifan Cai (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076633#comment-17076633
 ] 

Yifan Cai commented on CASSANDRA-15338:
---

[~benedict][~dcapwell], do you want to take a look?

> Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest
> ---
>
> Key: CASSANDRA-15338
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15338
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: David Capwell
>Assignee: Yifan Cai
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0-alpha
>
> Attachments: CASS-15338-Docker.zip
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Example failure: 
> [https://circleci.com/gh/dcapwell/cassandra/11#artifacts/containers/1]
>   
> {code:java}
> Testcase: testMessagePurging(org.apache.cassandra.net.ConnectionTest):  FAILED
>  expected:<0> but was:<1>
>  junit.framework.AssertionFailedError: expected:<0> but was:<1>
>    at 
> org.apache.cassandra.net.ConnectionTest.lambda$testMessagePurging$38(ConnectionTest.java:625)
>    at 
> org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:258)
>    at 
> org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:231)
>    at 
> org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:584){code}
>   
>  Looking closer at 
> org.apache.cassandra.net.OutboundConnection.Delivery#stopAndRun it seems that 
> the run method is called before 
> org.apache.cassandra.net.OutboundConnection.Delivery#doRun which may lead to 
> a test race condition where the CountDownLatch completes before executing



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15623) When running CQLSH with STDIN input, exit with error status code if script fails

2020-04-06 Thread Jordan West (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076628#comment-17076628
 ] 

Jordan West commented on CASSANDRA-15623:
-

[~plastikat] thanks for reworking the patch for trunk. The changes LGTM. I 
verified the new behavior works as expected and the old behavior remains 
unchanged: 

 

```

Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo 'select * from 
foo;' | ./bin/cqlsh
:2:InvalidRequest: Error from server: code=2200 [Invalid query] 
message="No keyspace has been specified. USE a keyspace, or explicitly specify 
keyspace.tablename"
Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo $?
2
Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo 'select * from;' 
| ./bin/cqlsh
:2:SyntaxException: line 1:13 no viable alternative at input ';' (select 
* from[;])
Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo $?
2
Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ ./bin/cqlsh -e 
'select;'
:1:SyntaxException: line 1:6 no viable alternative at input ';' 
(select[;])
Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo $?
2
Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ ./bin/cqlsh -f testcql
testcql:2:SyntaxException: line 1:6 no viable alternative at input ';' 
(select[;])
Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo $?
2

```

 

I did however notice a new bug that also exists on trunk (unrelated to this 
change) while testing. More to come on that. 

> When running CQLSH with STDIN input, exit with error status code if script 
> fails
> 
>
> Key: CASSANDRA-15623
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15623
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Tools
>Reporter: Jacob Becker
>Assignee: Jacob Becker
>Priority: Normal
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Assuming CASSANDRA-6344 is in place for years and considering that scripts 
> submitted with the `-e` option behave in a similar fashion, it is very 
> surprising that scripts submitted to STDIN (i.e. piped in) always exit with a 
> zero code, regardless of errors. I believe this should be fixed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15623) When running CQLSH with STDIN input, exit with error status code if script fails

2020-04-06 Thread Jordan West (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076628#comment-17076628
 ] 

Jordan West edited comment on CASSANDRA-15623 at 4/6/20, 8:15 PM:
--

[~plastikat] thanks for reworking the patch for trunk. The changes LGTM. I 
verified the new behavior works as expected and the old behavior remains 
unchanged: 

 

{code}
Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo 'select * from 
foo;' | ./bin/cqlsh
:2:InvalidRequest: Error from server: code=2200 [Invalid query] 
message="No keyspace has been specified. USE a keyspace, or explicitly specify 
keyspace.tablename"
Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo $?
2
Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo 'select * from;' 
| ./bin/cqlsh
:2:SyntaxException: line 1:13 no viable alternative at input ';' (select 
* from[;])
Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo $?
2
Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ ./bin/cqlsh -e 
'select;'
:1:SyntaxException: line 1:6 no viable alternative at input ';' 
(select[;])
Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo $?
2
Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ ./bin/cqlsh -f testcql
testcql:2:SyntaxException: line 1:6 no viable alternative at input ';' 
(select[;])
Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo $?
2
{code}

 

I did however notice a new bug that also exists on trunk (unrelated to this 
change) while testing. More to come on that. 


was (Author: jrwest):
[~plastikat] thanks for reworking the patch for trunk. The changes LGTM. I 
verified the new behavior works as expected and the old behavior remains 
unchanged: 

 

```

Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo 'select * from 
foo;' | ./bin/cqlsh
:2:InvalidRequest: Error from server: code=2200 [Invalid query] 
message="No keyspace has been specified. USE a keyspace, or explicitly specify 
keyspace.tablename"
Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo $?
2
Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo 'select * from;' 
| ./bin/cqlsh
:2:SyntaxException: line 1:13 no viable alternative at input ';' (select 
* from[;])
Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo $?
2
Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ ./bin/cqlsh -e 
'select;'
:1:SyntaxException: line 1:6 no viable alternative at input ';' 
(select[;])
Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo $?
2
Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ ./bin/cqlsh -f testcql
testcql:2:SyntaxException: line 1:6 no viable alternative at input ';' 
(select[;])
Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo $?
2

```

 

I did however notice a new bug that also exists on trunk (unrelated to this 
change) while testing. More to come on that. 

> When running CQLSH with STDIN input, exit with error status code if script 
> fails
> 
>
> Key: CASSANDRA-15623
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15623
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Tools
>Reporter: Jacob Becker
>Assignee: Jacob Becker
>Priority: Normal
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Assuming CASSANDRA-6344 is in place for years and considering that scripts 
> submitted with the `-e` option behave in a similar fashion, it is very 
> surprising that scripts submitted to STDIN (i.e. piped in) always exit with a 
> zero code, regardless of errors. I believe this should be fixed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15696) Only track ideal CL failure when request CL is met

2020-04-06 Thread Jon Haddad (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jon Haddad updated CASSANDRA-15696:
---
 Bug Category: Parent values: Code(13163)Level 1 values: Bug - Unclear 
Impact(13164)
   Complexity: Low Hanging Fruit
Discovered By: Code Inspection
 Severity: Low
   Status: Open  (was: Triage Needed)

> Only track ideal CL failure when request CL is met
> --
>
> Key: CASSANDRA-15696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15696
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Jon Haddad
>Assignee: Jon Haddad
>Priority: Normal
>
> When ideal_consistency_level is set (CASSANDRA-13289), we currently increment 
> a counter if a request doesn’t meet the consistency level specified in the 
> configuration (or through JMX).
> At the moment, we increment the counter if the query was successful or not. I 
> think it would be slightly better if we only incremented the counter if the 
> ideal CL wasn’t achieved but the query’s CL was met.
> The original JIRA, stated the following as an objective:
> {quote}If your application writes at LOCAL_QUORUM how often are those writes 
> failing to achieve EACH_QUORUM at other data centers. If you failed your 
> application over to one of those data centers roughly how inconsistent might 
> it be given the number of writes that didn't propagate since the last 
> incremental repair?
> {quote}
> The main benefit to the JIRA was to set a CL higher than the CL being used, 
> and to track how often we weren’t able to hit that CL despise hitting the 
> underlying CL. We should only increment the counter in a case where we were 
> able to meet the query provided consistency but were unable to meet the ideal 
> consistency level.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15696) Only track ideal CL failure when request CL is met

2020-04-06 Thread Jon Haddad (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jon Haddad updated CASSANDRA-15696:
---
Description: 
When ideal_consistency_level is set (CASSANDRA-13289), we currently increment a 
counter if a request doesn’t meet the consistency level specified in the 
configuration (or through JMX).

At the moment, we increment the counter if the query was successful or not. I 
think it would be slightly better if we only incremented the counter if the 
ideal CL wasn’t achieved but the query’s CL was met.

The original JIRA, stated the following as an objective:
{quote}If your application writes at LOCAL_QUORUM how often are those writes 
failing to achieve EACH_QUORUM at other data centers. If you failed your 
application over to one of those data centers roughly how inconsistent might it 
be given the number of writes that didn't propagate since the last incremental 
repair?
{quote}
The main benefit to the JIRA was to set a CL higher than the CL being used, and 
to track how often we weren’t able to hit that CL despise hitting the 
underlying CL. We should only increment the counter in a case where we were 
able to meet the query provided consistency but were unable to meet the ideal 
consistency level.

  was:
When ideal_consistency_level is set (CASSANDRA-13289), we currently increment a 
counter if a request doesn’t use the consistency level specified in the 
configuration (or through JMX).  

At the moment, we increment the counter if the query was successful or not.  I 
think it would be slightly better if we only incremented the counter if the 
ideal CL wasn’t achieved but the query’s CL was met. 

The original JIRA, stated the following as an objective:

{quote}
If your application writes at LOCAL_QUORUM how often are those writes failing 
to achieve EACH_QUORUM at other data centers. If you failed your application 
over to one of those data centers roughly how inconsistent might it be given 
the number of writes that didn't propagate since the last incremental repair?
{quote}

The main benefit to the JIRA was to set a CL higher than the CL being used, and 
to track how often we weren’t able to hit that CL despise hitting the 
underlying CL.  We should only increment the counter in a case where we were 
able to meet the query provided consistency but were unable to meet the ideal 
consistency level.


> Only track ideal CL failure when request CL is met
> --
>
> Key: CASSANDRA-15696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15696
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
>Reporter: Jon Haddad
>Assignee: Jon Haddad
>Priority: Normal
>
> When ideal_consistency_level is set (CASSANDRA-13289), we currently increment 
> a counter if a request doesn’t meet the consistency level specified in the 
> configuration (or through JMX).
> At the moment, we increment the counter if the query was successful or not. I 
> think it would be slightly better if we only incremented the counter if the 
> ideal CL wasn’t achieved but the query’s CL was met.
> The original JIRA, stated the following as an objective:
> {quote}If your application writes at LOCAL_QUORUM how often are those writes 
> failing to achieve EACH_QUORUM at other data centers. If you failed your 
> application over to one of those data centers roughly how inconsistent might 
> it be given the number of writes that didn't propagate since the last 
> incremental repair?
> {quote}
> The main benefit to the JIRA was to set a CL higher than the CL being used, 
> and to track how often we weren’t able to hit that CL despise hitting the 
> underlying CL. We should only increment the counter in a case where we were 
> able to meet the query provided consistency but were unable to meet the ideal 
> consistency level.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15696) Only track ideal CL failure when request CL is met

2020-04-06 Thread Jon Haddad (Jira)
Jon Haddad created CASSANDRA-15696:
--

 Summary: Only track ideal CL failure when request CL is met
 Key: CASSANDRA-15696
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15696
 Project: Cassandra
  Issue Type: Bug
  Components: Observability/Metrics
Reporter: Jon Haddad
Assignee: Jon Haddad


When ideal_consistency_level is set (CASSANDRA-13289), we currently increment a 
counter if a request doesn’t use the consistency level specified in the 
configuration (or through JMX).  

At the moment, we increment the counter if the query was successful or not.  I 
think it would be slightly better if we only incremented the counter if the 
ideal CL wasn’t achieved but the query’s CL was met. 

The original JIRA, stated the following as an objective:

{quote}
If your application writes at LOCAL_QUORUM how often are those writes failing 
to achieve EACH_QUORUM at other data centers. If you failed your application 
over to one of those data centers roughly how inconsistent might it be given 
the number of writes that didn't propagate since the last incremental repair?
{quote}

The main benefit to the JIRA was to set a CL higher than the CL being used, and 
to track how often we weren’t able to hit that CL despise hitting the 
underlying CL.  We should only increment the counter in a case where we were 
able to meet the query provided consistency but were unable to meet the ideal 
consistency level.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15686) Improvements in circle CI default config

2020-04-06 Thread Kevin Gallardo (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076590#comment-17076590
 ] 

Kevin Gallardo commented on CASSANDRA-15686:


bq. I am only speculating but I thought there were tests that spin up network 
and there are many tests which share disk, and I don't know if we isolate the 
paths or not. If this is not true then it should be a simple change to bump the 
number of runners for unit tests (definitely not jvm dtests).

Hm afaict running with multiple runners is fine for the unit tests, I have 
tested on different configurations and it didn't seem to cause problems as long 
as # runners <= # CPUs

> Improvements in circle CI default config
> 
>
> Key: CASSANDRA-15686
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15686
> Project: Cassandra
>  Issue Type: Bug
>  Components: Build
>Reporter: Kevin Gallardo
>Priority: Normal
>
> I have been looking at and played around with the [default CircleCI 
> config|https://github.com/apache/cassandra/blob/trunk/.circleci/config.yml], 
> a few comments/questions regarding the following topics:
>  * Python dtests do not run successfully (200-300 failures) on {{medium}} 
> instances, they seem to only run with small flaky failures on {{large}} 
> instances or higher
>  * Python Upgrade tests:
>  ** Do not seem to run without many failures on any instance types / any 
> parallelism setting
>  ** Do not seem to parallelize well, it seems each container is going to 
> download multiple C* versions
>  ** Additionally it seems the configuration is not up to date, as currently 
> we get errors because {{JAVA8_HOME}} is not set
>  * Unit tests do not seem to parallelize optimally, number of test runners do 
> not reflect the available CPUs on the container. Ideally if # of runners == # 
> of CPUs, build time is improved, on any type of instances.
>  ** For instance when using the current configuration, running on medium 
> instances, build will use 1 junit test runner, but 2 CPUs are available. If 
> using 2 runners, the build time is reduced from 19min (at the current main 
> config of parallelism=4) to 12min.
>  * There are some typos in the file, some dtests say "Run Unit Tests" but 
> they are JVM dtests (see 
> [here|https://github.com/apache/cassandra/blob/trunk/.circleci/config.yml#L1077],
>  
> [here|https://github.com/apache/cassandra/blob/trunk/.circleci/config.yml#L1386])
> So some ways to process these would be:
>  * Do the Python dtests run successfully for anyone on {{medium}} instances? 
> If not, would it make sense to bump them to {{large}} so that they can be run 
> successfully?
>  * Does anybody ever run the python upgrade tests on CircleCI and what is the 
> configuration that makes it work?
>  * Would it make sense to either hardcode the number of test runners in the 
> unit tests with `-Dtest.runners` in the config file to reflect the number of 
> CPUs on the instances, or change the build so that it is able to detect the 
> appropriate number of core available automatically?
> Additionally, it seems this default config file (config.yml) is not as well 
> maintained as the 
> [{{config-2_1.yml}}|https://github.com/apache/cassandra/blob/trunk/.circleci/config-2_1.yml]
>  (+its lowres/highres) version in the same folder (from CASSANDRA-14806). 
> What is the reasoning for maintaining these 2 versions of the build? Could 
> the better maintained version be used as the default? We could generate a 
> lowres version of the new config-2_1.yml, and rename it {{config.yml}} so 
> that it gets picked up by CircleCI automatically instead of the current 
> default.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15568) Message filtering should apply on the inboundSink in In-JVM dtest

2020-04-06 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076588#comment-17076588
 ] 

Benedict Elliott Smith commented on CASSANDRA-15568:


Neat!

> Message filtering should apply on the inboundSink in In-JVM dtest
> -
>
> Key: CASSANDRA-15568
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15568
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
>
> The message filtering mechanism in the in-jvm dtest helps to simulate network 
> partition/delay. 
> The problem of the current approach that adds all filters to the 
> {{MessagingService#outboundSink}} is that a blocking filter blocks the 
> following filters to be evaluated since there is only a single thread that 
> evaluates them. It further blocks the other outing messages. The typical 
> internode messaging pattern is that the coordinator node sends out multiple 
> messages to other nodes upon receiving a query. The described blocking 
> messages can happen quite often.
> The problem can be solved by moving the message filtering to the 
> {{MessagingService#inboundSink}}, so that each inbounding message is 
> naturally filtered in parallel.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15568) Message filtering should apply on the inboundSink in In-JVM dtest

2020-04-06 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076497#comment-17076497
 ] 

David Capwell edited comment on CASSANDRA-15568 at 4/6/20, 5:38 PM:


[~benedict]

bq. Hmm, when did this happen? 

~2 weeks ago?

bq. Have we eliminated outbound filtering? There is value in being able to stop 
progress on the outbound thread, as it permits you to specify a sequence of 
events by controlling the flow of events on the coordinator.

The default is inbound, but you can define inbound or outbound when you define 
the filter

{code}
cluster.filters().outbound().allVerbs().drop(); // runs in the Instance doing 
the message sending
cluster.filters().inbound().allVerbs().drop(); // runs in the instance 
receiving the message
cluster.filters().allVerbs().drop(); // same as the above, inbound is the 
default.
{code}

PreviewRepairTest uses outbound filters to block the sending until IR has 
completed.


was (Author: dcapwell):
bq. Hmm, when did this happen? 

~2 weeks ago?

bq. Have we eliminated outbound filtering? There is value in being able to stop 
progress on the outbound thread, as it permits you to specify a sequence of 
events by controlling the flow of events on the coordinator.

The default is inbound, but you can define inbound or outbound when you define 
the filter

{code}
cluster.filters().outbound().allVerbs().drop(); // runs in the Instance doing 
the message sending
cluster.filters().inbound().allVerbs().drop(); // runs in the instance 
receiving the message
cluster.filters().allVerbs().drop(); // same as the above, inbound is the 
default.
{code}

PreviewRepairTest uses outbound filters to block the sending until IR has 
completed.

> Message filtering should apply on the inboundSink in In-JVM dtest
> -
>
> Key: CASSANDRA-15568
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15568
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
>
> The message filtering mechanism in the in-jvm dtest helps to simulate network 
> partition/delay. 
> The problem of the current approach that adds all filters to the 
> {{MessagingService#outboundSink}} is that a blocking filter blocks the 
> following filters to be evaluated since there is only a single thread that 
> evaluates them. It further blocks the other outing messages. The typical 
> internode messaging pattern is that the coordinator node sends out multiple 
> messages to other nodes upon receiving a query. The described blocking 
> messages can happen quite often.
> The problem can be solved by moving the message filtering to the 
> {{MessagingService#inboundSink}}, so that each inbounding message is 
> naturally filtered in parallel.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15568) Message filtering should apply on the inboundSink in In-JVM dtest

2020-04-06 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076497#comment-17076497
 ] 

David Capwell edited comment on CASSANDRA-15568 at 4/6/20, 5:38 PM:


bq. Hmm, when did this happen? 

~2 weeks ago?

bq. Have we eliminated outbound filtering? There is value in being able to stop 
progress on the outbound thread, as it permits you to specify a sequence of 
events by controlling the flow of events on the coordinator.

The default is inbound, but you can define inbound or outbound when you define 
the filter

{code}
cluster.filters().outbound().allVerbs().drop(); // runs in the Instance doing 
the message sending
cluster.filters().inbound().allVerbs().drop(); // runs in the instance 
receiving the message
cluster.filters().allVerbs().drop(); // same as the above, inbound is the 
default.
{code}

PreviewRepairTest uses outbound filters to block the sending until IR has 
completed.


was (Author: dcapwell):
bq. Hmm, when did this happen? 

~2 weeks ago?

bq. Have we eliminated outbound filtering? There is value in being able to stop 
progress on the outbound thread, as it permits you to specify a sequence of 
events by controlling the flow of events on the coordinator.

The default is inbound, but you can define inbound or outbound when you define 
the filter

{code}
cluster.filters().outbound().allVerbs().drop();
cluster.filters().inbound().allVerbs().drop();
{code}

PreviewRepairTest uses outbound filters to block the sending until IR has 
completed.

> Message filtering should apply on the inboundSink in In-JVM dtest
> -
>
> Key: CASSANDRA-15568
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15568
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
>
> The message filtering mechanism in the in-jvm dtest helps to simulate network 
> partition/delay. 
> The problem of the current approach that adds all filters to the 
> {{MessagingService#outboundSink}} is that a blocking filter blocks the 
> following filters to be evaluated since there is only a single thread that 
> evaluates them. It further blocks the other outing messages. The typical 
> internode messaging pattern is that the coordinator node sends out multiple 
> messages to other nodes upon receiving a query. The described blocking 
> messages can happen quite often.
> The problem can be solved by moving the message filtering to the 
> {{MessagingService#inboundSink}}, so that each inbounding message is 
> naturally filtered in parallel.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-15640) digest may not match when single partition named queries skip older sstables

2020-04-06 Thread Sam Tunnicliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sam Tunnicliffe reassigned CASSANDRA-15640:
---

Assignee: Sam Tunnicliffe

> digest may not match when single partition named queries skip older sstables
> 
>
> Key: CASSANDRA-15640
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15640
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: ZhaoYang
>Assignee: Sam Tunnicliffe
>Priority: Normal
>
> Name queries (aka. single partition query with full clustering keys) query 
> sstables sequentially in recency order, in the hope that most recent sstables 
> will contain most recent data, so that they can avoid reading older sstables 
> in {{SinglePartitionReadCommand#reduceFilter}}.
> Unfortunately, this optimization may cause digest mismatch if older sstables 
> contain range tombstone or row deletion with lower timestamp. [Test 
> Code|https://github.com/jasonstack/cassandra/commit/3dfa29bb34bc237ab2b68f849906c09569c5cc94]
> {code:java}
> Table with (pk, ck1, ck2)
> Node1:
> * delete row (pk=1, ck1=1) with ts=10
> * insert row (pk=1, ck1=1, ck2=1) with ts=11
> Node2:
> * delete row (pk=1, ck1=1) with ts=10
> * flush into sstable1
> * insert row (pk=1, ck1=1, ck2=1) with ts=11
> * flush into sstable2
> Query with pk=1 and ck1=1 and ck2=1
> * node1 returns: RT open marker, row, RT close marker
> * node2 returns: row  (because sstable1 is skipped)
> Note: similar mismatch can happen with row deletion as well.
> {code}
> In the above example: Is it safe to ignore RT or row deletion if row liveness 
> has higher timestamp for named queries in node1?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15568) Message filtering should apply on the inboundSink in In-JVM dtest

2020-04-06 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076497#comment-17076497
 ] 

David Capwell commented on CASSANDRA-15568:
---

bq. Hmm, when did this happen? 

~2 weeks ago?

bq. Have we eliminated outbound filtering? There is value in being able to stop 
progress on the outbound thread, as it permits you to specify a sequence of 
events by controlling the flow of events on the coordinator.

The default is inbound, but you can define inbound or outbound when you define 
the filter

{code}
cluster.filters().outbound().allVerbs().drop();
cluster.filters().inbound().allVerbs().drop();
{code}

PreviewRepairTest uses outbound filters to block the sending until IR has 
completed.

> Message filtering should apply on the inboundSink in In-JVM dtest
> -
>
> Key: CASSANDRA-15568
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15568
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
>
> The message filtering mechanism in the in-jvm dtest helps to simulate network 
> partition/delay. 
> The problem of the current approach that adds all filters to the 
> {{MessagingService#outboundSink}} is that a blocking filter blocks the 
> following filters to be evaluated since there is only a single thread that 
> evaluates them. It further blocks the other outing messages. The typical 
> internode messaging pattern is that the coordinator node sends out multiple 
> messages to other nodes upon receiving a query. The described blocking 
> messages can happen quite often.
> The problem can be solved by moving the message filtering to the 
> {{MessagingService#inboundSink}}, so that each inbounding message is 
> naturally filtered in parallel.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15684) CASSANDRA-15650 was merged after dtest refactor and modified classes no longer in the project

2020-04-06 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-15684:
--
Reviewers: Alex Petrov, Benjamin Lerer, David Capwell  (was: Alex Petrov, 
Benjamin Lerer)
   Status: Review In Progress  (was: Patch Available)

> CASSANDRA-15650 was merged after dtest refactor and modified classes no 
> longer in the project
> -
>
> Key: CASSANDRA-15684
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15684
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> CASSANDRA-15650 was based off commits before CASSANDRA-15539 which removed 
> some of the files modified in CASSANDRA-15650.  The tests were passing 
> pre-merge but off earlier commits.  On commit they started failing since the 
> dtest API no longer match so produces the following exception
> {code}
> [junit-timeout] 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts;
> [junit-timeout] java.lang.NoSuchMethodError: 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts;
> [junit-timeout] at 
> org.apache.cassandra.distributed.test.RepairCoordinatorFast.lambda$unknownHost$5(RepairCoordinatorFast.java:216)
> [junit-timeout] at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39)
> [junit-timeout] at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67)
> [junit-timeout] at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [junit-timeout] at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [junit-timeout] at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [junit-timeout] at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> [junit-timeout] at java.lang.Thread.run(Thread.java:748)
> {code}
> Root cause was 4 files exited which should have been deleted in 
> CASSANDRA-15539.  Since they were not when CASSANDRA-15650 modified one it 
> didn't cause a merge conflict, but when the test runs it conflicts and fails.
> [Circle 
> CI|https://circleci.com/gh/dcapwell/cassandra/tree/bug%2FCASSANDRA-15684]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15684) CASSANDRA-15650 was merged after dtest refactor and modified classes no longer in the project

2020-04-06 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-15684:
--
Status: Ready to Commit  (was: Review In Progress)

> CASSANDRA-15650 was merged after dtest refactor and modified classes no 
> longer in the project
> -
>
> Key: CASSANDRA-15684
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15684
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> CASSANDRA-15650 was based off commits before CASSANDRA-15539 which removed 
> some of the files modified in CASSANDRA-15650.  The tests were passing 
> pre-merge but off earlier commits.  On commit they started failing since the 
> dtest API no longer match so produces the following exception
> {code}
> [junit-timeout] 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts;
> [junit-timeout] java.lang.NoSuchMethodError: 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts;
> [junit-timeout] at 
> org.apache.cassandra.distributed.test.RepairCoordinatorFast.lambda$unknownHost$5(RepairCoordinatorFast.java:216)
> [junit-timeout] at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39)
> [junit-timeout] at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67)
> [junit-timeout] at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [junit-timeout] at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [junit-timeout] at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [junit-timeout] at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> [junit-timeout] at java.lang.Thread.run(Thread.java:748)
> {code}
> Root cause was 4 files exited which should have been deleted in 
> CASSANDRA-15539.  Since they were not when CASSANDRA-15650 modified one it 
> didn't cause a merge conflict, but when the test runs it conflicts and fails.
> [Circle 
> CI|https://circleci.com/gh/dcapwell/cassandra/tree/bug%2FCASSANDRA-15684]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15684) CASSANDRA-15650 was merged after dtest refactor and modified classes no longer in the project

2020-04-06 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-15684:
--
Fix Version/s: 4.0-alpha
   Resolution: Fixed
   Status: Resolved  (was: Ready to Commit)

> CASSANDRA-15650 was merged after dtest refactor and modified classes no 
> longer in the project
> -
>
> Key: CASSANDRA-15684
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15684
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0-alpha
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> CASSANDRA-15650 was based off commits before CASSANDRA-15539 which removed 
> some of the files modified in CASSANDRA-15650.  The tests were passing 
> pre-merge but off earlier commits.  On commit they started failing since the 
> dtest API no longer match so produces the following exception
> {code}
> [junit-timeout] 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts;
> [junit-timeout] java.lang.NoSuchMethodError: 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts;
> [junit-timeout] at 
> org.apache.cassandra.distributed.test.RepairCoordinatorFast.lambda$unknownHost$5(RepairCoordinatorFast.java:216)
> [junit-timeout] at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39)
> [junit-timeout] at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67)
> [junit-timeout] at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [junit-timeout] at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [junit-timeout] at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [junit-timeout] at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> [junit-timeout] at java.lang.Thread.run(Thread.java:748)
> {code}
> Root cause was 4 files exited which should have been deleted in 
> CASSANDRA-15539.  Since they were not when CASSANDRA-15650 modified one it 
> didn't cause a merge conflict, but when the test runs it conflicts and fails.
> [Circle 
> CI|https://circleci.com/gh/dcapwell/cassandra/tree/bug%2FCASSANDRA-15684]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15686) Improvements in circle CI default config

2020-04-06 Thread Kevin Gallardo (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076459#comment-17076459
 ] 

Kevin Gallardo commented on CASSANDRA-15686:


Confirm the build is running as normal by using the {{config-2_1.yml}} directly 
renamed as {{config.yml}}, see build #42 here: 
https://app.circleci.com/pipelines/github/newkek/cassandra?branch=chg

> Improvements in circle CI default config
> 
>
> Key: CASSANDRA-15686
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15686
> Project: Cassandra
>  Issue Type: Bug
>  Components: Build
>Reporter: Kevin Gallardo
>Priority: Normal
>
> I have been looking at and played around with the [default CircleCI 
> config|https://github.com/apache/cassandra/blob/trunk/.circleci/config.yml], 
> a few comments/questions regarding the following topics:
>  * Python dtests do not run successfully (200-300 failures) on {{medium}} 
> instances, they seem to only run with small flaky failures on {{large}} 
> instances or higher
>  * Python Upgrade tests:
>  ** Do not seem to run without many failures on any instance types / any 
> parallelism setting
>  ** Do not seem to parallelize well, it seems each container is going to 
> download multiple C* versions
>  ** Additionally it seems the configuration is not up to date, as currently 
> we get errors because {{JAVA8_HOME}} is not set
>  * Unit tests do not seem to parallelize optimally, number of test runners do 
> not reflect the available CPUs on the container. Ideally if # of runners == # 
> of CPUs, build time is improved, on any type of instances.
>  ** For instance when using the current configuration, running on medium 
> instances, build will use 1 junit test runner, but 2 CPUs are available. If 
> using 2 runners, the build time is reduced from 19min (at the current main 
> config of parallelism=4) to 12min.
>  * There are some typos in the file, some dtests say "Run Unit Tests" but 
> they are JVM dtests (see 
> [here|https://github.com/apache/cassandra/blob/trunk/.circleci/config.yml#L1077],
>  
> [here|https://github.com/apache/cassandra/blob/trunk/.circleci/config.yml#L1386])
> So some ways to process these would be:
>  * Do the Python dtests run successfully for anyone on {{medium}} instances? 
> If not, would it make sense to bump them to {{large}} so that they can be run 
> successfully?
>  * Does anybody ever run the python upgrade tests on CircleCI and what is the 
> configuration that makes it work?
>  * Would it make sense to either hardcode the number of test runners in the 
> unit tests with `-Dtest.runners` in the config file to reflect the number of 
> CPUs on the instances, or change the build so that it is able to detect the 
> appropriate number of core available automatically?
> Additionally, it seems this default config file (config.yml) is not as well 
> maintained as the 
> [{{config-2_1.yml}}|https://github.com/apache/cassandra/blob/trunk/.circleci/config-2_1.yml]
>  (+its lowres/highres) version in the same folder (from CASSANDRA-14806). 
> What is the reasoning for maintaining these 2 versions of the build? Could 
> the better maintained version be used as the default? We could generate a 
> lowres version of the new config-2_1.yml, and rename it {{config.yml}} so 
> that it gets picked up by CircleCI automatically instead of the current 
> default.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15684) CASSANDRA-15650 was merged after dtest refactor and modified classes no longer in the project

2020-04-06 Thread Michael Semb Wever (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076452#comment-17076452
 ] 

Michael Semb Wever commented on CASSANDRA-15684:


bq.  Michael Semb Wever if I understand the nature of the failure correctly, 
the main problem was merge.

That's my understanding too.

Since both you [~ifesdjeen] and [~blerer] have +1 on the main patch I've gone 
ahead and committed it.

Committed as 
[a104b06d4aea2f2cd3d48bdbe38410284f236428|https://github.com/apache/cassandra/commit/a104b06d4aea2f2cd3d48bdbe38410284f236428]

> CASSANDRA-15650 was merged after dtest refactor and modified classes no 
> longer in the project
> -
>
> Key: CASSANDRA-15684
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15684
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> CASSANDRA-15650 was based off commits before CASSANDRA-15539 which removed 
> some of the files modified in CASSANDRA-15650.  The tests were passing 
> pre-merge but off earlier commits.  On commit they started failing since the 
> dtest API no longer match so produces the following exception
> {code}
> [junit-timeout] 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts;
> [junit-timeout] java.lang.NoSuchMethodError: 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts;
> [junit-timeout] at 
> org.apache.cassandra.distributed.test.RepairCoordinatorFast.lambda$unknownHost$5(RepairCoordinatorFast.java:216)
> [junit-timeout] at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39)
> [junit-timeout] at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67)
> [junit-timeout] at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [junit-timeout] at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [junit-timeout] at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [junit-timeout] at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> [junit-timeout] at java.lang.Thread.run(Thread.java:748)
> {code}
> Root cause was 4 files exited which should have been deleted in 
> CASSANDRA-15539.  Since they were not when CASSANDRA-15650 modified one it 
> didn't cause a merge conflict, but when the test runs it conflicts and fails.
> [Circle 
> CI|https://circleci.com/gh/dcapwell/cassandra/tree/bug%2FCASSANDRA-15684]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15684) CASSANDRA-15650 was merged after dtest refactor and modified classes no longer in the project

2020-04-06 Thread Michael Semb Wever (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-15684:
---
  Since Version: 4.0-alpha
Source Control Link: 
https://github.com/apache/cassandra/commit/a104b06d4aea2f2cd3d48bdbe38410284f236428

> CASSANDRA-15650 was merged after dtest refactor and modified classes no 
> longer in the project
> -
>
> Key: CASSANDRA-15684
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15684
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> CASSANDRA-15650 was based off commits before CASSANDRA-15539 which removed 
> some of the files modified in CASSANDRA-15650.  The tests were passing 
> pre-merge but off earlier commits.  On commit they started failing since the 
> dtest API no longer match so produces the following exception
> {code}
> [junit-timeout] 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts;
> [junit-timeout] java.lang.NoSuchMethodError: 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts;
> [junit-timeout] at 
> org.apache.cassandra.distributed.test.RepairCoordinatorFast.lambda$unknownHost$5(RepairCoordinatorFast.java:216)
> [junit-timeout] at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39)
> [junit-timeout] at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67)
> [junit-timeout] at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [junit-timeout] at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [junit-timeout] at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [junit-timeout] at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> [junit-timeout] at java.lang.Thread.run(Thread.java:748)
> {code}
> Root cause was 4 files exited which should have been deleted in 
> CASSANDRA-15539.  Since they were not when CASSANDRA-15650 modified one it 
> didn't cause a merge conflict, but when the test runs it conflicts and fails.
> [Circle 
> CI|https://circleci.com/gh/dcapwell/cassandra/tree/bug%2FCASSANDRA-15684]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch trunk updated: Fix RepairCoordinator test failures, after clobbering jvm-dtest refactoring (CASSANDRA-15650) and modifying classes no longer in the project

2020-04-06 Thread mck
This is an automated email from the ASF dual-hosted git repository.

mck pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/trunk by this push:
 new a104b06  Fix RepairCoordinator test failures, after clobbering 
jvm-dtest refactoring (CASSANDRA-15650) and modifying classes no longer in the 
project
a104b06 is described below

commit a104b06d4aea2f2cd3d48bdbe38410284f236428
Author: David Capwell 
AuthorDate: Thu Apr 2 10:58:43 2020 -0700

Fix RepairCoordinator test failures, after clobbering jvm-dtest refactoring 
(CASSANDRA-15650) and modifying classes no longer in the project

 patch by David Capwell; reviewed by Benjamin Lerer, Alex Petrov for 
CASSANDRA-15684
---
 .../cassandra/distributed/api/LongTokenRange.java  |  38 
 .../cassandra/distributed/api/NodeToolResult.java  | 218 -
 .../cassandra/distributed/api/QueryResult.java | 139 -
 .../org/apache/cassandra/distributed/api/Row.java  | 119 ---
 .../distributed/test/RepairCoordinatorFast.java|   8 +-
 5 files changed, 6 insertions(+), 516 deletions(-)

diff --git 
a/test/distributed/org/apache/cassandra/distributed/api/LongTokenRange.java 
b/test/distributed/org/apache/cassandra/distributed/api/LongTokenRange.java
deleted file mode 100644
index 06327e8..000
--- a/test/distributed/org/apache/cassandra/distributed/api/LongTokenRange.java
+++ /dev/null
@@ -1,38 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.cassandra.distributed.api;
-
-import java.io.Serializable;
-
-public final class LongTokenRange implements Serializable
-{
-public final long minExclusive;
-public final long maxInclusive;
-
-public LongTokenRange(long minExclusive, long maxInclusive)
-{
-this.minExclusive = minExclusive;
-this.maxInclusive = maxInclusive;
-}
-
-public String toString()
-{
-return "(" + minExclusive + "," + maxInclusive + "]";
-}
-}
diff --git 
a/test/distributed/org/apache/cassandra/distributed/api/NodeToolResult.java 
b/test/distributed/org/apache/cassandra/distributed/api/NodeToolResult.java
deleted file mode 100644
index 8f33ae5..000
--- a/test/distributed/org/apache/cassandra/distributed/api/NodeToolResult.java
+++ /dev/null
@@ -1,218 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.cassandra.distributed.api;
-
-import java.util.Arrays;
-import java.util.Collection;
-import java.util.List;
-import java.util.Map;
-import java.util.stream.Collectors;
-import java.util.stream.Stream;
-import javax.management.Notification;
-
-import com.google.common.base.Throwables;
-import org.junit.Assert;
-
-public class NodeToolResult
-{
-private final String[] commandAndArgs;
-private final int rc;
-private final List notifications;
-private final Throwable error;
-
-public NodeToolResult(String[] commandAndArgs, int rc, List 
notifications, Throwable error)
-{
-this.commandAndArgs = commandAndArgs;
-this.rc = rc;
-this.notifications = notifications;
-this.error = error;
-}
-
-public String[] getCommandAndArgs()
-{
-return commandAndArgs;
-}
-
-public int getRc()
-{
-return rc;
-}
-
-public List getNotifications()
-{
-return notifications;
-}
-
-public 

[jira] [Updated] (CASSANDRA-15662) cqlsh tests won't run on jdk1.8 (regression from CASSANDRA-10190)

2020-04-06 Thread Michael Semb Wever (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-15662:
---
  Since Version: 4.0-alpha
Source Control Link: 
https://github.com/apache/cassandra/commit/bb8ec1fc1066e604b5695c0f8057e2b3adfa5cb2
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> cqlsh tests won't run on jdk1.8 (regression from CASSANDRA-10190) 
> --
>
> Key: CASSANDRA-15662
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15662
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> Running the cqlsh tests on jdk1.8 no longer work.
> The commit {{bf9a1d487b}} for CASSANDRA-10190 broke this, by defaulting 
> {{CASSANDRA_USE_JDK11}} to true. See 
> https://github.com/apache/cassandra/commit/bf9a1d487b9ba469e8d740cf7d1cd419535a7e79#diff-90e40e02845884b66e9006b25250ea5cR36-R38
> The following three work…
> {code}
> jenv shell 1.8
> export CASSANDRA_USE_JDK11=false
> ./pylib/cassandra-cqlsh-tests.sh `pwd`
> {code}
> {code}
> jenv shell 11.0
> export CASSANDRA_USE_JDK11=true
> ./pylib/cassandra-cqlsh-tests.sh `pwd`
> {code}
> {code}
> jenv shell 1.8
> unset CASSANDRA_USE_JDK11
> ./pylib/cassandra-cqlsh-tests.sh `pwd`
> {code}
> The following does not…
> {code}
> jenv shell 1.8
> ./pylib/cassandra-cqlsh-tests.sh `pwd`
> {code}
> {noformat}
> BUILD FAILED
> /Users/mick/src/apache/casSANDRA/build.xml:292: -Duse.jdk11=true or 
> $CASSANDRA_USE_JDK11=true cannot be set when building from java 8
> {noformat}
> JDK 1.8 is expected to be the default. With {{CASSANDRA_USE_JDK11}} being 
> defined if/when JDK 11 is used.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15662) cqlsh tests won't run on jdk1.8 (regression from CASSANDRA-10190)

2020-04-06 Thread Michael Semb Wever (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076426#comment-17076426
 ] 

Michael Semb Wever commented on CASSANDRA-15662:


Thanks [~yukim].

Committed as bb8ec1fc1066e604b5695c0f8057e2b3adfa5cb2

> cqlsh tests won't run on jdk1.8 (regression from CASSANDRA-10190) 
> --
>
> Key: CASSANDRA-15662
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15662
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> Running the cqlsh tests on jdk1.8 no longer work.
> The commit {{bf9a1d487b}} for CASSANDRA-10190 broke this, by defaulting 
> {{CASSANDRA_USE_JDK11}} to true. See 
> https://github.com/apache/cassandra/commit/bf9a1d487b9ba469e8d740cf7d1cd419535a7e79#diff-90e40e02845884b66e9006b25250ea5cR36-R38
> The following three work…
> {code}
> jenv shell 1.8
> export CASSANDRA_USE_JDK11=false
> ./pylib/cassandra-cqlsh-tests.sh `pwd`
> {code}
> {code}
> jenv shell 11.0
> export CASSANDRA_USE_JDK11=true
> ./pylib/cassandra-cqlsh-tests.sh `pwd`
> {code}
> {code}
> jenv shell 1.8
> unset CASSANDRA_USE_JDK11
> ./pylib/cassandra-cqlsh-tests.sh `pwd`
> {code}
> The following does not…
> {code}
> jenv shell 1.8
> ./pylib/cassandra-cqlsh-tests.sh `pwd`
> {code}
> {noformat}
> BUILD FAILED
> /Users/mick/src/apache/casSANDRA/build.xml:292: -Duse.jdk11=true or 
> $CASSANDRA_USE_JDK11=true cannot be set when building from java 8
> {noformat}
> JDK 1.8 is expected to be the default. With {{CASSANDRA_USE_JDK11}} being 
> defined if/when JDK 11 is used.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch trunk updated: Fix cqlsh tests running on jdk1.8 (regression from CASSANDRA-10190)

2020-04-06 Thread mck
This is an automated email from the ASF dual-hosted git repository.

mck pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/trunk by this push:
 new bb8ec1f  Fix cqlsh tests running on jdk1.8 (regression from 
CASSANDRA-10190)
bb8ec1f is described below

commit bb8ec1fc1066e604b5695c0f8057e2b3adfa5cb2
Author: Mick Semb Wever 
AuthorDate: Sun Apr 5 10:50:26 2020 +0200

Fix cqlsh tests running on jdk1.8 (regression from CASSANDRA-10190)

 patched by Mick Semb Wever; reviewed by Yuki Morishita for CASSANDRA-15662
---
 pylib/cassandra-cqlsh-tests.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/pylib/cassandra-cqlsh-tests.sh b/pylib/cassandra-cqlsh-tests.sh
index d305759..56672f1 100755
--- a/pylib/cassandra-cqlsh-tests.sh
+++ b/pylib/cassandra-cqlsh-tests.sh
@@ -34,7 +34,7 @@ export NUM_TOKENS="32"
 export CASSANDRA_DIR=${WORKSPACE}
 
 if [ -z "$CASSANDRA_USE_JDK11" ]; then
-export CASSANDRA_USE_JDK11=true
+export CASSANDRA_USE_JDK11=false
 fi
 
 # Loop to prevent failure due to maven-ant-tasks not downloading a jar..
@@ -53,7 +53,7 @@ fi
 
 # Set up venv with dtest dependencies
 set -e # enable immediate exit if venv setup fails
-virtualenv --python=$PYTHON_VERSION --no-site-packages venv
+virtualenv --python=$PYTHON_VERSION venv
 source venv/bin/activate
 
 pip install -r ${CASSANDRA_DIR}/pylib/requirements.txt


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15573) Python 3.8 fails to execute cqlsh

2020-04-06 Thread Eduard Tudenhoefner (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076424#comment-17076424
 ] 

Eduard Tudenhoefner commented on CASSANDRA-15573:
-

The issue itself can also be manually verified with the below steps, which are 
from [the pylib 
readme|https://github.com/nastra/cassandra/blob/e840b458dd8fda021871ceee1efb5187ad94aad3/pylib/README.asc]
 and require CASSANDRA-15659)
{code}
docker build . --file Dockerfile.ubuntu.py38 -t ubuntu-lts-py3
docker run -v $CASSANDRA_DIR:/code -it ubuntu-lts-py3:latest /code/bin/cqlsh 
host.docker.internal
{code}

> Python 3.8 fails to execute cqlsh
> -
>
> Key: CASSANDRA-15573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15573
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Tool/cqlsh
>Reporter: Yuki Morishita
>Assignee: Eduard Tudenhoefner
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0-alpha
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Python 3.8 renamed sre_parse.Pattern to sre_parse.State (see 
> https://bugs.python.org/issue34681 and corresponding pull request 
> https://github.com/python/cpython/pull/9310)
> So when executing cqlsh with Python 3.8, it throws error:
> {code}
> Traceback (most recent call last):
>   File ".\bin\cqlsh.py", line 175, in 
> from cqlshlib import cql3handling, cqlhandling, pylexotron, sslhandling, 
> cqlshhandling
>   File "C:\Users\Yuki 
> Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\cql3handling.py", line 19, 
> in 
> from cqlshlib.cqlhandling import CqlParsingRuleSet, Hint
>   File "C:\Users\Yuki 
> Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\cqlhandling.py", line 23, 
> in 
> from cqlshlib import pylexotron, util
>   File "C:\Users\Yuki 
> Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\pylexotron.py", line 342, 
> in 
> class ParsingRuleSet:
>   File "C:\Users\Yuki 
> Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\pylexotron.py", line 343, 
> in ParsingRuleSet
> RuleSpecScanner = SaferScanner([
>   File "C:\Users\Yuki 
> Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\saferscanner.py", line 74, 
> in __init__
> s = re.sre_parse.Pattern()
> AttributeError: module 'sre_parse' has no attribute 'Pattern'
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15573) Python 3.8 fails to execute cqlsh

2020-04-06 Thread Eduard Tudenhoefner (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eduard Tudenhoefner updated CASSANDRA-15573:

Change Category: Operability
 Complexity: Low Hanging Fruit
  Fix Version/s: 4.0-alpha
 Status: Open  (was: Triage Needed)

> Python 3.8 fails to execute cqlsh
> -
>
> Key: CASSANDRA-15573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15573
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Tool/cqlsh
>Reporter: Yuki Morishita
>Assignee: Eduard Tudenhoefner
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0-alpha
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Python 3.8 renamed sre_parse.Pattern to sre_parse.State (see 
> https://bugs.python.org/issue34681 and corresponding pull request 
> https://github.com/python/cpython/pull/9310)
> So when executing cqlsh with Python 3.8, it throws error:
> {code}
> Traceback (most recent call last):
>   File ".\bin\cqlsh.py", line 175, in 
> from cqlshlib import cql3handling, cqlhandling, pylexotron, sslhandling, 
> cqlshhandling
>   File "C:\Users\Yuki 
> Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\cql3handling.py", line 19, 
> in 
> from cqlshlib.cqlhandling import CqlParsingRuleSet, Hint
>   File "C:\Users\Yuki 
> Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\cqlhandling.py", line 23, 
> in 
> from cqlshlib import pylexotron, util
>   File "C:\Users\Yuki 
> Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\pylexotron.py", line 342, 
> in 
> class ParsingRuleSet:
>   File "C:\Users\Yuki 
> Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\pylexotron.py", line 343, 
> in ParsingRuleSet
> RuleSpecScanner = SaferScanner([
>   File "C:\Users\Yuki 
> Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\saferscanner.py", line 74, 
> in __init__
> s = re.sre_parse.Pattern()
> AttributeError: module 'sre_parse' has no attribute 'Pattern'
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15568) Message filtering should apply on the inboundSink in In-JVM dtest

2020-04-06 Thread Yifan Cai (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076420#comment-17076420
 ] 

Yifan Cai commented on CASSANDRA-15568:
---

Right. I think [~dcapwell] already have it in the CASSANDRA-15564. Since that 
ticket is resolved already, this one can be closed. 

> Message filtering should apply on the inboundSink in In-JVM dtest
> -
>
> Key: CASSANDRA-15568
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15568
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
>
> The message filtering mechanism in the in-jvm dtest helps to simulate network 
> partition/delay. 
> The problem of the current approach that adds all filters to the 
> {{MessagingService#outboundSink}} is that a blocking filter blocks the 
> following filters to be evaluated since there is only a single thread that 
> evaluates them. It further blocks the other outing messages. The typical 
> internode messaging pattern is that the coordinator node sends out multiple 
> messages to other nodes upon receiving a query. The described blocking 
> messages can happen quite often.
> The problem can be solved by moving the message filtering to the 
> {{MessagingService#inboundSink}}, so that each inbounding message is 
> naturally filtered in parallel.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15573) Python 3.8 fails to execute cqlsh

2020-04-06 Thread Eduard Tudenhoefner (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076421#comment-17076421
 ] 

Eduard Tudenhoefner commented on CASSANDRA-15573:
-

I added a Python3.8 compatible SaferScanner in 
https://github.com/apache/cassandra/pull/518. Note that the PR currently 
contains 2 commits from CASSANDRA-15659 because those are required for testing 
things with newer Python versions.

> Python 3.8 fails to execute cqlsh
> -
>
> Key: CASSANDRA-15573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15573
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Tool/cqlsh
>Reporter: Yuki Morishita
>Assignee: Eduard Tudenhoefner
>Priority: Normal
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Python 3.8 renamed sre_parse.Pattern to sre_parse.State (see 
> https://bugs.python.org/issue34681 and corresponding pull request 
> https://github.com/python/cpython/pull/9310)
> So when executing cqlsh with Python 3.8, it throws error:
> {code}
> Traceback (most recent call last):
>   File ".\bin\cqlsh.py", line 175, in 
> from cqlshlib import cql3handling, cqlhandling, pylexotron, sslhandling, 
> cqlshhandling
>   File "C:\Users\Yuki 
> Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\cql3handling.py", line 19, 
> in 
> from cqlshlib.cqlhandling import CqlParsingRuleSet, Hint
>   File "C:\Users\Yuki 
> Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\cqlhandling.py", line 23, 
> in 
> from cqlshlib import pylexotron, util
>   File "C:\Users\Yuki 
> Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\pylexotron.py", line 342, 
> in 
> class ParsingRuleSet:
>   File "C:\Users\Yuki 
> Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\pylexotron.py", line 343, 
> in ParsingRuleSet
> RuleSpecScanner = SaferScanner([
>   File "C:\Users\Yuki 
> Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\saferscanner.py", line 74, 
> in __init__
> s = re.sre_parse.Pattern()
> AttributeError: module 'sre_parse' has no attribute 'Pattern'
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15573) Python 3.8 fails to execute cqlsh

2020-04-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CASSANDRA-15573:
---
Labels: pull-request-available  (was: )

> Python 3.8 fails to execute cqlsh
> -
>
> Key: CASSANDRA-15573
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15573
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Tool/cqlsh
>Reporter: Yuki Morishita
>Assignee: Eduard Tudenhoefner
>Priority: Normal
>  Labels: pull-request-available
>
> Python 3.8 renamed sre_parse.Pattern to sre_parse.State (see 
> https://bugs.python.org/issue34681 and corresponding pull request 
> https://github.com/python/cpython/pull/9310)
> So when executing cqlsh with Python 3.8, it throws error:
> {code}
> Traceback (most recent call last):
>   File ".\bin\cqlsh.py", line 175, in 
> from cqlshlib import cql3handling, cqlhandling, pylexotron, sslhandling, 
> cqlshhandling
>   File "C:\Users\Yuki 
> Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\cql3handling.py", line 19, 
> in 
> from cqlshlib.cqlhandling import CqlParsingRuleSet, Hint
>   File "C:\Users\Yuki 
> Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\cqlhandling.py", line 23, 
> in 
> from cqlshlib import pylexotron, util
>   File "C:\Users\Yuki 
> Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\pylexotron.py", line 342, 
> in 
> class ParsingRuleSet:
>   File "C:\Users\Yuki 
> Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\pylexotron.py", line 343, 
> in ParsingRuleSet
> RuleSpecScanner = SaferScanner([
>   File "C:\Users\Yuki 
> Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\saferscanner.py", line 74, 
> in __init__
> s = re.sre_parse.Pattern()
> AttributeError: module 'sre_parse' has no attribute 'Pattern'
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15657) Improve zero-copy-streaming containment check by using file sections

2020-04-06 Thread Marcus Eriksson (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076413#comment-17076413
 ] 

Marcus Eriksson commented on CASSANDRA-15657:
-

the approach we discussed was trying to use sstable first + last tokens to 
figure out if the ranges covered the whole sstable

from a quick look, this seems to sum up the total size of sections to be 
transferred and check if that size matches the whole sstable size, seems like a 
good idea to me

> Improve zero-copy-streaming containment check by using file sections
> 
>
> Key: CASSANDRA-15657
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15657
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Streaming and Messaging
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0
>
>
> Currently zero copy streaming is only enabled for leveled-compaction strategy 
> and it checks if all keys in the sstables are included in the transferred 
> ranges.
> This is very inefficient. The containment check can be improved by checking 
> if transferred sections (the transferred file positions) cover entire sstable.
> I also enabled ZCS for all compaction strategies since the new containment 
> check is very fast..



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15694) Statistics upon streaming of entire SSTables in Netstats is wrong

2020-04-06 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-15694:
--
Description: 
There is a bug in the current code (trunk on 6th April 2020) as if we are 
streaming entire SSTables via CassandraEntireSSTableStreamWriter and 
CassandraOutgoingFile respectively, there is not any update on particular 
components of a SSTable as it counts only "db" file to be there. That 
introduces this bug:

 
{code:java}
Mode: NORMAL
Rebuild 2c0b43f0-735d-11ea-9346-fb0ffe238736
/127.0.0.2 Sending 19 files, 27664559 bytes total. Already sent 133 files, 
27664559 bytes total

/tmp/dtests15682026295742741219/node2/data/distributed_test_keyspace/cf-196b3...

{code}
Basically, number of files to be sent is lower than files sent.

 

The straightforward fix here is to distinguish when we are streaming entire 
sstables and in that case include all manifest files into computation. 

 

This issue relates to https://issues.apache.org/jira/browse/CASSANDRA-15657 
because the resolution whether we stream entirely or not is got from a method 
which is performance sensitive and computed every time. Once CASSANDRA-15657  
(hence CASSANDRA-14586) is done, this ticket can be worked on.

 

branch with fix is here: 
[https://github.com/smiklosovic/cassandra/tree/CASSANDRA-15694]

  was:
There is a bug in the current code as if we are streaming entire SSTables via 
CassandraEntireSSTableStreamWriter and CassandraOutgoingFile respectively, 
there is not any update on particular components of a SSTable as it counts only 
"db" file to be there. That introduces this bug:

 
{code:java}
Mode: NORMAL
Rebuild 2c0b43f0-735d-11ea-9346-fb0ffe238736
/127.0.0.2 Sending 19 files, 27664559 bytes total. Already sent 133 files, 
27664559 bytes total

/tmp/dtests15682026295742741219/node2/data/distributed_test_keyspace/cf-196b3...

{code}
Basically, number of files to be sent is lower than files sent.

 

The straightforward fix here is to distinguish when we are streaming entire 
sstables and in that case include all manifest files into computation. 

 

This issue relates to https://issues.apache.org/jira/browse/CASSANDRA-15657 
because the resolution whether we stream entirely or not is got from a method 
which is performance sensitive and computed every time. Once CASSANDRA-15657  
(hence CASSANDRA-14586) is done, this ticket can be worked on.

 

branch with fix is here: 
[https://github.com/smiklosovic/cassandra/tree/CASSANDRA-15694]


> Statistics upon streaming of entire SSTables in Netstats is wrong
> -
>
> Key: CASSANDRA-15694
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15694
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/nodetool
>Reporter: Stefan Miklosovic
>Priority: Normal
>
> There is a bug in the current code (trunk on 6th April 2020) as if we are 
> streaming entire SSTables via CassandraEntireSSTableStreamWriter and 
> CassandraOutgoingFile respectively, there is not any update on particular 
> components of a SSTable as it counts only "db" file to be there. That 
> introduces this bug:
>  
> {code:java}
> Mode: NORMAL
> Rebuild 2c0b43f0-735d-11ea-9346-fb0ffe238736
> /127.0.0.2 Sending 19 files, 27664559 bytes total. Already sent 133 
> files, 27664559 bytes total
> 
> /tmp/dtests15682026295742741219/node2/data/distributed_test_keyspace/cf-196b3...
> 
> {code}
> Basically, number of files to be sent is lower than files sent.
>  
> The straightforward fix here is to distinguish when we are streaming entire 
> sstables and in that case include all manifest files into computation. 
>  
> This issue relates to https://issues.apache.org/jira/browse/CASSANDRA-15657 
> because the resolution whether we stream entirely or not is got from a method 
> which is performance sensitive and computed every time. Once CASSANDRA-15657  
> (hence CASSANDRA-14586) is done, this ticket can be worked on.
>  
> branch with fix is here: 
> [https://github.com/smiklosovic/cassandra/tree/CASSANDRA-15694]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15682) Missing commas between endpoints in nodetool describering

2020-04-06 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076401#comment-17076401
 ] 

Stefan Miklosovic commented on CASSANDRA-15682:
---

PR with the fix here [https://github.com/apache/cassandra/pull/517]

> Missing commas between endpoints in nodetool describering
> -
>
> Key: CASSANDRA-15682
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15682
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/nodetool
>Reporter: Aleksandr Sorokoumov
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.0
>
>
> *Setup*
> 3-node cluster created with ccm
> {noformat}
> cqlsh> create keyspace ks with replication = {'class': 'SimpleStrategy', 
> 'replication_factor': 2};
> {noformat}
> *trunk*:
> {noformat}
> $ bin/nodetool describering ks --port 7100
> Schema Version:295e8142-fc9f-3f76-b6db-24430ff572e5
> TokenRange:
> TokenRange(start_token:-9223372036854775808, 
> end_token:-3074457345618258603endpoints:[127.0.0.2, 
> 127.0.0.3]rpc_endpoints:[127.0.0.2, 
> 127.0.0.3]endpoint_details:[EndpointDetails(host:127.0.0.2, datacenter:d
> atacenter1, rack:rack1), EndpointDetails(host:127.0.0.3, 
> datacenter:datacenter1, rack:rack1)])
> TokenRange(start_token:-3074457345618258603, 
> end_token:3074457345618258602endpoints:[127.0.0.3, 
> 127.0.0.1]rpc_endpoints:[127.0.0.3, 
> /127.0.0.1]endpoint_details:[EndpointDetails(host:127.0.0.3, datacenter:d
> atacenter1, rack:rack1), EndpointDetails(host:127.0.0.1, 
> datacenter:datacenter1, rack:rack1)])
> TokenRange(start_token:3074457345618258602, 
> end_token:-9223372036854775808endpoints:[127.0.0.1, 
> 127.0.0.2]rpc_endpoints:[/127.0.0.1, 
> 127.0.0.2]endpoint_details:[EndpointDetails(host:127.0.0.1, datacenter:d
> atacenter1, rack:rack1), EndpointDetails(host:127.0.0.2, 
> datacenter:datacenter1, rack:rack1)])
> {noformat}
> *3.11* (correct output)
> {noformat}
> bin/nodetool describering ks --port 7100
> Schema Version:c8fd35ea-6f49-3e77-85e7-a92e79df8696
> TokenRange:
> TokenRange(start_token:-9223372036854775808, 
> end_token:-3074457345618258603, endpoints:[127.0.0.2, 127.0.0.3], 
> rpc_endpoints:[127.0.0.2, 127.0.0.3], 
> endpoint_details:[EndpointDetails(host:127.0.0.2, datace
> nter:datacenter1, rack:rack1), EndpointDetails(host:127.0.0.3, 
> datacenter:datacenter1, rack:rack1)])
> TokenRange(start_token:-3074457345618258603, 
> end_token:3074457345618258602, endpoints:[127.0.0.3, 127.0.0.1], 
> rpc_endpoints:[127.0.0.3, 127.0.0.1], 
> endpoint_details:[EndpointDetails(host:127.0.0.3, datacen
> ter:datacenter1, rack:rack1), EndpointDetails(host:127.0.0.1, 
> datacenter:datacenter1, rack:rack1)])
> TokenRange(start_token:3074457345618258602, 
> end_token:-9223372036854775808, endpoints:[127.0.0.1, 127.0.0.2], 
> rpc_endpoints:[127.0.0.1, 127.0.0.2], 
> endpoint_details:[EndpointDetails(host:127.0.0.1, datacen
> ter:datacenter1, rack:rack1), EndpointDetails(host:127.0.0.2, 
> datacenter:datacenter1, rack:rack1)])
> {noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15686) Improvements in circle CI default config

2020-04-06 Thread Kevin Gallardo (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076386#comment-17076386
 ] 

Kevin Gallardo commented on CASSANDRA-15686:


bq. I was under the impression the lager instances were not in the free tier;

Correct, it is not. I was told that if you have a build that requests > medium 
on a free tier, CircleCI would downgrade to medium automatically, but I have 
not confirmed that, need to double check. I have seen [other OSS 
projects|https://github.com/envoyproxy/envoy/blob/master/.circleci/config.yml#L9]
 using {{xlarge}} resources on their default build config 路‍♂️. The problem is 
that the tests do not run well at all on medium, contributors get confused by 
seeing the python dtest build available and launches it, and then gets confused 
by the 200-300+ failures resulting. If these don't build successfully at all by 
default, I am wondering if it is worth to keep it in the "lowres" build at all 
given the confusion it causes?

bq. Sorry I don't follow. [...] So the only file we should be maintaining is 
config-2_1.yml.

So as we discussed offline, it would be possible to, instead of using the 
generated version of the {{config-2_1.yml}} file, we use the {{config-2_1.yml}} 
by default. This would cause less confusion to contributors too, we could also 
keep a HIGHRES version as discussed so that it doesn't change people's 
workflows.

> Improvements in circle CI default config
> 
>
> Key: CASSANDRA-15686
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15686
> Project: Cassandra
>  Issue Type: Bug
>  Components: Build
>Reporter: Kevin Gallardo
>Priority: Normal
>
> I have been looking at and played around with the [default CircleCI 
> config|https://github.com/apache/cassandra/blob/trunk/.circleci/config.yml], 
> a few comments/questions regarding the following topics:
>  * Python dtests do not run successfully (200-300 failures) on {{medium}} 
> instances, they seem to only run with small flaky failures on {{large}} 
> instances or higher
>  * Python Upgrade tests:
>  ** Do not seem to run without many failures on any instance types / any 
> parallelism setting
>  ** Do not seem to parallelize well, it seems each container is going to 
> download multiple C* versions
>  ** Additionally it seems the configuration is not up to date, as currently 
> we get errors because {{JAVA8_HOME}} is not set
>  * Unit tests do not seem to parallelize optimally, number of test runners do 
> not reflect the available CPUs on the container. Ideally if # of runners == # 
> of CPUs, build time is improved, on any type of instances.
>  ** For instance when using the current configuration, running on medium 
> instances, build will use 1 junit test runner, but 2 CPUs are available. If 
> using 2 runners, the build time is reduced from 19min (at the current main 
> config of parallelism=4) to 12min.
>  * There are some typos in the file, some dtests say "Run Unit Tests" but 
> they are JVM dtests (see 
> [here|https://github.com/apache/cassandra/blob/trunk/.circleci/config.yml#L1077],
>  
> [here|https://github.com/apache/cassandra/blob/trunk/.circleci/config.yml#L1386])
> So some ways to process these would be:
>  * Do the Python dtests run successfully for anyone on {{medium}} instances? 
> If not, would it make sense to bump them to {{large}} so that they can be run 
> successfully?
>  * Does anybody ever run the python upgrade tests on CircleCI and what is the 
> configuration that makes it work?
>  * Would it make sense to either hardcode the number of test runners in the 
> unit tests with `-Dtest.runners` in the config file to reflect the number of 
> CPUs on the instances, or change the build so that it is able to detect the 
> appropriate number of core available automatically?
> Additionally, it seems this default config file (config.yml) is not as well 
> maintained as the 
> [{{config-2_1.yml}}|https://github.com/apache/cassandra/blob/trunk/.circleci/config-2_1.yml]
>  (+its lowres/highres) version in the same folder (from CASSANDRA-14806). 
> What is the reasoning for maintaining these 2 versions of the build? Could 
> the better maintained version be used as the default? We could generate a 
> lowres version of the new config-2_1.yml, and rename it {{config.yml}} so 
> that it gets picked up by CircleCI automatically instead of the current 
> default.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15642) Inconsistent failure messages on distributed queries

2020-04-06 Thread Kevin Gallardo (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076374#comment-17076374
 ] 

Kevin Gallardo commented on CASSANDRA-15642:


bq. What is your definition of complete/reliable?

To me it would mean that user gets a complete view of the states of the other 
requests necessary to complete a request at a certain CL.

Instead, in the current state of things:

* say for CL=3
* first response that comes back is a failure
* then later the 2 other responses are successful
* the error message may say "1 failure, 0 successful response"

The "0 successful response" cannot be trusted because some successful responses 
actually came back, but after the failure. And the "1 failure" cannot be 
trusted either, because there may have been more failures that would not be 
reported because of the current fail-fast behavior.

The alternate solution you mention to save the state first doesn't provide with 
a complete view of the situation either as far as I understand. The state is 
saved and things are less inconsistent, but the errors returned to the user may 
still be misleading as explained above

> Inconsistent failure messages on distributed queries
> 
>
> Key: CASSANDRA-15642
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15642
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Coordination
>Reporter: Kevin Gallardo
>Priority: Normal
>
> As a follow up to some exploration I have done for CASSANDRA-15543, I 
> realized the following behavior in both {{ReadCallback}} and 
> {{AbstractWriteHandler}}:
>  - await for responses
>  - when all required number of responses have come back: unblock the wait
>  - when a single failure happens: unblock the wait
>  - when unblocked, look to see if the counter of failures is > 1 and if so 
> return an error message based on the {{failures}} map that's been filled
> Error messages that can result from this behavior can be a ReadTimeout, a 
> ReadFailure, a WriteTimeout or a WriteFailure.
> In case of a Write/ReadFailure, the user will get back an error looking like 
> the following:
> "Failure: Received X responses, and Y failures"
> (if this behavior I describe is incorrect, please correct me)
> This causes a usability problem. Since the handler will fail and throw an 
> exception as soon as 1 failure happens, the error message that is returned to 
> the user may not be accurate.
> (note: I am not entirely sure of the behavior in case of timeouts for now)
> For example, say a request at CL = QUORUM = 3, a failed request may complete 
> first, then a successful one completes, and another fails. If the exception 
> is thrown fast enough, the error message could say 
>  "Failure: Received 0 response, and 1 failure at CL = 3"
> Which:
> 1. doesn't make a lot of sense because the CL doesn't match the number of 
> results in the message, so you end up thinking "what happened with the rest 
> of the required CL?"
> 2. the information is incorrect. We did receive a successful response, only 
> it came after the initial failure.
> From that logic, I think it is safe to assume that the information returned 
> in the error message cannot be trusted in case of a failure. Only information 
> users should extract out of it is that at least 1 node has failed.
> For a big improvement in usability, the {{ReadCallback}} and 
> {{AbstractWriteResponseHandler}} could instead wait for all responses to come 
> back before unblocking the wait, or let it timeout. This is way, the users 
> will be able to have some trust around the information returned to them.
> Additionally, an error that happens first prevents a timeout to happen 
> because it fails immediately, and so potentially it hides problems with other 
> replicas. If we were to wait for all responses, we might get a timeout, in 
> that case we'd also be able to tell wether failures have happened *before* 
> that timeout, and have a more complete diagnostic where you can't detect both 
> errors at the same time.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15657) Improve zero-copy-streaming containment check by using file sections

2020-04-06 Thread T Jake Luciani (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076366#comment-17076366
 ] 

T Jake Luciani commented on CASSANDRA-15657:


[~aleksey] Can you comment here? We are not understanding the potential problem 
here?

> Improve zero-copy-streaming containment check by using file sections
> 
>
> Key: CASSANDRA-15657
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15657
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Streaming and Messaging
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0
>
>
> Currently zero copy streaming is only enabled for leveled-compaction strategy 
> and it checks if all keys in the sstables are included in the transferred 
> ranges.
> This is very inefficient. The containment check can be improved by checking 
> if transferred sections (the transferred file positions) cover entire sstable.
> I also enabled ZCS for all compaction strategies since the new containment 
> check is very fast..



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15659) Better support of Python 3 for cqlsh

2020-04-06 Thread Eduard Tudenhoefner (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eduard Tudenhoefner updated CASSANDRA-15659:

Description: 
h2. From mailing list:

[https://lists.apache.org/thread.html/r377099b632c62b641e4feef5b738084fc5369b0c7157fae867853597%40%3Cdev.cassandra.apache.org%3E]

 

As of today (24/3/2020) and current trunk, there is Python 3.6 supported (1) 
but there is not any 3.6 version ootb in Debian for example. E.g. Buster has 
Python 3.7 and other (recent) releases have version 2.7. This means that if one 
wants to use Python 3 in Debian, he has to use 3.6 but it is not in the 
repository so he has to download / compile / install it on his own.

There should be some sane Python 3 version supported which is as well present 
in Debian repository (or requirement to run with 3.6 should be relaxed) .

(1) 
[https://github.com/apache/cassandra/blob/bf9a1d487b9ba469e8d740cf7d1cd419535a7e79/bin/cqlsh#L57-L65]

h2. Summary of work that was done:

I relaxed the requirement of *cqlsh* only working with Python 2.7 & 3.6 by 
allowing Python 3.6+.
Note that I left the constraint for Python 3.6 being the minimum Python3 
version. 
As [~ptbannister] pointed out, we could remove the Python 3.6 min version once 
we remove Python 2.7 support, as otherwise testing with lots of different 
Python versions will get costly.

2 Dockerfiles were added in *pylib* for minimal local testing of *cqlsh* 
starting up with Python 3.7 & 3.8 and that both revealed
CASSANDRA-15572 and CASSANDRA-15573. 
CASSANDRA-15572 was fixed here as it was a one-liner. And I'm going to tackle 
CASSANDRA-15573 later.

Python 3.8 testing was added to the CircleCI config so that we can actually see 
what else breaks with newer Python versions.

A new Docker images with Ubuntu 19.10 was required 
(https://github.com/apache/cassandra-builds/pull/17). This docker image sets up 
Python 2.7/3.6/3.7/3.8 with their respective virtual environments, which are 
then being used by the CircleCI yaml.

The image *spod/cassandra-testing-ubuntu1810-java11-w-dependencies:20190306* 
couldn't be updated unfortunately because it can't be built anymore, due to 
Ubuntu 18.10 being EOL. 



  was:
>From mailing list:

[https://lists.apache.org/thread.html/r377099b632c62b641e4feef5b738084fc5369b0c7157fae867853597%40%3Cdev.cassandra.apache.org%3E]

 

As of today (24/3/2020) and current trunk, there is Python 3.6 supported (1) 
but there is not any 3.6 version ootb in Debian for example. E.g. Buster has 
Python 3.7 and other (recent) releases have version 2.7. This means that if one 
wants to use Python 3 in Debian, he has to use 3.6 but it is not in the 
repository so he has to download / compile / install it on his own.

There should be some sane Python 3 version supported which is as well present 
in Debian repository (or requirement to run with 3.6 should be relaxed) .

(1) 
[https://github.com/apache/cassandra/blob/bf9a1d487b9ba469e8d740cf7d1cd419535a7e79/bin/cqlsh#L57-L65]


> Better support of Python 3 for cqlsh
> 
>
> Key: CASSANDRA-15659
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15659
> Project: Cassandra
>  Issue Type: Task
>  Components: Tool/cqlsh
>Reporter: Stefan Miklosovic
>Assignee: Eduard Tudenhoefner
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0-alpha
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> h2. From mailing list:
> [https://lists.apache.org/thread.html/r377099b632c62b641e4feef5b738084fc5369b0c7157fae867853597%40%3Cdev.cassandra.apache.org%3E]
>  
> As of today (24/3/2020) and current trunk, there is Python 3.6 supported (1) 
> but there is not any 3.6 version ootb in Debian for example. E.g. Buster has 
> Python 3.7 and other (recent) releases have version 2.7. This means that if 
> one wants to use Python 3 in Debian, he has to use 3.6 but it is not in the 
> repository so he has to download / compile / install it on his own.
> There should be some sane Python 3 version supported which is as well present 
> in Debian repository (or requirement to run with 3.6 should be relaxed) .
> (1) 
> [https://github.com/apache/cassandra/blob/bf9a1d487b9ba469e8d740cf7d1cd419535a7e79/bin/cqlsh#L57-L65]
> h2. Summary of work that was done:
> I relaxed the requirement of *cqlsh* only working with Python 2.7 & 3.6 by 
> allowing Python 3.6+.
> Note that I left the constraint for Python 3.6 being the minimum Python3 
> version. 
> As [~ptbannister] pointed out, we could remove the Python 3.6 min version 
> once we remove Python 2.7 support, as otherwise testing with lots of 
> different Python versions will get costly.
> 2 Dockerfiles were added in *pylib* for minimal local testing of *cqlsh* 
> starting up with Python 3.7 & 3.8 and that both revealed

[jira] [Updated] (CASSANDRA-15659) Better support of Python 3 for cqlsh

2020-04-06 Thread Eduard Tudenhoefner (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eduard Tudenhoefner updated CASSANDRA-15659:

Description: 
h2. From mailing list:

[https://lists.apache.org/thread.html/r377099b632c62b641e4feef5b738084fc5369b0c7157fae867853597%40%3Cdev.cassandra.apache.org%3E]

 

As of today (24/3/2020) and current trunk, there is Python 3.6 supported (1) 
but there is not any 3.6 version ootb in Debian for example. E.g. Buster has 
Python 3.7 and other (recent) releases have version 2.7. This means that if one 
wants to use Python 3 in Debian, he has to use 3.6 but it is not in the 
repository so he has to download / compile / install it on his own.

There should be some sane Python 3 version supported which is as well present 
in Debian repository (or requirement to run with 3.6 should be relaxed) .

(1) 
[https://github.com/apache/cassandra/blob/bf9a1d487b9ba469e8d740cf7d1cd419535a7e79/bin/cqlsh#L57-L65]
h2. Summary of work that was done:

I relaxed the requirement of *cqlsh* only working with Python 2.7 & 3.6 by 
allowing Python 3.6+.
 Note that I left the constraint for Python 3.6 being the minimum Python3 
version. 
 As [~ptbannister] pointed out, we could remove the Python 3.6 min version once 
we remove Python 2.7 support, as otherwise testing with lots of different 
Python versions will get costly.

2 Dockerfiles were added in *pylib* for minimal local testing of *cqlsh* 
starting up with Python 3.7 & 3.8 and that both revealed
 CASSANDRA-15572 and CASSANDRA-15573. 
 CASSANDRA-15572 was fixed here as it was a one-liner. And I'm going to tackle 
CASSANDRA-15573 later.

Python 3.8 testing was added to the CircleCI config so that we can actually see 
what else breaks with newer Python versions.

A new Docker images with Ubuntu 19.10 was required for testing 
([https://github.com/apache/cassandra-builds/pull/17]). This docker image sets 
up Python 2.7/3.6/3.7/3.8 with their respective virtual environments, which are 
then being used by the CircleCI yaml.

The image *spod/cassandra-testing-ubuntu1810-java11-w-dependencies:20190306* 
couldn't be updated unfortunately because it can't be built anymore, due to 
Ubuntu 18.10 being EOL.

  was:
h2. From mailing list:

[https://lists.apache.org/thread.html/r377099b632c62b641e4feef5b738084fc5369b0c7157fae867853597%40%3Cdev.cassandra.apache.org%3E]

 

As of today (24/3/2020) and current trunk, there is Python 3.6 supported (1) 
but there is not any 3.6 version ootb in Debian for example. E.g. Buster has 
Python 3.7 and other (recent) releases have version 2.7. This means that if one 
wants to use Python 3 in Debian, he has to use 3.6 but it is not in the 
repository so he has to download / compile / install it on his own.

There should be some sane Python 3 version supported which is as well present 
in Debian repository (or requirement to run with 3.6 should be relaxed) .

(1) 
[https://github.com/apache/cassandra/blob/bf9a1d487b9ba469e8d740cf7d1cd419535a7e79/bin/cqlsh#L57-L65]

h2. Summary of work that was done:

I relaxed the requirement of *cqlsh* only working with Python 2.7 & 3.6 by 
allowing Python 3.6+.
Note that I left the constraint for Python 3.6 being the minimum Python3 
version. 
As [~ptbannister] pointed out, we could remove the Python 3.6 min version once 
we remove Python 2.7 support, as otherwise testing with lots of different 
Python versions will get costly.

2 Dockerfiles were added in *pylib* for minimal local testing of *cqlsh* 
starting up with Python 3.7 & 3.8 and that both revealed
CASSANDRA-15572 and CASSANDRA-15573. 
CASSANDRA-15572 was fixed here as it was a one-liner. And I'm going to tackle 
CASSANDRA-15573 later.

Python 3.8 testing was added to the CircleCI config so that we can actually see 
what else breaks with newer Python versions.

A new Docker images with Ubuntu 19.10 was required 
(https://github.com/apache/cassandra-builds/pull/17). This docker image sets up 
Python 2.7/3.6/3.7/3.8 with their respective virtual environments, which are 
then being used by the CircleCI yaml.

The image *spod/cassandra-testing-ubuntu1810-java11-w-dependencies:20190306* 
couldn't be updated unfortunately because it can't be built anymore, due to 
Ubuntu 18.10 being EOL. 




> Better support of Python 3 for cqlsh
> 
>
> Key: CASSANDRA-15659
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15659
> Project: Cassandra
>  Issue Type: Task
>  Components: Tool/cqlsh
>Reporter: Stefan Miklosovic
>Assignee: Eduard Tudenhoefner
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0-alpha
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> h2. From mailing list:
> 

[jira] [Commented] (CASSANDRA-15229) BufferPool Regression

2020-04-06 Thread Stefania Alborghetti (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076347#comment-17076347
 ] 

Stefania Alborghetti commented on CASSANDRA-15229:
--

bq. The current implementation isn't really a bump the pointer allocator? It's 
bitmap based, though with a very tiny bitmap. 

Sorry it's been a while. Of course the current implementation is also bitmap 
based. The point is that it is not suitable for long lived buffers, similarly 
to our bump the pointer strategy. The transient case is easy to solve, either 
approach would work.

bq. Could you elaborate on how these work, as my intuition is that anything 
designed for a thread-per-core architecture probably won't translate so well to 
the present state of the world. Though, either way, I suppose this is probably 
orthogonal to this ticket as we only need to address the {{ChunkCache}} part.

The thread-per-core architecture makes it easy to identify threads that do most 
of the work and cause most of the contention. However, thread identification 
can be achieved also with thread pools or we can simply give all threads a 
local stash of buffers, provided that we return it when the thread dies. I 
don't think there is any other dependency on TPC beyond this. 

The design choice was mostly dictated by the size of the cache: with AIO reads 
the OS page cache is bypassed, and the chunk cache needs therefore to be very 
large, which is not the case if we use Java NIO reads or if we eventually 
implement asynchronous reads with the new uring API, bypassing AIO completely 
(which I do recommend). 

bq. We also optimized the chunk cache to store memory addresses rather than 
byte buffers, which significantly reduced heap usage. The byte buffers are 
materialized on the fly.

bq. This would be a huge improvement, and a welcome backport if it is easy - 
though it might (I would guess) depend on Unsafe, which may be going away soon. 
It's orthogonal to this ticket, though, I think

Yes it's based on the Unsafe. The addresses come from the slabs, and then we 
use the Unsafe to create hollow buffers and to set the address. This is an 
optimization and it clearly belongs to a separate ticket.

{quote}
We changed the chunk cache to always store buffers of the same size.

We have global lists of these slabs, sorted by buffer size where each size 
is a power-of-two.

How do these two statements reconcile?
{quote}

So let's assume the current workload is mostly on a table with 4k chunks, which 
translate to 4k buffers in the cache. Let's also assume that the workload is 
shifting towards another table, with 8k chunks. Alternatively, let's assume 
compression is ON, and an ALTER TABLE changes the chunk size. So now the chunk 
cache is slowly evicting 4k buffers and retaining 8k buffers. These buffers 
come from two different lists: the list of slabs serving 4k and the list 
serving 8k. Even if we collect all unused 4k slabs, until each slab has every 
single buffer returned, there will be wasted memory and we do not control how 
long that will take. To be fair, it's an extreme case, and we were perhaps over 
cautions in addressing this possibility by fixing the size of buffers in the 
cache. So it's possible that the redesigned buffer pool may work even with the 
current chunk cache implementation. 

bq. Is it your opinion that your entire ChunkCache implementation can be 
dropped wholesale into 4.0? I would assume it is still primarily 
multi-threaded. If so, it might be preferable to trying to fix the existing 
ChunkCache

The changes to the chunk cache are not trivial and should be left as a follow 
up for 4.x or later in my opinion. 

The changes to the buffer pool can be dropped in 4.0 if you think that:

- they are safe even in the presence of the case described above. 
- they are justified: memory wasted due to fragmentation is perhaps not an 
issue with a cache as little as 512 MB

I'll try to share some code so you can have a clearer picture. 


> BufferPool Regression
> -
>
> Key: CASSANDRA-15229
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15229
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Caching
>Reporter: Benedict Elliott Smith
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0, 4.0-beta
>
>
> The BufferPool was never intended to be used for a {{ChunkCache}}, and we 
> need to either change our behaviour to handle uncorrelated lifetimes or use 
> something else.  This is particularly important with the default chunk size 
> for compressed sstables being reduced.  If we address the problem, we should 
> also utilise the BufferPool for native transport connections like we do for 
> internode messaging, and reduce the number of pooling solutions we employ.
> Probably the best thing to do 

[jira] [Updated] (CASSANDRA-15684) CASSANDRA-15650 was merged after dtest refactor and modified classes no longer in the project

2020-04-06 Thread Benjamin Lerer (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-15684:
---
Reviewers: Alex Petrov, Benjamin Lerer

> CASSANDRA-15650 was merged after dtest refactor and modified classes no 
> longer in the project
> -
>
> Key: CASSANDRA-15684
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15684
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> CASSANDRA-15650 was based off commits before CASSANDRA-15539 which removed 
> some of the files modified in CASSANDRA-15650.  The tests were passing 
> pre-merge but off earlier commits.  On commit they started failing since the 
> dtest API no longer match so produces the following exception
> {code}
> [junit-timeout] 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts;
> [junit-timeout] java.lang.NoSuchMethodError: 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts;
> [junit-timeout] at 
> org.apache.cassandra.distributed.test.RepairCoordinatorFast.lambda$unknownHost$5(RepairCoordinatorFast.java:216)
> [junit-timeout] at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39)
> [junit-timeout] at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67)
> [junit-timeout] at 
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [junit-timeout] at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [junit-timeout] at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [junit-timeout] at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> [junit-timeout] at java.lang.Thread.run(Thread.java:748)
> {code}
> Root cause was 4 files exited which should have been deleted in 
> CASSANDRA-15539.  Since they were not when CASSANDRA-15650 modified one it 
> didn't cause a merge conflict, but when the test runs it conflicts and fails.
> [Circle 
> CI|https://circleci.com/gh/dcapwell/cassandra/tree/bug%2FCASSANDRA-15684]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15689) CASSANDRA-15650 broke CasWriteTest causing it to fail and hang

2020-04-06 Thread Benjamin Lerer (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-15689:
---
  Fix Version/s: 4.0-alpha
  Since Version: 4.0-alpha
Source Control Link: 
https://github.com/apache/cassandra/commit/c133385986db9fb1333b37739528f66ad45de916
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

Committed into trunk at c133385986db9fb1333b37739528f66ad45de916

> CASSANDRA-15650 broke CasWriteTest causing it to fail and hang
> --
>
> Key: CASSANDRA-15689
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15689
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0-alpha
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> CasWriteTest changed IsolatedExecutor to rethrow runtime exceptions rather 
> than wrap, this test assumes they are wrapped which causes tests to fail and 
> casWriteContentionTimeoutTest to timeout.
> [Circle 
> CI|https://circleci.com/gh/dcapwell/cassandra/tree/bug%2FCASSANDRA-15689]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15689) CASSANDRA-15650 broke CasWriteTest causing it to fail and hang

2020-04-06 Thread Benjamin Lerer (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Lerer updated CASSANDRA-15689:
---
Status: Ready to Commit  (was: Review In Progress)

> CASSANDRA-15650 broke CasWriteTest causing it to fail and hang
> --
>
> Key: CASSANDRA-15689
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15689
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> CasWriteTest changed IsolatedExecutor to rethrow runtime exceptions rather 
> than wrap, this test assumes they are wrapped which causes tests to fail and 
> casWriteContentionTimeoutTest to timeout.
> [Circle 
> CI|https://circleci.com/gh/dcapwell/cassandra/tree/bug%2FCASSANDRA-15689]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch trunk updated: Fix CasWriterTest after CASSANDRA-15689

2020-04-06 Thread blerer
This is an automated email from the ASF dual-hosted git repository.

blerer pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/trunk by this push:
 new c133385  Fix CasWriterTest after CASSANDRA-15689
c133385 is described below

commit c133385986db9fb1333b37739528f66ad45de916
Author: David Capwell 
AuthorDate: Fri Apr 3 10:56:20 2020 -0700

Fix CasWriterTest after CASSANDRA-15689

patch by David Capwell; reviewed by Benjamin Lerer for CASSANDRA-15689
---
 .../cassandra/distributed/test/CasWriteTest.java   | 24 +++---
 1 file changed, 17 insertions(+), 7 deletions(-)

diff --git 
a/test/distributed/org/apache/cassandra/distributed/test/CasWriteTest.java 
b/test/distributed/org/apache/cassandra/distributed/test/CasWriteTest.java
index a5d7e72..1d886cf 100644
--- a/test/distributed/org/apache/cassandra/distributed/test/CasWriteTest.java
+++ b/test/distributed/org/apache/cassandra/distributed/test/CasWriteTest.java
@@ -167,8 +167,8 @@ public class CasWriteTest extends TestBaseImpl
failure ->
failure.get() != null &&
failure.get()
-  .getMessage()
-  
.contains(CasWriteTimeoutException.class.getCanonicalName()),
+  .getClass().getCanonicalName()
+  
.equals(CasWriteTimeoutException.class.getCanonicalName()),
"Expecting cause to be CasWriteTimeoutException");
 }
 
@@ -217,8 +217,7 @@ public class CasWriteTest extends TestBaseImpl
 
 private void expectCasWriteTimeout()
 {
-thrown.expect(RuntimeException.class);
-thrown.expectCause(new BaseMatcher()
+thrown.expect(new BaseMatcher()
 {
 public boolean matches(Object item)
 {
@@ -232,7 +231,18 @@ public class CasWriteTest extends TestBaseImpl
 });
 // unable to assert on class becuase the exception thrown was loaded 
by a differnet classloader, InstanceClassLoader
 // therefor asserts the FQCN name present in the message as a 
workaround
-
thrown.expectMessage(containsString(CasWriteTimeoutException.class.getCanonicalName()));
+thrown.expect(new BaseMatcher()
+{
+public boolean matches(Object item)
+{
+return 
item.getClass().getCanonicalName().equals(CasWriteTimeoutException.class.getCanonicalName());
+}
+
+public void describeTo(Description description)
+{
+description.appendText("Class was expected to be " + 
CasWriteTimeoutException.class.getCanonicalName() + " but was not");
+}
+});
 thrown.expectMessage(containsString("CAS operation timed out"));
 }
 
@@ -256,8 +266,8 @@ public class CasWriteTest extends TestBaseImpl
 }
 catch (Throwable t)
 {
-Assert.assertTrue("Expecting cause to be 
CasWriteUncertainException",
-  
t.getMessage().contains(CasWriteUnknownResultException.class.getCanonicalName()));
+Assert.assertEquals("Expecting cause to be 
CasWriteUncertainException",
+
CasWriteUnknownResultException.class.getCanonicalName(), 
t.getClass().getCanonicalName());
 return;
 }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15672) Testsuite: org.apache.cassandra.repair.consistent.CoordinatorMessagingTest Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.878 sec

2020-04-06 Thread Stefania Alborghetti (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania Alborghetti updated CASSANDRA-15672:
-
  Since Version: 4.0-alpha
Source Control Link: 
https://gitbox.apache.org/repos/asf?p=cassandra.git;a=commit;h=b4e640a96e76f8d4a45937b1312b64ddc1aeb8ac
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

CI link: https://jenkins-cm4.apache.org/view/patches/job/Cassandra-devbranch/16/

Committed as 
[b4e640a96e76f8d4a45937b1312b64ddc1aeb8ac|https://gitbox.apache.org/repos/asf?p=cassandra.git;a=commit;h=b4e640a96e76f8d4a45937b1312b64ddc1aeb8ac]

>  Testsuite: org.apache.cassandra.repair.consistent.CoordinatorMessagingTest 
> Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.878 sec
> -
>
> Key: CASSANDRA-15672
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15672
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Ekaterina Dimitrova
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0, 4.0-beta
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The following failure was observed:
>  Testsuite: org.apache.cassandra.repair.consistent.CoordinatorMessagingTest 
> Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.878 sec
> [junit-timeout] 
> [junit-timeout] Testcase: 
> testMockedMessagingPrepareFailureP1(org.apache.cassandra.repair.consistent.CoordinatorMessagingTest):
>FAILED
> [junit-timeout] null
> [junit-timeout] junit.framework.AssertionFailedError
> [junit-timeout]   at 
> org.apache.cassandra.repair.consistent.CoordinatorMessagingTest.testMockedMessagingPrepareFailure(CoordinatorMessagingTest.java:206)
> [junit-timeout]   at 
> org.apache.cassandra.repair.consistent.CoordinatorMessagingTest.testMockedMessagingPrepareFailureP1(CoordinatorMessagingTest.java:154)
> [junit-timeout] 
> [junit-timeout] 
> Seen on Java8



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15672) Testsuite: org.apache.cassandra.repair.consistent.CoordinatorMessagingTest Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.878 sec

2020-04-06 Thread Stefania Alborghetti (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania Alborghetti updated CASSANDRA-15672:
-
Status: Ready to Commit  (was: Review In Progress)

>  Testsuite: org.apache.cassandra.repair.consistent.CoordinatorMessagingTest 
> Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.878 sec
> -
>
> Key: CASSANDRA-15672
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15672
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Ekaterina Dimitrova
>Assignee: Aleksandr Sorokoumov
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0, 4.0-beta
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The following failure was observed:
>  Testsuite: org.apache.cassandra.repair.consistent.CoordinatorMessagingTest 
> Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.878 sec
> [junit-timeout] 
> [junit-timeout] Testcase: 
> testMockedMessagingPrepareFailureP1(org.apache.cassandra.repair.consistent.CoordinatorMessagingTest):
>FAILED
> [junit-timeout] null
> [junit-timeout] junit.framework.AssertionFailedError
> [junit-timeout]   at 
> org.apache.cassandra.repair.consistent.CoordinatorMessagingTest.testMockedMessagingPrepareFailure(CoordinatorMessagingTest.java:206)
> [junit-timeout]   at 
> org.apache.cassandra.repair.consistent.CoordinatorMessagingTest.testMockedMessagingPrepareFailureP1(CoordinatorMessagingTest.java:154)
> [junit-timeout] 
> [junit-timeout] 
> Seen on Java8



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch trunk updated: Fix flaky CoordinatorMessagingTest and docstring in OutboundSink and ConsistentSession

2020-04-06 Thread stefania
This is an automated email from the ASF dual-hosted git repository.

stefania pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/trunk by this push:
 new b4e640a  Fix flaky CoordinatorMessagingTest and docstring in 
OutboundSink and ConsistentSession
b4e640a is described below

commit b4e640a96e76f8d4a45937b1312b64ddc1aeb8ac
Author: Aleksandr Sorokoumov 
AuthorDate: Tue Mar 31 15:53:51 2020 +0200

Fix flaky CoordinatorMessagingTest and docstring in OutboundSink and 
ConsistentSession

patch by Aleksandr Sorokoumov; reviewed by Stefania Alborghetti for 
CASSANDRA-15672
---
 CHANGES.txt|  1 +
 .../org/apache/cassandra/net/OutboundSink.java |  2 +-
 .../repair/consistent/ConsistentSession.java   |  8 +--
 .../consistent/CoordinatorMessagingTest.java   | 70 +++---
 4 files changed, 53 insertions(+), 28 deletions(-)

diff --git a/CHANGES.txt b/CHANGES.txt
index 65111d0..fb881de 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 4.0-alpha4
+ * Fix flaky CoordinatorMessagingTest and docstring in OutboundSink and 
ConsistentSession (CASSANDRA-15672)
  * Fix force compaction of wrapping ranges (CASSANDRA-15664)
  * Expose repair streaming metrics (CASSANDRA-15656)
  * Set now in seconds in the future for validation repairs (CASSANDRA-15655)
diff --git a/src/java/org/apache/cassandra/net/OutboundSink.java 
b/src/java/org/apache/cassandra/net/OutboundSink.java
index d19b3e2..34c72db 100644
--- a/src/java/org/apache/cassandra/net/OutboundSink.java
+++ b/src/java/org/apache/cassandra/net/OutboundSink.java
@@ -25,7 +25,7 @@ import org.apache.cassandra.locator.InetAddressAndPort;
 /**
  * A message sink that all outbound messages go through.
  *
- * Default sink {@link Sink} used by {@link MessagingService} is 
MessagingService#doSend(), which proceeds to
+ * Default sink {@link Sink} used by {@link MessagingService} is {@link 
MessagingService#doSend(Message, InetAddressAndPort, ConnectionType)}, which 
proceeds to
  * send messages over the network, but it can be overridden to filter out 
certain messages, record the fact
  * of attempted delivery, or delay they delivery.
  *
diff --git 
a/src/java/org/apache/cassandra/repair/consistent/ConsistentSession.java 
b/src/java/org/apache/cassandra/repair/consistent/ConsistentSession.java
index 03de157..d9ac927 100644
--- a/src/java/org/apache/cassandra/repair/consistent/ConsistentSession.java
+++ b/src/java/org/apache/cassandra/repair/consistent/ConsistentSession.java
@@ -56,13 +56,13 @@ import org.apache.cassandra.tools.nodetool.RepairAdmin;
  * There are 4 stages to a consistent incremental repair.
  *
  * Repair prepare
- *  First, the normal {@link ActiveRepairService#prepareForRepair(UUID, 
InetAddressAndPort, Set, RepairOption, List)} stuff
+ *  First, the normal {@link ActiveRepairService#prepareForRepair(UUID, 
InetAddressAndPort, Set, RepairOption, boolean, List)} stuff
  *  happens, which sends out {@link PrepareMessage} and creates a {@link 
ActiveRepairService.ParentRepairSession}
  *  on the coordinator and each of the neighbors.
  *
  * Consistent prepare
  *  The consistent prepare step promotes the parent repair session to a 
consistent session, and isolates the sstables
- *  being repaired other sstables. First, the coordinator sends a {@link 
PrepareConsistentRequest} message to each repair
+ *  being repaired from  other sstables. First, the coordinator sends a {@link 
PrepareConsistentRequest} message to each repair
  *  participant (including itself). When received, the node creates a {@link 
LocalSession} instance, sets it's state to
  *  {@code PREPARING}, persists it, and begins a preparing the tables for 
incremental repair, which segregates the data
  *  being repaired from the rest of the table data. When the preparation 
completes, the session state is set to
@@ -74,7 +74,7 @@ import org.apache.cassandra.tools.nodetool.RepairAdmin;
  *  Once the coordinator recieves positive {@code PrepareConsistentResponse} 
messages from all the participants, the
  *  coordinator begins the normal repair process.
  *  
- *  (see {@link CoordinatorSession#handlePrepareResponse(InetAddress, boolean)}
+ *  (see {@link CoordinatorSession#handlePrepareResponse(InetAddressAndPort, 
boolean)}
  *
  * Repair
  *  The coordinator runs the normal data repair process against the sstables 
segregated in the previous step. When a
@@ -96,7 +96,7 @@ import org.apache.cassandra.tools.nodetool.RepairAdmin;
  *  conflicts with in progress compactions. The sstables will be marked 
repaired as part of the normal compaction process.
  *  
  *
- *  On the coordinator side, see {@link CoordinatorSession#finalizePropose()}, 
{@link CoordinatorSession#handleFinalizePromise(InetAddress, boolean)},
+ *  On the coordinator side, see {@link CoordinatorSession#finalizePropose()}, 
{@link 

[jira] [Updated] (CASSANDRA-15369) Fake row deletions and range tombstones, causing digest mismatch and sstable growth

2020-04-06 Thread Marcus Eriksson (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-15369:

Reviewers: Andres de la Peña, Marcus Eriksson  (was: Andres de la Peña)

> Fake row deletions and range tombstones, causing digest mismatch and sstable 
> growth
> ---
>
> Key: CASSANDRA-15369
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15369
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination, Local/Memtable, Local/SSTable
>Reporter: Benedict Elliott Smith
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0, 3.0.x, 3.11.x
>
>
> As assessed in CASSANDRA-15363, we generate fake row deletions and fake 
> tombstone markers under various circumstances:
>  * If we perform a clustering key query (or select a compact column):
>  * Serving from a {{Memtable}}, we will generate fake row deletions
>  * Serving from an sstable, we will generate fake row tombstone markers
>  * If we perform a slice query, we will generate only fake row tombstone 
> markers for any range tombstone that begins or ends outside of the limit of 
> the requested slice
>  * If we perform a multi-slice or IN query, this will occur for each 
> slice/clustering
> Unfortunately, these different behaviours can lead to very different data 
> stored in sstables until a full repair is run.  When we read-repair, we only 
> send these fake deletions or range tombstones.  A fake row deletion, 
> clustering RT and slice RT, each produces a different digest.  So for each 
> single point lookup we can produce a digest mismatch twice, and until a full 
> repair is run we can encounter an unlimited number of digest mismatches 
> across different overlapping queries.
> Relatedly, this seems a more problematic variant of our atomicity failures 
> caused by our monotonic reads, since RTs can have an atomic effect across (up 
> to) the entire partition, whereas the propagation may happen on an 
> arbitrarily small portion.  If the RT exists on only one node, this could 
> plausibly lead to fairly problematic scenario if that node fails before the 
> range can be repaired. 
> At the very least, this behaviour can lead to an almost unlimited amount of 
> extraneous data being stored until the range is repaired and compaction 
> happens to overwrite the sub-range RTs and row deletions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15369) Fake row deletions and range tombstones, causing digest mismatch and sstable growth

2020-04-06 Thread Jira


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andres de la Peña updated CASSANDRA-15369:
--
Reviewers: Andres de la Peña

> Fake row deletions and range tombstones, causing digest mismatch and sstable 
> growth
> ---
>
> Key: CASSANDRA-15369
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15369
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination, Local/Memtable, Local/SSTable
>Reporter: Benedict Elliott Smith
>Assignee: ZhaoYang
>Priority: Normal
> Fix For: 4.0, 3.0.x, 3.11.x
>
>
> As assessed in CASSANDRA-15363, we generate fake row deletions and fake 
> tombstone markers under various circumstances:
>  * If we perform a clustering key query (or select a compact column):
>  * Serving from a {{Memtable}}, we will generate fake row deletions
>  * Serving from an sstable, we will generate fake row tombstone markers
>  * If we perform a slice query, we will generate only fake row tombstone 
> markers for any range tombstone that begins or ends outside of the limit of 
> the requested slice
>  * If we perform a multi-slice or IN query, this will occur for each 
> slice/clustering
> Unfortunately, these different behaviours can lead to very different data 
> stored in sstables until a full repair is run.  When we read-repair, we only 
> send these fake deletions or range tombstones.  A fake row deletion, 
> clustering RT and slice RT, each produces a different digest.  So for each 
> single point lookup we can produce a digest mismatch twice, and until a full 
> repair is run we can encounter an unlimited number of digest mismatches 
> across different overlapping queries.
> Relatedly, this seems a more problematic variant of our atomicity failures 
> caused by our monotonic reads, since RTs can have an atomic effect across (up 
> to) the entire partition, whereas the propagation may happen on an 
> arbitrarily small portion.  If the RT exists on only one node, this could 
> plausibly lead to fairly problematic scenario if that node fails before the 
> range can be repaired. 
> At the very least, this behaviour can lead to an almost unlimited amount of 
> extraneous data being stored until the range is repaired and compaction 
> happens to overwrite the sub-range RTs and row deletions.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15695) Fix NoSuchMethodError: 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts

2020-04-06 Thread Eduard Tudenhoefner (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eduard Tudenhoefner updated CASSANDRA-15695:

Resolution: Duplicate
Status: Resolved  (was: Open)

> Fix NoSuchMethodError: 
> 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts 
> 
>
> Key: CASSANDRA-15695
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15695
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Eduard Tudenhoefner
>Assignee: Eduard Tudenhoefner
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> It seems that there was a regression introduced with CASSANDRA-15650 as the 
> first failures of that kind started to happen 
> [here.|https://ci-cassandra.apache.org/view/branches/job/Cassandra-trunk/30/#showFailuresLink]
>  {code}
> [junit-timeout] Testcase: 
> prepareRPCTimeout[DATACENTER_AWARE/true](org.apache.cassandra.distributed.test.PreviewRepairCoordinatorTimeoutTest):
> Caused an ERROR
> [junit-timeout] 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains(java.lang.String[])'
> [junit-timeout] java.lang.NoSuchMethodError: 
> 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains(java.lang.String[])'
> [junit-timeout]   at 
> org.apache.cassandra.distributed.test.RepairCoordinatorTimeout.lambda$prepareRPCTimeout$0(RepairCoordinatorTimeout.java:45)
> [junit-timeout]   at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39)
> [junit-timeout]   at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67)
> [junit-timeout]   at 
> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> [junit-timeout]   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> [junit-timeout]   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> [junit-timeout]   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> [junit-timeout]   at java.base/java.lang.Thread.run(Thread.java:834)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15695) Fix NoSuchMethodError: 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts

2020-04-06 Thread Eduard Tudenhoefner (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eduard Tudenhoefner updated CASSANDRA-15695:

 Bug Category: Parent values: Code(13163)
   Complexity: Normal
Discovered By: Unit Test
 Severity: Normal
   Status: Open  (was: Triage Needed)

> Fix NoSuchMethodError: 
> 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts 
> 
>
> Key: CASSANDRA-15695
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15695
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Eduard Tudenhoefner
>Assignee: Eduard Tudenhoefner
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> It seems that there was a regression introduced with CASSANDRA-15650 as the 
> first failures of that kind started to happen 
> [here.|https://ci-cassandra.apache.org/view/branches/job/Cassandra-trunk/30/#showFailuresLink]
>  {code}
> [junit-timeout] Testcase: 
> prepareRPCTimeout[DATACENTER_AWARE/true](org.apache.cassandra.distributed.test.PreviewRepairCoordinatorTimeoutTest):
> Caused an ERROR
> [junit-timeout] 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains(java.lang.String[])'
> [junit-timeout] java.lang.NoSuchMethodError: 
> 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains(java.lang.String[])'
> [junit-timeout]   at 
> org.apache.cassandra.distributed.test.RepairCoordinatorTimeout.lambda$prepareRPCTimeout$0(RepairCoordinatorTimeout.java:45)
> [junit-timeout]   at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39)
> [junit-timeout]   at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67)
> [junit-timeout]   at 
> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> [junit-timeout]   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> [junit-timeout]   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> [junit-timeout]   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> [junit-timeout]   at java.base/java.lang.Thread.run(Thread.java:834)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15695) Fix NoSuchMethodError: 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts

2020-04-06 Thread Eduard Tudenhoefner (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eduard Tudenhoefner updated CASSANDRA-15695:

Since Version: 4.0-alpha

> Fix NoSuchMethodError: 
> 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts 
> 
>
> Key: CASSANDRA-15695
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15695
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Eduard Tudenhoefner
>Assignee: Eduard Tudenhoefner
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> It seems that there was a regression introduced with CASSANDRA-15650 as the 
> first failures of that kind started to happen 
> [here.|https://ci-cassandra.apache.org/view/branches/job/Cassandra-trunk/30/#showFailuresLink]
>  {code}
> [junit-timeout] Testcase: 
> prepareRPCTimeout[DATACENTER_AWARE/true](org.apache.cassandra.distributed.test.PreviewRepairCoordinatorTimeoutTest):
> Caused an ERROR
> [junit-timeout] 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains(java.lang.String[])'
> [junit-timeout] java.lang.NoSuchMethodError: 
> 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains(java.lang.String[])'
> [junit-timeout]   at 
> org.apache.cassandra.distributed.test.RepairCoordinatorTimeout.lambda$prepareRPCTimeout$0(RepairCoordinatorTimeout.java:45)
> [junit-timeout]   at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39)
> [junit-timeout]   at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67)
> [junit-timeout]   at 
> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> [junit-timeout]   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> [junit-timeout]   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> [junit-timeout]   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> [junit-timeout]   at java.base/java.lang.Thread.run(Thread.java:834)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15695) Fix NoSuchMethodError: 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts

2020-04-06 Thread Eduard Tudenhoefner (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eduard Tudenhoefner updated CASSANDRA-15695:

Fix Version/s: 4.0-alpha

> Fix NoSuchMethodError: 
> 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts 
> 
>
> Key: CASSANDRA-15695
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15695
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Eduard Tudenhoefner
>Assignee: Eduard Tudenhoefner
>Priority: Normal
> Fix For: 4.0-alpha
>
>
> It seems that there was a regression introduced with CASSANDRA-15650 as the 
> first failures of that kind started to happen 
> [here.|https://ci-cassandra.apache.org/view/branches/job/Cassandra-trunk/30/#showFailuresLink]
>  {code}
> [junit-timeout] Testcase: 
> prepareRPCTimeout[DATACENTER_AWARE/true](org.apache.cassandra.distributed.test.PreviewRepairCoordinatorTimeoutTest):
> Caused an ERROR
> [junit-timeout] 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains(java.lang.String[])'
> [junit-timeout] java.lang.NoSuchMethodError: 
> 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains(java.lang.String[])'
> [junit-timeout]   at 
> org.apache.cassandra.distributed.test.RepairCoordinatorTimeout.lambda$prepareRPCTimeout$0(RepairCoordinatorTimeout.java:45)
> [junit-timeout]   at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39)
> [junit-timeout]   at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67)
> [junit-timeout]   at 
> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> [junit-timeout]   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> [junit-timeout]   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> [junit-timeout]   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> [junit-timeout]   at java.base/java.lang.Thread.run(Thread.java:834)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-15695) Fix NoSuchMethodError: 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts

2020-04-06 Thread Eduard Tudenhoefner (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eduard Tudenhoefner reassigned CASSANDRA-15695:
---

Assignee: Eduard Tudenhoefner

> Fix NoSuchMethodError: 
> 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts 
> 
>
> Key: CASSANDRA-15695
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15695
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Eduard Tudenhoefner
>Assignee: Eduard Tudenhoefner
>Priority: Normal
>
> It seems that there was a regression introduced with CASSANDRA-15650 as the 
> first failures of that kind started to happen 
> [here.|https://ci-cassandra.apache.org/view/branches/job/Cassandra-trunk/30/#showFailuresLink]
>  {code}
> [junit-timeout] Testcase: 
> prepareRPCTimeout[DATACENTER_AWARE/true](org.apache.cassandra.distributed.test.PreviewRepairCoordinatorTimeoutTest):
> Caused an ERROR
> [junit-timeout] 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains(java.lang.String[])'
> [junit-timeout] java.lang.NoSuchMethodError: 
> 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts 
> org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains(java.lang.String[])'
> [junit-timeout]   at 
> org.apache.cassandra.distributed.test.RepairCoordinatorTimeout.lambda$prepareRPCTimeout$0(RepairCoordinatorTimeout.java:45)
> [junit-timeout]   at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39)
> [junit-timeout]   at 
> org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67)
> [junit-timeout]   at 
> java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
> [junit-timeout]   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> [junit-timeout]   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> [junit-timeout]   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> [junit-timeout]   at java.base/java.lang.Thread.run(Thread.java:834)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15695) Fix NoSuchMethodError: 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts

2020-04-06 Thread Eduard Tudenhoefner (Jira)
Eduard Tudenhoefner created CASSANDRA-15695:
---

 Summary: Fix NoSuchMethodError: 
'org.apache.cassandra.distributed.api.NodeToolResult$Asserts 
 Key: CASSANDRA-15695
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15695
 Project: Cassandra
  Issue Type: Bug
  Components: Test/unit
Reporter: Eduard Tudenhoefner


It seems that there was a regression introduced with CASSANDRA-15650 as the 
first failures of that kind started to happen 
[here.|https://ci-cassandra.apache.org/view/branches/job/Cassandra-trunk/30/#showFailuresLink]

 {code}
[junit-timeout] Testcase: 
prepareRPCTimeout[DATACENTER_AWARE/true](org.apache.cassandra.distributed.test.PreviewRepairCoordinatorTimeoutTest):
  Caused an ERROR
[junit-timeout] 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts 
org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains(java.lang.String[])'
[junit-timeout] java.lang.NoSuchMethodError: 
'org.apache.cassandra.distributed.api.NodeToolResult$Asserts 
org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains(java.lang.String[])'
[junit-timeout] at 
org.apache.cassandra.distributed.test.RepairCoordinatorTimeout.lambda$prepareRPCTimeout$0(RepairCoordinatorTimeout.java:45)
[junit-timeout] at 
org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39)
[junit-timeout] at 
org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67)
[junit-timeout] at 
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
[junit-timeout] at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
[junit-timeout] at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
[junit-timeout] at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
[junit-timeout] at java.base/java.lang.Thread.run(Thread.java:834)
{code}

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15667) StreamResultFuture check for completeness is inconsistent, leading to races

2020-04-06 Thread Massimiliano Tomassi (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Massimiliano Tomassi updated CASSANDRA-15667:
-
Test and Documentation Plan: 
[https://app.circleci.com/pipelines/github/maxtomassi/cassandra?branch=15667-4.0]

It seems like JVM dtests fail to run properly. Lots of logs like this:
{code:java}
[junit-timeout] Testcase: 
prepareRPCTimeout[PARALLEL/true](org.apache.cassandra.distributed.test.PreviewRepairCoordinatorTimeoutTest):
  Caused an ERROR
[junit-timeout] 
org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts;
[junit-timeout] java.lang.NoSuchMethodError: 
org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts;
[junit-timeout] at 
org.apache.cassandra.distributed.test.RepairCoordinatorTimeout.lambda$prepareRPCTimeout$0(RepairCoordinatorTimeout.java:45)
[junit-timeout] at 
org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39)
[junit-timeout] at 
org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67)
[junit-timeout] at 
java.util.concurrent.FutureTask.run(FutureTask.java:266)
[junit-timeout] at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[junit-timeout] at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[junit-timeout] at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
[junit-timeout] at java.lang.Thread.run(Thread.java:748)
{code}
 Status: Patch Available  (was: In Progress)

> StreamResultFuture check for completeness is inconsistent, leading to races
> ---
>
> Key: CASSANDRA-15667
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15667
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Streaming and Messaging
>Reporter: Sergio Bossa
>Assignee: Massimiliano Tomassi
>Priority: Normal
> Fix For: 4.0
>
>
> {{StreamResultFuture#maybeComplete()}} uses 
> {{StreamCoordinator#hasActiveSessions()}} to determine if all sessions are 
> completed, but then accesses each session state via 
> {{StreamCoordinator#getAllSessionInfo()}}: this is inconsistent, as the 
> former relies on the actual {{StreamSession}} state, while the latter on the 
> {{SessionInfo}} state, and the two are concurrently updated with no 
> coordination whatsoever.
> This leads to races, i.e. apparent in some dtest spurious failures, such as 
> {{TestBootstrap.resumable_bootstrap_test}} in CASSANDRA-15614 cc 
> [~e.dimitrova].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15568) Message filtering should apply on the inboundSink in In-JVM dtest

2020-04-06 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076235#comment-17076235
 ] 

Benedict Elliott Smith commented on CASSANDRA-15568:


bq. inbound message filtering / sinks

Hmm, when did this happen?  Have we eliminated outbound filtering?  There is 
value in being able to stop progress on the outbound thread, as it permits you 
to specify a sequence of events by controlling the flow of events on the 
coordinator.

> Message filtering should apply on the inboundSink in In-JVM dtest
> -
>
> Key: CASSANDRA-15568
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15568
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
>
> The message filtering mechanism in the in-jvm dtest helps to simulate network 
> partition/delay. 
> The problem of the current approach that adds all filters to the 
> {{MessagingService#outboundSink}} is that a blocking filter blocks the 
> following filters to be evaluated since there is only a single thread that 
> evaluates them. It further blocks the other outing messages. The typical 
> internode messaging pattern is that the coordinator node sends out multiple 
> messages to other nodes upon receiving a query. The described blocking 
> messages can happen quite often.
> The problem can be solved by moving the message filtering to the 
> {{MessagingService#inboundSink}}, so that each inbounding message is 
> naturally filtered in parallel.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15667) StreamResultFuture check for completeness is inconsistent, leading to races

2020-04-06 Thread Massimiliano Tomassi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076236#comment-17076236
 ] 

Massimiliano Tomassi commented on CASSANDRA-15667:
--

PR: [https://github.com/maxtomassi/cassandra/pull/1]
CircleCI: 
[https://app.circleci.com/pipelines/github/maxtomassi/cassandra?branch=15667-4.0]
 (JVM dtests failed running)

> StreamResultFuture check for completeness is inconsistent, leading to races
> ---
>
> Key: CASSANDRA-15667
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15667
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Streaming and Messaging
>Reporter: Sergio Bossa
>Assignee: Massimiliano Tomassi
>Priority: Normal
> Fix For: 4.0
>
>
> {{StreamResultFuture#maybeComplete()}} uses 
> {{StreamCoordinator#hasActiveSessions()}} to determine if all sessions are 
> completed, but then accesses each session state via 
> {{StreamCoordinator#getAllSessionInfo()}}: this is inconsistent, as the 
> former relies on the actual {{StreamSession}} state, while the latter on the 
> {{SessionInfo}} state, and the two are concurrently updated with no 
> coordination whatsoever.
> This leads to races, i.e. apparent in some dtest spurious failures, such as 
> {{TestBootstrap.resumable_bootstrap_test}} in CASSANDRA-15614 cc 
> [~e.dimitrova].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15671) Testcase: testSubrangeCompaction(org.apache.cassandra.db.compaction.CancelCompactionsTest): FAILED

2020-04-06 Thread Francisco Fernandez (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076223#comment-17076223
 ] 

Francisco Fernandez commented on CASSANDRA-15671:
-

Can you see the results using this 
[link|https://app.circleci.com/pipelines/github/fcofdez/cassandra?branch=CASSANDRA-15671]?

> Testcase: 
> testSubrangeCompaction(org.apache.cassandra.db.compaction.CancelCompactionsTest):
>FAILED
> --
>
> Key: CASSANDRA-15671
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15671
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Ekaterina Dimitrova
>Assignee: Francisco Fernandez
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0, 4.0-beta
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The following test failure was observed:
> [junit-timeout] Testcase: 
> testSubrangeCompaction(org.apache.cassandra.db.compaction.CancelCompactionsTest):
>FAILED
> [junit-timeout] expected:<4> but was:<5>
> [junit-timeout] junit.framework.AssertionFailedError: expected:<4> but was:<5>
> [junit-timeout]   at 
> org.apache.cassandra.db.compaction.CancelCompactionsTest.testSubrangeCompaction(CancelCompactionsTest.java:190)
> Java 8



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15586) 4.0 quality testing: Cluster Setup and Maintenance

2020-04-06 Thread Angelo Polo (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076210#comment-17076210
 ] 

Angelo Polo commented on CASSANDRA-15586:
-

Glad to see FreeBSD is back on the map for the broader Cassandra community :). 
[~e.dimitrova] I will let you know as I investigate bugs or need any insight. 
Some things can get patched in the FreeBSD build and don't need immediate (i.e. 
prior to 4.0 release) support from a core project commiter, such as 
CASSANDRA-15693, which I opened recently, and can be upstreamed sometime later.

> 4.0 quality testing: Cluster Setup and Maintenance
> --
>
> Key: CASSANDRA-15586
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15586
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest
>Reporter: Josh McKenzie
>Assignee: Ekaterina Dimitrova
>Priority: Normal
>  Labels: 4.0-QA
> Fix For: 4.0-rc
>
>
> We want 4.0 to be easy for users to setup out of the box and just work. This 
> means having low friction when users download the Cassandra package and start 
> running it. For example, users should be able to easily configure and start 
> new 4.0 clusters and have tokens distributed evenly. Another example is 
> packaging, it should be easy to install Cassandra on all supported platforms 
> (e.g. packaging) and have Cassandra use standard platform integrations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15568) Message filtering should apply on the inboundSink in In-JVM dtest

2020-04-06 Thread Alex Petrov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076194#comment-17076194
 ] 

Alex Petrov commented on CASSANDRA-15568:
-

If I understand the problem correctly, [~dcapwell] has already implemented 
inbound message filtering / sinks. Should we close this one as duplicate?

> Message filtering should apply on the inboundSink in In-JVM dtest
> -
>
> Key: CASSANDRA-15568
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15568
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
>
> The message filtering mechanism in the in-jvm dtest helps to simulate network 
> partition/delay. 
> The problem of the current approach that adds all filters to the 
> {{MessagingService#outboundSink}} is that a blocking filter blocks the 
> following filters to be evaluated since there is only a single thread that 
> evaluates them. It further blocks the other outing messages. The typical 
> internode messaging pattern is that the coordinator node sends out multiple 
> messages to other nodes upon receiving a query. The described blocking 
> messages can happen quite often.
> The problem can be solved by moving the message filtering to the 
> {{MessagingService#inboundSink}}, so that each inbounding message is 
> naturally filtered in parallel.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13403) nodetool repair breaks SASI index

2020-04-06 Thread Alex Petrov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-13403:

Reviewers:   (was: Alex Petrov)

> nodetool repair breaks SASI index
> -
>
> Key: CASSANDRA-13403
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13403
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
> Environment: 3.10
>Reporter: Igor Novgorodov
>Priority: Normal
>  Labels: patch
> Attachments: 3_nodes_compaction.log, 4_nodes_compaction.log, 
> CASSANDRA-13403.patch, testSASIRepair.patch
>
>
> I've got table:
> {code}
> CREATE TABLE cservice.bulks_recipients (
> recipient text,
> bulk_id uuid,
> datetime_final timestamp,
> datetime_sent timestamp,
> request_id uuid,
> status int,
> PRIMARY KEY (recipient, bulk_id)
> ) WITH CLUSTERING ORDER BY (bulk_id ASC)
> AND bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99PERCENTILE';
> CREATE CUSTOM INDEX bulk_recipients_bulk_id ON cservice.bulks_recipients 
> (bulk_id) USING 'org.apache.cassandra.index.sasi.SASIIndex';
> {code}
> There are 11 rows in it:
> {code}
> > select * from bulks_recipients;
> ...
> (11 rows)
> {code}
> Let's query by index (all rows have the same *bulk_id*):
> {code}
> > select * from bulks_recipients where bulk_id = 
> > baa94815-e276-4ca4-adda-5b9734e6c4a5;   
> >   
> ...
> (11 rows)
> {code}
> Ok, everything is fine.
> Now i'm doing *nodetool repair --partitioner-range --job-threads 4 --full* on 
> each node in cluster sequentially.
> After it finished:
> {code}
> > select * from bulks_recipients where bulk_id = 
> > baa94815-e276-4ca4-adda-5b9734e6c4a5;
> ...
> (2 rows)
> {code}
> Only two rows.
> While the rows are actually there:
> {code}
> > select * from bulks_recipients;
> ...
> (11 rows)
> {code}
> If i issue an incremental repair on a random node, i can get like 7 rows 
> after index query.
> Dropping index and recreating it fixes the issue. Is it a bug or am i doing 
> the repair the wrong way?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13403) nodetool repair breaks SASI index

2020-04-06 Thread Alex Petrov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-13403:

Authors:   (was: Alex Petrov)

> nodetool repair breaks SASI index
> -
>
> Key: CASSANDRA-13403
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13403
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
> Environment: 3.10
>Reporter: Igor Novgorodov
>Assignee: Alex Petrov
>Priority: Normal
>  Labels: patch
> Attachments: 3_nodes_compaction.log, 4_nodes_compaction.log, 
> CASSANDRA-13403.patch, testSASIRepair.patch
>
>
> I've got table:
> {code}
> CREATE TABLE cservice.bulks_recipients (
> recipient text,
> bulk_id uuid,
> datetime_final timestamp,
> datetime_sent timestamp,
> request_id uuid,
> status int,
> PRIMARY KEY (recipient, bulk_id)
> ) WITH CLUSTERING ORDER BY (bulk_id ASC)
> AND bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99PERCENTILE';
> CREATE CUSTOM INDEX bulk_recipients_bulk_id ON cservice.bulks_recipients 
> (bulk_id) USING 'org.apache.cassandra.index.sasi.SASIIndex';
> {code}
> There are 11 rows in it:
> {code}
> > select * from bulks_recipients;
> ...
> (11 rows)
> {code}
> Let's query by index (all rows have the same *bulk_id*):
> {code}
> > select * from bulks_recipients where bulk_id = 
> > baa94815-e276-4ca4-adda-5b9734e6c4a5;   
> >   
> ...
> (11 rows)
> {code}
> Ok, everything is fine.
> Now i'm doing *nodetool repair --partitioner-range --job-threads 4 --full* on 
> each node in cluster sequentially.
> After it finished:
> {code}
> > select * from bulks_recipients where bulk_id = 
> > baa94815-e276-4ca4-adda-5b9734e6c4a5;
> ...
> (2 rows)
> {code}
> Only two rows.
> While the rows are actually there:
> {code}
> > select * from bulks_recipients;
> ...
> (11 rows)
> {code}
> If i issue an incremental repair on a random node, i can get like 7 rows 
> after index query.
> Dropping index and recreating it fixes the issue. Is it a bug or am i doing 
> the repair the wrong way?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-13403) nodetool repair breaks SASI index

2020-04-06 Thread Alex Petrov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov reassigned CASSANDRA-13403:
---

Assignee: (was: Alex Petrov)

> nodetool repair breaks SASI index
> -
>
> Key: CASSANDRA-13403
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13403
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SASI
> Environment: 3.10
>Reporter: Igor Novgorodov
>Priority: Normal
>  Labels: patch
> Attachments: 3_nodes_compaction.log, 4_nodes_compaction.log, 
> CASSANDRA-13403.patch, testSASIRepair.patch
>
>
> I've got table:
> {code}
> CREATE TABLE cservice.bulks_recipients (
> recipient text,
> bulk_id uuid,
> datetime_final timestamp,
> datetime_sent timestamp,
> request_id uuid,
> status int,
> PRIMARY KEY (recipient, bulk_id)
> ) WITH CLUSTERING ORDER BY (bulk_id ASC)
> AND bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99PERCENTILE';
> CREATE CUSTOM INDEX bulk_recipients_bulk_id ON cservice.bulks_recipients 
> (bulk_id) USING 'org.apache.cassandra.index.sasi.SASIIndex';
> {code}
> There are 11 rows in it:
> {code}
> > select * from bulks_recipients;
> ...
> (11 rows)
> {code}
> Let's query by index (all rows have the same *bulk_id*):
> {code}
> > select * from bulks_recipients where bulk_id = 
> > baa94815-e276-4ca4-adda-5b9734e6c4a5;   
> >   
> ...
> (11 rows)
> {code}
> Ok, everything is fine.
> Now i'm doing *nodetool repair --partitioner-range --job-threads 4 --full* on 
> each node in cluster sequentially.
> After it finished:
> {code}
> > select * from bulks_recipients where bulk_id = 
> > baa94815-e276-4ca4-adda-5b9734e6c4a5;
> ...
> (2 rows)
> {code}
> Only two rows.
> While the rows are actually there:
> {code}
> > select * from bulks_recipients;
> ...
> (11 rows)
> {code}
> If i issue an incremental repair on a random node, i can get like 7 rows 
> after index query.
> Dropping index and recreating it fixes the issue. Is it a bug or am i doing 
> the repair the wrong way?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13917) COMPACT STORAGE queries on dense static tables accept hidden column1 and value columns

2020-04-06 Thread Alex Petrov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-13917:

Status: Patch Available  (was: In Progress)

> COMPACT STORAGE queries on dense static tables accept hidden column1 and 
> value columns
> --
>
> Key: CASSANDRA-13917
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13917
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Alex Petrov
>Assignee: Aleksandr Sorokoumov
>Priority: Low
>  Labels: lhf
> Fix For: 3.0.x, 3.11.x
>
> Attachments: 13917-3.0-testall-13.12.2019, 
> 13917-3.0-testall-16.01.2020, 13917-3.0-testall-2.png, 
> 13917-3.0-testall-20.11.2019.png, 13917-3.0-upgrade-16.01.2020, 
> 13917-3.0.png, 13917-3.11-testall-13.12.2019, 
> 13917-3.11-testall-16.01.2020.png, 13917-3.11-testall-2.png, 
> 13917-3.11-testall-20.11.2019.png, 13917-3.11-upgrade-16.01.2020.png, 
> 13917-3.11.png
>
>
> Test for the issue:
> {code}
> @Test
> public void testCompactStorage() throws Throwable
> {
> createTable("CREATE TABLE %s (a int PRIMARY KEY, b int, c int) WITH 
> COMPACT STORAGE");
> assertInvalid("INSERT INTO %s (a, b, c, column1) VALUES (?, ?, ?, 
> ?)", 1, 1, 1, ByteBufferUtil.bytes('a'));
> // This one fails with Some clustering keys are missing: column1, 
> which is still wrong
> assertInvalid("INSERT INTO %s (a, b, c, value) VALUES (?, ?, ?, ?)", 
> 1, 1, 1, ByteBufferUtil.bytes('a'));   
> assertInvalid("INSERT INTO %s (a, b, c, column1, value) VALUES (?, ?, 
> ?, ?, ?)", 1, 1, 1, ByteBufferUtil.bytes('a'), ByteBufferUtil.bytes('b'));
> assertEmpty(execute("SELECT * FROM %s"));
> }
> {code}
> Gladly, these writes are no-op, even though they succeed.
> {{value}} and {{column1}} should be completely hidden. Fixing this one should 
> be as easy as just adding validations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13917) COMPACT STORAGE queries on dense static tables accept hidden column1 and value columns

2020-04-06 Thread Alex Petrov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-13917:

Reviewers: Alex Petrov, Alex Petrov  (was: Alex Petrov)
   Alex Petrov, Alex Petrov  (was: Alex Petrov)
   Status: Review In Progress  (was: Patch Available)

> COMPACT STORAGE queries on dense static tables accept hidden column1 and 
> value columns
> --
>
> Key: CASSANDRA-13917
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13917
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Alex Petrov
>Assignee: Aleksandr Sorokoumov
>Priority: Low
>  Labels: lhf
> Fix For: 3.0.x, 3.11.x
>
> Attachments: 13917-3.0-testall-13.12.2019, 
> 13917-3.0-testall-16.01.2020, 13917-3.0-testall-2.png, 
> 13917-3.0-testall-20.11.2019.png, 13917-3.0-upgrade-16.01.2020, 
> 13917-3.0.png, 13917-3.11-testall-13.12.2019, 
> 13917-3.11-testall-16.01.2020.png, 13917-3.11-testall-2.png, 
> 13917-3.11-testall-20.11.2019.png, 13917-3.11-upgrade-16.01.2020.png, 
> 13917-3.11.png
>
>
> Test for the issue:
> {code}
> @Test
> public void testCompactStorage() throws Throwable
> {
> createTable("CREATE TABLE %s (a int PRIMARY KEY, b int, c int) WITH 
> COMPACT STORAGE");
> assertInvalid("INSERT INTO %s (a, b, c, column1) VALUES (?, ?, ?, 
> ?)", 1, 1, 1, ByteBufferUtil.bytes('a'));
> // This one fails with Some clustering keys are missing: column1, 
> which is still wrong
> assertInvalid("INSERT INTO %s (a, b, c, value) VALUES (?, ?, ?, ?)", 
> 1, 1, 1, ByteBufferUtil.bytes('a'));   
> assertInvalid("INSERT INTO %s (a, b, c, column1, value) VALUES (?, ?, 
> ?, ?, ?)", 1, 1, 1, ByteBufferUtil.bytes('a'), ByteBufferUtil.bytes('b'));
> assertEmpty(execute("SELECT * FROM %s"));
> }
> {code}
> Gladly, these writes are no-op, even though they succeed.
> {{value}} and {{column1}} should be completely hidden. Fixing this one should 
> be as easy as just adding validations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13917) COMPACT STORAGE queries on dense static tables accept hidden column1 and value columns

2020-04-06 Thread Alex Petrov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-13917:

Status: Ready to Commit  (was: Review In Progress)

> COMPACT STORAGE queries on dense static tables accept hidden column1 and 
> value columns
> --
>
> Key: CASSANDRA-13917
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13917
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Alex Petrov
>Assignee: Aleksandr Sorokoumov
>Priority: Low
>  Labels: lhf
> Fix For: 3.0.x, 3.11.x
>
> Attachments: 13917-3.0-testall-13.12.2019, 
> 13917-3.0-testall-16.01.2020, 13917-3.0-testall-2.png, 
> 13917-3.0-testall-20.11.2019.png, 13917-3.0-upgrade-16.01.2020, 
> 13917-3.0.png, 13917-3.11-testall-13.12.2019, 
> 13917-3.11-testall-16.01.2020.png, 13917-3.11-testall-2.png, 
> 13917-3.11-testall-20.11.2019.png, 13917-3.11-upgrade-16.01.2020.png, 
> 13917-3.11.png
>
>
> Test for the issue:
> {code}
> @Test
> public void testCompactStorage() throws Throwable
> {
> createTable("CREATE TABLE %s (a int PRIMARY KEY, b int, c int) WITH 
> COMPACT STORAGE");
> assertInvalid("INSERT INTO %s (a, b, c, column1) VALUES (?, ?, ?, 
> ?)", 1, 1, 1, ByteBufferUtil.bytes('a'));
> // This one fails with Some clustering keys are missing: column1, 
> which is still wrong
> assertInvalid("INSERT INTO %s (a, b, c, value) VALUES (?, ?, ?, ?)", 
> 1, 1, 1, ByteBufferUtil.bytes('a'));   
> assertInvalid("INSERT INTO %s (a, b, c, column1, value) VALUES (?, ?, 
> ?, ?, ?)", 1, 1, 1, ByteBufferUtil.bytes('a'), ByteBufferUtil.bytes('b'));
> assertEmpty(execute("SELECT * FROM %s"));
> }
> {code}
> Gladly, these writes are no-op, even though they succeed.
> {{value}} and {{column1}} should be completely hidden. Fixing this one should 
> be as easy as just adding validations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13917) COMPACT STORAGE queries on dense static tables accept hidden column1 and value columns

2020-04-06 Thread Alex Petrov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-13917:

Resolution: Fixed
Status: Resolved  (was: Ready to Commit)

> COMPACT STORAGE queries on dense static tables accept hidden column1 and 
> value columns
> --
>
> Key: CASSANDRA-13917
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13917
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Alex Petrov
>Assignee: Aleksandr Sorokoumov
>Priority: Low
>  Labels: lhf
> Fix For: 3.0.x, 3.11.x
>
> Attachments: 13917-3.0-testall-13.12.2019, 
> 13917-3.0-testall-16.01.2020, 13917-3.0-testall-2.png, 
> 13917-3.0-testall-20.11.2019.png, 13917-3.0-upgrade-16.01.2020, 
> 13917-3.0.png, 13917-3.11-testall-13.12.2019, 
> 13917-3.11-testall-16.01.2020.png, 13917-3.11-testall-2.png, 
> 13917-3.11-testall-20.11.2019.png, 13917-3.11-upgrade-16.01.2020.png, 
> 13917-3.11.png
>
>
> Test for the issue:
> {code}
> @Test
> public void testCompactStorage() throws Throwable
> {
> createTable("CREATE TABLE %s (a int PRIMARY KEY, b int, c int) WITH 
> COMPACT STORAGE");
> assertInvalid("INSERT INTO %s (a, b, c, column1) VALUES (?, ?, ?, 
> ?)", 1, 1, 1, ByteBufferUtil.bytes('a'));
> // This one fails with Some clustering keys are missing: column1, 
> which is still wrong
> assertInvalid("INSERT INTO %s (a, b, c, value) VALUES (?, ?, ?, ?)", 
> 1, 1, 1, ByteBufferUtil.bytes('a'));   
> assertInvalid("INSERT INTO %s (a, b, c, column1, value) VALUES (?, ?, 
> ?, ?, ?)", 1, 1, 1, ByteBufferUtil.bytes('a'), ByteBufferUtil.bytes('b'));
> assertEmpty(execute("SELECT * FROM %s"));
> }
> {code}
> Gladly, these writes are no-op, even though they succeed.
> {{value}} and {{column1}} should be completely hidden. Fixing this one should 
> be as easy as just adding validations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13994) Remove COMPACT STORAGE internals before 4.0 release

2020-04-06 Thread Alex Petrov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Petrov updated CASSANDRA-13994:

Reviewers: Dinesh Joshi  (was: Alex Petrov, Dinesh Joshi)

> Remove COMPACT STORAGE internals before 4.0 release
> ---
>
> Key: CASSANDRA-13994
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13994
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Local Write-Read Paths
>Reporter: Alex Petrov
>Assignee: Ekaterina Dimitrova
>Priority: Low
> Fix For: 4.0, 4.0-rc
>
>
> 4.0 comes without thrift (after [CASSANDRA-5]) and COMPACT STORAGE (after 
> [CASSANDRA-10857]), and since Compact Storage flags are now disabled, all of 
> the related functionality is useless.
> There are still some things to consider:
> 1. One of the system tables (built indexes) was compact. For now, we just 
> added {{value}} column to it to make sure it's backwards-compatible, but we 
> might want to make sure it's just a "normal" table and doesn't have redundant 
> columns.
> 2. Compact Tables were building indexes in {{KEYS}} mode. Removing it is 
> trivial, but this would mean that all built indexes will be defunct. We could 
> log a warning for now and ask users to migrate off those for now and 
> completely remove it from future releases. It's just a couple of classes 
> though.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14773) Overflow of 32-bit integer during compaction.

2020-04-06 Thread Francisco Fernandez (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076163#comment-17076163
 ] 

Francisco Fernandez commented on CASSANDRA-14773:
-

I've updated the patch including more test coverage.

> Overflow of 32-bit integer during compaction.
> -
>
> Key: CASSANDRA-14773
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14773
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Vladimir Bukhtoyarov
>Assignee: Francisco Fernandez
>Priority: Urgent
>  Labels: pull-request-available
> Fix For: 4.0, 4.0-beta
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In scope of CASSANDRA-13444 the compaction was significantly improved from 
> CPU and memory perspective. Hovewer this improvement introduces the bug in 
> rounding. When rounding the expriration time which is close to  
> *Cell.MAX_DELETION_TIME*(it is just *Integer.MAX_VALUE*) the math overflow 
> happens(because in scope of -CASSANDRA-13444-) data type for point was 
> changed from Long to Integer in order to reduce memory footprint), as result 
> point became negative and acts as silent poison for internal structures of 
> StreamingTombstoneHistogramBuilder like *DistanceHolder* and *DataHolder*. 
> Then depending of point intervals:
>  * The TombstoneHistogram produces wrong values when interval of points is 
> less then binSize, it is not critical.
>  * Compaction crashes with ArrayIndexOutOfBoundsException if amount of point 
> intervals is great then  binSize, this case is very critical.
>  
> This is pull request [https://github.com/apache/cassandra/pull/273] that 
> reproduces the issue and provides the fix. 
>  
> The stacktrace when running(on codebase without fix) 
> *testMathOverflowDuringRoundingOfLargeTimestamp* without -ea JVM flag
> {noformat}
> java.lang.ArrayIndexOutOfBoundsException
> at java.lang.System.arraycopy(Native Method)
> at 
> org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder$DistanceHolder.add(StreamingTombstoneHistogramBuilder.java:208)
> at 
> org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder.flushValue(StreamingTombstoneHistogramBuilder.java:140)
> at 
> org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder$$Lambda$1/1967205423.consume(Unknown
>  Source)
> at 
> org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder$Spool.forEach(StreamingTombstoneHistogramBuilder.java:574)
> at 
> org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder.flushHistogram(StreamingTombstoneHistogramBuilder.java:124)
> at 
> org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder.build(StreamingTombstoneHistogramBuilder.java:184)
> at 
> org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilderTest.testMathOverflowDuringRoundingOfLargeTimestamp(StreamingTombstoneHistogramBuilderTest.java:183)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
> at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:44)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:180)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:41)
> at org.junit.runners.ParentRunner$1.evaluate(ParentRunner.java:173)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
> at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:220)
> at org.junit.runner.JUnitCore.run(JUnitCore.java:159)
> at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
> at 
> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
> at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
> at 

[jira] [Assigned] (CASSANDRA-15406) Add command to show the progress of data streaming and index build

2020-04-06 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic reassigned CASSANDRA-15406:
-

Assignee: (was: Stefan Miklosovic)

> Add command to show the progress of data streaming and index build 
> ---
>
> Key: CASSANDRA-15406
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15406
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Streaming, Legacy/Streaming and Messaging, 
> Tool/nodetool
>Reporter: maxwellguo
>Priority: Normal
> Fix For: 4.0, 4.x
>
>
> I found that we should supply a command to show the progress of streaming 
> when we do the operation of bootstrap/move/decommission/removenode. For when 
> do data streaming , noboday knows which steps there program are in , so I 
> think a command to show the joing/leaving node's is needed .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15694) Statistics upon streaming of entire SSTables in Netstats is wrong

2020-04-06 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-15694:
--
Description: 
There is a bug in the current code as if we are streaming entire SSTables via 
CassandraEntireSSTableStreamWriter and CassandraOutgoingFile respectively, 
there is not any update on particular components of a SSTable as it counts only 
"db" file to be there. That introduces this bug:

 
{code:java}
Mode: NORMAL
Rebuild 2c0b43f0-735d-11ea-9346-fb0ffe238736
/127.0.0.2 Sending 19 files, 27664559 bytes total. Already sent 133 files, 
27664559 bytes total

/tmp/dtests15682026295742741219/node2/data/distributed_test_keyspace/cf-196b3...

{code}
Basically, number of files to be sent is lower than files sent.

 

The straightforward fix here is to distinguish when we are streaming entire 
sstables and in that case include all manifest files into computation. 

 

This issue relates to https://issues.apache.org/jira/browse/CASSANDRA-15657 
because the resolution whether we stream entirely or not is got from a method 
which is performance sensitive and computed every time. Once CASSANDRA-15657  
(hence CASSANDRA-14586) is done, this ticket can be worked on.

 

branch with fix is here: 
[https://github.com/smiklosovic/cassandra/tree/CASSANDRA-15694]

  was:
There is a bug in the current code as if we are streaming entire SSTables via 
CassandraEntireSSTableStreamWriter and CassandraOutgoingFile respectively, 
there is not any update on particular components of a SSTable as it counts only 
"db" file to be there. That introduces this bug:

 
{code:java}
Mode: NORMAL
Rebuild 2c0b43f0-735d-11ea-9346-fb0ffe238736
/127.0.0.2 Sending 19 files, 27664559 bytes total. Already sent 133 files, 
27664559 bytes total

/tmp/dtests15682026295742741219/node2/data/distributed_test_keyspace/cf-196b3...

{code}
Basically, number of files to be sent is lower than files sent.

 

The straightforward fix here is to distinguish when we are streaming entire 
sstables and in that case include all manifest files into computation. 

 

This issue relates to https://issues.apache.org/jira/browse/CASSANDRA-15657 
because the resolution whether we stream entirely or not is got from a method 
which is performance sensitive and computed every time. Once CASSANDRA-15657  
(hence CASSANDRA-14586) is done, this ticket can be worked on.


> Statistics upon streaming of entire SSTables in Netstats is wrong
> -
>
> Key: CASSANDRA-15694
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15694
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/nodetool
>Reporter: Stefan Miklosovic
>Priority: Normal
>
> There is a bug in the current code as if we are streaming entire SSTables via 
> CassandraEntireSSTableStreamWriter and CassandraOutgoingFile respectively, 
> there is not any update on particular components of a SSTable as it counts 
> only "db" file to be there. That introduces this bug:
>  
> {code:java}
> Mode: NORMAL
> Rebuild 2c0b43f0-735d-11ea-9346-fb0ffe238736
> /127.0.0.2 Sending 19 files, 27664559 bytes total. Already sent 133 
> files, 27664559 bytes total
> 
> /tmp/dtests15682026295742741219/node2/data/distributed_test_keyspace/cf-196b3...
> 
> {code}
> Basically, number of files to be sent is lower than files sent.
>  
> The straightforward fix here is to distinguish when we are streaming entire 
> sstables and in that case include all manifest files into computation. 
>  
> This issue relates to https://issues.apache.org/jira/browse/CASSANDRA-15657 
> because the resolution whether we stream entirely or not is got from a method 
> which is performance sensitive and computed every time. Once CASSANDRA-15657  
> (hence CASSANDRA-14586) is done, this ticket can be worked on.
>  
> branch with fix is here: 
> [https://github.com/smiklosovic/cassandra/tree/CASSANDRA-15694]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15406) Add command to show the progress of data streaming and index build

2020-04-06 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076115#comment-17076115
 ] 

Stefan Miklosovic commented on CASSANDRA-15406:
---

This issue should be blocked to work on until the underlying bug when streaming 
of entire sstables is resolved as the figures for the computation of progress 
would not make sense anyway.

> Add command to show the progress of data streaming and index build 
> ---
>
> Key: CASSANDRA-15406
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15406
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Streaming, Legacy/Streaming and Messaging, 
> Tool/nodetool
>Reporter: maxwellguo
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 4.0, 4.x
>
>
> I found that we should supply a command to show the progress of streaming 
> when we do the operation of bootstrap/move/decommission/removenode. For when 
> do data streaming , noboday knows which steps there program are in , so I 
> think a command to show the joing/leaving node's is needed .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-15694) Statistics upon streaming of entire SSTables in Netstats is wrong

2020-04-06 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic reassigned CASSANDRA-15694:
-

Assignee: (was: Stefan Miklosovic)

> Statistics upon streaming of entire SSTables in Netstats is wrong
> -
>
> Key: CASSANDRA-15694
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15694
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/nodetool
>Reporter: Stefan Miklosovic
>Priority: Normal
>
> There is a bug in the current code as if we are streaming entire SSTables via 
> CassandraEntireSSTableStreamWriter and CassandraOutgoingFile respectively, 
> there is not any update on particular components of a SSTable as it counts 
> only "db" file to be there. That introduces this bug:
>  
> {code:java}
> Mode: NORMAL
> Rebuild 2c0b43f0-735d-11ea-9346-fb0ffe238736
> /127.0.0.2 Sending 19 files, 27664559 bytes total. Already sent 133 
> files, 27664559 bytes total
> 
> /tmp/dtests15682026295742741219/node2/data/distributed_test_keyspace/cf-196b3...
> 
> {code}
> Basically, number of files to be sent is lower than files sent.
>  
> The straightforward fix here is to distinguish when we are streaming entire 
> sstables and in that case include all manifest files into computation. 
>  
> This issue relates to https://issues.apache.org/jira/browse/CASSANDRA-15657 
> because the resolution whether we stream entirely or not is got from a method 
> which is performance sensitive and computed every time. Once CASSANDRA-15657  
> (hence CASSANDRA-14586) is done, this ticket can be worked on.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15694) Statistics upon streaming of entire SSTables in Netstats is wrong

2020-04-06 Thread Stefan Miklosovic (Jira)
Stefan Miklosovic created CASSANDRA-15694:
-

 Summary: Statistics upon streaming of entire SSTables in Netstats 
is wrong
 Key: CASSANDRA-15694
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15694
 Project: Cassandra
  Issue Type: Bug
  Components: Tool/nodetool
Reporter: Stefan Miklosovic
Assignee: Stefan Miklosovic


There is a bug in the current code as if we are streaming entire SSTables via 
CassandraEntireSSTableStreamWriter and CassandraOutgoingFile respectively, 
there is not any update on particular components of a SSTable as it counts only 
"db" file to be there. That introduces this bug:

 
{code:java}
Mode: NORMAL
Rebuild 2c0b43f0-735d-11ea-9346-fb0ffe238736
/127.0.0.2 Sending 19 files, 27664559 bytes total. Already sent 133 files, 
27664559 bytes total

/tmp/dtests15682026295742741219/node2/data/distributed_test_keyspace/cf-196b3...

{code}
Basically, number of files to be sent is lower than files sent.

 

The straightforward fix here is to distinguish when we are streaming entire 
sstables and in that case include all manifest files into computation. 

 

This issue relates to https://issues.apache.org/jira/browse/CASSANDRA-15657 
because the resolution whether we stream entirely or not is got from a method 
which is performance sensitive and computed every time. Once CASSANDRA-15657  
(hence CASSANDRA-14586) is done, this ticket can be worked on.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org