[jira] [Updated] (CASSANDRA-15696) Only track ideal CL failure when request CL is met
[ https://issues.apache.org/jira/browse/CASSANDRA-15696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jon Haddad updated CASSANDRA-15696: --- Fix Version/s: 4.0-alpha Since Version: 4.0-alpha Source Control Link: https://github.com/apache/cassandra/commit/0e0d288ab7e87e7d4a7542c955dd06701798bd06 Resolution: Fixed Status: Resolved (was: Ready to Commit) > Only track ideal CL failure when request CL is met > -- > > Key: CASSANDRA-15696 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15696 > Project: Cassandra > Issue Type: Bug > Components: Observability/Metrics >Reporter: Jon Haddad >Assignee: Jon Haddad >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-alpha > > Time Spent: 20m > Remaining Estimate: 0h > > When ideal_consistency_level is set (CASSANDRA-13289), we currently increment > a counter if a request doesn’t meet the consistency level specified in the > configuration (or through JMX). > At the moment, we increment the counter if the query was successful or not. I > think it would be slightly better if we only incremented the counter if the > ideal CL wasn’t achieved but the query’s CL was met. > The original JIRA, stated the following as an objective: > {quote}If your application writes at LOCAL_QUORUM how often are those writes > failing to achieve EACH_QUORUM at other data centers. If you failed your > application over to one of those data centers roughly how inconsistent might > it be given the number of writes that didn't propagate since the last > incremental repair? > {quote} > The main benefit to the JIRA was to set a CL higher than the CL being used, > and to track how often we weren’t able to hit that CL despise hitting the > underlying CL. We should only increment the counter in a case where we were > able to meet the query provided consistency but were unable to meet the ideal > consistency level. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15696) Only track ideal CL failure when request CL is met
[ https://issues.apache.org/jira/browse/CASSANDRA-15696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jon Haddad updated CASSANDRA-15696: --- Status: Ready to Commit (was: Changes Suggested) > Only track ideal CL failure when request CL is met > -- > > Key: CASSANDRA-15696 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15696 > Project: Cassandra > Issue Type: Bug > Components: Observability/Metrics >Reporter: Jon Haddad >Assignee: Jon Haddad >Priority: Normal > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > When ideal_consistency_level is set (CASSANDRA-13289), we currently increment > a counter if a request doesn’t meet the consistency level specified in the > configuration (or through JMX). > At the moment, we increment the counter if the query was successful or not. I > think it would be slightly better if we only incremented the counter if the > ideal CL wasn’t achieved but the query’s CL was met. > The original JIRA, stated the following as an objective: > {quote}If your application writes at LOCAL_QUORUM how often are those writes > failing to achieve EACH_QUORUM at other data centers. If you failed your > application over to one of those data centers roughly how inconsistent might > it be given the number of writes that didn't propagate since the last > incremental repair? > {quote} > The main benefit to the JIRA was to set a CL higher than the CL being used, > and to track how often we weren’t able to hit that CL despise hitting the > underlying CL. We should only increment the counter in a case where we were > able to meet the query provided consistency but were unable to meet the ideal > consistency level. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch trunk updated: Only track ideal CL failure when request CL met
This is an automated email from the ASF dual-hosted git repository. rustyrazorblade pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git The following commit(s) were added to refs/heads/trunk by this push: new 0e0d288 Only track ideal CL failure when request CL met 0e0d288 is described below commit 0e0d288ab7e87e7d4a7542c955dd06701798bd06 Author: Jon Haddad j...@jonhaddad.com AuthorDate: Mon Apr 6 12:53:27 2020 -0700 Only track ideal CL failure when request CL met Ideal consistency level tracking should not report a failure when requested CL was also not met either. Patch by Jon Haddad; Reviewed by Dinesh Joshi for CASSANDRA-15696. --- CHANGES.txt| 1 + .../service/AbstractWriteResponseHandler.java | 17 -- .../service/WriteResponseHandlerTest.java | 26 ++ 3 files changed, 42 insertions(+), 2 deletions(-) diff --git a/CHANGES.txt b/CHANGES.txt index fb881de..95a6802 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 4.0-alpha4 + * Only track ideal CL failure when request CL met (CASSANDRA-15696) * Fix flaky CoordinatorMessagingTest and docstring in OutboundSink and ConsistentSession (CASSANDRA-15672) * Fix force compaction of wrapping ranges (CASSANDRA-15664) * Expose repair streaming metrics (CASSANDRA-15656) diff --git a/src/java/org/apache/cassandra/service/AbstractWriteResponseHandler.java b/src/java/org/apache/cassandra/service/AbstractWriteResponseHandler.java index 1889c79..b1eb5b3 100644 --- a/src/java/org/apache/cassandra/service/AbstractWriteResponseHandler.java +++ b/src/java/org/apache/cassandra/service/AbstractWriteResponseHandler.java @@ -74,6 +74,11 @@ public abstract class AbstractWriteResponseHandler implements RequestCallback private AbstractWriteResponseHandler idealCLDelegate; /** + * We don't want to increment the writeFailedIdealCL if we didn't achieve the original requested CL + */ +private boolean requestedCLAchieved = false; + +/** * @param callback A callback to be called when the write is successful. * @param queryStartNanoTime */ @@ -232,6 +237,13 @@ public abstract class AbstractWriteResponseHandler implements RequestCallback protected void signal() { +//The ideal CL should only count as a strike if the requested CL was achieved. +//If the requested CL is not achieved it's fine for the ideal CL to also not be achieved. +if (idealCLDelegate != null) +{ +idealCLDelegate.requestedCLAchieved = true; +} + condition.signalAll(); if (callback != null) callback.run(); @@ -279,8 +291,9 @@ public abstract class AbstractWriteResponseHandler implements RequestCallback int decrementedValue = responsesAndExpirations.decrementAndGet(); if (decrementedValue == 0) { -//The condition being signaled is a valid proxy for the CL being achieved -if (!condition.isSignaled()) +// The condition being signaled is a valid proxy for the CL being achieved +// Only mark it as failed if the requested CL was achieved. +if (!condition.isSignaled() && requestedCLAchieved) { replicaPlan.keyspace().metric.writeFailedIdealCL.inc(); } diff --git a/test/unit/org/apache/cassandra/service/WriteResponseHandlerTest.java b/test/unit/org/apache/cassandra/service/WriteResponseHandlerTest.java index f06b706..5d8d191 100644 --- a/test/unit/org/apache/cassandra/service/WriteResponseHandlerTest.java +++ b/test/unit/org/apache/cassandra/service/WriteResponseHandlerTest.java @@ -232,6 +232,32 @@ public class WriteResponseHandlerTest assertEquals(0, ks.metric.idealCLWriteLatency.totalLatency.getCount()); } +/** + * Validate that failing to achieve ideal CL doesn't increase the failure counter when not meeting CL + * @throws Throwable + */ +@Test +public void failedIdealCLDoesNotIncrementsStatOnQueryFailure() throws Throwable +{ +AbstractWriteResponseHandler awr = createWriteResponseHandler(ConsistencyLevel.LOCAL_QUORUM, ConsistencyLevel.EACH_QUORUM); + +long startingCount = ks.metric.writeFailedIdealCL.getCount(); + +// Failure in local DC +awr.onResponse(createDummyMessage(0)); + +awr.expired(); +awr.expired(); + +//Fail in remote DC +awr.expired(); +awr.expired(); +awr.expired(); + +assertEquals(startingCount, ks.metric.writeFailedIdealCL.getCount()); +} + + private static AbstractWriteResponseHandler createWriteResponseHandler(ConsistencyLevel cl, ConsistencyLevel ideal) { return createWriteResponseHandler(cl, ideal, System.nanoTime());
[jira] [Commented] (CASSANDRA-15660) Unable to specify -e/--execute flag in cqlsh
[ https://issues.apache.org/jira/browse/CASSANDRA-15660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076803#comment-17076803 ] ZhaoYang commented on CASSANDRA-15660: -- [~djoshi] there is a regression dtest.. > Unable to specify -e/--execute flag in cqlsh > > > Key: CASSANDRA-15660 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15660 > Project: Cassandra > Issue Type: Bug > Components: Tool/cqlsh >Reporter: Stefan Miklosovic >Assignee: ZhaoYang >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-alpha > > Time Spent: 10m > Remaining Estimate: 0h > > From mailing list: > [https://lists.apache.org/thread.html/r377099b632c62b641e4feef5b738084fc5369b0c7157fae867853597%40%3Cdev.cassandra.apache.org%3E] > The bug looks like this: > {code:java} > $ /usr/bin/cqlsh -e 'describe keyspaces' -u cassandra -p cassandra 127.0.0.1 > Usage: cqlsh.py [options] [host [port]]cqlsh.py: error: '127.0.0.1' is not a > valid port number. > {code} > This is working in 3.x releases just fine but fails on 4. > The workaround for 4.x code as of today is to put these statements into file > and use "-f" flag. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15696) Only track ideal CL failure when request CL is met
[ https://issues.apache.org/jira/browse/CASSANDRA-15696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076804#comment-17076804 ] Jon Haddad commented on CASSANDRA-15696: Ah yes - that was a typo. Fixed. > Only track ideal CL failure when request CL is met > -- > > Key: CASSANDRA-15696 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15696 > Project: Cassandra > Issue Type: Bug > Components: Observability/Metrics >Reporter: Jon Haddad >Assignee: Jon Haddad >Priority: Normal > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > When ideal_consistency_level is set (CASSANDRA-13289), we currently increment > a counter if a request doesn’t meet the consistency level specified in the > configuration (or through JMX). > At the moment, we increment the counter if the query was successful or not. I > think it would be slightly better if we only incremented the counter if the > ideal CL wasn’t achieved but the query’s CL was met. > The original JIRA, stated the following as an objective: > {quote}If your application writes at LOCAL_QUORUM how often are those writes > failing to achieve EACH_QUORUM at other data centers. If you failed your > application over to one of those data centers roughly how inconsistent might > it be given the number of writes that didn't propagate since the last > incremental repair? > {quote} > The main benefit to the JIRA was to set a CL higher than the CL being used, > and to track how often we weren’t able to hit that CL despise hitting the > underlying CL. We should only increment the counter in a case where we were > able to meet the query provided consistency but were unable to meet the ideal > consistency level. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15696) Only track ideal CL failure when request CL is met
[ https://issues.apache.org/jira/browse/CASSANDRA-15696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Joshi updated CASSANDRA-15696: - Status: Changes Suggested (was: Review In Progress) Thanks for your patch [~rustyrazorblade]. Looks ok overall. I think there is just one minor issue where you're using a bitwise operator instead of logical operator. Although the results are the same right now, I'd prefer that you change it to logical operator on commit. > Only track ideal CL failure when request CL is met > -- > > Key: CASSANDRA-15696 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15696 > Project: Cassandra > Issue Type: Bug > Components: Observability/Metrics >Reporter: Jon Haddad >Assignee: Jon Haddad >Priority: Normal > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > When ideal_consistency_level is set (CASSANDRA-13289), we currently increment > a counter if a request doesn’t meet the consistency level specified in the > configuration (or through JMX). > At the moment, we increment the counter if the query was successful or not. I > think it would be slightly better if we only incremented the counter if the > ideal CL wasn’t achieved but the query’s CL was met. > The original JIRA, stated the following as an objective: > {quote}If your application writes at LOCAL_QUORUM how often are those writes > failing to achieve EACH_QUORUM at other data centers. If you failed your > application over to one of those data centers roughly how inconsistent might > it be given the number of writes that didn't propagate since the last > incremental repair? > {quote} > The main benefit to the JIRA was to set a CL higher than the CL being used, > and to track how often we weren’t able to hit that CL despise hitting the > underlying CL. We should only increment the counter in a case where we were > able to meet the query provided consistency but were unable to meet the ideal > consistency level. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15696) Only track ideal CL failure when request CL is met
[ https://issues.apache.org/jira/browse/CASSANDRA-15696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Joshi updated CASSANDRA-15696: - Reviewers: Dinesh Joshi, Dinesh Joshi (was: Dinesh Joshi) Dinesh Joshi, Dinesh Joshi (was: Dinesh Joshi) Status: Review In Progress (was: Patch Available) > Only track ideal CL failure when request CL is met > -- > > Key: CASSANDRA-15696 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15696 > Project: Cassandra > Issue Type: Bug > Components: Observability/Metrics >Reporter: Jon Haddad >Assignee: Jon Haddad >Priority: Normal > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > When ideal_consistency_level is set (CASSANDRA-13289), we currently increment > a counter if a request doesn’t meet the consistency level specified in the > configuration (or through JMX). > At the moment, we increment the counter if the query was successful or not. I > think it would be slightly better if we only incremented the counter if the > ideal CL wasn’t achieved but the query’s CL was met. > The original JIRA, stated the following as an objective: > {quote}If your application writes at LOCAL_QUORUM how often are those writes > failing to achieve EACH_QUORUM at other data centers. If you failed your > application over to one of those data centers roughly how inconsistent might > it be given the number of writes that didn't propagate since the last > incremental repair? > {quote} > The main benefit to the JIRA was to set a CL higher than the CL being used, > and to track how often we weren’t able to hit that CL despise hitting the > underlying CL. We should only increment the counter in a case where we were > able to meet the query provided consistency but were unable to meet the ideal > consistency level. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15696) Only track ideal CL failure when request CL is met
[ https://issues.apache.org/jira/browse/CASSANDRA-15696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Joshi updated CASSANDRA-15696: - Reviewers: Dinesh Joshi > Only track ideal CL failure when request CL is met > -- > > Key: CASSANDRA-15696 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15696 > Project: Cassandra > Issue Type: Bug > Components: Observability/Metrics >Reporter: Jon Haddad >Assignee: Jon Haddad >Priority: Normal > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > When ideal_consistency_level is set (CASSANDRA-13289), we currently increment > a counter if a request doesn’t meet the consistency level specified in the > configuration (or through JMX). > At the moment, we increment the counter if the query was successful or not. I > think it would be slightly better if we only incremented the counter if the > ideal CL wasn’t achieved but the query’s CL was met. > The original JIRA, stated the following as an objective: > {quote}If your application writes at LOCAL_QUORUM how often are those writes > failing to achieve EACH_QUORUM at other data centers. If you failed your > application over to one of those data centers roughly how inconsistent might > it be given the number of writes that didn't propagate since the last > incremental repair? > {quote} > The main benefit to the JIRA was to set a CL higher than the CL being used, > and to track how often we weren’t able to hit that CL despise hitting the > underlying CL. We should only increment the counter in a case where we were > able to meet the query provided consistency but were unable to meet the ideal > consistency level. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15696) Only track ideal CL failure when request CL is met
[ https://issues.apache.org/jira/browse/CASSANDRA-15696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jon Haddad updated CASSANDRA-15696: --- Test and Documentation Plan: Test links in comments (was: |[Unit Test|https://app.circleci.com/pipelines/github/rustyrazorblade/cassandra/20/workflows/62377765-3e08-4d6f-b4b5-aec609a197e6/jobs/254]| |[DTest|https://app.circleci.com/pipelines/github/rustyrazorblade/cassandra/20/workflows/62377765-3e08-4d6f-b4b5-aec609a197e6/jobs/256]| ) Status: Patch Available (was: In Progress) > Only track ideal CL failure when request CL is met > -- > > Key: CASSANDRA-15696 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15696 > Project: Cassandra > Issue Type: Bug > Components: Observability/Metrics >Reporter: Jon Haddad >Assignee: Jon Haddad >Priority: Normal > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > When ideal_consistency_level is set (CASSANDRA-13289), we currently increment > a counter if a request doesn’t meet the consistency level specified in the > configuration (or through JMX). > At the moment, we increment the counter if the query was successful or not. I > think it would be slightly better if we only incremented the counter if the > ideal CL wasn’t achieved but the query’s CL was met. > The original JIRA, stated the following as an objective: > {quote}If your application writes at LOCAL_QUORUM how often are those writes > failing to achieve EACH_QUORUM at other data centers. If you failed your > application over to one of those data centers roughly how inconsistent might > it be given the number of writes that didn't propagate since the last > incremental repair? > {quote} > The main benefit to the JIRA was to set a CL higher than the CL being used, > and to track how often we weren’t able to hit that CL despise hitting the > underlying CL. We should only increment the counter in a case where we were > able to meet the query provided consistency but were unable to meet the ideal > consistency level. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15696) Only track ideal CL failure when request CL is met
[ https://issues.apache.org/jira/browse/CASSANDRA-15696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076778#comment-17076778 ] Jon Haddad edited comment on CASSANDRA-15696 at 4/7/20, 12:26 AM: -- [Unit Tests|https://app.circleci.com/pipelines/github/rustyrazorblade/cassandra/23/workflows/61d1e8fd-6e9d-44c2-99d4-1c62e3d7cacc/jobs/273] [JVM DTests| https://app.circleci.com/pipelines/github/rustyrazorblade/cassandra/23/workflows/61d1e8fd-6e9d-44c2-99d4-1c62e3d7cacc/jobs/272] [Python DTests| https://app.circleci.com/pipelines/github/rustyrazorblade/cassandra/23/workflows/61d1e8fd-6e9d-44c2-99d4-1c62e3d7cacc/jobs/279] was (Author: rustyrazorblade): Unit Tests: https://app.circleci.com/pipelines/github/rustyrazorblade/cassandra/23/workflows/61d1e8fd-6e9d-44c2-99d4-1c62e3d7cacc/jobs/273 JVM DTests: https://app.circleci.com/pipelines/github/rustyrazorblade/cassandra/23/workflows/61d1e8fd-6e9d-44c2-99d4-1c62e3d7cacc/jobs/272 Python DTests: https://app.circleci.com/pipelines/github/rustyrazorblade/cassandra/23/workflows/61d1e8fd-6e9d-44c2-99d4-1c62e3d7cacc/jobs/279 > Only track ideal CL failure when request CL is met > -- > > Key: CASSANDRA-15696 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15696 > Project: Cassandra > Issue Type: Bug > Components: Observability/Metrics >Reporter: Jon Haddad >Assignee: Jon Haddad >Priority: Normal > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > When ideal_consistency_level is set (CASSANDRA-13289), we currently increment > a counter if a request doesn’t meet the consistency level specified in the > configuration (or through JMX). > At the moment, we increment the counter if the query was successful or not. I > think it would be slightly better if we only incremented the counter if the > ideal CL wasn’t achieved but the query’s CL was met. > The original JIRA, stated the following as an objective: > {quote}If your application writes at LOCAL_QUORUM how often are those writes > failing to achieve EACH_QUORUM at other data centers. If you failed your > application over to one of those data centers roughly how inconsistent might > it be given the number of writes that didn't propagate since the last > incremental repair? > {quote} > The main benefit to the JIRA was to set a CL higher than the CL being used, > and to track how often we weren’t able to hit that CL despise hitting the > underlying CL. We should only increment the counter in a case where we were > able to meet the query provided consistency but were unable to meet the ideal > consistency level. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15696) Only track ideal CL failure when request CL is met
[ https://issues.apache.org/jira/browse/CASSANDRA-15696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076778#comment-17076778 ] Jon Haddad commented on CASSANDRA-15696: Unit Tests: https://app.circleci.com/pipelines/github/rustyrazorblade/cassandra/23/workflows/61d1e8fd-6e9d-44c2-99d4-1c62e3d7cacc/jobs/273 JVM DTests: https://app.circleci.com/pipelines/github/rustyrazorblade/cassandra/23/workflows/61d1e8fd-6e9d-44c2-99d4-1c62e3d7cacc/jobs/272 Python DTests: https://app.circleci.com/pipelines/github/rustyrazorblade/cassandra/23/workflows/61d1e8fd-6e9d-44c2-99d4-1c62e3d7cacc/jobs/279 > Only track ideal CL failure when request CL is met > -- > > Key: CASSANDRA-15696 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15696 > Project: Cassandra > Issue Type: Bug > Components: Observability/Metrics >Reporter: Jon Haddad >Assignee: Jon Haddad >Priority: Normal > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > When ideal_consistency_level is set (CASSANDRA-13289), we currently increment > a counter if a request doesn’t meet the consistency level specified in the > configuration (or through JMX). > At the moment, we increment the counter if the query was successful or not. I > think it would be slightly better if we only incremented the counter if the > ideal CL wasn’t achieved but the query’s CL was met. > The original JIRA, stated the following as an objective: > {quote}If your application writes at LOCAL_QUORUM how often are those writes > failing to achieve EACH_QUORUM at other data centers. If you failed your > application over to one of those data centers roughly how inconsistent might > it be given the number of writes that didn't propagate since the last > incremental repair? > {quote} > The main benefit to the JIRA was to set a CL higher than the CL being used, > and to track how often we weren’t able to hit that CL despise hitting the > underlying CL. We should only increment the counter in a case where we were > able to meet the query provided consistency but were unable to meet the ideal > consistency level. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15660) Unable to specify -e/--execute flag in cqlsh
[ https://issues.apache.org/jira/browse/CASSANDRA-15660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076754#comment-17076754 ] Dinesh Joshi commented on CASSANDRA-15660: -- thank you for the patch. Could we please add a test to detect a future regression? > Unable to specify -e/--execute flag in cqlsh > > > Key: CASSANDRA-15660 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15660 > Project: Cassandra > Issue Type: Bug > Components: Tool/cqlsh >Reporter: Stefan Miklosovic >Assignee: ZhaoYang >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-alpha > > Time Spent: 10m > Remaining Estimate: 0h > > From mailing list: > [https://lists.apache.org/thread.html/r377099b632c62b641e4feef5b738084fc5369b0c7157fae867853597%40%3Cdev.cassandra.apache.org%3E] > The bug looks like this: > {code:java} > $ /usr/bin/cqlsh -e 'describe keyspaces' -u cassandra -p cassandra 127.0.0.1 > Usage: cqlsh.py [options] [host [port]]cqlsh.py: error: '127.0.0.1' is not a > valid port number. > {code} > This is working in 3.x releases just fine but fails on 4. > The workaround for 4.x code as of today is to put these statements into file > and use "-f" flag. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15696) Only track ideal CL failure when request CL is met
[ https://issues.apache.org/jira/browse/CASSANDRA-15696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jon Haddad updated CASSANDRA-15696: --- Status: In Progress (was: Patch Available) > Only track ideal CL failure when request CL is met > -- > > Key: CASSANDRA-15696 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15696 > Project: Cassandra > Issue Type: Bug > Components: Observability/Metrics >Reporter: Jon Haddad >Assignee: Jon Haddad >Priority: Normal > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > When ideal_consistency_level is set (CASSANDRA-13289), we currently increment > a counter if a request doesn’t meet the consistency level specified in the > configuration (or through JMX). > At the moment, we increment the counter if the query was successful or not. I > think it would be slightly better if we only incremented the counter if the > ideal CL wasn’t achieved but the query’s CL was met. > The original JIRA, stated the following as an objective: > {quote}If your application writes at LOCAL_QUORUM how often are those writes > failing to achieve EACH_QUORUM at other data centers. If you failed your > application over to one of those data centers roughly how inconsistent might > it be given the number of writes that didn't propagate since the last > incremental repair? > {quote} > The main benefit to the JIRA was to set a CL higher than the CL being used, > and to track how often we weren’t able to hit that CL despise hitting the > underlying CL. We should only increment the counter in a case where we were > able to meet the query provided consistency but were unable to meet the ideal > consistency level. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15696) Only track ideal CL failure when request CL is met
[ https://issues.apache.org/jira/browse/CASSANDRA-15696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jon Haddad updated CASSANDRA-15696: --- Test and Documentation Plan: |[Unit Test|https://app.circleci.com/pipelines/github/rustyrazorblade/cassandra/20/workflows/62377765-3e08-4d6f-b4b5-aec609a197e6/jobs/254]| |[DTest|https://app.circleci.com/pipelines/github/rustyrazorblade/cassandra/20/workflows/62377765-3e08-4d6f-b4b5-aec609a197e6/jobs/256]| Status: Patch Available (was: Open) There's a single failing DTest unrelated to this patch. > Only track ideal CL failure when request CL is met > -- > > Key: CASSANDRA-15696 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15696 > Project: Cassandra > Issue Type: Bug > Components: Observability/Metrics >Reporter: Jon Haddad >Assignee: Jon Haddad >Priority: Normal > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > When ideal_consistency_level is set (CASSANDRA-13289), we currently increment > a counter if a request doesn’t meet the consistency level specified in the > configuration (or through JMX). > At the moment, we increment the counter if the query was successful or not. I > think it would be slightly better if we only incremented the counter if the > ideal CL wasn’t achieved but the query’s CL was met. > The original JIRA, stated the following as an objective: > {quote}If your application writes at LOCAL_QUORUM how often are those writes > failing to achieve EACH_QUORUM at other data centers. If you failed your > application over to one of those data centers roughly how inconsistent might > it be given the number of writes that didn't propagate since the last > incremental repair? > {quote} > The main benefit to the JIRA was to set a CL higher than the CL being used, > and to track how often we weren’t able to hit that CL despise hitting the > underlying CL. We should only increment the counter in a case where we were > able to meet the query provided consistency but were unable to meet the ideal > consistency level. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15338) Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076727#comment-17076727 ] David Capwell commented on CASSANDRA-15338: --- I have a few things on my plate, I should be able to look end of the week? > Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest > --- > > Key: CASSANDRA-15338 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15338 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Yifan Cai >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-alpha > > Attachments: CASS-15338-Docker.zip > > Time Spent: 10m > Remaining Estimate: 0h > > Example failure: > [https://circleci.com/gh/dcapwell/cassandra/11#artifacts/containers/1] > > {code:java} > Testcase: testMessagePurging(org.apache.cassandra.net.ConnectionTest): FAILED > expected:<0> but was:<1> > junit.framework.AssertionFailedError: expected:<0> but was:<1> > at > org.apache.cassandra.net.ConnectionTest.lambda$testMessagePurging$38(ConnectionTest.java:625) > at > org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:258) > at > org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:231) > at > org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:584){code} > > Looking closer at > org.apache.cassandra.net.OutboundConnection.Delivery#stopAndRun it seems that > the run method is called before > org.apache.cassandra.net.OutboundConnection.Delivery#doRun which may lead to > a test race condition where the CountDownLatch completes before executing -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15696) Only track ideal CL failure when request CL is met
[ https://issues.apache.org/jira/browse/CASSANDRA-15696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated CASSANDRA-15696: --- Labels: pull-request-available (was: ) > Only track ideal CL failure when request CL is met > -- > > Key: CASSANDRA-15696 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15696 > Project: Cassandra > Issue Type: Bug > Components: Observability/Metrics >Reporter: Jon Haddad >Assignee: Jon Haddad >Priority: Normal > Labels: pull-request-available > > When ideal_consistency_level is set (CASSANDRA-13289), we currently increment > a counter if a request doesn’t meet the consistency level specified in the > configuration (or through JMX). > At the moment, we increment the counter if the query was successful or not. I > think it would be slightly better if we only incremented the counter if the > ideal CL wasn’t achieved but the query’s CL was met. > The original JIRA, stated the following as an objective: > {quote}If your application writes at LOCAL_QUORUM how often are those writes > failing to achieve EACH_QUORUM at other data centers. If you failed your > application over to one of those data centers roughly how inconsistent might > it be given the number of writes that didn't propagate since the last > incremental repair? > {quote} > The main benefit to the JIRA was to set a CL higher than the CL being used, > and to track how often we weren’t able to hit that CL despise hitting the > underlying CL. We should only increment the counter in a case where we were > able to meet the query provided consistency but were unable to meet the ideal > consistency level. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15642) Inconsistent failure messages on distributed queries
[ https://issues.apache.org/jira/browse/CASSANDRA-15642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076718#comment-17076718 ] Benedict Elliott Smith commented on CASSANDRA-15642: So what do you propose? Some questions to consider: * How long until you're sure you've received all the responses you might ever receive? * Can you guarantee to respond within the timeout specified by the operation? * If not, can you as a result ever guarantee "complete" information? * If not, is it a coherent concept for a distributed system? * How would you balance the delayed responses with user requirements to take corrective action promptly in response to failures? > Inconsistent failure messages on distributed queries > > > Key: CASSANDRA-15642 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15642 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination >Reporter: Kevin Gallardo >Priority: Normal > > As a follow up to some exploration I have done for CASSANDRA-15543, I > realized the following behavior in both {{ReadCallback}} and > {{AbstractWriteHandler}}: > - await for responses > - when all required number of responses have come back: unblock the wait > - when a single failure happens: unblock the wait > - when unblocked, look to see if the counter of failures is > 1 and if so > return an error message based on the {{failures}} map that's been filled > Error messages that can result from this behavior can be a ReadTimeout, a > ReadFailure, a WriteTimeout or a WriteFailure. > In case of a Write/ReadFailure, the user will get back an error looking like > the following: > "Failure: Received X responses, and Y failures" > (if this behavior I describe is incorrect, please correct me) > This causes a usability problem. Since the handler will fail and throw an > exception as soon as 1 failure happens, the error message that is returned to > the user may not be accurate. > (note: I am not entirely sure of the behavior in case of timeouts for now) > For example, say a request at CL = QUORUM = 3, a failed request may complete > first, then a successful one completes, and another fails. If the exception > is thrown fast enough, the error message could say > "Failure: Received 0 response, and 1 failure at CL = 3" > Which: > 1. doesn't make a lot of sense because the CL doesn't match the number of > results in the message, so you end up thinking "what happened with the rest > of the required CL?" > 2. the information is incorrect. We did receive a successful response, only > it came after the initial failure. > From that logic, I think it is safe to assume that the information returned > in the error message cannot be trusted in case of a failure. Only information > users should extract out of it is that at least 1 node has failed. > For a big improvement in usability, the {{ReadCallback}} and > {{AbstractWriteResponseHandler}} could instead wait for all responses to come > back before unblocking the wait, or let it timeout. This is way, the users > will be able to have some trust around the information returned to them. > Additionally, an error that happens first prevents a timeout to happen > because it fails immediately, and so potentially it hides problems with other > replicas. If we were to wait for all responses, we might get a timeout, in > that case we'd also be able to tell wether failures have happened *before* > that timeout, and have a more complete diagnostic where you can't detect both > errors at the same time. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15229) BufferPool Regression
[ https://issues.apache.org/jira/browse/CASSANDRA-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076712#comment-17076712 ] Benedict Elliott Smith edited comment on CASSANDRA-15229 at 4/6/20, 10:07 PM: -- bq. memory wasted due to fragmentation is perhaps not an issue with a cache as little as 512 MB My view is that having a significant proportion of memory wasted to fragmentation is a serious bug, irregardless of the total amount of memory that is wasted. bq. The point is that it is not suitable for long lived buffers, similarly to our bump the pointer strategy. It's not poorly suited to long lived buffers its it? Only to buffers with widely divergent lifetimes. If the lifetimes are loosely correlated then the length of the lifetime is mostly irrelevant I think. bq. The changes to the buffer pool can be dropped in 4.0 if you think that If you mean introducing a new pool specifically for {{ChunkCache}}. I'm fine with it as an alternative to permitting {{BufferPool}} to mitigate worst case behaviour for the {{ChunkCache}}. But verifying a replacement for {{BufferPool}} is a lot more work, and we use the {{BufferPool}} extensively in networking now, which requires non-uniform buffer sizes. Honestly, given chunks are normally the same size, simply re-using the evicted buffer if possible, and if not allocating new system memory, seems probably sufficient to me. bq. I'll try to share some code so you can have a clearer picture. Thanks, that sounds great. I may not get to it immediately, but look forward to taking a look hopefully soon. was (Author: benedict): bq. memory wasted due to fragmentation is perhaps not an issue with a cache as little as 512 MB My view is that having a significant proportion of memory wasted to fragmentation is a serious bug, irregardless of the total amount of memory that is wasted. bq. The point is that it is not suitable for long lived buffers, similarly to our bump the pointer strategy. It's not poorly suited to long lived buffers its it? Only to buffers with widely divergent lifetimes. If the lifetimes are loosely correlated then the length of the lifetime is mostly irrelevant I think. bq. The changes to the buffer pool can be dropped in 4.0 if you think that If you mean introducing a new pool specifically for {{ChunkCache}}. I'm fine with it as an alternative to permitting {{BufferPool}} to mitigate worst case behaviour for the {{ChunkCache}}. But verifying a replacement for {{BufferPool}} is a lot more work, and we use the {{BufferPool}} extensively in networking now, which requires non-uniform buffer sizes. Honestly, given chunks are normally the same size, simply re-using the evicted buffer if possible, and if not allocating new system memory, seems probably sufficient to me. > BufferPool Regression > - > > Key: CASSANDRA-15229 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15229 > Project: Cassandra > Issue Type: Bug > Components: Local/Caching >Reporter: Benedict Elliott Smith >Assignee: ZhaoYang >Priority: Normal > Fix For: 4.0, 4.0-beta > > > The BufferPool was never intended to be used for a {{ChunkCache}}, and we > need to either change our behaviour to handle uncorrelated lifetimes or use > something else. This is particularly important with the default chunk size > for compressed sstables being reduced. If we address the problem, we should > also utilise the BufferPool for native transport connections like we do for > internode messaging, and reduce the number of pooling solutions we employ. > Probably the best thing to do is to improve BufferPool’s behaviour when used > for things with uncorrelated lifetimes, which essentially boils down to > tracking those chunks that have not been freed and re-circulating them when > we run out of completely free blocks. We should probably also permit > instantiating separate {{BufferPool}}, so that we can insulate internode > messaging from the {{ChunkCache}}, or at least have separate memory bounds > for each, and only share fully-freed chunks. > With these improvements we can also safely increase the {{BufferPool}} chunk > size to 128KiB or 256KiB, to guarantee we can fit compressed pages and reduce > the amount of global coordination and per-allocation overhead. We don’t need > 1KiB granularity for allocations, nor 16 byte granularity for tiny > allocations. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15229) BufferPool Regression
[ https://issues.apache.org/jira/browse/CASSANDRA-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076712#comment-17076712 ] Benedict Elliott Smith commented on CASSANDRA-15229: bq. memory wasted due to fragmentation is perhaps not an issue with a cache as little as 512 MB My view is that having a significant proportion of memory wasted to fragmentation is a serious bug, irregardless of the total amount of memory that is wasted. bq. The point is that it is not suitable for long lived buffers, similarly to our bump the pointer strategy. It's not poorly suited to long lived buffers its it? Only to buffers with widely divergent lifetimes. If the lifetimes are loosely correlated then the length of the lifetime is mostly irrelevant I think. bq. The changes to the buffer pool can be dropped in 4.0 if you think that If you mean introducing a new pool specifically for {{ChunkCache}}. I'm fine with it as an alternative to permitting {{BufferPool}} to mitigate worst case behaviour for the {{ChunkCache}}. But verifying a replacement for {{BufferPool}} is a lot more work, and we use the {{BufferPool}} extensively in networking now, which requires non-uniform buffer sizes. Honestly, given chunks are normally the same size, simply re-using the evicted buffer if possible, and if not allocating new system memory, seems probably sufficient to me. > BufferPool Regression > - > > Key: CASSANDRA-15229 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15229 > Project: Cassandra > Issue Type: Bug > Components: Local/Caching >Reporter: Benedict Elliott Smith >Assignee: ZhaoYang >Priority: Normal > Fix For: 4.0, 4.0-beta > > > The BufferPool was never intended to be used for a {{ChunkCache}}, and we > need to either change our behaviour to handle uncorrelated lifetimes or use > something else. This is particularly important with the default chunk size > for compressed sstables being reduced. If we address the problem, we should > also utilise the BufferPool for native transport connections like we do for > internode messaging, and reduce the number of pooling solutions we employ. > Probably the best thing to do is to improve BufferPool’s behaviour when used > for things with uncorrelated lifetimes, which essentially boils down to > tracking those chunks that have not been freed and re-circulating them when > we run out of completely free blocks. We should probably also permit > instantiating separate {{BufferPool}}, so that we can insulate internode > messaging from the {{ChunkCache}}, or at least have separate memory bounds > for each, and only share fully-freed chunks. > With these improvements we can also safely increase the {{BufferPool}} chunk > size to 128KiB or 256KiB, to guarantee we can fit compressed pages and reduce > the amount of global coordination and per-allocation overhead. We don’t need > 1KiB granularity for allocations, nor 16 byte granularity for tiny > allocations. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15697) cqlsh -e parsing bug
[ https://issues.apache.org/jira/browse/CASSANDRA-15697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076707#comment-17076707 ] Stefan Miklosovic commented on CASSANDRA-15697: --- [~jrwest] I have already hit this and it is solved here https://issues.apache.org/jira/browse/CASSANDRA-15660, this should be closed / resolved as a duplicate. > cqlsh -e parsing bug > > > Key: CASSANDRA-15697 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15697 > Project: Cassandra > Issue Type: Bug > Components: Tool/cqlsh >Reporter: Jordan West >Priority: Normal > Fix For: 4.0-alpha > > > {{cqlsh -e}} no longer works on trunk after the introduction of python 3 > support (CASSANDRA-10190). Examples below. > {code} > $ ./bin/cqlsh -e 'select * from foo;' > Usage: cqlsh.py [options] [host [port]] > cqlsh.py: error: ‘CHANGES.txt' is not a valid port number. > $ ./bin/cqlsh -e 'select id from foo;' > Usage: cqlsh.py [options] [host [port]] > cqlsh.py: error: 'from' is not a valid port number. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15671) Testcase: testSubrangeCompaction(org.apache.cassandra.db.compaction.CancelCompactionsTest): FAILED
[ https://issues.apache.org/jira/browse/CASSANDRA-15671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ekaterina Dimitrova updated CASSANDRA-15671: Status: Ready to Commit (was: Review In Progress) > Testcase: > testSubrangeCompaction(org.apache.cassandra.db.compaction.CancelCompactionsTest): >FAILED > -- > > Key: CASSANDRA-15671 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15671 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Ekaterina Dimitrova >Assignee: Francisco Fernandez >Priority: Normal > Labels: pull-request-available > Fix For: 4.0, 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > The following test failure was observed: > [junit-timeout] Testcase: > testSubrangeCompaction(org.apache.cassandra.db.compaction.CancelCompactionsTest): >FAILED > [junit-timeout] expected:<4> but was:<5> > [junit-timeout] junit.framework.AssertionFailedError: expected:<4> but was:<5> > [junit-timeout] at > org.apache.cassandra.db.compaction.CancelCompactionsTest.testSubrangeCompaction(CancelCompactionsTest.java:190) > Java 8 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15671) Testcase: testSubrangeCompaction(org.apache.cassandra.db.compaction.CancelCompactionsTest): FAILED
[ https://issues.apache.org/jira/browse/CASSANDRA-15671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076692#comment-17076692 ] Ekaterina Dimitrova commented on CASSANDRA-15671: - Hi, now it is fine, thank you! The patch LGTM. Thanks! Only thing is I am not a committer. [~brandon.williams] can you, please, check and commit this one? Thanks in advance! > Testcase: > testSubrangeCompaction(org.apache.cassandra.db.compaction.CancelCompactionsTest): >FAILED > -- > > Key: CASSANDRA-15671 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15671 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Ekaterina Dimitrova >Assignee: Francisco Fernandez >Priority: Normal > Labels: pull-request-available > Fix For: 4.0, 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > The following test failure was observed: > [junit-timeout] Testcase: > testSubrangeCompaction(org.apache.cassandra.db.compaction.CancelCompactionsTest): >FAILED > [junit-timeout] expected:<4> but was:<5> > [junit-timeout] junit.framework.AssertionFailedError: expected:<4> but was:<5> > [junit-timeout] at > org.apache.cassandra.db.compaction.CancelCompactionsTest.testSubrangeCompaction(CancelCompactionsTest.java:190) > Java 8 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15597) Correct Visibility and Improve Safety of Methods in LatencyMetrics
[ https://issues.apache.org/jira/browse/CASSANDRA-15597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076663#comment-17076663 ] Jordan West edited comment on CASSANDRA-15597 at 4/6/20, 9:03 PM: -- +1 Tests: https://circleci.com/gh/jrwest/cassandra/tree/15597-4.0 was (Author: jrwest): +1 > Correct Visibility and Improve Safety of Methods in LatencyMetrics > -- > > Key: CASSANDRA-15597 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15597 > Project: Cassandra > Issue Type: Improvement > Components: Observability/Metrics >Reporter: Jordan West >Assignee: Jeff >Priority: Normal > Fix For: 4.0 > > > * add/removeChildren does not need to be public (and exposing addChildren is > unsafe since no lock is used). > * casting in the constructor is safer than casting each time in removeChildren -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15597) Correct Visibility and Improve Safety of Methods in LatencyMetrics
[ https://issues.apache.org/jira/browse/CASSANDRA-15597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076663#comment-17076663 ] Jordan West commented on CASSANDRA-15597: - +1 > Correct Visibility and Improve Safety of Methods in LatencyMetrics > -- > > Key: CASSANDRA-15597 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15597 > Project: Cassandra > Issue Type: Improvement > Components: Observability/Metrics >Reporter: Jordan West >Assignee: Jeff >Priority: Normal > Fix For: 4.0 > > > * add/removeChildren does not need to be public (and exposing addChildren is > unsafe since no lock is used). > * casting in the constructor is safer than casting each time in removeChildren -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15697) cqlsh -e parsing bug
[ https://issues.apache.org/jira/browse/CASSANDRA-15697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jordan West updated CASSANDRA-15697: Bug Category: Parent values: Correctness(12982)Level 1 values: API / Semantic Implementation(12988) Complexity: Normal Component/s: Tool/cqlsh Discovered By: User Report Fix Version/s: 4.0-alpha Severity: Normal Status: Open (was: Triage Needed) > cqlsh -e parsing bug > > > Key: CASSANDRA-15697 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15697 > Project: Cassandra > Issue Type: Bug > Components: Tool/cqlsh >Reporter: Jordan West >Priority: Normal > Fix For: 4.0-alpha > > > {{cqlsh -e}} no longer works on trunk after the introduction of python 3 > support (CASSANDRA-10190). Examples below. > {code} > $ ./bin/cqlsh -e 'select * from foo;' > Usage: cqlsh.py [options] [host [port]] > cqlsh.py: error: ‘CHANGES.txt' is not a valid port number. > $ ./bin/cqlsh -e 'select id from foo;' > Usage: cqlsh.py [options] [host [port]] > cqlsh.py: error: 'from' is not a valid port number. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15697) cqlsh -e parsing bug
Jordan West created CASSANDRA-15697: --- Summary: cqlsh -e parsing bug Key: CASSANDRA-15697 URL: https://issues.apache.org/jira/browse/CASSANDRA-15697 Project: Cassandra Issue Type: Bug Reporter: Jordan West {{cqlsh -e}} no longer works on trunk after the introduction of python 3 support (CASSANDRA-10190). Examples below. {code} $ ./bin/cqlsh -e 'select * from foo;' Usage: cqlsh.py [options] [host [port]] cqlsh.py: error: ‘CHANGES.txt' is not a valid port number. $ ./bin/cqlsh -e 'select id from foo;' Usage: cqlsh.py [options] [host [port]] cqlsh.py: error: 'from' is not a valid port number. {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15338) Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest
[ https://issues.apache.org/jira/browse/CASSANDRA-15338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076633#comment-17076633 ] Yifan Cai commented on CASSANDRA-15338: --- [~benedict][~dcapwell], do you want to take a look? > Fix flakey testMessagePurging - org.apache.cassandra.net.ConnectionTest > --- > > Key: CASSANDRA-15338 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15338 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: David Capwell >Assignee: Yifan Cai >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-alpha > > Attachments: CASS-15338-Docker.zip > > Time Spent: 10m > Remaining Estimate: 0h > > Example failure: > [https://circleci.com/gh/dcapwell/cassandra/11#artifacts/containers/1] > > {code:java} > Testcase: testMessagePurging(org.apache.cassandra.net.ConnectionTest): FAILED > expected:<0> but was:<1> > junit.framework.AssertionFailedError: expected:<0> but was:<1> > at > org.apache.cassandra.net.ConnectionTest.lambda$testMessagePurging$38(ConnectionTest.java:625) > at > org.apache.cassandra.net.ConnectionTest.doTestManual(ConnectionTest.java:258) > at > org.apache.cassandra.net.ConnectionTest.testManual(ConnectionTest.java:231) > at > org.apache.cassandra.net.ConnectionTest.testMessagePurging(ConnectionTest.java:584){code} > > Looking closer at > org.apache.cassandra.net.OutboundConnection.Delivery#stopAndRun it seems that > the run method is called before > org.apache.cassandra.net.OutboundConnection.Delivery#doRun which may lead to > a test race condition where the CountDownLatch completes before executing -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15623) When running CQLSH with STDIN input, exit with error status code if script fails
[ https://issues.apache.org/jira/browse/CASSANDRA-15623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076628#comment-17076628 ] Jordan West commented on CASSANDRA-15623: - [~plastikat] thanks for reworking the patch for trunk. The changes LGTM. I verified the new behavior works as expected and the old behavior remains unchanged: ``` Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo 'select * from foo;' | ./bin/cqlsh :2:InvalidRequest: Error from server: code=2200 [Invalid query] message="No keyspace has been specified. USE a keyspace, or explicitly specify keyspace.tablename" Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo $? 2 Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo 'select * from;' | ./bin/cqlsh :2:SyntaxException: line 1:13 no viable alternative at input ';' (select * from[;]) Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo $? 2 Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ ./bin/cqlsh -e 'select;' :1:SyntaxException: line 1:6 no viable alternative at input ';' (select[;]) Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo $? 2 Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ ./bin/cqlsh -f testcql testcql:2:SyntaxException: line 1:6 no viable alternative at input ';' (select[;]) Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo $? 2 ``` I did however notice a new bug that also exists on trunk (unrelated to this change) while testing. More to come on that. > When running CQLSH with STDIN input, exit with error status code if script > fails > > > Key: CASSANDRA-15623 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15623 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Tools >Reporter: Jacob Becker >Assignee: Jacob Becker >Priority: Normal > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Assuming CASSANDRA-6344 is in place for years and considering that scripts > submitted with the `-e` option behave in a similar fashion, it is very > surprising that scripts submitted to STDIN (i.e. piped in) always exit with a > zero code, regardless of errors. I believe this should be fixed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15623) When running CQLSH with STDIN input, exit with error status code if script fails
[ https://issues.apache.org/jira/browse/CASSANDRA-15623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076628#comment-17076628 ] Jordan West edited comment on CASSANDRA-15623 at 4/6/20, 8:15 PM: -- [~plastikat] thanks for reworking the patch for trunk. The changes LGTM. I verified the new behavior works as expected and the old behavior remains unchanged: {code} Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo 'select * from foo;' | ./bin/cqlsh :2:InvalidRequest: Error from server: code=2200 [Invalid query] message="No keyspace has been specified. USE a keyspace, or explicitly specify keyspace.tablename" Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo $? 2 Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo 'select * from;' | ./bin/cqlsh :2:SyntaxException: line 1:13 no viable alternative at input ';' (select * from[;]) Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo $? 2 Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ ./bin/cqlsh -e 'select;' :1:SyntaxException: line 1:6 no viable alternative at input ';' (select[;]) Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo $? 2 Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ ./bin/cqlsh -f testcql testcql:2:SyntaxException: line 1:6 no viable alternative at input ';' (select[;]) Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo $? 2 {code} I did however notice a new bug that also exists on trunk (unrelated to this change) while testing. More to come on that. was (Author: jrwest): [~plastikat] thanks for reworking the patch for trunk. The changes LGTM. I verified the new behavior works as expected and the old behavior remains unchanged: ``` Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo 'select * from foo;' | ./bin/cqlsh :2:InvalidRequest: Error from server: code=2200 [Invalid query] message="No keyspace has been specified. USE a keyspace, or explicitly specify keyspace.tablename" Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo $? 2 Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo 'select * from;' | ./bin/cqlsh :2:SyntaxException: line 1:13 no viable alternative at input ';' (select * from[;]) Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo $? 2 Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ ./bin/cqlsh -e 'select;' :1:SyntaxException: line 1:6 no viable alternative at input ';' (select[;]) Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo $? 2 Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ ./bin/cqlsh -f testcql testcql:2:SyntaxException: line 1:6 no viable alternative at input ';' (select[;]) Jordans-MacBook-Pro-2:cassandra-15623-review jordanwest$ echo $? 2 ``` I did however notice a new bug that also exists on trunk (unrelated to this change) while testing. More to come on that. > When running CQLSH with STDIN input, exit with error status code if script > fails > > > Key: CASSANDRA-15623 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15623 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Tools >Reporter: Jacob Becker >Assignee: Jacob Becker >Priority: Normal > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Assuming CASSANDRA-6344 is in place for years and considering that scripts > submitted with the `-e` option behave in a similar fashion, it is very > surprising that scripts submitted to STDIN (i.e. piped in) always exit with a > zero code, regardless of errors. I believe this should be fixed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15696) Only track ideal CL failure when request CL is met
[ https://issues.apache.org/jira/browse/CASSANDRA-15696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jon Haddad updated CASSANDRA-15696: --- Bug Category: Parent values: Code(13163)Level 1 values: Bug - Unclear Impact(13164) Complexity: Low Hanging Fruit Discovered By: Code Inspection Severity: Low Status: Open (was: Triage Needed) > Only track ideal CL failure when request CL is met > -- > > Key: CASSANDRA-15696 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15696 > Project: Cassandra > Issue Type: Bug > Components: Observability/Metrics >Reporter: Jon Haddad >Assignee: Jon Haddad >Priority: Normal > > When ideal_consistency_level is set (CASSANDRA-13289), we currently increment > a counter if a request doesn’t meet the consistency level specified in the > configuration (or through JMX). > At the moment, we increment the counter if the query was successful or not. I > think it would be slightly better if we only incremented the counter if the > ideal CL wasn’t achieved but the query’s CL was met. > The original JIRA, stated the following as an objective: > {quote}If your application writes at LOCAL_QUORUM how often are those writes > failing to achieve EACH_QUORUM at other data centers. If you failed your > application over to one of those data centers roughly how inconsistent might > it be given the number of writes that didn't propagate since the last > incremental repair? > {quote} > The main benefit to the JIRA was to set a CL higher than the CL being used, > and to track how often we weren’t able to hit that CL despise hitting the > underlying CL. We should only increment the counter in a case where we were > able to meet the query provided consistency but were unable to meet the ideal > consistency level. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15696) Only track ideal CL failure when request CL is met
[ https://issues.apache.org/jira/browse/CASSANDRA-15696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jon Haddad updated CASSANDRA-15696: --- Description: When ideal_consistency_level is set (CASSANDRA-13289), we currently increment a counter if a request doesn’t meet the consistency level specified in the configuration (or through JMX). At the moment, we increment the counter if the query was successful or not. I think it would be slightly better if we only incremented the counter if the ideal CL wasn’t achieved but the query’s CL was met. The original JIRA, stated the following as an objective: {quote}If your application writes at LOCAL_QUORUM how often are those writes failing to achieve EACH_QUORUM at other data centers. If you failed your application over to one of those data centers roughly how inconsistent might it be given the number of writes that didn't propagate since the last incremental repair? {quote} The main benefit to the JIRA was to set a CL higher than the CL being used, and to track how often we weren’t able to hit that CL despise hitting the underlying CL. We should only increment the counter in a case where we were able to meet the query provided consistency but were unable to meet the ideal consistency level. was: When ideal_consistency_level is set (CASSANDRA-13289), we currently increment a counter if a request doesn’t use the consistency level specified in the configuration (or through JMX). At the moment, we increment the counter if the query was successful or not. I think it would be slightly better if we only incremented the counter if the ideal CL wasn’t achieved but the query’s CL was met. The original JIRA, stated the following as an objective: {quote} If your application writes at LOCAL_QUORUM how often are those writes failing to achieve EACH_QUORUM at other data centers. If you failed your application over to one of those data centers roughly how inconsistent might it be given the number of writes that didn't propagate since the last incremental repair? {quote} The main benefit to the JIRA was to set a CL higher than the CL being used, and to track how often we weren’t able to hit that CL despise hitting the underlying CL. We should only increment the counter in a case where we were able to meet the query provided consistency but were unable to meet the ideal consistency level. > Only track ideal CL failure when request CL is met > -- > > Key: CASSANDRA-15696 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15696 > Project: Cassandra > Issue Type: Bug > Components: Observability/Metrics >Reporter: Jon Haddad >Assignee: Jon Haddad >Priority: Normal > > When ideal_consistency_level is set (CASSANDRA-13289), we currently increment > a counter if a request doesn’t meet the consistency level specified in the > configuration (or through JMX). > At the moment, we increment the counter if the query was successful or not. I > think it would be slightly better if we only incremented the counter if the > ideal CL wasn’t achieved but the query’s CL was met. > The original JIRA, stated the following as an objective: > {quote}If your application writes at LOCAL_QUORUM how often are those writes > failing to achieve EACH_QUORUM at other data centers. If you failed your > application over to one of those data centers roughly how inconsistent might > it be given the number of writes that didn't propagate since the last > incremental repair? > {quote} > The main benefit to the JIRA was to set a CL higher than the CL being used, > and to track how often we weren’t able to hit that CL despise hitting the > underlying CL. We should only increment the counter in a case where we were > able to meet the query provided consistency but were unable to meet the ideal > consistency level. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15696) Only track ideal CL failure when request CL is met
Jon Haddad created CASSANDRA-15696: -- Summary: Only track ideal CL failure when request CL is met Key: CASSANDRA-15696 URL: https://issues.apache.org/jira/browse/CASSANDRA-15696 Project: Cassandra Issue Type: Bug Components: Observability/Metrics Reporter: Jon Haddad Assignee: Jon Haddad When ideal_consistency_level is set (CASSANDRA-13289), we currently increment a counter if a request doesn’t use the consistency level specified in the configuration (or through JMX). At the moment, we increment the counter if the query was successful or not. I think it would be slightly better if we only incremented the counter if the ideal CL wasn’t achieved but the query’s CL was met. The original JIRA, stated the following as an objective: {quote} If your application writes at LOCAL_QUORUM how often are those writes failing to achieve EACH_QUORUM at other data centers. If you failed your application over to one of those data centers roughly how inconsistent might it be given the number of writes that didn't propagate since the last incremental repair? {quote} The main benefit to the JIRA was to set a CL higher than the CL being used, and to track how often we weren’t able to hit that CL despise hitting the underlying CL. We should only increment the counter in a case where we were able to meet the query provided consistency but were unable to meet the ideal consistency level. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15686) Improvements in circle CI default config
[ https://issues.apache.org/jira/browse/CASSANDRA-15686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076590#comment-17076590 ] Kevin Gallardo commented on CASSANDRA-15686: bq. I am only speculating but I thought there were tests that spin up network and there are many tests which share disk, and I don't know if we isolate the paths or not. If this is not true then it should be a simple change to bump the number of runners for unit tests (definitely not jvm dtests). Hm afaict running with multiple runners is fine for the unit tests, I have tested on different configurations and it didn't seem to cause problems as long as # runners <= # CPUs > Improvements in circle CI default config > > > Key: CASSANDRA-15686 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15686 > Project: Cassandra > Issue Type: Bug > Components: Build >Reporter: Kevin Gallardo >Priority: Normal > > I have been looking at and played around with the [default CircleCI > config|https://github.com/apache/cassandra/blob/trunk/.circleci/config.yml], > a few comments/questions regarding the following topics: > * Python dtests do not run successfully (200-300 failures) on {{medium}} > instances, they seem to only run with small flaky failures on {{large}} > instances or higher > * Python Upgrade tests: > ** Do not seem to run without many failures on any instance types / any > parallelism setting > ** Do not seem to parallelize well, it seems each container is going to > download multiple C* versions > ** Additionally it seems the configuration is not up to date, as currently > we get errors because {{JAVA8_HOME}} is not set > * Unit tests do not seem to parallelize optimally, number of test runners do > not reflect the available CPUs on the container. Ideally if # of runners == # > of CPUs, build time is improved, on any type of instances. > ** For instance when using the current configuration, running on medium > instances, build will use 1 junit test runner, but 2 CPUs are available. If > using 2 runners, the build time is reduced from 19min (at the current main > config of parallelism=4) to 12min. > * There are some typos in the file, some dtests say "Run Unit Tests" but > they are JVM dtests (see > [here|https://github.com/apache/cassandra/blob/trunk/.circleci/config.yml#L1077], > > [here|https://github.com/apache/cassandra/blob/trunk/.circleci/config.yml#L1386]) > So some ways to process these would be: > * Do the Python dtests run successfully for anyone on {{medium}} instances? > If not, would it make sense to bump them to {{large}} so that they can be run > successfully? > * Does anybody ever run the python upgrade tests on CircleCI and what is the > configuration that makes it work? > * Would it make sense to either hardcode the number of test runners in the > unit tests with `-Dtest.runners` in the config file to reflect the number of > CPUs on the instances, or change the build so that it is able to detect the > appropriate number of core available automatically? > Additionally, it seems this default config file (config.yml) is not as well > maintained as the > [{{config-2_1.yml}}|https://github.com/apache/cassandra/blob/trunk/.circleci/config-2_1.yml] > (+its lowres/highres) version in the same folder (from CASSANDRA-14806). > What is the reasoning for maintaining these 2 versions of the build? Could > the better maintained version be used as the default? We could generate a > lowres version of the new config-2_1.yml, and rename it {{config.yml}} so > that it gets picked up by CircleCI automatically instead of the current > default. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15568) Message filtering should apply on the inboundSink in In-JVM dtest
[ https://issues.apache.org/jira/browse/CASSANDRA-15568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076588#comment-17076588 ] Benedict Elliott Smith commented on CASSANDRA-15568: Neat! > Message filtering should apply on the inboundSink in In-JVM dtest > - > > Key: CASSANDRA-15568 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15568 > Project: Cassandra > Issue Type: Improvement > Components: Test/dtest >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: Normal > > The message filtering mechanism in the in-jvm dtest helps to simulate network > partition/delay. > The problem of the current approach that adds all filters to the > {{MessagingService#outboundSink}} is that a blocking filter blocks the > following filters to be evaluated since there is only a single thread that > evaluates them. It further blocks the other outing messages. The typical > internode messaging pattern is that the coordinator node sends out multiple > messages to other nodes upon receiving a query. The described blocking > messages can happen quite often. > The problem can be solved by moving the message filtering to the > {{MessagingService#inboundSink}}, so that each inbounding message is > naturally filtered in parallel. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15568) Message filtering should apply on the inboundSink in In-JVM dtest
[ https://issues.apache.org/jira/browse/CASSANDRA-15568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076497#comment-17076497 ] David Capwell edited comment on CASSANDRA-15568 at 4/6/20, 5:38 PM: [~benedict] bq. Hmm, when did this happen? ~2 weeks ago? bq. Have we eliminated outbound filtering? There is value in being able to stop progress on the outbound thread, as it permits you to specify a sequence of events by controlling the flow of events on the coordinator. The default is inbound, but you can define inbound or outbound when you define the filter {code} cluster.filters().outbound().allVerbs().drop(); // runs in the Instance doing the message sending cluster.filters().inbound().allVerbs().drop(); // runs in the instance receiving the message cluster.filters().allVerbs().drop(); // same as the above, inbound is the default. {code} PreviewRepairTest uses outbound filters to block the sending until IR has completed. was (Author: dcapwell): bq. Hmm, when did this happen? ~2 weeks ago? bq. Have we eliminated outbound filtering? There is value in being able to stop progress on the outbound thread, as it permits you to specify a sequence of events by controlling the flow of events on the coordinator. The default is inbound, but you can define inbound or outbound when you define the filter {code} cluster.filters().outbound().allVerbs().drop(); // runs in the Instance doing the message sending cluster.filters().inbound().allVerbs().drop(); // runs in the instance receiving the message cluster.filters().allVerbs().drop(); // same as the above, inbound is the default. {code} PreviewRepairTest uses outbound filters to block the sending until IR has completed. > Message filtering should apply on the inboundSink in In-JVM dtest > - > > Key: CASSANDRA-15568 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15568 > Project: Cassandra > Issue Type: Improvement > Components: Test/dtest >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: Normal > > The message filtering mechanism in the in-jvm dtest helps to simulate network > partition/delay. > The problem of the current approach that adds all filters to the > {{MessagingService#outboundSink}} is that a blocking filter blocks the > following filters to be evaluated since there is only a single thread that > evaluates them. It further blocks the other outing messages. The typical > internode messaging pattern is that the coordinator node sends out multiple > messages to other nodes upon receiving a query. The described blocking > messages can happen quite often. > The problem can be solved by moving the message filtering to the > {{MessagingService#inboundSink}}, so that each inbounding message is > naturally filtered in parallel. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15568) Message filtering should apply on the inboundSink in In-JVM dtest
[ https://issues.apache.org/jira/browse/CASSANDRA-15568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076497#comment-17076497 ] David Capwell edited comment on CASSANDRA-15568 at 4/6/20, 5:38 PM: bq. Hmm, when did this happen? ~2 weeks ago? bq. Have we eliminated outbound filtering? There is value in being able to stop progress on the outbound thread, as it permits you to specify a sequence of events by controlling the flow of events on the coordinator. The default is inbound, but you can define inbound or outbound when you define the filter {code} cluster.filters().outbound().allVerbs().drop(); // runs in the Instance doing the message sending cluster.filters().inbound().allVerbs().drop(); // runs in the instance receiving the message cluster.filters().allVerbs().drop(); // same as the above, inbound is the default. {code} PreviewRepairTest uses outbound filters to block the sending until IR has completed. was (Author: dcapwell): bq. Hmm, when did this happen? ~2 weeks ago? bq. Have we eliminated outbound filtering? There is value in being able to stop progress on the outbound thread, as it permits you to specify a sequence of events by controlling the flow of events on the coordinator. The default is inbound, but you can define inbound or outbound when you define the filter {code} cluster.filters().outbound().allVerbs().drop(); cluster.filters().inbound().allVerbs().drop(); {code} PreviewRepairTest uses outbound filters to block the sending until IR has completed. > Message filtering should apply on the inboundSink in In-JVM dtest > - > > Key: CASSANDRA-15568 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15568 > Project: Cassandra > Issue Type: Improvement > Components: Test/dtest >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: Normal > > The message filtering mechanism in the in-jvm dtest helps to simulate network > partition/delay. > The problem of the current approach that adds all filters to the > {{MessagingService#outboundSink}} is that a blocking filter blocks the > following filters to be evaluated since there is only a single thread that > evaluates them. It further blocks the other outing messages. The typical > internode messaging pattern is that the coordinator node sends out multiple > messages to other nodes upon receiving a query. The described blocking > messages can happen quite often. > The problem can be solved by moving the message filtering to the > {{MessagingService#inboundSink}}, so that each inbounding message is > naturally filtered in parallel. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-15640) digest may not match when single partition named queries skip older sstables
[ https://issues.apache.org/jira/browse/CASSANDRA-15640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sam Tunnicliffe reassigned CASSANDRA-15640: --- Assignee: Sam Tunnicliffe > digest may not match when single partition named queries skip older sstables > > > Key: CASSANDRA-15640 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15640 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Local Write-Read Paths >Reporter: ZhaoYang >Assignee: Sam Tunnicliffe >Priority: Normal > > Name queries (aka. single partition query with full clustering keys) query > sstables sequentially in recency order, in the hope that most recent sstables > will contain most recent data, so that they can avoid reading older sstables > in {{SinglePartitionReadCommand#reduceFilter}}. > Unfortunately, this optimization may cause digest mismatch if older sstables > contain range tombstone or row deletion with lower timestamp. [Test > Code|https://github.com/jasonstack/cassandra/commit/3dfa29bb34bc237ab2b68f849906c09569c5cc94] > {code:java} > Table with (pk, ck1, ck2) > Node1: > * delete row (pk=1, ck1=1) with ts=10 > * insert row (pk=1, ck1=1, ck2=1) with ts=11 > Node2: > * delete row (pk=1, ck1=1) with ts=10 > * flush into sstable1 > * insert row (pk=1, ck1=1, ck2=1) with ts=11 > * flush into sstable2 > Query with pk=1 and ck1=1 and ck2=1 > * node1 returns: RT open marker, row, RT close marker > * node2 returns: row (because sstable1 is skipped) > Note: similar mismatch can happen with row deletion as well. > {code} > In the above example: Is it safe to ignore RT or row deletion if row liveness > has higher timestamp for named queries in node1? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15568) Message filtering should apply on the inboundSink in In-JVM dtest
[ https://issues.apache.org/jira/browse/CASSANDRA-15568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076497#comment-17076497 ] David Capwell commented on CASSANDRA-15568: --- bq. Hmm, when did this happen? ~2 weeks ago? bq. Have we eliminated outbound filtering? There is value in being able to stop progress on the outbound thread, as it permits you to specify a sequence of events by controlling the flow of events on the coordinator. The default is inbound, but you can define inbound or outbound when you define the filter {code} cluster.filters().outbound().allVerbs().drop(); cluster.filters().inbound().allVerbs().drop(); {code} PreviewRepairTest uses outbound filters to block the sending until IR has completed. > Message filtering should apply on the inboundSink in In-JVM dtest > - > > Key: CASSANDRA-15568 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15568 > Project: Cassandra > Issue Type: Improvement > Components: Test/dtest >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: Normal > > The message filtering mechanism in the in-jvm dtest helps to simulate network > partition/delay. > The problem of the current approach that adds all filters to the > {{MessagingService#outboundSink}} is that a blocking filter blocks the > following filters to be evaluated since there is only a single thread that > evaluates them. It further blocks the other outing messages. The typical > internode messaging pattern is that the coordinator node sends out multiple > messages to other nodes upon receiving a query. The described blocking > messages can happen quite often. > The problem can be solved by moving the message filtering to the > {{MessagingService#inboundSink}}, so that each inbounding message is > naturally filtered in parallel. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15684) CASSANDRA-15650 was merged after dtest refactor and modified classes no longer in the project
[ https://issues.apache.org/jira/browse/CASSANDRA-15684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-15684: -- Reviewers: Alex Petrov, Benjamin Lerer, David Capwell (was: Alex Petrov, Benjamin Lerer) Status: Review In Progress (was: Patch Available) > CASSANDRA-15650 was merged after dtest refactor and modified classes no > longer in the project > - > > Key: CASSANDRA-15684 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15684 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > CASSANDRA-15650 was based off commits before CASSANDRA-15539 which removed > some of the files modified in CASSANDRA-15650. The tests were passing > pre-merge but off earlier commits. On commit they started failing since the > dtest API no longer match so produces the following exception > {code} > [junit-timeout] > org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts; > [junit-timeout] java.lang.NoSuchMethodError: > org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts; > [junit-timeout] at > org.apache.cassandra.distributed.test.RepairCoordinatorFast.lambda$unknownHost$5(RepairCoordinatorFast.java:216) > [junit-timeout] at > org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39) > [junit-timeout] at > org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67) > [junit-timeout] at > java.util.concurrent.FutureTask.run(FutureTask.java:266) > [junit-timeout] at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [junit-timeout] at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [junit-timeout] at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > [junit-timeout] at java.lang.Thread.run(Thread.java:748) > {code} > Root cause was 4 files exited which should have been deleted in > CASSANDRA-15539. Since they were not when CASSANDRA-15650 modified one it > didn't cause a merge conflict, but when the test runs it conflicts and fails. > [Circle > CI|https://circleci.com/gh/dcapwell/cassandra/tree/bug%2FCASSANDRA-15684] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15684) CASSANDRA-15650 was merged after dtest refactor and modified classes no longer in the project
[ https://issues.apache.org/jira/browse/CASSANDRA-15684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-15684: -- Status: Ready to Commit (was: Review In Progress) > CASSANDRA-15650 was merged after dtest refactor and modified classes no > longer in the project > - > > Key: CASSANDRA-15684 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15684 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > CASSANDRA-15650 was based off commits before CASSANDRA-15539 which removed > some of the files modified in CASSANDRA-15650. The tests were passing > pre-merge but off earlier commits. On commit they started failing since the > dtest API no longer match so produces the following exception > {code} > [junit-timeout] > org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts; > [junit-timeout] java.lang.NoSuchMethodError: > org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts; > [junit-timeout] at > org.apache.cassandra.distributed.test.RepairCoordinatorFast.lambda$unknownHost$5(RepairCoordinatorFast.java:216) > [junit-timeout] at > org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39) > [junit-timeout] at > org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67) > [junit-timeout] at > java.util.concurrent.FutureTask.run(FutureTask.java:266) > [junit-timeout] at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [junit-timeout] at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [junit-timeout] at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > [junit-timeout] at java.lang.Thread.run(Thread.java:748) > {code} > Root cause was 4 files exited which should have been deleted in > CASSANDRA-15539. Since they were not when CASSANDRA-15650 modified one it > didn't cause a merge conflict, but when the test runs it conflicts and fails. > [Circle > CI|https://circleci.com/gh/dcapwell/cassandra/tree/bug%2FCASSANDRA-15684] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15684) CASSANDRA-15650 was merged after dtest refactor and modified classes no longer in the project
[ https://issues.apache.org/jira/browse/CASSANDRA-15684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-15684: -- Fix Version/s: 4.0-alpha Resolution: Fixed Status: Resolved (was: Ready to Commit) > CASSANDRA-15650 was merged after dtest refactor and modified classes no > longer in the project > - > > Key: CASSANDRA-15684 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15684 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-alpha > > Time Spent: 20m > Remaining Estimate: 0h > > CASSANDRA-15650 was based off commits before CASSANDRA-15539 which removed > some of the files modified in CASSANDRA-15650. The tests were passing > pre-merge but off earlier commits. On commit they started failing since the > dtest API no longer match so produces the following exception > {code} > [junit-timeout] > org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts; > [junit-timeout] java.lang.NoSuchMethodError: > org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts; > [junit-timeout] at > org.apache.cassandra.distributed.test.RepairCoordinatorFast.lambda$unknownHost$5(RepairCoordinatorFast.java:216) > [junit-timeout] at > org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39) > [junit-timeout] at > org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67) > [junit-timeout] at > java.util.concurrent.FutureTask.run(FutureTask.java:266) > [junit-timeout] at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [junit-timeout] at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [junit-timeout] at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > [junit-timeout] at java.lang.Thread.run(Thread.java:748) > {code} > Root cause was 4 files exited which should have been deleted in > CASSANDRA-15539. Since they were not when CASSANDRA-15650 modified one it > didn't cause a merge conflict, but when the test runs it conflicts and fails. > [Circle > CI|https://circleci.com/gh/dcapwell/cassandra/tree/bug%2FCASSANDRA-15684] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15686) Improvements in circle CI default config
[ https://issues.apache.org/jira/browse/CASSANDRA-15686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076459#comment-17076459 ] Kevin Gallardo commented on CASSANDRA-15686: Confirm the build is running as normal by using the {{config-2_1.yml}} directly renamed as {{config.yml}}, see build #42 here: https://app.circleci.com/pipelines/github/newkek/cassandra?branch=chg > Improvements in circle CI default config > > > Key: CASSANDRA-15686 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15686 > Project: Cassandra > Issue Type: Bug > Components: Build >Reporter: Kevin Gallardo >Priority: Normal > > I have been looking at and played around with the [default CircleCI > config|https://github.com/apache/cassandra/blob/trunk/.circleci/config.yml], > a few comments/questions regarding the following topics: > * Python dtests do not run successfully (200-300 failures) on {{medium}} > instances, they seem to only run with small flaky failures on {{large}} > instances or higher > * Python Upgrade tests: > ** Do not seem to run without many failures on any instance types / any > parallelism setting > ** Do not seem to parallelize well, it seems each container is going to > download multiple C* versions > ** Additionally it seems the configuration is not up to date, as currently > we get errors because {{JAVA8_HOME}} is not set > * Unit tests do not seem to parallelize optimally, number of test runners do > not reflect the available CPUs on the container. Ideally if # of runners == # > of CPUs, build time is improved, on any type of instances. > ** For instance when using the current configuration, running on medium > instances, build will use 1 junit test runner, but 2 CPUs are available. If > using 2 runners, the build time is reduced from 19min (at the current main > config of parallelism=4) to 12min. > * There are some typos in the file, some dtests say "Run Unit Tests" but > they are JVM dtests (see > [here|https://github.com/apache/cassandra/blob/trunk/.circleci/config.yml#L1077], > > [here|https://github.com/apache/cassandra/blob/trunk/.circleci/config.yml#L1386]) > So some ways to process these would be: > * Do the Python dtests run successfully for anyone on {{medium}} instances? > If not, would it make sense to bump them to {{large}} so that they can be run > successfully? > * Does anybody ever run the python upgrade tests on CircleCI and what is the > configuration that makes it work? > * Would it make sense to either hardcode the number of test runners in the > unit tests with `-Dtest.runners` in the config file to reflect the number of > CPUs on the instances, or change the build so that it is able to detect the > appropriate number of core available automatically? > Additionally, it seems this default config file (config.yml) is not as well > maintained as the > [{{config-2_1.yml}}|https://github.com/apache/cassandra/blob/trunk/.circleci/config-2_1.yml] > (+its lowres/highres) version in the same folder (from CASSANDRA-14806). > What is the reasoning for maintaining these 2 versions of the build? Could > the better maintained version be used as the default? We could generate a > lowres version of the new config-2_1.yml, and rename it {{config.yml}} so > that it gets picked up by CircleCI automatically instead of the current > default. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15684) CASSANDRA-15650 was merged after dtest refactor and modified classes no longer in the project
[ https://issues.apache.org/jira/browse/CASSANDRA-15684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076452#comment-17076452 ] Michael Semb Wever commented on CASSANDRA-15684: bq. Michael Semb Wever if I understand the nature of the failure correctly, the main problem was merge. That's my understanding too. Since both you [~ifesdjeen] and [~blerer] have +1 on the main patch I've gone ahead and committed it. Committed as [a104b06d4aea2f2cd3d48bdbe38410284f236428|https://github.com/apache/cassandra/commit/a104b06d4aea2f2cd3d48bdbe38410284f236428] > CASSANDRA-15650 was merged after dtest refactor and modified classes no > longer in the project > - > > Key: CASSANDRA-15684 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15684 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > CASSANDRA-15650 was based off commits before CASSANDRA-15539 which removed > some of the files modified in CASSANDRA-15650. The tests were passing > pre-merge but off earlier commits. On commit they started failing since the > dtest API no longer match so produces the following exception > {code} > [junit-timeout] > org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts; > [junit-timeout] java.lang.NoSuchMethodError: > org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts; > [junit-timeout] at > org.apache.cassandra.distributed.test.RepairCoordinatorFast.lambda$unknownHost$5(RepairCoordinatorFast.java:216) > [junit-timeout] at > org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39) > [junit-timeout] at > org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67) > [junit-timeout] at > java.util.concurrent.FutureTask.run(FutureTask.java:266) > [junit-timeout] at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [junit-timeout] at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [junit-timeout] at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > [junit-timeout] at java.lang.Thread.run(Thread.java:748) > {code} > Root cause was 4 files exited which should have been deleted in > CASSANDRA-15539. Since they were not when CASSANDRA-15650 modified one it > didn't cause a merge conflict, but when the test runs it conflicts and fails. > [Circle > CI|https://circleci.com/gh/dcapwell/cassandra/tree/bug%2FCASSANDRA-15684] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15684) CASSANDRA-15650 was merged after dtest refactor and modified classes no longer in the project
[ https://issues.apache.org/jira/browse/CASSANDRA-15684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Semb Wever updated CASSANDRA-15684: --- Since Version: 4.0-alpha Source Control Link: https://github.com/apache/cassandra/commit/a104b06d4aea2f2cd3d48bdbe38410284f236428 > CASSANDRA-15650 was merged after dtest refactor and modified classes no > longer in the project > - > > Key: CASSANDRA-15684 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15684 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > CASSANDRA-15650 was based off commits before CASSANDRA-15539 which removed > some of the files modified in CASSANDRA-15650. The tests were passing > pre-merge but off earlier commits. On commit they started failing since the > dtest API no longer match so produces the following exception > {code} > [junit-timeout] > org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts; > [junit-timeout] java.lang.NoSuchMethodError: > org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts; > [junit-timeout] at > org.apache.cassandra.distributed.test.RepairCoordinatorFast.lambda$unknownHost$5(RepairCoordinatorFast.java:216) > [junit-timeout] at > org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39) > [junit-timeout] at > org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67) > [junit-timeout] at > java.util.concurrent.FutureTask.run(FutureTask.java:266) > [junit-timeout] at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [junit-timeout] at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [junit-timeout] at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > [junit-timeout] at java.lang.Thread.run(Thread.java:748) > {code} > Root cause was 4 files exited which should have been deleted in > CASSANDRA-15539. Since they were not when CASSANDRA-15650 modified one it > didn't cause a merge conflict, but when the test runs it conflicts and fails. > [Circle > CI|https://circleci.com/gh/dcapwell/cassandra/tree/bug%2FCASSANDRA-15684] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch trunk updated: Fix RepairCoordinator test failures, after clobbering jvm-dtest refactoring (CASSANDRA-15650) and modifying classes no longer in the project
This is an automated email from the ASF dual-hosted git repository. mck pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git The following commit(s) were added to refs/heads/trunk by this push: new a104b06 Fix RepairCoordinator test failures, after clobbering jvm-dtest refactoring (CASSANDRA-15650) and modifying classes no longer in the project a104b06 is described below commit a104b06d4aea2f2cd3d48bdbe38410284f236428 Author: David Capwell AuthorDate: Thu Apr 2 10:58:43 2020 -0700 Fix RepairCoordinator test failures, after clobbering jvm-dtest refactoring (CASSANDRA-15650) and modifying classes no longer in the project patch by David Capwell; reviewed by Benjamin Lerer, Alex Petrov for CASSANDRA-15684 --- .../cassandra/distributed/api/LongTokenRange.java | 38 .../cassandra/distributed/api/NodeToolResult.java | 218 - .../cassandra/distributed/api/QueryResult.java | 139 - .../org/apache/cassandra/distributed/api/Row.java | 119 --- .../distributed/test/RepairCoordinatorFast.java| 8 +- 5 files changed, 6 insertions(+), 516 deletions(-) diff --git a/test/distributed/org/apache/cassandra/distributed/api/LongTokenRange.java b/test/distributed/org/apache/cassandra/distributed/api/LongTokenRange.java deleted file mode 100644 index 06327e8..000 --- a/test/distributed/org/apache/cassandra/distributed/api/LongTokenRange.java +++ /dev/null @@ -1,38 +0,0 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -package org.apache.cassandra.distributed.api; - -import java.io.Serializable; - -public final class LongTokenRange implements Serializable -{ -public final long minExclusive; -public final long maxInclusive; - -public LongTokenRange(long minExclusive, long maxInclusive) -{ -this.minExclusive = minExclusive; -this.maxInclusive = maxInclusive; -} - -public String toString() -{ -return "(" + minExclusive + "," + maxInclusive + "]"; -} -} diff --git a/test/distributed/org/apache/cassandra/distributed/api/NodeToolResult.java b/test/distributed/org/apache/cassandra/distributed/api/NodeToolResult.java deleted file mode 100644 index 8f33ae5..000 --- a/test/distributed/org/apache/cassandra/distributed/api/NodeToolResult.java +++ /dev/null @@ -1,218 +0,0 @@ -/* - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * "License"); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, software - * distributed under the License is distributed on an "AS IS" BASIS, - * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - * See the License for the specific language governing permissions and - * limitations under the License. - */ - -package org.apache.cassandra.distributed.api; - -import java.util.Arrays; -import java.util.Collection; -import java.util.List; -import java.util.Map; -import java.util.stream.Collectors; -import java.util.stream.Stream; -import javax.management.Notification; - -import com.google.common.base.Throwables; -import org.junit.Assert; - -public class NodeToolResult -{ -private final String[] commandAndArgs; -private final int rc; -private final List notifications; -private final Throwable error; - -public NodeToolResult(String[] commandAndArgs, int rc, List notifications, Throwable error) -{ -this.commandAndArgs = commandAndArgs; -this.rc = rc; -this.notifications = notifications; -this.error = error; -} - -public String[] getCommandAndArgs() -{ -return commandAndArgs; -} - -public int getRc() -{ -return rc; -} - -public List getNotifications() -{ -return notifications; -} - -public
[jira] [Updated] (CASSANDRA-15662) cqlsh tests won't run on jdk1.8 (regression from CASSANDRA-10190)
[ https://issues.apache.org/jira/browse/CASSANDRA-15662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Semb Wever updated CASSANDRA-15662: --- Since Version: 4.0-alpha Source Control Link: https://github.com/apache/cassandra/commit/bb8ec1fc1066e604b5695c0f8057e2b3adfa5cb2 Resolution: Fixed Status: Resolved (was: Ready to Commit) > cqlsh tests won't run on jdk1.8 (regression from CASSANDRA-10190) > -- > > Key: CASSANDRA-15662 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15662 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Michael Semb Wever >Assignee: Michael Semb Wever >Priority: Normal > Fix For: 4.0-alpha > > > Running the cqlsh tests on jdk1.8 no longer work. > The commit {{bf9a1d487b}} for CASSANDRA-10190 broke this, by defaulting > {{CASSANDRA_USE_JDK11}} to true. See > https://github.com/apache/cassandra/commit/bf9a1d487b9ba469e8d740cf7d1cd419535a7e79#diff-90e40e02845884b66e9006b25250ea5cR36-R38 > The following three work… > {code} > jenv shell 1.8 > export CASSANDRA_USE_JDK11=false > ./pylib/cassandra-cqlsh-tests.sh `pwd` > {code} > {code} > jenv shell 11.0 > export CASSANDRA_USE_JDK11=true > ./pylib/cassandra-cqlsh-tests.sh `pwd` > {code} > {code} > jenv shell 1.8 > unset CASSANDRA_USE_JDK11 > ./pylib/cassandra-cqlsh-tests.sh `pwd` > {code} > The following does not… > {code} > jenv shell 1.8 > ./pylib/cassandra-cqlsh-tests.sh `pwd` > {code} > {noformat} > BUILD FAILED > /Users/mick/src/apache/casSANDRA/build.xml:292: -Duse.jdk11=true or > $CASSANDRA_USE_JDK11=true cannot be set when building from java 8 > {noformat} > JDK 1.8 is expected to be the default. With {{CASSANDRA_USE_JDK11}} being > defined if/when JDK 11 is used. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15662) cqlsh tests won't run on jdk1.8 (regression from CASSANDRA-10190)
[ https://issues.apache.org/jira/browse/CASSANDRA-15662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076426#comment-17076426 ] Michael Semb Wever commented on CASSANDRA-15662: Thanks [~yukim]. Committed as bb8ec1fc1066e604b5695c0f8057e2b3adfa5cb2 > cqlsh tests won't run on jdk1.8 (regression from CASSANDRA-10190) > -- > > Key: CASSANDRA-15662 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15662 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Michael Semb Wever >Assignee: Michael Semb Wever >Priority: Normal > Fix For: 4.0-alpha > > > Running the cqlsh tests on jdk1.8 no longer work. > The commit {{bf9a1d487b}} for CASSANDRA-10190 broke this, by defaulting > {{CASSANDRA_USE_JDK11}} to true. See > https://github.com/apache/cassandra/commit/bf9a1d487b9ba469e8d740cf7d1cd419535a7e79#diff-90e40e02845884b66e9006b25250ea5cR36-R38 > The following three work… > {code} > jenv shell 1.8 > export CASSANDRA_USE_JDK11=false > ./pylib/cassandra-cqlsh-tests.sh `pwd` > {code} > {code} > jenv shell 11.0 > export CASSANDRA_USE_JDK11=true > ./pylib/cassandra-cqlsh-tests.sh `pwd` > {code} > {code} > jenv shell 1.8 > unset CASSANDRA_USE_JDK11 > ./pylib/cassandra-cqlsh-tests.sh `pwd` > {code} > The following does not… > {code} > jenv shell 1.8 > ./pylib/cassandra-cqlsh-tests.sh `pwd` > {code} > {noformat} > BUILD FAILED > /Users/mick/src/apache/casSANDRA/build.xml:292: -Duse.jdk11=true or > $CASSANDRA_USE_JDK11=true cannot be set when building from java 8 > {noformat} > JDK 1.8 is expected to be the default. With {{CASSANDRA_USE_JDK11}} being > defined if/when JDK 11 is used. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch trunk updated: Fix cqlsh tests running on jdk1.8 (regression from CASSANDRA-10190)
This is an automated email from the ASF dual-hosted git repository. mck pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git The following commit(s) were added to refs/heads/trunk by this push: new bb8ec1f Fix cqlsh tests running on jdk1.8 (regression from CASSANDRA-10190) bb8ec1f is described below commit bb8ec1fc1066e604b5695c0f8057e2b3adfa5cb2 Author: Mick Semb Wever AuthorDate: Sun Apr 5 10:50:26 2020 +0200 Fix cqlsh tests running on jdk1.8 (regression from CASSANDRA-10190) patched by Mick Semb Wever; reviewed by Yuki Morishita for CASSANDRA-15662 --- pylib/cassandra-cqlsh-tests.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/pylib/cassandra-cqlsh-tests.sh b/pylib/cassandra-cqlsh-tests.sh index d305759..56672f1 100755 --- a/pylib/cassandra-cqlsh-tests.sh +++ b/pylib/cassandra-cqlsh-tests.sh @@ -34,7 +34,7 @@ export NUM_TOKENS="32" export CASSANDRA_DIR=${WORKSPACE} if [ -z "$CASSANDRA_USE_JDK11" ]; then -export CASSANDRA_USE_JDK11=true +export CASSANDRA_USE_JDK11=false fi # Loop to prevent failure due to maven-ant-tasks not downloading a jar.. @@ -53,7 +53,7 @@ fi # Set up venv with dtest dependencies set -e # enable immediate exit if venv setup fails -virtualenv --python=$PYTHON_VERSION --no-site-packages venv +virtualenv --python=$PYTHON_VERSION venv source venv/bin/activate pip install -r ${CASSANDRA_DIR}/pylib/requirements.txt - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15573) Python 3.8 fails to execute cqlsh
[ https://issues.apache.org/jira/browse/CASSANDRA-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076424#comment-17076424 ] Eduard Tudenhoefner commented on CASSANDRA-15573: - The issue itself can also be manually verified with the below steps, which are from [the pylib readme|https://github.com/nastra/cassandra/blob/e840b458dd8fda021871ceee1efb5187ad94aad3/pylib/README.asc] and require CASSANDRA-15659) {code} docker build . --file Dockerfile.ubuntu.py38 -t ubuntu-lts-py3 docker run -v $CASSANDRA_DIR:/code -it ubuntu-lts-py3:latest /code/bin/cqlsh host.docker.internal {code} > Python 3.8 fails to execute cqlsh > - > > Key: CASSANDRA-15573 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15573 > Project: Cassandra > Issue Type: Sub-task > Components: Tool/cqlsh >Reporter: Yuki Morishita >Assignee: Eduard Tudenhoefner >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-alpha > > Time Spent: 10m > Remaining Estimate: 0h > > Python 3.8 renamed sre_parse.Pattern to sre_parse.State (see > https://bugs.python.org/issue34681 and corresponding pull request > https://github.com/python/cpython/pull/9310) > So when executing cqlsh with Python 3.8, it throws error: > {code} > Traceback (most recent call last): > File ".\bin\cqlsh.py", line 175, in > from cqlshlib import cql3handling, cqlhandling, pylexotron, sslhandling, > cqlshhandling > File "C:\Users\Yuki > Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\cql3handling.py", line 19, > in > from cqlshlib.cqlhandling import CqlParsingRuleSet, Hint > File "C:\Users\Yuki > Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\cqlhandling.py", line 23, > in > from cqlshlib import pylexotron, util > File "C:\Users\Yuki > Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\pylexotron.py", line 342, > in > class ParsingRuleSet: > File "C:\Users\Yuki > Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\pylexotron.py", line 343, > in ParsingRuleSet > RuleSpecScanner = SaferScanner([ > File "C:\Users\Yuki > Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\saferscanner.py", line 74, > in __init__ > s = re.sre_parse.Pattern() > AttributeError: module 'sre_parse' has no attribute 'Pattern' > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15573) Python 3.8 fails to execute cqlsh
[ https://issues.apache.org/jira/browse/CASSANDRA-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eduard Tudenhoefner updated CASSANDRA-15573: Change Category: Operability Complexity: Low Hanging Fruit Fix Version/s: 4.0-alpha Status: Open (was: Triage Needed) > Python 3.8 fails to execute cqlsh > - > > Key: CASSANDRA-15573 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15573 > Project: Cassandra > Issue Type: Sub-task > Components: Tool/cqlsh >Reporter: Yuki Morishita >Assignee: Eduard Tudenhoefner >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-alpha > > Time Spent: 10m > Remaining Estimate: 0h > > Python 3.8 renamed sre_parse.Pattern to sre_parse.State (see > https://bugs.python.org/issue34681 and corresponding pull request > https://github.com/python/cpython/pull/9310) > So when executing cqlsh with Python 3.8, it throws error: > {code} > Traceback (most recent call last): > File ".\bin\cqlsh.py", line 175, in > from cqlshlib import cql3handling, cqlhandling, pylexotron, sslhandling, > cqlshhandling > File "C:\Users\Yuki > Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\cql3handling.py", line 19, > in > from cqlshlib.cqlhandling import CqlParsingRuleSet, Hint > File "C:\Users\Yuki > Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\cqlhandling.py", line 23, > in > from cqlshlib import pylexotron, util > File "C:\Users\Yuki > Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\pylexotron.py", line 342, > in > class ParsingRuleSet: > File "C:\Users\Yuki > Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\pylexotron.py", line 343, > in ParsingRuleSet > RuleSpecScanner = SaferScanner([ > File "C:\Users\Yuki > Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\saferscanner.py", line 74, > in __init__ > s = re.sre_parse.Pattern() > AttributeError: module 'sre_parse' has no attribute 'Pattern' > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15568) Message filtering should apply on the inboundSink in In-JVM dtest
[ https://issues.apache.org/jira/browse/CASSANDRA-15568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076420#comment-17076420 ] Yifan Cai commented on CASSANDRA-15568: --- Right. I think [~dcapwell] already have it in the CASSANDRA-15564. Since that ticket is resolved already, this one can be closed. > Message filtering should apply on the inboundSink in In-JVM dtest > - > > Key: CASSANDRA-15568 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15568 > Project: Cassandra > Issue Type: Improvement > Components: Test/dtest >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: Normal > > The message filtering mechanism in the in-jvm dtest helps to simulate network > partition/delay. > The problem of the current approach that adds all filters to the > {{MessagingService#outboundSink}} is that a blocking filter blocks the > following filters to be evaluated since there is only a single thread that > evaluates them. It further blocks the other outing messages. The typical > internode messaging pattern is that the coordinator node sends out multiple > messages to other nodes upon receiving a query. The described blocking > messages can happen quite often. > The problem can be solved by moving the message filtering to the > {{MessagingService#inboundSink}}, so that each inbounding message is > naturally filtered in parallel. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15573) Python 3.8 fails to execute cqlsh
[ https://issues.apache.org/jira/browse/CASSANDRA-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076421#comment-17076421 ] Eduard Tudenhoefner commented on CASSANDRA-15573: - I added a Python3.8 compatible SaferScanner in https://github.com/apache/cassandra/pull/518. Note that the PR currently contains 2 commits from CASSANDRA-15659 because those are required for testing things with newer Python versions. > Python 3.8 fails to execute cqlsh > - > > Key: CASSANDRA-15573 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15573 > Project: Cassandra > Issue Type: Sub-task > Components: Tool/cqlsh >Reporter: Yuki Morishita >Assignee: Eduard Tudenhoefner >Priority: Normal > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Python 3.8 renamed sre_parse.Pattern to sre_parse.State (see > https://bugs.python.org/issue34681 and corresponding pull request > https://github.com/python/cpython/pull/9310) > So when executing cqlsh with Python 3.8, it throws error: > {code} > Traceback (most recent call last): > File ".\bin\cqlsh.py", line 175, in > from cqlshlib import cql3handling, cqlhandling, pylexotron, sslhandling, > cqlshhandling > File "C:\Users\Yuki > Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\cql3handling.py", line 19, > in > from cqlshlib.cqlhandling import CqlParsingRuleSet, Hint > File "C:\Users\Yuki > Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\cqlhandling.py", line 23, > in > from cqlshlib import pylexotron, util > File "C:\Users\Yuki > Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\pylexotron.py", line 342, > in > class ParsingRuleSet: > File "C:\Users\Yuki > Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\pylexotron.py", line 343, > in ParsingRuleSet > RuleSpecScanner = SaferScanner([ > File "C:\Users\Yuki > Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\saferscanner.py", line 74, > in __init__ > s = re.sre_parse.Pattern() > AttributeError: module 'sre_parse' has no attribute 'Pattern' > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15573) Python 3.8 fails to execute cqlsh
[ https://issues.apache.org/jira/browse/CASSANDRA-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated CASSANDRA-15573: --- Labels: pull-request-available (was: ) > Python 3.8 fails to execute cqlsh > - > > Key: CASSANDRA-15573 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15573 > Project: Cassandra > Issue Type: Sub-task > Components: Tool/cqlsh >Reporter: Yuki Morishita >Assignee: Eduard Tudenhoefner >Priority: Normal > Labels: pull-request-available > > Python 3.8 renamed sre_parse.Pattern to sre_parse.State (see > https://bugs.python.org/issue34681 and corresponding pull request > https://github.com/python/cpython/pull/9310) > So when executing cqlsh with Python 3.8, it throws error: > {code} > Traceback (most recent call last): > File ".\bin\cqlsh.py", line 175, in > from cqlshlib import cql3handling, cqlhandling, pylexotron, sslhandling, > cqlshhandling > File "C:\Users\Yuki > Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\cql3handling.py", line 19, > in > from cqlshlib.cqlhandling import CqlParsingRuleSet, Hint > File "C:\Users\Yuki > Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\cqlhandling.py", line 23, > in > from cqlshlib import pylexotron, util > File "C:\Users\Yuki > Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\pylexotron.py", line 342, > in > class ParsingRuleSet: > File "C:\Users\Yuki > Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\pylexotron.py", line 343, > in ParsingRuleSet > RuleSpecScanner = SaferScanner([ > File "C:\Users\Yuki > Morishita\Projects\cassandra\bin\..\pylib\cqlshlib\saferscanner.py", line 74, > in __init__ > s = re.sre_parse.Pattern() > AttributeError: module 'sre_parse' has no attribute 'Pattern' > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15657) Improve zero-copy-streaming containment check by using file sections
[ https://issues.apache.org/jira/browse/CASSANDRA-15657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076413#comment-17076413 ] Marcus Eriksson commented on CASSANDRA-15657: - the approach we discussed was trying to use sstable first + last tokens to figure out if the ranges covered the whole sstable from a quick look, this seems to sum up the total size of sections to be transferred and check if that size matches the whole sstable size, seems like a good idea to me > Improve zero-copy-streaming containment check by using file sections > > > Key: CASSANDRA-15657 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15657 > Project: Cassandra > Issue Type: Improvement > Components: Legacy/Streaming and Messaging >Reporter: ZhaoYang >Assignee: ZhaoYang >Priority: Normal > Fix For: 4.0 > > > Currently zero copy streaming is only enabled for leveled-compaction strategy > and it checks if all keys in the sstables are included in the transferred > ranges. > This is very inefficient. The containment check can be improved by checking > if transferred sections (the transferred file positions) cover entire sstable. > I also enabled ZCS for all compaction strategies since the new containment > check is very fast.. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15694) Statistics upon streaming of entire SSTables in Netstats is wrong
[ https://issues.apache.org/jira/browse/CASSANDRA-15694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Miklosovic updated CASSANDRA-15694: -- Description: There is a bug in the current code (trunk on 6th April 2020) as if we are streaming entire SSTables via CassandraEntireSSTableStreamWriter and CassandraOutgoingFile respectively, there is not any update on particular components of a SSTable as it counts only "db" file to be there. That introduces this bug: {code:java} Mode: NORMAL Rebuild 2c0b43f0-735d-11ea-9346-fb0ffe238736 /127.0.0.2 Sending 19 files, 27664559 bytes total. Already sent 133 files, 27664559 bytes total /tmp/dtests15682026295742741219/node2/data/distributed_test_keyspace/cf-196b3... {code} Basically, number of files to be sent is lower than files sent. The straightforward fix here is to distinguish when we are streaming entire sstables and in that case include all manifest files into computation. This issue relates to https://issues.apache.org/jira/browse/CASSANDRA-15657 because the resolution whether we stream entirely or not is got from a method which is performance sensitive and computed every time. Once CASSANDRA-15657 (hence CASSANDRA-14586) is done, this ticket can be worked on. branch with fix is here: [https://github.com/smiklosovic/cassandra/tree/CASSANDRA-15694] was: There is a bug in the current code as if we are streaming entire SSTables via CassandraEntireSSTableStreamWriter and CassandraOutgoingFile respectively, there is not any update on particular components of a SSTable as it counts only "db" file to be there. That introduces this bug: {code:java} Mode: NORMAL Rebuild 2c0b43f0-735d-11ea-9346-fb0ffe238736 /127.0.0.2 Sending 19 files, 27664559 bytes total. Already sent 133 files, 27664559 bytes total /tmp/dtests15682026295742741219/node2/data/distributed_test_keyspace/cf-196b3... {code} Basically, number of files to be sent is lower than files sent. The straightforward fix here is to distinguish when we are streaming entire sstables and in that case include all manifest files into computation. This issue relates to https://issues.apache.org/jira/browse/CASSANDRA-15657 because the resolution whether we stream entirely or not is got from a method which is performance sensitive and computed every time. Once CASSANDRA-15657 (hence CASSANDRA-14586) is done, this ticket can be worked on. branch with fix is here: [https://github.com/smiklosovic/cassandra/tree/CASSANDRA-15694] > Statistics upon streaming of entire SSTables in Netstats is wrong > - > > Key: CASSANDRA-15694 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15694 > Project: Cassandra > Issue Type: Bug > Components: Tool/nodetool >Reporter: Stefan Miklosovic >Priority: Normal > > There is a bug in the current code (trunk on 6th April 2020) as if we are > streaming entire SSTables via CassandraEntireSSTableStreamWriter and > CassandraOutgoingFile respectively, there is not any update on particular > components of a SSTable as it counts only "db" file to be there. That > introduces this bug: > > {code:java} > Mode: NORMAL > Rebuild 2c0b43f0-735d-11ea-9346-fb0ffe238736 > /127.0.0.2 Sending 19 files, 27664559 bytes total. Already sent 133 > files, 27664559 bytes total > > /tmp/dtests15682026295742741219/node2/data/distributed_test_keyspace/cf-196b3... > > {code} > Basically, number of files to be sent is lower than files sent. > > The straightforward fix here is to distinguish when we are streaming entire > sstables and in that case include all manifest files into computation. > > This issue relates to https://issues.apache.org/jira/browse/CASSANDRA-15657 > because the resolution whether we stream entirely or not is got from a method > which is performance sensitive and computed every time. Once CASSANDRA-15657 > (hence CASSANDRA-14586) is done, this ticket can be worked on. > > branch with fix is here: > [https://github.com/smiklosovic/cassandra/tree/CASSANDRA-15694] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15682) Missing commas between endpoints in nodetool describering
[ https://issues.apache.org/jira/browse/CASSANDRA-15682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076401#comment-17076401 ] Stefan Miklosovic commented on CASSANDRA-15682: --- PR with the fix here [https://github.com/apache/cassandra/pull/517] > Missing commas between endpoints in nodetool describering > - > > Key: CASSANDRA-15682 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15682 > Project: Cassandra > Issue Type: Bug > Components: Tool/nodetool >Reporter: Aleksandr Sorokoumov >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.0 > > > *Setup* > 3-node cluster created with ccm > {noformat} > cqlsh> create keyspace ks with replication = {'class': 'SimpleStrategy', > 'replication_factor': 2}; > {noformat} > *trunk*: > {noformat} > $ bin/nodetool describering ks --port 7100 > Schema Version:295e8142-fc9f-3f76-b6db-24430ff572e5 > TokenRange: > TokenRange(start_token:-9223372036854775808, > end_token:-3074457345618258603endpoints:[127.0.0.2, > 127.0.0.3]rpc_endpoints:[127.0.0.2, > 127.0.0.3]endpoint_details:[EndpointDetails(host:127.0.0.2, datacenter:d > atacenter1, rack:rack1), EndpointDetails(host:127.0.0.3, > datacenter:datacenter1, rack:rack1)]) > TokenRange(start_token:-3074457345618258603, > end_token:3074457345618258602endpoints:[127.0.0.3, > 127.0.0.1]rpc_endpoints:[127.0.0.3, > /127.0.0.1]endpoint_details:[EndpointDetails(host:127.0.0.3, datacenter:d > atacenter1, rack:rack1), EndpointDetails(host:127.0.0.1, > datacenter:datacenter1, rack:rack1)]) > TokenRange(start_token:3074457345618258602, > end_token:-9223372036854775808endpoints:[127.0.0.1, > 127.0.0.2]rpc_endpoints:[/127.0.0.1, > 127.0.0.2]endpoint_details:[EndpointDetails(host:127.0.0.1, datacenter:d > atacenter1, rack:rack1), EndpointDetails(host:127.0.0.2, > datacenter:datacenter1, rack:rack1)]) > {noformat} > *3.11* (correct output) > {noformat} > bin/nodetool describering ks --port 7100 > Schema Version:c8fd35ea-6f49-3e77-85e7-a92e79df8696 > TokenRange: > TokenRange(start_token:-9223372036854775808, > end_token:-3074457345618258603, endpoints:[127.0.0.2, 127.0.0.3], > rpc_endpoints:[127.0.0.2, 127.0.0.3], > endpoint_details:[EndpointDetails(host:127.0.0.2, datace > nter:datacenter1, rack:rack1), EndpointDetails(host:127.0.0.3, > datacenter:datacenter1, rack:rack1)]) > TokenRange(start_token:-3074457345618258603, > end_token:3074457345618258602, endpoints:[127.0.0.3, 127.0.0.1], > rpc_endpoints:[127.0.0.3, 127.0.0.1], > endpoint_details:[EndpointDetails(host:127.0.0.3, datacen > ter:datacenter1, rack:rack1), EndpointDetails(host:127.0.0.1, > datacenter:datacenter1, rack:rack1)]) > TokenRange(start_token:3074457345618258602, > end_token:-9223372036854775808, endpoints:[127.0.0.1, 127.0.0.2], > rpc_endpoints:[127.0.0.1, 127.0.0.2], > endpoint_details:[EndpointDetails(host:127.0.0.1, datacen > ter:datacenter1, rack:rack1), EndpointDetails(host:127.0.0.2, > datacenter:datacenter1, rack:rack1)]) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15686) Improvements in circle CI default config
[ https://issues.apache.org/jira/browse/CASSANDRA-15686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076386#comment-17076386 ] Kevin Gallardo commented on CASSANDRA-15686: bq. I was under the impression the lager instances were not in the free tier; Correct, it is not. I was told that if you have a build that requests > medium on a free tier, CircleCI would downgrade to medium automatically, but I have not confirmed that, need to double check. I have seen [other OSS projects|https://github.com/envoyproxy/envoy/blob/master/.circleci/config.yml#L9] using {{xlarge}} resources on their default build config 路♂️. The problem is that the tests do not run well at all on medium, contributors get confused by seeing the python dtest build available and launches it, and then gets confused by the 200-300+ failures resulting. If these don't build successfully at all by default, I am wondering if it is worth to keep it in the "lowres" build at all given the confusion it causes? bq. Sorry I don't follow. [...] So the only file we should be maintaining is config-2_1.yml. So as we discussed offline, it would be possible to, instead of using the generated version of the {{config-2_1.yml}} file, we use the {{config-2_1.yml}} by default. This would cause less confusion to contributors too, we could also keep a HIGHRES version as discussed so that it doesn't change people's workflows. > Improvements in circle CI default config > > > Key: CASSANDRA-15686 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15686 > Project: Cassandra > Issue Type: Bug > Components: Build >Reporter: Kevin Gallardo >Priority: Normal > > I have been looking at and played around with the [default CircleCI > config|https://github.com/apache/cassandra/blob/trunk/.circleci/config.yml], > a few comments/questions regarding the following topics: > * Python dtests do not run successfully (200-300 failures) on {{medium}} > instances, they seem to only run with small flaky failures on {{large}} > instances or higher > * Python Upgrade tests: > ** Do not seem to run without many failures on any instance types / any > parallelism setting > ** Do not seem to parallelize well, it seems each container is going to > download multiple C* versions > ** Additionally it seems the configuration is not up to date, as currently > we get errors because {{JAVA8_HOME}} is not set > * Unit tests do not seem to parallelize optimally, number of test runners do > not reflect the available CPUs on the container. Ideally if # of runners == # > of CPUs, build time is improved, on any type of instances. > ** For instance when using the current configuration, running on medium > instances, build will use 1 junit test runner, but 2 CPUs are available. If > using 2 runners, the build time is reduced from 19min (at the current main > config of parallelism=4) to 12min. > * There are some typos in the file, some dtests say "Run Unit Tests" but > they are JVM dtests (see > [here|https://github.com/apache/cassandra/blob/trunk/.circleci/config.yml#L1077], > > [here|https://github.com/apache/cassandra/blob/trunk/.circleci/config.yml#L1386]) > So some ways to process these would be: > * Do the Python dtests run successfully for anyone on {{medium}} instances? > If not, would it make sense to bump them to {{large}} so that they can be run > successfully? > * Does anybody ever run the python upgrade tests on CircleCI and what is the > configuration that makes it work? > * Would it make sense to either hardcode the number of test runners in the > unit tests with `-Dtest.runners` in the config file to reflect the number of > CPUs on the instances, or change the build so that it is able to detect the > appropriate number of core available automatically? > Additionally, it seems this default config file (config.yml) is not as well > maintained as the > [{{config-2_1.yml}}|https://github.com/apache/cassandra/blob/trunk/.circleci/config-2_1.yml] > (+its lowres/highres) version in the same folder (from CASSANDRA-14806). > What is the reasoning for maintaining these 2 versions of the build? Could > the better maintained version be used as the default? We could generate a > lowres version of the new config-2_1.yml, and rename it {{config.yml}} so > that it gets picked up by CircleCI automatically instead of the current > default. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15642) Inconsistent failure messages on distributed queries
[ https://issues.apache.org/jira/browse/CASSANDRA-15642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076374#comment-17076374 ] Kevin Gallardo commented on CASSANDRA-15642: bq. What is your definition of complete/reliable? To me it would mean that user gets a complete view of the states of the other requests necessary to complete a request at a certain CL. Instead, in the current state of things: * say for CL=3 * first response that comes back is a failure * then later the 2 other responses are successful * the error message may say "1 failure, 0 successful response" The "0 successful response" cannot be trusted because some successful responses actually came back, but after the failure. And the "1 failure" cannot be trusted either, because there may have been more failures that would not be reported because of the current fail-fast behavior. The alternate solution you mention to save the state first doesn't provide with a complete view of the situation either as far as I understand. The state is saved and things are less inconsistent, but the errors returned to the user may still be misleading as explained above > Inconsistent failure messages on distributed queries > > > Key: CASSANDRA-15642 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15642 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Coordination >Reporter: Kevin Gallardo >Priority: Normal > > As a follow up to some exploration I have done for CASSANDRA-15543, I > realized the following behavior in both {{ReadCallback}} and > {{AbstractWriteHandler}}: > - await for responses > - when all required number of responses have come back: unblock the wait > - when a single failure happens: unblock the wait > - when unblocked, look to see if the counter of failures is > 1 and if so > return an error message based on the {{failures}} map that's been filled > Error messages that can result from this behavior can be a ReadTimeout, a > ReadFailure, a WriteTimeout or a WriteFailure. > In case of a Write/ReadFailure, the user will get back an error looking like > the following: > "Failure: Received X responses, and Y failures" > (if this behavior I describe is incorrect, please correct me) > This causes a usability problem. Since the handler will fail and throw an > exception as soon as 1 failure happens, the error message that is returned to > the user may not be accurate. > (note: I am not entirely sure of the behavior in case of timeouts for now) > For example, say a request at CL = QUORUM = 3, a failed request may complete > first, then a successful one completes, and another fails. If the exception > is thrown fast enough, the error message could say > "Failure: Received 0 response, and 1 failure at CL = 3" > Which: > 1. doesn't make a lot of sense because the CL doesn't match the number of > results in the message, so you end up thinking "what happened with the rest > of the required CL?" > 2. the information is incorrect. We did receive a successful response, only > it came after the initial failure. > From that logic, I think it is safe to assume that the information returned > in the error message cannot be trusted in case of a failure. Only information > users should extract out of it is that at least 1 node has failed. > For a big improvement in usability, the {{ReadCallback}} and > {{AbstractWriteResponseHandler}} could instead wait for all responses to come > back before unblocking the wait, or let it timeout. This is way, the users > will be able to have some trust around the information returned to them. > Additionally, an error that happens first prevents a timeout to happen > because it fails immediately, and so potentially it hides problems with other > replicas. If we were to wait for all responses, we might get a timeout, in > that case we'd also be able to tell wether failures have happened *before* > that timeout, and have a more complete diagnostic where you can't detect both > errors at the same time. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15657) Improve zero-copy-streaming containment check by using file sections
[ https://issues.apache.org/jira/browse/CASSANDRA-15657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076366#comment-17076366 ] T Jake Luciani commented on CASSANDRA-15657: [~aleksey] Can you comment here? We are not understanding the potential problem here? > Improve zero-copy-streaming containment check by using file sections > > > Key: CASSANDRA-15657 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15657 > Project: Cassandra > Issue Type: Improvement > Components: Legacy/Streaming and Messaging >Reporter: ZhaoYang >Assignee: ZhaoYang >Priority: Normal > Fix For: 4.0 > > > Currently zero copy streaming is only enabled for leveled-compaction strategy > and it checks if all keys in the sstables are included in the transferred > ranges. > This is very inefficient. The containment check can be improved by checking > if transferred sections (the transferred file positions) cover entire sstable. > I also enabled ZCS for all compaction strategies since the new containment > check is very fast.. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15659) Better support of Python 3 for cqlsh
[ https://issues.apache.org/jira/browse/CASSANDRA-15659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eduard Tudenhoefner updated CASSANDRA-15659: Description: h2. From mailing list: [https://lists.apache.org/thread.html/r377099b632c62b641e4feef5b738084fc5369b0c7157fae867853597%40%3Cdev.cassandra.apache.org%3E] As of today (24/3/2020) and current trunk, there is Python 3.6 supported (1) but there is not any 3.6 version ootb in Debian for example. E.g. Buster has Python 3.7 and other (recent) releases have version 2.7. This means that if one wants to use Python 3 in Debian, he has to use 3.6 but it is not in the repository so he has to download / compile / install it on his own. There should be some sane Python 3 version supported which is as well present in Debian repository (or requirement to run with 3.6 should be relaxed) . (1) [https://github.com/apache/cassandra/blob/bf9a1d487b9ba469e8d740cf7d1cd419535a7e79/bin/cqlsh#L57-L65] h2. Summary of work that was done: I relaxed the requirement of *cqlsh* only working with Python 2.7 & 3.6 by allowing Python 3.6+. Note that I left the constraint for Python 3.6 being the minimum Python3 version. As [~ptbannister] pointed out, we could remove the Python 3.6 min version once we remove Python 2.7 support, as otherwise testing with lots of different Python versions will get costly. 2 Dockerfiles were added in *pylib* for minimal local testing of *cqlsh* starting up with Python 3.7 & 3.8 and that both revealed CASSANDRA-15572 and CASSANDRA-15573. CASSANDRA-15572 was fixed here as it was a one-liner. And I'm going to tackle CASSANDRA-15573 later. Python 3.8 testing was added to the CircleCI config so that we can actually see what else breaks with newer Python versions. A new Docker images with Ubuntu 19.10 was required (https://github.com/apache/cassandra-builds/pull/17). This docker image sets up Python 2.7/3.6/3.7/3.8 with their respective virtual environments, which are then being used by the CircleCI yaml. The image *spod/cassandra-testing-ubuntu1810-java11-w-dependencies:20190306* couldn't be updated unfortunately because it can't be built anymore, due to Ubuntu 18.10 being EOL. was: >From mailing list: [https://lists.apache.org/thread.html/r377099b632c62b641e4feef5b738084fc5369b0c7157fae867853597%40%3Cdev.cassandra.apache.org%3E] As of today (24/3/2020) and current trunk, there is Python 3.6 supported (1) but there is not any 3.6 version ootb in Debian for example. E.g. Buster has Python 3.7 and other (recent) releases have version 2.7. This means that if one wants to use Python 3 in Debian, he has to use 3.6 but it is not in the repository so he has to download / compile / install it on his own. There should be some sane Python 3 version supported which is as well present in Debian repository (or requirement to run with 3.6 should be relaxed) . (1) [https://github.com/apache/cassandra/blob/bf9a1d487b9ba469e8d740cf7d1cd419535a7e79/bin/cqlsh#L57-L65] > Better support of Python 3 for cqlsh > > > Key: CASSANDRA-15659 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15659 > Project: Cassandra > Issue Type: Task > Components: Tool/cqlsh >Reporter: Stefan Miklosovic >Assignee: Eduard Tudenhoefner >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-alpha > > Time Spent: 40m > Remaining Estimate: 0h > > h2. From mailing list: > [https://lists.apache.org/thread.html/r377099b632c62b641e4feef5b738084fc5369b0c7157fae867853597%40%3Cdev.cassandra.apache.org%3E] > > As of today (24/3/2020) and current trunk, there is Python 3.6 supported (1) > but there is not any 3.6 version ootb in Debian for example. E.g. Buster has > Python 3.7 and other (recent) releases have version 2.7. This means that if > one wants to use Python 3 in Debian, he has to use 3.6 but it is not in the > repository so he has to download / compile / install it on his own. > There should be some sane Python 3 version supported which is as well present > in Debian repository (or requirement to run with 3.6 should be relaxed) . > (1) > [https://github.com/apache/cassandra/blob/bf9a1d487b9ba469e8d740cf7d1cd419535a7e79/bin/cqlsh#L57-L65] > h2. Summary of work that was done: > I relaxed the requirement of *cqlsh* only working with Python 2.7 & 3.6 by > allowing Python 3.6+. > Note that I left the constraint for Python 3.6 being the minimum Python3 > version. > As [~ptbannister] pointed out, we could remove the Python 3.6 min version > once we remove Python 2.7 support, as otherwise testing with lots of > different Python versions will get costly. > 2 Dockerfiles were added in *pylib* for minimal local testing of *cqlsh* > starting up with Python 3.7 & 3.8 and that both revealed
[jira] [Updated] (CASSANDRA-15659) Better support of Python 3 for cqlsh
[ https://issues.apache.org/jira/browse/CASSANDRA-15659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eduard Tudenhoefner updated CASSANDRA-15659: Description: h2. From mailing list: [https://lists.apache.org/thread.html/r377099b632c62b641e4feef5b738084fc5369b0c7157fae867853597%40%3Cdev.cassandra.apache.org%3E] As of today (24/3/2020) and current trunk, there is Python 3.6 supported (1) but there is not any 3.6 version ootb in Debian for example. E.g. Buster has Python 3.7 and other (recent) releases have version 2.7. This means that if one wants to use Python 3 in Debian, he has to use 3.6 but it is not in the repository so he has to download / compile / install it on his own. There should be some sane Python 3 version supported which is as well present in Debian repository (or requirement to run with 3.6 should be relaxed) . (1) [https://github.com/apache/cassandra/blob/bf9a1d487b9ba469e8d740cf7d1cd419535a7e79/bin/cqlsh#L57-L65] h2. Summary of work that was done: I relaxed the requirement of *cqlsh* only working with Python 2.7 & 3.6 by allowing Python 3.6+. Note that I left the constraint for Python 3.6 being the minimum Python3 version. As [~ptbannister] pointed out, we could remove the Python 3.6 min version once we remove Python 2.7 support, as otherwise testing with lots of different Python versions will get costly. 2 Dockerfiles were added in *pylib* for minimal local testing of *cqlsh* starting up with Python 3.7 & 3.8 and that both revealed CASSANDRA-15572 and CASSANDRA-15573. CASSANDRA-15572 was fixed here as it was a one-liner. And I'm going to tackle CASSANDRA-15573 later. Python 3.8 testing was added to the CircleCI config so that we can actually see what else breaks with newer Python versions. A new Docker images with Ubuntu 19.10 was required for testing ([https://github.com/apache/cassandra-builds/pull/17]). This docker image sets up Python 2.7/3.6/3.7/3.8 with their respective virtual environments, which are then being used by the CircleCI yaml. The image *spod/cassandra-testing-ubuntu1810-java11-w-dependencies:20190306* couldn't be updated unfortunately because it can't be built anymore, due to Ubuntu 18.10 being EOL. was: h2. From mailing list: [https://lists.apache.org/thread.html/r377099b632c62b641e4feef5b738084fc5369b0c7157fae867853597%40%3Cdev.cassandra.apache.org%3E] As of today (24/3/2020) and current trunk, there is Python 3.6 supported (1) but there is not any 3.6 version ootb in Debian for example. E.g. Buster has Python 3.7 and other (recent) releases have version 2.7. This means that if one wants to use Python 3 in Debian, he has to use 3.6 but it is not in the repository so he has to download / compile / install it on his own. There should be some sane Python 3 version supported which is as well present in Debian repository (or requirement to run with 3.6 should be relaxed) . (1) [https://github.com/apache/cassandra/blob/bf9a1d487b9ba469e8d740cf7d1cd419535a7e79/bin/cqlsh#L57-L65] h2. Summary of work that was done: I relaxed the requirement of *cqlsh* only working with Python 2.7 & 3.6 by allowing Python 3.6+. Note that I left the constraint for Python 3.6 being the minimum Python3 version. As [~ptbannister] pointed out, we could remove the Python 3.6 min version once we remove Python 2.7 support, as otherwise testing with lots of different Python versions will get costly. 2 Dockerfiles were added in *pylib* for minimal local testing of *cqlsh* starting up with Python 3.7 & 3.8 and that both revealed CASSANDRA-15572 and CASSANDRA-15573. CASSANDRA-15572 was fixed here as it was a one-liner. And I'm going to tackle CASSANDRA-15573 later. Python 3.8 testing was added to the CircleCI config so that we can actually see what else breaks with newer Python versions. A new Docker images with Ubuntu 19.10 was required (https://github.com/apache/cassandra-builds/pull/17). This docker image sets up Python 2.7/3.6/3.7/3.8 with their respective virtual environments, which are then being used by the CircleCI yaml. The image *spod/cassandra-testing-ubuntu1810-java11-w-dependencies:20190306* couldn't be updated unfortunately because it can't be built anymore, due to Ubuntu 18.10 being EOL. > Better support of Python 3 for cqlsh > > > Key: CASSANDRA-15659 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15659 > Project: Cassandra > Issue Type: Task > Components: Tool/cqlsh >Reporter: Stefan Miklosovic >Assignee: Eduard Tudenhoefner >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-alpha > > Time Spent: 40m > Remaining Estimate: 0h > > h2. From mailing list: >
[jira] [Commented] (CASSANDRA-15229) BufferPool Regression
[ https://issues.apache.org/jira/browse/CASSANDRA-15229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076347#comment-17076347 ] Stefania Alborghetti commented on CASSANDRA-15229: -- bq. The current implementation isn't really a bump the pointer allocator? It's bitmap based, though with a very tiny bitmap. Sorry it's been a while. Of course the current implementation is also bitmap based. The point is that it is not suitable for long lived buffers, similarly to our bump the pointer strategy. The transient case is easy to solve, either approach would work. bq. Could you elaborate on how these work, as my intuition is that anything designed for a thread-per-core architecture probably won't translate so well to the present state of the world. Though, either way, I suppose this is probably orthogonal to this ticket as we only need to address the {{ChunkCache}} part. The thread-per-core architecture makes it easy to identify threads that do most of the work and cause most of the contention. However, thread identification can be achieved also with thread pools or we can simply give all threads a local stash of buffers, provided that we return it when the thread dies. I don't think there is any other dependency on TPC beyond this. The design choice was mostly dictated by the size of the cache: with AIO reads the OS page cache is bypassed, and the chunk cache needs therefore to be very large, which is not the case if we use Java NIO reads or if we eventually implement asynchronous reads with the new uring API, bypassing AIO completely (which I do recommend). bq. We also optimized the chunk cache to store memory addresses rather than byte buffers, which significantly reduced heap usage. The byte buffers are materialized on the fly. bq. This would be a huge improvement, and a welcome backport if it is easy - though it might (I would guess) depend on Unsafe, which may be going away soon. It's orthogonal to this ticket, though, I think Yes it's based on the Unsafe. The addresses come from the slabs, and then we use the Unsafe to create hollow buffers and to set the address. This is an optimization and it clearly belongs to a separate ticket. {quote} We changed the chunk cache to always store buffers of the same size. We have global lists of these slabs, sorted by buffer size where each size is a power-of-two. How do these two statements reconcile? {quote} So let's assume the current workload is mostly on a table with 4k chunks, which translate to 4k buffers in the cache. Let's also assume that the workload is shifting towards another table, with 8k chunks. Alternatively, let's assume compression is ON, and an ALTER TABLE changes the chunk size. So now the chunk cache is slowly evicting 4k buffers and retaining 8k buffers. These buffers come from two different lists: the list of slabs serving 4k and the list serving 8k. Even if we collect all unused 4k slabs, until each slab has every single buffer returned, there will be wasted memory and we do not control how long that will take. To be fair, it's an extreme case, and we were perhaps over cautions in addressing this possibility by fixing the size of buffers in the cache. So it's possible that the redesigned buffer pool may work even with the current chunk cache implementation. bq. Is it your opinion that your entire ChunkCache implementation can be dropped wholesale into 4.0? I would assume it is still primarily multi-threaded. If so, it might be preferable to trying to fix the existing ChunkCache The changes to the chunk cache are not trivial and should be left as a follow up for 4.x or later in my opinion. The changes to the buffer pool can be dropped in 4.0 if you think that: - they are safe even in the presence of the case described above. - they are justified: memory wasted due to fragmentation is perhaps not an issue with a cache as little as 512 MB I'll try to share some code so you can have a clearer picture. > BufferPool Regression > - > > Key: CASSANDRA-15229 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15229 > Project: Cassandra > Issue Type: Bug > Components: Local/Caching >Reporter: Benedict Elliott Smith >Assignee: ZhaoYang >Priority: Normal > Fix For: 4.0, 4.0-beta > > > The BufferPool was never intended to be used for a {{ChunkCache}}, and we > need to either change our behaviour to handle uncorrelated lifetimes or use > something else. This is particularly important with the default chunk size > for compressed sstables being reduced. If we address the problem, we should > also utilise the BufferPool for native transport connections like we do for > internode messaging, and reduce the number of pooling solutions we employ. > Probably the best thing to do
[jira] [Updated] (CASSANDRA-15684) CASSANDRA-15650 was merged after dtest refactor and modified classes no longer in the project
[ https://issues.apache.org/jira/browse/CASSANDRA-15684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer updated CASSANDRA-15684: --- Reviewers: Alex Petrov, Benjamin Lerer > CASSANDRA-15650 was merged after dtest refactor and modified classes no > longer in the project > - > > Key: CASSANDRA-15684 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15684 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > CASSANDRA-15650 was based off commits before CASSANDRA-15539 which removed > some of the files modified in CASSANDRA-15650. The tests were passing > pre-merge but off earlier commits. On commit they started failing since the > dtest API no longer match so produces the following exception > {code} > [junit-timeout] > org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts; > [junit-timeout] java.lang.NoSuchMethodError: > org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts; > [junit-timeout] at > org.apache.cassandra.distributed.test.RepairCoordinatorFast.lambda$unknownHost$5(RepairCoordinatorFast.java:216) > [junit-timeout] at > org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39) > [junit-timeout] at > org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67) > [junit-timeout] at > java.util.concurrent.FutureTask.run(FutureTask.java:266) > [junit-timeout] at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > [junit-timeout] at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > [junit-timeout] at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > [junit-timeout] at java.lang.Thread.run(Thread.java:748) > {code} > Root cause was 4 files exited which should have been deleted in > CASSANDRA-15539. Since they were not when CASSANDRA-15650 modified one it > didn't cause a merge conflict, but when the test runs it conflicts and fails. > [Circle > CI|https://circleci.com/gh/dcapwell/cassandra/tree/bug%2FCASSANDRA-15684] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15689) CASSANDRA-15650 broke CasWriteTest causing it to fail and hang
[ https://issues.apache.org/jira/browse/CASSANDRA-15689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer updated CASSANDRA-15689: --- Fix Version/s: 4.0-alpha Since Version: 4.0-alpha Source Control Link: https://github.com/apache/cassandra/commit/c133385986db9fb1333b37739528f66ad45de916 Resolution: Fixed Status: Resolved (was: Ready to Commit) Committed into trunk at c133385986db9fb1333b37739528f66ad45de916 > CASSANDRA-15650 broke CasWriteTest causing it to fail and hang > -- > > Key: CASSANDRA-15689 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15689 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Labels: pull-request-available > Fix For: 4.0-alpha > > Time Spent: 10m > Remaining Estimate: 0h > > CasWriteTest changed IsolatedExecutor to rethrow runtime exceptions rather > than wrap, this test assumes they are wrapped which causes tests to fail and > casWriteContentionTimeoutTest to timeout. > [Circle > CI|https://circleci.com/gh/dcapwell/cassandra/tree/bug%2FCASSANDRA-15689] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15689) CASSANDRA-15650 broke CasWriteTest causing it to fail and hang
[ https://issues.apache.org/jira/browse/CASSANDRA-15689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer updated CASSANDRA-15689: --- Status: Ready to Commit (was: Review In Progress) > CASSANDRA-15650 broke CasWriteTest causing it to fail and hang > -- > > Key: CASSANDRA-15689 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15689 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > CasWriteTest changed IsolatedExecutor to rethrow runtime exceptions rather > than wrap, this test assumes they are wrapped which causes tests to fail and > casWriteContentionTimeoutTest to timeout. > [Circle > CI|https://circleci.com/gh/dcapwell/cassandra/tree/bug%2FCASSANDRA-15689] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch trunk updated: Fix CasWriterTest after CASSANDRA-15689
This is an automated email from the ASF dual-hosted git repository. blerer pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git The following commit(s) were added to refs/heads/trunk by this push: new c133385 Fix CasWriterTest after CASSANDRA-15689 c133385 is described below commit c133385986db9fb1333b37739528f66ad45de916 Author: David Capwell AuthorDate: Fri Apr 3 10:56:20 2020 -0700 Fix CasWriterTest after CASSANDRA-15689 patch by David Capwell; reviewed by Benjamin Lerer for CASSANDRA-15689 --- .../cassandra/distributed/test/CasWriteTest.java | 24 +++--- 1 file changed, 17 insertions(+), 7 deletions(-) diff --git a/test/distributed/org/apache/cassandra/distributed/test/CasWriteTest.java b/test/distributed/org/apache/cassandra/distributed/test/CasWriteTest.java index a5d7e72..1d886cf 100644 --- a/test/distributed/org/apache/cassandra/distributed/test/CasWriteTest.java +++ b/test/distributed/org/apache/cassandra/distributed/test/CasWriteTest.java @@ -167,8 +167,8 @@ public class CasWriteTest extends TestBaseImpl failure -> failure.get() != null && failure.get() - .getMessage() - .contains(CasWriteTimeoutException.class.getCanonicalName()), + .getClass().getCanonicalName() + .equals(CasWriteTimeoutException.class.getCanonicalName()), "Expecting cause to be CasWriteTimeoutException"); } @@ -217,8 +217,7 @@ public class CasWriteTest extends TestBaseImpl private void expectCasWriteTimeout() { -thrown.expect(RuntimeException.class); -thrown.expectCause(new BaseMatcher() +thrown.expect(new BaseMatcher() { public boolean matches(Object item) { @@ -232,7 +231,18 @@ public class CasWriteTest extends TestBaseImpl }); // unable to assert on class becuase the exception thrown was loaded by a differnet classloader, InstanceClassLoader // therefor asserts the FQCN name present in the message as a workaround - thrown.expectMessage(containsString(CasWriteTimeoutException.class.getCanonicalName())); +thrown.expect(new BaseMatcher() +{ +public boolean matches(Object item) +{ +return item.getClass().getCanonicalName().equals(CasWriteTimeoutException.class.getCanonicalName()); +} + +public void describeTo(Description description) +{ +description.appendText("Class was expected to be " + CasWriteTimeoutException.class.getCanonicalName() + " but was not"); +} +}); thrown.expectMessage(containsString("CAS operation timed out")); } @@ -256,8 +266,8 @@ public class CasWriteTest extends TestBaseImpl } catch (Throwable t) { -Assert.assertTrue("Expecting cause to be CasWriteUncertainException", - t.getMessage().contains(CasWriteUnknownResultException.class.getCanonicalName())); +Assert.assertEquals("Expecting cause to be CasWriteUncertainException", + CasWriteUnknownResultException.class.getCanonicalName(), t.getClass().getCanonicalName()); return; } } - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15672) Testsuite: org.apache.cassandra.repair.consistent.CoordinatorMessagingTest Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.878 sec
[ https://issues.apache.org/jira/browse/CASSANDRA-15672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania Alborghetti updated CASSANDRA-15672: - Since Version: 4.0-alpha Source Control Link: https://gitbox.apache.org/repos/asf?p=cassandra.git;a=commit;h=b4e640a96e76f8d4a45937b1312b64ddc1aeb8ac Resolution: Fixed Status: Resolved (was: Ready to Commit) CI link: https://jenkins-cm4.apache.org/view/patches/job/Cassandra-devbranch/16/ Committed as [b4e640a96e76f8d4a45937b1312b64ddc1aeb8ac|https://gitbox.apache.org/repos/asf?p=cassandra.git;a=commit;h=b4e640a96e76f8d4a45937b1312b64ddc1aeb8ac] > Testsuite: org.apache.cassandra.repair.consistent.CoordinatorMessagingTest > Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.878 sec > - > > Key: CASSANDRA-15672 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15672 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Ekaterina Dimitrova >Assignee: Aleksandr Sorokoumov >Priority: Normal > Labels: pull-request-available > Fix For: 4.0, 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > The following failure was observed: > Testsuite: org.apache.cassandra.repair.consistent.CoordinatorMessagingTest > Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.878 sec > [junit-timeout] > [junit-timeout] Testcase: > testMockedMessagingPrepareFailureP1(org.apache.cassandra.repair.consistent.CoordinatorMessagingTest): >FAILED > [junit-timeout] null > [junit-timeout] junit.framework.AssertionFailedError > [junit-timeout] at > org.apache.cassandra.repair.consistent.CoordinatorMessagingTest.testMockedMessagingPrepareFailure(CoordinatorMessagingTest.java:206) > [junit-timeout] at > org.apache.cassandra.repair.consistent.CoordinatorMessagingTest.testMockedMessagingPrepareFailureP1(CoordinatorMessagingTest.java:154) > [junit-timeout] > [junit-timeout] > Seen on Java8 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15672) Testsuite: org.apache.cassandra.repair.consistent.CoordinatorMessagingTest Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.878 sec
[ https://issues.apache.org/jira/browse/CASSANDRA-15672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania Alborghetti updated CASSANDRA-15672: - Status: Ready to Commit (was: Review In Progress) > Testsuite: org.apache.cassandra.repair.consistent.CoordinatorMessagingTest > Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.878 sec > - > > Key: CASSANDRA-15672 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15672 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Ekaterina Dimitrova >Assignee: Aleksandr Sorokoumov >Priority: Normal > Labels: pull-request-available > Fix For: 4.0, 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > The following failure was observed: > Testsuite: org.apache.cassandra.repair.consistent.CoordinatorMessagingTest > Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 7.878 sec > [junit-timeout] > [junit-timeout] Testcase: > testMockedMessagingPrepareFailureP1(org.apache.cassandra.repair.consistent.CoordinatorMessagingTest): >FAILED > [junit-timeout] null > [junit-timeout] junit.framework.AssertionFailedError > [junit-timeout] at > org.apache.cassandra.repair.consistent.CoordinatorMessagingTest.testMockedMessagingPrepareFailure(CoordinatorMessagingTest.java:206) > [junit-timeout] at > org.apache.cassandra.repair.consistent.CoordinatorMessagingTest.testMockedMessagingPrepareFailureP1(CoordinatorMessagingTest.java:154) > [junit-timeout] > [junit-timeout] > Seen on Java8 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch trunk updated: Fix flaky CoordinatorMessagingTest and docstring in OutboundSink and ConsistentSession
This is an automated email from the ASF dual-hosted git repository. stefania pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git The following commit(s) were added to refs/heads/trunk by this push: new b4e640a Fix flaky CoordinatorMessagingTest and docstring in OutboundSink and ConsistentSession b4e640a is described below commit b4e640a96e76f8d4a45937b1312b64ddc1aeb8ac Author: Aleksandr Sorokoumov AuthorDate: Tue Mar 31 15:53:51 2020 +0200 Fix flaky CoordinatorMessagingTest and docstring in OutboundSink and ConsistentSession patch by Aleksandr Sorokoumov; reviewed by Stefania Alborghetti for CASSANDRA-15672 --- CHANGES.txt| 1 + .../org/apache/cassandra/net/OutboundSink.java | 2 +- .../repair/consistent/ConsistentSession.java | 8 +-- .../consistent/CoordinatorMessagingTest.java | 70 +++--- 4 files changed, 53 insertions(+), 28 deletions(-) diff --git a/CHANGES.txt b/CHANGES.txt index 65111d0..fb881de 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 4.0-alpha4 + * Fix flaky CoordinatorMessagingTest and docstring in OutboundSink and ConsistentSession (CASSANDRA-15672) * Fix force compaction of wrapping ranges (CASSANDRA-15664) * Expose repair streaming metrics (CASSANDRA-15656) * Set now in seconds in the future for validation repairs (CASSANDRA-15655) diff --git a/src/java/org/apache/cassandra/net/OutboundSink.java b/src/java/org/apache/cassandra/net/OutboundSink.java index d19b3e2..34c72db 100644 --- a/src/java/org/apache/cassandra/net/OutboundSink.java +++ b/src/java/org/apache/cassandra/net/OutboundSink.java @@ -25,7 +25,7 @@ import org.apache.cassandra.locator.InetAddressAndPort; /** * A message sink that all outbound messages go through. * - * Default sink {@link Sink} used by {@link MessagingService} is MessagingService#doSend(), which proceeds to + * Default sink {@link Sink} used by {@link MessagingService} is {@link MessagingService#doSend(Message, InetAddressAndPort, ConnectionType)}, which proceeds to * send messages over the network, but it can be overridden to filter out certain messages, record the fact * of attempted delivery, or delay they delivery. * diff --git a/src/java/org/apache/cassandra/repair/consistent/ConsistentSession.java b/src/java/org/apache/cassandra/repair/consistent/ConsistentSession.java index 03de157..d9ac927 100644 --- a/src/java/org/apache/cassandra/repair/consistent/ConsistentSession.java +++ b/src/java/org/apache/cassandra/repair/consistent/ConsistentSession.java @@ -56,13 +56,13 @@ import org.apache.cassandra.tools.nodetool.RepairAdmin; * There are 4 stages to a consistent incremental repair. * * Repair prepare - * First, the normal {@link ActiveRepairService#prepareForRepair(UUID, InetAddressAndPort, Set, RepairOption, List)} stuff + * First, the normal {@link ActiveRepairService#prepareForRepair(UUID, InetAddressAndPort, Set, RepairOption, boolean, List)} stuff * happens, which sends out {@link PrepareMessage} and creates a {@link ActiveRepairService.ParentRepairSession} * on the coordinator and each of the neighbors. * * Consistent prepare * The consistent prepare step promotes the parent repair session to a consistent session, and isolates the sstables - * being repaired other sstables. First, the coordinator sends a {@link PrepareConsistentRequest} message to each repair + * being repaired from other sstables. First, the coordinator sends a {@link PrepareConsistentRequest} message to each repair * participant (including itself). When received, the node creates a {@link LocalSession} instance, sets it's state to * {@code PREPARING}, persists it, and begins a preparing the tables for incremental repair, which segregates the data * being repaired from the rest of the table data. When the preparation completes, the session state is set to @@ -74,7 +74,7 @@ import org.apache.cassandra.tools.nodetool.RepairAdmin; * Once the coordinator recieves positive {@code PrepareConsistentResponse} messages from all the participants, the * coordinator begins the normal repair process. * - * (see {@link CoordinatorSession#handlePrepareResponse(InetAddress, boolean)} + * (see {@link CoordinatorSession#handlePrepareResponse(InetAddressAndPort, boolean)} * * Repair * The coordinator runs the normal data repair process against the sstables segregated in the previous step. When a @@ -96,7 +96,7 @@ import org.apache.cassandra.tools.nodetool.RepairAdmin; * conflicts with in progress compactions. The sstables will be marked repaired as part of the normal compaction process. * * - * On the coordinator side, see {@link CoordinatorSession#finalizePropose()}, {@link CoordinatorSession#handleFinalizePromise(InetAddress, boolean)}, + * On the coordinator side, see {@link CoordinatorSession#finalizePropose()}, {@link
[jira] [Updated] (CASSANDRA-15369) Fake row deletions and range tombstones, causing digest mismatch and sstable growth
[ https://issues.apache.org/jira/browse/CASSANDRA-15369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-15369: Reviewers: Andres de la Peña, Marcus Eriksson (was: Andres de la Peña) > Fake row deletions and range tombstones, causing digest mismatch and sstable > growth > --- > > Key: CASSANDRA-15369 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15369 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Coordination, Local/Memtable, Local/SSTable >Reporter: Benedict Elliott Smith >Assignee: ZhaoYang >Priority: Normal > Fix For: 4.0, 3.0.x, 3.11.x > > > As assessed in CASSANDRA-15363, we generate fake row deletions and fake > tombstone markers under various circumstances: > * If we perform a clustering key query (or select a compact column): > * Serving from a {{Memtable}}, we will generate fake row deletions > * Serving from an sstable, we will generate fake row tombstone markers > * If we perform a slice query, we will generate only fake row tombstone > markers for any range tombstone that begins or ends outside of the limit of > the requested slice > * If we perform a multi-slice or IN query, this will occur for each > slice/clustering > Unfortunately, these different behaviours can lead to very different data > stored in sstables until a full repair is run. When we read-repair, we only > send these fake deletions or range tombstones. A fake row deletion, > clustering RT and slice RT, each produces a different digest. So for each > single point lookup we can produce a digest mismatch twice, and until a full > repair is run we can encounter an unlimited number of digest mismatches > across different overlapping queries. > Relatedly, this seems a more problematic variant of our atomicity failures > caused by our monotonic reads, since RTs can have an atomic effect across (up > to) the entire partition, whereas the propagation may happen on an > arbitrarily small portion. If the RT exists on only one node, this could > plausibly lead to fairly problematic scenario if that node fails before the > range can be repaired. > At the very least, this behaviour can lead to an almost unlimited amount of > extraneous data being stored until the range is repaired and compaction > happens to overwrite the sub-range RTs and row deletions. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15369) Fake row deletions and range tombstones, causing digest mismatch and sstable growth
[ https://issues.apache.org/jira/browse/CASSANDRA-15369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andres de la Peña updated CASSANDRA-15369: -- Reviewers: Andres de la Peña > Fake row deletions and range tombstones, causing digest mismatch and sstable > growth > --- > > Key: CASSANDRA-15369 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15369 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Coordination, Local/Memtable, Local/SSTable >Reporter: Benedict Elliott Smith >Assignee: ZhaoYang >Priority: Normal > Fix For: 4.0, 3.0.x, 3.11.x > > > As assessed in CASSANDRA-15363, we generate fake row deletions and fake > tombstone markers under various circumstances: > * If we perform a clustering key query (or select a compact column): > * Serving from a {{Memtable}}, we will generate fake row deletions > * Serving from an sstable, we will generate fake row tombstone markers > * If we perform a slice query, we will generate only fake row tombstone > markers for any range tombstone that begins or ends outside of the limit of > the requested slice > * If we perform a multi-slice or IN query, this will occur for each > slice/clustering > Unfortunately, these different behaviours can lead to very different data > stored in sstables until a full repair is run. When we read-repair, we only > send these fake deletions or range tombstones. A fake row deletion, > clustering RT and slice RT, each produces a different digest. So for each > single point lookup we can produce a digest mismatch twice, and until a full > repair is run we can encounter an unlimited number of digest mismatches > across different overlapping queries. > Relatedly, this seems a more problematic variant of our atomicity failures > caused by our monotonic reads, since RTs can have an atomic effect across (up > to) the entire partition, whereas the propagation may happen on an > arbitrarily small portion. If the RT exists on only one node, this could > plausibly lead to fairly problematic scenario if that node fails before the > range can be repaired. > At the very least, this behaviour can lead to an almost unlimited amount of > extraneous data being stored until the range is repaired and compaction > happens to overwrite the sub-range RTs and row deletions. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15695) Fix NoSuchMethodError: 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts
[ https://issues.apache.org/jira/browse/CASSANDRA-15695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eduard Tudenhoefner updated CASSANDRA-15695: Resolution: Duplicate Status: Resolved (was: Open) > Fix NoSuchMethodError: > 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts > > > Key: CASSANDRA-15695 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15695 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Eduard Tudenhoefner >Assignee: Eduard Tudenhoefner >Priority: Normal > Fix For: 4.0-alpha > > > It seems that there was a regression introduced with CASSANDRA-15650 as the > first failures of that kind started to happen > [here.|https://ci-cassandra.apache.org/view/branches/job/Cassandra-trunk/30/#showFailuresLink] > {code} > [junit-timeout] Testcase: > prepareRPCTimeout[DATACENTER_AWARE/true](org.apache.cassandra.distributed.test.PreviewRepairCoordinatorTimeoutTest): > Caused an ERROR > [junit-timeout] 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts > org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains(java.lang.String[])' > [junit-timeout] java.lang.NoSuchMethodError: > 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts > org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains(java.lang.String[])' > [junit-timeout] at > org.apache.cassandra.distributed.test.RepairCoordinatorTimeout.lambda$prepareRPCTimeout$0(RepairCoordinatorTimeout.java:45) > [junit-timeout] at > org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39) > [junit-timeout] at > org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67) > [junit-timeout] at > java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > [junit-timeout] at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > [junit-timeout] at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > [junit-timeout] at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > [junit-timeout] at java.base/java.lang.Thread.run(Thread.java:834) > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15695) Fix NoSuchMethodError: 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts
[ https://issues.apache.org/jira/browse/CASSANDRA-15695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eduard Tudenhoefner updated CASSANDRA-15695: Bug Category: Parent values: Code(13163) Complexity: Normal Discovered By: Unit Test Severity: Normal Status: Open (was: Triage Needed) > Fix NoSuchMethodError: > 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts > > > Key: CASSANDRA-15695 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15695 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Eduard Tudenhoefner >Assignee: Eduard Tudenhoefner >Priority: Normal > Fix For: 4.0-alpha > > > It seems that there was a regression introduced with CASSANDRA-15650 as the > first failures of that kind started to happen > [here.|https://ci-cassandra.apache.org/view/branches/job/Cassandra-trunk/30/#showFailuresLink] > {code} > [junit-timeout] Testcase: > prepareRPCTimeout[DATACENTER_AWARE/true](org.apache.cassandra.distributed.test.PreviewRepairCoordinatorTimeoutTest): > Caused an ERROR > [junit-timeout] 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts > org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains(java.lang.String[])' > [junit-timeout] java.lang.NoSuchMethodError: > 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts > org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains(java.lang.String[])' > [junit-timeout] at > org.apache.cassandra.distributed.test.RepairCoordinatorTimeout.lambda$prepareRPCTimeout$0(RepairCoordinatorTimeout.java:45) > [junit-timeout] at > org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39) > [junit-timeout] at > org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67) > [junit-timeout] at > java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > [junit-timeout] at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > [junit-timeout] at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > [junit-timeout] at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > [junit-timeout] at java.base/java.lang.Thread.run(Thread.java:834) > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15695) Fix NoSuchMethodError: 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts
[ https://issues.apache.org/jira/browse/CASSANDRA-15695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eduard Tudenhoefner updated CASSANDRA-15695: Since Version: 4.0-alpha > Fix NoSuchMethodError: > 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts > > > Key: CASSANDRA-15695 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15695 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Eduard Tudenhoefner >Assignee: Eduard Tudenhoefner >Priority: Normal > Fix For: 4.0-alpha > > > It seems that there was a regression introduced with CASSANDRA-15650 as the > first failures of that kind started to happen > [here.|https://ci-cassandra.apache.org/view/branches/job/Cassandra-trunk/30/#showFailuresLink] > {code} > [junit-timeout] Testcase: > prepareRPCTimeout[DATACENTER_AWARE/true](org.apache.cassandra.distributed.test.PreviewRepairCoordinatorTimeoutTest): > Caused an ERROR > [junit-timeout] 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts > org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains(java.lang.String[])' > [junit-timeout] java.lang.NoSuchMethodError: > 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts > org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains(java.lang.String[])' > [junit-timeout] at > org.apache.cassandra.distributed.test.RepairCoordinatorTimeout.lambda$prepareRPCTimeout$0(RepairCoordinatorTimeout.java:45) > [junit-timeout] at > org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39) > [junit-timeout] at > org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67) > [junit-timeout] at > java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > [junit-timeout] at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > [junit-timeout] at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > [junit-timeout] at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > [junit-timeout] at java.base/java.lang.Thread.run(Thread.java:834) > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15695) Fix NoSuchMethodError: 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts
[ https://issues.apache.org/jira/browse/CASSANDRA-15695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eduard Tudenhoefner updated CASSANDRA-15695: Fix Version/s: 4.0-alpha > Fix NoSuchMethodError: > 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts > > > Key: CASSANDRA-15695 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15695 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Eduard Tudenhoefner >Assignee: Eduard Tudenhoefner >Priority: Normal > Fix For: 4.0-alpha > > > It seems that there was a regression introduced with CASSANDRA-15650 as the > first failures of that kind started to happen > [here.|https://ci-cassandra.apache.org/view/branches/job/Cassandra-trunk/30/#showFailuresLink] > {code} > [junit-timeout] Testcase: > prepareRPCTimeout[DATACENTER_AWARE/true](org.apache.cassandra.distributed.test.PreviewRepairCoordinatorTimeoutTest): > Caused an ERROR > [junit-timeout] 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts > org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains(java.lang.String[])' > [junit-timeout] java.lang.NoSuchMethodError: > 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts > org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains(java.lang.String[])' > [junit-timeout] at > org.apache.cassandra.distributed.test.RepairCoordinatorTimeout.lambda$prepareRPCTimeout$0(RepairCoordinatorTimeout.java:45) > [junit-timeout] at > org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39) > [junit-timeout] at > org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67) > [junit-timeout] at > java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > [junit-timeout] at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > [junit-timeout] at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > [junit-timeout] at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > [junit-timeout] at java.base/java.lang.Thread.run(Thread.java:834) > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-15695) Fix NoSuchMethodError: 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts
[ https://issues.apache.org/jira/browse/CASSANDRA-15695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eduard Tudenhoefner reassigned CASSANDRA-15695: --- Assignee: Eduard Tudenhoefner > Fix NoSuchMethodError: > 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts > > > Key: CASSANDRA-15695 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15695 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Eduard Tudenhoefner >Assignee: Eduard Tudenhoefner >Priority: Normal > > It seems that there was a regression introduced with CASSANDRA-15650 as the > first failures of that kind started to happen > [here.|https://ci-cassandra.apache.org/view/branches/job/Cassandra-trunk/30/#showFailuresLink] > {code} > [junit-timeout] Testcase: > prepareRPCTimeout[DATACENTER_AWARE/true](org.apache.cassandra.distributed.test.PreviewRepairCoordinatorTimeoutTest): > Caused an ERROR > [junit-timeout] 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts > org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains(java.lang.String[])' > [junit-timeout] java.lang.NoSuchMethodError: > 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts > org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains(java.lang.String[])' > [junit-timeout] at > org.apache.cassandra.distributed.test.RepairCoordinatorTimeout.lambda$prepareRPCTimeout$0(RepairCoordinatorTimeout.java:45) > [junit-timeout] at > org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39) > [junit-timeout] at > org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67) > [junit-timeout] at > java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > [junit-timeout] at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > [junit-timeout] at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > [junit-timeout] at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > [junit-timeout] at java.base/java.lang.Thread.run(Thread.java:834) > {code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15695) Fix NoSuchMethodError: 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts
Eduard Tudenhoefner created CASSANDRA-15695: --- Summary: Fix NoSuchMethodError: 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts Key: CASSANDRA-15695 URL: https://issues.apache.org/jira/browse/CASSANDRA-15695 Project: Cassandra Issue Type: Bug Components: Test/unit Reporter: Eduard Tudenhoefner It seems that there was a regression introduced with CASSANDRA-15650 as the first failures of that kind started to happen [here.|https://ci-cassandra.apache.org/view/branches/job/Cassandra-trunk/30/#showFailuresLink] {code} [junit-timeout] Testcase: prepareRPCTimeout[DATACENTER_AWARE/true](org.apache.cassandra.distributed.test.PreviewRepairCoordinatorTimeoutTest): Caused an ERROR [junit-timeout] 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains(java.lang.String[])' [junit-timeout] java.lang.NoSuchMethodError: 'org.apache.cassandra.distributed.api.NodeToolResult$Asserts org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains(java.lang.String[])' [junit-timeout] at org.apache.cassandra.distributed.test.RepairCoordinatorTimeout.lambda$prepareRPCTimeout$0(RepairCoordinatorTimeout.java:45) [junit-timeout] at org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39) [junit-timeout] at org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67) [junit-timeout] at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) [junit-timeout] at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [junit-timeout] at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [junit-timeout] at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [junit-timeout] at java.base/java.lang.Thread.run(Thread.java:834) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15667) StreamResultFuture check for completeness is inconsistent, leading to races
[ https://issues.apache.org/jira/browse/CASSANDRA-15667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Massimiliano Tomassi updated CASSANDRA-15667: - Test and Documentation Plan: [https://app.circleci.com/pipelines/github/maxtomassi/cassandra?branch=15667-4.0] It seems like JVM dtests fail to run properly. Lots of logs like this: {code:java} [junit-timeout] Testcase: prepareRPCTimeout[PARALLEL/true](org.apache.cassandra.distributed.test.PreviewRepairCoordinatorTimeoutTest): Caused an ERROR [junit-timeout] org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts; [junit-timeout] java.lang.NoSuchMethodError: org.apache.cassandra.distributed.api.NodeToolResult$Asserts.errorContains([Ljava/lang/String;)Lorg/apache/cassandra/distributed/api/NodeToolResult$Asserts; [junit-timeout] at org.apache.cassandra.distributed.test.RepairCoordinatorTimeout.lambda$prepareRPCTimeout$0(RepairCoordinatorTimeout.java:45) [junit-timeout] at org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$0(AssertUtil.java:39) [junit-timeout] at org.apache.cassandra.utils.AssertUtil.lambda$assertTimeoutPreemptively$1(AssertUtil.java:67) [junit-timeout] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [junit-timeout] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [junit-timeout] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [junit-timeout] at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [junit-timeout] at java.lang.Thread.run(Thread.java:748) {code} Status: Patch Available (was: In Progress) > StreamResultFuture check for completeness is inconsistent, leading to races > --- > > Key: CASSANDRA-15667 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15667 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Streaming and Messaging >Reporter: Sergio Bossa >Assignee: Massimiliano Tomassi >Priority: Normal > Fix For: 4.0 > > > {{StreamResultFuture#maybeComplete()}} uses > {{StreamCoordinator#hasActiveSessions()}} to determine if all sessions are > completed, but then accesses each session state via > {{StreamCoordinator#getAllSessionInfo()}}: this is inconsistent, as the > former relies on the actual {{StreamSession}} state, while the latter on the > {{SessionInfo}} state, and the two are concurrently updated with no > coordination whatsoever. > This leads to races, i.e. apparent in some dtest spurious failures, such as > {{TestBootstrap.resumable_bootstrap_test}} in CASSANDRA-15614 cc > [~e.dimitrova]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15568) Message filtering should apply on the inboundSink in In-JVM dtest
[ https://issues.apache.org/jira/browse/CASSANDRA-15568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076235#comment-17076235 ] Benedict Elliott Smith commented on CASSANDRA-15568: bq. inbound message filtering / sinks Hmm, when did this happen? Have we eliminated outbound filtering? There is value in being able to stop progress on the outbound thread, as it permits you to specify a sequence of events by controlling the flow of events on the coordinator. > Message filtering should apply on the inboundSink in In-JVM dtest > - > > Key: CASSANDRA-15568 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15568 > Project: Cassandra > Issue Type: Improvement > Components: Test/dtest >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: Normal > > The message filtering mechanism in the in-jvm dtest helps to simulate network > partition/delay. > The problem of the current approach that adds all filters to the > {{MessagingService#outboundSink}} is that a blocking filter blocks the > following filters to be evaluated since there is only a single thread that > evaluates them. It further blocks the other outing messages. The typical > internode messaging pattern is that the coordinator node sends out multiple > messages to other nodes upon receiving a query. The described blocking > messages can happen quite often. > The problem can be solved by moving the message filtering to the > {{MessagingService#inboundSink}}, so that each inbounding message is > naturally filtered in parallel. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15667) StreamResultFuture check for completeness is inconsistent, leading to races
[ https://issues.apache.org/jira/browse/CASSANDRA-15667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076236#comment-17076236 ] Massimiliano Tomassi commented on CASSANDRA-15667: -- PR: [https://github.com/maxtomassi/cassandra/pull/1] CircleCI: [https://app.circleci.com/pipelines/github/maxtomassi/cassandra?branch=15667-4.0] (JVM dtests failed running) > StreamResultFuture check for completeness is inconsistent, leading to races > --- > > Key: CASSANDRA-15667 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15667 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Streaming and Messaging >Reporter: Sergio Bossa >Assignee: Massimiliano Tomassi >Priority: Normal > Fix For: 4.0 > > > {{StreamResultFuture#maybeComplete()}} uses > {{StreamCoordinator#hasActiveSessions()}} to determine if all sessions are > completed, but then accesses each session state via > {{StreamCoordinator#getAllSessionInfo()}}: this is inconsistent, as the > former relies on the actual {{StreamSession}} state, while the latter on the > {{SessionInfo}} state, and the two are concurrently updated with no > coordination whatsoever. > This leads to races, i.e. apparent in some dtest spurious failures, such as > {{TestBootstrap.resumable_bootstrap_test}} in CASSANDRA-15614 cc > [~e.dimitrova]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15671) Testcase: testSubrangeCompaction(org.apache.cassandra.db.compaction.CancelCompactionsTest): FAILED
[ https://issues.apache.org/jira/browse/CASSANDRA-15671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076223#comment-17076223 ] Francisco Fernandez commented on CASSANDRA-15671: - Can you see the results using this [link|https://app.circleci.com/pipelines/github/fcofdez/cassandra?branch=CASSANDRA-15671]? > Testcase: > testSubrangeCompaction(org.apache.cassandra.db.compaction.CancelCompactionsTest): >FAILED > -- > > Key: CASSANDRA-15671 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15671 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Ekaterina Dimitrova >Assignee: Francisco Fernandez >Priority: Normal > Labels: pull-request-available > Fix For: 4.0, 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > The following test failure was observed: > [junit-timeout] Testcase: > testSubrangeCompaction(org.apache.cassandra.db.compaction.CancelCompactionsTest): >FAILED > [junit-timeout] expected:<4> but was:<5> > [junit-timeout] junit.framework.AssertionFailedError: expected:<4> but was:<5> > [junit-timeout] at > org.apache.cassandra.db.compaction.CancelCompactionsTest.testSubrangeCompaction(CancelCompactionsTest.java:190) > Java 8 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15586) 4.0 quality testing: Cluster Setup and Maintenance
[ https://issues.apache.org/jira/browse/CASSANDRA-15586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076210#comment-17076210 ] Angelo Polo commented on CASSANDRA-15586: - Glad to see FreeBSD is back on the map for the broader Cassandra community :). [~e.dimitrova] I will let you know as I investigate bugs or need any insight. Some things can get patched in the FreeBSD build and don't need immediate (i.e. prior to 4.0 release) support from a core project commiter, such as CASSANDRA-15693, which I opened recently, and can be upstreamed sometime later. > 4.0 quality testing: Cluster Setup and Maintenance > -- > > Key: CASSANDRA-15586 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15586 > Project: Cassandra > Issue Type: Task > Components: Test/dtest >Reporter: Josh McKenzie >Assignee: Ekaterina Dimitrova >Priority: Normal > Labels: 4.0-QA > Fix For: 4.0-rc > > > We want 4.0 to be easy for users to setup out of the box and just work. This > means having low friction when users download the Cassandra package and start > running it. For example, users should be able to easily configure and start > new 4.0 clusters and have tokens distributed evenly. Another example is > packaging, it should be easy to install Cassandra on all supported platforms > (e.g. packaging) and have Cassandra use standard platform integrations. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15568) Message filtering should apply on the inboundSink in In-JVM dtest
[ https://issues.apache.org/jira/browse/CASSANDRA-15568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076194#comment-17076194 ] Alex Petrov commented on CASSANDRA-15568: - If I understand the problem correctly, [~dcapwell] has already implemented inbound message filtering / sinks. Should we close this one as duplicate? > Message filtering should apply on the inboundSink in In-JVM dtest > - > > Key: CASSANDRA-15568 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15568 > Project: Cassandra > Issue Type: Improvement > Components: Test/dtest >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: Normal > > The message filtering mechanism in the in-jvm dtest helps to simulate network > partition/delay. > The problem of the current approach that adds all filters to the > {{MessagingService#outboundSink}} is that a blocking filter blocks the > following filters to be evaluated since there is only a single thread that > evaluates them. It further blocks the other outing messages. The typical > internode messaging pattern is that the coordinator node sends out multiple > messages to other nodes upon receiving a query. The described blocking > messages can happen quite often. > The problem can be solved by moving the message filtering to the > {{MessagingService#inboundSink}}, so that each inbounding message is > naturally filtered in parallel. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13403) nodetool repair breaks SASI index
[ https://issues.apache.org/jira/browse/CASSANDRA-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-13403: Reviewers: (was: Alex Petrov) > nodetool repair breaks SASI index > - > > Key: CASSANDRA-13403 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13403 > Project: Cassandra > Issue Type: Bug > Components: Feature/SASI > Environment: 3.10 >Reporter: Igor Novgorodov >Priority: Normal > Labels: patch > Attachments: 3_nodes_compaction.log, 4_nodes_compaction.log, > CASSANDRA-13403.patch, testSASIRepair.patch > > > I've got table: > {code} > CREATE TABLE cservice.bulks_recipients ( > recipient text, > bulk_id uuid, > datetime_final timestamp, > datetime_sent timestamp, > request_id uuid, > status int, > PRIMARY KEY (recipient, bulk_id) > ) WITH CLUSTERING ORDER BY (bulk_id ASC) > AND bloom_filter_fp_chance = 0.01 > AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'} > AND comment = '' > AND compaction = {'class': > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', > 'max_threshold': '32', 'min_threshold': '4'} > AND compression = {'chunk_length_in_kb': '64', 'class': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND crc_check_chance = 1.0 > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99PERCENTILE'; > CREATE CUSTOM INDEX bulk_recipients_bulk_id ON cservice.bulks_recipients > (bulk_id) USING 'org.apache.cassandra.index.sasi.SASIIndex'; > {code} > There are 11 rows in it: > {code} > > select * from bulks_recipients; > ... > (11 rows) > {code} > Let's query by index (all rows have the same *bulk_id*): > {code} > > select * from bulks_recipients where bulk_id = > > baa94815-e276-4ca4-adda-5b9734e6c4a5; > > > ... > (11 rows) > {code} > Ok, everything is fine. > Now i'm doing *nodetool repair --partitioner-range --job-threads 4 --full* on > each node in cluster sequentially. > After it finished: > {code} > > select * from bulks_recipients where bulk_id = > > baa94815-e276-4ca4-adda-5b9734e6c4a5; > ... > (2 rows) > {code} > Only two rows. > While the rows are actually there: > {code} > > select * from bulks_recipients; > ... > (11 rows) > {code} > If i issue an incremental repair on a random node, i can get like 7 rows > after index query. > Dropping index and recreating it fixes the issue. Is it a bug or am i doing > the repair the wrong way? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13403) nodetool repair breaks SASI index
[ https://issues.apache.org/jira/browse/CASSANDRA-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-13403: Authors: (was: Alex Petrov) > nodetool repair breaks SASI index > - > > Key: CASSANDRA-13403 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13403 > Project: Cassandra > Issue Type: Bug > Components: Feature/SASI > Environment: 3.10 >Reporter: Igor Novgorodov >Assignee: Alex Petrov >Priority: Normal > Labels: patch > Attachments: 3_nodes_compaction.log, 4_nodes_compaction.log, > CASSANDRA-13403.patch, testSASIRepair.patch > > > I've got table: > {code} > CREATE TABLE cservice.bulks_recipients ( > recipient text, > bulk_id uuid, > datetime_final timestamp, > datetime_sent timestamp, > request_id uuid, > status int, > PRIMARY KEY (recipient, bulk_id) > ) WITH CLUSTERING ORDER BY (bulk_id ASC) > AND bloom_filter_fp_chance = 0.01 > AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'} > AND comment = '' > AND compaction = {'class': > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', > 'max_threshold': '32', 'min_threshold': '4'} > AND compression = {'chunk_length_in_kb': '64', 'class': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND crc_check_chance = 1.0 > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99PERCENTILE'; > CREATE CUSTOM INDEX bulk_recipients_bulk_id ON cservice.bulks_recipients > (bulk_id) USING 'org.apache.cassandra.index.sasi.SASIIndex'; > {code} > There are 11 rows in it: > {code} > > select * from bulks_recipients; > ... > (11 rows) > {code} > Let's query by index (all rows have the same *bulk_id*): > {code} > > select * from bulks_recipients where bulk_id = > > baa94815-e276-4ca4-adda-5b9734e6c4a5; > > > ... > (11 rows) > {code} > Ok, everything is fine. > Now i'm doing *nodetool repair --partitioner-range --job-threads 4 --full* on > each node in cluster sequentially. > After it finished: > {code} > > select * from bulks_recipients where bulk_id = > > baa94815-e276-4ca4-adda-5b9734e6c4a5; > ... > (2 rows) > {code} > Only two rows. > While the rows are actually there: > {code} > > select * from bulks_recipients; > ... > (11 rows) > {code} > If i issue an incremental repair on a random node, i can get like 7 rows > after index query. > Dropping index and recreating it fixes the issue. Is it a bug or am i doing > the repair the wrong way? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-13403) nodetool repair breaks SASI index
[ https://issues.apache.org/jira/browse/CASSANDRA-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov reassigned CASSANDRA-13403: --- Assignee: (was: Alex Petrov) > nodetool repair breaks SASI index > - > > Key: CASSANDRA-13403 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13403 > Project: Cassandra > Issue Type: Bug > Components: Feature/SASI > Environment: 3.10 >Reporter: Igor Novgorodov >Priority: Normal > Labels: patch > Attachments: 3_nodes_compaction.log, 4_nodes_compaction.log, > CASSANDRA-13403.patch, testSASIRepair.patch > > > I've got table: > {code} > CREATE TABLE cservice.bulks_recipients ( > recipient text, > bulk_id uuid, > datetime_final timestamp, > datetime_sent timestamp, > request_id uuid, > status int, > PRIMARY KEY (recipient, bulk_id) > ) WITH CLUSTERING ORDER BY (bulk_id ASC) > AND bloom_filter_fp_chance = 0.01 > AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'} > AND comment = '' > AND compaction = {'class': > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', > 'max_threshold': '32', 'min_threshold': '4'} > AND compression = {'chunk_length_in_kb': '64', 'class': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND crc_check_chance = 1.0 > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99PERCENTILE'; > CREATE CUSTOM INDEX bulk_recipients_bulk_id ON cservice.bulks_recipients > (bulk_id) USING 'org.apache.cassandra.index.sasi.SASIIndex'; > {code} > There are 11 rows in it: > {code} > > select * from bulks_recipients; > ... > (11 rows) > {code} > Let's query by index (all rows have the same *bulk_id*): > {code} > > select * from bulks_recipients where bulk_id = > > baa94815-e276-4ca4-adda-5b9734e6c4a5; > > > ... > (11 rows) > {code} > Ok, everything is fine. > Now i'm doing *nodetool repair --partitioner-range --job-threads 4 --full* on > each node in cluster sequentially. > After it finished: > {code} > > select * from bulks_recipients where bulk_id = > > baa94815-e276-4ca4-adda-5b9734e6c4a5; > ... > (2 rows) > {code} > Only two rows. > While the rows are actually there: > {code} > > select * from bulks_recipients; > ... > (11 rows) > {code} > If i issue an incremental repair on a random node, i can get like 7 rows > after index query. > Dropping index and recreating it fixes the issue. Is it a bug or am i doing > the repair the wrong way? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13917) COMPACT STORAGE queries on dense static tables accept hidden column1 and value columns
[ https://issues.apache.org/jira/browse/CASSANDRA-13917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-13917: Status: Patch Available (was: In Progress) > COMPACT STORAGE queries on dense static tables accept hidden column1 and > value columns > -- > > Key: CASSANDRA-13917 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13917 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Core >Reporter: Alex Petrov >Assignee: Aleksandr Sorokoumov >Priority: Low > Labels: lhf > Fix For: 3.0.x, 3.11.x > > Attachments: 13917-3.0-testall-13.12.2019, > 13917-3.0-testall-16.01.2020, 13917-3.0-testall-2.png, > 13917-3.0-testall-20.11.2019.png, 13917-3.0-upgrade-16.01.2020, > 13917-3.0.png, 13917-3.11-testall-13.12.2019, > 13917-3.11-testall-16.01.2020.png, 13917-3.11-testall-2.png, > 13917-3.11-testall-20.11.2019.png, 13917-3.11-upgrade-16.01.2020.png, > 13917-3.11.png > > > Test for the issue: > {code} > @Test > public void testCompactStorage() throws Throwable > { > createTable("CREATE TABLE %s (a int PRIMARY KEY, b int, c int) WITH > COMPACT STORAGE"); > assertInvalid("INSERT INTO %s (a, b, c, column1) VALUES (?, ?, ?, > ?)", 1, 1, 1, ByteBufferUtil.bytes('a')); > // This one fails with Some clustering keys are missing: column1, > which is still wrong > assertInvalid("INSERT INTO %s (a, b, c, value) VALUES (?, ?, ?, ?)", > 1, 1, 1, ByteBufferUtil.bytes('a')); > assertInvalid("INSERT INTO %s (a, b, c, column1, value) VALUES (?, ?, > ?, ?, ?)", 1, 1, 1, ByteBufferUtil.bytes('a'), ByteBufferUtil.bytes('b')); > assertEmpty(execute("SELECT * FROM %s")); > } > {code} > Gladly, these writes are no-op, even though they succeed. > {{value}} and {{column1}} should be completely hidden. Fixing this one should > be as easy as just adding validations. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13917) COMPACT STORAGE queries on dense static tables accept hidden column1 and value columns
[ https://issues.apache.org/jira/browse/CASSANDRA-13917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-13917: Reviewers: Alex Petrov, Alex Petrov (was: Alex Petrov) Alex Petrov, Alex Petrov (was: Alex Petrov) Status: Review In Progress (was: Patch Available) > COMPACT STORAGE queries on dense static tables accept hidden column1 and > value columns > -- > > Key: CASSANDRA-13917 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13917 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Core >Reporter: Alex Petrov >Assignee: Aleksandr Sorokoumov >Priority: Low > Labels: lhf > Fix For: 3.0.x, 3.11.x > > Attachments: 13917-3.0-testall-13.12.2019, > 13917-3.0-testall-16.01.2020, 13917-3.0-testall-2.png, > 13917-3.0-testall-20.11.2019.png, 13917-3.0-upgrade-16.01.2020, > 13917-3.0.png, 13917-3.11-testall-13.12.2019, > 13917-3.11-testall-16.01.2020.png, 13917-3.11-testall-2.png, > 13917-3.11-testall-20.11.2019.png, 13917-3.11-upgrade-16.01.2020.png, > 13917-3.11.png > > > Test for the issue: > {code} > @Test > public void testCompactStorage() throws Throwable > { > createTable("CREATE TABLE %s (a int PRIMARY KEY, b int, c int) WITH > COMPACT STORAGE"); > assertInvalid("INSERT INTO %s (a, b, c, column1) VALUES (?, ?, ?, > ?)", 1, 1, 1, ByteBufferUtil.bytes('a')); > // This one fails with Some clustering keys are missing: column1, > which is still wrong > assertInvalid("INSERT INTO %s (a, b, c, value) VALUES (?, ?, ?, ?)", > 1, 1, 1, ByteBufferUtil.bytes('a')); > assertInvalid("INSERT INTO %s (a, b, c, column1, value) VALUES (?, ?, > ?, ?, ?)", 1, 1, 1, ByteBufferUtil.bytes('a'), ByteBufferUtil.bytes('b')); > assertEmpty(execute("SELECT * FROM %s")); > } > {code} > Gladly, these writes are no-op, even though they succeed. > {{value}} and {{column1}} should be completely hidden. Fixing this one should > be as easy as just adding validations. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13917) COMPACT STORAGE queries on dense static tables accept hidden column1 and value columns
[ https://issues.apache.org/jira/browse/CASSANDRA-13917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-13917: Status: Ready to Commit (was: Review In Progress) > COMPACT STORAGE queries on dense static tables accept hidden column1 and > value columns > -- > > Key: CASSANDRA-13917 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13917 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Core >Reporter: Alex Petrov >Assignee: Aleksandr Sorokoumov >Priority: Low > Labels: lhf > Fix For: 3.0.x, 3.11.x > > Attachments: 13917-3.0-testall-13.12.2019, > 13917-3.0-testall-16.01.2020, 13917-3.0-testall-2.png, > 13917-3.0-testall-20.11.2019.png, 13917-3.0-upgrade-16.01.2020, > 13917-3.0.png, 13917-3.11-testall-13.12.2019, > 13917-3.11-testall-16.01.2020.png, 13917-3.11-testall-2.png, > 13917-3.11-testall-20.11.2019.png, 13917-3.11-upgrade-16.01.2020.png, > 13917-3.11.png > > > Test for the issue: > {code} > @Test > public void testCompactStorage() throws Throwable > { > createTable("CREATE TABLE %s (a int PRIMARY KEY, b int, c int) WITH > COMPACT STORAGE"); > assertInvalid("INSERT INTO %s (a, b, c, column1) VALUES (?, ?, ?, > ?)", 1, 1, 1, ByteBufferUtil.bytes('a')); > // This one fails with Some clustering keys are missing: column1, > which is still wrong > assertInvalid("INSERT INTO %s (a, b, c, value) VALUES (?, ?, ?, ?)", > 1, 1, 1, ByteBufferUtil.bytes('a')); > assertInvalid("INSERT INTO %s (a, b, c, column1, value) VALUES (?, ?, > ?, ?, ?)", 1, 1, 1, ByteBufferUtil.bytes('a'), ByteBufferUtil.bytes('b')); > assertEmpty(execute("SELECT * FROM %s")); > } > {code} > Gladly, these writes are no-op, even though they succeed. > {{value}} and {{column1}} should be completely hidden. Fixing this one should > be as easy as just adding validations. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13917) COMPACT STORAGE queries on dense static tables accept hidden column1 and value columns
[ https://issues.apache.org/jira/browse/CASSANDRA-13917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-13917: Resolution: Fixed Status: Resolved (was: Ready to Commit) > COMPACT STORAGE queries on dense static tables accept hidden column1 and > value columns > -- > > Key: CASSANDRA-13917 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13917 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Core >Reporter: Alex Petrov >Assignee: Aleksandr Sorokoumov >Priority: Low > Labels: lhf > Fix For: 3.0.x, 3.11.x > > Attachments: 13917-3.0-testall-13.12.2019, > 13917-3.0-testall-16.01.2020, 13917-3.0-testall-2.png, > 13917-3.0-testall-20.11.2019.png, 13917-3.0-upgrade-16.01.2020, > 13917-3.0.png, 13917-3.11-testall-13.12.2019, > 13917-3.11-testall-16.01.2020.png, 13917-3.11-testall-2.png, > 13917-3.11-testall-20.11.2019.png, 13917-3.11-upgrade-16.01.2020.png, > 13917-3.11.png > > > Test for the issue: > {code} > @Test > public void testCompactStorage() throws Throwable > { > createTable("CREATE TABLE %s (a int PRIMARY KEY, b int, c int) WITH > COMPACT STORAGE"); > assertInvalid("INSERT INTO %s (a, b, c, column1) VALUES (?, ?, ?, > ?)", 1, 1, 1, ByteBufferUtil.bytes('a')); > // This one fails with Some clustering keys are missing: column1, > which is still wrong > assertInvalid("INSERT INTO %s (a, b, c, value) VALUES (?, ?, ?, ?)", > 1, 1, 1, ByteBufferUtil.bytes('a')); > assertInvalid("INSERT INTO %s (a, b, c, column1, value) VALUES (?, ?, > ?, ?, ?)", 1, 1, 1, ByteBufferUtil.bytes('a'), ByteBufferUtil.bytes('b')); > assertEmpty(execute("SELECT * FROM %s")); > } > {code} > Gladly, these writes are no-op, even though they succeed. > {{value}} and {{column1}} should be completely hidden. Fixing this one should > be as easy as just adding validations. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13994) Remove COMPACT STORAGE internals before 4.0 release
[ https://issues.apache.org/jira/browse/CASSANDRA-13994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Petrov updated CASSANDRA-13994: Reviewers: Dinesh Joshi (was: Alex Petrov, Dinesh Joshi) > Remove COMPACT STORAGE internals before 4.0 release > --- > > Key: CASSANDRA-13994 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13994 > Project: Cassandra > Issue Type: Improvement > Components: Legacy/Local Write-Read Paths >Reporter: Alex Petrov >Assignee: Ekaterina Dimitrova >Priority: Low > Fix For: 4.0, 4.0-rc > > > 4.0 comes without thrift (after [CASSANDRA-5]) and COMPACT STORAGE (after > [CASSANDRA-10857]), and since Compact Storage flags are now disabled, all of > the related functionality is useless. > There are still some things to consider: > 1. One of the system tables (built indexes) was compact. For now, we just > added {{value}} column to it to make sure it's backwards-compatible, but we > might want to make sure it's just a "normal" table and doesn't have redundant > columns. > 2. Compact Tables were building indexes in {{KEYS}} mode. Removing it is > trivial, but this would mean that all built indexes will be defunct. We could > log a warning for now and ask users to migrate off those for now and > completely remove it from future releases. It's just a couple of classes > though. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14773) Overflow of 32-bit integer during compaction.
[ https://issues.apache.org/jira/browse/CASSANDRA-14773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076163#comment-17076163 ] Francisco Fernandez commented on CASSANDRA-14773: - I've updated the patch including more test coverage. > Overflow of 32-bit integer during compaction. > - > > Key: CASSANDRA-14773 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14773 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Vladimir Bukhtoyarov >Assignee: Francisco Fernandez >Priority: Urgent > Labels: pull-request-available > Fix For: 4.0, 4.0-beta > > Time Spent: 10m > Remaining Estimate: 0h > > In scope of CASSANDRA-13444 the compaction was significantly improved from > CPU and memory perspective. Hovewer this improvement introduces the bug in > rounding. When rounding the expriration time which is close to > *Cell.MAX_DELETION_TIME*(it is just *Integer.MAX_VALUE*) the math overflow > happens(because in scope of -CASSANDRA-13444-) data type for point was > changed from Long to Integer in order to reduce memory footprint), as result > point became negative and acts as silent poison for internal structures of > StreamingTombstoneHistogramBuilder like *DistanceHolder* and *DataHolder*. > Then depending of point intervals: > * The TombstoneHistogram produces wrong values when interval of points is > less then binSize, it is not critical. > * Compaction crashes with ArrayIndexOutOfBoundsException if amount of point > intervals is great then binSize, this case is very critical. > > This is pull request [https://github.com/apache/cassandra/pull/273] that > reproduces the issue and provides the fix. > > The stacktrace when running(on codebase without fix) > *testMathOverflowDuringRoundingOfLargeTimestamp* without -ea JVM flag > {noformat} > java.lang.ArrayIndexOutOfBoundsException > at java.lang.System.arraycopy(Native Method) > at > org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder$DistanceHolder.add(StreamingTombstoneHistogramBuilder.java:208) > at > org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder.flushValue(StreamingTombstoneHistogramBuilder.java:140) > at > org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder$$Lambda$1/1967205423.consume(Unknown > Source) > at > org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder$Spool.forEach(StreamingTombstoneHistogramBuilder.java:574) > at > org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder.flushHistogram(StreamingTombstoneHistogramBuilder.java:124) > at > org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder.build(StreamingTombstoneHistogramBuilder.java:184) > at > org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilderTest.testMathOverflowDuringRoundingOfLargeTimestamp(StreamingTombstoneHistogramBuilderTest.java:183) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) > at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:44) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:180) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:41) > at org.junit.runners.ParentRunner$1.evaluate(ParentRunner.java:173) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) > at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31) > at org.junit.runners.ParentRunner.run(ParentRunner.java:220) > at org.junit.runner.JUnitCore.run(JUnitCore.java:159) > at > com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68) > at > com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47) > at > com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242) > at
[jira] [Assigned] (CASSANDRA-15406) Add command to show the progress of data streaming and index build
[ https://issues.apache.org/jira/browse/CASSANDRA-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Miklosovic reassigned CASSANDRA-15406: - Assignee: (was: Stefan Miklosovic) > Add command to show the progress of data streaming and index build > --- > > Key: CASSANDRA-15406 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15406 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Streaming, Legacy/Streaming and Messaging, > Tool/nodetool >Reporter: maxwellguo >Priority: Normal > Fix For: 4.0, 4.x > > > I found that we should supply a command to show the progress of streaming > when we do the operation of bootstrap/move/decommission/removenode. For when > do data streaming , noboday knows which steps there program are in , so I > think a command to show the joing/leaving node's is needed . -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15694) Statistics upon streaming of entire SSTables in Netstats is wrong
[ https://issues.apache.org/jira/browse/CASSANDRA-15694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Miklosovic updated CASSANDRA-15694: -- Description: There is a bug in the current code as if we are streaming entire SSTables via CassandraEntireSSTableStreamWriter and CassandraOutgoingFile respectively, there is not any update on particular components of a SSTable as it counts only "db" file to be there. That introduces this bug: {code:java} Mode: NORMAL Rebuild 2c0b43f0-735d-11ea-9346-fb0ffe238736 /127.0.0.2 Sending 19 files, 27664559 bytes total. Already sent 133 files, 27664559 bytes total /tmp/dtests15682026295742741219/node2/data/distributed_test_keyspace/cf-196b3... {code} Basically, number of files to be sent is lower than files sent. The straightforward fix here is to distinguish when we are streaming entire sstables and in that case include all manifest files into computation. This issue relates to https://issues.apache.org/jira/browse/CASSANDRA-15657 because the resolution whether we stream entirely or not is got from a method which is performance sensitive and computed every time. Once CASSANDRA-15657 (hence CASSANDRA-14586) is done, this ticket can be worked on. branch with fix is here: [https://github.com/smiklosovic/cassandra/tree/CASSANDRA-15694] was: There is a bug in the current code as if we are streaming entire SSTables via CassandraEntireSSTableStreamWriter and CassandraOutgoingFile respectively, there is not any update on particular components of a SSTable as it counts only "db" file to be there. That introduces this bug: {code:java} Mode: NORMAL Rebuild 2c0b43f0-735d-11ea-9346-fb0ffe238736 /127.0.0.2 Sending 19 files, 27664559 bytes total. Already sent 133 files, 27664559 bytes total /tmp/dtests15682026295742741219/node2/data/distributed_test_keyspace/cf-196b3... {code} Basically, number of files to be sent is lower than files sent. The straightforward fix here is to distinguish when we are streaming entire sstables and in that case include all manifest files into computation. This issue relates to https://issues.apache.org/jira/browse/CASSANDRA-15657 because the resolution whether we stream entirely or not is got from a method which is performance sensitive and computed every time. Once CASSANDRA-15657 (hence CASSANDRA-14586) is done, this ticket can be worked on. > Statistics upon streaming of entire SSTables in Netstats is wrong > - > > Key: CASSANDRA-15694 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15694 > Project: Cassandra > Issue Type: Bug > Components: Tool/nodetool >Reporter: Stefan Miklosovic >Priority: Normal > > There is a bug in the current code as if we are streaming entire SSTables via > CassandraEntireSSTableStreamWriter and CassandraOutgoingFile respectively, > there is not any update on particular components of a SSTable as it counts > only "db" file to be there. That introduces this bug: > > {code:java} > Mode: NORMAL > Rebuild 2c0b43f0-735d-11ea-9346-fb0ffe238736 > /127.0.0.2 Sending 19 files, 27664559 bytes total. Already sent 133 > files, 27664559 bytes total > > /tmp/dtests15682026295742741219/node2/data/distributed_test_keyspace/cf-196b3... > > {code} > Basically, number of files to be sent is lower than files sent. > > The straightforward fix here is to distinguish when we are streaming entire > sstables and in that case include all manifest files into computation. > > This issue relates to https://issues.apache.org/jira/browse/CASSANDRA-15657 > because the resolution whether we stream entirely or not is got from a method > which is performance sensitive and computed every time. Once CASSANDRA-15657 > (hence CASSANDRA-14586) is done, this ticket can be worked on. > > branch with fix is here: > [https://github.com/smiklosovic/cassandra/tree/CASSANDRA-15694] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15406) Add command to show the progress of data streaming and index build
[ https://issues.apache.org/jira/browse/CASSANDRA-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17076115#comment-17076115 ] Stefan Miklosovic commented on CASSANDRA-15406: --- This issue should be blocked to work on until the underlying bug when streaming of entire sstables is resolved as the figures for the computation of progress would not make sense anyway. > Add command to show the progress of data streaming and index build > --- > > Key: CASSANDRA-15406 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15406 > Project: Cassandra > Issue Type: Improvement > Components: Consistency/Streaming, Legacy/Streaming and Messaging, > Tool/nodetool >Reporter: maxwellguo >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 4.0, 4.x > > > I found that we should supply a command to show the progress of streaming > when we do the operation of bootstrap/move/decommission/removenode. For when > do data streaming , noboday knows which steps there program are in , so I > think a command to show the joing/leaving node's is needed . -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-15694) Statistics upon streaming of entire SSTables in Netstats is wrong
[ https://issues.apache.org/jira/browse/CASSANDRA-15694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Miklosovic reassigned CASSANDRA-15694: - Assignee: (was: Stefan Miklosovic) > Statistics upon streaming of entire SSTables in Netstats is wrong > - > > Key: CASSANDRA-15694 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15694 > Project: Cassandra > Issue Type: Bug > Components: Tool/nodetool >Reporter: Stefan Miklosovic >Priority: Normal > > There is a bug in the current code as if we are streaming entire SSTables via > CassandraEntireSSTableStreamWriter and CassandraOutgoingFile respectively, > there is not any update on particular components of a SSTable as it counts > only "db" file to be there. That introduces this bug: > > {code:java} > Mode: NORMAL > Rebuild 2c0b43f0-735d-11ea-9346-fb0ffe238736 > /127.0.0.2 Sending 19 files, 27664559 bytes total. Already sent 133 > files, 27664559 bytes total > > /tmp/dtests15682026295742741219/node2/data/distributed_test_keyspace/cf-196b3... > > {code} > Basically, number of files to be sent is lower than files sent. > > The straightforward fix here is to distinguish when we are streaming entire > sstables and in that case include all manifest files into computation. > > This issue relates to https://issues.apache.org/jira/browse/CASSANDRA-15657 > because the resolution whether we stream entirely or not is got from a method > which is performance sensitive and computed every time. Once CASSANDRA-15657 > (hence CASSANDRA-14586) is done, this ticket can be worked on. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15694) Statistics upon streaming of entire SSTables in Netstats is wrong
Stefan Miklosovic created CASSANDRA-15694: - Summary: Statistics upon streaming of entire SSTables in Netstats is wrong Key: CASSANDRA-15694 URL: https://issues.apache.org/jira/browse/CASSANDRA-15694 Project: Cassandra Issue Type: Bug Components: Tool/nodetool Reporter: Stefan Miklosovic Assignee: Stefan Miklosovic There is a bug in the current code as if we are streaming entire SSTables via CassandraEntireSSTableStreamWriter and CassandraOutgoingFile respectively, there is not any update on particular components of a SSTable as it counts only "db" file to be there. That introduces this bug: {code:java} Mode: NORMAL Rebuild 2c0b43f0-735d-11ea-9346-fb0ffe238736 /127.0.0.2 Sending 19 files, 27664559 bytes total. Already sent 133 files, 27664559 bytes total /tmp/dtests15682026295742741219/node2/data/distributed_test_keyspace/cf-196b3... {code} Basically, number of files to be sent is lower than files sent. The straightforward fix here is to distinguish when we are streaming entire sstables and in that case include all manifest files into computation. This issue relates to https://issues.apache.org/jira/browse/CASSANDRA-15657 because the resolution whether we stream entirely or not is got from a method which is performance sensitive and computed every time. Once CASSANDRA-15657 (hence CASSANDRA-14586) is done, this ticket can be worked on. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org