[jira] [Updated] (CASSANDRA-16150) Upgrade to snakeyaml >= 1.26 version for CVE-2017-18640 fix
[ https://issues.apache.org/jira/browse/CASSANDRA-16150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rahul Nandi updated CASSANDRA-16150: Description: There have been critical level CVE (CVE-2017-18640) discovered in snakeyaml version earlier to 1.26. This has been patched into snakeyaml version 1.26. Reference: [https://nvd.nist.gov/vuln/detail/CVE-2017-18640] This card is expected to upgrade the snakeyaml version to 1.26. was: There have been critical level CVE ( [CVE-2017-18640 | [https://nvd.nist.gov/vuln/detail/CVE-2017-18640]] ) discovered in snakeyaml version earlier to 1.26. This has been patched into snakeyaml version 1.26. This card is expected to upgrade the snakeyaml version to 1.26. > Upgrade to snakeyaml >= 1.26 version for CVE-2017-18640 fix > --- > > Key: CASSANDRA-16150 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16150 > Project: Cassandra > Issue Type: Bug > Components: Dependencies >Reporter: Rahul Nandi >Assignee: Rahul Nandi >Priority: Normal > > There have been critical level CVE (CVE-2017-18640) discovered in snakeyaml > version earlier to 1.26. This has been patched into snakeyaml version 1.26. > Reference: [https://nvd.nist.gov/vuln/detail/CVE-2017-18640] > This card is expected to upgrade the snakeyaml version to 1.26. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16150) Upgrade to snakeyaml >= 1.26 version for CVE-2017-18640 fix
[ https://issues.apache.org/jira/browse/CASSANDRA-16150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rahul Nandi updated CASSANDRA-16150: Description: There have been critical level CVE ( [CVE-2017-18640 | [https://nvd.nist.gov/vuln/detail/CVE-2017-18640]] ) discovered in snakeyaml version earlier to 1.26. This has been patched into snakeyaml version 1.26. This card is expected to upgrade the snakeyaml version to 1.26. was: There have been critical level CVE ([CVE-2017-18640|[https://nvd.nist.gov/vuln/detail/CVE-2017-18640]]) discovered in snakeyaml version earlier to 1.26. This has been patched into snakeyaml version 1.26. This card is expected to upgrade the snakeyaml version to 1.26. > Upgrade to snakeyaml >= 1.26 version for CVE-2017-18640 fix > --- > > Key: CASSANDRA-16150 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16150 > Project: Cassandra > Issue Type: Bug > Components: Dependencies >Reporter: Rahul Nandi >Assignee: Rahul Nandi >Priority: Normal > > There have been critical level CVE ( [CVE-2017-18640 | > [https://nvd.nist.gov/vuln/detail/CVE-2017-18640]] ) discovered in snakeyaml > version earlier to 1.26. This has been patched into snakeyaml version 1.26. > This card is expected to upgrade the snakeyaml version to 1.26. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-16150) Upgrade to snakeyaml >= 1.26 version for CVE-2017-18640 fix
Rahul Nandi created CASSANDRA-16150: --- Summary: Upgrade to snakeyaml >= 1.26 version for CVE-2017-18640 fix Key: CASSANDRA-16150 URL: https://issues.apache.org/jira/browse/CASSANDRA-16150 Project: Cassandra Issue Type: Bug Components: Dependencies Reporter: Rahul Nandi Assignee: Rahul Nandi There have been critical level CVE ([CVE-2017-18640|[https://nvd.nist.gov/vuln/detail/CVE-2017-18640]]) discovered in snakeyaml version earlier to 1.26. This has been patched into snakeyaml version 1.26. This card is expected to upgrade the snakeyaml version to 1.26. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16128) Jenkins: dsl for website build, logging repo SHAs, and using nightlies.a.o instead of archiving
[ https://issues.apache.org/jira/browse/CASSANDRA-16128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204449#comment-17204449 ] Berenguer Blasi commented on CASSANDRA-16128: - lgtm. > Jenkins: dsl for website build, logging repo SHAs, and using nightlies.a.o > instead of archiving > --- > > Key: CASSANDRA-16128 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16128 > Project: Cassandra > Issue Type: Task > Components: CI >Reporter: Michael Semb Wever >Assignee: Michael Semb Wever >Priority: Normal > Fix For: 2.2.x, 3.0.x, 3.11.x, 4.0-beta > > > Jenkins improvements > 1. Add the cassandra-website job into cassandra_job_dsl.seed.groovy (so we > don't lose it next time the Jenkins master is corrupted) > 2. Print the SHAs of the different git repos used during the build process. > Also store them in the .head files (so the pipeline can print them out too). > 3. Instead of archiving artefacts, ssh them to > https://nightlies.apache.org/cassandra/ > (Disk usage on agents is largely under control, but disk usage on master was > the new problem. The suspicion here is the Cassandra-*-artifact's artefacts > was the disk usage culprit, though we have to evidence to support it.) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15991) 15583 - Add UX tests to intree LHF tooling
[ https://issues.apache.org/jira/browse/CASSANDRA-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204423#comment-17204423 ] Berenguer Blasi commented on CASSANDRA-15991: - M weird that thing passed for me locally. I was waiting for PR thumbs up to trigger all the CI jobs to spare you that [~dcapwell]. Anyway pushed a fix and checked in [circle|https://app.circleci.com/pipelines/github/bereng/cassandra/133/workflows/bf77a2f4-3243-4051-884a-2b2a83d777be/jobs/1133] as well. > 15583 - Add UX tests to intree LHF tooling > -- > > Key: CASSANDRA-15991 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15991 > Project: Cassandra > Issue Type: Improvement > Components: Test/unit >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0-beta > > > As per CASSANDRA-15583 many in tree tools lack proper UX tooling: mandatory > params are indeed mandatory, 'help' produces an actual help, return codes etc > This ticket is an attempt to add it to those tools that classify as LHF. > Other tools such as nodetool, with many sub-commands, deserve a separate > ticket of their own -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-16036) Add flag to disable chunk cache and disable by default
[ https://issues.apache.org/jira/browse/CASSANDRA-16036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204331#comment-17204331 ] David Capwell edited comment on CASSANDRA-16036 at 9/30/20, 12:08 AM: -- Updated CI results Circle: https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16036-trunk-2EBAD3E9-4394-4D42-9213-69A6590F37E2 (expected test failures caused by other JIRA, and 1 flaky test in no-vnode case but not in vnode case) Jenkins: https://ci-cassandra.apache.org/job/Cassandra-devbranch/52/ trunk baseline: https://app.circleci.com/pipelines/github/dcapwell/cassandra/574/workflows/19f38f3c-9da3-42d5-ba5f-269f0285b791 was (Author: dcapwell): Updated CI results (pending) Circle: https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16036-trunk-2EBAD3E9-4394-4D42-9213-69A6590F37E2 Jenkins: https://ci-cassandra.apache.org/job/Cassandra-devbranch/52/ trunk baseline: https://app.circleci.com/pipelines/github/dcapwell/cassandra/574/workflows/19f38f3c-9da3-42d5-ba5f-269f0285b791 > Add flag to disable chunk cache and disable by default > -- > > Key: CASSANDRA-16036 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16036 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Local Write-Read Paths >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Fix For: 4.0-beta3 > > Attachments: 15229_128mb.png, 16036_128mb.png, > async-profile.collapsed.svg, > clustering-in-clause_latency_selects_baseline.png, > clustering-in-clause_latency_selects_baseline_attempt3.png, > clustering-in-clause_latency_under90_selects_baseline.png, > clustering-in-clause_latency_under90_selects_baseline_attempt3.png, > clustering-slice_latency_selects_baseline.png, > clustering-slice_latency_under90_selects_baseline.png, > medium-blobs_latency_selects_baseline.png, > medium-blobs_latency_under90_selects_baseline.png, > partition-single-row-read_latency_selects_baseline.png, > partition-single-row-read_latency_under90_selects_baseline.png > > > Chunk cache is enabled by default and doesn’t have a flag to disable without > impacting networking. In performance testing 4.0 against 3.0 I found that > reads were slower in 4.0 and after profiling found that the ChunkCache was > partially to blame; after disabling the chunk cache, read performance had > improved. > {code} > 40_w_cc-selects.hdr > #[Mean= 11.50063, StdDeviation = 13.44014] > #[Max =482.41254, Total count= 316477] > #[Buckets = 25, SubBuckets = 262144] > 40_wo_cc-selects.hdr > #[Mean= 9.82115, StdDeviation = 10.14270] > #[Max =522.36493, Total count= 317444] > #[Buckets = 25, SubBuckets = 262144] > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16036) Add flag to disable chunk cache and disable by default
[ https://issues.apache.org/jira/browse/CASSANDRA-16036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-16036: -- Fix Version/s: (was: 4.0-beta) 4.0-beta3 Since Version: 3.11.0 Source Control Link: https://github.com/apache/cassandra/commit/d4f501892d882cb1bf62529f0e72cf7d9c61e323 Resolution: Fixed Status: Resolved (was: Ready to Commit) > Add flag to disable chunk cache and disable by default > -- > > Key: CASSANDRA-16036 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16036 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Local Write-Read Paths >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Fix For: 4.0-beta3 > > Attachments: 15229_128mb.png, 16036_128mb.png, > async-profile.collapsed.svg, > clustering-in-clause_latency_selects_baseline.png, > clustering-in-clause_latency_selects_baseline_attempt3.png, > clustering-in-clause_latency_under90_selects_baseline.png, > clustering-in-clause_latency_under90_selects_baseline_attempt3.png, > clustering-slice_latency_selects_baseline.png, > clustering-slice_latency_under90_selects_baseline.png, > medium-blobs_latency_selects_baseline.png, > medium-blobs_latency_under90_selects_baseline.png, > partition-single-row-read_latency_selects_baseline.png, > partition-single-row-read_latency_under90_selects_baseline.png > > > Chunk cache is enabled by default and doesn’t have a flag to disable without > impacting networking. In performance testing 4.0 against 3.0 I found that > reads were slower in 4.0 and after profiling found that the ChunkCache was > partially to blame; after disabling the chunk cache, read performance had > improved. > {code} > 40_w_cc-selects.hdr > #[Mean= 11.50063, StdDeviation = 13.44014] > #[Max =482.41254, Total count= 316477] > #[Buckets = 25, SubBuckets = 262144] > 40_wo_cc-selects.hdr > #[Mean= 9.82115, StdDeviation = 10.14270] > #[Max =522.36493, Total count= 317444] > #[Buckets = 25, SubBuckets = 262144] > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch trunk updated: Add flag to disable chunk cache and disable by default
This is an automated email from the ASF dual-hosted git repository. dcapwell pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git The following commit(s) were added to refs/heads/trunk by this push: new 79e693e Add flag to disable chunk cache and disable by default 79e693e is described below commit 79e693e16e2152097c5b27d2d7aaa1763e34f594 Author: David Capwell AuthorDate: Tue Sep 29 15:26:37 2020 -0700 Add flag to disable chunk cache and disable by default patch by David Capwell; reviewed by Jon Meredith, Zhao Yang for CASSANDRA-16036 --- CHANGES.txt | 1 + conf/cassandra.yaml | 4 src/java/org/apache/cassandra/cache/ChunkCache.java | 2 +- src/java/org/apache/cassandra/config/Config.java | 2 ++ src/java/org/apache/cassandra/config/DatabaseDescriptor.java | 5 + test/conf/cassandra.yaml | 1 + 6 files changed, 14 insertions(+), 1 deletion(-) diff --git a/CHANGES.txt b/CHANGES.txt index 190eebc..d1fa00e 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -16,6 +16,7 @@ * Mutating sstable component may race with entire-sstable-streaming(ZCS) causing checksum validation failure (CASSANDRA-15861) * NPE thrown while updating speculative execution time if keyspace is removed during task execution (CASSANDRA-15949) * Show the progress of data streaming and index build (CASSANDRA-15406) + * Add flag to disable chunk cache and disable by default (CASSANDRA-16036) Merged from 3.11: * Don't attempt value skipping with mixed version cluster (CASSANDRA-15833) * Use IF NOT EXISTS for index and UDT create statements in snapshot schema files (CASSANDRA-13935) diff --git a/conf/cassandra.yaml b/conf/cassandra.yaml index fcd2ffa..ff414ed 100644 --- a/conf/cassandra.yaml +++ b/conf/cassandra.yaml @@ -469,6 +469,10 @@ concurrent_counter_writes: 32 # be limited by the less of concurrent reads or concurrent writes. concurrent_materialized_view_writes: 32 +# Enable the sstable chunk cache. The chunk cache will store recently accessed +# sections of the sstable in-memory as uncompressed buffers. +# file_cache_enabled: false + # Maximum memory to use for sstable chunk cache and buffer pooling. # 32MB of this are reserved for pooling buffers, the rest is used as an # cache that holds uncompressed sstable chunks. diff --git a/src/java/org/apache/cassandra/cache/ChunkCache.java b/src/java/org/apache/cassandra/cache/ChunkCache.java index e370206..ae38015 100644 --- a/src/java/org/apache/cassandra/cache/ChunkCache.java +++ b/src/java/org/apache/cassandra/cache/ChunkCache.java @@ -42,7 +42,7 @@ public class ChunkCache public static final long cacheSize = 1024L * 1024L * Math.max(0, DatabaseDescriptor.getFileCacheSizeInMB() - RESERVED_POOL_SPACE_IN_MB); public static final boolean roundUp = DatabaseDescriptor.getFileCacheRoundUp(); -private static boolean enabled = cacheSize > 0; +private static boolean enabled = DatabaseDescriptor.getFileCacheEnabled() && cacheSize > 0; public static final ChunkCache instance = enabled ? new ChunkCache() : null; private final LoadingCache cache; diff --git a/src/java/org/apache/cassandra/config/Config.java b/src/java/org/apache/cassandra/config/Config.java index 6abdfba..da410155 100644 --- a/src/java/org/apache/cassandra/config/Config.java +++ b/src/java/org/apache/cassandra/config/Config.java @@ -304,6 +304,8 @@ public class Config public Integer file_cache_size_in_mb; +public boolean file_cache_enabled = Boolean.getBoolean("cassandra.file_cache_enabled"); + /** * Because of the current {@link org.apache.cassandra.utils.memory.BufferPool} slab sizes of 64 kb, we * store in the file cache buffers that divide 64 kb, so we need to round the buffer sizes to powers of two. diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java index 3b5fdfb..e8e66fa 100644 --- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java +++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java @@ -2432,6 +2432,11 @@ public class DatabaseDescriptor conf.incremental_backups = value; } +public static boolean getFileCacheEnabled() +{ +return conf.file_cache_enabled; +} + public static int getFileCacheSizeInMB() { if (conf.file_cache_size_in_mb == null) diff --git a/test/conf/cassandra.yaml b/test/conf/cassandra.yaml index 89b7ff1..38e012f 100644 --- a/test/conf/cassandra.yaml +++ b/test/conf/cassandra.yaml @@ -50,3 +50,4 @@ stream_entire_sstables: true stream_throughput_outbound_megabits_per_sec: 2 enable_sasi_indexes: true enable_materialized_views: true +file_cache_enabled: true - To unsubs
[jira] [Created] (CASSANDRA-16149) Record the expiration time for hints files to avoid loading expired ones
Yifan Cai created CASSANDRA-16149: - Summary: Record the expiration time for hints files to avoid loading expired ones Key: CASSANDRA-16149 URL: https://issues.apache.org/jira/browse/CASSANDRA-16149 Project: Cassandra Issue Type: Improvement Components: Local/Other Reporter: Yifan Cai The expiration time of a hints file is considered to be the latest expiration time among all the hints in the file. If the current time exceeds the file expiration time, the file can be safely deleted. The expiration time can be determined when finishing writing to the hints file. The tricky part is that each hints file keeps the metadata at the header of the file, but the expiration time is only known at the end. So we may want to save the metadata in a companion file of the hints. This approach is also future-proof, in that case that we want to add more metadata. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16147) ValueAccessor is using signed shorts in sliceWithShortLength
[ https://issues.apache.org/jira/browse/CASSANDRA-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204375#comment-17204375 ] David Capwell commented on CASSANDRA-16147: --- test LGTM thanks! +1 > ValueAccessor is using signed shorts in sliceWithShortLength > > > Key: CASSANDRA-16147 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16147 > Project: Cassandra > Issue Type: Bug > Components: Local/Other >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 4.0-beta > > > ValueAccessor is using a signed short when interpreting byte lengths, causing > exceptions when reading blobs over 32767 bytes in length -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15234) Standardise config and JVM parameters
[ https://issues.apache.org/jira/browse/CASSANDRA-15234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204374#comment-17204374 ] Jon Meredith commented on CASSANDRA-15234: -- The improvements are definitely very valuable and make configuration much cleaner and more flexible, but I'm also concerned it's too late in the cycle. Although the patch goes to great lengths to be backward compatible, people that have been working towards getting ready for production deployments would need to re-test all the configurations they've worked through so far which would certainly cause rework to validate the release. > Standardise config and JVM parameters > - > > Key: CASSANDRA-15234 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15234 > Project: Cassandra > Issue Type: Bug > Components: Local/Config >Reporter: Benedict Elliott Smith >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0-alpha > > Attachments: CASSANDRA-15234-3-DTests-JAVA8.txt > > > We have a bunch of inconsistent names and config patterns in the codebase, > both from the yams and JVM properties. It would be nice to standardise the > naming (such as otc_ vs internode_) as well as the provision of values with > units - while maintaining perpetual backwards compatibility with the old > parameter names, of course. > For temporal units, I would propose parsing strings with suffixes of: > {{code}} > u|micros(econds?)? > ms|millis(econds?)? > s(econds?)? > m(inutes?)? > h(ours?)? > d(ays?)? > mo(nths?)? > {{code}} > For rate units, I would propose parsing any of the standard {{B/s, KiB/s, > MiB/s, GiB/s, TiB/s}}. > Perhaps for avoiding ambiguity we could not accept bauds {{bs, Mbps}} or > powers of 1000 such as {{KB/s}}, given these are regularly used for either > their old or new definition e.g. {{KiB/s}}, or we could support them and > simply log the value in bytes/s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16147) ValueAccessor is using signed shorts in sliceWithShortLength
[ https://issues.apache.org/jira/browse/CASSANDRA-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204373#comment-17204373 ] David Capwell commented on CASSANDRA-16147: --- looking now. > ValueAccessor is using signed shorts in sliceWithShortLength > > > Key: CASSANDRA-16147 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16147 > Project: Cassandra > Issue Type: Bug > Components: Local/Other >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 4.0-beta > > > ValueAccessor is using a signed short when interpreting byte lengths, causing > exceptions when reading blobs over 32767 bytes in length -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16147) ValueAccessor is using signed shorts in sliceWithShortLength
[ https://issues.apache.org/jira/browse/CASSANDRA-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204371#comment-17204371 ] David Capwell commented on CASSANDRA-16147: --- I modified split to trigger this logic {code} l[i++] = ByteBufferAccessor.instance.sliceWithShortLength(bb, bb.position()); bb.position(bb.position() + 2 + l[i - 1].remaining()); {code} the test causes the size to be -2 and since the offset is 2 for the header, the returned buffer is 0 > ValueAccessor is using signed shorts in sliceWithShortLength > > > Key: CASSANDRA-16147 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16147 > Project: Cassandra > Issue Type: Bug > Components: Local/Other >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 4.0-beta > > > ValueAccessor is using a signed short when interpreting byte lengths, causing > exceptions when reading blobs over 32767 bytes in length -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16147) ValueAccessor is using signed shorts in sliceWithShortLength
[ https://issues.apache.org/jira/browse/CASSANDRA-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204369#comment-17204369 ] Blake Eggleston commented on CASSANDRA-16147: - yep, build and split both do. I've updated the test to use to/from string. > ValueAccessor is using signed shorts in sliceWithShortLength > > > Key: CASSANDRA-16147 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16147 > Project: Cassandra > Issue Type: Bug > Components: Local/Other >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 4.0-beta > > > ValueAccessor is using a signed short when interpreting byte lengths, causing > exceptions when reading blobs over 32767 bytes in length -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16147) ValueAccessor is using signed shorts in sliceWithShortLength
[ https://issues.apache.org/jira/browse/CASSANDRA-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204365#comment-17204365 ] David Capwell commented on CASSANDRA-16147: --- the test passes on trunk, and one of the reasons is that org.apache.cassandra.db.marshal.CompositeType#build(org.apache.cassandra.db.marshal.ValueAccessor, boolean, V...) uses ByteBuffer directly {code} @SafeVarargs public static V build(ValueAccessor accessor, boolean isStatic, V... values) { .. ByteBuffer out = ByteBuffer.allocate(totalLength); ... for (V v : values) { ByteBufferUtil.writeShortLength(out, accessor.size(v)); ... } {code} And org.apache.cassandra.db.marshal.CompositeType#split also does the same {code} while (bb.remaining() > 0) { l[i++] = ByteBufferUtil.readBytesWithShortLength(bb); bb.get(); // skip end-of-component } {code} > ValueAccessor is using signed shorts in sliceWithShortLength > > > Key: CASSANDRA-16147 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16147 > Project: Cassandra > Issue Type: Bug > Components: Local/Other >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 4.0-beta > > > ValueAccessor is using a signed short when interpreting byte lengths, causing > exceptions when reading blobs over 32767 bytes in length -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16147) ValueAccessor is using signed shorts in sliceWithShortLength
[ https://issues.apache.org/jira/browse/CASSANDRA-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204362#comment-17204362 ] Blake Eggleston commented on CASSANDRA-16147: - Variable length data types can have values > 0x, but composite types can't, so I added a test around composite types with large values. > ValueAccessor is using signed shorts in sliceWithShortLength > > > Key: CASSANDRA-16147 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16147 > Project: Cassandra > Issue Type: Bug > Components: Local/Other >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 4.0-beta > > > ValueAccessor is using a signed short when interpreting byte lengths, causing > exceptions when reading blobs over 32767 bytes in length -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16148) GossiperTest#testHaveVersion3Nodes is failing on trunk
[ https://issues.apache.org/jira/browse/CASSANDRA-16148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jordan West updated CASSANDRA-16148: Test and Documentation Plan: Make sure the test that is being fixed passes and no other tests were broken as a result Status: Patch Available (was: Open) [branch | https://github.com/jrwest/cassandra/tree/jwest/16148] [tests | https://app.circleci.com/pipelines/github/jrwest/cassandra?branch=jwest%2F16148] > GossiperTest#testHaveVersion3Nodes is failing on trunk > -- > > Key: CASSANDRA-16148 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16148 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip >Reporter: Jordan West >Assignee: Jordan West >Priority: Normal > > https://app.circleci.com/pipelines/github/jrwest/cassandra/53/workflows/95f9f401-1ef8-4b8d-9c64-3703d9669d95/jobs/771 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15537) 4.0 quality testing: Local Read/Write Path: Upgrade and Diff Test
[ https://issues.apache.org/jira/browse/CASSANDRA-15537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204353#comment-17204353 ] Yifan Cai commented on CASSANDRA-15537: --- Thank you [~pauloricardomg] for correlating the tickets! I should have done it when filing. :| > 4.0 quality testing: Local Read/Write Path: Upgrade and Diff Test > - > > Key: CASSANDRA-15537 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15537 > Project: Cassandra > Issue Type: Task > Components: Test/dtest/java, Test/dtest/python >Reporter: Josh McKenzie >Assignee: Yifan Cai >Priority: Normal > Fix For: 4.0-beta > > > Reference [doc from > NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#] > for context. > Execution of upgrade and diff tests via cassandra-diff have proven to be one > of the most effective approaches toward identifying issues with the local > read/write path. These include instances of data loss, data corruption, data > resurrection, incorrect responses to queries, incomplete responses, and > others. Upgrade and diff tests can be executed concurrent with fault > injection (such as host or network failure); as well as during mixed-version > scenarios (such as upgrading half of the instances in a cluster, and running > upgradesstables on only half of the upgraded instances). > Upgrade and diff tests are expected to continue through the release cycle, > and are a great way for contributors to gain confidence in the correctness of > the database under their own workloads. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-16148) GossiperTest#testHaveVersion3Nodes is failing on trunk
Jordan West created CASSANDRA-16148: --- Summary: GossiperTest#testHaveVersion3Nodes is failing on trunk Key: CASSANDRA-16148 URL: https://issues.apache.org/jira/browse/CASSANDRA-16148 Project: Cassandra Issue Type: Bug Components: Cluster/Gossip Reporter: Jordan West Assignee: Jordan West https://app.circleci.com/pipelines/github/jrwest/cassandra/53/workflows/95f9f401-1ef8-4b8d-9c64-3703d9669d95/jobs/771 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16148) GossiperTest#testHaveVersion3Nodes is failing on trunk
[ https://issues.apache.org/jira/browse/CASSANDRA-16148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jordan West updated CASSANDRA-16148: Bug Category: Parent values: Correctness(12982)Level 1 values: Test Failure(12990) Complexity: Normal Discovered By: Unit Test Severity: Normal Status: Open (was: Triage Needed) > GossiperTest#testHaveVersion3Nodes is failing on trunk > -- > > Key: CASSANDRA-16148 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16148 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip >Reporter: Jordan West >Assignee: Jordan West >Priority: Normal > > https://app.circleci.com/pipelines/github/jrwest/cassandra/53/workflows/95f9f401-1ef8-4b8d-9c64-3703d9669d95/jobs/771 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15214) OOMs caught and not rethrown
[ https://issues.apache.org/jira/browse/CASSANDRA-15214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yifan Cai updated CASSANDRA-15214: -- Test and Documentation Plan: ci Status: Patch Available (was: Open) > OOMs caught and not rethrown > > > Key: CASSANDRA-15214 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15214 > Project: Cassandra > Issue Type: Bug > Components: Messaging/Client, Messaging/Internode >Reporter: Benedict Elliott Smith >Assignee: Yifan Cai >Priority: Normal > Fix For: 4.0, 4.0-rc > > Attachments: oom-experiments.zip > > Time Spent: 10m > Remaining Estimate: 0h > > Netty (at least, and perhaps elsewhere in Executors) catches all exceptions, > so presently there is no way to ensure that an OOM reaches the JVM handler to > trigger a crash/heapdump. > It may be that the simplest most consistent way to do this would be to have a > single thread spawned at startup that waits for any exceptions we must > propagate to the Runtime. > We could probably submit a patch upstream to Netty, but for a guaranteed > future proof approach, it may be worth paying the cost of a single thread. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15214) OOMs caught and not rethrown
[ https://issues.apache.org/jira/browse/CASSANDRA-15214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204348#comment-17204348 ] Yifan Cai commented on CASSANDRA-15214: --- Talked with Benedict on Slack and cleaned up my confusion. So the {{JVMStabilityInspector}} is able to inspect the OOM error. But after it re-throws, Netty catches all throwables and simply logs. It happens [here|https://github.com/netty/netty/blob/4.1/transport/src/main/java/io/netty/channel/AbstractChannelHandlerContext.java#L303-L316]. Therefore, the {{propagateOutOfMemory}} parameter was added. I submitted a PR that allows to produce a heap space OOM error forcefully when catching a direct buffer OOM. The PR also removes the parameter {{propagateOutOfMemory}} in the {{JVMStabilityInspector}}. Because it makes sure the instance can crash/exit properly on OOM. (see the gist below) PR: https://github.com/apache/cassandra/pull/761 CI: https://app.circleci.com/pipelines/github/yifan-c/cassandra/112/workflows/293a4334-d2df-43f9-b532-1d79876701c1 I have also created a separate demo to prove that JVM invokes the OOM handler even if such OOM error (not including the direct buffer one) is to be swallowed by a catch block. The code and the output can be found at the gist: https://gist.github.com/yifan-c/82ff4fd7fbe83fe41113f6f14cba4907. > OOMs caught and not rethrown > > > Key: CASSANDRA-15214 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15214 > Project: Cassandra > Issue Type: Bug > Components: Messaging/Client, Messaging/Internode >Reporter: Benedict Elliott Smith >Assignee: Yifan Cai >Priority: Normal > Fix For: 4.0, 4.0-rc > > Attachments: oom-experiments.zip > > Time Spent: 10m > Remaining Estimate: 0h > > Netty (at least, and perhaps elsewhere in Executors) catches all exceptions, > so presently there is no way to ensure that an OOM reaches the JVM handler to > trigger a crash/heapdump. > It may be that the simplest most consistent way to do this would be to have a > single thread spawned at startup that waits for any exceptions we must > propagate to the Runtime. > We could probably submit a patch upstream to Netty, but for a guaranteed > future proof approach, it may be worth paying the cost of a single thread. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15994) Fix flaky python dtest test_simple_rebuild - rebuild_test.TestRebuild
[ https://issues.apache.org/jira/browse/CASSANDRA-15994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204346#comment-17204346 ] David Capwell commented on CASSANDRA-15994: --- works for me. > Fix flaky python dtest test_simple_rebuild - rebuild_test.TestRebuild > - > > Key: CASSANDRA-15994 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15994 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: David Capwell >Priority: Normal > Fix For: 3.0.x > > > https://app.circleci.com/pipelines/github/dcapwell/cassandra/360/workflows/8e93a655-b66e-4bf2-8866-5f9a46487763/jobs/1847 > {code} > > assert self.rebuild_errors == 1, \ > 'rebuild errors should be 1, but found {}. Concurrent rebuild > should not be allowed, but one rebuild command should have > succeeded.'.format(self.rebuild_errors) > E AssertionError: rebuild errors should be 1, but found 0. Concurrent > rebuild should not be allowed, but one rebuild command should have succeeded. > E assert 0 == 1 > E+ where 0 = 0x7f29fe243518>.rebuild_errors > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15833) Unresolvable false digest mismatch during upgrade due to CASSANDRA-10657
[ https://issues.apache.org/jira/browse/CASSANDRA-15833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204339#comment-17204339 ] Jordan West commented on CASSANDRA-15833: - The issue only affects trunk. My bad. Will open a JIRA to follow-up. The test is likely failing because we changed the logic to make the method actually work as expected. > Unresolvable false digest mismatch during upgrade due to CASSANDRA-10657 > > > Key: CASSANDRA-15833 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15833 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Jacek Lewandowski >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 3.11.x, 4.0-beta > > Attachments: CASSANDRA-15833-3.11.patch, CASSANDRA-15833-4.0.patch > > > CASSANDRA-10657 introduced changes in how the ColumnFilter is interpreted. > This results in digest mismatch when querying incomplete set of columns from > a table with consistency that requires reaching instances running pre > CASSANDRA-10657 from nodes that include CASSANDRA-10657 (it was introduced in > Cassandra 3.4). > The fix is to bring back the previous behaviour until there are no instances > running pre CASSANDRA-10657 version. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-14746) Ensure Netty Internode Messaging Refactor is Solid
[ https://issues.apache.org/jira/browse/CASSANDRA-14746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204328#comment-17204328 ] Vinay Chella edited comment on CASSANDRA-14746 at 9/29/20, 10:40 PM: - Thank you for following up on this [~pauloricardomg] {quote}a) Is work on this issue still active? {quote} Yes, it was active until I took a long break from work for personal reasons, if you see CASSANDRA-15181 and CASSANDRA-14764, I started some of this work but had to put it on hold, I am starting to get back in motion, should be able to make progress in coming weeks. {quote}b) Can we complete this issue once all subtasks are completed or are there more subtasks to be added? {quote} quoting from the description "The goal is that 4.0 should have better latency, more throughput, fewer threads, fewer context switches, less GC allocation, and faster recovery time" - I would say it is all about building the confidence in 4.0, we can add more tasks as we make progress and findings based on CASSANDRA-14747, CASSANDRA-15181, and CASSANDRA-14764. was (Author: vinaykumarcse): Thank you for following up on this [~pauloricardomg] {quote}a) Is work on this issue still active? {quote} Yes, it was active until I took a long break from work for personal reasons, if you see CASSANDRA-15181 and CASSANDRA-14764, I started some of this work but had to put it on hold, I am starting to get back in motion, should be able to make progress in coming weeks. {quote} b) Can we complete this issue once all subtasks are completed or are there more subtasks to be added? {quote} quoting from the description "The goal is that 4.0 should have better latency, more throughput, fewer threads, fewer context switches, less GC allocation, and faster recovery time" - I would say it is all about building the confidence in 4.0, I can sign up to add more tasks as we make progress and findings based on CASSANDRA-14747, CASSANDRA-15181, and CASSANDRA-14764. > Ensure Netty Internode Messaging Refactor is Solid > -- > > Key: CASSANDRA-14746 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14746 > Project: Cassandra > Issue Type: Improvement > Components: Legacy/Streaming and Messaging >Reporter: Joey Lynch >Assignee: Joey Lynch >Priority: Normal > Labels: 4.0-QA > Fix For: 4.0-beta > > > Before we release 4.0 let's ensure that the internode messaging refactor is > 100% solid. As internode messaging is naturally used in many code paths and > widely configurable we have a large number of cluster configurations and test > configurations that must be vetted. > We plan to vary the following: > * Version of Cassandra 3.0.17 vs 4.0-alpha > * Cluster sizes with *multi-dc* deployments ranging from 6 - 100 nodes > * Client request rates varying between 1k QPS and 100k QPS of varying sizes > and shapes (BATCH, INSERT, SELECT point, SELECT range, etc ...) > * Internode compression > * Internode SSL (as well as openssl vs jdk) > * Internode Coalescing options > We are looking to measure the following as appropriate: > * Latency distributions of reads and writes (lower is better) > * Scaling limit, aka maximum throughput before violating p99 latency > deadline of 10ms @ LOCAL_QUORUM, on a fixed hardware deployment for 100% > writes, 100% reads and 50-50 writes+reads (higher is better) > * Thread counts (lower is better) > * Context switches (lower is better) > * On-CPU time of tasks (higher periods without context switch is better) > * GC allocation rates / throughput for a fixed size heap (lower allocation > better) > * Streaming recovery time for a single node failure, i.e. can Cassandra > saturate the NIC > > The goal is that 4.0 should have better latency, more throughput, fewer > threads, fewer context switches, less GC allocation, and faster recovery > time. I'm putting Jason Brown as the reviewer since he implemented most of > the internode refactor. > Current collaborators driving this QA task: Dinesh Joshi, Jordan West, Joey > Lynch (Netflix), Vinay Chella (Netflix) > Owning committer(s): Jason Brown -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16127) NullPointerException when calling nodetool enablethrift
[ https://issues.apache.org/jira/browse/CASSANDRA-16127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204337#comment-17204337 ] David Capwell commented on CASSANDRA-16127: --- 3.0 and 3.11 bootstrap failed but were working before https://app.circleci.com/pipelines/github/dcapwell/cassandra/550/workflows/ca6c6551-01d4-4438-bd4d-c14e27fa9bfc/jobs/3035, looks like a change I made caused a regression; looking into it. > NullPointerException when calling nodetool enablethrift > --- > > Key: CASSANDRA-16127 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16127 > Project: Cassandra > Issue Type: Bug > Components: Messaging/Thrift >Reporter: Tibor Repasi >Assignee: David Capwell >Priority: Normal > Fix For: 2.2.x, 3.0.x, 3.11.x > > > Having thrift disabled, it's impossible to enable it again without restarting > the node: > {code} > $ nodetool statusthrift > not running > $ nodetool enablethrift > error: null > -- StackTrace -- > java.lang.NullPointerException > at > org.apache.cassandra.service.StorageService.startRPCServer(StorageService.java:392) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71) > at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275) > at > com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112) > at > com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46) > at > com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237) > at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138) > at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819) > at > com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801) > at > javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1468) > at > javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76) > at > javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1309) > at > javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1401) > at > javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:829) > at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:357) > at sun.rmi.transport.Transport$1.run(Transport.java:200) > at sun.rmi.transport.Transport$1.run(Transport.java:197) > at java.security.AccessController.doPrivileged(Native Method) > at sun.rmi.transport.Transport.serviceCall(Transport.java:196) > at > sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:573) > at > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:834) > at > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:688) > at java.security.AccessController.doPrivileged(Native Method) > at > sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:687) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15996) Fix flaky python dtest test_expiration_overflow_policy_capnowarn - ttl_test.TestTTL
[ https://issues.apache.org/jira/browse/CASSANDRA-15996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204333#comment-17204333 ] David Capwell commented on CASSANDRA-15996: --- works for me. > Fix flaky python dtest test_expiration_overflow_policy_capnowarn - > ttl_test.TestTTL > --- > > Key: CASSANDRA-15996 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15996 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: David Capwell >Priority: Normal > Fix For: 3.11.x > > > https://app.circleci.com/pipelines/github/dcapwell/cassandra/361/workflows/3a42fa45-1f60-4c95-86a4-15a6773e384e/jobs/1860 > {code} > > assert warning, 'Log message should be print for CAP and > > CAP_NOWARN policy' > E AssertionError: Log message should be print for CAP and > CAP_NOWARN policy > E assert [] > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16036) Add flag to disable chunk cache and disable by default
[ https://issues.apache.org/jira/browse/CASSANDRA-16036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204331#comment-17204331 ] David Capwell commented on CASSANDRA-16036: --- Updated CI results (pending) Circle: https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16036-trunk-2EBAD3E9-4394-4D42-9213-69A6590F37E2 Jenkins: https://ci-cassandra.apache.org/job/Cassandra-devbranch/52/ trunk baseline: https://app.circleci.com/pipelines/github/dcapwell/cassandra/574/workflows/19f38f3c-9da3-42d5-ba5f-269f0285b791 > Add flag to disable chunk cache and disable by default > -- > > Key: CASSANDRA-16036 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16036 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Local Write-Read Paths >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Fix For: 4.0-beta > > Attachments: 15229_128mb.png, 16036_128mb.png, > async-profile.collapsed.svg, > clustering-in-clause_latency_selects_baseline.png, > clustering-in-clause_latency_selects_baseline_attempt3.png, > clustering-in-clause_latency_under90_selects_baseline.png, > clustering-in-clause_latency_under90_selects_baseline_attempt3.png, > clustering-slice_latency_selects_baseline.png, > clustering-slice_latency_under90_selects_baseline.png, > medium-blobs_latency_selects_baseline.png, > medium-blobs_latency_under90_selects_baseline.png, > partition-single-row-read_latency_selects_baseline.png, > partition-single-row-read_latency_under90_selects_baseline.png > > > Chunk cache is enabled by default and doesn’t have a flag to disable without > impacting networking. In performance testing 4.0 against 3.0 I found that > reads were slower in 4.0 and after profiling found that the ChunkCache was > partially to blame; after disabling the chunk cache, read performance had > improved. > {code} > 40_w_cc-selects.hdr > #[Mean= 11.50063, StdDeviation = 13.44014] > #[Max =482.41254, Total count= 316477] > #[Buckets = 25, SubBuckets = 262144] > 40_wo_cc-selects.hdr > #[Mean= 9.82115, StdDeviation = 10.14270] > #[Max =522.36493, Total count= 317444] > #[Buckets = 25, SubBuckets = 262144] > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16036) Add flag to disable chunk cache and disable by default
[ https://issues.apache.org/jira/browse/CASSANDRA-16036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204330#comment-17204330 ] David Capwell commented on CASSANDRA-16036: --- ok so looks like the read_repair tests and the gossiper test was broken by https://issues.apache.org/jira/browse/CASSANDRA-15833, so can ignore in this results. Will rerun the tests with the commit to enable the cache in tests. > Add flag to disable chunk cache and disable by default > -- > > Key: CASSANDRA-16036 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16036 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Local Write-Read Paths >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Fix For: 4.0-beta > > Attachments: 15229_128mb.png, 16036_128mb.png, > async-profile.collapsed.svg, > clustering-in-clause_latency_selects_baseline.png, > clustering-in-clause_latency_selects_baseline_attempt3.png, > clustering-in-clause_latency_under90_selects_baseline.png, > clustering-in-clause_latency_under90_selects_baseline_attempt3.png, > clustering-slice_latency_selects_baseline.png, > clustering-slice_latency_under90_selects_baseline.png, > medium-blobs_latency_selects_baseline.png, > medium-blobs_latency_under90_selects_baseline.png, > partition-single-row-read_latency_selects_baseline.png, > partition-single-row-read_latency_under90_selects_baseline.png > > > Chunk cache is enabled by default and doesn’t have a flag to disable without > impacting networking. In performance testing 4.0 against 3.0 I found that > reads were slower in 4.0 and after profiling found that the ChunkCache was > partially to blame; after disabling the chunk cache, read performance had > improved. > {code} > 40_w_cc-selects.hdr > #[Mean= 11.50063, StdDeviation = 13.44014] > #[Max =482.41254, Total count= 316477] > #[Buckets = 25, SubBuckets = 262144] > 40_wo_cc-selects.hdr > #[Mean= 9.82115, StdDeviation = 10.14270] > #[Max =522.36493, Total count= 317444] > #[Buckets = 25, SubBuckets = 262144] > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15991) 15583 - Add UX tests to intree LHF tooling
[ https://issues.apache.org/jira/browse/CASSANDRA-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204329#comment-17204329 ] David Capwell commented on CASSANDRA-15991: --- ok so looks like the read_repair tests and the gossiper test was broken by https://issues.apache.org/jira/browse/CASSANDRA-15833, so can ignore in this results. [~Bereng] can you look into the `org.apache.cassandra.tools.SSTableRepairedAtSetterTest#testFilesArg` test? > 15583 - Add UX tests to intree LHF tooling > -- > > Key: CASSANDRA-15991 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15991 > Project: Cassandra > Issue Type: Improvement > Components: Test/unit >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0-beta > > > As per CASSANDRA-15583 many in tree tools lack proper UX tooling: mandatory > params are indeed mandatory, 'help' produces an actual help, return codes etc > This ticket is an attempt to add it to those tools that classify as LHF. > Other tools such as nodetool, with many sub-commands, deserve a separate > ticket of their own -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15833) Unresolvable false digest mismatch during upgrade due to CASSANDRA-10657
[ https://issues.apache.org/jira/browse/CASSANDRA-15833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204326#comment-17204326 ] David Capwell edited comment on CASSANDRA-15833 at 9/29/20, 10:24 PM: -- Looks like this broke a unit test (https://ci-cassandra.apache.org/job/Cassandra-trunk/38/testReport/org.apache.cassandra.gms/GossiperTest/testHaveVersion3Nodes/history/) and read repair python dtest (https://ci-cassandra.apache.org/job/Cassandra-trunk/38/testReport/dtest.read_repair_test/TestReadRepair/test_alter_rf_and_run_read_repair/history/ and https://ci-cassandra.apache.org/job/Cassandra-trunk/38/testReport/dtest-offheap.read_repair_test/TestReadRepairGuarantees/test_atomic_writes_blocking_/history/). Didn't check 3.11 builds, only trunk. was (Author: dcapwell): Looks like this broke a unit test (https://ci-cassandra.apache.org/job/Cassandra-trunk/38/testReport/org.apache.cassandra.gms/GossiperTest/testHaveVersion3Nodes/history/) and read repair python dtest (https://ci-cassandra.apache.org/job/Cassandra-trunk/38/testReport/dtest.read_repair_test/TestReadRepair/test_alter_rf_and_run_read_repair/history/ and https://ci-cassandra.apache.org/job/Cassandra-trunk/38/testReport/dtest-offheap.read_repair_test/TestReadRepairGuarantees/test_atomic_writes_blocking_/history/). > Unresolvable false digest mismatch during upgrade due to CASSANDRA-10657 > > > Key: CASSANDRA-15833 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15833 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Jacek Lewandowski >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 3.11.x, 4.0-beta > > Attachments: CASSANDRA-15833-3.11.patch, CASSANDRA-15833-4.0.patch > > > CASSANDRA-10657 introduced changes in how the ColumnFilter is interpreted. > This results in digest mismatch when querying incomplete set of columns from > a table with consistency that requires reaching instances running pre > CASSANDRA-10657 from nodes that include CASSANDRA-10657 (it was introduced in > Cassandra 3.4). > The fix is to bring back the previous behaviour until there are no instances > running pre CASSANDRA-10657 version. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14746) Ensure Netty Internode Messaging Refactor is Solid
[ https://issues.apache.org/jira/browse/CASSANDRA-14746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204328#comment-17204328 ] Vinay Chella commented on CASSANDRA-14746: -- Thank you for following up on this [~pauloricardomg] {quote}a) Is work on this issue still active? {quote} Yes, it was active until I took a long break from work for personal reasons, if you see CASSANDRA-15181 and CASSANDRA-14764, I started some of this work but had to put it on hold, I am starting to get back in motion, should be able to make progress in coming weeks. {quote} b) Can we complete this issue once all subtasks are completed or are there more subtasks to be added? {quote} quoting from the description "The goal is that 4.0 should have better latency, more throughput, fewer threads, fewer context switches, less GC allocation, and faster recovery time" - I would say it is all about building the confidence in 4.0, I can sign up to add more tasks as we make progress and findings based on CASSANDRA-14747, CASSANDRA-15181, and CASSANDRA-14764. > Ensure Netty Internode Messaging Refactor is Solid > -- > > Key: CASSANDRA-14746 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14746 > Project: Cassandra > Issue Type: Improvement > Components: Legacy/Streaming and Messaging >Reporter: Joey Lynch >Assignee: Joey Lynch >Priority: Normal > Labels: 4.0-QA > Fix For: 4.0-beta > > > Before we release 4.0 let's ensure that the internode messaging refactor is > 100% solid. As internode messaging is naturally used in many code paths and > widely configurable we have a large number of cluster configurations and test > configurations that must be vetted. > We plan to vary the following: > * Version of Cassandra 3.0.17 vs 4.0-alpha > * Cluster sizes with *multi-dc* deployments ranging from 6 - 100 nodes > * Client request rates varying between 1k QPS and 100k QPS of varying sizes > and shapes (BATCH, INSERT, SELECT point, SELECT range, etc ...) > * Internode compression > * Internode SSL (as well as openssl vs jdk) > * Internode Coalescing options > We are looking to measure the following as appropriate: > * Latency distributions of reads and writes (lower is better) > * Scaling limit, aka maximum throughput before violating p99 latency > deadline of 10ms @ LOCAL_QUORUM, on a fixed hardware deployment for 100% > writes, 100% reads and 50-50 writes+reads (higher is better) > * Thread counts (lower is better) > * Context switches (lower is better) > * On-CPU time of tasks (higher periods without context switch is better) > * GC allocation rates / throughput for a fixed size heap (lower allocation > better) > * Streaming recovery time for a single node failure, i.e. can Cassandra > saturate the NIC > > The goal is that 4.0 should have better latency, more throughput, fewer > threads, fewer context switches, less GC allocation, and faster recovery > time. I'm putting Jason Brown as the reviewer since he implemented most of > the internode refactor. > Current collaborators driving this QA task: Dinesh Joshi, Jordan West, Joey > Lynch (Netflix), Vinay Chella (Netflix) > Owning committer(s): Jason Brown -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15833) Unresolvable false digest mismatch during upgrade due to CASSANDRA-10657
[ https://issues.apache.org/jira/browse/CASSANDRA-15833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204326#comment-17204326 ] David Capwell commented on CASSANDRA-15833: --- Looks like this broke a unit test (https://ci-cassandra.apache.org/job/Cassandra-trunk/38/testReport/org.apache.cassandra.gms/GossiperTest/testHaveVersion3Nodes/history/) and read repair python dtest (https://ci-cassandra.apache.org/job/Cassandra-trunk/38/testReport/dtest.read_repair_test/TestReadRepair/test_alter_rf_and_run_read_repair/history/ and https://ci-cassandra.apache.org/job/Cassandra-trunk/38/testReport/dtest-offheap.read_repair_test/TestReadRepairGuarantees/test_atomic_writes_blocking_/history/). > Unresolvable false digest mismatch during upgrade due to CASSANDRA-10657 > > > Key: CASSANDRA-15833 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15833 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Jacek Lewandowski >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 3.11.x, 4.0-beta > > Attachments: CASSANDRA-15833-3.11.patch, CASSANDRA-15833-4.0.patch > > > CASSANDRA-10657 introduced changes in how the ColumnFilter is interpreted. > This results in digest mismatch when querying incomplete set of columns from > a table with consistency that requires reaching instances running pre > CASSANDRA-10657 from nodes that include CASSANDRA-10657 (it was introduced in > Cassandra 3.4). > The fix is to bring back the previous behaviour until there are no instances > running pre CASSANDRA-10657 version. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15581) 4.0 quality testing: Compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-15581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204322#comment-17204322 ] Paulo Motta commented on CASSANDRA-15581: - Hey [~blerer], did you have the chance to lay out a plan for this? For context, I'm asking this to check the status of the 4.0 quality epic as part of this [this discussion|https://www.mail-archive.com/dev@cassandra.apache.org/msg15881.html] on the mailing list. > 4.0 quality testing: Compaction > --- > > Key: CASSANDRA-15581 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15581 > Project: Cassandra > Issue Type: Task > Components: Test/dtest/python >Reporter: Josh McKenzie >Assignee: Benjamin Lerer >Priority: Normal > Fix For: 4.0-beta > > > Reference [doc from > NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#] > for context. > *Shepherd: Marcus Eriksson* > Alongside the local and distributed read/write paths, we'll also want to > validate compaction. CASSANDRA-6696 introduced substantial > changes/improvements that require testing (esp. JBOD). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15537) 4.0 quality testing: Local Read/Write Path: Upgrade and Diff Test
[ https://issues.apache.org/jira/browse/CASSANDRA-15537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204321#comment-17204321 ] Paulo Motta commented on CASSANDRA-15537: - {quote}I don't know where exactly to do that, but any workflow changes require the assistance of infra and I strongly suspect adding a new relationship between tickets will as well. {quote} hmm OK, can't be bothered right now, guess it's not a big deal to live with that. :P Thanks! > 4.0 quality testing: Local Read/Write Path: Upgrade and Diff Test > - > > Key: CASSANDRA-15537 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15537 > Project: Cassandra > Issue Type: Task > Components: Test/dtest/java, Test/dtest/python >Reporter: Josh McKenzie >Assignee: Yifan Cai >Priority: Normal > Fix For: 4.0-beta > > > Reference [doc from > NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#] > for context. > Execution of upgrade and diff tests via cassandra-diff have proven to be one > of the most effective approaches toward identifying issues with the local > read/write path. These include instances of data loss, data corruption, data > resurrection, incorrect responses to queries, incomplete responses, and > others. Upgrade and diff tests can be executed concurrent with fault > injection (such as host or network failure); as well as during mixed-version > scenarios (such as upgrading half of the instances in a cluster, and running > upgradesstables on only half of the upgraded instances). > Upgrade and diff tests are expected to continue through the release cycle, > and are a great way for contributors to gain confidence in the correctness of > the database under their own workloads. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16036) Add flag to disable chunk cache and disable by default
[ https://issues.apache.org/jira/browse/CASSANDRA-16036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204320#comment-17204320 ] David Capwell commented on CASSANDRA-16036: --- rebase and broke JMX (since it isn't enabled) so enabled check cache in testing and tests are passing. I am running CI against trunk as the failing tests are consistent and failing for other branches, so isolating the changes to make sure those tests are not broken here. > Add flag to disable chunk cache and disable by default > -- > > Key: CASSANDRA-16036 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16036 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Local Write-Read Paths >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Fix For: 4.0-beta > > Attachments: 15229_128mb.png, 16036_128mb.png, > async-profile.collapsed.svg, > clustering-in-clause_latency_selects_baseline.png, > clustering-in-clause_latency_selects_baseline_attempt3.png, > clustering-in-clause_latency_under90_selects_baseline.png, > clustering-in-clause_latency_under90_selects_baseline_attempt3.png, > clustering-slice_latency_selects_baseline.png, > clustering-slice_latency_under90_selects_baseline.png, > medium-blobs_latency_selects_baseline.png, > medium-blobs_latency_under90_selects_baseline.png, > partition-single-row-read_latency_selects_baseline.png, > partition-single-row-read_latency_under90_selects_baseline.png > > > Chunk cache is enabled by default and doesn’t have a flag to disable without > impacting networking. In performance testing 4.0 against 3.0 I found that > reads were slower in 4.0 and after profiling found that the ChunkCache was > partially to blame; after disabling the chunk cache, read performance had > improved. > {code} > 40_w_cc-selects.hdr > #[Mean= 11.50063, StdDeviation = 13.44014] > #[Max =482.41254, Total count= 316477] > #[Buckets = 25, SubBuckets = 262144] > 40_wo_cc-selects.hdr > #[Mean= 9.82115, StdDeviation = 10.14270] > #[Max =522.36493, Total count= 317444] > #[Buckets = 25, SubBuckets = 262144] > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15537) 4.0 quality testing: Local Read/Write Path: Upgrade and Diff Test
[ https://issues.apache.org/jira/browse/CASSANDRA-15537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204318#comment-17204318 ] Brandon Williams commented on CASSANDRA-15537: -- I don't know where exactly to do that, but any workflow changes require the assistance of infra and I strongly suspect adding a new relationship between tickets will as well. > 4.0 quality testing: Local Read/Write Path: Upgrade and Diff Test > - > > Key: CASSANDRA-15537 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15537 > Project: Cassandra > Issue Type: Task > Components: Test/dtest/java, Test/dtest/python >Reporter: Josh McKenzie >Assignee: Yifan Cai >Priority: Normal > Fix For: 4.0-beta > > > Reference [doc from > NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#] > for context. > Execution of upgrade and diff tests via cassandra-diff have proven to be one > of the most effective approaches toward identifying issues with the local > read/write path. These include instances of data loss, data corruption, data > resurrection, incorrect responses to queries, incomplete responses, and > others. Upgrade and diff tests can be executed concurrent with fault > injection (such as host or network failure); as well as during mixed-version > scenarios (such as upgrading half of the instances in a cluster, and running > upgradesstables on only half of the upgraded instances). > Upgrade and diff tests are expected to continue through the release cycle, > and are a great way for contributors to gain confidence in the correctness of > the database under their own workloads. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15537) 4.0 quality testing: Local Read/Write Path: Upgrade and Diff Test
[ https://issues.apache.org/jira/browse/CASSANDRA-15537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204316#comment-17204316 ] Paulo Motta commented on CASSANDRA-15537: - Awesome, thanks for the update [~yifanc]. I've set the ticket to "In progress" to reflect its status. I also added the tickets you mentioned as "related". Though the proper relationship type would be "Found while testing", do you know how easy is to add a new JIRA relationship state [~brandon.williams] (pinging you because I've recall you fixing JIRA workflow issues before but I can't remember how to do it)? > 4.0 quality testing: Local Read/Write Path: Upgrade and Diff Test > - > > Key: CASSANDRA-15537 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15537 > Project: Cassandra > Issue Type: Task > Components: Test/dtest/java, Test/dtest/python >Reporter: Josh McKenzie >Assignee: Yifan Cai >Priority: Normal > Fix For: 4.0-beta > > > Reference [doc from > NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#] > for context. > Execution of upgrade and diff tests via cassandra-diff have proven to be one > of the most effective approaches toward identifying issues with the local > read/write path. These include instances of data loss, data corruption, data > resurrection, incorrect responses to queries, incomplete responses, and > others. Upgrade and diff tests can be executed concurrent with fault > injection (such as host or network failure); as well as during mixed-version > scenarios (such as upgrading half of the instances in a cluster, and running > upgradesstables on only half of the upgraded instances). > Upgrade and diff tests are expected to continue through the release cycle, > and are a great way for contributors to gain confidence in the correctness of > the database under their own workloads. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16147) ValueAccessor is using signed shorts in sliceWithShortLength
[ https://issues.apache.org/jira/browse/CASSANDRA-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204313#comment-17204313 ] David Capwell commented on CASSANDRA-16147: --- overall +1 from me. I do think it would be good to add a test which reads/writes SSTables that would have hit the issue as it seems that we are missing larger data in java and python testing. > ValueAccessor is using signed shorts in sliceWithShortLength > > > Key: CASSANDRA-16147 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16147 > Project: Cassandra > Issue Type: Bug > Components: Local/Other >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 4.0-beta > > > ValueAccessor is using a signed short when interpreting byte lengths, causing > exceptions when reading blobs over 32767 bytes in length -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16147) ValueAccessor is using signed shorts in sliceWithShortLength
[ https://issues.apache.org/jira/browse/CASSANDRA-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-16147: -- Reviewers: David Capwell, David Capwell (was: David Capwell) David Capwell, David Capwell (was: David Capwell) Status: Review In Progress (was: Patch Available) > ValueAccessor is using signed shorts in sliceWithShortLength > > > Key: CASSANDRA-16147 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16147 > Project: Cassandra > Issue Type: Bug > Components: Local/Other >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 4.0-beta > > > ValueAccessor is using a signed short when interpreting byte lengths, causing > exceptions when reading blobs over 32767 bytes in length -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16147) ValueAccessor is using signed shorts in sliceWithShortLength
[ https://issues.apache.org/jira/browse/CASSANDRA-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-16147: Test and Documentation Plan: unit tests / circle Status: Patch Available (was: In Progress) [trunk |https://github.com/bdeggleston/cassandra/tree/16147-trunk] > ValueAccessor is using signed shorts in sliceWithShortLength > > > Key: CASSANDRA-16147 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16147 > Project: Cassandra > Issue Type: Bug > Components: Local/Other >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 4.0-beta > > > ValueAccessor is using a signed short when interpreting byte lengths, causing > exceptions when reading blobs over 32767 bytes in length -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16147) ValueAccessor is using signed shorts in sliceWithShortLength
[ https://issues.apache.org/jira/browse/CASSANDRA-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-16147: Bug Category: Parent values: Availability(12983)Level 1 values: Response Crash(12991) Complexity: Low Hanging Fruit Discovered By: Workload Replay Reviewers: David Capwell Severity: Critical Status: Open (was: Triage Needed) > ValueAccessor is using signed shorts in sliceWithShortLength > > > Key: CASSANDRA-16147 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16147 > Project: Cassandra > Issue Type: Bug > Components: Local/Other >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 4.0-beta > > > ValueAccessor is using a signed short when interpreting byte lengths, causing > exceptions when reading blobs over 32767 bytes in length -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-16147) ValueAccessor is using signed shorts in sliceWithShortLength
Blake Eggleston created CASSANDRA-16147: --- Summary: ValueAccessor is using signed shorts in sliceWithShortLength Key: CASSANDRA-16147 URL: https://issues.apache.org/jira/browse/CASSANDRA-16147 Project: Cassandra Issue Type: Bug Components: Local/Other Reporter: Blake Eggleston Assignee: Blake Eggleston ValueAccessor is using a signed short when interpreting byte lengths, causing exceptions when reading blobs over 32767 bytes in length -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16147) ValueAccessor is using signed shorts in sliceWithShortLength
[ https://issues.apache.org/jira/browse/CASSANDRA-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-16147: Fix Version/s: 4.0-beta > ValueAccessor is using signed shorts in sliceWithShortLength > > > Key: CASSANDRA-16147 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16147 > Project: Cassandra > Issue Type: Bug > Components: Local/Other >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 4.0-beta > > > ValueAccessor is using a signed short when interpreting byte lengths, causing > exceptions when reading blobs over 32767 bytes in length -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Issue Comment Deleted] (CASSANDRA-15799) CorruptSSTableException when compacting a 3.0 format sstable that was originally created as an outcome of 2.1 sstable upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-15799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-15799: -- Comment: was deleted (was: CI results (pending): Circle: https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16036-trunk-B4951B6C-9967-4B3D-A93A-5C5539DDE804 Jenkins: https://ci-cassandra.apache.org/job/Cassandra-devbranch/51/) > CorruptSSTableException when compacting a 3.0 format sstable that was > originally created as an outcome of 2.1 sstable upgrade > - > > Key: CASSANDRA-15799 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15799 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction, Local/SSTable >Reporter: Sumanth Pasupuleti >Assignee: David Capwell >Priority: Normal > Fix For: 3.0.x > > Attachments: fake-deletedcell-if-bad.patch > > > Below is the exception with stack trace. This issue is reproduce-able. > {code:java} > DEBUG [CompactionExecutor:10] 2020-05-07 19:33:34,268 CompactionTask.java:158 > - Compacting (a3ea9fc0-9099-11ea-933f-c5e852f71338) > [/mnt/data/cassandra/data/ks/cf/md-10802-big-Data.db:level=0, ] > ERROR [CompactionExecutor:10] 2020-05-07 19:33:34,275 > CassandraDaemon.java:208 - Exception in thread > Thread[CompactionExecutor:10,1,RMI Runtime] > org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: > /mnt/data/cassandra/data/ks/cf/md-10802-big-Data.db > at > org.apache.cassandra.io.sstable.SSTableIdentityIterator.computeNext(SSTableIdentityIterator.java:105) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.io.sstable.SSTableIdentityIterator.computeNext(SSTableIdentityIterator.java:30) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:95) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.utils.MergeIterator$TrivialOneToOne.computeNext(MergeIterator.java:460) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:534) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:394) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:129) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.db.ColumnIndex$Builder.build(ColumnIndex.java:111) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.db.ColumnIndex.writeAndBuildIndex(ColumnIndex.java:52) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(BigTableWriter.java:165) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:126) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.db.compaction.writers.DefaultCompactionWriter.realAppend(DefaultCompactionWriter.java:57) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:109) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:195) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:89) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61) > ~[nf-cassan
[jira] [Commented] (CASSANDRA-16036) Add flag to disable chunk cache and disable by default
[ https://issues.apache.org/jira/browse/CASSANDRA-16036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204290#comment-17204290 ] David Capwell commented on CASSANDRA-16036: --- CI results (pending): Circle: https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16036-trunk-B4951B6C-9967-4B3D-A93A-5C5539DDE804 Jenkins: https://ci-cassandra.apache.org/job/Cassandra-devbranch/51/ > Add flag to disable chunk cache and disable by default > -- > > Key: CASSANDRA-16036 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16036 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Local Write-Read Paths >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Fix For: 4.0-beta > > Attachments: 15229_128mb.png, 16036_128mb.png, > async-profile.collapsed.svg, > clustering-in-clause_latency_selects_baseline.png, > clustering-in-clause_latency_selects_baseline_attempt3.png, > clustering-in-clause_latency_under90_selects_baseline.png, > clustering-in-clause_latency_under90_selects_baseline_attempt3.png, > clustering-slice_latency_selects_baseline.png, > clustering-slice_latency_under90_selects_baseline.png, > medium-blobs_latency_selects_baseline.png, > medium-blobs_latency_under90_selects_baseline.png, > partition-single-row-read_latency_selects_baseline.png, > partition-single-row-read_latency_under90_selects_baseline.png > > > Chunk cache is enabled by default and doesn’t have a flag to disable without > impacting networking. In performance testing 4.0 against 3.0 I found that > reads were slower in 4.0 and after profiling found that the ChunkCache was > partially to blame; after disabling the chunk cache, read performance had > improved. > {code} > 40_w_cc-selects.hdr > #[Mean= 11.50063, StdDeviation = 13.44014] > #[Max =482.41254, Total count= 316477] > #[Buckets = 25, SubBuckets = 262144] > 40_wo_cc-selects.hdr > #[Mean= 9.82115, StdDeviation = 10.14270] > #[Max =522.36493, Total count= 317444] > #[Buckets = 25, SubBuckets = 262144] > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15799) CorruptSSTableException when compacting a 3.0 format sstable that was originally created as an outcome of 2.1 sstable upgrade
[ https://issues.apache.org/jira/browse/CASSANDRA-15799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204289#comment-17204289 ] David Capwell commented on CASSANDRA-15799: --- CI results (pending): Circle: https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16036-trunk-B4951B6C-9967-4B3D-A93A-5C5539DDE804 Jenkins: https://ci-cassandra.apache.org/job/Cassandra-devbranch/51/ > CorruptSSTableException when compacting a 3.0 format sstable that was > originally created as an outcome of 2.1 sstable upgrade > - > > Key: CASSANDRA-15799 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15799 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction, Local/SSTable >Reporter: Sumanth Pasupuleti >Assignee: David Capwell >Priority: Normal > Fix For: 3.0.x > > Attachments: fake-deletedcell-if-bad.patch > > > Below is the exception with stack trace. This issue is reproduce-able. > {code:java} > DEBUG [CompactionExecutor:10] 2020-05-07 19:33:34,268 CompactionTask.java:158 > - Compacting (a3ea9fc0-9099-11ea-933f-c5e852f71338) > [/mnt/data/cassandra/data/ks/cf/md-10802-big-Data.db:level=0, ] > ERROR [CompactionExecutor:10] 2020-05-07 19:33:34,275 > CassandraDaemon.java:208 - Exception in thread > Thread[CompactionExecutor:10,1,RMI Runtime] > org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: > /mnt/data/cassandra/data/ks/cf/md-10802-big-Data.db > at > org.apache.cassandra.io.sstable.SSTableIdentityIterator.computeNext(SSTableIdentityIterator.java:105) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.io.sstable.SSTableIdentityIterator.computeNext(SSTableIdentityIterator.java:30) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:95) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.utils.MergeIterator$TrivialOneToOne.computeNext(MergeIterator.java:460) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:534) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:394) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:129) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.db.ColumnIndex$Builder.build(ColumnIndex.java:111) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.db.ColumnIndex.writeAndBuildIndex(ColumnIndex.java:52) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(BigTableWriter.java:165) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:126) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.db.compaction.writers.DefaultCompactionWriter.realAppend(DefaultCompactionWriter.java:57) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:109) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:195) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:89) > ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] > at > org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTas
[jira] [Commented] (CASSANDRA-15993) Fix flaky python dtest test_view_metadata_cleanup - materialized_views_test.TestMaterializedViews
[ https://issues.apache.org/jira/browse/CASSANDRA-15993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204280#comment-17204280 ] Adam Holmberg commented on CASSANDRA-15993: --- I'm not seeing failures in ci-cassandra right now, but I think I have a local setup that produces this failure intermittently. I'm going to look into it. > Fix flaky python dtest test_view_metadata_cleanup - > materialized_views_test.TestMaterializedViews > - > > Key: CASSANDRA-15993 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15993 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: David Capwell >Assignee: Adam Holmberg >Priority: Normal > Fix For: 4.0-beta > > > https://app.circleci.com/pipelines/github/dcapwell/cassandra/355/workflows/7b8df61d-706f-4094-a206-7cdc6b4e0451/jobs/1818 > {code} > E cassandra.OperationTimedOut: errors={'127.0.0.2': 'Client request > timeout. See Session.execute[_async](timeout)'}, last_host=127.0.0.2 > cassandra/cluster.py:4026: OperationTimedOut > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-15993) Fix flaky python dtest test_view_metadata_cleanup - materialized_views_test.TestMaterializedViews
[ https://issues.apache.org/jira/browse/CASSANDRA-15993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Holmberg reassigned CASSANDRA-15993: - Assignee: Adam Holmberg > Fix flaky python dtest test_view_metadata_cleanup - > materialized_views_test.TestMaterializedViews > - > > Key: CASSANDRA-15993 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15993 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: David Capwell >Assignee: Adam Holmberg >Priority: Normal > Fix For: 4.0-beta > > > https://app.circleci.com/pipelines/github/dcapwell/cassandra/355/workflows/7b8df61d-706f-4094-a206-7cdc6b4e0451/jobs/1818 > {code} > E cassandra.OperationTimedOut: errors={'127.0.0.2': 'Client request > timeout. See Session.execute[_async](timeout)'}, last_host=127.0.0.2 > cassandra/cluster.py:4026: OperationTimedOut > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15991) 15583 - Add UX tests to intree LHF tooling
[ https://issues.apache.org/jira/browse/CASSANDRA-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204238#comment-17204238 ] David Capwell commented on CASSANDRA-15991: --- Also see read_repair_test.TestReadRepairGuarantees is failing in both w/ and w/o vnode so does not look flaky. Given the code in this patch I feel that it is unrelated so will try to rerun tests on trunk. > 15583 - Add UX tests to intree LHF tooling > -- > > Key: CASSANDRA-15991 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15991 > Project: Cassandra > Issue Type: Improvement > Components: Test/unit >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0-beta > > > As per CASSANDRA-15583 many in tree tools lack proper UX tooling: mandatory > params are indeed mandatory, 'help' produces an actual help, return codes etc > This ticket is an attempt to add it to those tools that classify as LHF. > Other tools such as nodetool, with many sub-commands, deserve a separate > ticket of their own -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15991) 15583 - Add UX tests to intree LHF tooling
[ https://issues.apache.org/jira/browse/CASSANDRA-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204237#comment-17204237 ] David Capwell edited comment on CASSANDRA-15991 at 9/29/20, 7:37 PM: - Looks like a new test is failing in Circle CI - https://app.circleci.com/pipelines/github/dcapwell/cassandra/572/workflows/5bbb328d-9497-4291-8a48-ca9e04019908/jobs/3141 testFilesArg - org.apache.cassandra.tools.SSTableRepairedAtSetterTest {code} [org.apache.cassandra.tools.SSTableRepairedAtSetter, --really-set, --is-repaired, -f, /tmp/cassandra/sstablelist.txt] exited with code -1 stderr: java.lang.RuntimeException: java.nio.file.NoSuchFileException: /tmp/cassandra/sstablelist.txt at org.apache.cassandra.tools.ToolRunner.runClassAsTool(ToolRunner.java:99) at org.apache.cassandra.tools.ToolRunner.invokeClass(ToolRunner.java:256) at org.apache.cassandra.tools.ToolRunner.invokeClass(ToolRunner.java:232) at org.apache.cassandra.tools.SSTableRepairedAtSetterTest.testFilesArg(SSTableRepairedAtSetterTest.java:117) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:38) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:534) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1196) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:1041) Caused by: java.nio.file.NoSuchFileException: /tmp/cassandra/sstablelist.txt at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214) at java.nio.file.Files.newByteChannel(Files.java:361) at java.nio.file.Files.newByteChannel(Files.java:407) at java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:384) at java.nio.file.Files.newInputStream(Files.java:152) at java.nio.file.Files.newBufferedReader(Files.java:2784) at java.nio.file.Files.readAllLines(Files.java:3202) at org.apache.cassandra.tools.SSTableRepairedAtSetter.main(SSTableRepairedAtSetter.java:72) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.cassandra.tools.ToolRunner.runClassAsTool(ToolRunner.java:79) ... 25 more stdout: junit.framework.AssertionFailedError: [org.apache.cassandra.tools.SSTableRepairedAtSetter, --really-set, --is-repaired, -f, /tmp/cassandra/sstablelist.txt] exited with code -1 stderr: java.lang.RuntimeException: java.nio.file.NoSuchFileException: /tmp/cassandra/sstablelist.txt at org.apache.cassandra.tools.ToolRunner.runClassAsTool(ToolRunner.java:99) at org.apache.cassandra.tools.ToolRunner.invokeClass(ToolRunner.java:256) at org.apache.cassandra.tools.ToolRunner.invokeClass(ToolRunner.java:232) at
[jira] [Commented] (CASSANDRA-15991) 15583 - Add UX tests to intree LHF tooling
[ https://issues.apache.org/jira/browse/CASSANDRA-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204237#comment-17204237 ] David Capwell commented on CASSANDRA-15991: --- Looks like a new test is failing in Circle CI testFilesArg - org.apache.cassandra.tools.SSTableRepairedAtSetterTest {code} [org.apache.cassandra.tools.SSTableRepairedAtSetter, --really-set, --is-repaired, -f, /tmp/cassandra/sstablelist.txt] exited with code -1 stderr: java.lang.RuntimeException: java.nio.file.NoSuchFileException: /tmp/cassandra/sstablelist.txt at org.apache.cassandra.tools.ToolRunner.runClassAsTool(ToolRunner.java:99) at org.apache.cassandra.tools.ToolRunner.invokeClass(ToolRunner.java:256) at org.apache.cassandra.tools.ToolRunner.invokeClass(ToolRunner.java:232) at org.apache.cassandra.tools.SSTableRepairedAtSetterTest.testFilesArg(SSTableRepairedAtSetterTest.java:117) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.runners.ParentRunner.run(ParentRunner.java:363) at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:38) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:534) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1196) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:1041) Caused by: java.nio.file.NoSuchFileException: /tmp/cassandra/sstablelist.txt at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) at sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214) at java.nio.file.Files.newByteChannel(Files.java:361) at java.nio.file.Files.newByteChannel(Files.java:407) at java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:384) at java.nio.file.Files.newInputStream(Files.java:152) at java.nio.file.Files.newBufferedReader(Files.java:2784) at java.nio.file.Files.readAllLines(Files.java:3202) at org.apache.cassandra.tools.SSTableRepairedAtSetter.main(SSTableRepairedAtSetter.java:72) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.cassandra.tools.ToolRunner.runClassAsTool(ToolRunner.java:79) ... 25 more stdout: junit.framework.AssertionFailedError: [org.apache.cassandra.tools.SSTableRepairedAtSetter, --really-set, --is-repaired, -f, /tmp/cassandra/sstablelist.txt] exited with code -1 stderr: java.lang.RuntimeException: java.nio.file.NoSuchFileException: /tmp/cassandra/sstablelist.txt at org.apache.cassandra.tools.ToolRunner.runClassAsTool(ToolRunner.java:99) at org.apache.cassandra.tools.ToolRunner.invokeClass(ToolRunner.java:256) at org.apache.cassandra.tools.ToolRunner.invokeClass(ToolRunner.java:232) at org.apache.cassandra.tools.SSTableRepairedAtSetterTest.testFilesArg(SSTableRepairedAtSetterTest.java:117) Caused by: java.nio.file.NoSuchFileException: /tmp/cassandra/sstablel
[jira] [Commented] (CASSANDRA-16074) Add metric for client concurrent byte throttle
[ https://issues.apache.org/jira/browse/CASSANDRA-16074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204217#comment-17204217 ] David Capwell commented on CASSANDRA-16074: --- thanks for the fix [~clohfink] +1 from me. > Add metric for client concurrent byte throttle > -- > > Key: CASSANDRA-16074 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16074 > Project: Cassandra > Issue Type: New Feature > Components: Messaging/Client, Observability/Metrics >Reporter: Chris Lohfink >Assignee: Chris Lohfink >Priority: Normal > Fix For: 4.0-beta > > > Add a metric to expose the current bytes and bytes per ip used that is used > in the existing throttle so its possible to determine what to set it to. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16036) Add flag to disable chunk cache and disable by default
[ https://issues.apache.org/jira/browse/CASSANDRA-16036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-16036: -- Status: Ready to Commit (was: Review In Progress) > Add flag to disable chunk cache and disable by default > -- > > Key: CASSANDRA-16036 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16036 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Local Write-Read Paths >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Fix For: 4.0-beta > > Attachments: 15229_128mb.png, 16036_128mb.png, > async-profile.collapsed.svg, > clustering-in-clause_latency_selects_baseline.png, > clustering-in-clause_latency_selects_baseline_attempt3.png, > clustering-in-clause_latency_under90_selects_baseline.png, > clustering-in-clause_latency_under90_selects_baseline_attempt3.png, > clustering-slice_latency_selects_baseline.png, > clustering-slice_latency_under90_selects_baseline.png, > medium-blobs_latency_selects_baseline.png, > medium-blobs_latency_under90_selects_baseline.png, > partition-single-row-read_latency_selects_baseline.png, > partition-single-row-read_latency_under90_selects_baseline.png > > > Chunk cache is enabled by default and doesn’t have a flag to disable without > impacting networking. In performance testing 4.0 against 3.0 I found that > reads were slower in 4.0 and after profiling found that the ChunkCache was > partially to blame; after disabling the chunk cache, read performance had > improved. > {code} > 40_w_cc-selects.hdr > #[Mean= 11.50063, StdDeviation = 13.44014] > #[Max =482.41254, Total count= 316477] > #[Buckets = 25, SubBuckets = 262144] > 40_wo_cc-selects.hdr > #[Mean= 9.82115, StdDeviation = 10.14270] > #[Max =522.36493, Total count= 317444] > #[Buckets = 25, SubBuckets = 262144] > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16036) Add flag to disable chunk cache and disable by default
[ https://issues.apache.org/jira/browse/CASSANDRA-16036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204208#comment-17204208 ] David Capwell commented on CASSANDRA-16036: --- Now that [~jasonstack] is a committer (congrats!) I will move forward to merge this later today (after rerunning CI on it). > Add flag to disable chunk cache and disable by default > -- > > Key: CASSANDRA-16036 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16036 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Local Write-Read Paths >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Fix For: 4.0-beta > > Attachments: 15229_128mb.png, 16036_128mb.png, > async-profile.collapsed.svg, > clustering-in-clause_latency_selects_baseline.png, > clustering-in-clause_latency_selects_baseline_attempt3.png, > clustering-in-clause_latency_under90_selects_baseline.png, > clustering-in-clause_latency_under90_selects_baseline_attempt3.png, > clustering-slice_latency_selects_baseline.png, > clustering-slice_latency_under90_selects_baseline.png, > medium-blobs_latency_selects_baseline.png, > medium-blobs_latency_under90_selects_baseline.png, > partition-single-row-read_latency_selects_baseline.png, > partition-single-row-read_latency_under90_selects_baseline.png > > > Chunk cache is enabled by default and doesn’t have a flag to disable without > impacting networking. In performance testing 4.0 against 3.0 I found that > reads were slower in 4.0 and after profiling found that the ChunkCache was > partially to blame; after disabling the chunk cache, read performance had > improved. > {code} > 40_w_cc-selects.hdr > #[Mean= 11.50063, StdDeviation = 13.44014] > #[Max =482.41254, Total count= 316477] > #[Buckets = 25, SubBuckets = 262144] > 40_wo_cc-selects.hdr > #[Mean= 9.82115, StdDeviation = 10.14270] > #[Max =522.36493, Total count= 317444] > #[Buckets = 25, SubBuckets = 262144] > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15991) 15583 - Add UX tests to intree LHF tooling
[ https://issues.apache.org/jira/browse/CASSANDRA-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204207#comment-17204207 ] David Capwell commented on CASSANDRA-15991: --- CI results (still running) Circle: https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-15991-trunk-DE96E193-E8A4-4882-BC04-4169A8E4AAB5 Jenkins: https://ci-cassandra.apache.org/job/Cassandra-devbranch/50/ > 15583 - Add UX tests to intree LHF tooling > -- > > Key: CASSANDRA-15991 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15991 > Project: Cassandra > Issue Type: Improvement > Components: Test/unit >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0-beta > > > As per CASSANDRA-15583 many in tree tools lack proper UX tooling: mandatory > params are indeed mandatory, 'help' produces an actual help, return codes etc > This ticket is an attempt to add it to those tools that classify as LHF. > Other tools such as nodetool, with many sub-commands, deserve a separate > ticket of their own -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15991) 15583 - Add UX tests to intree LHF tooling
[ https://issues.apache.org/jira/browse/CASSANDRA-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204206#comment-17204206 ] David Capwell commented on CASSANDRA-15991: --- reviewed latest commit so +1 from me. Ill start the commit process and link the test results before merging. > 15583 - Add UX tests to intree LHF tooling > -- > > Key: CASSANDRA-15991 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15991 > Project: Cassandra > Issue Type: Improvement > Components: Test/unit >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0-beta > > > As per CASSANDRA-15583 many in tree tools lack proper UX tooling: mandatory > params are indeed mandatory, 'help' produces an actual help, return codes etc > This ticket is an attempt to add it to those tools that classify as LHF. > Other tools such as nodetool, with many sub-commands, deserve a separate > ticket of their own -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15991) 15583 - Add UX tests to intree LHF tooling
[ https://issues.apache.org/jira/browse/CASSANDRA-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-15991: -- Status: Ready to Commit (was: Review In Progress) > 15583 - Add UX tests to intree LHF tooling > -- > > Key: CASSANDRA-15991 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15991 > Project: Cassandra > Issue Type: Improvement > Components: Test/unit >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0-beta > > > As per CASSANDRA-15583 many in tree tools lack proper UX tooling: mandatory > params are indeed mandatory, 'help' produces an actual help, return codes etc > This ticket is an attempt to add it to those tools that classify as LHF. > Other tools such as nodetool, with many sub-commands, deserve a separate > ticket of their own -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16089) Flaky Test: TestDiskBalance.test_disk_balance_after_boundary_change_stcs
[ https://issues.apache.org/jira/browse/CASSANDRA-16089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-16089: - Fix Version/s: (was: 4.0-beta) 4.0-beta3 Since Version: NA Source Control Link: https://github.com/apache/cassandra-dtest/commit/e4e8d94ba540743f0b0ccfdd5b8ce3cefc7a6a68 Resolution: Fixed Status: Resolved (was: Ready to Commit) Committed, thanks. > Flaky Test: TestDiskBalance.test_disk_balance_after_boundary_change_stcs > > > Key: CASSANDRA-16089 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16089 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Caleb Rackliffe >Assignee: Adam Holmberg >Priority: Normal > Labels: dtest > Fix For: 4.0-beta3 > > > See > https://app.circleci.com/pipelines/github/maedhroz/cassandra/99/workflows/72c69ea8-f347-4b00-aed8-bd465f3549ff/jobs/498 > After bootstrapping a second node into the cluster, the sizes of the SSTables > (per directory) on the first node no longer fall within the 10% margin of > error. We don’t have any assertion in the test that they were balanced before > bootstrap, however. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16089) Flaky Test: TestDiskBalance.test_disk_balance_after_boundary_change_stcs
[ https://issues.apache.org/jira/browse/CASSANDRA-16089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-16089: - Status: Ready to Commit (was: Review In Progress) > Flaky Test: TestDiskBalance.test_disk_balance_after_boundary_change_stcs > > > Key: CASSANDRA-16089 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16089 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Caleb Rackliffe >Assignee: Adam Holmberg >Priority: Normal > Labels: dtest > Fix For: 4.0-beta > > > See > https://app.circleci.com/pipelines/github/maedhroz/cassandra/99/workflows/72c69ea8-f347-4b00-aed8-bd465f3549ff/jobs/498 > After bootstrapping a second node into the cluster, the sizes of the SSTables > (per directory) on the first node no longer fall within the 10% margin of > error. We don’t have any assertion in the test that they were balanced before > bootstrap, however. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16089) Flaky Test: TestDiskBalance.test_disk_balance_after_boundary_change_stcs
[ https://issues.apache.org/jira/browse/CASSANDRA-16089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-16089: - Reviewers: Brandon Williams, Brandon Williams (was: Brandon Williams) Brandon Williams, Brandon Williams Status: Review In Progress (was: Patch Available) > Flaky Test: TestDiskBalance.test_disk_balance_after_boundary_change_stcs > > > Key: CASSANDRA-16089 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16089 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Caleb Rackliffe >Assignee: Adam Holmberg >Priority: Normal > Labels: dtest > Fix For: 4.0-beta > > > See > https://app.circleci.com/pipelines/github/maedhroz/cassandra/99/workflows/72c69ea8-f347-4b00-aed8-bd465f3549ff/jobs/498 > After bootstrapping a second node into the cluster, the sizes of the SSTables > (per directory) on the first node no longer fall within the 10% margin of > error. We don’t have any assertion in the test that they were balanced before > bootstrap, however. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra-dtest] branch master updated: fix flakiness in TestDiskBalance caused by random token generation
This is an automated email from the ASF dual-hosted git repository. brandonwilliams pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/cassandra-dtest.git The following commit(s) were added to refs/heads/master by this push: new e4e8d94 fix flakiness in TestDiskBalance caused by random token generation e4e8d94 is described below commit e4e8d94ba540743f0b0ccfdd5b8ce3cefc7a6a68 Author: Adam Holmberg AuthorDate: Tue Sep 29 12:55:48 2020 -0500 fix flakiness in TestDiskBalance caused by random token generation patch by Adam Holberg, reviewed by brandonwilliams for CASSANDRA-16089 --- disk_balance_test.py | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/disk_balance_test.py b/disk_balance_test.py index 3d02ac1..91ba848 100644 --- a/disk_balance_test.py +++ b/disk_balance_test.py @@ -234,7 +234,10 @@ class TestDiskBalance(Tester): # Add a new node, so disk boundaries will change logger.debug("Bootstrap node2 and flush") -node2 = new_node(cluster, bootstrap=True) +# Fixed initial token to bisect the ring and make sure the nodes are balanced (otherwise a random token is generated). +balanced_tokens = cluster.balanced_tokens(2) +assert balanced_tokens[0] == node1.initial_token # make sure cluster population still works as assumed +node2 = new_node(cluster, token=balanced_tokens[1], bootstrap=True) node2.start(wait_for_binary_proto=True, jvm_args=["-Dcassandra.migration_task_wait_in_seconds=10"], set_migration_task=False) node2.flush() - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16089) Flaky Test: TestDiskBalance.test_disk_balance_after_boundary_change_stcs
[ https://issues.apache.org/jira/browse/CASSANDRA-16089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Holmberg updated CASSANDRA-16089: -- Test and Documentation Plan: Manually reproduced the problem. Fixed and looped test many iterations locally. CI looks good. Status: Patch Available (was: In Progress) > Flaky Test: TestDiskBalance.test_disk_balance_after_boundary_change_stcs > > > Key: CASSANDRA-16089 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16089 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Caleb Rackliffe >Assignee: Adam Holmberg >Priority: Normal > Labels: dtest > Fix For: 4.0-beta > > > See > https://app.circleci.com/pipelines/github/maedhroz/cassandra/99/workflows/72c69ea8-f347-4b00-aed8-bd465f3549ff/jobs/498 > After bootstrapping a second node into the cluster, the sizes of the SSTables > (per directory) on the first node no longer fall within the 10% margin of > error. We don’t have any assertion in the test that they were balanced before > bootstrap, however. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15537) 4.0 quality testing: Local Read/Write Path: Upgrade and Diff Test
[ https://issues.apache.org/jira/browse/CASSANDRA-15537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204178#comment-17204178 ] Yifan Cai commented on CASSANDRA-15537: --- Hi Paulo, Dozens of clusters so far have passed the diff test that compares between its current build and the latest 4.0 build. The number of clusters being tested is increasing each week. The tested clusters have the data size ranging from gigabytes to 10s of TB. The success criteria for a diff test is that 100% of data from user tables matches between the 2 testing clusters. Several issues have been resolved, and tool improvements has been made during this on-going diff exercise. The tickets are: Issues * CASSANDRA-15945 * CASSANDRA-15905 * CASSANDRA-15857 * CASSANDRA-15514 Tool improvements * CASSANDRA-16125 * CASSANDRA-16065 * CASSANDRA-15953 * CASSANDRA-15807 * CASSANDRA-15722 * CASSANDRA-15658 > 4.0 quality testing: Local Read/Write Path: Upgrade and Diff Test > - > > Key: CASSANDRA-15537 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15537 > Project: Cassandra > Issue Type: Task > Components: Test/dtest/java, Test/dtest/python >Reporter: Josh McKenzie >Assignee: Yifan Cai >Priority: Normal > Fix For: 4.0-beta > > > Reference [doc from > NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#] > for context. > Execution of upgrade and diff tests via cassandra-diff have proven to be one > of the most effective approaches toward identifying issues with the local > read/write path. These include instances of data loss, data corruption, data > resurrection, incorrect responses to queries, incomplete responses, and > others. Upgrade and diff tests can be executed concurrent with fault > injection (such as host or network failure); as well as during mixed-version > scenarios (such as upgrading half of the instances in a cluster, and running > upgradesstables on only half of the upgraded instances). > Upgrade and diff tests are expected to continue through the release cycle, > and are a great way for contributors to gain confidence in the correctness of > the database under their own workloads. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16089) Flaky Test: TestDiskBalance.test_disk_balance_after_boundary_change_stcs
[ https://issues.apache.org/jira/browse/CASSANDRA-16089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204176#comment-17204176 ] Adam Holmberg commented on CASSANDRA-16089: --- The test fails occasionally due to random token generation for the second node. If the random token is too close to the node1 token, very little data is left on node1 after cleanup, and the disk balance variation goes up due to "noise". It can be made to fail in this manner consistently by configuring an intentionally bad token. For example one unfortunate token selection: {noformat} Datacenter: datacenter1 == AddressRackStatus State LoadOwnsToken 9143583083429189474 127.0.0.1 rack1 Up Normal 23.67 MiB 0.43% -9223372036854775808 127.0.0.2 rack1 Up Normal 68.62 KiB 99.57% 9143583083429189474 {noformat} Leaves only kilobytes of data on each disk after cleanup and compaction. The dtest change just makes for fixed token selection so we can avoid the noise in small files. [patch|https://github.com/apache/cassandra-dtest/commit/32a31742bda41b09872b7820e9fb7fffda1addd9] [ci|https://app.circleci.com/pipelines/github/aholmberg/cassandra?branch=CASSANDRA-16089] > Flaky Test: TestDiskBalance.test_disk_balance_after_boundary_change_stcs > > > Key: CASSANDRA-16089 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16089 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/python >Reporter: Caleb Rackliffe >Assignee: Adam Holmberg >Priority: Normal > Labels: dtest > Fix For: 4.0-beta > > > See > https://app.circleci.com/pipelines/github/maedhroz/cassandra/99/workflows/72c69ea8-f347-4b00-aed8-bd465f3549ff/jobs/498 > After bootstrapping a second node into the cluster, the sizes of the SSTables > (per directory) on the first node no longer fall within the 10% margin of > error. We don’t have any assertion in the test that they were balanced before > bootstrap, however. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15991) 15583 - Add UX tests to intree LHF tooling
[ https://issues.apache.org/jira/browse/CASSANDRA-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204158#comment-17204158 ] Brandon Williams commented on CASSANDRA-15991: -- LGTM with latest changes. > 15583 - Add UX tests to intree LHF tooling > -- > > Key: CASSANDRA-15991 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15991 > Project: Cassandra > Issue Type: Improvement > Components: Test/unit >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0-beta > > > As per CASSANDRA-15583 many in tree tools lack proper UX tooling: mandatory > params are indeed mandatory, 'help' produces an actual help, return codes etc > This ticket is an attempt to add it to those tools that classify as LHF. > Other tools such as nodetool, with many sub-commands, deserve a separate > ticket of their own -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16146) Node state incorrectly set to NORMAL after nodetool disablegossip and enablegossip during bootstrap
[ https://issues.apache.org/jira/browse/CASSANDRA-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204050#comment-17204050 ] Brandon Williams commented on CASSANDRA-16146: -- Indeed, that looks related to CASSANDRA-16127. > Node state incorrectly set to NORMAL after nodetool disablegossip and > enablegossip during bootstrap > --- > > Key: CASSANDRA-16146 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16146 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.0-beta3 > > Time Spent: 10m > Remaining Estimate: 0h > > At high level, {{StorageService#setGossipTokens}} set the gossip state to > {{NORMAL}} blindly. Therefore, re-enabling gossip (stop and start gossip) > overrides the actual gossip state. > > It could happen in the below scenario. > # Bootstrap failed. The gossip state remains in {{BOOT}} / {{JOINING}} and > code execution exits StorageService#initServer. > # Operator runs nodetool to stop and re-start gossip. The gossip state gets > flipped to {{NORMAL}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15538) 4.0 quality testing: Local Read/Write Path: Other Areas
[ https://issues.apache.org/jira/browse/CASSANDRA-15538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204048#comment-17204048 ] Sylvain Lebresne commented on CASSANDRA-15538: -- No, I haven't really started anything on this issue, and I don't plan to in the near term, so I unassigned myself. I should have done it sooner, my bad. I did spent a few cycles some time ago thinking about what could be done concretely here and I'll share my "reflections" in case that's useful. That said, in general, the scope here was a bit fuzzy to me. First, if you look at (true) unit testing for the classes that constitute the read/write path, there isn't much. So I suppose one could try to cover that somewhat, but the work to make a dent there is huge, and I'm not sure the value is that great since those path are mostly covered, but by "integration/functional" tests. But this doesn't make is super clear to me if specific area are more in need of additional testing than others. Then the description mentions "numerous bugs and issues with the 3.0 storage engine rewrite", so I looked at the list of "serious bugs" that was shared on the mailing list (by [~kohlisankalp] I believe; too lazy to dig the link right now). From looking at that, the biggest bucket I saw for "storage engine rewrite" related bugs was with 'legacy layout conversions/handling'. And that was clearly under-tested, but it's also gone in 4.0. From memory, there were also 2-3 read-repair related bugs, but we have CASSANDRA-15977. Nothing else struck me as pointing to a specific area to focus one. Those aside and fwiw, I've a feeling that things like reverse queries and range tombstones may be 2 features that aren't as well tested as they could, but it's more an impression of mine than hard data. Short of focusing on some specific area, the "read/write path" is a big place and the space to explore is kinda big. So I feel the biggest value would be to start exploring more of that space through randomized testing, specifically randomizing queries and/or schema. Presumably, that's what [Harry|https://issues.apache.org/jira/browse/CASSANDRA-15348] is for (though I haven't really checked it as of yet, so I don't know how capable it is for this). So if it was me, I'd look in this direction. But again, I don't have plans to at the moment due to other priorities. > 4.0 quality testing: Local Read/Write Path: Other Areas > --- > > Key: CASSANDRA-15538 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15538 > Project: Cassandra > Issue Type: Task > Components: Test/dtest/java, Test/dtest/python >Reporter: Josh McKenzie >Priority: Normal > Fix For: 4.0-beta > > > Reference [doc from > NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#] > for context. > *Shepherd: Aleksey Yeschenko* > Testing in this area refers to the local read/write path (StorageProxy, > ColumnFamilyStore, Memtable, SSTable reading/writing, etc). We are still > finding numerous bugs and issues with the 3.0 storage engine rewrite > (CASSANDRA-8099). For 4.0 we want to ensure that we thoroughly cover the > local read/write path with techniques such as property-based testing, fuzzing > ([example|http://cassandra.apache.org/blog/2018/10/17/finding_bugs_with_property_based_testing.html]), > and a source audit. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-15538) 4.0 quality testing: Local Read/Write Path: Other Areas
[ https://issues.apache.org/jira/browse/CASSANDRA-15538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne reassigned CASSANDRA-15538: Assignee: (was: Sylvain Lebresne) > 4.0 quality testing: Local Read/Write Path: Other Areas > --- > > Key: CASSANDRA-15538 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15538 > Project: Cassandra > Issue Type: Task > Components: Test/dtest/java, Test/dtest/python >Reporter: Josh McKenzie >Priority: Normal > Fix For: 4.0-beta > > > Reference [doc from > NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#] > for context. > *Shepherd: Aleksey Yeschenko* > Testing in this area refers to the local read/write path (StorageProxy, > ColumnFamilyStore, Memtable, SSTable reading/writing, etc). We are still > finding numerous bugs and issues with the 3.0 storage engine rewrite > (CASSANDRA-8099). For 4.0 we want to ensure that we thoroughly cover the > local read/write path with techniques such as property-based testing, fuzzing > ([example|http://cassandra.apache.org/blog/2018/10/17/finding_bugs_with_property_based_testing.html]), > and a source audit. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16063) Fix user experience when upgrading to 4.0 with compact tables
[ https://issues.apache.org/jira/browse/CASSANDRA-16063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203976#comment-17203976 ] Sylvain Lebresne commented on CASSANDRA-16063: -- I don't the time to test the patch thoroughly right now, but from a code review point of view, this lgtm. > Fix user experience when upgrading to 4.0 with compact tables > - > > Key: CASSANDRA-16063 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16063 > Project: Cassandra > Issue Type: Bug > Components: Legacy/CQL >Reporter: Sylvain Lebresne >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0-beta > > Attachments: Compact_storage_upgrade_tests.txt > > Time Spent: 20m > Remaining Estimate: 0h > > The code to handle compact tables has been removed from 4.0, and the intended > upgrade path to 4.0 for users having compact tables on 3.x is that they must > execute {{ALTER ... DROP COMPACT STORAGE}} on all of their compact tables > *before* attempting the upgrade. > Obviously, some users won't read the upgrade instructions (or miss a table) > and may try upgrading despite still having compact tables. If they do so, the > intent is that the node will _not_ start, with a message clearly indicating > the pre-upgrade step the user has missed. The user will then downgrade back > the node(s) to 3.x, run the proper {{ALTER ... DROP COMPACT STORAGE}}, and > then upgrade again. > But while 4.0 does currently fail startup when finding any compact tables > with a decent message, I believe the check is done too late during startup. > Namely, that check is done as we read the tables schema, so within > [{{Schema.instance.loadFromDisk()}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CassandraDaemon.java#L241]. > But by then, we've _at least_ called > {{SystemKeyspace.persistLocalMetadata()}}} and > {{SystemKeyspaceMigrator40.migrate()}}, which will get into the commit log, > and even possibly flush new {{na}} format sstables. As a results, a user > might not be able to seemlessly restart the node on 3.x (to drop compact > storage on the appropriate tables). > Basically, we should make sure the check for compact tables done at 4.0 > startup is done as a {{StartupCheck}}, before the node does anything. > We should also add a test for this (checking that if you try upgrading to 4.0 > with compact storage, you can downgrade back with no intervention whatsoever). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15538) 4.0 quality testing: Local Read/Write Path: Other Areas
[ https://issues.apache.org/jira/browse/CASSANDRA-15538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203974#comment-17203974 ] Paulo Motta commented on CASSANDRA-15538: - Hi [~slebresne], did you have the chance to look into this issue? For context, I'm asking this to check the status of the 4.0 quality epic as part of this [this discussion|https://www.mail-archive.com/dev@cassandra.apache.org/msg15881.html] on the mailing list. > 4.0 quality testing: Local Read/Write Path: Other Areas > --- > > Key: CASSANDRA-15538 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15538 > Project: Cassandra > Issue Type: Task > Components: Test/dtest/java, Test/dtest/python >Reporter: Josh McKenzie >Assignee: Sylvain Lebresne >Priority: Normal > Fix For: 4.0-beta > > > Reference [doc from > NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#] > for context. > *Shepherd: Aleksey Yeschenko* > Testing in this area refers to the local read/write path (StorageProxy, > ColumnFamilyStore, Memtable, SSTable reading/writing, etc). We are still > finding numerous bugs and issues with the 3.0 storage engine rewrite > (CASSANDRA-8099). For 4.0 we want to ensure that we thoroughly cover the > local read/write path with techniques such as property-based testing, fuzzing > ([example|http://cassandra.apache.org/blog/2018/10/17/finding_bugs_with_property_based_testing.html]), > and a source audit. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra-dtest] branch CASSANDRA-14793 created (now b9baecd)
This is an automated email from the ASF dual-hosted git repository. blerer pushed a change to branch CASSANDRA-14793 in repository https://gitbox.apache.org/repos/asf/cassandra-dtest.git. at b9baecd Update tests for CASSANDRA-14793 No new revisions were added by this update. - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] 01/02: Allow to use a different directory for storing system tables.
This is an automated email from the ASF dual-hosted git repository. blerer pushed a commit to branch CASSANDRA-14793 in repository https://gitbox.apache.org/repos/asf/cassandra.git commit ea3ee373f670b4eddcde0f94d4f5f6221166761b Author: Benjamin Lerer AuthorDate: Thu Mar 19 12:57:28 2020 +0100 Allow to use a different directory for storing system tables. --- .circleci/config.yml | 97 +++ .circleci/config.yml.HIGHRES | 98 +++ .circleci/config.yml.LOWRES| 97 +++ .circleci/config.yml.MIDRES| 97 +++ NEWS.txt | 9 ++ build.xml | 38 + conf/cassandra.yaml| 6 + src/java/org/apache/cassandra/config/Config.java | 6 + .../cassandra/config/DatabaseDescriptor.java | 97 +-- .../org/apache/cassandra/db/ColumnFamilyStore.java | 101 ++-- src/java/org/apache/cassandra/db/Directories.java | 180 - .../apache/cassandra/db/DiskBoundaryManager.java | 1 - .../org/apache/cassandra/db/SystemKeyspace.java| 5 + .../apache/cassandra/io/FSDiskFullWriteError.java | 12 +- ...or.java => FSNoDiskAvailableForWriteError.java} | 16 +- .../org/apache/cassandra/io/util/FileUtils.java| 67 .../apache/cassandra/service/CassandraDaemon.java | 94 ++- .../cassandra/service/DefaultFSErrorHandler.java | 17 +- .../apache/cassandra/service/StartupChecks.java| 1 + .../apache/cassandra/service/StorageService.java | 25 ++- .../cassandra/service/StorageServiceMBean.java | 14 ++ test/conf/system_keyspaces_directory.yaml | 1 + .../cassandra/OffsetAwareConfigurationLoader.java | 3 + .../org/apache/cassandra/db/DirectoriesTest.java | 42 ++--- .../apache/cassandra/io/util/FileUtilsTest.java| 69 .../apache/cassandra/tools/ClearSnapshotTest.java | 2 +- 26 files changed, 1091 insertions(+), 104 deletions(-) diff --git a/.circleci/config.yml b/.circleci/config.yml index 8ba8949..6c177a4 100644 --- a/.circleci/config.yml +++ b/.circleci/config.yml @@ -2151,6 +2151,97 @@ jobs: - CCM_HEAP_NEWSIZE: 256M - JAVA_HOME: /usr/lib/jvm/java-8-openjdk-amd64 - JDK_HOME: /usr/lib/jvm/java-8-openjdk-amd64 + utests_system_keyspace_directory: +docker: +- image: nastra/cassandra-testing-ubuntu1910-java11-w-dependencies:20200603 +resource_class: medium +working_directory: ~/ +shell: /bin/bash -eo pipefail -l +parallelism: 4 +steps: +- attach_workspace: +at: /home/cassandra +- run: +name: Determine unit Tests to Run +command: | + # reminder: this code (along with all the steps) is independently executed on every circle container + # so the goal here is to get the circleci script to return the tests *this* container will run + # which we do via the `circleci` cli tool. + + rm -fr ~/cassandra-dtest/upgrade_tests + echo "***java tests***" + + # get all of our unit test filenames + set -eo pipefail && circleci tests glob "$HOME/cassandra/test/unit/**/*.java" > /tmp/all_java_unit_tests.txt + + # split up the unit tests into groups based on the number of containers we have + set -eo pipefail && circleci tests split --split-by=timings --timings-type=filename --index=${CIRCLE_NODE_INDEX} --total=${CIRCLE_NODE_TOTAL} /tmp/all_java_unit_tests.txt > /tmp/java_tests_${CIRCLE_NODE_INDEX}.txt + set -eo pipefail && cat /tmp/java_tests_${CIRCLE_NODE_INDEX}.txt | sed "s;^/home/cassandra/cassandra/test/unit/;;g" | grep "Test\.java$" > /tmp/java_tests_${CIRCLE_NODE_INDEX}_final.txt + echo "** /tmp/java_tests_${CIRCLE_NODE_INDEX}_final.txt" + cat /tmp/java_tests_${CIRCLE_NODE_INDEX}_final.txt +no_output_timeout: 15m +- run: +name: Log Environment Information +command: | + echo '*** id ***' + id + echo '*** cat /proc/cpuinfo ***' + cat /proc/cpuinfo + echo '*** free -m ***' + free -m + echo '*** df -m ***' + df -m + echo '*** ifconfig -a ***' + ifconfig -a + echo '*** uname -a ***' + uname -a + echo '*** mount ***' + mount + echo '*** env ***' + env + echo '*** java ***' + which java + java -version +- run: +name: Run Unit Tests (testclasslist-system-keyspace-directory) +command: | + set -x + export PATH=$JAVA_HOME/bin:$PATH + time mv ~/cassandra /tmp + cd /tmp/cassandra + if [ -d ~/dtest_jars ]; then +cp ~/dtest_jars/dtest* /tmp/cassandra/build/ + fi + test_timeout=$(grep 'name="test.unit.timeout"' build.
[cassandra] branch CASSANDRA-14793 created (now 51366a6)
This is an automated email from the ASF dual-hosted git repository. blerer pushed a change to branch CASSANDRA-14793 in repository https://gitbox.apache.org/repos/asf/cassandra.git. at 51366a6 Change Circle-CI DO NOT MERGE This branch includes the following new commits: new ea3ee37 Allow to use a different directory for storing system tables. new 51366a6 Change Circle-CI DO NOT MERGE The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] 02/02: Change Circle-CI DO NOT MERGE
This is an automated email from the ASF dual-hosted git repository. blerer pushed a commit to branch CASSANDRA-14793 in repository https://gitbox.apache.org/repos/asf/cassandra.git commit 51366a6bae614b14fe81d4fd2f43c0e4c8b2425b Author: Benjamin Lerer AuthorDate: Fri Sep 18 16:59:39 2020 +0200 Change Circle-CI DO NOT MERGE --- .circleci/config.yml | 204 +-- 1 file changed, 102 insertions(+), 102 deletions(-) diff --git a/.circleci/config.yml b/.circleci/config.yml index 6c177a4..a41fcf3 100644 --- a/.circleci/config.yml +++ b/.circleci/config.yml @@ -3,10 +3,10 @@ jobs: j8_jvm_upgrade_dtests: docker: - image: nastra/cassandra-testing-ubuntu1910-java11-w-dependencies:20200603 -resource_class: medium +resource_class: large working_directory: ~/ shell: /bin/bash -eo pipefail -l -parallelism: 1 +parallelism: 10 steps: - attach_workspace: at: /home/cassandra @@ -85,8 +85,8 @@ jobs: - CASS_DRIVER_NO_EXTENSIONS: true - CASS_DRIVER_NO_CYTHON: true - CASSANDRA_SKIP_SYNC: true -- DTEST_REPO: git://github.com/apache/cassandra-dtest.git -- DTEST_BRANCH: master +- DTEST_REPO: git://github.com/blerer/cassandra-dtest.git +- DTEST_BRANCH: CASSANDRA-14793 - CCM_MAX_HEAP_SIZE: 1024M - CCM_HEAP_NEWSIZE: 256M - JAVA_HOME: /usr/lib/jvm/java-8-openjdk-amd64 @@ -94,10 +94,10 @@ jobs: j8_cqlsh-dtests-py2-with-vnodes: docker: - image: nastra/cassandra-testing-ubuntu1910-java11-w-dependencies:20200603 -resource_class: medium +resource_class: large working_directory: ~/ shell: /bin/bash -eo pipefail -l -parallelism: 4 +parallelism: 50 steps: - attach_workspace: at: /home/cassandra @@ -162,8 +162,8 @@ jobs: - CASS_DRIVER_NO_EXTENSIONS: true - CASS_DRIVER_NO_CYTHON: true - CASSANDRA_SKIP_SYNC: true -- DTEST_REPO: git://github.com/apache/cassandra-dtest.git -- DTEST_BRANCH: master +- DTEST_REPO: git://github.com/blerer/cassandra-dtest.git +- DTEST_BRANCH: CASSANDRA-14793 - CCM_MAX_HEAP_SIZE: 1024M - CCM_HEAP_NEWSIZE: 256M - JAVA_HOME: /usr/lib/jvm/java-8-openjdk-amd64 @@ -174,7 +174,7 @@ jobs: resource_class: medium working_directory: ~/ shell: /bin/bash -eo pipefail -l -parallelism: 4 +parallelism: 25 steps: - attach_workspace: at: /home/cassandra @@ -253,8 +253,8 @@ jobs: - CASS_DRIVER_NO_EXTENSIONS: true - CASS_DRIVER_NO_CYTHON: true - CASSANDRA_SKIP_SYNC: true -- DTEST_REPO: git://github.com/apache/cassandra-dtest.git -- DTEST_BRANCH: master +- DTEST_REPO: git://github.com/blerer/cassandra-dtest.git +- DTEST_BRANCH: CASSANDRA-14793 - CCM_MAX_HEAP_SIZE: 1024M - CCM_HEAP_NEWSIZE: 256M - JAVA_HOME: /usr/lib/jvm/java-11-openjdk-amd64 @@ -263,10 +263,10 @@ jobs: j8_cqlsh-dtests-py38-no-vnodes: docker: - image: nastra/cassandra-testing-ubuntu1910-java11-w-dependencies:20200603 -resource_class: medium +resource_class: large working_directory: ~/ shell: /bin/bash -eo pipefail -l -parallelism: 4 +parallelism: 50 steps: - attach_workspace: at: /home/cassandra @@ -331,8 +331,8 @@ jobs: - CASS_DRIVER_NO_EXTENSIONS: true - CASS_DRIVER_NO_CYTHON: true - CASSANDRA_SKIP_SYNC: true -- DTEST_REPO: git://github.com/apache/cassandra-dtest.git -- DTEST_BRANCH: master +- DTEST_REPO: git://github.com/blerer/cassandra-dtest.git +- DTEST_BRANCH: CASSANDRA-14793 - CCM_MAX_HEAP_SIZE: 1024M - CCM_HEAP_NEWSIZE: 256M - JAVA_HOME: /usr/lib/jvm/java-8-openjdk-amd64 @@ -340,10 +340,10 @@ jobs: j11_cqlsh-dtests-py3-with-vnodes: docker: - image: nastra/cassandra-testing-ubuntu1910-java11:20200603 -resource_class: medium +resource_class: large working_directory: ~/ shell: /bin/bash -eo pipefail -l -parallelism: 4 +parallelism: 50 steps: - attach_workspace: at: /home/cassandra @@ -408,8 +408,8 @@ jobs: - CASS_DRIVER_NO_EXTENSIONS: true - CASS_DRIVER_NO_CYTHON: true - CASSANDRA_SKIP_SYNC: true -- DTEST_REPO: git://github.com/apache/cassandra-dtest.git -- DTEST_BRANCH: master +- DTEST_REPO: git://github.com/blerer/cassandra-dtest.git +- DTEST_BRANCH: CASSANDRA-14793 - CCM_MAX_HEAP_SIZE: 1024M - CCM_HEAP_NEWSIZE: 256M - JAVA_HOME: /usr/lib/jvm/java-11-openjdk-amd64 @@ -418,10 +418,10 @@ jobs: j11_cqlsh-dtests-py3-no-vnodes: docker: - image: nastra/cassandra-testing-ubuntu1910-java11:20200603 -resource_class: medium +resource_class: large working_directory: ~/ shell: /bin/bash -eo pipefail -l -parallelism: 4 +parallelism: 50 steps: - attach_workspace: at: /home/cassandra @@ -486,8 +486,8 @@ jobs: - CASS_DRIVER_NO_EXTENSIONS: true - CASS_DRIVER_NO_CYTHON
[jira] [Commented] (CASSANDRA-15537) 4.0 quality testing: Local Read/Write Path: Upgrade and Diff Test
[ https://issues.apache.org/jira/browse/CASSANDRA-15537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203946#comment-17203946 ] Paulo Motta commented on CASSANDRA-15537: - Hi [~yifanc], do you have any update on the diff tests? Is work of this task composed solely of the tests you are running or can it be split into subtasks so others can maybe help? For context, I'm asking this to check the status of the 4.0 quality epic as part of this [this discussion|https://www.mail-archive.com/dev@cassandra.apache.org/msg15881.html] on the mailing list. > 4.0 quality testing: Local Read/Write Path: Upgrade and Diff Test > - > > Key: CASSANDRA-15537 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15537 > Project: Cassandra > Issue Type: Task > Components: Test/dtest/java, Test/dtest/python >Reporter: Josh McKenzie >Assignee: Yifan Cai >Priority: Normal > Fix For: 4.0-beta > > > Reference [doc from > NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#] > for context. > Execution of upgrade and diff tests via cassandra-diff have proven to be one > of the most effective approaches toward identifying issues with the local > read/write path. These include instances of data loss, data corruption, data > resurrection, incorrect responses to queries, incomplete responses, and > others. Upgrade and diff tests can be executed concurrent with fault > injection (such as host or network failure); as well as during mixed-version > scenarios (such as upgrading half of the instances in a cluster, and running > upgradesstables on only half of the upgraded instances). > Upgrade and diff tests are expected to continue through the release cycle, > and are a great way for contributors to gain confidence in the correctness of > the database under their own workloads. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-14746) Ensure Netty Internode Messaging Refactor is Solid
[ https://issues.apache.org/jira/browse/CASSANDRA-14746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203940#comment-17203940 ] Paulo Motta commented on CASSANDRA-14746: - [~jolynch] [~vinaykumarcse] As part of this [this discussion|https://www.mail-archive.com/dev@cassandra.apache.org/msg15881.html] on the mailing list I'm checking the status of the 4.0 quality epic issues and would appreciate if you could answer me the following questions: a) Is work on this issue still active? b) Can we complete this issue once all subtasks are completed or are there more subtasks to be added? > Ensure Netty Internode Messaging Refactor is Solid > -- > > Key: CASSANDRA-14746 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14746 > Project: Cassandra > Issue Type: Improvement > Components: Legacy/Streaming and Messaging >Reporter: Joey Lynch >Assignee: Joey Lynch >Priority: Normal > Labels: 4.0-QA > Fix For: 4.0-beta > > > Before we release 4.0 let's ensure that the internode messaging refactor is > 100% solid. As internode messaging is naturally used in many code paths and > widely configurable we have a large number of cluster configurations and test > configurations that must be vetted. > We plan to vary the following: > * Version of Cassandra 3.0.17 vs 4.0-alpha > * Cluster sizes with *multi-dc* deployments ranging from 6 - 100 nodes > * Client request rates varying between 1k QPS and 100k QPS of varying sizes > and shapes (BATCH, INSERT, SELECT point, SELECT range, etc ...) > * Internode compression > * Internode SSL (as well as openssl vs jdk) > * Internode Coalescing options > We are looking to measure the following as appropriate: > * Latency distributions of reads and writes (lower is better) > * Scaling limit, aka maximum throughput before violating p99 latency > deadline of 10ms @ LOCAL_QUORUM, on a fixed hardware deployment for 100% > writes, 100% reads and 50-50 writes+reads (higher is better) > * Thread counts (lower is better) > * Context switches (lower is better) > * On-CPU time of tasks (higher periods without context switch is better) > * GC allocation rates / throughput for a fixed size heap (lower allocation > better) > * Streaming recovery time for a single node failure, i.e. can Cassandra > saturate the NIC > > The goal is that 4.0 should have better latency, more throughput, fewer > threads, fewer context switches, less GC allocation, and faster recovery > time. I'm putting Jason Brown as the reviewer since he implemented most of > the internode refactor. > Current collaborators driving this QA task: Dinesh Joshi, Jordan West, Joey > Lynch (Netflix), Vinay Chella (Netflix) > Owning committer(s): Jason Brown -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15234) Standardise config and JVM parameters
[ https://issues.apache.org/jira/browse/CASSANDRA-15234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203910#comment-17203910 ] Paulo Motta commented on CASSANDRA-15234: - Despite the awesome work (thanks for leading it [~e.dimitrova]) and productive discussion that went into this issue, we didn't seem to reach a strong agreement here and it seems to me it's a bit late in the 4.0 release cycle to land this? In the spirit of expediting 4.0RC release I propose we postpone this to 4.X, and resume this with high priority earlier in the next release cycle. What do you think? > Standardise config and JVM parameters > - > > Key: CASSANDRA-15234 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15234 > Project: Cassandra > Issue Type: Bug > Components: Local/Config >Reporter: Benedict Elliott Smith >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0-alpha > > Attachments: CASSANDRA-15234-3-DTests-JAVA8.txt > > > We have a bunch of inconsistent names and config patterns in the codebase, > both from the yams and JVM properties. It would be nice to standardise the > naming (such as otc_ vs internode_) as well as the provision of values with > units - while maintaining perpetual backwards compatibility with the old > parameter names, of course. > For temporal units, I would propose parsing strings with suffixes of: > {{code}} > u|micros(econds?)? > ms|millis(econds?)? > s(econds?)? > m(inutes?)? > h(ours?)? > d(ays?)? > mo(nths?)? > {{code}} > For rate units, I would propose parsing any of the standard {{B/s, KiB/s, > MiB/s, GiB/s, TiB/s}}. > Perhaps for avoiding ambiguity we could not accept bauds {{bs, Mbps}} or > powers of 1000 such as {{KB/s}}, given these are regularly used for either > their old or new definition e.g. {{KiB/s}}, or we could support them and > simply log the value in bytes/s. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16063) Fix user experience when upgrading to 4.0 with compact tables
[ https://issues.apache.org/jira/browse/CASSANDRA-16063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203885#comment-17203885 ] Ekaterina Dimitrova commented on CASSANDRA-16063: - Hi [~adelapena] I just rebased all branches. PRs as follow: [trunk|https://github.com/ekaterinadimitrova2/cassandra/pull/54]| [3.0|https://github.com/ekaterinadimitrova2/cassandra/pull/56] | [dtetsts|https://github.com/ekaterinadimitrova2/cassandra-dtest/pull/4] No PR for 3.11 as it is a merge from 3.0 > Fix user experience when upgrading to 4.0 with compact tables > - > > Key: CASSANDRA-16063 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16063 > Project: Cassandra > Issue Type: Bug > Components: Legacy/CQL >Reporter: Sylvain Lebresne >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0-beta > > Attachments: Compact_storage_upgrade_tests.txt > > Time Spent: 20m > Remaining Estimate: 0h > > The code to handle compact tables has been removed from 4.0, and the intended > upgrade path to 4.0 for users having compact tables on 3.x is that they must > execute {{ALTER ... DROP COMPACT STORAGE}} on all of their compact tables > *before* attempting the upgrade. > Obviously, some users won't read the upgrade instructions (or miss a table) > and may try upgrading despite still having compact tables. If they do so, the > intent is that the node will _not_ start, with a message clearly indicating > the pre-upgrade step the user has missed. The user will then downgrade back > the node(s) to 3.x, run the proper {{ALTER ... DROP COMPACT STORAGE}}, and > then upgrade again. > But while 4.0 does currently fail startup when finding any compact tables > with a decent message, I believe the check is done too late during startup. > Namely, that check is done as we read the tables schema, so within > [{{Schema.instance.loadFromDisk()}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CassandraDaemon.java#L241]. > But by then, we've _at least_ called > {{SystemKeyspace.persistLocalMetadata()}}} and > {{SystemKeyspaceMigrator40.migrate()}}, which will get into the commit log, > and even possibly flush new {{na}} format sstables. As a results, a user > might not be able to seemlessly restart the node on 3.x (to drop compact > storage on the appropriate tables). > Basically, we should make sure the check for compact tables done at 4.0 > startup is done as a {{StartupCheck}}, before the node does anything. > We should also add a test for this (checking that if you try upgrading to 4.0 > with compact storage, you can downgrade back with no intervention whatsoever). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16121) Circleci should run cqlshlib tests as well
[ https://issues.apache.org/jira/browse/CASSANDRA-16121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203877#comment-17203877 ] Berenguer Blasi commented on CASSANDRA-16121: - [~e.dimitrova] executors are now in the original config-2.1 file: - Low: [j11|https://app.circleci.com/pipelines/github/bereng/cassandra/129/workflows/42ede6ff-d809-42f3-b143-3945003539a6] & [j8|https://app.circleci.com/pipelines/github/bereng/cassandra/129/workflows/10497110-d938-4500-8ef3-eb3d0e815b6e] - Medium: [j11|https://app.circleci.com/pipelines/github/bereng/cassandra/130/workflows/ee08a837-0710-40c8-bb26-cad7b2e20891] & [j8|https://app.circleci.com/pipelines/github/bereng/cassandra/130/workflows/94de5698-26b5-467d-afe4-c8b284d52d50] - High: [j11|https://app.circleci.com/pipelines/github/bereng/cassandra/131/workflows/5d7b1a6e-fd7b-47d9-932c-cabc8194a644] & [j8|https://app.circleci.com/pipelines/github/bereng/cassandra/131/workflows/893ae53b-0744-4568-ae41-993a9c1fdcd5] > Circleci should run cqlshlib tests as well > -- > > Key: CASSANDRA-16121 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16121 > Project: Cassandra > Issue Type: Bug > Components: CI, Test/unit >Reporter: Berenguer Blasi >Assignee: Berenguer Blasi >Priority: Normal > Fix For: 4.0-beta > > > Currently circleci is not running cqlshlib tests. This resulted in some bugs > not being caught before committing. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16063) Fix user experience when upgrading to 4.0 with compact tables
[ https://issues.apache.org/jira/browse/CASSANDRA-16063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203861#comment-17203861 ] Andres de la Peña commented on CASSANDRA-16063: --- [~e.dimitrova] are there PRs for those branches? I only see [this one|https://github.com/ekaterinadimitrova2/cassandra/pull/54] for trunk. > Fix user experience when upgrading to 4.0 with compact tables > - > > Key: CASSANDRA-16063 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16063 > Project: Cassandra > Issue Type: Bug > Components: Legacy/CQL >Reporter: Sylvain Lebresne >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0-beta > > Attachments: Compact_storage_upgrade_tests.txt > > > The code to handle compact tables has been removed from 4.0, and the intended > upgrade path to 4.0 for users having compact tables on 3.x is that they must > execute {{ALTER ... DROP COMPACT STORAGE}} on all of their compact tables > *before* attempting the upgrade. > Obviously, some users won't read the upgrade instructions (or miss a table) > and may try upgrading despite still having compact tables. If they do so, the > intent is that the node will _not_ start, with a message clearly indicating > the pre-upgrade step the user has missed. The user will then downgrade back > the node(s) to 3.x, run the proper {{ALTER ... DROP COMPACT STORAGE}}, and > then upgrade again. > But while 4.0 does currently fail startup when finding any compact tables > with a decent message, I believe the check is done too late during startup. > Namely, that check is done as we read the tables schema, so within > [{{Schema.instance.loadFromDisk()}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CassandraDaemon.java#L241]. > But by then, we've _at least_ called > {{SystemKeyspace.persistLocalMetadata()}}} and > {{SystemKeyspaceMigrator40.migrate()}}, which will get into the commit log, > and even possibly flush new {{na}} format sstables. As a results, a user > might not be able to seemlessly restart the node on 3.x (to drop compact > storage on the appropriate tables). > Basically, we should make sure the check for compact tables done at 4.0 > startup is done as a {{StartupCheck}}, before the node does anything. > We should also add a test for this (checking that if you try upgrading to 4.0 > with compact storage, you can downgrade back with no intervention whatsoever). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-16128) Jenkins: dsl for website build, logging repo SHAs, and using nightlies.a.o instead of archiving
[ https://issues.apache.org/jira/browse/CASSANDRA-16128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203860#comment-17203860 ] Michael Semb Wever commented on CASSANDRA-16128: In-tree patches - [2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...thelastpickle:mck/cassandra-2.2_jenkinsfile_2020-08] - [3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...thelastpickle:mck/cassandra-3.0_jenkinsfile_2020-08] - [3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...thelastpickle:mck/cassandra-3.11_jenkinsfile_2020-08] - [trunk|https://github.com/apache/cassandra/compare/trunk...thelastpickle:mck/jenkinsfile_2020-08] > Jenkins: dsl for website build, logging repo SHAs, and using nightlies.a.o > instead of archiving > --- > > Key: CASSANDRA-16128 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16128 > Project: Cassandra > Issue Type: Task > Components: CI >Reporter: Michael Semb Wever >Assignee: Michael Semb Wever >Priority: Normal > Fix For: 2.2.x, 3.0.x, 3.11.x, 4.0-beta > > > Jenkins improvements > 1. Add the cassandra-website job into cassandra_job_dsl.seed.groovy (so we > don't lose it next time the Jenkins master is corrupted) > 2. Print the SHAs of the different git repos used during the build process. > Also store them in the .head files (so the pipeline can print them out too). > 3. Instead of archiving artefacts, ssh them to > https://nightlies.apache.org/cassandra/ > (Disk usage on agents is largely under control, but disk usage on master was > the new problem. The suspicion here is the Cassandra-*-artifact's artefacts > was the disk usage culprit, though we have to evidence to support it.) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16128) Jenkins: dsl for website build, logging repo SHAs, and using nightlies.a.o instead of archiving
[ https://issues.apache.org/jira/browse/CASSANDRA-16128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Semb Wever updated CASSANDRA-16128: --- Status: Patch Available (was: In Progress) > Jenkins: dsl for website build, logging repo SHAs, and using nightlies.a.o > instead of archiving > --- > > Key: CASSANDRA-16128 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16128 > Project: Cassandra > Issue Type: Task > Components: CI >Reporter: Michael Semb Wever >Assignee: Michael Semb Wever >Priority: Normal > Fix For: 2.2.x, 3.0.x, 3.11.x, 4.0-beta > > > Jenkins improvements > 1. Add the cassandra-website job into cassandra_job_dsl.seed.groovy (so we > don't lose it next time the Jenkins master is corrupted) > 2. Print the SHAs of the different git repos used during the build process. > Also store them in the .head files (so the pipeline can print them out too). > 3. Instead of archiving artefacts, ssh them to > https://nightlies.apache.org/cassandra/ > (Disk usage on agents is largely under control, but disk usage on master was > the new problem. The suspicion here is the Cassandra-*-artifact's artefacts > was the disk usage culprit, though we have to evidence to support it.) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16063) Fix user experience when upgrading to 4.0 with compact tables
[ https://issues.apache.org/jira/browse/CASSANDRA-16063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andres de la Peña updated CASSANDRA-16063: -- Reviewers: Andres de la Peña, Sylvain Lebresne, Andres de la Peña (was: Andres de la Peña, Sylvain Lebresne) Andres de la Peña, Sylvain Lebresne, Andres de la Peña (was: Andres de la Peña, Sylvain Lebresne) Status: Review In Progress (was: Patch Available) > Fix user experience when upgrading to 4.0 with compact tables > - > > Key: CASSANDRA-16063 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16063 > Project: Cassandra > Issue Type: Bug > Components: Legacy/CQL >Reporter: Sylvain Lebresne >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.0-beta > > Attachments: Compact_storage_upgrade_tests.txt > > > The code to handle compact tables has been removed from 4.0, and the intended > upgrade path to 4.0 for users having compact tables on 3.x is that they must > execute {{ALTER ... DROP COMPACT STORAGE}} on all of their compact tables > *before* attempting the upgrade. > Obviously, some users won't read the upgrade instructions (or miss a table) > and may try upgrading despite still having compact tables. If they do so, the > intent is that the node will _not_ start, with a message clearly indicating > the pre-upgrade step the user has missed. The user will then downgrade back > the node(s) to 3.x, run the proper {{ALTER ... DROP COMPACT STORAGE}}, and > then upgrade again. > But while 4.0 does currently fail startup when finding any compact tables > with a decent message, I believe the check is done too late during startup. > Namely, that check is done as we read the tables schema, so within > [{{Schema.instance.loadFromDisk()}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CassandraDaemon.java#L241]. > But by then, we've _at least_ called > {{SystemKeyspace.persistLocalMetadata()}}} and > {{SystemKeyspaceMigrator40.migrate()}}, which will get into the commit log, > and even possibly flush new {{na}} format sstables. As a results, a user > might not be able to seemlessly restart the node on 3.x (to drop compact > storage on the appropriate tables). > Basically, we should make sure the check for compact tables done at 4.0 > startup is done as a {{StartupCheck}}, before the node does anything. > We should also add a test for this (checking that if you try upgrading to 4.0 > with compact storage, you can downgrade back with no intervention whatsoever). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org