[jira] [Updated] (CASSANDRA-16150) Upgrade to snakeyaml >= 1.26 version for CVE-2017-18640 fix

2020-09-29 Thread Rahul Nandi (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Nandi updated CASSANDRA-16150:

Description: 
There have been critical level CVE (CVE-2017-18640) discovered in snakeyaml 
version earlier to 1.26. This has been patched into snakeyaml version 1.26.

Reference: [https://nvd.nist.gov/vuln/detail/CVE-2017-18640]

This card is expected to upgrade the snakeyaml version to 1.26.

  was:
There have been critical level CVE ( [CVE-2017-18640 | 
[https://nvd.nist.gov/vuln/detail/CVE-2017-18640]] ) discovered in snakeyaml 
version earlier to 1.26. This has been patched into snakeyaml version 1.26.

This card is expected to upgrade the snakeyaml version to 1.26.


> Upgrade to snakeyaml >= 1.26 version for CVE-2017-18640 fix
> ---
>
> Key: CASSANDRA-16150
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16150
> Project: Cassandra
>  Issue Type: Bug
>  Components: Dependencies
>Reporter: Rahul Nandi
>Assignee: Rahul Nandi
>Priority: Normal
>
> There have been critical level CVE (CVE-2017-18640) discovered in snakeyaml 
> version earlier to 1.26. This has been patched into snakeyaml version 1.26.
> Reference: [https://nvd.nist.gov/vuln/detail/CVE-2017-18640]
> This card is expected to upgrade the snakeyaml version to 1.26.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16150) Upgrade to snakeyaml >= 1.26 version for CVE-2017-18640 fix

2020-09-29 Thread Rahul Nandi (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Nandi updated CASSANDRA-16150:

Description: 
There have been critical level CVE ( [CVE-2017-18640 | 
[https://nvd.nist.gov/vuln/detail/CVE-2017-18640]] ) discovered in snakeyaml 
version earlier to 1.26. This has been patched into snakeyaml version 1.26.

This card is expected to upgrade the snakeyaml version to 1.26.

  was:
There have been critical level CVE 
([CVE-2017-18640|[https://nvd.nist.gov/vuln/detail/CVE-2017-18640]]) discovered 
in snakeyaml version earlier to 1.26. This has been patched into snakeyaml 
version 1.26.

This card is expected to upgrade the snakeyaml version to 1.26.


> Upgrade to snakeyaml >= 1.26 version for CVE-2017-18640 fix
> ---
>
> Key: CASSANDRA-16150
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16150
> Project: Cassandra
>  Issue Type: Bug
>  Components: Dependencies
>Reporter: Rahul Nandi
>Assignee: Rahul Nandi
>Priority: Normal
>
> There have been critical level CVE ( [CVE-2017-18640 | 
> [https://nvd.nist.gov/vuln/detail/CVE-2017-18640]] ) discovered in snakeyaml 
> version earlier to 1.26. This has been patched into snakeyaml version 1.26.
> This card is expected to upgrade the snakeyaml version to 1.26.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-16150) Upgrade to snakeyaml >= 1.26 version for CVE-2017-18640 fix

2020-09-29 Thread Rahul Nandi (Jira)
Rahul Nandi created CASSANDRA-16150:
---

 Summary: Upgrade to snakeyaml >= 1.26 version for CVE-2017-18640 
fix
 Key: CASSANDRA-16150
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16150
 Project: Cassandra
  Issue Type: Bug
  Components: Dependencies
Reporter: Rahul Nandi
Assignee: Rahul Nandi


There have been critical level CVE 
([CVE-2017-18640|[https://nvd.nist.gov/vuln/detail/CVE-2017-18640]]) discovered 
in snakeyaml version earlier to 1.26. This has been patched into snakeyaml 
version 1.26.

This card is expected to upgrade the snakeyaml version to 1.26.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16128) Jenkins: dsl for website build, logging repo SHAs, and using nightlies.a.o instead of archiving

2020-09-29 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204449#comment-17204449
 ] 

Berenguer Blasi commented on CASSANDRA-16128:
-

lgtm.

> Jenkins: dsl for website build, logging repo SHAs, and using nightlies.a.o 
> instead of archiving
> ---
>
> Key: CASSANDRA-16128
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16128
> Project: Cassandra
>  Issue Type: Task
>  Components: CI
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.0-beta
>
>
> Jenkins improvements
> 1. Add the cassandra-website job into cassandra_job_dsl.seed.groovy (so we 
> don't lose it next time the Jenkins master is corrupted)
> 2. Print the SHAs of the different git repos used during the build process. 
> Also store them in the .head files (so the pipeline can print them out too).
> 3. Instead of archiving artefacts, ssh them to 
> https://nightlies.apache.org/cassandra/
> (Disk usage on agents is largely under control, but disk usage on master was 
> the new problem. The suspicion here is the Cassandra-*-artifact's artefacts 
> was the disk usage culprit, though we have to evidence to support it.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15991) 15583 - Add UX tests to intree LHF tooling

2020-09-29 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204423#comment-17204423
 ] 

Berenguer Blasi commented on CASSANDRA-15991:
-

M weird that thing passed for me locally. I was waiting for PR thumbs up to 
trigger all the CI jobs to spare you that [~dcapwell]. Anyway pushed a fix and 
checked in 
[circle|https://app.circleci.com/pipelines/github/bereng/cassandra/133/workflows/bf77a2f4-3243-4051-884a-2b2a83d777be/jobs/1133]
 as well.

> 15583 - Add UX tests to intree LHF tooling
> --
>
> Key: CASSANDRA-15991
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15991
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/unit
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta
>
>
> As per CASSANDRA-15583 many in tree tools lack proper UX tooling: mandatory 
> params are indeed mandatory, 'help' produces an actual help, return codes etc
> This ticket is an attempt to add it to those tools that classify as LHF. 
> Other tools such as nodetool, with many sub-commands, deserve a separate 
> ticket of their own



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-16036) Add flag to disable chunk cache and disable by default

2020-09-29 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204331#comment-17204331
 ] 

David Capwell edited comment on CASSANDRA-16036 at 9/30/20, 12:08 AM:
--

Updated CI results

Circle: 
https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16036-trunk-2EBAD3E9-4394-4D42-9213-69A6590F37E2
 (expected test failures caused by other JIRA, and 1 flaky test in no-vnode 
case but not in vnode case)
Jenkins: https://ci-cassandra.apache.org/job/Cassandra-devbranch/52/

trunk baseline: 
https://app.circleci.com/pipelines/github/dcapwell/cassandra/574/workflows/19f38f3c-9da3-42d5-ba5f-269f0285b791


was (Author: dcapwell):
Updated CI results (pending)

Circle: 
https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16036-trunk-2EBAD3E9-4394-4D42-9213-69A6590F37E2
Jenkins: https://ci-cassandra.apache.org/job/Cassandra-devbranch/52/

trunk baseline: 
https://app.circleci.com/pipelines/github/dcapwell/cassandra/574/workflows/19f38f3c-9da3-42d5-ba5f-269f0285b791

> Add flag to disable chunk cache and disable by default
> --
>
> Key: CASSANDRA-16036
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16036
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0-beta3
>
> Attachments: 15229_128mb.png, 16036_128mb.png, 
> async-profile.collapsed.svg, 
> clustering-in-clause_latency_selects_baseline.png, 
> clustering-in-clause_latency_selects_baseline_attempt3.png, 
> clustering-in-clause_latency_under90_selects_baseline.png, 
> clustering-in-clause_latency_under90_selects_baseline_attempt3.png, 
> clustering-slice_latency_selects_baseline.png, 
> clustering-slice_latency_under90_selects_baseline.png, 
> medium-blobs_latency_selects_baseline.png, 
> medium-blobs_latency_under90_selects_baseline.png, 
> partition-single-row-read_latency_selects_baseline.png, 
> partition-single-row-read_latency_under90_selects_baseline.png
>
>
> Chunk cache is enabled by default and doesn’t have a flag to disable without 
> impacting networking.  In performance testing 4.0 against 3.0 I found that 
> reads were slower in 4.0 and after profiling found that the ChunkCache was 
> partially to blame; after disabling the chunk cache, read performance had 
> improved.
> {code}
> 40_w_cc-selects.hdr
> #[Mean= 11.50063, StdDeviation   = 13.44014]
> #[Max =482.41254, Total count=   316477]
> #[Buckets =   25, SubBuckets =   262144]
> 40_wo_cc-selects.hdr
> #[Mean=  9.82115, StdDeviation   = 10.14270]
> #[Max =522.36493, Total count=   317444]
> #[Buckets =   25, SubBuckets =   262144]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16036) Add flag to disable chunk cache and disable by default

2020-09-29 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-16036:
--
  Fix Version/s: (was: 4.0-beta)
 4.0-beta3
  Since Version: 3.11.0
Source Control Link: 
https://github.com/apache/cassandra/commit/d4f501892d882cb1bf62529f0e72cf7d9c61e323
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Add flag to disable chunk cache and disable by default
> --
>
> Key: CASSANDRA-16036
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16036
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0-beta3
>
> Attachments: 15229_128mb.png, 16036_128mb.png, 
> async-profile.collapsed.svg, 
> clustering-in-clause_latency_selects_baseline.png, 
> clustering-in-clause_latency_selects_baseline_attempt3.png, 
> clustering-in-clause_latency_under90_selects_baseline.png, 
> clustering-in-clause_latency_under90_selects_baseline_attempt3.png, 
> clustering-slice_latency_selects_baseline.png, 
> clustering-slice_latency_under90_selects_baseline.png, 
> medium-blobs_latency_selects_baseline.png, 
> medium-blobs_latency_under90_selects_baseline.png, 
> partition-single-row-read_latency_selects_baseline.png, 
> partition-single-row-read_latency_under90_selects_baseline.png
>
>
> Chunk cache is enabled by default and doesn’t have a flag to disable without 
> impacting networking.  In performance testing 4.0 against 3.0 I found that 
> reads were slower in 4.0 and after profiling found that the ChunkCache was 
> partially to blame; after disabling the chunk cache, read performance had 
> improved.
> {code}
> 40_w_cc-selects.hdr
> #[Mean= 11.50063, StdDeviation   = 13.44014]
> #[Max =482.41254, Total count=   316477]
> #[Buckets =   25, SubBuckets =   262144]
> 40_wo_cc-selects.hdr
> #[Mean=  9.82115, StdDeviation   = 10.14270]
> #[Max =522.36493, Total count=   317444]
> #[Buckets =   25, SubBuckets =   262144]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch trunk updated: Add flag to disable chunk cache and disable by default

2020-09-29 Thread dcapwell
This is an automated email from the ASF dual-hosted git repository.

dcapwell pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/trunk by this push:
 new 79e693e  Add flag to disable chunk cache and disable by default
79e693e is described below

commit 79e693e16e2152097c5b27d2d7aaa1763e34f594
Author: David Capwell 
AuthorDate: Tue Sep 29 15:26:37 2020 -0700

Add flag to disable chunk cache and disable by default

patch by David Capwell; reviewed by Jon Meredith, Zhao Yang for 
CASSANDRA-16036
---
 CHANGES.txt  | 1 +
 conf/cassandra.yaml  | 4 
 src/java/org/apache/cassandra/cache/ChunkCache.java  | 2 +-
 src/java/org/apache/cassandra/config/Config.java | 2 ++
 src/java/org/apache/cassandra/config/DatabaseDescriptor.java | 5 +
 test/conf/cassandra.yaml | 1 +
 6 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/CHANGES.txt b/CHANGES.txt
index 190eebc..d1fa00e 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -16,6 +16,7 @@
  * Mutating sstable component may race with entire-sstable-streaming(ZCS) 
causing checksum validation failure (CASSANDRA-15861)
  * NPE thrown while updating speculative execution time if keyspace is removed 
during task execution (CASSANDRA-15949)
  * Show the progress of data streaming and index build (CASSANDRA-15406)
+ * Add flag to disable chunk cache and disable by default (CASSANDRA-16036)
 Merged from 3.11:
  * Don't attempt value skipping with mixed version cluster (CASSANDRA-15833)
  * Use IF NOT EXISTS for index and UDT create statements in snapshot schema 
files (CASSANDRA-13935)
diff --git a/conf/cassandra.yaml b/conf/cassandra.yaml
index fcd2ffa..ff414ed 100644
--- a/conf/cassandra.yaml
+++ b/conf/cassandra.yaml
@@ -469,6 +469,10 @@ concurrent_counter_writes: 32
 # be limited by the less of concurrent reads or concurrent writes.
 concurrent_materialized_view_writes: 32
 
+# Enable the sstable chunk cache.  The chunk cache will store recently accessed
+# sections of the sstable in-memory as uncompressed buffers.
+# file_cache_enabled: false
+
 # Maximum memory to use for sstable chunk cache and buffer pooling.
 # 32MB of this are reserved for pooling buffers, the rest is used as an
 # cache that holds uncompressed sstable chunks.
diff --git a/src/java/org/apache/cassandra/cache/ChunkCache.java 
b/src/java/org/apache/cassandra/cache/ChunkCache.java
index e370206..ae38015 100644
--- a/src/java/org/apache/cassandra/cache/ChunkCache.java
+++ b/src/java/org/apache/cassandra/cache/ChunkCache.java
@@ -42,7 +42,7 @@ public class ChunkCache
 public static final long cacheSize = 1024L * 1024L * Math.max(0, 
DatabaseDescriptor.getFileCacheSizeInMB() - RESERVED_POOL_SPACE_IN_MB);
 public static final boolean roundUp = 
DatabaseDescriptor.getFileCacheRoundUp();
 
-private static boolean enabled = cacheSize > 0;
+private static boolean enabled = DatabaseDescriptor.getFileCacheEnabled() 
&& cacheSize > 0;
 public static final ChunkCache instance = enabled ? new ChunkCache() : 
null;
 
 private final LoadingCache cache;
diff --git a/src/java/org/apache/cassandra/config/Config.java 
b/src/java/org/apache/cassandra/config/Config.java
index 6abdfba..da410155 100644
--- a/src/java/org/apache/cassandra/config/Config.java
+++ b/src/java/org/apache/cassandra/config/Config.java
@@ -304,6 +304,8 @@ public class Config
 
 public Integer file_cache_size_in_mb;
 
+public boolean file_cache_enabled = 
Boolean.getBoolean("cassandra.file_cache_enabled");
+
 /**
  * Because of the current {@link 
org.apache.cassandra.utils.memory.BufferPool} slab sizes of 64 kb, we
  * store in the file cache buffers that divide 64 kb, so we need to round 
the buffer sizes to powers of two.
diff --git a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java 
b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
index 3b5fdfb..e8e66fa 100644
--- a/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
+++ b/src/java/org/apache/cassandra/config/DatabaseDescriptor.java
@@ -2432,6 +2432,11 @@ public class DatabaseDescriptor
 conf.incremental_backups = value;
 }
 
+public static boolean getFileCacheEnabled()
+{
+return conf.file_cache_enabled;
+}
+
 public static int getFileCacheSizeInMB()
 {
 if (conf.file_cache_size_in_mb == null)
diff --git a/test/conf/cassandra.yaml b/test/conf/cassandra.yaml
index 89b7ff1..38e012f 100644
--- a/test/conf/cassandra.yaml
+++ b/test/conf/cassandra.yaml
@@ -50,3 +50,4 @@ stream_entire_sstables: true
 stream_throughput_outbound_megabits_per_sec: 2
 enable_sasi_indexes: true
 enable_materialized_views: true
+file_cache_enabled: true


-
To unsubs

[jira] [Created] (CASSANDRA-16149) Record the expiration time for hints files to avoid loading expired ones

2020-09-29 Thread Yifan Cai (Jira)
Yifan Cai created CASSANDRA-16149:
-

 Summary: Record the expiration time for hints files to avoid 
loading expired ones
 Key: CASSANDRA-16149
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16149
 Project: Cassandra
  Issue Type: Improvement
  Components: Local/Other
Reporter: Yifan Cai


The expiration time of a hints file is considered to be the latest expiration 
time among all the hints in the file. If the current time exceeds the file 
expiration time, the file can be safely deleted. 

The expiration time can be determined when finishing writing to the hints file. 

The tricky part is that each hints file keeps the metadata at the header of the 
file, but the expiration time is only known at the end. So we may want to save 
the metadata in a companion file of the hints. This approach is also 
future-proof, in that case that we want to add more metadata. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16147) ValueAccessor is using signed shorts in sliceWithShortLength

2020-09-29 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204375#comment-17204375
 ] 

David Capwell commented on CASSANDRA-16147:
---

test LGTM thanks!  +1

> ValueAccessor is using signed shorts in sliceWithShortLength
> 
>
> Key: CASSANDRA-16147
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16147
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Other
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0-beta
>
>
> ValueAccessor is using a signed short when interpreting byte lengths, causing 
> exceptions when reading blobs over 32767 bytes in length



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15234) Standardise config and JVM parameters

2020-09-29 Thread Jon Meredith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204374#comment-17204374
 ] 

Jon Meredith commented on CASSANDRA-15234:
--

The improvements are definitely very valuable and make configuration much 
cleaner and more flexible, but I'm also concerned it's too late in the cycle.

Although the patch goes to great lengths to be backward compatible, people that 
have been working towards getting ready for production deployments would need 
to re-test all the configurations they've worked through so far which would 
certainly cause rework to validate the release.

 

 

 

> Standardise config and JVM parameters
> -
>
> Key: CASSANDRA-15234
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15234
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Config
>Reporter: Benedict Elliott Smith
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 4.0-alpha
>
> Attachments: CASSANDRA-15234-3-DTests-JAVA8.txt
>
>
> We have a bunch of inconsistent names and config patterns in the codebase, 
> both from the yams and JVM properties.  It would be nice to standardise the 
> naming (such as otc_ vs internode_) as well as the provision of values with 
> units - while maintaining perpetual backwards compatibility with the old 
> parameter names, of course.
> For temporal units, I would propose parsing strings with suffixes of:
> {{code}}
> u|micros(econds?)?
> ms|millis(econds?)?
> s(econds?)?
> m(inutes?)?
> h(ours?)?
> d(ays?)?
> mo(nths?)?
> {{code}}
> For rate units, I would propose parsing any of the standard {{B/s, KiB/s, 
> MiB/s, GiB/s, TiB/s}}.
> Perhaps for avoiding ambiguity we could not accept bauds {{bs, Mbps}} or 
> powers of 1000 such as {{KB/s}}, given these are regularly used for either 
> their old or new definition e.g. {{KiB/s}}, or we could support them and 
> simply log the value in bytes/s.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16147) ValueAccessor is using signed shorts in sliceWithShortLength

2020-09-29 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204373#comment-17204373
 ] 

David Capwell commented on CASSANDRA-16147:
---

looking now.

> ValueAccessor is using signed shorts in sliceWithShortLength
> 
>
> Key: CASSANDRA-16147
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16147
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Other
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0-beta
>
>
> ValueAccessor is using a signed short when interpreting byte lengths, causing 
> exceptions when reading blobs over 32767 bytes in length



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16147) ValueAccessor is using signed shorts in sliceWithShortLength

2020-09-29 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204371#comment-17204371
 ] 

David Capwell commented on CASSANDRA-16147:
---

I modified split to trigger this logic

{code}
l[i++] = ByteBufferAccessor.instance.sliceWithShortLength(bb, bb.position());
bb.position(bb.position() + 2 + l[i - 1].remaining());
{code}

the test causes the size to be -2 and since the offset is 2 for the header, the 
returned buffer is 0


> ValueAccessor is using signed shorts in sliceWithShortLength
> 
>
> Key: CASSANDRA-16147
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16147
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Other
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0-beta
>
>
> ValueAccessor is using a signed short when interpreting byte lengths, causing 
> exceptions when reading blobs over 32767 bytes in length



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16147) ValueAccessor is using signed shorts in sliceWithShortLength

2020-09-29 Thread Blake Eggleston (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204369#comment-17204369
 ] 

Blake Eggleston commented on CASSANDRA-16147:
-

yep, build and split both do. I've updated the test to use to/from string.

> ValueAccessor is using signed shorts in sliceWithShortLength
> 
>
> Key: CASSANDRA-16147
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16147
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Other
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0-beta
>
>
> ValueAccessor is using a signed short when interpreting byte lengths, causing 
> exceptions when reading blobs over 32767 bytes in length



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16147) ValueAccessor is using signed shorts in sliceWithShortLength

2020-09-29 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204365#comment-17204365
 ] 

David Capwell commented on CASSANDRA-16147:
---

the test passes on trunk, and one of the reasons is that 
org.apache.cassandra.db.marshal.CompositeType#build(org.apache.cassandra.db.marshal.ValueAccessor,
 boolean, V...) uses ByteBuffer directly

{code}
@SafeVarargs
public static  V build(ValueAccessor accessor, boolean isStatic, V... 
values)
{
..

ByteBuffer out = ByteBuffer.allocate(totalLength);

...
for (V v : values)
{
ByteBufferUtil.writeShortLength(out, accessor.size(v));
...
}
{code}

And org.apache.cassandra.db.marshal.CompositeType#split also does the same

{code}
while (bb.remaining() > 0)
{
l[i++] = ByteBufferUtil.readBytesWithShortLength(bb);
bb.get(); // skip end-of-component
}
{code}

> ValueAccessor is using signed shorts in sliceWithShortLength
> 
>
> Key: CASSANDRA-16147
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16147
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Other
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0-beta
>
>
> ValueAccessor is using a signed short when interpreting byte lengths, causing 
> exceptions when reading blobs over 32767 bytes in length



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16147) ValueAccessor is using signed shorts in sliceWithShortLength

2020-09-29 Thread Blake Eggleston (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204362#comment-17204362
 ] 

Blake Eggleston commented on CASSANDRA-16147:
-

Variable length data types can have values > 0x, but composite types can't, 
so I added a test around composite types with large values.

> ValueAccessor is using signed shorts in sliceWithShortLength
> 
>
> Key: CASSANDRA-16147
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16147
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Other
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0-beta
>
>
> ValueAccessor is using a signed short when interpreting byte lengths, causing 
> exceptions when reading blobs over 32767 bytes in length



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16148) GossiperTest#testHaveVersion3Nodes is failing on trunk

2020-09-29 Thread Jordan West (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan West updated CASSANDRA-16148:

Test and Documentation Plan: Make sure the test that is being fixed passes 
and no other tests were broken as a result
 Status: Patch Available  (was: Open)

[branch | https://github.com/jrwest/cassandra/tree/jwest/16148] [tests | 
https://app.circleci.com/pipelines/github/jrwest/cassandra?branch=jwest%2F16148]

> GossiperTest#testHaveVersion3Nodes is failing on trunk
> --
>
> Key: CASSANDRA-16148
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16148
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Normal
>
> https://app.circleci.com/pipelines/github/jrwest/cassandra/53/workflows/95f9f401-1ef8-4b8d-9c64-3703d9669d95/jobs/771



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15537) 4.0 quality testing: Local Read/Write Path: Upgrade and Diff Test

2020-09-29 Thread Yifan Cai (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204353#comment-17204353
 ] 

Yifan Cai commented on CASSANDRA-15537:
---

Thank you [~pauloricardomg] for correlating the tickets! I should have done it 
when filing. :|

> 4.0 quality testing: Local Read/Write Path: Upgrade and Diff Test
> -
>
> Key: CASSANDRA-15537
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15537
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/java, Test/dtest/python
>Reporter: Josh McKenzie
>Assignee: Yifan Cai
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Reference [doc from 
> NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
>  for context.
> Execution of upgrade and diff tests via cassandra-diff have proven to be one 
> of the most effective approaches toward identifying issues with the local 
> read/write path. These include instances of data loss, data corruption, data 
> resurrection, incorrect responses to queries, incomplete responses, and 
> others. Upgrade and diff tests can be executed concurrent with fault 
> injection (such as host or network failure); as well as during mixed-version 
> scenarios (such as upgrading half of the instances in a cluster, and running 
> upgradesstables on only half of the upgraded instances).
> Upgrade and diff tests are expected to continue through the release cycle, 
> and are a great way for contributors to gain confidence in the correctness of 
> the database under their own workloads.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-16148) GossiperTest#testHaveVersion3Nodes is failing on trunk

2020-09-29 Thread Jordan West (Jira)
Jordan West created CASSANDRA-16148:
---

 Summary: GossiperTest#testHaveVersion3Nodes is failing on trunk
 Key: CASSANDRA-16148
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16148
 Project: Cassandra
  Issue Type: Bug
  Components: Cluster/Gossip
Reporter: Jordan West
Assignee: Jordan West


https://app.circleci.com/pipelines/github/jrwest/cassandra/53/workflows/95f9f401-1ef8-4b8d-9c64-3703d9669d95/jobs/771



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16148) GossiperTest#testHaveVersion3Nodes is failing on trunk

2020-09-29 Thread Jordan West (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jordan West updated CASSANDRA-16148:

 Bug Category: Parent values: Correctness(12982)Level 1 values: Test 
Failure(12990)
   Complexity: Normal
Discovered By: Unit Test
 Severity: Normal
   Status: Open  (was: Triage Needed)

> GossiperTest#testHaveVersion3Nodes is failing on trunk
> --
>
> Key: CASSANDRA-16148
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16148
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Jordan West
>Assignee: Jordan West
>Priority: Normal
>
> https://app.circleci.com/pipelines/github/jrwest/cassandra/53/workflows/95f9f401-1ef8-4b8d-9c64-3703d9669d95/jobs/771



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15214) OOMs caught and not rethrown

2020-09-29 Thread Yifan Cai (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yifan Cai updated CASSANDRA-15214:
--
Test and Documentation Plan: ci
 Status: Patch Available  (was: Open)

> OOMs caught and not rethrown
> 
>
> Key: CASSANDRA-15214
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15214
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Client, Messaging/Internode
>Reporter: Benedict Elliott Smith
>Assignee: Yifan Cai
>Priority: Normal
> Fix For: 4.0, 4.0-rc
>
> Attachments: oom-experiments.zip
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Netty (at least, and perhaps elsewhere in Executors) catches all exceptions, 
> so presently there is no way to ensure that an OOM reaches the JVM handler to 
> trigger a crash/heapdump.
> It may be that the simplest most consistent way to do this would be to have a 
> single thread spawned at startup that waits for any exceptions we must 
> propagate to the Runtime.
> We could probably submit a patch upstream to Netty, but for a guaranteed 
> future proof approach, it may be worth paying the cost of a single thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15214) OOMs caught and not rethrown

2020-09-29 Thread Yifan Cai (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204348#comment-17204348
 ] 

Yifan Cai commented on CASSANDRA-15214:
---

Talked with Benedict on Slack and cleaned up my confusion. So the 
{{JVMStabilityInspector}} is able to inspect the OOM error. But after it 
re-throws, Netty catches all throwables and simply logs. It happens 
[here|https://github.com/netty/netty/blob/4.1/transport/src/main/java/io/netty/channel/AbstractChannelHandlerContext.java#L303-L316].
 Therefore, the {{propagateOutOfMemory}} parameter was added. 

I submitted a PR that allows to produce a heap space OOM error forcefully when 
catching a direct buffer OOM. 
The PR also removes the parameter {{propagateOutOfMemory}} in the 
{{JVMStabilityInspector}}. Because it makes sure the instance can crash/exit 
properly on OOM. (see the gist below)

PR: https://github.com/apache/cassandra/pull/761
CI: 
https://app.circleci.com/pipelines/github/yifan-c/cassandra/112/workflows/293a4334-d2df-43f9-b532-1d79876701c1

I have also created a separate demo to prove that JVM invokes the OOM handler 
even if such OOM error (not including the direct buffer one) is to be swallowed 
by a catch block. 
The code and the output can be found at the gist: 
https://gist.github.com/yifan-c/82ff4fd7fbe83fe41113f6f14cba4907.

> OOMs caught and not rethrown
> 
>
> Key: CASSANDRA-15214
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15214
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Client, Messaging/Internode
>Reporter: Benedict Elliott Smith
>Assignee: Yifan Cai
>Priority: Normal
> Fix For: 4.0, 4.0-rc
>
> Attachments: oom-experiments.zip
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Netty (at least, and perhaps elsewhere in Executors) catches all exceptions, 
> so presently there is no way to ensure that an OOM reaches the JVM handler to 
> trigger a crash/heapdump.
> It may be that the simplest most consistent way to do this would be to have a 
> single thread spawned at startup that waits for any exceptions we must 
> propagate to the Runtime.
> We could probably submit a patch upstream to Netty, but for a guaranteed 
> future proof approach, it may be worth paying the cost of a single thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15994) Fix flaky python dtest test_simple_rebuild - rebuild_test.TestRebuild

2020-09-29 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204346#comment-17204346
 ] 

David Capwell commented on CASSANDRA-15994:
---

works for me.

> Fix flaky python dtest test_simple_rebuild - rebuild_test.TestRebuild
> -
>
> Key: CASSANDRA-15994
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15994
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: David Capwell
>Priority: Normal
> Fix For: 3.0.x
>
>
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/360/workflows/8e93a655-b66e-4bf2-8866-5f9a46487763/jobs/1847
> {code}
> >   assert self.rebuild_errors == 1, \
> 'rebuild errors should be 1, but found {}. Concurrent rebuild 
> should not be allowed, but one rebuild command should have 
> succeeded.'.format(self.rebuild_errors)
> E   AssertionError: rebuild errors should be 1, but found 0. Concurrent 
> rebuild should not be allowed, but one rebuild command should have succeeded.
> E   assert 0 == 1
> E+  where 0 =  0x7f29fe243518>.rebuild_errors
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15833) Unresolvable false digest mismatch during upgrade due to CASSANDRA-10657

2020-09-29 Thread Jordan West (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204339#comment-17204339
 ] 

Jordan West commented on CASSANDRA-15833:
-

The issue only affects trunk. My bad. Will open a JIRA to follow-up. The test 
is likely failing because we changed the logic to make the method actually work 
as expected. 

> Unresolvable false digest mismatch during upgrade due to CASSANDRA-10657
> 
>
> Key: CASSANDRA-15833
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15833
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Jacek Lewandowski
>Assignee: Jacek Lewandowski
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
> Attachments: CASSANDRA-15833-3.11.patch, CASSANDRA-15833-4.0.patch
>
>
> CASSANDRA-10657 introduced changes in how the ColumnFilter is interpreted. 
> This results in digest mismatch when querying incomplete set of columns from 
> a table with consistency that requires reaching instances running pre 
> CASSANDRA-10657 from nodes that include CASSANDRA-10657 (it was introduced in 
> Cassandra 3.4). 
> The fix is to bring back the previous behaviour until there are no instances 
> running pre CASSANDRA-10657 version. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14746) Ensure Netty Internode Messaging Refactor is Solid

2020-09-29 Thread Vinay Chella (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204328#comment-17204328
 ] 

Vinay Chella edited comment on CASSANDRA-14746 at 9/29/20, 10:40 PM:
-

Thank you for following up on this [~pauloricardomg]
{quote}a) Is work on this issue still active?
{quote}
Yes, it was active until I took a long break from work for personal reasons, if 
you see CASSANDRA-15181 and CASSANDRA-14764, I started some of this work but 
had to put it on hold, I am starting to get back in motion, should be able to 
make progress in coming weeks.
{quote}b) Can we complete this issue once all subtasks are completed or are 
there more subtasks to be added?
{quote}
quoting from the description "The goal is that 4.0 should have better latency, 
more throughput, fewer threads, fewer context switches, less GC allocation, and 
faster recovery time" - I would say it is all about building the confidence in 
4.0, we can add more tasks as we make progress and findings based on 
CASSANDRA-14747, CASSANDRA-15181, and CASSANDRA-14764.


was (Author: vinaykumarcse):
Thank you for following up on this [~pauloricardomg]
{quote}a) Is work on this issue still active?
{quote}
Yes, it was active until I took a long break from work for personal reasons,  
if you see CASSANDRA-15181 and CASSANDRA-14764, I started some of this work but 
had to put it on hold, I am starting to get back in motion, should be able to 
make progress in coming weeks. 

{quote}
b) Can we complete this issue once all subtasks are completed or are there more 
subtasks to be added?
{quote}
quoting from the description "The goal is that 4.0 should have better latency, 
more throughput, fewer threads, fewer context switches, less GC allocation, and 
faster recovery time" - I would say it is all about building the confidence in 
4.0, I can sign up to add more tasks as we make progress and findings based on 
CASSANDRA-14747, CASSANDRA-15181, and CASSANDRA-14764. 

> Ensure Netty Internode Messaging Refactor is Solid
> --
>
> Key: CASSANDRA-14746
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14746
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Streaming and Messaging
>Reporter: Joey Lynch
>Assignee: Joey Lynch
>Priority: Normal
>  Labels: 4.0-QA
> Fix For: 4.0-beta
>
>
> Before we release 4.0 let's ensure that the internode messaging refactor is 
> 100% solid. As internode messaging is naturally used in many code paths and 
> widely configurable we have a large number of cluster configurations and test 
> configurations that must be vetted.
> We plan to vary the following:
>  * Version of Cassandra 3.0.17 vs 4.0-alpha
>  * Cluster sizes with *multi-dc* deployments ranging from 6 - 100 nodes
>  * Client request rates varying between 1k QPS and 100k QPS of varying sizes 
> and shapes (BATCH, INSERT, SELECT point, SELECT range, etc ...)
>  * Internode compression
>  * Internode SSL (as well as openssl vs jdk)
>  * Internode Coalescing options
> We are looking to measure the following as appropriate:
>  * Latency distributions of reads and writes (lower is better)
>  * Scaling limit, aka maximum throughput before violating p99 latency 
> deadline of 10ms @ LOCAL_QUORUM, on a fixed hardware deployment for 100% 
> writes, 100% reads and 50-50 writes+reads (higher is better)
>  * Thread counts (lower is better)
>  * Context switches (lower is better)
>  * On-CPU time of tasks (higher periods without context switch is better)
>  * GC allocation rates / throughput for a fixed size heap (lower allocation 
> better)
>  * Streaming recovery time for a single node failure, i.e. can Cassandra 
> saturate the NIC
>  
> The goal is that 4.0 should have better latency, more throughput, fewer 
> threads, fewer context switches, less GC allocation, and faster recovery 
> time. I'm putting Jason Brown as the reviewer since he implemented most of 
> the internode refactor.
> Current collaborators driving this QA task: Dinesh Joshi, Jordan West, Joey 
> Lynch (Netflix), Vinay Chella (Netflix)
> Owning committer(s): Jason Brown



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16127) NullPointerException when calling nodetool enablethrift

2020-09-29 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204337#comment-17204337
 ] 

David Capwell commented on CASSANDRA-16127:
---

3.0 and 3.11 bootstrap failed but were working before 
https://app.circleci.com/pipelines/github/dcapwell/cassandra/550/workflows/ca6c6551-01d4-4438-bd4d-c14e27fa9bfc/jobs/3035,
 looks like a change I made caused a regression; looking into it.

> NullPointerException when calling nodetool enablethrift
> ---
>
> Key: CASSANDRA-16127
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16127
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Thrift
>Reporter: Tibor Repasi
>Assignee: David Capwell
>Priority: Normal
> Fix For: 2.2.x, 3.0.x, 3.11.x
>
>
> Having thrift disabled, it's impossible to enable it again without restarting 
> the node:
> {code}
> $ nodetool statusthrift
> not running
> $ nodetool enablethrift
> error: null
> -- StackTrace --
> java.lang.NullPointerException
>   at 
> org.apache.cassandra.service.StorageService.startRPCServer(StorageService.java:392)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
>   at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275)
>   at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
>   at 
> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
>   at 
> com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
>   at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:138)
>   at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:252)
>   at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:819)
>   at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:801)
>   at 
> javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1468)
>   at 
> javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:76)
>   at 
> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1309)
>   at 
> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1401)
>   at 
> javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:829)
>   at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:357)
>   at sun.rmi.transport.Transport$1.run(Transport.java:200)
>   at sun.rmi.transport.Transport$1.run(Transport.java:197)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
>   at 
> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:573)
>   at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:834)
>   at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:688)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:687)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15996) Fix flaky python dtest test_expiration_overflow_policy_capnowarn - ttl_test.TestTTL

2020-09-29 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204333#comment-17204333
 ] 

David Capwell commented on CASSANDRA-15996:
---

works for me.

> Fix flaky python dtest test_expiration_overflow_policy_capnowarn - 
> ttl_test.TestTTL
> ---
>
> Key: CASSANDRA-15996
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15996
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: David Capwell
>Priority: Normal
> Fix For: 3.11.x
>
>
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/361/workflows/3a42fa45-1f60-4c95-86a4-15a6773e384e/jobs/1860
> {code}
> >   assert warning, 'Log message should be print for CAP and 
> > CAP_NOWARN policy'
> E   AssertionError: Log message should be print for CAP and 
> CAP_NOWARN policy
> E   assert []
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16036) Add flag to disable chunk cache and disable by default

2020-09-29 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204331#comment-17204331
 ] 

David Capwell commented on CASSANDRA-16036:
---

Updated CI results (pending)

Circle: 
https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16036-trunk-2EBAD3E9-4394-4D42-9213-69A6590F37E2
Jenkins: https://ci-cassandra.apache.org/job/Cassandra-devbranch/52/

trunk baseline: 
https://app.circleci.com/pipelines/github/dcapwell/cassandra/574/workflows/19f38f3c-9da3-42d5-ba5f-269f0285b791

> Add flag to disable chunk cache and disable by default
> --
>
> Key: CASSANDRA-16036
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16036
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0-beta
>
> Attachments: 15229_128mb.png, 16036_128mb.png, 
> async-profile.collapsed.svg, 
> clustering-in-clause_latency_selects_baseline.png, 
> clustering-in-clause_latency_selects_baseline_attempt3.png, 
> clustering-in-clause_latency_under90_selects_baseline.png, 
> clustering-in-clause_latency_under90_selects_baseline_attempt3.png, 
> clustering-slice_latency_selects_baseline.png, 
> clustering-slice_latency_under90_selects_baseline.png, 
> medium-blobs_latency_selects_baseline.png, 
> medium-blobs_latency_under90_selects_baseline.png, 
> partition-single-row-read_latency_selects_baseline.png, 
> partition-single-row-read_latency_under90_selects_baseline.png
>
>
> Chunk cache is enabled by default and doesn’t have a flag to disable without 
> impacting networking.  In performance testing 4.0 against 3.0 I found that 
> reads were slower in 4.0 and after profiling found that the ChunkCache was 
> partially to blame; after disabling the chunk cache, read performance had 
> improved.
> {code}
> 40_w_cc-selects.hdr
> #[Mean= 11.50063, StdDeviation   = 13.44014]
> #[Max =482.41254, Total count=   316477]
> #[Buckets =   25, SubBuckets =   262144]
> 40_wo_cc-selects.hdr
> #[Mean=  9.82115, StdDeviation   = 10.14270]
> #[Max =522.36493, Total count=   317444]
> #[Buckets =   25, SubBuckets =   262144]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16036) Add flag to disable chunk cache and disable by default

2020-09-29 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204330#comment-17204330
 ] 

David Capwell commented on CASSANDRA-16036:
---

ok so looks like the read_repair tests and the gossiper test was broken by 
https://issues.apache.org/jira/browse/CASSANDRA-15833, so can ignore in this 
results.  Will rerun the tests with the commit to enable the cache in tests.

> Add flag to disable chunk cache and disable by default
> --
>
> Key: CASSANDRA-16036
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16036
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0-beta
>
> Attachments: 15229_128mb.png, 16036_128mb.png, 
> async-profile.collapsed.svg, 
> clustering-in-clause_latency_selects_baseline.png, 
> clustering-in-clause_latency_selects_baseline_attempt3.png, 
> clustering-in-clause_latency_under90_selects_baseline.png, 
> clustering-in-clause_latency_under90_selects_baseline_attempt3.png, 
> clustering-slice_latency_selects_baseline.png, 
> clustering-slice_latency_under90_selects_baseline.png, 
> medium-blobs_latency_selects_baseline.png, 
> medium-blobs_latency_under90_selects_baseline.png, 
> partition-single-row-read_latency_selects_baseline.png, 
> partition-single-row-read_latency_under90_selects_baseline.png
>
>
> Chunk cache is enabled by default and doesn’t have a flag to disable without 
> impacting networking.  In performance testing 4.0 against 3.0 I found that 
> reads were slower in 4.0 and after profiling found that the ChunkCache was 
> partially to blame; after disabling the chunk cache, read performance had 
> improved.
> {code}
> 40_w_cc-selects.hdr
> #[Mean= 11.50063, StdDeviation   = 13.44014]
> #[Max =482.41254, Total count=   316477]
> #[Buckets =   25, SubBuckets =   262144]
> 40_wo_cc-selects.hdr
> #[Mean=  9.82115, StdDeviation   = 10.14270]
> #[Max =522.36493, Total count=   317444]
> #[Buckets =   25, SubBuckets =   262144]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15991) 15583 - Add UX tests to intree LHF tooling

2020-09-29 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204329#comment-17204329
 ] 

David Capwell commented on CASSANDRA-15991:
---

ok so looks like the read_repair tests and the gossiper test was broken by 
https://issues.apache.org/jira/browse/CASSANDRA-15833, so can ignore in this 
results.  [~Bereng] can you look into the 
`org.apache.cassandra.tools.SSTableRepairedAtSetterTest#testFilesArg` test?

> 15583 - Add UX tests to intree LHF tooling
> --
>
> Key: CASSANDRA-15991
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15991
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/unit
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta
>
>
> As per CASSANDRA-15583 many in tree tools lack proper UX tooling: mandatory 
> params are indeed mandatory, 'help' produces an actual help, return codes etc
> This ticket is an attempt to add it to those tools that classify as LHF. 
> Other tools such as nodetool, with many sub-commands, deserve a separate 
> ticket of their own



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15833) Unresolvable false digest mismatch during upgrade due to CASSANDRA-10657

2020-09-29 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204326#comment-17204326
 ] 

David Capwell edited comment on CASSANDRA-15833 at 9/29/20, 10:24 PM:
--

Looks like this broke a unit test 
(https://ci-cassandra.apache.org/job/Cassandra-trunk/38/testReport/org.apache.cassandra.gms/GossiperTest/testHaveVersion3Nodes/history/)
  and read repair python dtest 
(https://ci-cassandra.apache.org/job/Cassandra-trunk/38/testReport/dtest.read_repair_test/TestReadRepair/test_alter_rf_and_run_read_repair/history/
 and 
https://ci-cassandra.apache.org/job/Cassandra-trunk/38/testReport/dtest-offheap.read_repair_test/TestReadRepairGuarantees/test_atomic_writes_blocking_/history/).

Didn't check 3.11 builds, only trunk.


was (Author: dcapwell):
Looks like this broke a unit test 
(https://ci-cassandra.apache.org/job/Cassandra-trunk/38/testReport/org.apache.cassandra.gms/GossiperTest/testHaveVersion3Nodes/history/)
  and read repair python dtest 
(https://ci-cassandra.apache.org/job/Cassandra-trunk/38/testReport/dtest.read_repair_test/TestReadRepair/test_alter_rf_and_run_read_repair/history/
 and 
https://ci-cassandra.apache.org/job/Cassandra-trunk/38/testReport/dtest-offheap.read_repair_test/TestReadRepairGuarantees/test_atomic_writes_blocking_/history/).

> Unresolvable false digest mismatch during upgrade due to CASSANDRA-10657
> 
>
> Key: CASSANDRA-15833
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15833
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Jacek Lewandowski
>Assignee: Jacek Lewandowski
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
> Attachments: CASSANDRA-15833-3.11.patch, CASSANDRA-15833-4.0.patch
>
>
> CASSANDRA-10657 introduced changes in how the ColumnFilter is interpreted. 
> This results in digest mismatch when querying incomplete set of columns from 
> a table with consistency that requires reaching instances running pre 
> CASSANDRA-10657 from nodes that include CASSANDRA-10657 (it was introduced in 
> Cassandra 3.4). 
> The fix is to bring back the previous behaviour until there are no instances 
> running pre CASSANDRA-10657 version. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14746) Ensure Netty Internode Messaging Refactor is Solid

2020-09-29 Thread Vinay Chella (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204328#comment-17204328
 ] 

Vinay Chella commented on CASSANDRA-14746:
--

Thank you for following up on this [~pauloricardomg]
{quote}a) Is work on this issue still active?
{quote}
Yes, it was active until I took a long break from work for personal reasons,  
if you see CASSANDRA-15181 and CASSANDRA-14764, I started some of this work but 
had to put it on hold, I am starting to get back in motion, should be able to 
make progress in coming weeks. 

{quote}
b) Can we complete this issue once all subtasks are completed or are there more 
subtasks to be added?
{quote}
quoting from the description "The goal is that 4.0 should have better latency, 
more throughput, fewer threads, fewer context switches, less GC allocation, and 
faster recovery time" - I would say it is all about building the confidence in 
4.0, I can sign up to add more tasks as we make progress and findings based on 
CASSANDRA-14747, CASSANDRA-15181, and CASSANDRA-14764. 

> Ensure Netty Internode Messaging Refactor is Solid
> --
>
> Key: CASSANDRA-14746
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14746
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Streaming and Messaging
>Reporter: Joey Lynch
>Assignee: Joey Lynch
>Priority: Normal
>  Labels: 4.0-QA
> Fix For: 4.0-beta
>
>
> Before we release 4.0 let's ensure that the internode messaging refactor is 
> 100% solid. As internode messaging is naturally used in many code paths and 
> widely configurable we have a large number of cluster configurations and test 
> configurations that must be vetted.
> We plan to vary the following:
>  * Version of Cassandra 3.0.17 vs 4.0-alpha
>  * Cluster sizes with *multi-dc* deployments ranging from 6 - 100 nodes
>  * Client request rates varying between 1k QPS and 100k QPS of varying sizes 
> and shapes (BATCH, INSERT, SELECT point, SELECT range, etc ...)
>  * Internode compression
>  * Internode SSL (as well as openssl vs jdk)
>  * Internode Coalescing options
> We are looking to measure the following as appropriate:
>  * Latency distributions of reads and writes (lower is better)
>  * Scaling limit, aka maximum throughput before violating p99 latency 
> deadline of 10ms @ LOCAL_QUORUM, on a fixed hardware deployment for 100% 
> writes, 100% reads and 50-50 writes+reads (higher is better)
>  * Thread counts (lower is better)
>  * Context switches (lower is better)
>  * On-CPU time of tasks (higher periods without context switch is better)
>  * GC allocation rates / throughput for a fixed size heap (lower allocation 
> better)
>  * Streaming recovery time for a single node failure, i.e. can Cassandra 
> saturate the NIC
>  
> The goal is that 4.0 should have better latency, more throughput, fewer 
> threads, fewer context switches, less GC allocation, and faster recovery 
> time. I'm putting Jason Brown as the reviewer since he implemented most of 
> the internode refactor.
> Current collaborators driving this QA task: Dinesh Joshi, Jordan West, Joey 
> Lynch (Netflix), Vinay Chella (Netflix)
> Owning committer(s): Jason Brown



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15833) Unresolvable false digest mismatch during upgrade due to CASSANDRA-10657

2020-09-29 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204326#comment-17204326
 ] 

David Capwell commented on CASSANDRA-15833:
---

Looks like this broke a unit test 
(https://ci-cassandra.apache.org/job/Cassandra-trunk/38/testReport/org.apache.cassandra.gms/GossiperTest/testHaveVersion3Nodes/history/)
  and read repair python dtest 
(https://ci-cassandra.apache.org/job/Cassandra-trunk/38/testReport/dtest.read_repair_test/TestReadRepair/test_alter_rf_and_run_read_repair/history/
 and 
https://ci-cassandra.apache.org/job/Cassandra-trunk/38/testReport/dtest-offheap.read_repair_test/TestReadRepairGuarantees/test_atomic_writes_blocking_/history/).

> Unresolvable false digest mismatch during upgrade due to CASSANDRA-10657
> 
>
> Key: CASSANDRA-15833
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15833
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Jacek Lewandowski
>Assignee: Jacek Lewandowski
>Priority: Normal
> Fix For: 3.11.x, 4.0-beta
>
> Attachments: CASSANDRA-15833-3.11.patch, CASSANDRA-15833-4.0.patch
>
>
> CASSANDRA-10657 introduced changes in how the ColumnFilter is interpreted. 
> This results in digest mismatch when querying incomplete set of columns from 
> a table with consistency that requires reaching instances running pre 
> CASSANDRA-10657 from nodes that include CASSANDRA-10657 (it was introduced in 
> Cassandra 3.4). 
> The fix is to bring back the previous behaviour until there are no instances 
> running pre CASSANDRA-10657 version. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15581) 4.0 quality testing: Compaction

2020-09-29 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204322#comment-17204322
 ] 

Paulo Motta commented on CASSANDRA-15581:
-

Hey [~blerer], did you have the chance to lay out a plan for this?

For context, I'm asking this to check the status of the 4.0 quality epic as 
part of this [this 
discussion|https://www.mail-archive.com/dev@cassandra.apache.org/msg15881.html] 
on the mailing list.

> 4.0 quality testing: Compaction
> ---
>
> Key: CASSANDRA-15581
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15581
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/python
>Reporter: Josh McKenzie
>Assignee: Benjamin Lerer
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Reference [doc from 
> NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
>  for context.
> *Shepherd: Marcus Eriksson*
> Alongside the local and distributed read/write paths, we'll also want to 
> validate compaction. CASSANDRA-6696 introduced substantial 
> changes/improvements that require testing (esp. JBOD).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15537) 4.0 quality testing: Local Read/Write Path: Upgrade and Diff Test

2020-09-29 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204321#comment-17204321
 ] 

Paulo Motta commented on CASSANDRA-15537:
-

{quote}I don't know where exactly to do that, but any workflow changes require 
the assistance of infra and I strongly suspect adding a new relationship 
between tickets will as well.
{quote}
hmm OK, can't be bothered right now, guess it's not a big deal to live with 
that. :P  Thanks!

> 4.0 quality testing: Local Read/Write Path: Upgrade and Diff Test
> -
>
> Key: CASSANDRA-15537
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15537
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/java, Test/dtest/python
>Reporter: Josh McKenzie
>Assignee: Yifan Cai
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Reference [doc from 
> NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
>  for context.
> Execution of upgrade and diff tests via cassandra-diff have proven to be one 
> of the most effective approaches toward identifying issues with the local 
> read/write path. These include instances of data loss, data corruption, data 
> resurrection, incorrect responses to queries, incomplete responses, and 
> others. Upgrade and diff tests can be executed concurrent with fault 
> injection (such as host or network failure); as well as during mixed-version 
> scenarios (such as upgrading half of the instances in a cluster, and running 
> upgradesstables on only half of the upgraded instances).
> Upgrade and diff tests are expected to continue through the release cycle, 
> and are a great way for contributors to gain confidence in the correctness of 
> the database under their own workloads.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16036) Add flag to disable chunk cache and disable by default

2020-09-29 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204320#comment-17204320
 ] 

David Capwell commented on CASSANDRA-16036:
---

rebase and broke JMX (since it isn't enabled) so enabled check cache in testing 
and tests are passing.  I am running CI against trunk as the failing tests are 
consistent and failing for other branches, so isolating the changes to make 
sure those tests are not broken here.

> Add flag to disable chunk cache and disable by default
> --
>
> Key: CASSANDRA-16036
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16036
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0-beta
>
> Attachments: 15229_128mb.png, 16036_128mb.png, 
> async-profile.collapsed.svg, 
> clustering-in-clause_latency_selects_baseline.png, 
> clustering-in-clause_latency_selects_baseline_attempt3.png, 
> clustering-in-clause_latency_under90_selects_baseline.png, 
> clustering-in-clause_latency_under90_selects_baseline_attempt3.png, 
> clustering-slice_latency_selects_baseline.png, 
> clustering-slice_latency_under90_selects_baseline.png, 
> medium-blobs_latency_selects_baseline.png, 
> medium-blobs_latency_under90_selects_baseline.png, 
> partition-single-row-read_latency_selects_baseline.png, 
> partition-single-row-read_latency_under90_selects_baseline.png
>
>
> Chunk cache is enabled by default and doesn’t have a flag to disable without 
> impacting networking.  In performance testing 4.0 against 3.0 I found that 
> reads were slower in 4.0 and after profiling found that the ChunkCache was 
> partially to blame; after disabling the chunk cache, read performance had 
> improved.
> {code}
> 40_w_cc-selects.hdr
> #[Mean= 11.50063, StdDeviation   = 13.44014]
> #[Max =482.41254, Total count=   316477]
> #[Buckets =   25, SubBuckets =   262144]
> 40_wo_cc-selects.hdr
> #[Mean=  9.82115, StdDeviation   = 10.14270]
> #[Max =522.36493, Total count=   317444]
> #[Buckets =   25, SubBuckets =   262144]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15537) 4.0 quality testing: Local Read/Write Path: Upgrade and Diff Test

2020-09-29 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204318#comment-17204318
 ] 

Brandon Williams commented on CASSANDRA-15537:
--

I don't know where exactly to do that, but any workflow changes require the 
assistance of infra and I strongly suspect adding a new relationship between 
tickets will as well.

> 4.0 quality testing: Local Read/Write Path: Upgrade and Diff Test
> -
>
> Key: CASSANDRA-15537
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15537
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/java, Test/dtest/python
>Reporter: Josh McKenzie
>Assignee: Yifan Cai
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Reference [doc from 
> NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
>  for context.
> Execution of upgrade and diff tests via cassandra-diff have proven to be one 
> of the most effective approaches toward identifying issues with the local 
> read/write path. These include instances of data loss, data corruption, data 
> resurrection, incorrect responses to queries, incomplete responses, and 
> others. Upgrade and diff tests can be executed concurrent with fault 
> injection (such as host or network failure); as well as during mixed-version 
> scenarios (such as upgrading half of the instances in a cluster, and running 
> upgradesstables on only half of the upgraded instances).
> Upgrade and diff tests are expected to continue through the release cycle, 
> and are a great way for contributors to gain confidence in the correctness of 
> the database under their own workloads.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15537) 4.0 quality testing: Local Read/Write Path: Upgrade and Diff Test

2020-09-29 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204316#comment-17204316
 ] 

Paulo Motta commented on CASSANDRA-15537:
-

Awesome, thanks for the update [~yifanc]. I've set the ticket to "In progress" 
to reflect its status.

 

I also added the tickets you mentioned as "related". Though the proper 
relationship type would be "Found while testing", do you know how easy is to 
add a new JIRA relationship state [~brandon.williams] (pinging you because I've 
recall you fixing JIRA workflow issues before but I can't remember how to do 
it)?

> 4.0 quality testing: Local Read/Write Path: Upgrade and Diff Test
> -
>
> Key: CASSANDRA-15537
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15537
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/java, Test/dtest/python
>Reporter: Josh McKenzie
>Assignee: Yifan Cai
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Reference [doc from 
> NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
>  for context.
> Execution of upgrade and diff tests via cassandra-diff have proven to be one 
> of the most effective approaches toward identifying issues with the local 
> read/write path. These include instances of data loss, data corruption, data 
> resurrection, incorrect responses to queries, incomplete responses, and 
> others. Upgrade and diff tests can be executed concurrent with fault 
> injection (such as host or network failure); as well as during mixed-version 
> scenarios (such as upgrading half of the instances in a cluster, and running 
> upgradesstables on only half of the upgraded instances).
> Upgrade and diff tests are expected to continue through the release cycle, 
> and are a great way for contributors to gain confidence in the correctness of 
> the database under their own workloads.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16147) ValueAccessor is using signed shorts in sliceWithShortLength

2020-09-29 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204313#comment-17204313
 ] 

David Capwell commented on CASSANDRA-16147:
---

overall +1 from me.  I do think it would be good to add a test which 
reads/writes SSTables that would have hit the issue as it seems that we are 
missing larger data in java and python testing.

> ValueAccessor is using signed shorts in sliceWithShortLength
> 
>
> Key: CASSANDRA-16147
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16147
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Other
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0-beta
>
>
> ValueAccessor is using a signed short when interpreting byte lengths, causing 
> exceptions when reading blobs over 32767 bytes in length



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16147) ValueAccessor is using signed shorts in sliceWithShortLength

2020-09-29 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-16147:
--
Reviewers: David Capwell, David Capwell  (was: David Capwell)
   David Capwell, David Capwell  (was: David Capwell)
   Status: Review In Progress  (was: Patch Available)

> ValueAccessor is using signed shorts in sliceWithShortLength
> 
>
> Key: CASSANDRA-16147
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16147
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Other
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0-beta
>
>
> ValueAccessor is using a signed short when interpreting byte lengths, causing 
> exceptions when reading blobs over 32767 bytes in length



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16147) ValueAccessor is using signed shorts in sliceWithShortLength

2020-09-29 Thread Blake Eggleston (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-16147:

Test and Documentation Plan: unit tests / circle
 Status: Patch Available  (was: In Progress)

[trunk |https://github.com/bdeggleston/cassandra/tree/16147-trunk]

> ValueAccessor is using signed shorts in sliceWithShortLength
> 
>
> Key: CASSANDRA-16147
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16147
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Other
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0-beta
>
>
> ValueAccessor is using a signed short when interpreting byte lengths, causing 
> exceptions when reading blobs over 32767 bytes in length



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16147) ValueAccessor is using signed shorts in sliceWithShortLength

2020-09-29 Thread Blake Eggleston (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-16147:

 Bug Category: Parent values: Availability(12983)Level 1 values: Response 
Crash(12991)
   Complexity: Low Hanging Fruit
Discovered By: Workload Replay
Reviewers: David Capwell
 Severity: Critical
   Status: Open  (was: Triage Needed)

> ValueAccessor is using signed shorts in sliceWithShortLength
> 
>
> Key: CASSANDRA-16147
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16147
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Other
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0-beta
>
>
> ValueAccessor is using a signed short when interpreting byte lengths, causing 
> exceptions when reading blobs over 32767 bytes in length



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-16147) ValueAccessor is using signed shorts in sliceWithShortLength

2020-09-29 Thread Blake Eggleston (Jira)
Blake Eggleston created CASSANDRA-16147:
---

 Summary: ValueAccessor is using signed shorts in 
sliceWithShortLength
 Key: CASSANDRA-16147
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16147
 Project: Cassandra
  Issue Type: Bug
  Components: Local/Other
Reporter: Blake Eggleston
Assignee: Blake Eggleston


ValueAccessor is using a signed short when interpreting byte lengths, causing 
exceptions when reading blobs over 32767 bytes in length



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16147) ValueAccessor is using signed shorts in sliceWithShortLength

2020-09-29 Thread Blake Eggleston (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-16147:

Fix Version/s: 4.0-beta

> ValueAccessor is using signed shorts in sliceWithShortLength
> 
>
> Key: CASSANDRA-16147
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16147
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Other
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0-beta
>
>
> ValueAccessor is using a signed short when interpreting byte lengths, causing 
> exceptions when reading blobs over 32767 bytes in length



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Issue Comment Deleted] (CASSANDRA-15799) CorruptSSTableException when compacting a 3.0 format sstable that was originally created as an outcome of 2.1 sstable upgrade

2020-09-29 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-15799:
--
Comment: was deleted

(was: CI results (pending):

Circle: 
https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16036-trunk-B4951B6C-9967-4B3D-A93A-5C5539DDE804
Jenkins: https://ci-cassandra.apache.org/job/Cassandra-devbranch/51/)

> CorruptSSTableException when compacting a 3.0 format sstable that was 
> originally created as an outcome of 2.1 sstable upgrade
> -
>
> Key: CASSANDRA-15799
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15799
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Local/SSTable
>Reporter: Sumanth Pasupuleti
>Assignee: David Capwell
>Priority: Normal
> Fix For: 3.0.x
>
> Attachments: fake-deletedcell-if-bad.patch
>
>
> Below is the exception with stack trace. This issue is reproduce-able.
> {code:java}
> DEBUG [CompactionExecutor:10] 2020-05-07 19:33:34,268 CompactionTask.java:158 
> - Compacting (a3ea9fc0-9099-11ea-933f-c5e852f71338) 
> [/mnt/data/cassandra/data/ks/cf/md-10802-big-Data.db:level=0, ]
> ERROR [CompactionExecutor:10] 2020-05-07 19:33:34,275 
> CassandraDaemon.java:208 - Exception in thread 
> Thread[CompactionExecutor:10,1,RMI Runtime]
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
> /mnt/data/cassandra/data/ks/cf/md-10802-big-Data.db
>   at 
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.computeNext(SSTableIdentityIterator.java:105)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.computeNext(SSTableIdentityIterator.java:30)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:95)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.utils.MergeIterator$TrivialOneToOne.computeNext(MergeIterator.java:460)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:534)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:394)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:129) 
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.db.ColumnIndex$Builder.build(ColumnIndex.java:111) 
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.db.ColumnIndex.writeAndBuildIndex(ColumnIndex.java:52) 
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(BigTableWriter.java:165)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:126)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.db.compaction.writers.DefaultCompactionWriter.realAppend(DefaultCompactionWriter.java:57)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:109)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:195)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:89)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:61)
>  ~[nf-cassan

[jira] [Commented] (CASSANDRA-16036) Add flag to disable chunk cache and disable by default

2020-09-29 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204290#comment-17204290
 ] 

David Capwell commented on CASSANDRA-16036:
---

CI results (pending):

Circle: 
https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16036-trunk-B4951B6C-9967-4B3D-A93A-5C5539DDE804
Jenkins: https://ci-cassandra.apache.org/job/Cassandra-devbranch/51/

> Add flag to disable chunk cache and disable by default
> --
>
> Key: CASSANDRA-16036
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16036
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0-beta
>
> Attachments: 15229_128mb.png, 16036_128mb.png, 
> async-profile.collapsed.svg, 
> clustering-in-clause_latency_selects_baseline.png, 
> clustering-in-clause_latency_selects_baseline_attempt3.png, 
> clustering-in-clause_latency_under90_selects_baseline.png, 
> clustering-in-clause_latency_under90_selects_baseline_attempt3.png, 
> clustering-slice_latency_selects_baseline.png, 
> clustering-slice_latency_under90_selects_baseline.png, 
> medium-blobs_latency_selects_baseline.png, 
> medium-blobs_latency_under90_selects_baseline.png, 
> partition-single-row-read_latency_selects_baseline.png, 
> partition-single-row-read_latency_under90_selects_baseline.png
>
>
> Chunk cache is enabled by default and doesn’t have a flag to disable without 
> impacting networking.  In performance testing 4.0 against 3.0 I found that 
> reads were slower in 4.0 and after profiling found that the ChunkCache was 
> partially to blame; after disabling the chunk cache, read performance had 
> improved.
> {code}
> 40_w_cc-selects.hdr
> #[Mean= 11.50063, StdDeviation   = 13.44014]
> #[Max =482.41254, Total count=   316477]
> #[Buckets =   25, SubBuckets =   262144]
> 40_wo_cc-selects.hdr
> #[Mean=  9.82115, StdDeviation   = 10.14270]
> #[Max =522.36493, Total count=   317444]
> #[Buckets =   25, SubBuckets =   262144]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15799) CorruptSSTableException when compacting a 3.0 format sstable that was originally created as an outcome of 2.1 sstable upgrade

2020-09-29 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204289#comment-17204289
 ] 

David Capwell commented on CASSANDRA-15799:
---

CI results (pending):

Circle: 
https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-16036-trunk-B4951B6C-9967-4B3D-A93A-5C5539DDE804
Jenkins: https://ci-cassandra.apache.org/job/Cassandra-devbranch/51/

> CorruptSSTableException when compacting a 3.0 format sstable that was 
> originally created as an outcome of 2.1 sstable upgrade
> -
>
> Key: CASSANDRA-15799
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15799
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction, Local/SSTable
>Reporter: Sumanth Pasupuleti
>Assignee: David Capwell
>Priority: Normal
> Fix For: 3.0.x
>
> Attachments: fake-deletedcell-if-bad.patch
>
>
> Below is the exception with stack trace. This issue is reproduce-able.
> {code:java}
> DEBUG [CompactionExecutor:10] 2020-05-07 19:33:34,268 CompactionTask.java:158 
> - Compacting (a3ea9fc0-9099-11ea-933f-c5e852f71338) 
> [/mnt/data/cassandra/data/ks/cf/md-10802-big-Data.db:level=0, ]
> ERROR [CompactionExecutor:10] 2020-05-07 19:33:34,275 
> CassandraDaemon.java:208 - Exception in thread 
> Thread[CompactionExecutor:10,1,RMI Runtime]
> org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: 
> /mnt/data/cassandra/data/ks/cf/md-10802-big-Data.db
>   at 
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.computeNext(SSTableIdentityIterator.java:105)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.computeNext(SSTableIdentityIterator.java:30)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:95)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.utils.MergeIterator$TrivialOneToOne.computeNext(MergeIterator.java:460)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:534)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:394)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47) 
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:129) 
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.db.ColumnIndex$Builder.build(ColumnIndex.java:111) 
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.db.ColumnIndex.writeAndBuildIndex(ColumnIndex.java:52) 
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(BigTableWriter.java:165)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:126)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.db.compaction.writers.DefaultCompactionWriter.realAppend(DefaultCompactionWriter.java:57)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.db.compaction.writers.CompactionAwareWriter.append(CompactionAwareWriter.java:109)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:195)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:89)
>  ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
>   at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTas

[jira] [Commented] (CASSANDRA-15993) Fix flaky python dtest test_view_metadata_cleanup - materialized_views_test.TestMaterializedViews

2020-09-29 Thread Adam Holmberg (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204280#comment-17204280
 ] 

Adam Holmberg commented on CASSANDRA-15993:
---

I'm not seeing failures in ci-cassandra right now, but I think I have a local 
setup that produces this failure intermittently. I'm going to look into it.

> Fix flaky python dtest test_view_metadata_cleanup - 
> materialized_views_test.TestMaterializedViews
> -
>
> Key: CASSANDRA-15993
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15993
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: David Capwell
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta
>
>
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/355/workflows/7b8df61d-706f-4094-a206-7cdc6b4e0451/jobs/1818
> {code}
> E   cassandra.OperationTimedOut: errors={'127.0.0.2': 'Client request 
> timeout. See Session.execute[_async](timeout)'}, last_host=127.0.0.2
> cassandra/cluster.py:4026: OperationTimedOut
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-15993) Fix flaky python dtest test_view_metadata_cleanup - materialized_views_test.TestMaterializedViews

2020-09-29 Thread Adam Holmberg (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Holmberg reassigned CASSANDRA-15993:
-

Assignee: Adam Holmberg

> Fix flaky python dtest test_view_metadata_cleanup - 
> materialized_views_test.TestMaterializedViews
> -
>
> Key: CASSANDRA-15993
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15993
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: David Capwell
>Assignee: Adam Holmberg
>Priority: Normal
> Fix For: 4.0-beta
>
>
> https://app.circleci.com/pipelines/github/dcapwell/cassandra/355/workflows/7b8df61d-706f-4094-a206-7cdc6b4e0451/jobs/1818
> {code}
> E   cassandra.OperationTimedOut: errors={'127.0.0.2': 'Client request 
> timeout. See Session.execute[_async](timeout)'}, last_host=127.0.0.2
> cassandra/cluster.py:4026: OperationTimedOut
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15991) 15583 - Add UX tests to intree LHF tooling

2020-09-29 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204238#comment-17204238
 ] 

David Capwell commented on CASSANDRA-15991:
---

Also see  read_repair_test.TestReadRepairGuarantees is failing in both w/ and 
w/o vnode so does not look flaky.  Given the code in this patch I feel that it 
is unrelated so will try to rerun tests on trunk.

> 15583 - Add UX tests to intree LHF tooling
> --
>
> Key: CASSANDRA-15991
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15991
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/unit
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta
>
>
> As per CASSANDRA-15583 many in tree tools lack proper UX tooling: mandatory 
> params are indeed mandatory, 'help' produces an actual help, return codes etc
> This ticket is an attempt to add it to those tools that classify as LHF. 
> Other tools such as nodetool, with many sub-commands, deserve a separate 
> ticket of their own



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15991) 15583 - Add UX tests to intree LHF tooling

2020-09-29 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204237#comment-17204237
 ] 

David Capwell edited comment on CASSANDRA-15991 at 9/29/20, 7:37 PM:
-

Looks like a new test is failing in Circle CI - 
https://app.circleci.com/pipelines/github/dcapwell/cassandra/572/workflows/5bbb328d-9497-4291-8a48-ca9e04019908/jobs/3141

testFilesArg - org.apache.cassandra.tools.SSTableRepairedAtSetterTest
{code}
[org.apache.cassandra.tools.SSTableRepairedAtSetter,
--really-set,
--is-repaired,
-f,
/tmp/cassandra/sstablelist.txt]
exited with code -1
stderr:

java.lang.RuntimeException: java.nio.file.NoSuchFileException: 
/tmp/cassandra/sstablelist.txt
at 
org.apache.cassandra.tools.ToolRunner.runClassAsTool(ToolRunner.java:99)
at 
org.apache.cassandra.tools.ToolRunner.invokeClass(ToolRunner.java:256)
at 
org.apache.cassandra.tools.ToolRunner.invokeClass(ToolRunner.java:232)
at 
org.apache.cassandra.tools.SSTableRepairedAtSetterTest.testFilesArg(SSTableRepairedAtSetterTest.java:117)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:38)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:534)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1196)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:1041)
Caused by: java.nio.file.NoSuchFileException: /tmp/cassandra/sstablelist.txt
at 
sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at 
sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
at java.nio.file.Files.newByteChannel(Files.java:361)
at java.nio.file.Files.newByteChannel(Files.java:407)
at 
java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:384)
at java.nio.file.Files.newInputStream(Files.java:152)
at java.nio.file.Files.newBufferedReader(Files.java:2784)
at java.nio.file.Files.readAllLines(Files.java:3202)
at 
org.apache.cassandra.tools.SSTableRepairedAtSetter.main(SSTableRepairedAtSetter.java:72)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.cassandra.tools.ToolRunner.runClassAsTool(ToolRunner.java:79)
... 25 more

stdout:

junit.framework.AssertionFailedError: 
[org.apache.cassandra.tools.SSTableRepairedAtSetter,
--really-set,
--is-repaired,
-f,
/tmp/cassandra/sstablelist.txt]
exited with code -1
stderr:

java.lang.RuntimeException: java.nio.file.NoSuchFileException: 
/tmp/cassandra/sstablelist.txt
at 
org.apache.cassandra.tools.ToolRunner.runClassAsTool(ToolRunner.java:99)
at 
org.apache.cassandra.tools.ToolRunner.invokeClass(ToolRunner.java:256)
at 
org.apache.cassandra.tools.ToolRunner.invokeClass(ToolRunner.java:232)
at 

[jira] [Commented] (CASSANDRA-15991) 15583 - Add UX tests to intree LHF tooling

2020-09-29 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204237#comment-17204237
 ] 

David Capwell commented on CASSANDRA-15991:
---

Looks like a new test is failing in Circle CI

testFilesArg - org.apache.cassandra.tools.SSTableRepairedAtSetterTest
{code}
[org.apache.cassandra.tools.SSTableRepairedAtSetter,
--really-set,
--is-repaired,
-f,
/tmp/cassandra/sstablelist.txt]
exited with code -1
stderr:

java.lang.RuntimeException: java.nio.file.NoSuchFileException: 
/tmp/cassandra/sstablelist.txt
at 
org.apache.cassandra.tools.ToolRunner.runClassAsTool(ToolRunner.java:99)
at 
org.apache.cassandra.tools.ToolRunner.invokeClass(ToolRunner.java:256)
at 
org.apache.cassandra.tools.ToolRunner.invokeClass(ToolRunner.java:232)
at 
org.apache.cassandra.tools.SSTableRepairedAtSetterTest.testFilesArg(SSTableRepairedAtSetterTest.java:117)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:38)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:534)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1196)
at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:1041)
Caused by: java.nio.file.NoSuchFileException: /tmp/cassandra/sstablelist.txt
at 
sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
at 
sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
at java.nio.file.Files.newByteChannel(Files.java:361)
at java.nio.file.Files.newByteChannel(Files.java:407)
at 
java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:384)
at java.nio.file.Files.newInputStream(Files.java:152)
at java.nio.file.Files.newBufferedReader(Files.java:2784)
at java.nio.file.Files.readAllLines(Files.java:3202)
at 
org.apache.cassandra.tools.SSTableRepairedAtSetter.main(SSTableRepairedAtSetter.java:72)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.cassandra.tools.ToolRunner.runClassAsTool(ToolRunner.java:79)
... 25 more

stdout:

junit.framework.AssertionFailedError: 
[org.apache.cassandra.tools.SSTableRepairedAtSetter,
--really-set,
--is-repaired,
-f,
/tmp/cassandra/sstablelist.txt]
exited with code -1
stderr:

java.lang.RuntimeException: java.nio.file.NoSuchFileException: 
/tmp/cassandra/sstablelist.txt
at 
org.apache.cassandra.tools.ToolRunner.runClassAsTool(ToolRunner.java:99)
at 
org.apache.cassandra.tools.ToolRunner.invokeClass(ToolRunner.java:256)
at 
org.apache.cassandra.tools.ToolRunner.invokeClass(ToolRunner.java:232)
at 
org.apache.cassandra.tools.SSTableRepairedAtSetterTest.testFilesArg(SSTableRepairedAtSetterTest.java:117)
Caused by: java.nio.file.NoSuchFileException: /tmp/cassandra/sstablel

[jira] [Commented] (CASSANDRA-16074) Add metric for client concurrent byte throttle

2020-09-29 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204217#comment-17204217
 ] 

David Capwell commented on CASSANDRA-16074:
---

thanks for the fix [~clohfink] +1 from me.

> Add metric for client concurrent byte throttle
> --
>
> Key: CASSANDRA-16074
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16074
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Messaging/Client, Observability/Metrics
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Add a metric to expose the current bytes and bytes per ip used that is used 
> in the existing throttle so its possible to determine what to set it to.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16036) Add flag to disable chunk cache and disable by default

2020-09-29 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-16036:
--
Status: Ready to Commit  (was: Review In Progress)

> Add flag to disable chunk cache and disable by default
> --
>
> Key: CASSANDRA-16036
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16036
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0-beta
>
> Attachments: 15229_128mb.png, 16036_128mb.png, 
> async-profile.collapsed.svg, 
> clustering-in-clause_latency_selects_baseline.png, 
> clustering-in-clause_latency_selects_baseline_attempt3.png, 
> clustering-in-clause_latency_under90_selects_baseline.png, 
> clustering-in-clause_latency_under90_selects_baseline_attempt3.png, 
> clustering-slice_latency_selects_baseline.png, 
> clustering-slice_latency_under90_selects_baseline.png, 
> medium-blobs_latency_selects_baseline.png, 
> medium-blobs_latency_under90_selects_baseline.png, 
> partition-single-row-read_latency_selects_baseline.png, 
> partition-single-row-read_latency_under90_selects_baseline.png
>
>
> Chunk cache is enabled by default and doesn’t have a flag to disable without 
> impacting networking.  In performance testing 4.0 against 3.0 I found that 
> reads were slower in 4.0 and after profiling found that the ChunkCache was 
> partially to blame; after disabling the chunk cache, read performance had 
> improved.
> {code}
> 40_w_cc-selects.hdr
> #[Mean= 11.50063, StdDeviation   = 13.44014]
> #[Max =482.41254, Total count=   316477]
> #[Buckets =   25, SubBuckets =   262144]
> 40_wo_cc-selects.hdr
> #[Mean=  9.82115, StdDeviation   = 10.14270]
> #[Max =522.36493, Total count=   317444]
> #[Buckets =   25, SubBuckets =   262144]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16036) Add flag to disable chunk cache and disable by default

2020-09-29 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204208#comment-17204208
 ] 

David Capwell commented on CASSANDRA-16036:
---

Now that [~jasonstack] is a committer (congrats!) I will move forward to merge 
this later today (after rerunning CI on it).

> Add flag to disable chunk cache and disable by default
> --
>
> Key: CASSANDRA-16036
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16036
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.0-beta
>
> Attachments: 15229_128mb.png, 16036_128mb.png, 
> async-profile.collapsed.svg, 
> clustering-in-clause_latency_selects_baseline.png, 
> clustering-in-clause_latency_selects_baseline_attempt3.png, 
> clustering-in-clause_latency_under90_selects_baseline.png, 
> clustering-in-clause_latency_under90_selects_baseline_attempt3.png, 
> clustering-slice_latency_selects_baseline.png, 
> clustering-slice_latency_under90_selects_baseline.png, 
> medium-blobs_latency_selects_baseline.png, 
> medium-blobs_latency_under90_selects_baseline.png, 
> partition-single-row-read_latency_selects_baseline.png, 
> partition-single-row-read_latency_under90_selects_baseline.png
>
>
> Chunk cache is enabled by default and doesn’t have a flag to disable without 
> impacting networking.  In performance testing 4.0 against 3.0 I found that 
> reads were slower in 4.0 and after profiling found that the ChunkCache was 
> partially to blame; after disabling the chunk cache, read performance had 
> improved.
> {code}
> 40_w_cc-selects.hdr
> #[Mean= 11.50063, StdDeviation   = 13.44014]
> #[Max =482.41254, Total count=   316477]
> #[Buckets =   25, SubBuckets =   262144]
> 40_wo_cc-selects.hdr
> #[Mean=  9.82115, StdDeviation   = 10.14270]
> #[Max =522.36493, Total count=   317444]
> #[Buckets =   25, SubBuckets =   262144]
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15991) 15583 - Add UX tests to intree LHF tooling

2020-09-29 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204207#comment-17204207
 ] 

David Capwell commented on CASSANDRA-15991:
---

CI results (still running)

Circle: 
https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-15991-trunk-DE96E193-E8A4-4882-BC04-4169A8E4AAB5
Jenkins: https://ci-cassandra.apache.org/job/Cassandra-devbranch/50/

> 15583 - Add UX tests to intree LHF tooling
> --
>
> Key: CASSANDRA-15991
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15991
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/unit
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta
>
>
> As per CASSANDRA-15583 many in tree tools lack proper UX tooling: mandatory 
> params are indeed mandatory, 'help' produces an actual help, return codes etc
> This ticket is an attempt to add it to those tools that classify as LHF. 
> Other tools such as nodetool, with many sub-commands, deserve a separate 
> ticket of their own



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15991) 15583 - Add UX tests to intree LHF tooling

2020-09-29 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204206#comment-17204206
 ] 

David Capwell commented on CASSANDRA-15991:
---

reviewed latest commit so +1 from me.  Ill start the commit process and link 
the test results before merging.

> 15583 - Add UX tests to intree LHF tooling
> --
>
> Key: CASSANDRA-15991
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15991
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/unit
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta
>
>
> As per CASSANDRA-15583 many in tree tools lack proper UX tooling: mandatory 
> params are indeed mandatory, 'help' produces an actual help, return codes etc
> This ticket is an attempt to add it to those tools that classify as LHF. 
> Other tools such as nodetool, with many sub-commands, deserve a separate 
> ticket of their own



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15991) 15583 - Add UX tests to intree LHF tooling

2020-09-29 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-15991:
--
Status: Ready to Commit  (was: Review In Progress)

> 15583 - Add UX tests to intree LHF tooling
> --
>
> Key: CASSANDRA-15991
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15991
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/unit
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta
>
>
> As per CASSANDRA-15583 many in tree tools lack proper UX tooling: mandatory 
> params are indeed mandatory, 'help' produces an actual help, return codes etc
> This ticket is an attempt to add it to those tools that classify as LHF. 
> Other tools such as nodetool, with many sub-commands, deserve a separate 
> ticket of their own



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16089) Flaky Test: TestDiskBalance.test_disk_balance_after_boundary_change_stcs

2020-09-29 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-16089:
-
  Fix Version/s: (was: 4.0-beta)
 4.0-beta3
  Since Version: NA
Source Control Link: 
https://github.com/apache/cassandra-dtest/commit/e4e8d94ba540743f0b0ccfdd5b8ce3cefc7a6a68
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

Committed, thanks.

> Flaky Test: TestDiskBalance.test_disk_balance_after_boundary_change_stcs
> 
>
> Key: CASSANDRA-16089
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16089
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Caleb Rackliffe
>Assignee: Adam Holmberg
>Priority: Normal
>  Labels: dtest
> Fix For: 4.0-beta3
>
>
> See 
> https://app.circleci.com/pipelines/github/maedhroz/cassandra/99/workflows/72c69ea8-f347-4b00-aed8-bd465f3549ff/jobs/498
> After bootstrapping a second node into the cluster, the sizes of the SSTables 
> (per directory) on the first node no longer fall within the 10% margin of 
> error. We don’t have any assertion in the test that they were balanced before 
> bootstrap, however.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16089) Flaky Test: TestDiskBalance.test_disk_balance_after_boundary_change_stcs

2020-09-29 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-16089:
-
Status: Ready to Commit  (was: Review In Progress)

> Flaky Test: TestDiskBalance.test_disk_balance_after_boundary_change_stcs
> 
>
> Key: CASSANDRA-16089
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16089
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Caleb Rackliffe
>Assignee: Adam Holmberg
>Priority: Normal
>  Labels: dtest
> Fix For: 4.0-beta
>
>
> See 
> https://app.circleci.com/pipelines/github/maedhroz/cassandra/99/workflows/72c69ea8-f347-4b00-aed8-bd465f3549ff/jobs/498
> After bootstrapping a second node into the cluster, the sizes of the SSTables 
> (per directory) on the first node no longer fall within the 10% margin of 
> error. We don’t have any assertion in the test that they were balanced before 
> bootstrap, however.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16089) Flaky Test: TestDiskBalance.test_disk_balance_after_boundary_change_stcs

2020-09-29 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-16089:
-
Reviewers: Brandon Williams, Brandon Williams  (was: Brandon Williams)
   Brandon Williams, Brandon Williams
   Status: Review In Progress  (was: Patch Available)

> Flaky Test: TestDiskBalance.test_disk_balance_after_boundary_change_stcs
> 
>
> Key: CASSANDRA-16089
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16089
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Caleb Rackliffe
>Assignee: Adam Holmberg
>Priority: Normal
>  Labels: dtest
> Fix For: 4.0-beta
>
>
> See 
> https://app.circleci.com/pipelines/github/maedhroz/cassandra/99/workflows/72c69ea8-f347-4b00-aed8-bd465f3549ff/jobs/498
> After bootstrapping a second node into the cluster, the sizes of the SSTables 
> (per directory) on the first node no longer fall within the 10% margin of 
> error. We don’t have any assertion in the test that they were balanced before 
> bootstrap, however.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra-dtest] branch master updated: fix flakiness in TestDiskBalance caused by random token generation

2020-09-29 Thread brandonwilliams
This is an automated email from the ASF dual-hosted git repository.

brandonwilliams pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/cassandra-dtest.git


The following commit(s) were added to refs/heads/master by this push:
 new e4e8d94  fix flakiness in TestDiskBalance caused by random token 
generation
e4e8d94 is described below

commit e4e8d94ba540743f0b0ccfdd5b8ce3cefc7a6a68
Author: Adam Holmberg 
AuthorDate: Tue Sep 29 12:55:48 2020 -0500

fix flakiness in TestDiskBalance caused by random token generation

patch by Adam Holberg, reviewed by brandonwilliams for CASSANDRA-16089
---
 disk_balance_test.py | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/disk_balance_test.py b/disk_balance_test.py
index 3d02ac1..91ba848 100644
--- a/disk_balance_test.py
+++ b/disk_balance_test.py
@@ -234,7 +234,10 @@ class TestDiskBalance(Tester):
 
 # Add a new node, so disk boundaries will change
 logger.debug("Bootstrap node2 and flush")
-node2 = new_node(cluster, bootstrap=True)
+# Fixed initial token to bisect the ring and make sure the nodes are 
balanced (otherwise a random token is generated).
+balanced_tokens = cluster.balanced_tokens(2)
+assert balanced_tokens[0] == node1.initial_token  # make sure cluster 
population still works as assumed
+node2 = new_node(cluster, token=balanced_tokens[1], bootstrap=True)
 node2.start(wait_for_binary_proto=True, 
jvm_args=["-Dcassandra.migration_task_wait_in_seconds=10"], 
set_migration_task=False)
 node2.flush()
 


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16089) Flaky Test: TestDiskBalance.test_disk_balance_after_boundary_change_stcs

2020-09-29 Thread Adam Holmberg (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Holmberg updated CASSANDRA-16089:
--
Test and Documentation Plan: 
Manually reproduced the problem.
Fixed and looped test many iterations locally.
CI looks good.
 Status: Patch Available  (was: In Progress)

> Flaky Test: TestDiskBalance.test_disk_balance_after_boundary_change_stcs
> 
>
> Key: CASSANDRA-16089
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16089
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Caleb Rackliffe
>Assignee: Adam Holmberg
>Priority: Normal
>  Labels: dtest
> Fix For: 4.0-beta
>
>
> See 
> https://app.circleci.com/pipelines/github/maedhroz/cassandra/99/workflows/72c69ea8-f347-4b00-aed8-bd465f3549ff/jobs/498
> After bootstrapping a second node into the cluster, the sizes of the SSTables 
> (per directory) on the first node no longer fall within the 10% margin of 
> error. We don’t have any assertion in the test that they were balanced before 
> bootstrap, however.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15537) 4.0 quality testing: Local Read/Write Path: Upgrade and Diff Test

2020-09-29 Thread Yifan Cai (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204178#comment-17204178
 ] 

Yifan Cai commented on CASSANDRA-15537:
---

Hi Paulo,

Dozens of clusters so far have passed the diff test that compares between its 
current build and the latest 4.0 build. The number of clusters being tested is 
increasing each week. The tested clusters have the data size ranging from 
gigabytes to 10s of TB.

The success criteria for a diff test is that 100% of data from user tables 
matches between the 2 testing clusters. 

Several issues have been resolved, and tool improvements has been made during 
this on-going diff exercise. The tickets are:

Issues
 * CASSANDRA-15945
 * CASSANDRA-15905
 * CASSANDRA-15857
 * CASSANDRA-15514

Tool improvements
 * CASSANDRA-16125
 * CASSANDRA-16065
 * CASSANDRA-15953
 * CASSANDRA-15807
 * CASSANDRA-15722
 * CASSANDRA-15658

> 4.0 quality testing: Local Read/Write Path: Upgrade and Diff Test
> -
>
> Key: CASSANDRA-15537
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15537
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/java, Test/dtest/python
>Reporter: Josh McKenzie
>Assignee: Yifan Cai
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Reference [doc from 
> NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
>  for context.
> Execution of upgrade and diff tests via cassandra-diff have proven to be one 
> of the most effective approaches toward identifying issues with the local 
> read/write path. These include instances of data loss, data corruption, data 
> resurrection, incorrect responses to queries, incomplete responses, and 
> others. Upgrade and diff tests can be executed concurrent with fault 
> injection (such as host or network failure); as well as during mixed-version 
> scenarios (such as upgrading half of the instances in a cluster, and running 
> upgradesstables on only half of the upgraded instances).
> Upgrade and diff tests are expected to continue through the release cycle, 
> and are a great way for contributors to gain confidence in the correctness of 
> the database under their own workloads.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16089) Flaky Test: TestDiskBalance.test_disk_balance_after_boundary_change_stcs

2020-09-29 Thread Adam Holmberg (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204176#comment-17204176
 ] 

Adam Holmberg commented on CASSANDRA-16089:
---

The test fails occasionally due to random token generation for the second node. 
If the random token is too close to the node1 token, very little data is left 
on node1 after cleanup, and the disk balance variation goes up due to "noise".

It can be made to fail in this manner consistently by configuring an 
intentionally bad token. For example one unfortunate token selection:

{noformat}
Datacenter: datacenter1
==
AddressRackStatus State   LoadOwnsToken

  9143583083429189474
127.0.0.1  rack1   Up Normal  23.67 MiB   0.43%   
-9223372036854775808
127.0.0.2  rack1   Up Normal  68.62 KiB   99.57%  
9143583083429189474
{noformat}

Leaves only kilobytes of data on each disk after cleanup and compaction.

The dtest change just makes for fixed token selection so we can avoid the noise 
in small files.
[patch|https://github.com/apache/cassandra-dtest/commit/32a31742bda41b09872b7820e9fb7fffda1addd9]
[ci|https://app.circleci.com/pipelines/github/aholmberg/cassandra?branch=CASSANDRA-16089]

> Flaky Test: TestDiskBalance.test_disk_balance_after_boundary_change_stcs
> 
>
> Key: CASSANDRA-16089
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16089
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/dtest/python
>Reporter: Caleb Rackliffe
>Assignee: Adam Holmberg
>Priority: Normal
>  Labels: dtest
> Fix For: 4.0-beta
>
>
> See 
> https://app.circleci.com/pipelines/github/maedhroz/cassandra/99/workflows/72c69ea8-f347-4b00-aed8-bd465f3549ff/jobs/498
> After bootstrapping a second node into the cluster, the sizes of the SSTables 
> (per directory) on the first node no longer fall within the 10% margin of 
> error. We don’t have any assertion in the test that they were balanced before 
> bootstrap, however.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15991) 15583 - Add UX tests to intree LHF tooling

2020-09-29 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204158#comment-17204158
 ] 

Brandon Williams commented on CASSANDRA-15991:
--

LGTM with latest changes.

> 15583 - Add UX tests to intree LHF tooling
> --
>
> Key: CASSANDRA-15991
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15991
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/unit
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta
>
>
> As per CASSANDRA-15583 many in tree tools lack proper UX tooling: mandatory 
> params are indeed mandatory, 'help' produces an actual help, return codes etc
> This ticket is an attempt to add it to those tools that classify as LHF. 
> Other tools such as nodetool, with many sub-commands, deserve a separate 
> ticket of their own



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16146) Node state incorrectly set to NORMAL after nodetool disablegossip and enablegossip during bootstrap

2020-09-29 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204050#comment-17204050
 ] 

Brandon Williams commented on CASSANDRA-16146:
--

Indeed, that looks related to CASSANDRA-16127.

> Node state incorrectly set to NORMAL after nodetool disablegossip and 
> enablegossip during bootstrap
> ---
>
> Key: CASSANDRA-16146
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16146
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Gossip
>Reporter: Yifan Cai
>Assignee: Yifan Cai
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0-beta3
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> At high level, {{StorageService#setGossipTokens}} set the gossip state to 
> {{NORMAL}} blindly. Therefore, re-enabling gossip (stop and start gossip) 
> overrides the actual gossip state.
>   
> It could happen in the below scenario.
> # Bootstrap failed. The gossip state remains in {{BOOT}} / {{JOINING}} and 
> code execution exits StorageService#initServer.
> # Operator runs nodetool to stop and re-start gossip. The gossip state gets 
> flipped to {{NORMAL}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15538) 4.0 quality testing: Local Read/Write Path: Other Areas

2020-09-29 Thread Sylvain Lebresne (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17204048#comment-17204048
 ] 

Sylvain Lebresne commented on CASSANDRA-15538:
--

No, I haven't really started anything on this issue, and I don't plan to in the 
near term, so I unassigned myself. I should have done it sooner, my bad.

I did spent a few cycles some time ago thinking about what could be done 
concretely here and I'll share my "reflections" in case that's useful. That 
said, in general, the scope here was a bit fuzzy to me.

First, if you look at (true) unit testing for the classes that constitute the 
read/write path, there isn't much. So I suppose one could try to cover that 
somewhat, but the work to make a dent there is huge, and I'm not sure the value 
is that great since those path are mostly covered, but by 
"integration/functional" tests. But this doesn't make is super clear to me if 
specific area are more in need of additional testing than others.

Then the description mentions "numerous bugs and issues with the 3.0 storage 
engine rewrite", so I looked at the list of "serious bugs" that was shared on 
the mailing list (by [~kohlisankalp] I believe; too lazy to dig the link right 
now). From looking at that, the biggest bucket I saw for "storage engine 
rewrite" related bugs was with 'legacy layout conversions/handling'.  And that 
was clearly under-tested, but it's also gone in 4.0. From memory, there were 
also 2-3 read-repair related bugs, but we have CASSANDRA-15977.  Nothing else 
struck me as pointing to a specific area to focus one.

Those aside and fwiw, I've a feeling that things like reverse queries and range 
tombstones may be 2 features that aren't as well tested as they could, but it's 
more an impression of mine than hard data.

Short of focusing on some specific area, the "read/write path" is a big place 
and the space to explore is kinda big. So I feel the biggest value would be to 
start exploring more of that space through randomized testing, specifically 
randomizing queries and/or schema. Presumably, that's what 
[Harry|https://issues.apache.org/jira/browse/CASSANDRA-15348] is for (though I 
haven't really checked it as of yet, so I don't know how capable it is for 
this).  So if it was me, I'd look in this direction. But again, I don't have 
plans to at the moment due to other priorities.


> 4.0 quality testing: Local Read/Write Path: Other Areas
> ---
>
> Key: CASSANDRA-15538
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15538
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/java, Test/dtest/python
>Reporter: Josh McKenzie
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Reference [doc from 
> NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
>  for context.
> *Shepherd: Aleksey Yeschenko*
> Testing in this area refers to the local read/write path (StorageProxy, 
> ColumnFamilyStore, Memtable, SSTable reading/writing, etc). We are still 
> finding numerous bugs and issues with the 3.0 storage engine rewrite 
> (CASSANDRA-8099). For 4.0 we want to ensure that we thoroughly cover the 
> local read/write path with techniques such as property-based testing, fuzzing 
> ([example|http://cassandra.apache.org/blog/2018/10/17/finding_bugs_with_property_based_testing.html]),
>  and a source audit.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-15538) 4.0 quality testing: Local Read/Write Path: Other Areas

2020-09-29 Thread Sylvain Lebresne (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne reassigned CASSANDRA-15538:


Assignee: (was: Sylvain Lebresne)

> 4.0 quality testing: Local Read/Write Path: Other Areas
> ---
>
> Key: CASSANDRA-15538
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15538
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/java, Test/dtest/python
>Reporter: Josh McKenzie
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Reference [doc from 
> NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
>  for context.
> *Shepherd: Aleksey Yeschenko*
> Testing in this area refers to the local read/write path (StorageProxy, 
> ColumnFamilyStore, Memtable, SSTable reading/writing, etc). We are still 
> finding numerous bugs and issues with the 3.0 storage engine rewrite 
> (CASSANDRA-8099). For 4.0 we want to ensure that we thoroughly cover the 
> local read/write path with techniques such as property-based testing, fuzzing 
> ([example|http://cassandra.apache.org/blog/2018/10/17/finding_bugs_with_property_based_testing.html]),
>  and a source audit.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16063) Fix user experience when upgrading to 4.0 with compact tables

2020-09-29 Thread Sylvain Lebresne (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203976#comment-17203976
 ] 

Sylvain Lebresne commented on CASSANDRA-16063:
--

I don't the time to test the patch thoroughly right now, but from a code review 
point of view, this lgtm.

> Fix user experience when upgrading to 4.0 with compact tables
> -
>
> Key: CASSANDRA-16063
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16063
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/CQL
>Reporter: Sylvain Lebresne
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 4.0-beta
>
> Attachments: Compact_storage_upgrade_tests.txt
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The code to handle compact tables has been removed from 4.0, and the intended 
> upgrade path to 4.0 for users having compact tables on 3.x is that they must 
> execute {{ALTER ... DROP COMPACT STORAGE}} on all of their compact tables 
> *before* attempting the upgrade.
> Obviously, some users won't read the upgrade instructions (or miss a table) 
> and may try upgrading despite still having compact tables. If they do so, the 
> intent is that the node will _not_ start, with a message clearly indicating 
> the pre-upgrade step the user has missed. The user will then downgrade back 
> the node(s) to 3.x, run the proper {{ALTER ... DROP COMPACT STORAGE}}, and 
> then upgrade again.
> But while 4.0 does currently fail startup when finding any compact tables 
> with a decent message, I believe the check is done too late during startup.
> Namely, that check is done as we read the tables schema, so within 
> [{{Schema.instance.loadFromDisk()}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CassandraDaemon.java#L241].
>   But by then, we've _at least_ called 
> {{SystemKeyspace.persistLocalMetadata()}}} and 
> {{SystemKeyspaceMigrator40.migrate()}}, which will get into the commit log, 
> and even possibly flush new {{na}} format sstables. As a results, a user 
> might not be able to seemlessly restart the node on 3.x (to drop compact 
> storage on the appropriate tables).
> Basically, we should make sure the check for compact tables done at 4.0 
> startup is done as a {{StartupCheck}}, before the node does anything.
> We should also add a test for this (checking that if you try upgrading to 4.0 
> with compact storage, you can downgrade back with no intervention whatsoever).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15538) 4.0 quality testing: Local Read/Write Path: Other Areas

2020-09-29 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203974#comment-17203974
 ] 

Paulo Motta commented on CASSANDRA-15538:
-

Hi [~slebresne], did you have the chance to look into this issue?

For context, I'm asking this to check the status of the 4.0 quality epic as 
part of this [this 
discussion|https://www.mail-archive.com/dev@cassandra.apache.org/msg15881.html] 
on the mailing list.

> 4.0 quality testing: Local Read/Write Path: Other Areas
> ---
>
> Key: CASSANDRA-15538
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15538
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/java, Test/dtest/python
>Reporter: Josh McKenzie
>Assignee: Sylvain Lebresne
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Reference [doc from 
> NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
>  for context.
> *Shepherd: Aleksey Yeschenko*
> Testing in this area refers to the local read/write path (StorageProxy, 
> ColumnFamilyStore, Memtable, SSTable reading/writing, etc). We are still 
> finding numerous bugs and issues with the 3.0 storage engine rewrite 
> (CASSANDRA-8099). For 4.0 we want to ensure that we thoroughly cover the 
> local read/write path with techniques such as property-based testing, fuzzing 
> ([example|http://cassandra.apache.org/blog/2018/10/17/finding_bugs_with_property_based_testing.html]),
>  and a source audit.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra-dtest] branch CASSANDRA-14793 created (now b9baecd)

2020-09-29 Thread blerer
This is an automated email from the ASF dual-hosted git repository.

blerer pushed a change to branch CASSANDRA-14793
in repository https://gitbox.apache.org/repos/asf/cassandra-dtest.git.


  at b9baecd  Update tests for CASSANDRA-14793

No new revisions were added by this update.


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] 01/02: Allow to use a different directory for storing system tables.

2020-09-29 Thread blerer
This is an automated email from the ASF dual-hosted git repository.

blerer pushed a commit to branch CASSANDRA-14793
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit ea3ee373f670b4eddcde0f94d4f5f6221166761b
Author: Benjamin Lerer 
AuthorDate: Thu Mar 19 12:57:28 2020 +0100

Allow to use a different directory for storing system tables.
---
 .circleci/config.yml   |  97 +++
 .circleci/config.yml.HIGHRES   |  98 +++
 .circleci/config.yml.LOWRES|  97 +++
 .circleci/config.yml.MIDRES|  97 +++
 NEWS.txt   |   9 ++
 build.xml  |  38 +
 conf/cassandra.yaml|   6 +
 src/java/org/apache/cassandra/config/Config.java   |   6 +
 .../cassandra/config/DatabaseDescriptor.java   |  97 +--
 .../org/apache/cassandra/db/ColumnFamilyStore.java | 101 ++--
 src/java/org/apache/cassandra/db/Directories.java  | 180 -
 .../apache/cassandra/db/DiskBoundaryManager.java   |   1 -
 .../org/apache/cassandra/db/SystemKeyspace.java|   5 +
 .../apache/cassandra/io/FSDiskFullWriteError.java  |  12 +-
 ...or.java => FSNoDiskAvailableForWriteError.java} |  16 +-
 .../org/apache/cassandra/io/util/FileUtils.java|  67 
 .../apache/cassandra/service/CassandraDaemon.java  |  94 ++-
 .../cassandra/service/DefaultFSErrorHandler.java   |  17 +-
 .../apache/cassandra/service/StartupChecks.java|   1 +
 .../apache/cassandra/service/StorageService.java   |  25 ++-
 .../cassandra/service/StorageServiceMBean.java |  14 ++
 test/conf/system_keyspaces_directory.yaml  |   1 +
 .../cassandra/OffsetAwareConfigurationLoader.java  |   3 +
 .../org/apache/cassandra/db/DirectoriesTest.java   |  42 ++---
 .../apache/cassandra/io/util/FileUtilsTest.java|  69 
 .../apache/cassandra/tools/ClearSnapshotTest.java  |   2 +-
 26 files changed, 1091 insertions(+), 104 deletions(-)

diff --git a/.circleci/config.yml b/.circleci/config.yml
index 8ba8949..6c177a4 100644
--- a/.circleci/config.yml
+++ b/.circleci/config.yml
@@ -2151,6 +2151,97 @@ jobs:
 - CCM_HEAP_NEWSIZE: 256M
 - JAVA_HOME: /usr/lib/jvm/java-8-openjdk-amd64
 - JDK_HOME: /usr/lib/jvm/java-8-openjdk-amd64
+  utests_system_keyspace_directory:
+docker:
+- image: nastra/cassandra-testing-ubuntu1910-java11-w-dependencies:20200603
+resource_class: medium
+working_directory: ~/
+shell: /bin/bash -eo pipefail -l
+parallelism: 4
+steps:
+- attach_workspace:
+at: /home/cassandra
+- run:
+name: Determine unit Tests to Run
+command: |
+  # reminder: this code (along with all the steps) is independently 
executed on every circle container
+  # so the goal here is to get the circleci script to return the tests 
*this* container will run
+  # which we do via the `circleci` cli tool.
+
+  rm -fr ~/cassandra-dtest/upgrade_tests
+  echo "***java tests***"
+
+  # get all of our unit test filenames
+  set -eo pipefail && circleci tests glob 
"$HOME/cassandra/test/unit/**/*.java" > /tmp/all_java_unit_tests.txt
+
+  # split up the unit tests into groups based on the number of 
containers we have
+  set -eo pipefail && circleci tests split --split-by=timings 
--timings-type=filename --index=${CIRCLE_NODE_INDEX} 
--total=${CIRCLE_NODE_TOTAL} /tmp/all_java_unit_tests.txt > 
/tmp/java_tests_${CIRCLE_NODE_INDEX}.txt
+  set -eo pipefail && cat /tmp/java_tests_${CIRCLE_NODE_INDEX}.txt | 
sed "s;^/home/cassandra/cassandra/test/unit/;;g" | grep "Test\.java$"  > 
/tmp/java_tests_${CIRCLE_NODE_INDEX}_final.txt
+  echo "** /tmp/java_tests_${CIRCLE_NODE_INDEX}_final.txt"
+  cat /tmp/java_tests_${CIRCLE_NODE_INDEX}_final.txt
+no_output_timeout: 15m
+- run:
+name: Log Environment Information
+command: |
+  echo '*** id ***'
+  id
+  echo '*** cat /proc/cpuinfo ***'
+  cat /proc/cpuinfo
+  echo '*** free -m ***'
+  free -m
+  echo '*** df -m ***'
+  df -m
+  echo '*** ifconfig -a ***'
+  ifconfig -a
+  echo '*** uname -a ***'
+  uname -a
+  echo '*** mount ***'
+  mount
+  echo '*** env ***'
+  env
+  echo '*** java ***'
+  which java
+  java -version
+- run:
+name: Run Unit Tests (testclasslist-system-keyspace-directory)
+command: |
+  set -x
+  export PATH=$JAVA_HOME/bin:$PATH
+  time mv ~/cassandra /tmp
+  cd /tmp/cassandra
+  if [ -d ~/dtest_jars ]; then
+cp ~/dtest_jars/dtest* /tmp/cassandra/build/
+  fi
+  test_timeout=$(grep 'name="test.unit.timeout"' build.

[cassandra] branch CASSANDRA-14793 created (now 51366a6)

2020-09-29 Thread blerer
This is an automated email from the ASF dual-hosted git repository.

blerer pushed a change to branch CASSANDRA-14793
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


  at 51366a6  Change Circle-CI DO NOT MERGE

This branch includes the following new commits:

 new ea3ee37  Allow to use a different directory for storing system tables.
 new 51366a6  Change Circle-CI DO NOT MERGE

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.



-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] 02/02: Change Circle-CI DO NOT MERGE

2020-09-29 Thread blerer
This is an automated email from the ASF dual-hosted git repository.

blerer pushed a commit to branch CASSANDRA-14793
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit 51366a6bae614b14fe81d4fd2f43c0e4c8b2425b
Author: Benjamin Lerer 
AuthorDate: Fri Sep 18 16:59:39 2020 +0200

Change Circle-CI DO NOT MERGE
---
 .circleci/config.yml | 204 +--
 1 file changed, 102 insertions(+), 102 deletions(-)

diff --git a/.circleci/config.yml b/.circleci/config.yml
index 6c177a4..a41fcf3 100644
--- a/.circleci/config.yml
+++ b/.circleci/config.yml
@@ -3,10 +3,10 @@ jobs:
   j8_jvm_upgrade_dtests:
 docker:
 - image: nastra/cassandra-testing-ubuntu1910-java11-w-dependencies:20200603
-resource_class: medium
+resource_class: large
 working_directory: ~/
 shell: /bin/bash -eo pipefail -l
-parallelism: 1
+parallelism: 10
 steps:
 - attach_workspace:
 at: /home/cassandra
@@ -85,8 +85,8 @@ jobs:
 - CASS_DRIVER_NO_EXTENSIONS: true
 - CASS_DRIVER_NO_CYTHON: true
 - CASSANDRA_SKIP_SYNC: true
-- DTEST_REPO: git://github.com/apache/cassandra-dtest.git
-- DTEST_BRANCH: master
+- DTEST_REPO: git://github.com/blerer/cassandra-dtest.git
+- DTEST_BRANCH: CASSANDRA-14793
 - CCM_MAX_HEAP_SIZE: 1024M
 - CCM_HEAP_NEWSIZE: 256M
 - JAVA_HOME: /usr/lib/jvm/java-8-openjdk-amd64
@@ -94,10 +94,10 @@ jobs:
   j8_cqlsh-dtests-py2-with-vnodes:
 docker:
 - image: nastra/cassandra-testing-ubuntu1910-java11-w-dependencies:20200603
-resource_class: medium
+resource_class: large
 working_directory: ~/
 shell: /bin/bash -eo pipefail -l
-parallelism: 4
+parallelism: 50
 steps:
 - attach_workspace:
 at: /home/cassandra
@@ -162,8 +162,8 @@ jobs:
 - CASS_DRIVER_NO_EXTENSIONS: true
 - CASS_DRIVER_NO_CYTHON: true
 - CASSANDRA_SKIP_SYNC: true
-- DTEST_REPO: git://github.com/apache/cassandra-dtest.git
-- DTEST_BRANCH: master
+- DTEST_REPO: git://github.com/blerer/cassandra-dtest.git
+- DTEST_BRANCH: CASSANDRA-14793
 - CCM_MAX_HEAP_SIZE: 1024M
 - CCM_HEAP_NEWSIZE: 256M
 - JAVA_HOME: /usr/lib/jvm/java-8-openjdk-amd64
@@ -174,7 +174,7 @@ jobs:
 resource_class: medium
 working_directory: ~/
 shell: /bin/bash -eo pipefail -l
-parallelism: 4
+parallelism: 25
 steps:
 - attach_workspace:
 at: /home/cassandra
@@ -253,8 +253,8 @@ jobs:
 - CASS_DRIVER_NO_EXTENSIONS: true
 - CASS_DRIVER_NO_CYTHON: true
 - CASSANDRA_SKIP_SYNC: true
-- DTEST_REPO: git://github.com/apache/cassandra-dtest.git
-- DTEST_BRANCH: master
+- DTEST_REPO: git://github.com/blerer/cassandra-dtest.git
+- DTEST_BRANCH: CASSANDRA-14793
 - CCM_MAX_HEAP_SIZE: 1024M
 - CCM_HEAP_NEWSIZE: 256M
 - JAVA_HOME: /usr/lib/jvm/java-11-openjdk-amd64
@@ -263,10 +263,10 @@ jobs:
   j8_cqlsh-dtests-py38-no-vnodes:
 docker:
 - image: nastra/cassandra-testing-ubuntu1910-java11-w-dependencies:20200603
-resource_class: medium
+resource_class: large
 working_directory: ~/
 shell: /bin/bash -eo pipefail -l
-parallelism: 4
+parallelism: 50
 steps:
 - attach_workspace:
 at: /home/cassandra
@@ -331,8 +331,8 @@ jobs:
 - CASS_DRIVER_NO_EXTENSIONS: true
 - CASS_DRIVER_NO_CYTHON: true
 - CASSANDRA_SKIP_SYNC: true
-- DTEST_REPO: git://github.com/apache/cassandra-dtest.git
-- DTEST_BRANCH: master
+- DTEST_REPO: git://github.com/blerer/cassandra-dtest.git
+- DTEST_BRANCH: CASSANDRA-14793
 - CCM_MAX_HEAP_SIZE: 1024M
 - CCM_HEAP_NEWSIZE: 256M
 - JAVA_HOME: /usr/lib/jvm/java-8-openjdk-amd64
@@ -340,10 +340,10 @@ jobs:
   j11_cqlsh-dtests-py3-with-vnodes:
 docker:
 - image: nastra/cassandra-testing-ubuntu1910-java11:20200603
-resource_class: medium
+resource_class: large
 working_directory: ~/
 shell: /bin/bash -eo pipefail -l
-parallelism: 4
+parallelism: 50
 steps:
 - attach_workspace:
 at: /home/cassandra
@@ -408,8 +408,8 @@ jobs:
 - CASS_DRIVER_NO_EXTENSIONS: true
 - CASS_DRIVER_NO_CYTHON: true
 - CASSANDRA_SKIP_SYNC: true
-- DTEST_REPO: git://github.com/apache/cassandra-dtest.git
-- DTEST_BRANCH: master
+- DTEST_REPO: git://github.com/blerer/cassandra-dtest.git
+- DTEST_BRANCH: CASSANDRA-14793
 - CCM_MAX_HEAP_SIZE: 1024M
 - CCM_HEAP_NEWSIZE: 256M
 - JAVA_HOME: /usr/lib/jvm/java-11-openjdk-amd64
@@ -418,10 +418,10 @@ jobs:
   j11_cqlsh-dtests-py3-no-vnodes:
 docker:
 - image: nastra/cassandra-testing-ubuntu1910-java11:20200603
-resource_class: medium
+resource_class: large
 working_directory: ~/
 shell: /bin/bash -eo pipefail -l
-parallelism: 4
+parallelism: 50
 steps:
 - attach_workspace:
 at: /home/cassandra
@@ -486,8 +486,8 @@ jobs:
 - CASS_DRIVER_NO_EXTENSIONS: true
 - CASS_DRIVER_NO_CYTHON

[jira] [Commented] (CASSANDRA-15537) 4.0 quality testing: Local Read/Write Path: Upgrade and Diff Test

2020-09-29 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203946#comment-17203946
 ] 

Paulo Motta commented on CASSANDRA-15537:
-

Hi [~yifanc], do you have any update on the diff tests? Is work of this task 
composed solely of the tests you are running or can it be split into subtasks 
so others can maybe help?

For context, I'm asking this to check the status of the 4.0 quality epic as 
part of this [this 
discussion|https://www.mail-archive.com/dev@cassandra.apache.org/msg15881.html] 
on the mailing list.

> 4.0 quality testing: Local Read/Write Path: Upgrade and Diff Test
> -
>
> Key: CASSANDRA-15537
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15537
> Project: Cassandra
>  Issue Type: Task
>  Components: Test/dtest/java, Test/dtest/python
>Reporter: Josh McKenzie
>Assignee: Yifan Cai
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Reference [doc from 
> NGCC|https://docs.google.com/document/d/1uhUOp7wpE9ZXNDgxoCZHejHt5SO4Qw1dArZqqsJccyQ/edit#]
>  for context.
> Execution of upgrade and diff tests via cassandra-diff have proven to be one 
> of the most effective approaches toward identifying issues with the local 
> read/write path. These include instances of data loss, data corruption, data 
> resurrection, incorrect responses to queries, incomplete responses, and 
> others. Upgrade and diff tests can be executed concurrent with fault 
> injection (such as host or network failure); as well as during mixed-version 
> scenarios (such as upgrading half of the instances in a cluster, and running 
> upgradesstables on only half of the upgraded instances).
> Upgrade and diff tests are expected to continue through the release cycle, 
> and are a great way for contributors to gain confidence in the correctness of 
> the database under their own workloads.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14746) Ensure Netty Internode Messaging Refactor is Solid

2020-09-29 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203940#comment-17203940
 ] 

Paulo Motta commented on CASSANDRA-14746:
-

[~jolynch] [~vinaykumarcse] As part of this [this 
discussion|https://www.mail-archive.com/dev@cassandra.apache.org/msg15881.html] 
on the mailing list I'm checking the status of the 4.0 quality epic issues and 
would appreciate if you could answer me the following questions:

a) Is work on this issue still active?
 b) Can we complete this issue once all subtasks are completed or are there 
more subtasks to be added?

> Ensure Netty Internode Messaging Refactor is Solid
> --
>
> Key: CASSANDRA-14746
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14746
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Streaming and Messaging
>Reporter: Joey Lynch
>Assignee: Joey Lynch
>Priority: Normal
>  Labels: 4.0-QA
> Fix For: 4.0-beta
>
>
> Before we release 4.0 let's ensure that the internode messaging refactor is 
> 100% solid. As internode messaging is naturally used in many code paths and 
> widely configurable we have a large number of cluster configurations and test 
> configurations that must be vetted.
> We plan to vary the following:
>  * Version of Cassandra 3.0.17 vs 4.0-alpha
>  * Cluster sizes with *multi-dc* deployments ranging from 6 - 100 nodes
>  * Client request rates varying between 1k QPS and 100k QPS of varying sizes 
> and shapes (BATCH, INSERT, SELECT point, SELECT range, etc ...)
>  * Internode compression
>  * Internode SSL (as well as openssl vs jdk)
>  * Internode Coalescing options
> We are looking to measure the following as appropriate:
>  * Latency distributions of reads and writes (lower is better)
>  * Scaling limit, aka maximum throughput before violating p99 latency 
> deadline of 10ms @ LOCAL_QUORUM, on a fixed hardware deployment for 100% 
> writes, 100% reads and 50-50 writes+reads (higher is better)
>  * Thread counts (lower is better)
>  * Context switches (lower is better)
>  * On-CPU time of tasks (higher periods without context switch is better)
>  * GC allocation rates / throughput for a fixed size heap (lower allocation 
> better)
>  * Streaming recovery time for a single node failure, i.e. can Cassandra 
> saturate the NIC
>  
> The goal is that 4.0 should have better latency, more throughput, fewer 
> threads, fewer context switches, less GC allocation, and faster recovery 
> time. I'm putting Jason Brown as the reviewer since he implemented most of 
> the internode refactor.
> Current collaborators driving this QA task: Dinesh Joshi, Jordan West, Joey 
> Lynch (Netflix), Vinay Chella (Netflix)
> Owning committer(s): Jason Brown



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15234) Standardise config and JVM parameters

2020-09-29 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203910#comment-17203910
 ] 

Paulo Motta commented on CASSANDRA-15234:
-

Despite the awesome work (thanks for leading it [~e.dimitrova]) and productive 
discussion that went into this issue, we didn't seem to reach a strong 
agreement here and it seems to me it's a bit late in the 4.0 release cycle to 
land this?

In the spirit of expediting 4.0RC release I propose we postpone this to 4.X, 
and resume this with high priority earlier in the next release cycle. What do 
you think?

> Standardise config and JVM parameters
> -
>
> Key: CASSANDRA-15234
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15234
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Config
>Reporter: Benedict Elliott Smith
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 4.0-alpha
>
> Attachments: CASSANDRA-15234-3-DTests-JAVA8.txt
>
>
> We have a bunch of inconsistent names and config patterns in the codebase, 
> both from the yams and JVM properties.  It would be nice to standardise the 
> naming (such as otc_ vs internode_) as well as the provision of values with 
> units - while maintaining perpetual backwards compatibility with the old 
> parameter names, of course.
> For temporal units, I would propose parsing strings with suffixes of:
> {{code}}
> u|micros(econds?)?
> ms|millis(econds?)?
> s(econds?)?
> m(inutes?)?
> h(ours?)?
> d(ays?)?
> mo(nths?)?
> {{code}}
> For rate units, I would propose parsing any of the standard {{B/s, KiB/s, 
> MiB/s, GiB/s, TiB/s}}.
> Perhaps for avoiding ambiguity we could not accept bauds {{bs, Mbps}} or 
> powers of 1000 such as {{KB/s}}, given these are regularly used for either 
> their old or new definition e.g. {{KiB/s}}, or we could support them and 
> simply log the value in bytes/s.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16063) Fix user experience when upgrading to 4.0 with compact tables

2020-09-29 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203885#comment-17203885
 ] 

Ekaterina Dimitrova commented on CASSANDRA-16063:
-

Hi [~adelapena]

I just rebased all branches.

PRs as follow:

[trunk|https://github.com/ekaterinadimitrova2/cassandra/pull/54]| 
[3.0|https://github.com/ekaterinadimitrova2/cassandra/pull/56] | 
[dtetsts|https://github.com/ekaterinadimitrova2/cassandra-dtest/pull/4]
 No PR for 3.11 as it is a merge from 3.0

> Fix user experience when upgrading to 4.0 with compact tables
> -
>
> Key: CASSANDRA-16063
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16063
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/CQL
>Reporter: Sylvain Lebresne
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 4.0-beta
>
> Attachments: Compact_storage_upgrade_tests.txt
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The code to handle compact tables has been removed from 4.0, and the intended 
> upgrade path to 4.0 for users having compact tables on 3.x is that they must 
> execute {{ALTER ... DROP COMPACT STORAGE}} on all of their compact tables 
> *before* attempting the upgrade.
> Obviously, some users won't read the upgrade instructions (or miss a table) 
> and may try upgrading despite still having compact tables. If they do so, the 
> intent is that the node will _not_ start, with a message clearly indicating 
> the pre-upgrade step the user has missed. The user will then downgrade back 
> the node(s) to 3.x, run the proper {{ALTER ... DROP COMPACT STORAGE}}, and 
> then upgrade again.
> But while 4.0 does currently fail startup when finding any compact tables 
> with a decent message, I believe the check is done too late during startup.
> Namely, that check is done as we read the tables schema, so within 
> [{{Schema.instance.loadFromDisk()}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CassandraDaemon.java#L241].
>   But by then, we've _at least_ called 
> {{SystemKeyspace.persistLocalMetadata()}}} and 
> {{SystemKeyspaceMigrator40.migrate()}}, which will get into the commit log, 
> and even possibly flush new {{na}} format sstables. As a results, a user 
> might not be able to seemlessly restart the node on 3.x (to drop compact 
> storage on the appropriate tables).
> Basically, we should make sure the check for compact tables done at 4.0 
> startup is done as a {{StartupCheck}}, before the node does anything.
> We should also add a test for this (checking that if you try upgrading to 4.0 
> with compact storage, you can downgrade back with no intervention whatsoever).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16121) Circleci should run cqlshlib tests as well

2020-09-29 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203877#comment-17203877
 ] 

Berenguer Blasi commented on CASSANDRA-16121:
-

[~e.dimitrova] executors are now in the original config-2.1 file:

- Low: 
[j11|https://app.circleci.com/pipelines/github/bereng/cassandra/129/workflows/42ede6ff-d809-42f3-b143-3945003539a6]
 & 
[j8|https://app.circleci.com/pipelines/github/bereng/cassandra/129/workflows/10497110-d938-4500-8ef3-eb3d0e815b6e]
- Medium: 
[j11|https://app.circleci.com/pipelines/github/bereng/cassandra/130/workflows/ee08a837-0710-40c8-bb26-cad7b2e20891]
 & 
[j8|https://app.circleci.com/pipelines/github/bereng/cassandra/130/workflows/94de5698-26b5-467d-afe4-c8b284d52d50]
- High: 
[j11|https://app.circleci.com/pipelines/github/bereng/cassandra/131/workflows/5d7b1a6e-fd7b-47d9-932c-cabc8194a644]
 & 
[j8|https://app.circleci.com/pipelines/github/bereng/cassandra/131/workflows/893ae53b-0744-4568-ae41-993a9c1fdcd5]


> Circleci should run cqlshlib tests as well
> --
>
> Key: CASSANDRA-16121
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16121
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI, Test/unit
>Reporter: Berenguer Blasi
>Assignee: Berenguer Blasi
>Priority: Normal
> Fix For: 4.0-beta
>
>
> Currently circleci is not running cqlshlib tests. This resulted in some bugs 
> not being caught before committing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16063) Fix user experience when upgrading to 4.0 with compact tables

2020-09-29 Thread Jira


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203861#comment-17203861
 ] 

Andres de la Peña commented on CASSANDRA-16063:
---

[~e.dimitrova] are there PRs for those branches? I only see [this 
one|https://github.com/ekaterinadimitrova2/cassandra/pull/54] for trunk.

> Fix user experience when upgrading to 4.0 with compact tables
> -
>
> Key: CASSANDRA-16063
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16063
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/CQL
>Reporter: Sylvain Lebresne
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 4.0-beta
>
> Attachments: Compact_storage_upgrade_tests.txt
>
>
> The code to handle compact tables has been removed from 4.0, and the intended 
> upgrade path to 4.0 for users having compact tables on 3.x is that they must 
> execute {{ALTER ... DROP COMPACT STORAGE}} on all of their compact tables 
> *before* attempting the upgrade.
> Obviously, some users won't read the upgrade instructions (or miss a table) 
> and may try upgrading despite still having compact tables. If they do so, the 
> intent is that the node will _not_ start, with a message clearly indicating 
> the pre-upgrade step the user has missed. The user will then downgrade back 
> the node(s) to 3.x, run the proper {{ALTER ... DROP COMPACT STORAGE}}, and 
> then upgrade again.
> But while 4.0 does currently fail startup when finding any compact tables 
> with a decent message, I believe the check is done too late during startup.
> Namely, that check is done as we read the tables schema, so within 
> [{{Schema.instance.loadFromDisk()}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CassandraDaemon.java#L241].
>   But by then, we've _at least_ called 
> {{SystemKeyspace.persistLocalMetadata()}}} and 
> {{SystemKeyspaceMigrator40.migrate()}}, which will get into the commit log, 
> and even possibly flush new {{na}} format sstables. As a results, a user 
> might not be able to seemlessly restart the node on 3.x (to drop compact 
> storage on the appropriate tables).
> Basically, we should make sure the check for compact tables done at 4.0 
> startup is done as a {{StartupCheck}}, before the node does anything.
> We should also add a test for this (checking that if you try upgrading to 4.0 
> with compact storage, you can downgrade back with no intervention whatsoever).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16128) Jenkins: dsl for website build, logging repo SHAs, and using nightlies.a.o instead of archiving

2020-09-29 Thread Michael Semb Wever (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17203860#comment-17203860
 ] 

Michael Semb Wever commented on CASSANDRA-16128:


In-tree patches
- 
[2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...thelastpickle:mck/cassandra-2.2_jenkinsfile_2020-08]
- 
[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...thelastpickle:mck/cassandra-3.0_jenkinsfile_2020-08]
- 
[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...thelastpickle:mck/cassandra-3.11_jenkinsfile_2020-08]
- 
[trunk|https://github.com/apache/cassandra/compare/trunk...thelastpickle:mck/jenkinsfile_2020-08]
 

> Jenkins: dsl for website build, logging repo SHAs, and using nightlies.a.o 
> instead of archiving
> ---
>
> Key: CASSANDRA-16128
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16128
> Project: Cassandra
>  Issue Type: Task
>  Components: CI
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.0-beta
>
>
> Jenkins improvements
> 1. Add the cassandra-website job into cassandra_job_dsl.seed.groovy (so we 
> don't lose it next time the Jenkins master is corrupted)
> 2. Print the SHAs of the different git repos used during the build process. 
> Also store them in the .head files (so the pipeline can print them out too).
> 3. Instead of archiving artefacts, ssh them to 
> https://nightlies.apache.org/cassandra/
> (Disk usage on agents is largely under control, but disk usage on master was 
> the new problem. The suspicion here is the Cassandra-*-artifact's artefacts 
> was the disk usage culprit, though we have to evidence to support it.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16128) Jenkins: dsl for website build, logging repo SHAs, and using nightlies.a.o instead of archiving

2020-09-29 Thread Michael Semb Wever (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Semb Wever updated CASSANDRA-16128:
---
Status: Patch Available  (was: In Progress)

> Jenkins: dsl for website build, logging repo SHAs, and using nightlies.a.o 
> instead of archiving
> ---
>
> Key: CASSANDRA-16128
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16128
> Project: Cassandra
>  Issue Type: Task
>  Components: CI
>Reporter: Michael Semb Wever
>Assignee: Michael Semb Wever
>Priority: Normal
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.0-beta
>
>
> Jenkins improvements
> 1. Add the cassandra-website job into cassandra_job_dsl.seed.groovy (so we 
> don't lose it next time the Jenkins master is corrupted)
> 2. Print the SHAs of the different git repos used during the build process. 
> Also store them in the .head files (so the pipeline can print them out too).
> 3. Instead of archiving artefacts, ssh them to 
> https://nightlies.apache.org/cassandra/
> (Disk usage on agents is largely under control, but disk usage on master was 
> the new problem. The suspicion here is the Cassandra-*-artifact's artefacts 
> was the disk usage culprit, though we have to evidence to support it.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16063) Fix user experience when upgrading to 4.0 with compact tables

2020-09-29 Thread Jira


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andres de la Peña updated CASSANDRA-16063:
--
Reviewers: Andres de la Peña, Sylvain Lebresne, Andres de la Peña  (was: 
Andres de la Peña, Sylvain Lebresne)
   Andres de la Peña, Sylvain Lebresne, Andres de la Peña  (was: 
Andres de la Peña, Sylvain Lebresne)
   Status: Review In Progress  (was: Patch Available)

> Fix user experience when upgrading to 4.0 with compact tables
> -
>
> Key: CASSANDRA-16063
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16063
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/CQL
>Reporter: Sylvain Lebresne
>Assignee: Ekaterina Dimitrova
>Priority: Normal
> Fix For: 4.0-beta
>
> Attachments: Compact_storage_upgrade_tests.txt
>
>
> The code to handle compact tables has been removed from 4.0, and the intended 
> upgrade path to 4.0 for users having compact tables on 3.x is that they must 
> execute {{ALTER ... DROP COMPACT STORAGE}} on all of their compact tables 
> *before* attempting the upgrade.
> Obviously, some users won't read the upgrade instructions (or miss a table) 
> and may try upgrading despite still having compact tables. If they do so, the 
> intent is that the node will _not_ start, with a message clearly indicating 
> the pre-upgrade step the user has missed. The user will then downgrade back 
> the node(s) to 3.x, run the proper {{ALTER ... DROP COMPACT STORAGE}}, and 
> then upgrade again.
> But while 4.0 does currently fail startup when finding any compact tables 
> with a decent message, I believe the check is done too late during startup.
> Namely, that check is done as we read the tables schema, so within 
> [{{Schema.instance.loadFromDisk()}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CassandraDaemon.java#L241].
>   But by then, we've _at least_ called 
> {{SystemKeyspace.persistLocalMetadata()}}} and 
> {{SystemKeyspaceMigrator40.migrate()}}, which will get into the commit log, 
> and even possibly flush new {{na}} format sstables. As a results, a user 
> might not be able to seemlessly restart the node on 3.x (to drop compact 
> storage on the appropriate tables).
> Basically, we should make sure the check for compact tables done at 4.0 
> startup is done as a {{StartupCheck}}, before the node does anything.
> We should also add a test for this (checking that if you try upgrading to 4.0 
> with compact storage, you can downgrade back with no intervention whatsoever).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org