[jira] [Commented] (CASSANDRA-16456) Add Plugin Support for CQLSH
[ https://issues.apache.org/jira/browse/CASSANDRA-16456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526772#comment-17526772 ] Brian Houser commented on CASSANDRA-16456: -- Ok, cool I will implement what I described with the points you added. Will be done by the weekend pacific. > Add Plugin Support for CQLSH > > > Key: CASSANDRA-16456 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16456 > Project: Cassandra > Issue Type: New Feature > Components: Tool/cqlsh >Reporter: Brian Houser >Assignee: Brian Houser >Priority: Normal > Labels: gsoc2021, mentor > Time Spent: 2h 50m > Remaining Estimate: 0h > > Currently the Cassandra drivers offer a plugin authenticator architecture for > the support of different authentication methods. This has been leveraged to > provide support for LDAP, Kerberos, and Sigv4 authentication. Unfortunately, > cqlsh, the included CLI tool, does not offer such support. Switching to a new > enhanced authentication scheme thus means being cut off from using cqlsh in > normal operation. > We should have a means of using the same plugins and authentication providers > as the Python Cassandra driver. > Here's a link to an initial draft of > [CEP|https://docs.google.com/document/d/1_G-OZCAEmDyuQuAN2wQUYUtZBEJpMkHWnkYELLhqvKc/edit?usp=sharing]. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra-website] branch asf-staging updated (6cf9cac7 -> 815fe0cb)
This is an automated email from the ASF dual-hosted git repository. git-site-role pushed a change to branch asf-staging in repository https://gitbox.apache.org/repos/asf/cassandra-website.git discard 6cf9cac7 generate docs for 8fd077a6 new 815fe0cb generate docs for 8fd077a6 This update added new revisions after undoing existing revisions. That is to say, some revisions that were in the old version of the branch are not in the new version. This situation occurs when a user --force pushes a change and generates a repository containing something like this: * -- * -- B -- O -- O -- O (6cf9cac7) \ N -- N -- N refs/heads/asf-staging (815fe0cb) You should already have received notification emails for all of the O revisions, and so the following emails describe only the N revisions from the common base, B. Any revisions marked "omit" are not gone; other references still refer to them. Any revisions marked "discard" are gone forever. The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: site-ui/build/ui-bundle.zip | Bin 4740078 -> 4740078 bytes 1 file changed, 0 insertions(+), 0 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17166) Enhance SnakeYAML properties to be reusable outside of YAML parsing, support camel case conversion to snake case, and add support to ignore properties
[ https://issues.apache.org/jira/browse/CASSANDRA-17166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-17166: -- Fix Version/s: 4.1 (was: 4.x) Source Control Link: https://github.com/apache/cassandra/commit/9b7e50b29bd029fc2151789306dc28864e1fc689 Resolution: Fixed Status: Resolved (was: Ready to Commit) > Enhance SnakeYAML properties to be reusable outside of YAML parsing, support > camel case conversion to snake case, and add support to ignore properties > -- > > Key: CASSANDRA-17166 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17166 > Project: Cassandra > Issue Type: Improvement > Components: Local/Config >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Fix For: 4.1 > > Time Spent: 15h 10m > Remaining Estimate: 0h > > SnakeYaml is rather limited in the “object mapping” layer, which forces our > internal code to match specific patterns (all fields public and camel case); > we can remove this restriction by leveraging Jackson for property lookup, and > leaving the YAML handling to SnakeYAML -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17166) Enhance SnakeYAML properties to be reusable outside of YAML parsing, support camel case conversion to snake case, and add support to ignore properties
[ https://issues.apache.org/jira/browse/CASSANDRA-17166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526749#comment-17526749 ] David Capwell commented on CASSANDRA-17166: --- Starting commit CI Results (pending): ||Branch||Source||Circle CI||Jenkins|| |trunk|[branch|https://github.com/dcapwell/cassandra/tree/commit_remote_branch/CASSANDRA-17166-trunk-B0041C5D-C3FD-41B0-8F73-BC6B8C01DCA0]|[build|https://app.circleci.com/pipelines/github/dcapwell/cassandra?branch=commit_remote_branch%2FCASSANDRA-17166-trunk-B0041C5D-C3FD-41B0-8F73-BC6B8C01DCA0]|[build|https://ci-cassandra.apache.org/job/Cassandra-devbranch/1627/]| > Enhance SnakeYAML properties to be reusable outside of YAML parsing, support > camel case conversion to snake case, and add support to ignore properties > -- > > Key: CASSANDRA-17166 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17166 > Project: Cassandra > Issue Type: Improvement > Components: Local/Config >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Fix For: 4.x > > Time Spent: 15h 10m > Remaining Estimate: 0h > > SnakeYaml is rather limited in the “object mapping” layer, which forces our > internal code to match specific patterns (all fields public and camel case); > we can remove this restriction by leveraging Jackson for property lookup, and > leaving the YAML handling to SnakeYAML -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17425) Add new CQL function maxWritetime
[ https://issues.apache.org/jira/browse/CASSANDRA-17425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526743#comment-17526743 ] Yifan Cai commented on CASSANDRA-17425: --- I rebased my patch on trunk and created a new PR: [https://github.com/apache/cassandra/pull/1584] CI: [https://app.circleci.com/pipelines/github/yifan-c/cassandra?branch=CASSANDRA-17425%2Ftrunk-new&filter=all] There are 2 commits. The first commit is the rebased original implementation. The second and optional commit implements the function based on WritetimeOrTTL. [~adelapena], can you review next week? > Add new CQL function maxWritetime > - > > Key: CASSANDRA-17425 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17425 > Project: Cassandra > Issue Type: Improvement > Components: CQL/Syntax >Reporter: Yifan Cai >Assignee: Yifan Cai >Priority: Normal > Time Spent: 20m > Remaining Estimate: 0h > > The function "writetime" does not support multi-cell types, e.g. collections > and UDT. It would be useful to enable querying the latest modified timestamp > of a column value. > I'd like to propose to add a new function named "maxWritetime", which returns > the largest timestamp amongst the cells. When being applied to the single > cell types, it returns the same result as "writetime". -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra-website] branch asf-staging updated (c636267c -> 6cf9cac7)
This is an automated email from the ASF dual-hosted git repository. git-site-role pushed a change to branch asf-staging in repository https://gitbox.apache.org/repos/asf/cassandra-website.git discard c636267c generate docs for 8fd077a6 new 6cf9cac7 generate docs for 8fd077a6 This update added new revisions after undoing existing revisions. That is to say, some revisions that were in the old version of the branch are not in the new version. This situation occurs when a user --force pushes a change and generates a repository containing something like this: * -- * -- B -- O -- O -- O (c636267c) \ N -- N -- N refs/heads/asf-staging (6cf9cac7) You should already have received notification emails for all of the O revisions, and so the following emails describe only the N revisions from the common base, B. Any revisions marked "omit" are not gone; other references still refer to them. Any revisions marked "discard" are gone forever. The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: .../doc/4.1/cassandra/tools/nodetool/compact.html | 6 +- .../latest/cassandra/tools/nodetool/compact.html | 6 +- .../trunk/cassandra/tools/nodetool/compact.html| 6 +- content/search-index.js| 2 +- site-ui/build/ui-bundle.zip| Bin 4740078 -> 4740078 bytes 5 files changed, 16 insertions(+), 4 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17166) Enhance SnakeYAML properties to be reusable outside of YAML parsing, support camel case conversion to snake case, and add support to ignore properties
[ https://issues.apache.org/jira/browse/CASSANDRA-17166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-17166: -- Status: Ready to Commit (was: Review In Progress) 2 +1s > Enhance SnakeYAML properties to be reusable outside of YAML parsing, support > camel case conversion to snake case, and add support to ignore properties > -- > > Key: CASSANDRA-17166 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17166 > Project: Cassandra > Issue Type: Improvement > Components: Local/Config >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Fix For: 4.x > > Time Spent: 15h 10m > Remaining Estimate: 0h > > SnakeYaml is rather limited in the “object mapping” layer, which forces our > internal code to match specific patterns (all fields public and camel case); > we can remove this restriction by leveraging Jackson for property lookup, and > leaving the YAML handling to SnakeYAML -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17537) nodetool compact should support using a key string to find the range to avoid operators having to manually do this
[ https://issues.apache.org/jira/browse/CASSANDRA-17537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-17537: -- Fix Version/s: 4.1 Source Control Link: https://github.com/apache/cassandra/commit/2b90ac1a1671b4071d9aa6f18e852021bc66702d Resolution: Fixed Status: Resolved (was: Ready to Commit) > nodetool compact should support using a key string to find the range to avoid > operators having to manually do this > -- > > Key: CASSANDRA-17537 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17537 > Project: Cassandra > Issue Type: New Feature > Components: Local/Compaction, Tool/nodetool >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Fix For: 4.1 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Its common that a single key needs to be compact, and operators need to do > the following > 1) go from key -> token > 2) generate range > 3) call nodetool compact with this range > We can simply this workflow by adding this to compact > nodetool compact ks.tbl -k “key1" -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch trunk updated: nodetool compact should support using a key string to find the range to avoid operators having to manually do this
This is an automated email from the ASF dual-hosted git repository. dcapwell pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git The following commit(s) were added to refs/heads/trunk by this push: new 2b90ac1a16 nodetool compact should support using a key string to find the range to avoid operators having to manually do this 2b90ac1a16 is described below commit 2b90ac1a1671b4071d9aa6f18e852021bc66702d Author: David Capwell AuthorDate: Thu Apr 21 14:37:59 2022 -0700 nodetool compact should support using a key string to find the range to avoid operators having to manually do this patch by David Capwell; reviewed by Marcus Eriksson for CASSANDRA-17537 --- CHANGES.txt| 1 + .../org/apache/cassandra/db/ColumnFamilyStore.java | 5 + .../db/compaction/CompactionController.java| 6 +- .../cassandra/db/compaction/CompactionManager.java | 30 +- .../apache/cassandra/dht/Murmur3Partitioner.java | 5 + .../cassandra/io/sstable/format/SSTableReader.java | 7 ++ .../apache/cassandra/service/StorageService.java | 49 +- .../cassandra/service/StorageServiceMBean.java | 7 ++ src/java/org/apache/cassandra/tools/NodeProbe.java | 5 + .../apache/cassandra/tools/nodetool/Compact.java | 13 ++- .../org/apache/cassandra/tools/ToolRunner.java | 53 ++ .../cassandra/tools/nodetool/CompactTest.java | 107 + 12 files changed, 274 insertions(+), 14 deletions(-) diff --git a/CHANGES.txt b/CHANGES.txt index 9e9e1ee2f1..972f760442 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 4.1 + * nodetool compact should support using a key string to find the range to avoid operators having to manually do this (CASSANDRA-17537) * Add guardrail for data disk usage (CASSANDRA-17150) * Tool to list data paths of existing tables (CASSANDRA-17568) * Migrate track_warnings to more standard naming conventions and use latest configuration types rather than long (CASSANDRA-17560) diff --git a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java index 35ca94214d..47dd66d7ae 100644 --- a/src/java/org/apache/cassandra/db/ColumnFamilyStore.java +++ b/src/java/org/apache/cassandra/db/ColumnFamilyStore.java @@ -2365,6 +2365,11 @@ public class ColumnFamilyStore implements ColumnFamilyStoreMBean return tokenRanges; } +public void forceCompactionForKey(DecoratedKey key) +{ +CompactionManager.instance.forceCompactionForKey(this, key); +} + public static Iterable all() { List> stores = new ArrayList<>(Schema.instance.getKeyspaces().size()); diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionController.java b/src/java/org/apache/cassandra/db/compaction/CompactionController.java index e1b0f32583..814292f207 100644 --- a/src/java/org/apache/cassandra/db/compaction/CompactionController.java +++ b/src/java/org/apache/cassandra/db/compaction/CompactionController.java @@ -34,7 +34,6 @@ import org.apache.cassandra.io.sstable.format.SSTableReader; import org.apache.cassandra.io.util.FileDataInput; import org.apache.cassandra.io.util.FileUtils; import org.apache.cassandra.schema.CompactionParams.TombstoneOption; -import org.apache.cassandra.utils.AlwaysPresentFilter; import org.apache.cassandra.utils.OverlapIterator; import org.apache.cassandra.utils.concurrent.Refs; @@ -255,10 +254,7 @@ public class CompactionController extends AbstractCompactionController for (SSTableReader sstable: filteredSSTables) { -// if we don't have bloom filter(bf_fp_chance=1.0 or filter file is missing), -// we check index file instead. -if (sstable.getBloomFilter() instanceof AlwaysPresentFilter && sstable.getPosition(key, SSTableReader.Operator.EQ, false) != null -|| sstable.getBloomFilter().isPresent(key)) +if (sstable.maybePresent(key)) { minTimestampSeen = Math.min(minTimestampSeen, sstable.getMinTimestamp()); hasTimestamp = true; diff --git a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java index 47ed3d5e11..165e1e02f3 100644 --- a/src/java/org/apache/cassandra/db/compaction/CompactionManager.java +++ b/src/java/org/apache/cassandra/db/compaction/CompactionManager.java @@ -27,6 +27,7 @@ import java.util.concurrent.TimeUnit; import java.util.concurrent.atomic.AtomicInteger; import java.util.function.BooleanSupplier; import java.util.function.Predicate; +import java.util.function.Supplier; import java.util.stream.Collectors; import javax.management.openmbean.OpenDataException; import javax.management.openmbean.TabularData; @@ -928,10 +929,10 @@ public class CompactionManager implements CompactionManagerMBean
[jira] [Commented] (CASSANDRA-17537) nodetool compact should support using a key string to find the range to avoid operators having to manually do this
[ https://issues.apache.org/jira/browse/CASSANDRA-17537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526702#comment-17526702 ] Brandon Williams commented on CASSANDRA-17537: -- That is an old known flaky: CASSANDRA-16677 > nodetool compact should support using a key string to find the range to avoid > operators having to manually do this > -- > > Key: CASSANDRA-17537 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17537 > Project: Cassandra > Issue Type: New Feature > Components: Local/Compaction, Tool/nodetool >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Time Spent: 1h 10m > Remaining Estimate: 0h > > Its common that a single key needs to be compact, and operators need to do > the following > 1) go from key -> token > 2) generate range > 3) call nodetool compact with this range > We can simply this workflow by adding this to compact > nodetool compact ks.tbl -k “key1" -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17537) nodetool compact should support using a key string to find the range to avoid operators having to manually do this
[ https://issues.apache.org/jira/browse/CASSANDRA-17537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526701#comment-17526701 ] David Capwell commented on CASSANDRA-17537: --- CI was clean other than org.apache.cassandra.net.ConnectionTest, rerunning locally as I think its just flaky... if passes will merge > nodetool compact should support using a key string to find the range to avoid > operators having to manually do this > -- > > Key: CASSANDRA-17537 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17537 > Project: Cassandra > Issue Type: New Feature > Components: Local/Compaction, Tool/nodetool >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Time Spent: 1h 10m > Remaining Estimate: 0h > > Its common that a single key needs to be compact, and operators need to do > the following > 1) go from key -> token > 2) generate range > 3) call nodetool compact with this range > We can simply this workflow by adding this to compact > nodetool compact ks.tbl -k “key1" -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16555) Add out-of-the-box snitch for Ec2 IMDSv2
[ https://issues.apache.org/jira/browse/CASSANDRA-16555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-16555: - Reviewers: Brandon Williams > Add out-of-the-box snitch for Ec2 IMDSv2 > > > Key: CASSANDRA-16555 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16555 > Project: Cassandra > Issue Type: New Feature > Components: Consistency/Coordination >Reporter: Paul Rütter (BlueConic) >Assignee: fulco taen >Priority: Normal > Time Spent: 10m > Remaining Estimate: 0h > > In order to patch a vulnerability, Amazon came up with a new version of their > metadata service. > It's no longer unrestricted but now requires a token (in a header), in order > to access the metadata service. > See > [https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-instance-metadata-service.html] > for more information. > Cassandra currently doesn't offer an out-of-the-box snitch class to support > this. > See > [https://cassandra.apache.org/doc/latest/operating/snitch.html#snitch-classes] > This issue asks to add support for this as a separate snitch class. > We'll probably do a PR for this, as we are in the process of developing one. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra-website] branch asf-staging updated (5ec74b34 -> c636267c)
This is an automated email from the ASF dual-hosted git repository. git-site-role pushed a change to branch asf-staging in repository https://gitbox.apache.org/repos/asf/cassandra-website.git discard 5ec74b34 generate docs for 8fd077a6 new c636267c generate docs for 8fd077a6 This update added new revisions after undoing existing revisions. That is to say, some revisions that were in the old version of the branch are not in the new version. This situation occurs when a user --force pushes a change and generates a repository containing something like this: * -- * -- B -- O -- O -- O (5ec74b34) \ N -- N -- N refs/heads/asf-staging (c636267c) You should already have received notification emails for all of the O revisions, and so the following emails describe only the N revisions from the common base, B. Any revisions marked "omit" are not gone; other references still refer to them. Any revisions marked "discard" are gone forever. The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: content/search-index.js | 2 +- site-ui/build/ui-bundle.zip | Bin 4740078 -> 4740078 bytes 2 files changed, 1 insertion(+), 1 deletion(-) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17576) Make GuardrailDiskUsageTest deterministic
[ https://issues.apache.org/jira/browse/CASSANDRA-17576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ekaterina Dimitrova updated CASSANDRA-17576: Fix Version/s: 4.1 > Make GuardrailDiskUsageTest deterministic > - > > Key: CASSANDRA-17576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17576 > Project: Cassandra > Issue Type: Bug > Components: Feature/Guardrails >Reporter: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.1 > > > Jenkins is low on space so we should mock the amount of available disk space > when testing the disk usage guardrails as otherwise the tests fail. > The issue was not seen before commit as CircleCI doesn't have storage > problems. > [https://ci-cassandra.apache.org/job/Cassandra-trunk/1095/] > Cc [~adelapena] -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17576) Make GuardrailDiskUsageTest deterministic
[ https://issues.apache.org/jira/browse/CASSANDRA-17576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ekaterina Dimitrova updated CASSANDRA-17576: Bug Category: Parent values: Code(13163) Complexity: Normal Component/s: Feature/Guardrails Discovered By: Unit Test Severity: Low Status: Open (was: Triage Needed) > Make GuardrailDiskUsageTest deterministic > - > > Key: CASSANDRA-17576 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17576 > Project: Cassandra > Issue Type: Bug > Components: Feature/Guardrails >Reporter: Ekaterina Dimitrova >Priority: Normal > > Jenkins is low on space so we should mock the amount of available disk space > when testing the disk usage guardrails as otherwise the tests fail. > The issue was not seen before commit as CircleCI doesn't have storage > problems. > [https://ci-cassandra.apache.org/job/Cassandra-trunk/1095/] > Cc [~adelapena] -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-17576) Make GuardrailDiskUsageTest deterministic
Ekaterina Dimitrova created CASSANDRA-17576: --- Summary: Make GuardrailDiskUsageTest deterministic Key: CASSANDRA-17576 URL: https://issues.apache.org/jira/browse/CASSANDRA-17576 Project: Cassandra Issue Type: Bug Reporter: Ekaterina Dimitrova Jenkins is low on space so we should mock the amount of available disk space when testing the disk usage guardrails as otherwise the tests fail. The issue was not seen before commit as CircleCI doesn't have storage problems. [https://ci-cassandra.apache.org/job/Cassandra-trunk/1095/] Cc [~adelapena] -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17575) forceCompactionForTokenRange when using a wrapped range may include sstables not within that range
[ https://issues.apache.org/jira/browse/CASSANDRA-17575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated CASSANDRA-17575: -- Bug Category: Parent values: Correctness(12982)Level 1 values: API / Semantic Implementation(12988) Complexity: Normal Discovered By: Unit Test Fix Version/s: 4.x Severity: Normal Status: Open (was: Triage Needed) > forceCompactionForTokenRange when using a wrapped range may include sstables > not within that range > -- > > Key: CASSANDRA-17575 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17575 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: David Capwell >Priority: Normal > Fix For: 4.x > > > This was found in CASSANDRA-17537 > When you compact the range (32, 31] this should include everything BUT 32, > but in the test > org.apache.cassandra.db.compaction.LeveledCompactionStrategyTest#testTokenRangeCompaction > it found that SSTables with the bounds (32, 32) were getting included in the > set of SSTables to compact -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-17575) forceCompactionForTokenRange when using a wrapped range may include sstables not within that range
David Capwell created CASSANDRA-17575: - Summary: forceCompactionForTokenRange when using a wrapped range may include sstables not within that range Key: CASSANDRA-17575 URL: https://issues.apache.org/jira/browse/CASSANDRA-17575 Project: Cassandra Issue Type: Bug Components: Local/Compaction Reporter: David Capwell This was found in CASSANDRA-17537 When you compact the range (32, 31] this should include everything BUT 32, but in the test org.apache.cassandra.db.compaction.LeveledCompactionStrategyTest#testTokenRangeCompaction it found that SSTables with the bounds (32, 32) were getting included in the set of SSTables to compact -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17537) nodetool compact should support using a key string to find the range to avoid operators having to manually do this
[ https://issues.apache.org/jira/browse/CASSANDRA-17537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526671#comment-17526671 ] David Capwell commented on CASSANDRA-17537: --- spoke with [~marcuse] and looks like sstablesInBounds is returning SSTables not within the range, which causes the above to fail; filing a different ticket for this and removed the assert from this block > nodetool compact should support using a key string to find the range to avoid > operators having to manually do this > -- > > Key: CASSANDRA-17537 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17537 > Project: Cassandra > Issue Type: New Feature > Components: Local/Compaction, Tool/nodetool >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Time Spent: 1h 10m > Remaining Estimate: 0h > > Its common that a single key needs to be compact, and operators need to do > the following > 1) go from key -> token > 2) generate range > 3) call nodetool compact with this range > We can simply this workflow by adding this to compact > nodetool compact ks.tbl -k “key1" -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra-website] branch asf-staging updated (3cb0927f -> 5ec74b34)
This is an automated email from the ASF dual-hosted git repository. git-site-role pushed a change to branch asf-staging in repository https://gitbox.apache.org/repos/asf/cassandra-website.git discard 3cb0927f generate docs for 8fd077a6 new 5ec74b34 generate docs for 8fd077a6 This update added new revisions after undoing existing revisions. That is to say, some revisions that were in the old version of the branch are not in the new version. This situation occurs when a user --force pushes a change and generates a repository containing something like this: * -- * -- B -- O -- O -- O (3cb0927f) \ N -- N -- N refs/heads/asf-staging (5ec74b34) You should already have received notification emails for all of the O revisions, and so the following emails describe only the N revisions from the common base, B. Any revisions marked "omit" are not gone; other references still refer to them. Any revisions marked "discard" are gone forever. The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: site-ui/build/ui-bundle.zip | Bin 4740078 -> 4740078 bytes 1 file changed, 0 insertions(+), 0 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15510) BTree: Improve Building, Inserting and Transforming
[ https://issues.apache.org/jira/browse/CASSANDRA-15510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer updated CASSANDRA-15510: --- Fix Version/s: 4.0.5 4.1 (was: 4.x) (was: 4.0.x) Source Control Link: https://github.com/apache/cassandra/commit/018c8e0d5e8bc55fc51d3361fcb27c3c1fd189f6 Resolution: Fixed Status: Resolved (was: Ready to Commit) Committed into cassandra-4.0 at 018c8e0d5e8bc55fc51d3361fcb27c3c1fd189f6 and merged into trunk > BTree: Improve Building, Inserting and Transforming > --- > > Key: CASSANDRA-15510 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15510 > Project: Cassandra > Issue Type: Improvement > Components: Local/Other >Reporter: Benedict Elliott Smith >Assignee: Benedict Elliott Smith >Priority: Normal > Fix For: 4.0.5, 4.1 > > Time Spent: 10h > Remaining Estimate: 0h > > This work was originally undertaken as a follow-up to CASSANDRA-15367 to > ensure performance is strictly improved, but it may no longer be needed for > that purpose. It’s still hugely impactful, however. It remains to be > decided where this should land. > The current {{BTree}} implementation is suboptimal in a number of ways, with > very little focus having been given to its performance besides its > memory-occupancy. This patch aims to address that, specifically improving > the performance and allocations involved in: building, transforming and > inserting into a tree. > To facilitate this work, the {{BTree}} definition is modified slightly, so > that we can perform some simple arithmetic on tree sizes. Specifically, > trees of depth n are defined to have a maximum capacity of {{branchFactor^n - > 1}}, which translates into capping the number of leaf children at > {{branchFactor-1}}, as opposed to {{branchFactor}}. Since {{branchFactor}} > is a power of 2, this permits fast tree size arithmetic, enabling some of > these changes. > h2. Building > The static build method has been modified to utilise dedicated > {{buildPerfect}} methods that build either perfectly dense or perfectly > sparse sub-trees. These perfect trees all share their {{sizeMap}} with each > other, and can be built more efficiently than trees of arbitrary size. The > specifics are described in detail in the comments, but this building block > can be used to construct trees of any size, using at most one child at each > level that is not either perfectly sparse or perfectly dense. Bulk methods > are used where possible. > For large trees this can produce up to 30x throughput improvement and 30% > allocation reduction vs 3.0 (TBC, and to be tested vs 4.0). > {{FastBuilder}} is introduced for building a tree in-order (or in reverse) > without duplicate elements to resolve, without necessarily knowing the size > upfront. This meets the needs of most use cases. Data is built directly > into nodes, with up to one already-constructed node, and one partially > constructed node, on each level, being mutated to share their contents in the > event of insufficient data to populate the tree. These builders are > thread-locally shared. These leads to minimal copying, the same sharing of > {{sizeMap}} as above, zero wasted allocations, and results in minimal > difference in performance between utilising the less-ergonomic static build > and builder approach. > For large trees this leads to ~4.5x throughput improvement, and 70% reduction > in allocations vs a normal Builder. For small trees performance is > comparable, but allocations similarly reduced. > h2. Inserting > It turns out that we only ever insert another tree into a tree, so we exploit > this to implement an efficient union of two trees, operating on them directly > via stacks in the transformer, instead of via a collection interface. A > builder-like object is introduced that shares functionality with > {{FastBuilder}}, and permits us to build the result of the union directly > into the final nodes, reusing as much of the original trees as possible. > Bulk methods are used where possible. > The result is not _uniformly_ faster, but is _significantly_ faster on > average: median _improvement_ of 1.4x (that is, 2.4x total throughput), mean > improvement of 10x. Worst reduction is 30%, and it may be that we can > isolate and alleviate that. Allocations are also reduced significantly, with > a median of 30% and mean of 42% for the tested workloads. As the trees get > larger the improvement drops, but remains uniformly lower. > h2. Transforming > Transformations garbage overhead is minimal, i.e. the main allocations are > those necessary to represent the new t
[cassandra] 01/01: Merge branch cassandra-4.0 into trunk
This is an automated email from the ASF dual-hosted git repository. blerer pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git commit 003a96b6a6f649f99138b94c52d28b73c2c3547a Merge: 2723c91878 018c8e0d5e Author: Benjamin Lerer AuthorDate: Fri Apr 22 19:28:32 2022 +0200 Merge branch cassandra-4.0 into trunk - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch trunk updated (2723c91878 -> 003a96b6a6)
This is an automated email from the ASF dual-hosted git repository. blerer pushed a change to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git from 2723c91878 Merge branch 'cassandra-4.0' into trunk add 018c8e0d5e Optimise BTree build, update and transform operations new 003a96b6a6 Merge branch cassandra-4.0 into trunk The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch cassandra-4.0 updated (2873c91269 -> 018c8e0d5e)
This is an automated email from the ASF dual-hosted git repository. blerer pushed a change to branch cassandra-4.0 in repository https://gitbox.apache.org/repos/asf/cassandra.git from 2873c91269 Split ReadRepairQueryTypesTest to avoid JUnit timeouts add 018c8e0d5e Optimise BTree build, update and transform operations No new revisions were added by this update. Summary of changes: CHANGES.txt|3 + build.xml |4 +- src/java/org/apache/cassandra/db/Columns.java |4 +- .../db/partitions/AtomicBTreePartition.java| 14 +- .../cassandra/db/partitions/PartitionUpdate.java |6 +- .../org/apache/cassandra/db/rows/BTreeRow.java |2 +- .../cassandra/db/rows/ComplexColumnData.java | 27 +- src/java/org/apache/cassandra/db/rows/Row.java |2 +- .../org/apache/cassandra/utils/BulkIterator.java | 112 + .../org/apache/cassandra/utils/btree/BTree.java| 3485 +--- .../apache/cassandra/utils/btree/BTreeRemoval.java | 12 +- .../org/apache/cassandra/utils/btree/BTreeSet.java | 46 +- .../apache/cassandra/utils/btree/NodeBuilder.java | 441 --- .../apache/cassandra/utils/btree/TreeBuilder.java | 121 - .../cassandra/utils/btree/UpdateFunction.java | 32 +- .../utils/caching/TinyThreadLocalPool.java | 85 + .../org/apache/cassandra/utils/LongBTreeTest.java | 587 ++-- .../BTreeBench.java} | 75 +- .../test/microbench/btree/BTreeBuildBench.java | 127 + .../test/microbench/btree/BTreeTransformBench.java | 194 ++ .../test/microbench/btree/BTreeUpdateBench.java| 324 ++ .../test/microbench/btree/IntVisitor.java | 85 + .../test/microbench/btree/Megamorphism.java| 169 + .../cassandra/utils/btree/BTreeRemovalTest.java| 17 +- .../utils/btree/BTreeSearchIteratorTest.java |6 +- .../apache/cassandra/utils/btree/BTreeTest.java| 239 +- 26 files changed, 4712 insertions(+), 1507 deletions(-) create mode 100644 src/java/org/apache/cassandra/utils/BulkIterator.java delete mode 100644 src/java/org/apache/cassandra/utils/btree/NodeBuilder.java delete mode 100644 src/java/org/apache/cassandra/utils/btree/TreeBuilder.java create mode 100644 src/java/org/apache/cassandra/utils/caching/TinyThreadLocalPool.java copy test/microbench/org/apache/cassandra/test/microbench/{BTreeBuildBench.java => btree/BTreeBench.java} (54%) create mode 100644 test/microbench/org/apache/cassandra/test/microbench/btree/BTreeBuildBench.java create mode 100644 test/microbench/org/apache/cassandra/test/microbench/btree/BTreeTransformBench.java create mode 100644 test/microbench/org/apache/cassandra/test/microbench/btree/BTreeUpdateBench.java create mode 100644 test/microbench/org/apache/cassandra/test/microbench/btree/IntVisitor.java create mode 100644 test/microbench/org/apache/cassandra/test/microbench/btree/Megamorphism.java - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17543) ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8: strategy=NONE coordinator=1 flush=false paging=false] times out sporadically
[ https://issues.apache.org/jira/browse/CASSANDRA-17543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andres de la Peña updated CASSANDRA-17543: -- Fix Version/s: 4.1 4.0.4 (was: 4.x) Since Version: 4.0.0 Source Control Link: https://github.com/apache/cassandra/commit/2873c9126979e21a8089e9a18d96af802745dbc2 Resolution: Fixed Status: Resolved (was: Ready to Commit) > ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8: strategy=NONE > coordinator=1 flush=false paging=false] times out sporadically > --- > > Key: CASSANDRA-17543 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17543 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/java >Reporter: Caleb Rackliffe >Assignee: Andres de la Peña >Priority: Normal > Fix For: 4.1, 4.0.4 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > org.apache.cassandra.distributed.test.ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8: > strategy=NONE coordinator=1 flush=false paging=false] > {noformat} > Error Message > Timeout occurred. Please note the time in the report does not reflect the > time until the timeout. > Stacktrace > junit.framework.AssertionFailedError: Timeout occurred. Please note the time > in the report does not reflect the time until the timeout. > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.util.Vector.forEach(Vector.java:1388) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.util.Vector.forEach(Vector.java:1388) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.util.Vector.forEach(Vector.java:1388) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at org.apache.cassandra.anttasks.TestHelper.execute(TestHelper.java:53) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.util.Vector.forEach(Vector.java:1388) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > {noformat} > See > https://ci-cassandra.apache.org/job/Cassandra-trunk/1075/testReport/org.apache.cassandra.distributed.test/ReadRepairQueryTypesTest/testUnrestrictedQueryOnSkinnyTable_8__strategy_NONE_coordinator_1_flush_false_paging_false_/ -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra-website] branch asf-staging updated (eb4d1ab0 -> 3cb0927f)
This is an automated email from the ASF dual-hosted git repository. git-site-role pushed a change to branch asf-staging in repository https://gitbox.apache.org/repos/asf/cassandra-website.git discard eb4d1ab0 generate docs for 8fd077a6 new 3cb0927f generate docs for 8fd077a6 This update added new revisions after undoing existing revisions. That is to say, some revisions that were in the old version of the branch are not in the new version. This situation occurs when a user --force pushes a change and generates a repository containing something like this: * -- * -- B -- O -- O -- O (eb4d1ab0) \ N -- N -- N refs/heads/asf-staging (3cb0927f) You should already have received notification emails for all of the O revisions, and so the following emails describe only the N revisions from the common base, B. Any revisions marked "omit" are not gone; other references still refer to them. Any revisions marked "discard" are gone forever. The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: .../cassandra/configuration/cass_yaml_file.html| 53 -- .../4.1/cassandra/tools/nodetool/bootstrap.html| 8 +-- .../nodetool/{refresh.html => datapaths.html} | 21 +++--- .../doc/4.1/cassandra/tools/nodetool/nodetool.html | 12 ++-- .../4.1/cassandra/tools/nodetool/repair_admin.html | 80 ++--- .../cassandra/troubleshooting/use_nodetool.html| 46 .../cassandra/configuration/cass_yaml_file.html| 53 -- .../latest/cassandra/tools/nodetool/bootstrap.html | 8 +-- .../cassandra/tools/nodetool/datapaths.html} | 21 +++--- .../latest/cassandra/tools/nodetool/nodetool.html | 12 ++-- .../cassandra/tools/nodetool/repair_admin.html | 80 ++--- .../cassandra/troubleshooting/use_nodetool.html| 46 .../cassandra/configuration/cass_yaml_file.html| 53 -- .../trunk/cassandra/tools/nodetool/bootstrap.html | 8 +-- .../cassandra/tools/nodetool/datapaths.html} | 21 +++--- .../trunk/cassandra/tools/nodetool/nodetool.html | 12 ++-- .../cassandra/tools/nodetool/repair_admin.html | 80 ++--- .../cassandra/troubleshooting/use_nodetool.html| 46 content/search-index.js| 2 +- site-ui/build/ui-bundle.zip| Bin 4740078 -> 4740078 bytes 20 files changed, 469 insertions(+), 193 deletions(-) copy content/doc/4.1/cassandra/tools/nodetool/{refresh.html => datapaths.html} (98%) copy content/doc/{4.1/cassandra/tools/nodetool/refresh.html => latest/cassandra/tools/nodetool/datapaths.html} (98%) copy content/doc/{4.1/cassandra/tools/nodetool/refresh.html => trunk/cassandra/tools/nodetool/datapaths.html} (98%) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17543) ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8: strategy=NONE coordinator=1 flush=false paging=false] times out sporadically
[ https://issues.apache.org/jira/browse/CASSANDRA-17543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andres de la Peña updated CASSANDRA-17543: -- Status: Ready to Commit (was: Review In Progress) > ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8: strategy=NONE > coordinator=1 flush=false paging=false] times out sporadically > --- > > Key: CASSANDRA-17543 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17543 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/java >Reporter: Caleb Rackliffe >Assignee: Andres de la Peña >Priority: Normal > Fix For: 4.x > > Time Spent: 1h 10m > Remaining Estimate: 0h > > org.apache.cassandra.distributed.test.ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8: > strategy=NONE coordinator=1 flush=false paging=false] > {noformat} > Error Message > Timeout occurred. Please note the time in the report does not reflect the > time until the timeout. > Stacktrace > junit.framework.AssertionFailedError: Timeout occurred. Please note the time > in the report does not reflect the time until the timeout. > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.util.Vector.forEach(Vector.java:1388) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.util.Vector.forEach(Vector.java:1388) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.util.Vector.forEach(Vector.java:1388) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at org.apache.cassandra.anttasks.TestHelper.execute(TestHelper.java:53) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.util.Vector.forEach(Vector.java:1388) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > {noformat} > See > https://ci-cassandra.apache.org/job/Cassandra-trunk/1075/testReport/org.apache.cassandra.distributed.test/ReadRepairQueryTypesTest/testUnrestrictedQueryOnSkinnyTable_8__strategy_NONE_coordinator_1_flush_false_paging_false_/ -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17543) ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8: strategy=NONE coordinator=1 flush=false paging=false] times out sporadically
[ https://issues.apache.org/jira/browse/CASSANDRA-17543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526547#comment-17526547 ] Andres de la Peña commented on CASSANDRA-17543: --- Thanks, committed to {{cassandra-4.0}} as [2873c9126979e21a8089e9a18d96af802745dbc2|https://github.com/apache/cassandra/commit/2873c9126979e21a8089e9a18d96af802745dbc2] and [merge into {{trunk}}|https://github.com/apache/cassandra/commit/2723c91878cfd7005a53f6118015c484dacc0f32] > ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8: strategy=NONE > coordinator=1 flush=false paging=false] times out sporadically > --- > > Key: CASSANDRA-17543 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17543 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/java >Reporter: Caleb Rackliffe >Assignee: Andres de la Peña >Priority: Normal > Fix For: 4.x > > Time Spent: 1h 10m > Remaining Estimate: 0h > > org.apache.cassandra.distributed.test.ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8: > strategy=NONE coordinator=1 flush=false paging=false] > {noformat} > Error Message > Timeout occurred. Please note the time in the report does not reflect the > time until the timeout. > Stacktrace > junit.framework.AssertionFailedError: Timeout occurred. Please note the time > in the report does not reflect the time until the timeout. > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.util.Vector.forEach(Vector.java:1388) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.util.Vector.forEach(Vector.java:1388) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.util.Vector.forEach(Vector.java:1388) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at org.apache.cassandra.anttasks.TestHelper.execute(TestHelper.java:53) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.util.Vector.forEach(Vector.java:1388) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > {noformat} > See > https://ci-cassandra.apache.org/job/Cassandra-trunk/1075/testReport/org.apache.cassandra.distributed.test/ReadRepairQueryTypesTest/testUnrestrictedQueryOnSkinnyTable_8__strategy_NONE_coordinator_1_flush_false_paging_false_/ -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch cassandra-4.0 updated: Split ReadRepairQueryTypesTest to avoid JUnit timeouts
This is an automated email from the ASF dual-hosted git repository. adelapena pushed a commit to branch cassandra-4.0 in repository https://gitbox.apache.org/repos/asf/cassandra.git The following commit(s) were added to refs/heads/cassandra-4.0 by this push: new 2873c91269 Split ReadRepairQueryTypesTest to avoid JUnit timeouts 2873c91269 is described below commit 2873c9126979e21a8089e9a18d96af802745dbc2 Author: Andrés de la Peña AuthorDate: Wed Apr 13 12:09:17 2022 +0100 Split ReadRepairQueryTypesTest to avoid JUnit timeouts patch by Andrés de la Peña; reviewed by Caleb Rackliffe for CASSANDRA-17543 --- .../test/ReadRepairCollectionQueriesTest.java | 236 .../distributed/test/ReadRepairInQueriesTest.java | 247 .../test/ReadRepairPointQueriesTest.java | 79 ++ .../distributed/test/ReadRepairQueryTester.java| 280 + .../distributed/test/ReadRepairQueryTypesTest.java | 1192 .../test/ReadRepairRangeQueriesTest.java | 261 + .../test/ReadRepairSliceQueriesTest.java | 145 +++ .../test/ReadRepairUnrestrictedQueriesTest.java| 116 ++ 8 files changed, 1364 insertions(+), 1192 deletions(-) diff --git a/test/distributed/org/apache/cassandra/distributed/test/ReadRepairCollectionQueriesTest.java b/test/distributed/org/apache/cassandra/distributed/test/ReadRepairCollectionQueriesTest.java new file mode 100644 index 00..6149ffc93f --- /dev/null +++ b/test/distributed/org/apache/cassandra/distributed/test/ReadRepairCollectionQueriesTest.java @@ -0,0 +1,236 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.cassandra.distributed.test; + +import org.junit.Test; + +import static org.apache.cassandra.distributed.shared.AssertUtils.row; + +/** + * {@link ReadRepairQueryTester} for queries on collections. + */ +public class ReadRepairCollectionQueriesTest extends ReadRepairQueryTester +{ +/** + * Test unrestricted queries with frozen tuples. + */ +@Test +public void testTuple() +{ +tester("") +.createTable("CREATE TABLE %s (k int PRIMARY KEY, a tuple, b tuple)") +.mutate("INSERT INTO %s (k, a, b) VALUES (0, (1, 2), (3, 4))") +.queryColumns("a", 1, 1, + rows(row(tuple(1, 2))), + rows(row(0, tuple(1, 2), tuple(3, 4))), + rows(row(0, tuple(1, 2), null))) +.deleteColumn("DELETE a FROM %s WHERE k=0", "b", 0, 1, + rows(row(tuple(3, 4))), + rows(row(0, null, tuple(3, 4))), + rows(row(0, tuple(1, 2), tuple(3, 4 +.deleteRows("DELETE FROM %s WHERE k=0", 1, +rows(), +rows(row(0, null, tuple(3, 4 +.tearDown(); +} + +/** + * Test unrestricted queries with frozen sets. + */ +@Test +public void testFrozenSet() +{ +tester("") +.createTable("CREATE TABLE %s (k int PRIMARY KEY, a frozen>, b frozen>)") +.mutate("INSERT INTO %s (k, a, b) VALUES (0, {1, 2}, {3, 4})") +.queryColumns("a[1]", 1, 1, + rows(row(1)), + rows(row(0, set(1, 2), set(3, 4))), + rows(row(0, set(1, 2), null))) +.deleteColumn("DELETE a FROM %s WHERE k=0", "b[4]", 0, 1, + rows(row(4)), + rows(row(0, null, set(3, 4))), + rows(row(0, set(1, 2), set(3, 4 +.deleteRows("DELETE FROM %s WHERE k=0", 1, +rows(), +rows(row(0, null, set(3, 4 +.tearDown(); +} + +/** + * Test unrestricted queries with frozen lists. + */ +@Test +public void testFrozenList() +{ +tester("") +.createTable("CREATE TABLE %s (k int PRIMARY KEY, a frozen>, b frozen>)") +.mutate("INSERT INTO %s (k, a, b) VALUES (0, [1, 2], [3, 4])") +.queryColumns("a", 1, 1, + rows(row(list(1, 2))), + rows(row(0, list(1, 2), list(3, 4))), + rows(row(0, list(1, 2), null))) +.deleteColumn("DE
[cassandra] branch trunk updated (b3842de5cf -> 2723c91878)
This is an automated email from the ASF dual-hosted git repository. adelapena pushed a change to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git from b3842de5cf Add guardrail for data disk usage new 2873c91269 Split ReadRepairQueryTypesTest to avoid JUnit timeouts new 2723c91878 Merge branch 'cassandra-4.0' into trunk The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: .../test/ReadRepairCollectionQueriesTest.java | 236 .../distributed/test/ReadRepairInQueriesTest.java | 247 .../test/ReadRepairPointQueriesTest.java | 79 ++ .../distributed/test/ReadRepairQueryTester.java| 279 + .../distributed/test/ReadRepairQueryTypesTest.java | 1191 .../test/ReadRepairRangeQueriesTest.java | 261 + .../test/ReadRepairSliceQueriesTest.java | 145 +++ .../test/ReadRepairUnrestrictedQueriesTest.java| 116 ++ 8 files changed, 1363 insertions(+), 1191 deletions(-) create mode 100644 test/distributed/org/apache/cassandra/distributed/test/ReadRepairCollectionQueriesTest.java create mode 100644 test/distributed/org/apache/cassandra/distributed/test/ReadRepairInQueriesTest.java create mode 100644 test/distributed/org/apache/cassandra/distributed/test/ReadRepairPointQueriesTest.java create mode 100644 test/distributed/org/apache/cassandra/distributed/test/ReadRepairQueryTester.java delete mode 100644 test/distributed/org/apache/cassandra/distributed/test/ReadRepairQueryTypesTest.java create mode 100644 test/distributed/org/apache/cassandra/distributed/test/ReadRepairRangeQueriesTest.java create mode 100644 test/distributed/org/apache/cassandra/distributed/test/ReadRepairSliceQueriesTest.java create mode 100644 test/distributed/org/apache/cassandra/distributed/test/ReadRepairUnrestrictedQueriesTest.java - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] 01/01: Merge branch 'cassandra-4.0' into trunk
This is an automated email from the ASF dual-hosted git repository. adelapena pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git commit 2723c91878cfd7005a53f6118015c484dacc0f32 Merge: b3842de5cf 2873c91269 Author: Andrés de la Peña AuthorDate: Fri Apr 22 17:30:22 2022 +0100 Merge branch 'cassandra-4.0' into trunk .../test/ReadRepairCollectionQueriesTest.java | 236 .../distributed/test/ReadRepairInQueriesTest.java | 247 .../test/ReadRepairPointQueriesTest.java | 79 ++ .../distributed/test/ReadRepairQueryTester.java| 279 + .../distributed/test/ReadRepairQueryTypesTest.java | 1191 .../test/ReadRepairRangeQueriesTest.java | 261 + .../test/ReadRepairSliceQueriesTest.java | 145 +++ .../test/ReadRepairUnrestrictedQueriesTest.java| 116 ++ 8 files changed, 1363 insertions(+), 1191 deletions(-) diff --cc test/distributed/org/apache/cassandra/distributed/test/ReadRepairQueryTester.java index 00,10bf05021b..26516104fb mode 00,100644..100644 --- a/test/distributed/org/apache/cassandra/distributed/test/ReadRepairQueryTester.java +++ b/test/distributed/org/apache/cassandra/distributed/test/ReadRepairQueryTester.java @@@ -1,0 -1,280 +1,279 @@@ + /* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + + package org.apache.cassandra.distributed.test; + + import java.io.IOException; + import java.util.ArrayList; + import java.util.Collection; + import java.util.List; + + import org.junit.AfterClass; + import org.junit.BeforeClass; + import org.junit.runner.RunWith; + import org.junit.runners.Parameterized; + + import org.apache.cassandra.distributed.Cluster; + import org.apache.cassandra.service.reads.repair.ReadRepairStrategy; + -import static java.util.concurrent.TimeUnit.MINUTES; + import static org.apache.cassandra.distributed.shared.AssertUtils.assertEquals; + import static org.apache.cassandra.distributed.shared.AssertUtils.assertRows; + import static org.apache.cassandra.service.reads.repair.ReadRepairStrategy.NONE; + + /** + * Base class for tests around read repair functionality with different query types and schemas. + * + * Each test verifies that its tested query triggers read repair propagating mismatched rows/columns and row/column + * deletions. They also verify that the selected rows and columns are propagated through read repair on missmatch, + * and that unselected rows/columns are not repaired. + * + * The tests are parameterized for: + * + * + * Data to be repaired residing on the query coordinator or a replica + * Data to be repaired residing on memtables or flushed to sstables + * + * + * All derived tests follow a similar pattern: + * + * Create a keyspace with RF=2 and a table + * Insert some data in only one of the nodes + * Run the tested read query selecting a subset of the inserted columns with CL=ALL + * Verify that the previous read has triggered read repair propagating only the queried columns + * Run the tested read query again but this time selecting all the columns + * Verify that the previous read has triggered read repair propagating the rest of the queried rows + * Delete one of the involved columns in just one node + * Run the tested read query again but this time selecting a column different to the deleted one + * Verify that the previous read hasn't propagated the column deletion + * Run the tested read query again selecting all the columns + * Verify that the previous read has triggered read repair propagating the column deletion + * Delete one of the involved rows in just one node + * Run the tested read query again selecting all the columns + * Verify that the previous read has triggered read repair propagating the row deletions + * Verify the final status of each node and drop the table + * + */ + @RunWith(Parameterized.class) + public abstract class ReadRepairQueryTester extends TestBaseImpl + { + private static final int NUM_NODES = 2; + + /** + * The read repair strategy to be used + */ +
[jira] [Commented] (CASSANDRA-15510) BTree: Improve Building, Inserting and Transforming
[ https://issues.apache.org/jira/browse/CASSANDRA-15510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526543#comment-17526543 ] Benjamin Lerer commented on CASSANDRA-15510: CI runs for [4.0|https://app.circleci.com/pipelines/github/blerer/cassandra/284/workflows/76db84a8-a6a1-4364-85ce-72ed5f12081f] and [trunk|https://app.circleci.com/pipelines/github/blerer/cassandra/287/workflows/f9ad1572-460f-4b99-bfbd-ac3edaac61da] > BTree: Improve Building, Inserting and Transforming > --- > > Key: CASSANDRA-15510 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15510 > Project: Cassandra > Issue Type: Improvement > Components: Local/Other >Reporter: Benedict Elliott Smith >Assignee: Benedict Elliott Smith >Priority: Normal > Fix For: 4.0.x, 4.x > > Time Spent: 10h > Remaining Estimate: 0h > > This work was originally undertaken as a follow-up to CASSANDRA-15367 to > ensure performance is strictly improved, but it may no longer be needed for > that purpose. It’s still hugely impactful, however. It remains to be > decided where this should land. > The current {{BTree}} implementation is suboptimal in a number of ways, with > very little focus having been given to its performance besides its > memory-occupancy. This patch aims to address that, specifically improving > the performance and allocations involved in: building, transforming and > inserting into a tree. > To facilitate this work, the {{BTree}} definition is modified slightly, so > that we can perform some simple arithmetic on tree sizes. Specifically, > trees of depth n are defined to have a maximum capacity of {{branchFactor^n - > 1}}, which translates into capping the number of leaf children at > {{branchFactor-1}}, as opposed to {{branchFactor}}. Since {{branchFactor}} > is a power of 2, this permits fast tree size arithmetic, enabling some of > these changes. > h2. Building > The static build method has been modified to utilise dedicated > {{buildPerfect}} methods that build either perfectly dense or perfectly > sparse sub-trees. These perfect trees all share their {{sizeMap}} with each > other, and can be built more efficiently than trees of arbitrary size. The > specifics are described in detail in the comments, but this building block > can be used to construct trees of any size, using at most one child at each > level that is not either perfectly sparse or perfectly dense. Bulk methods > are used where possible. > For large trees this can produce up to 30x throughput improvement and 30% > allocation reduction vs 3.0 (TBC, and to be tested vs 4.0). > {{FastBuilder}} is introduced for building a tree in-order (or in reverse) > without duplicate elements to resolve, without necessarily knowing the size > upfront. This meets the needs of most use cases. Data is built directly > into nodes, with up to one already-constructed node, and one partially > constructed node, on each level, being mutated to share their contents in the > event of insufficient data to populate the tree. These builders are > thread-locally shared. These leads to minimal copying, the same sharing of > {{sizeMap}} as above, zero wasted allocations, and results in minimal > difference in performance between utilising the less-ergonomic static build > and builder approach. > For large trees this leads to ~4.5x throughput improvement, and 70% reduction > in allocations vs a normal Builder. For small trees performance is > comparable, but allocations similarly reduced. > h2. Inserting > It turns out that we only ever insert another tree into a tree, so we exploit > this to implement an efficient union of two trees, operating on them directly > via stacks in the transformer, instead of via a collection interface. A > builder-like object is introduced that shares functionality with > {{FastBuilder}}, and permits us to build the result of the union directly > into the final nodes, reusing as much of the original trees as possible. > Bulk methods are used where possible. > The result is not _uniformly_ faster, but is _significantly_ faster on > average: median _improvement_ of 1.4x (that is, 2.4x total throughput), mean > improvement of 10x. Worst reduction is 30%, and it may be that we can > isolate and alleviate that. Allocations are also reduced significantly, with > a median of 30% and mean of 42% for the tested workloads. As the trees get > larger the improvement drops, but remains uniformly lower. > h2. Transforming > Transformations garbage overhead is minimal, i.e. the main allocations are > those necessary to represent the new tree. It is significantly faster and > particularly more efficient when removing elements, utilising the shared > functionali
[jira] [Commented] (CASSANDRA-17370) Add flag enabling operators to restrict use of ALLOW FILTERING in queries
[ https://issues.apache.org/jira/browse/CASSANDRA-17370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526541#comment-17526541 ] Josh McKenzie commented on CASSANDRA-17370: --- +1 here > Add flag enabling operators to restrict use of ALLOW FILTERING in queries > - > > Key: CASSANDRA-17370 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17370 > Project: Cassandra > Issue Type: Improvement > Components: CQL/Semantics, Feature/Guardrails >Reporter: Savni Nagarkar >Assignee: Savni Nagarkar >Priority: Normal > Fix For: 4.x > > Time Spent: 3.5h > Remaining Estimate: 0h > > This ticket adds the ability for operators to disallow use of ALLOW FILTERING > predicates in CQL SELECT statements. As queries that ALLOW FILTERING can > place additional load on the database, the flag enables operators to provide > tighter bounds on performance guarantees. The patch includes a new yaml > property, as well as a hot property enabling the value to be modified via JMX > at runtime. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17543) ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8: strategy=NONE coordinator=1 flush=false paging=false] times out sporadically
[ https://issues.apache.org/jira/browse/CASSANDRA-17543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526524#comment-17526524 ] Caleb Rackliffe commented on CASSANDRA-17543: - Go for it. The only failure I see is {{test_oversized_mutation}}, so it's "clean" :) > ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8: strategy=NONE > coordinator=1 flush=false paging=false] times out sporadically > --- > > Key: CASSANDRA-17543 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17543 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/java >Reporter: Caleb Rackliffe >Assignee: Andres de la Peña >Priority: Normal > Fix For: 4.x > > Time Spent: 1h 10m > Remaining Estimate: 0h > > org.apache.cassandra.distributed.test.ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8: > strategy=NONE coordinator=1 flush=false paging=false] > {noformat} > Error Message > Timeout occurred. Please note the time in the report does not reflect the > time until the timeout. > Stacktrace > junit.framework.AssertionFailedError: Timeout occurred. Please note the time > in the report does not reflect the time until the timeout. > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.util.Vector.forEach(Vector.java:1388) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.util.Vector.forEach(Vector.java:1388) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.util.Vector.forEach(Vector.java:1388) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at org.apache.cassandra.anttasks.TestHelper.execute(TestHelper.java:53) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.util.Vector.forEach(Vector.java:1388) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > {noformat} > See > https://ci-cassandra.apache.org/job/Cassandra-trunk/1075/testReport/org.apache.cassandra.distributed.test/ReadRepairQueryTypesTest/testUnrestrictedQueryOnSkinnyTable_8__strategy_NONE_coordinator_1_flush_false_paging_false_/ -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-16555) Add out-of-the-box snitch for Ec2 IMDSv2
[ https://issues.apache.org/jira/browse/CASSANDRA-16555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremiah Jordan updated CASSANDRA-16555: Test and Documentation Plan: New snitch should be added to the docs. Status: Patch Available (was: Open) Just noticed this PR today. Putting this to patch available, looks like it was never transitioned and slipped through. > Add out-of-the-box snitch for Ec2 IMDSv2 > > > Key: CASSANDRA-16555 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16555 > Project: Cassandra > Issue Type: New Feature > Components: Consistency/Coordination >Reporter: Paul Rütter (BlueConic) >Assignee: fulco taen >Priority: Normal > Time Spent: 10m > Remaining Estimate: 0h > > In order to patch a vulnerability, Amazon came up with a new version of their > metadata service. > It's no longer unrestricted but now requires a token (in a header), in order > to access the metadata service. > See > [https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-instance-metadata-service.html] > for more information. > Cassandra currently doesn't offer an out-of-the-box snitch class to support > this. > See > [https://cassandra.apache.org/doc/latest/operating/snitch.html#snitch-classes] > This issue asks to add support for this as a separate snitch class. > We'll probably do a PR for this, as we are in the process of developing one. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-16555) Add out-of-the-box snitch for Ec2 IMDSv2
[ https://issues.apache.org/jira/browse/CASSANDRA-16555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremiah Jordan reassigned CASSANDRA-16555: --- Assignee: fulco taen > Add out-of-the-box snitch for Ec2 IMDSv2 > > > Key: CASSANDRA-16555 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16555 > Project: Cassandra > Issue Type: New Feature > Components: Consistency/Coordination >Reporter: Paul Rütter (BlueConic) >Assignee: fulco taen >Priority: Normal > Time Spent: 10m > Remaining Estimate: 0h > > In order to patch a vulnerability, Amazon came up with a new version of their > metadata service. > It's no longer unrestricted but now requires a token (in a header), in order > to access the metadata service. > See > [https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-instance-metadata-service.html] > for more information. > Cassandra currently doesn't offer an out-of-the-box snitch class to support > this. > See > [https://cassandra.apache.org/doc/latest/operating/snitch.html#snitch-classes] > This issue asks to add support for this as a separate snitch class. > We'll probably do a PR for this, as we are in the process of developing one. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17150) Guardrails for disk usage
[ https://issues.apache.org/jira/browse/CASSANDRA-17150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andres de la Peña updated CASSANDRA-17150: -- Fix Version/s: 4.1 (was: 4.x) Source Control Link: https://github.com/apache/cassandra/commit/b3842de5cf1fa1b81872effb4585fbc7e1873d59 Resolution: Fixed Status: Resolved (was: Ready to Commit) > Guardrails for disk usage > - > > Key: CASSANDRA-17150 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17150 > Project: Cassandra > Issue Type: New Feature > Components: Feature/Guardrails >Reporter: Andres de la Peña >Assignee: Andres de la Peña >Priority: Normal > Fix For: 4.1 > > Time Spent: 8h 20m > Remaining Estimate: 0h > > Add guardrails for disk usage establishing soft/hard limits on the percentage > of used disk space. For example: > {code} > # Warning threshold to warn when local disk usage exceeds threshold. Valid > values: (1, 100] > # Defaults to -1 to disable. > # disk_usage_percentage_warn_threshold: -1 > # Failure threshold to reject write requests if replica disk usage exceeds > threshold. Valid values: (1, 100] > # Defaults to -1 to disable. > # disk_usage_percentage_failure_threshold: -1 > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17150) Guardrails for disk usage
[ https://issues.apache.org/jira/browse/CASSANDRA-17150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526506#comment-17526506 ] Andres de la Peña commented on CASSANDRA-17150: --- Thanks, committed to {{trunk}} as [b3842de5cf1fa1b81872effb4585fbc7e1873d59|https://github.com/apache/cassandra/commit/b3842de5cf1fa1b81872effb4585fbc7e1873d59]. > Guardrails for disk usage > - > > Key: CASSANDRA-17150 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17150 > Project: Cassandra > Issue Type: New Feature > Components: Feature/Guardrails >Reporter: Andres de la Peña >Assignee: Andres de la Peña >Priority: Normal > Fix For: 4.x > > Time Spent: 8h 20m > Remaining Estimate: 0h > > Add guardrails for disk usage establishing soft/hard limits on the percentage > of used disk space. For example: > {code} > # Warning threshold to warn when local disk usage exceeds threshold. Valid > values: (1, 100] > # Defaults to -1 to disable. > # disk_usage_percentage_warn_threshold: -1 > # Failure threshold to reject write requests if replica disk usage exceeds > threshold. Valid values: (1, 100] > # Defaults to -1 to disable. > # disk_usage_percentage_failure_threshold: -1 > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch trunk updated: Add guardrail for data disk usage
This is an automated email from the ASF dual-hosted git repository. adelapena pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git The following commit(s) were added to refs/heads/trunk by this push: new b3842de5cf Add guardrail for data disk usage b3842de5cf is described below commit b3842de5cf1fa1b81872effb4585fbc7e1873d59 Author: Andrés de la Peña AuthorDate: Fri Apr 22 16:36:07 2022 +0100 Add guardrail for data disk usage patch by Andrés de la Peña; reviewed by Ekaterina Dimitrova and Stefan Miklosovic for CASSANDRA-17150 Co-authored-by: Andrés de la Peña Co-authored-by: Zhao Yang Co-authored-by: Eduard Tudenhoefner --- CHANGES.txt| 1 + NEWS.txt | 28 + conf/cassandra.yaml| 24 +- .../config/CassandraRelevantProperties.java| 8 + src/java/org/apache/cassandra/config/Config.java | 49 +- .../apache/cassandra/config/DataStorageSpec.java | 13 +- .../apache/cassandra/config/GuardrailsOptions.java | 121 +++- .../org/apache/cassandra/cql3/QueryOptions.java| 9 +- .../cassandra/cql3/selection/ResultSetBuilder.java | 5 +- .../cassandra/cql3/statements/BatchStatement.java | 8 +- .../cql3/statements/ModificationStatement.java | 25 + src/java/org/apache/cassandra/db/Directories.java | 5 + src/java/org/apache/cassandra/db/ReadCommand.java | 8 +- .../apache/cassandra/db/guardrails/Guardrail.java | 92 ++- .../apache/cassandra/db/guardrails/Guardrails.java | 112 +++- .../cassandra/db/guardrails/GuardrailsConfig.java | 25 +- .../cassandra/db/guardrails/GuardrailsMBean.java | 61 +- .../db/guardrails/PercentageThreshold.java | 56 ++ .../apache/cassandra/db/guardrails/Predicates.java | 93 .../apache/cassandra/db/guardrails/Threshold.java | 20 +- .../org/apache/cassandra/gms/ApplicationState.java | 1 + .../org/apache/cassandra/gms/VersionedValue.java | 5 + .../cassandra/io/sstable/format/SSTableWriter.java | 2 +- .../apache/cassandra/service/StorageService.java | 2 + .../service/disk/usage/DiskUsageBroadcaster.java | 181 ++ .../service/disk/usage/DiskUsageMonitor.java | 233 .../service/disk/usage/DiskUsageState.java | 70 +++ .../test/guardrails/GuardrailDiskUsageTest.java| 225 .../cassandra/config/DataStorageSpecTest.java | 29 +- .../db/guardrails/GuardrailCollectionSizeTest.java | 10 +- .../db/guardrails/GuardrailDiskUsageTest.java | 617 + .../cassandra/db/guardrails/GuardrailTester.java | 10 + .../cassandra/db/guardrails/GuardrailsTest.java| 46 ++ .../cassandra/db/guardrails/ThresholdTester.java | 28 +- .../cassandra/db/virtual/GossipInfoTableTest.java | 3 +- 35 files changed, 2125 insertions(+), 100 deletions(-) diff --git a/CHANGES.txt b/CHANGES.txt index a1213090e2..9e9e1ee2f1 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 4.1 + * Add guardrail for data disk usage (CASSANDRA-17150) * Tool to list data paths of existing tables (CASSANDRA-17568) * Migrate track_warnings to more standard naming conventions and use latest configuration types rather than long (CASSANDRA-17560) * Add support for CONTAINS and CONTAINS KEY in conditional UPDATE and DELETE statement (CASSANDRA-10537) diff --git a/NEWS.txt b/NEWS.txt index a891eb3a9a..fd31e06c93 100644 --- a/NEWS.txt +++ b/NEWS.txt @@ -56,6 +56,34 @@ using the provided 'sstableupgrade' tool. New features +- Added a new guardrails framework allowing to define soft/hard limits for different user actions, such as limiting + the number of tables, columns per table or the size of collections. These guardrails are only applied to regular + user queries, and superusers and internal queries are excluded. Reaching the soft limit raises a client warning, + whereas reaching the hard limit aborts the query. In both cases a log message and a diagnostic event are emitted. + Additionally, some guardrails are not linked to specific user queries due to techincal limitations, such as + detecting the size of large collections during compaction or periodically monitoring the disk usage. These + guardrails would only emit the proper logs and diagnostic events when triggered, without aborting any processes. + Guardrails config is defined through cassandra.yaml properties, and they can be dynamically updated through the + JMX MBean `org.apache.cassandra.db:type=Guardrails`. There are guardrails for: +- Number of user keyspaces. +- Number of user tables. +- Number of columns per table. +- Number of secondary indexes per table. +- Number of materialized tables per table. +- Number of fields per user-defined type. +- Number of items in a collection . +- Num
[jira] [Updated] (CASSANDRA-17557) Fix a few config parameters after the Paxos improvements commit
[ https://issues.apache.org/jira/browse/CASSANDRA-17557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ekaterina Dimitrova updated CASSANDRA-17557: Reviewers: Benedict Elliott Smith, Ekaterina Dimitrova (was: Benedict Elliott Smith) Status: Review In Progress (was: Patch Available) > Fix a few config parameters after the Paxos improvements commit > --- > > Key: CASSANDRA-17557 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17557 > Project: Cassandra > Issue Type: Bug > Components: Local/Config >Reporter: Ekaterina Dimitrova >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.x > > > After committing the Paxos improvements, it was identified that the following > configuration parameters need additional work: > * repair_request_timeout_in_ms - can be removed > * paxos_auto_repair_threshold_mb - I think it can be also removed; to be > confirmed with the author > Discussed a bit in Slack and on this PR - > https://github.com/apache/cassandra/commit/d2923275e360a1ee9db498e748c269f701bb3a8b -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17557) Fix a few config parameters after the Paxos improvements commit
[ https://issues.apache.org/jira/browse/CASSANDRA-17557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ekaterina Dimitrova updated CASSANDRA-17557: Reviewers: Benedict Elliott Smith (was: Benedict Elliott Smith, Ekaterina Dimitrova) > Fix a few config parameters after the Paxos improvements commit > --- > > Key: CASSANDRA-17557 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17557 > Project: Cassandra > Issue Type: Bug > Components: Local/Config >Reporter: Ekaterina Dimitrova >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.x > > > After committing the Paxos improvements, it was identified that the following > configuration parameters need additional work: > * repair_request_timeout_in_ms - can be removed > * paxos_auto_repair_threshold_mb - I think it can be also removed; to be > confirmed with the author > Discussed a bit in Slack and on this PR - > https://github.com/apache/cassandra/commit/d2923275e360a1ee9db498e748c269f701bb3a8b -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17557) Fix a few config parameters after the Paxos improvements commit
[ https://issues.apache.org/jira/browse/CASSANDRA-17557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526504#comment-17526504 ] Ekaterina Dimitrova commented on CASSANDRA-17557: - Thanks, I rebased your branch and pushed a new CI run: [J8|https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/1566/workflows/393c8917-1d44-41be-afec-8ab6a97b7ede], [J11|https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/1566/workflows/23ff50bf-a946-4097-b6d6-d624983c932c] - pending results, just started it. >From config perspective looks good, I guess for the Verb class change someone >more familiar with your latest work than me should say. [~barnie] or >[~ifesdjeen] maybe? > Fix a few config parameters after the Paxos improvements commit > --- > > Key: CASSANDRA-17557 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17557 > Project: Cassandra > Issue Type: Bug > Components: Local/Config >Reporter: Ekaterina Dimitrova >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.x > > > After committing the Paxos improvements, it was identified that the following > configuration parameters need additional work: > * repair_request_timeout_in_ms - can be removed > * paxos_auto_repair_threshold_mb - I think it can be also removed; to be > confirmed with the author > Discussed a bit in Slack and on this PR - > https://github.com/apache/cassandra/commit/d2923275e360a1ee9db498e748c269f701bb3a8b -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17150) Guardrails for disk usage
[ https://issues.apache.org/jira/browse/CASSANDRA-17150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526485#comment-17526485 ] Ekaterina Dimitrova commented on CASSANDRA-17150: - Thanks [~adelapena], I see in CI only known old failures that have respective tickets. I think we are ready to commit it. :) > Guardrails for disk usage > - > > Key: CASSANDRA-17150 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17150 > Project: Cassandra > Issue Type: New Feature > Components: Feature/Guardrails >Reporter: Andres de la Peña >Assignee: Andres de la Peña >Priority: Normal > Fix For: 4.x > > Time Spent: 8h 20m > Remaining Estimate: 0h > > Add guardrails for disk usage establishing soft/hard limits on the percentage > of used disk space. For example: > {code} > # Warning threshold to warn when local disk usage exceeds threshold. Valid > values: (1, 100] > # Defaults to -1 to disable. > # disk_usage_percentage_warn_threshold: -1 > # Failure threshold to reject write requests if replica disk usage exceeds > threshold. Valid values: (1, 100] > # Defaults to -1 to disable. > # disk_usage_percentage_failure_threshold: -1 > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17150) Guardrails for disk usage
[ https://issues.apache.org/jira/browse/CASSANDRA-17150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ekaterina Dimitrova updated CASSANDRA-17150: Status: Ready to Commit (was: Review In Progress) > Guardrails for disk usage > - > > Key: CASSANDRA-17150 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17150 > Project: Cassandra > Issue Type: New Feature > Components: Feature/Guardrails >Reporter: Andres de la Peña >Assignee: Andres de la Peña >Priority: Normal > Fix For: 4.x > > Time Spent: 8h 20m > Remaining Estimate: 0h > > Add guardrails for disk usage establishing soft/hard limits on the percentage > of used disk space. For example: > {code} > # Warning threshold to warn when local disk usage exceeds threshold. Valid > values: (1, 100] > # Defaults to -1 to disable. > # disk_usage_percentage_warn_threshold: -1 > # Failure threshold to reject write requests if replica disk usage exceeds > threshold. Valid values: (1, 100] > # Defaults to -1 to disable. > # disk_usage_percentage_failure_threshold: -1 > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-17574) Throw exception on wrong config boundaries
Ekaterina Dimitrova created CASSANDRA-17574: --- Summary: Throw exception on wrong config boundaries Key: CASSANDRA-17574 URL: https://issues.apache.org/jira/browse/CASSANDRA-17574 Project: Cassandra Issue Type: Bug Reporter: Ekaterina Dimitrova While working on CASSANDRA-15234 we noticed usage of negative values where they are not supposed to be used. We fixed that for the parameters in scope - type duration, data storage and data rate but as [~brandon.williams] pointed - there are other examples from the rest of the config that is good to be fixed too. This ticket should handle: - check the rest of the parameters, where negatives shouldn't be allowed they shouldn't be allowed - ensure that whatever validations we apply to parameters during startup (check the DatabaseDescriptor) are applied also in the respective setters for those parameters. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17574) Throw exception on wrong config boundaries
[ https://issues.apache.org/jira/browse/CASSANDRA-17574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ekaterina Dimitrova updated CASSANDRA-17574: Bug Category: Parent values: Correctness(12982) Complexity: Low Hanging Fruit Component/s: Local/Config Discovered By: User Report Severity: Normal Status: Open (was: Triage Needed) > Throw exception on wrong config boundaries > -- > > Key: CASSANDRA-17574 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17574 > Project: Cassandra > Issue Type: Bug > Components: Local/Config >Reporter: Ekaterina Dimitrova >Priority: Normal > > While working on CASSANDRA-15234 we noticed usage of negative values where > they are not supposed to be used. We fixed that for the parameters in scope - > type duration, data storage and data rate but as [~brandon.williams] pointed > - there are other examples from the rest of the config that is good to be > fixed too. > This ticket should handle: > - check the rest of the parameters, where negatives shouldn't be allowed they > shouldn't be allowed > - ensure that whatever validations we apply to parameters during startup > (check the DatabaseDescriptor) are applied also in the respective setters for > those parameters. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17574) Throw exception on wrong config boundaries
[ https://issues.apache.org/jira/browse/CASSANDRA-17574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ekaterina Dimitrova updated CASSANDRA-17574: Fix Version/s: 4.x > Throw exception on wrong config boundaries > -- > > Key: CASSANDRA-17574 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17574 > Project: Cassandra > Issue Type: Bug > Components: Local/Config >Reporter: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.x > > > While working on CASSANDRA-15234 we noticed usage of negative values where > they are not supposed to be used. We fixed that for the parameters in scope - > type duration, data storage and data rate but as [~brandon.williams] pointed > - there are other examples from the rest of the config that is good to be > fixed too. > This ticket should handle: > - check the rest of the parameters, where negatives shouldn't be allowed they > shouldn't be allowed > - ensure that whatever validations we apply to parameters during startup > (check the DatabaseDescriptor) are applied also in the respective setters for > those parameters. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17329) Fix failing test - dtest-upgrade.upgrade_internal_auth_test.TestAuthUpgrade.test_upgrade_legacy_tabl
[ https://issues.apache.org/jira/browse/CASSANDRA-17329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-17329: - Fix Version/s: 3.11.x > Fix failing test - > dtest-upgrade.upgrade_internal_auth_test.TestAuthUpgrade.test_upgrade_legacy_tabl > > > Key: CASSANDRA-17329 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17329 > Project: Cassandra > Issue Type: Bug > Components: Feature/Authorization >Reporter: Brandon Williams >Assignee: Brandon Williams >Priority: Normal > Fix For: 3.0.x, 3.11.x > > > Failed 6 times in the last 16 runs. Flakiness: 60%, Stability: 62% > Error Message > ccmlib.node.TimeoutError: 26 Jan 2022 22:48:51 [node1] after 120.17/120 > seconds Missing: ['Listening for thrift clients...'] not found in system.log: > Head: INFO [main] 2022-01-26 22:46:51,840 YamlConfigura Tail: ...ing > legacy permissions data INFO [OptionalTasks:1] 2022-01-26 22:47:08,405 > CassandraAuthorizer.java:444 - Completed conversion of legacy permissions -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17180) Implement startup check to prevent Cassandra start to spread zombie data
[ https://issues.apache.org/jira/browse/CASSANDRA-17180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526470#comment-17526470 ] Brandon Williams commented on CASSANDRA-17180: -- Data dir in the system keyspace makes the most sense to me. The system ks is generally not backed up/restored since that's a bad idea, so if the file is there is won't accidentally be restored and cause a problem. > Implement startup check to prevent Cassandra start to spread zombie data > > > Key: CASSANDRA-17180 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17180 > Project: Cassandra > Issue Type: New Feature > Components: Legacy/Observability >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Time Spent: 9.5h > Remaining Estimate: 0h > > As already discussed on ML, it would be nice to have a service which would > periodically write timestamp to a file signalling it is up / running. > Then, on the startup, we would read this file and we would determine if there > is some table which gc grace is behind this time and we would fail the start > so we would prevent zombie data to be likely spread around a cluster. > https://lists.apache.org/thread/w4w5t2hlcrvqhgdwww61hgg58qz13glw -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Assigned] (CASSANDRA-17566) Fix flaky test - org.apache.cassandra.distributed.test.repair.ForceRepairTest.force
[ https://issues.apache.org/jira/browse/CASSANDRA-17566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams reassigned CASSANDRA-17566: Assignee: Brandon Williams > Fix flaky test - > org.apache.cassandra.distributed.test.repair.ForceRepairTest.force > --- > > Key: CASSANDRA-17566 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17566 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/java >Reporter: Brandon Williams >Assignee: Brandon Williams >Priority: Normal > Fix For: 4.x > > > Seen on jenkins here: > [https://ci-cassandra.apache.org/job/Cassandra-trunk/1083/testReport/org.apache.cassandra.distributed.test.repair/ForceRepairTest/force_2/] > > and circle here: > https://app.circleci.com/pipelines/github/driftx/cassandra/440/workflows/42f936c7-2ede-4fbf-957c-5fb4e461dd90/jobs/5160/tests#failed-test-1 > {noformat} > junit.framework.AssertionFailedError: nodetool command [repair, > distributed_test_keyspace, --force, --full] was not successful > stdout: > [2022-04-20 15:11:01,402] Starting repair command #2 > (1701a090-c0bc-11ec-9898-07c796ce6a49), repairing keyspace > distributed_test_keyspace with repair options (parallelism: parallel, primary > range: false, incremental: false, job threads: 1, ColumnFamilies: [], > dataCenters: [], hosts: [], previewKind: NONE, # of ranges: 3, pull repair: > false, force repair: true, optimise streams: false, ignore unreplicated > keyspaces: false, repairPaxos: true, paxosOnly: false) > [2022-04-20 15:11:11,406] Repair command #2 failed with error Did not get > replies from all endpoints. > [2022-04-20 15:11:11,408] Repair command #2 finished with error > stderr: > error: Repair job has failed with the error message: Repair command #2 failed > with error Did not get replies from all endpoints.. Check the logs on the > repair participants for further details > -- StackTrace -- > java.lang.RuntimeException: Repair job has failed with the error message: > Repair command #2 failed with error Did not get replies from all endpoints.. > Check the logs on the repair participants for further details > at > org.apache.cassandra.tools.RepairRunner.progress(RepairRunner.java:137) > at > org.apache.cassandra.utils.progress.jmx.JMXNotificationProgressListener.handleNotification(JMXNotificationProgressListener.java:77) > at > javax.management.NotificationBroadcasterSupport.handleNotification(NotificationBroadcasterSupport.java:275) > at > javax.management.NotificationBroadcasterSupport$SendNotifJob.run(NotificationBroadcasterSupport.java:352) > at > org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:124) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.lang.Thread.run(Thread.java:748) > {noformat} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17180) Implement startup check to prevent Cassandra start to spread zombie data
[ https://issues.apache.org/jira/browse/CASSANDRA-17180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526467#comment-17526467 ] Stefan Miklosovic commented on CASSANDRA-17180: --- I think we need to really place it to a data dir because /tmp is not durable enough and other Cassandra dir might not be writable. The only durable & writable is data dir. > Implement startup check to prevent Cassandra start to spread zombie data > > > Key: CASSANDRA-17180 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17180 > Project: Cassandra > Issue Type: New Feature > Components: Legacy/Observability >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Time Spent: 9.5h > Remaining Estimate: 0h > > As already discussed on ML, it would be nice to have a service which would > periodically write timestamp to a file signalling it is up / running. > Then, on the startup, we would read this file and we would determine if there > is some table which gc grace is behind this time and we would fail the start > so we would prevent zombie data to be likely spread around a cluster. > https://lists.apache.org/thread/w4w5t2hlcrvqhgdwww61hgg58qz13glw -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17551) Allow 0 to be used in collection_size guardrails in order to prohibit collections
[ https://issues.apache.org/jira/browse/CASSANDRA-17551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526465#comment-17526465 ] Ekaterina Dimitrova commented on CASSANDRA-17551: - We had further discussion with [~adelapena] and he has a good point that we already have feature flags like "materialized_views_enabled" so disabling collections with 0 will diverge from our current approach. This will require broader discussion and consideration. Not doing this patch at the moment. We can start discussion and more work for the next release, now we have only a week until freeze so it will be a rush. Moving the ticket back to open and marking it 5.x > Allow 0 to be used in collection_size guardrails in order to prohibit > collections > - > > Key: CASSANDRA-17551 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17551 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails >Reporter: Ekaterina Dimitrova >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 4.x > > > Allow 0 to be used in collection_size guardrails in order to prohibit > collections -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17551) Allow 0 to be used in collection_size guardrails in order to prohibit collections
[ https://issues.apache.org/jira/browse/CASSANDRA-17551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ekaterina Dimitrova updated CASSANDRA-17551: Fix Version/s: 5.x (was: 4.x) > Allow 0 to be used in collection_size guardrails in order to prohibit > collections > - > > Key: CASSANDRA-17551 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17551 > Project: Cassandra > Issue Type: Improvement > Components: Feature/Guardrails >Reporter: Ekaterina Dimitrova >Assignee: Ekaterina Dimitrova >Priority: Normal > Fix For: 5.x > > > Allow 0 to be used in collection_size guardrails in order to prohibit > collections -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17456) Test Failures: write_failures_test.TestMultiDCWriteFailures.test_oversized_mutation
[ https://issues.apache.org/jira/browse/CASSANDRA-17456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksandr Sorokoumov updated CASSANDRA-17456: - Test and Documentation Plan: I made the existing dtest applicable to C* versions until 4.0.x and added an in-jvm dtest to cover rejection of oversized mutations on insert. Status: Patch Available (was: In Progress) As Benedict suggested, I moved the mutation size check from CommitLog to the client and internode connections. Patches: * [17456-trunk|https://github.com/apache/cassandra/compare/trunk...Ge:17456-trunk?expand=1] * [dtest|https://github.com/apache/cassandra-dtest/pull/186] [Jenkins CI run|https://ci-cassandra.apache.org/job/Cassandra-devbranch/1626/#showFailuresLink] > Test Failures: > write_failures_test.TestMultiDCWriteFailures.test_oversized_mutation > --- > > Key: CASSANDRA-17456 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17456 > Project: Cassandra > Issue Type: Bug > Components: CI >Reporter: Ekaterina Dimitrova >Assignee: Aleksandr Sorokoumov >Priority: Normal > Fix For: 4.x > > > https://ci-cassandra.apache.org/job/Cassandra-trunk/1002/testReport/dtest-offheap.write_failures_test/TestMultiDCWriteFailures/test_oversized_mutation/ > {code:java} > Error Message > AssertionError: assert 0 == 8 + where 8 = JolokiaAgent.read_attribute of 0x7f1fca78dac0>>('org.apache.cassandra.metrics:type=Storage,name=TotalHints', > 'Count') +where > = > .read_attribute + > and 'org.apache.cassandra.metrics:type=Storage,name=TotalHints' = > make_mbean('metrics', type='Storage', name='TotalHints') > Stacktrace > self = > def test_oversized_mutation(self): > """ > Test that multi-DC write failures return operation failed rather > than a timeout. > @jira_ticket CASSANDRA-16334. > """ > > cluster = self.cluster > cluster.populate([2, 2]) > cluster.set_configuration_options(values={'max_mutation_size_in_kb': > 128}) > cluster.start() > > node1 = cluster.nodelist()[0] > session = self.patient_exclusive_cql_connection(node1) > > session.execute("CREATE KEYSPACE k WITH replication = {'class': > 'NetworkTopologyStrategy', 'dc1': 2, 'dc2': 2}") > session.execute("CREATE TABLE k.t (key int PRIMARY KEY, val blob)") > > payload = '1' * 1024 * 256 > query = "INSERT INTO k.t (key, val) VALUES (1, > textAsBlob('{}'))".format(payload) > > assert_write_failure(session, query, ConsistencyLevel.LOCAL_ONE) > assert_write_failure(session, query, ConsistencyLevel.ONE) > > # verify that no hints are created > with JolokiaAgent(node1) as jmx: > > assert 0 == jmx.read_attribute(make_mbean('metrics', > > type='Storage', name='TotalHints'), 'Count') > E AssertionError: assert 0 == 8 > E+ where 8 = 0x7f1fca78dac0>>('org.apache.cassandra.metrics:type=Storage,name=TotalHints', > 'Count') > E+where > = > .read_attribute > E+and > 'org.apache.cassandra.metrics:type=Storage,name=TotalHints' = > make_mbean('metrics', type='Storage', name='TotalHints') > write_failures_test.py:277: AssertionError > REST API > CloudBees CI Client Controller 2.319.3.4-rolling > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17180) Implement startup check to prevent Cassandra start to spread zombie data
[ https://issues.apache.org/jira/browse/CASSANDRA-17180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526463#comment-17526463 ] Stefan Miklosovic commented on CASSANDRA-17180: --- Yes, I realised that /tmp/ problem just now ... ahh. > Can't we parse it like any other JSON file? Ah right, I know what you mean. So the two outstanding questions is the default place of this file and if it should be enabled by default. > Implement startup check to prevent Cassandra start to spread zombie data > > > Key: CASSANDRA-17180 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17180 > Project: Cassandra > Issue Type: New Feature > Components: Legacy/Observability >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Time Spent: 9.5h > Remaining Estimate: 0h > > As already discussed on ML, it would be nice to have a service which would > periodically write timestamp to a file signalling it is up / running. > Then, on the startup, we would read this file and we would determine if there > is some table which gc grace is behind this time and we would fail the start > so we would prevent zombie data to be likely spread around a cluster. > https://lists.apache.org/thread/w4w5t2hlcrvqhgdwww61hgg58qz13glw -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-17180) Implement startup check to prevent Cassandra start to spread zombie data
[ https://issues.apache.org/jira/browse/CASSANDRA-17180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526452#comment-17526452 ] Stefan Miklosovic edited comment on CASSANDRA-17180 at 4/22/22 2:18 PM: [~paulo] thanks for finally looking into it, I ll deal with it over the weekend to finally move this over the line. I had implemented something similar to your postActions idea but Brandon's opinion was that we are inventing just something else here. But I see you moved that "execute post actions loop" after all checks are verified in CassandraDaemon instead of having it in StartupChecks.verify directly. I am fine with your take on that, is Brandon too? Good to know this is going to check system_distributed and system_auth too. As for the default place of the heartbeat file, thats good point. Maybe we should go a little bit wild here and we might save it to /tmp/ ? I think that has the most guarantee of being writable. I do not like the fact that there is suddenly some file in area for sstables / tables. Other existing software might have a problem with this. For example when you are backuping, you would need to what ... exclude or include that file? It depends how people look at these backups etc. For that reason I would place it somewhere else. But if we place it to /tmp, and you have more than one node running on the same machine, there will be the clash as two nodes happen to write to the same file {_}by default{_}. In that case we would have to make that file name unique, e.g. by including node's id. What is your take on this? Yes we can rename that class. I do not mind to start to write JSON into that file, but ... how do you want to parse that file? I still need to read it / check it and so on. By what you would like to replace all that logic? EDIT: I will think more about the consequencies of making this enabled by default. That is simple thing to change at the end of this work anyway, might be done whenever we want. EDIT 2: writing to /tmp/ is quite a bad idea because that tend to be wiped out on restarts. was (Author: smiklosovic): [~paulo] thanks for finally looking into it, I ll deal with it over the weekend to finally move this over the line. I had implemented something similar to your postActions idea but Brandon's opinion was that we are inventing just something else here. But I see you moved that "execute post actions loop" after all checks are verified in CassandraDaemon instead of having it in StartupChecks.verify directly. I am fine with your take on that, is Brandon too? Good to know this is going to check system_distributed and system_auth too. As for the default place of the heartbeat file, thats good point. Maybe we should go a little bit wild here and we might save it to /tmp/ ? I think that has the most guarantee of being writable. I do not like the fact that there is suddenly some file in area for sstables / tables. Other existing software might have a problem with this. For example when you are backuping, you would need to what ... exclude or include that file? It depends how people look at these backups etc. For that reason I would place it somewhere else. But if we place it to /tmp, and you have more than one node running on the same machine, there will be the clash as two nodes happen to write to the same file {_}by default{_}. In that case we would have to make that file name unique, e.g. by including node's id. What is your take on this? Yes we can rename that class. I do not mind to start to write JSON into that file, but ... how do you want to parse that file? I still need to read it / check it and so on. By what you would like to replace all that logic? EDIT: I will think more about the consequencies of making this enabled by default. That is simple thing to change at the end of this work anyway, might be done whenever we want. > Implement startup check to prevent Cassandra start to spread zombie data > > > Key: CASSANDRA-17180 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17180 > Project: Cassandra > Issue Type: New Feature > Components: Legacy/Observability >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Time Spent: 9.5h > Remaining Estimate: 0h > > As already discussed on ML, it would be nice to have a service which would > periodically write timestamp to a file signalling it is up / running. > Then, on the startup, we would read this file and we would determine if there > is some table which gc grace is behind this time and we would fail the start > so we would prevent zombie data to be likely spread around a cluster. > https://lists.apache.org/thread/w4w5t2hlcrv
[jira] [Commented] (CASSANDRA-17180) Implement startup check to prevent Cassandra start to spread zombie data
[ https://issues.apache.org/jira/browse/CASSANDRA-17180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526459#comment-17526459 ] Brandon Williams commented on CASSANDRA-17180: -- bq. I am fine with your take on that, is Brandon too? I'm fine with the majority here. bq. Maybe we should go a little bit wild here and we might save it to /tmp/ ? That sounds unworkable to me. There's no guarantee of durability until the next startup, people override tmpdir, etc. bq. I do not mind to start to write JSON into that file, but ... how do you want to parse that file? Can't we parse it like any other JSON file? > Implement startup check to prevent Cassandra start to spread zombie data > > > Key: CASSANDRA-17180 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17180 > Project: Cassandra > Issue Type: New Feature > Components: Legacy/Observability >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Time Spent: 9.5h > Remaining Estimate: 0h > > As already discussed on ML, it would be nice to have a service which would > periodically write timestamp to a file signalling it is up / running. > Then, on the startup, we would read this file and we would determine if there > is some table which gc grace is behind this time and we would fail the start > so we would prevent zombie data to be likely spread around a cluster. > https://lists.apache.org/thread/w4w5t2hlcrvqhgdwww61hgg58qz13glw -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-17180) Implement startup check to prevent Cassandra start to spread zombie data
[ https://issues.apache.org/jira/browse/CASSANDRA-17180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526452#comment-17526452 ] Stefan Miklosovic edited comment on CASSANDRA-17180 at 4/22/22 2:11 PM: [~paulo] thanks for finally looking into it, I ll deal with it over the weekend to finally move this over the line. I had implemented something similar to your postActions idea but Brandon's opinion was that we are inventing just something else here. But I see you moved that "execute post actions loop" after all checks are verified in CassandraDaemon instead of having it in StartupChecks.verify directly. I am fine with your take on that, is Brandon too? Good to know this is going to check system_distributed and system_auth too. As for the default place of the heartbeat file, thats good point. Maybe we should go a little bit wild here and we might save it to /tmp/ ? I think that has the most guarantee of being writable. I do not like the fact that there is suddenly some file in area for sstables / tables. Other existing software might have a problem with this. For example when you are backuping, you would need to what ... exclude or include that file? It depends how people look at these backups etc. For that reason I would place it somewhere else. But if we place it to /tmp, and you have more than one node running on the same machine, there will be the clash as two nodes happen to write to the same file {_}by default{_}. In that case we would have to make that file name unique, e.g. by including node's id. What is your take on this? Yes we can rename that class. I do not mind to start to write JSON into that file, but ... how do you want to parse that file? I still need to read it / check it and so on. By what you would like to replace all that logic? EDIT: I will think more about the consequencies of making this enabled by default. That is simple thing to change at the end of this work anyway, might be done whenever we want. was (Author: smiklosovic): [~paulo] thanks for finally looking into it, I ll deal with it over the weekend to finally move this over the line. I had implemented something similar to your postActions idea but Brandon's opinion was that we are inventing just something else here. But I see you moved that "execute post actions loop" after all checks are verified in CassandraDaemon instead of having it in StartupChecks.verify directly. I am fine with your take on that, is Brandon too? Good to know this is going to check system_distributed and system_auth too. As for the default place of the heartbeat file, thats good point. Maybe we should go a little bit wild here and we might save it to /tmp/ ? I think that has the most guarantee of being writable. I do not like the fact that there is suddenly some file in area for sstables / tables. Other existing software might have a problem with this. For example when you are backuping, you would need to what ... exclude or include that file? It depends how people look at these backups etc. For that reason I would place it somewhere else. But if we place it to /tmp, and you have more than one node running on the same machine, there will be the clash as two nodes happen to write to the same file {_}by default{_}. In that case we would have to make that file name unique, e.g. by including node's id. What is your take on this? Yes we can rename that class. I do not mind to start to write JSON into that file, but ... how do you want to parse that file? I still need to read it / check it and so on. By what you would like to replace all that logic? > Implement startup check to prevent Cassandra start to spread zombie data > > > Key: CASSANDRA-17180 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17180 > Project: Cassandra > Issue Type: New Feature > Components: Legacy/Observability >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Time Spent: 9.5h > Remaining Estimate: 0h > > As already discussed on ML, it would be nice to have a service which would > periodically write timestamp to a file signalling it is up / running. > Then, on the startup, we would read this file and we would determine if there > is some table which gc grace is behind this time and we would fail the start > so we would prevent zombie data to be likely spread around a cluster. > https://lists.apache.org/thread/w4w5t2hlcrvqhgdwww61hgg58qz13glw -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassa
[jira] [Commented] (CASSANDRA-17180) Implement startup check to prevent Cassandra start to spread zombie data
[ https://issues.apache.org/jira/browse/CASSANDRA-17180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526452#comment-17526452 ] Stefan Miklosovic commented on CASSANDRA-17180: --- [~paulo] thanks for finally looking into it, I ll deal with it over the weekend to finally move this over the line. I had implemented something similar to your postActions idea but Brandon's opinion was that we are inventing just something else here. But I see you moved that "execute post actions loop" after all checks are verified in CassandraDaemon instead of having it in StartupChecks.verify directly. I am fine with your take on that, is Brandon too? Good to know this is going to check system_distributed and system_auth too. As for the default place of the heartbeat file, thats good point. Maybe we should go a little bit wild here and we might save it to /tmp/ ? I think that has the most guarantee of being writable. I do not like the fact that there is suddenly some file in area for sstables / tables. Other existing software might have a problem with this. For example when you are backuping, you would need to what ... exclude or include that file? It depends how people look at these backups etc. For that reason I would place it somewhere else. But if we place it to /tmp, and you have more than one node running on the same machine, there will be the clash as two nodes happen to write to the same file {_}by default{_}. In that case we would have to make that file name unique, e.g. by including node's id. What is your take on this? Yes we can rename that class. I do not mind to start to write JSON into that file, but ... how do you want to parse that file? I still need to read it / check it and so on. By what you would like to replace all that logic? > Implement startup check to prevent Cassandra start to spread zombie data > > > Key: CASSANDRA-17180 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17180 > Project: Cassandra > Issue Type: New Feature > Components: Legacy/Observability >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Time Spent: 9.5h > Remaining Estimate: 0h > > As already discussed on ML, it would be nice to have a service which would > periodically write timestamp to a file signalling it is up / running. > Then, on the startup, we would read this file and we would determine if there > is some table which gc grace is behind this time and we would fail the start > so we would prevent zombie data to be likely spread around a cluster. > https://lists.apache.org/thread/w4w5t2hlcrvqhgdwww61hgg58qz13glw -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-17150) Guardrails for disk usage
[ https://issues.apache.org/jira/browse/CASSANDRA-17150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526317#comment-17526317 ] Andres de la Peña edited comment on CASSANDRA-17150 at 4/22/22 2:05 PM: [~e.dimitrova] thanks for the review. I think I have addressed the last bits. I'm running CI after rebase+squash: ||PR||CI|| |[trunk|https://github.com/apache/cassandra/pull/1546]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1494/workflows/d032178d-f8a9-4124-b36f-5bf6f47b3116] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/1494/workflows/bc844580-6f3a-4bc3-a4d0-d85f082330f8]| Please note that during the rebase I have replaced a few references to the removed {{Config.DISABLED_GUARDRAIL}} constant by {{-1}}. Those references were recently added to track warnings during CASSANDRA-17560. As it's mentioned [here|https://github.com/apache/cassandra/pull/1572#discussion_r854251196], using {{-1}} as the disabled value is a global config convention and not a guardrails thing, so we should either use it directly or define a new constant with a more generic name. If we decide to do the latter, I'd prefer to do it in a separate ticket, so we can focus on locating all the usages around. was (Author: adelapena): [~e.dimitrova] thanks for the review. I think I have addressed the last bits. I'm running CI after rebase+squash: ||PR||CI|| |[trunk|https://github.com/apache/cassandra/pull/1546]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1494/workflows/d032178d-f8a9-4124-b36f-5bf6f47b3116] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/1494/workflows/bc844580-6f3a-4bc3-a4d0-d85f082330f8]| Please note that during the rebase I have replaced a few references to the removed `Config.DISABLED_GUARDRAIL` constant by {{{}-1{}}}. Those references were recently added to track warnings during CASSANDRA-17560. As it's mentioned [here|https://github.com/apache/cassandra/pull/1572#discussion_r854251196], using {{-1}} as the disabled value is a global config convention and not a guardrails thing, so we should either use it directly or define a new constant with a more generic name. If we decide to do the latter, I'd prefer to do it in a separate ticket, so we can focus on locating all the usages around. > Guardrails for disk usage > - > > Key: CASSANDRA-17150 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17150 > Project: Cassandra > Issue Type: New Feature > Components: Feature/Guardrails >Reporter: Andres de la Peña >Assignee: Andres de la Peña >Priority: Normal > Fix For: 4.x > > Time Spent: 8h 20m > Remaining Estimate: 0h > > Add guardrails for disk usage establishing soft/hard limits on the percentage > of used disk space. For example: > {code} > # Warning threshold to warn when local disk usage exceeds threshold. Valid > values: (1, 100] > # Defaults to -1 to disable. > # disk_usage_percentage_warn_threshold: -1 > # Failure threshold to reject write requests if replica disk usage exceeds > threshold. Valid values: (1, 100] > # Defaults to -1 to disable. > # disk_usage_percentage_failure_threshold: -1 > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17180) Implement startup check to prevent Cassandra start to spread zombie data
[ https://issues.apache.org/jira/browse/CASSANDRA-17180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526440#comment-17526440 ] Brandon Williams commented on CASSANDRA-17180: -- That sounds like a great idea to me. > Implement startup check to prevent Cassandra start to spread zombie data > > > Key: CASSANDRA-17180 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17180 > Project: Cassandra > Issue Type: New Feature > Components: Legacy/Observability >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Time Spent: 9.5h > Remaining Estimate: 0h > > As already discussed on ML, it would be nice to have a service which would > periodically write timestamp to a file signalling it is up / running. > Then, on the startup, we would read this file and we would determine if there > is some table which gc grace is behind this time and we would fail the start > so we would prevent zombie data to be likely spread around a cluster. > https://lists.apache.org/thread/w4w5t2hlcrvqhgdwww61hgg58qz13glw -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17370) Add flag enabling operators to restrict use of ALLOW FILTERING in queries
[ https://issues.apache.org/jira/browse/CASSANDRA-17370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526435#comment-17526435 ] Andres de la Peña commented on CASSANDRA-17370: --- [~jmckenzie] [~dcapwell] are we ready to commit this before the freeze? > Add flag enabling operators to restrict use of ALLOW FILTERING in queries > - > > Key: CASSANDRA-17370 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17370 > Project: Cassandra > Issue Type: Improvement > Components: CQL/Semantics, Feature/Guardrails >Reporter: Savni Nagarkar >Assignee: Savni Nagarkar >Priority: Normal > Fix For: 4.x > > Time Spent: 3.5h > Remaining Estimate: 0h > > This ticket adds the ability for operators to disallow use of ALLOW FILTERING > predicates in CQL SELECT statements. As queries that ALLOW FILTERING can > place additional load on the database, the flag enables operators to provide > tighter bounds on performance guarantees. The patch includes a new yaml > property, as well as a hot property enabling the value to be modified via JMX > at runtime. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17180) Implement startup check to prevent Cassandra start to spread zombie data
[ https://issues.apache.org/jira/browse/CASSANDRA-17180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526431#comment-17526431 ] Paulo Motta commented on CASSANDRA-17180: - {quote}Can we just use File.setLastModified and File.lastModified to read/write the heartbeat instead? {quote} alternatively we can just write a JSON similar to the snapshot manifest, since we can use existing JSON utilities to read/write the hearbeat file without needing to implement a custom parser. something like this: {noformat} {"last_heartbeat": "2022-04-22T13:33:41Z"} {noformat} we could later augment this json with more info if the need arises. WDYT? > Implement startup check to prevent Cassandra start to spread zombie data > > > Key: CASSANDRA-17180 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17180 > Project: Cassandra > Issue Type: New Feature > Components: Legacy/Observability >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Time Spent: 9.5h > Remaining Estimate: 0h > > As already discussed on ML, it would be nice to have a service which would > periodically write timestamp to a file signalling it is up / running. > Then, on the startup, we would read this file and we would determine if there > is some table which gc grace is behind this time and we would fail the start > so we would prevent zombie data to be likely spread around a cluster. > https://lists.apache.org/thread/w4w5t2hlcrvqhgdwww61hgg58qz13glw -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17568) Implement nodetool command to list data directories of existing tables
[ https://issues.apache.org/jira/browse/CASSANDRA-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526413#comment-17526413 ] Tibor Repasi commented on CASSANDRA-17568: -- Thank you for the commitment, the reviews and the productive feedback. Glad to see that coming in 4.1. > Implement nodetool command to list data directories of existing tables > -- > > Key: CASSANDRA-17568 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17568 > Project: Cassandra > Issue Type: New Feature > Components: Tool/nodetool >Reporter: Tibor Repasi >Assignee: Tibor Repasi >Priority: Normal > Fix For: 4.1 > > Time Spent: 10h > Remaining Estimate: 0h > > When a table is created, dropped and re-created with the same name, > directories remain within data paths. Operators may be challenged finding out > which directories belong to existing tables and which may be subject to > removal. However, the information is available in CQL as well as in MBeans > via JMX, a convenient access to this information is still missing. > My proposal is a new nodetool subcommand allowing to list data paths of all > existing tables. > {code} > % bin/nodetool datapaths -- example > Keyspace : example > Table : test > Paths : > > /var/lib/cassandra/data/example/test-02f5b8d0c0e311ecb327ff24df5ab301 > > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17568) Implement nodetool command to list data directories of existing tables
[ https://issues.apache.org/jira/browse/CASSANDRA-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526410#comment-17526410 ] Stefan Miklosovic commented on CASSANDRA-17568: --- Thanks [~rtib] for the effort, it was very smooth cooperation at GitHub. Definitely keep this stuff coming if you had any other ideas to implement. > Implement nodetool command to list data directories of existing tables > -- > > Key: CASSANDRA-17568 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17568 > Project: Cassandra > Issue Type: New Feature > Components: Tool/nodetool >Reporter: Tibor Repasi >Assignee: Tibor Repasi >Priority: Normal > Fix For: 4.1 > > Time Spent: 10h > Remaining Estimate: 0h > > When a table is created, dropped and re-created with the same name, > directories remain within data paths. Operators may be challenged finding out > which directories belong to existing tables and which may be subject to > removal. However, the information is available in CQL as well as in MBeans > via JMX, a convenient access to this information is still missing. > My proposal is a new nodetool subcommand allowing to list data paths of all > existing tables. > {code} > % bin/nodetool datapaths -- example > Keyspace : example > Table : test > Paths : > > /var/lib/cassandra/data/example/test-02f5b8d0c0e311ecb327ff24df5ab301 > > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17568) Implement nodetool command to list data directories of existing tables
[ https://issues.apache.org/jira/browse/CASSANDRA-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Miklosovic updated CASSANDRA-17568: -- Fix Version/s: 4.1 (was: 4.x) Source Control Link: https://github.com/apache/cassandra/commit/c26dc06a28b0e150384474001ac23026ae76e6d5 Resolution: Fixed Status: Resolved (was: Ready to Commit) > Implement nodetool command to list data directories of existing tables > -- > > Key: CASSANDRA-17568 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17568 > Project: Cassandra > Issue Type: New Feature > Components: Tool/nodetool >Reporter: Tibor Repasi >Assignee: Tibor Repasi >Priority: Normal > Fix For: 4.1 > > Time Spent: 9h 50m > Remaining Estimate: 0h > > When a table is created, dropped and re-created with the same name, > directories remain within data paths. Operators may be challenged finding out > which directories belong to existing tables and which may be subject to > removal. However, the information is available in CQL as well as in MBeans > via JMX, a convenient access to this information is still missing. > My proposal is a new nodetool subcommand allowing to list data paths of all > existing tables. > {code} > % bin/nodetool datapaths -- example > Keyspace : example > Table : test > Paths : > > /var/lib/cassandra/data/example/test-02f5b8d0c0e311ecb327ff24df5ab301 > > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17568) Implement nodetool command to list data directories of existing tables
[ https://issues.apache.org/jira/browse/CASSANDRA-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526404#comment-17526404 ] Brandon Williams commented on CASSANDRA-17568: -- +1 > Implement nodetool command to list data directories of existing tables > -- > > Key: CASSANDRA-17568 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17568 > Project: Cassandra > Issue Type: New Feature > Components: Tool/nodetool >Reporter: Tibor Repasi >Assignee: Tibor Repasi >Priority: Normal > Fix For: 4.x > > Time Spent: 9h 50m > Remaining Estimate: 0h > > When a table is created, dropped and re-created with the same name, > directories remain within data paths. Operators may be challenged finding out > which directories belong to existing tables and which may be subject to > removal. However, the information is available in CQL as well as in MBeans > via JMX, a convenient access to this information is still missing. > My proposal is a new nodetool subcommand allowing to list data paths of all > existing tables. > {code} > % bin/nodetool datapaths -- example > Keyspace : example > Table : test > Paths : > > /var/lib/cassandra/data/example/test-02f5b8d0c0e311ecb327ff24df5ab301 > > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch trunk updated: add datapaths subcommand to nodetool
This is an automated email from the ASF dual-hosted git repository. smiklosovic pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git The following commit(s) were added to refs/heads/trunk by this push: new c26dc06a28 add datapaths subcommand to nodetool c26dc06a28 is described below commit c26dc06a28b0e150384474001ac23026ae76e6d5 Author: Tibor Répási AuthorDate: Wed Apr 20 22:10:13 2022 +0200 add datapaths subcommand to nodetool patch by Tibor Repasi; reviewed by Stefan Miklosovic and Brandon Williams for CASSANDRA-17568 --- CHANGES.txt| 1 + .../pages/troubleshooting/use_nodetool.adoc| 43 ++ src/java/org/apache/cassandra/tools/NodeTool.java | 1 + .../apache/cassandra/tools/nodetool/DataPaths.java | 53 +++ .../tools/nodetool/stats/DataPathsHolder.java | 84 ++ .../tools/nodetool/stats/DataPathsPrinter.java | 63 .../cassandra/tools/nodetool/DataPathsTest.java| 170 + 7 files changed, 415 insertions(+) diff --git a/CHANGES.txt b/CHANGES.txt index 5ab33a229b..a1213090e2 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 4.1 + * Tool to list data paths of existing tables (CASSANDRA-17568) * Migrate track_warnings to more standard naming conventions and use latest configuration types rather than long (CASSANDRA-17560) * Add support for CONTAINS and CONTAINS KEY in conditional UPDATE and DELETE statement (CASSANDRA-10537) * Migrate advanced config parameters to the new Config types (CASSANDRA-17431) diff --git a/doc/modules/cassandra/pages/troubleshooting/use_nodetool.adoc b/doc/modules/cassandra/pages/troubleshooting/use_nodetool.adoc index f80d039695..a313432cbb 100644 --- a/doc/modules/cassandra/pages/troubleshooting/use_nodetool.adoc +++ b/doc/modules/cassandra/pages/troubleshooting/use_nodetool.adoc @@ -240,3 +240,46 @@ concurrent compactions such that compactions complete quickly but don't take too many resources away from query threads is very important for performance. If you notice compaction unable to keep up, try tuning Cassandra's `concurrent_compactors` or `compaction_throughput` options. + +[[nodetool-datapaths]] +== Paths used for data files + +Cassandra is persisting data on disk within the configured directories. Data +files are distributed among the directories configured with `data_file_directories`. +Resembling the structure of keyspaces and tables, Cassandra is creating +subdirectories within `data_file_directories`. However, directories aren't removed +even if the tables and keyspaces are dropped. While these directories are kept with +the reason of holding snapshots, they are subject to removal. This is where operators +need to know which directories are still in use. Running the `nodetool datapaths` +command is an easy way to list in which directories Cassandra is actually storing +sstable data on disk. + +[source, bash] + +% nodetool datapaths -- system_auth +Keyspace: system_auth + Table: role_permissions + Paths: + /var/lib/cassandra/data/system_auth/role_permissions-3afbe79f219431a7add7f5ab90d8ec9c + + Table: network_permissions + Paths: + /var/lib/cassandra/data/system_auth/network_permissions-d46780c22f1c3db9b4c1b8d9fbc0cc23 + + Table: resource_role_permissons_index + Paths: + /var/lib/cassandra/data/system_auth/resource_role_permissons_index-5f2fbdad91f13946bd25d5da3a5c35ec + + Table: roles + Paths: + /var/lib/cassandra/data/system_auth/roles-5bc52802de2535edaeab188eecebb090 + + Table: role_members + Paths: + /var/lib/cassandra/data/system_auth/role_members-0ecdaa87f8fb3e6088d174fb36fe5c0d + + + +By default all keyspaces and tables are listed, however, a list of `keyspace` and +`keyspace.table` arguments can be given to query specific data paths. Using the `--format` +option the output can be formatted as YAML or JSON. diff --git a/src/java/org/apache/cassandra/tools/NodeTool.java b/src/java/org/apache/cassandra/tools/NodeTool.java index f9422bdbba..476353fee0 100644 --- a/src/java/org/apache/cassandra/tools/NodeTool.java +++ b/src/java/org/apache/cassandra/tools/NodeTool.java @@ -102,6 +102,7 @@ public class NodeTool Compact.class, CompactionHistory.class, CompactionStats.class, +DataPaths.class, Decommission.class, DescribeCluster.class, DescribeRing.class, diff --git a/src/java/org/apache/cassandra/tools/nodetool/DataPaths.java b/src/java/org/apache/cassandra/tools/nodetool/DataPaths.java new file mode 100644 index 00..10ae01e8da --- /dev/null +++ b/src/java/org/apache/cassandra/tools/nodetool/DataPaths.java @@ -0,0 +1,53 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or
[jira] [Updated] (CASSANDRA-17568) Implement nodetool command to list data directories of existing tables
[ https://issues.apache.org/jira/browse/CASSANDRA-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Miklosovic updated CASSANDRA-17568: -- Status: Ready to Commit (was: Review In Progress) > Implement nodetool command to list data directories of existing tables > -- > > Key: CASSANDRA-17568 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17568 > Project: Cassandra > Issue Type: New Feature > Components: Tool/nodetool >Reporter: Tibor Repasi >Assignee: Tibor Repasi >Priority: Normal > Fix For: 4.x > > Time Spent: 9h 50m > Remaining Estimate: 0h > > When a table is created, dropped and re-created with the same name, > directories remain within data paths. Operators may be challenged finding out > which directories belong to existing tables and which may be subject to > removal. However, the information is available in CQL as well as in MBeans > via JMX, a convenient access to this information is still missing. > My proposal is a new nodetool subcommand allowing to list data paths of all > existing tables. > {code} > % bin/nodetool datapaths -- example > Keyspace : example > Table : test > Paths : > > /var/lib/cassandra/data/example/test-02f5b8d0c0e311ecb327ff24df5ab301 > > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-17568) Implement nodetool command to list data directories of existing tables
[ https://issues.apache.org/jira/browse/CASSANDRA-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526401#comment-17526401 ] Stefan Miklosovic edited comment on CASSANDRA-17568 at 4/22/22 12:44 PM: - https://app.circleci.com/pipelines/github/instaclustr/cassandra/935/workflows/5894b8ae-571c-4d95-8379-fcb894da34e9 one jvm dtest fails, not related at all, thats known flaky was (Author: smiklosovic): https://app.circleci.com/pipelines/github/instaclustr/cassandra/935/workflows/5894b8ae-571c-4d95-8379-fcb894da34e9 > Implement nodetool command to list data directories of existing tables > -- > > Key: CASSANDRA-17568 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17568 > Project: Cassandra > Issue Type: New Feature > Components: Tool/nodetool >Reporter: Tibor Repasi >Assignee: Tibor Repasi >Priority: Normal > Fix For: 4.x > > Time Spent: 9h 50m > Remaining Estimate: 0h > > When a table is created, dropped and re-created with the same name, > directories remain within data paths. Operators may be challenged finding out > which directories belong to existing tables and which may be subject to > removal. However, the information is available in CQL as well as in MBeans > via JMX, a convenient access to this information is still missing. > My proposal is a new nodetool subcommand allowing to list data paths of all > existing tables. > {code} > % bin/nodetool datapaths -- example > Keyspace : example > Table : test > Paths : > > /var/lib/cassandra/data/example/test-02f5b8d0c0e311ecb327ff24df5ab301 > > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17568) Implement nodetool command to list data directories of existing tables
[ https://issues.apache.org/jira/browse/CASSANDRA-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526401#comment-17526401 ] Stefan Miklosovic commented on CASSANDRA-17568: --- https://app.circleci.com/pipelines/github/instaclustr/cassandra/935/workflows/5894b8ae-571c-4d95-8379-fcb894da34e9 > Implement nodetool command to list data directories of existing tables > -- > > Key: CASSANDRA-17568 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17568 > Project: Cassandra > Issue Type: New Feature > Components: Tool/nodetool >Reporter: Tibor Repasi >Assignee: Tibor Repasi >Priority: Normal > Fix For: 4.x > > Time Spent: 9h 50m > Remaining Estimate: 0h > > When a table is created, dropped and re-created with the same name, > directories remain within data paths. Operators may be challenged finding out > which directories belong to existing tables and which may be subject to > removal. However, the information is available in CQL as well as in MBeans > via JMX, a convenient access to this information is still missing. > My proposal is a new nodetool subcommand allowing to list data paths of all > existing tables. > {code} > % bin/nodetool datapaths -- example > Keyspace : example > Table : test > Paths : > > /var/lib/cassandra/data/example/test-02f5b8d0c0e311ecb327ff24df5ab301 > > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra-website] branch asf-staging updated (4cb38fd9 -> eb4d1ab0)
This is an automated email from the ASF dual-hosted git repository. git-site-role pushed a change to branch asf-staging in repository https://gitbox.apache.org/repos/asf/cassandra-website.git discard 4cb38fd9 generate docs for 8fd077a6 new eb4d1ab0 generate docs for 8fd077a6 This update added new revisions after undoing existing revisions. That is to say, some revisions that were in the old version of the branch are not in the new version. This situation occurs when a user --force pushes a change and generates a repository containing something like this: * -- * -- B -- O -- O -- O (4cb38fd9) \ N -- N -- N refs/heads/asf-staging (eb4d1ab0) You should already have received notification emails for all of the O revisions, and so the following emails describe only the N revisions from the common base, B. Any revisions marked "omit" are not gone; other references still refer to them. Any revisions marked "discard" are gone forever. The 1 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: site-ui/build/ui-bundle.zip | Bin 4740078 -> 4740078 bytes 1 file changed, 0 insertions(+), 0 deletions(-) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17568) Implement nodetool command to list data directories of existing tables
[ https://issues.apache.org/jira/browse/CASSANDRA-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-17568: - Status: Review In Progress (was: Needs Committer) > Implement nodetool command to list data directories of existing tables > -- > > Key: CASSANDRA-17568 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17568 > Project: Cassandra > Issue Type: New Feature > Components: Tool/nodetool >Reporter: Tibor Repasi >Assignee: Tibor Repasi >Priority: Normal > Fix For: 4.x > > Time Spent: 9h 50m > Remaining Estimate: 0h > > When a table is created, dropped and re-created with the same name, > directories remain within data paths. Operators may be challenged finding out > which directories belong to existing tables and which may be subject to > removal. However, the information is available in CQL as well as in MBeans > via JMX, a convenient access to this information is still missing. > My proposal is a new nodetool subcommand allowing to list data paths of all > existing tables. > {code} > % bin/nodetool datapaths -- example > Keyspace : example > Table : test > Paths : > > /var/lib/cassandra/data/example/test-02f5b8d0c0e311ecb327ff24df5ab301 > > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17568) Implement nodetool command to list data directories of existing tables
[ https://issues.apache.org/jira/browse/CASSANDRA-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-17568: - Reviewers: Brandon Williams, Stefan Miklosovic (was: Stefan Miklosovic) > Implement nodetool command to list data directories of existing tables > -- > > Key: CASSANDRA-17568 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17568 > Project: Cassandra > Issue Type: New Feature > Components: Tool/nodetool >Reporter: Tibor Repasi >Assignee: Tibor Repasi >Priority: Normal > Fix For: 4.x > > Time Spent: 9h 50m > Remaining Estimate: 0h > > When a table is created, dropped and re-created with the same name, > directories remain within data paths. Operators may be challenged finding out > which directories belong to existing tables and which may be subject to > removal. However, the information is available in CQL as well as in MBeans > via JMX, a convenient access to this information is still missing. > My proposal is a new nodetool subcommand allowing to list data paths of all > existing tables. > {code} > % bin/nodetool datapaths -- example > Keyspace : example > Table : test > Paths : > > /var/lib/cassandra/data/example/test-02f5b8d0c0e311ecb327ff24df5ab301 > > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17568) Implement nodetool command to list data directories of existing tables
[ https://issues.apache.org/jira/browse/CASSANDRA-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526341#comment-17526341 ] Brandon Williams commented on CASSANDRA-17568: -- Please add a J11 CI run, and if that is clean I am +1. > Implement nodetool command to list data directories of existing tables > -- > > Key: CASSANDRA-17568 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17568 > Project: Cassandra > Issue Type: New Feature > Components: Tool/nodetool >Reporter: Tibor Repasi >Assignee: Tibor Repasi >Priority: Normal > Fix For: 4.x > > Time Spent: 9h 50m > Remaining Estimate: 0h > > When a table is created, dropped and re-created with the same name, > directories remain within data paths. Operators may be challenged finding out > which directories belong to existing tables and which may be subject to > removal. However, the information is available in CQL as well as in MBeans > via JMX, a convenient access to this information is still missing. > My proposal is a new nodetool subcommand allowing to list data paths of all > existing tables. > {code} > % bin/nodetool datapaths -- example > Keyspace : example > Table : test > Paths : > > /var/lib/cassandra/data/example/test-02f5b8d0c0e311ecb327ff24df5ab301 > > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17568) Implement nodetool command to list data directories of existing tables
[ https://issues.apache.org/jira/browse/CASSANDRA-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526338#comment-17526338 ] Stefan Miklosovic commented on CASSANDRA-17568: --- Squashed commits branch: https://github.com/instaclustr/cassandra/commits/CASSANDRA-17568-squashed build: https://app.circleci.com/pipelines/github/instaclustr/cassandra/935/workflows/68fe55ef-851d-4c54-80af-668068e99abd I am +1 on this. Waiting for the second reviewer. > Implement nodetool command to list data directories of existing tables > -- > > Key: CASSANDRA-17568 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17568 > Project: Cassandra > Issue Type: New Feature > Components: Tool/nodetool >Reporter: Tibor Repasi >Assignee: Tibor Repasi >Priority: Normal > Fix For: 4.x > > Time Spent: 9h 50m > Remaining Estimate: 0h > > When a table is created, dropped and re-created with the same name, > directories remain within data paths. Operators may be challenged finding out > which directories belong to existing tables and which may be subject to > removal. However, the information is available in CQL as well as in MBeans > via JMX, a convenient access to this information is still missing. > My proposal is a new nodetool subcommand allowing to list data paths of all > existing tables. > {code} > % bin/nodetool datapaths -- example > Keyspace : example > Table : test > Paths : > > /var/lib/cassandra/data/example/test-02f5b8d0c0e311ecb327ff24df5ab301 > > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17568) Implement nodetool command to list data directories of existing tables
[ https://issues.apache.org/jira/browse/CASSANDRA-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Miklosovic updated CASSANDRA-17568: -- Status: Needs Committer (was: Review In Progress) > Implement nodetool command to list data directories of existing tables > -- > > Key: CASSANDRA-17568 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17568 > Project: Cassandra > Issue Type: New Feature > Components: Tool/nodetool >Reporter: Tibor Repasi >Assignee: Tibor Repasi >Priority: Normal > Fix For: 4.x > > Time Spent: 9h 50m > Remaining Estimate: 0h > > When a table is created, dropped and re-created with the same name, > directories remain within data paths. Operators may be challenged finding out > which directories belong to existing tables and which may be subject to > removal. However, the information is available in CQL as well as in MBeans > via JMX, a convenient access to this information is still missing. > My proposal is a new nodetool subcommand allowing to list data paths of all > existing tables. > {code} > % bin/nodetool datapaths -- example > Keyspace : example > Table : test > Paths : > > /var/lib/cassandra/data/example/test-02f5b8d0c0e311ecb327ff24df5ab301 > > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17568) Implement nodetool command to list data directories of existing tables
[ https://issues.apache.org/jira/browse/CASSANDRA-17568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Miklosovic updated CASSANDRA-17568: -- Status: Review In Progress (was: Changes Suggested) > Implement nodetool command to list data directories of existing tables > -- > > Key: CASSANDRA-17568 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17568 > Project: Cassandra > Issue Type: New Feature > Components: Tool/nodetool >Reporter: Tibor Repasi >Assignee: Tibor Repasi >Priority: Normal > Fix For: 4.x > > Time Spent: 9h 50m > Remaining Estimate: 0h > > When a table is created, dropped and re-created with the same name, > directories remain within data paths. Operators may be challenged finding out > which directories belong to existing tables and which may be subject to > removal. However, the information is available in CQL as well as in MBeans > via JMX, a convenient access to this information is still missing. > My proposal is a new nodetool subcommand allowing to list data paths of all > existing tables. > {code} > % bin/nodetool datapaths -- example > Keyspace : example > Table : test > Paths : > > /var/lib/cassandra/data/example/test-02f5b8d0c0e311ecb327ff24df5ab301 > > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-17543) ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8: strategy=NONE coordinator=1 flush=false paging=false] times out sporadically
[ https://issues.apache.org/jira/browse/CASSANDRA-17543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526329#comment-17526329 ] Andres de la Peña edited comment on CASSANDRA-17543 at 4/22/22 11:03 AM: - [~maedhroz] are we ready to commit this? I have just rebased without conflicts and I'm running CI one last time: ||PR||CI|| |[4.0|https://github.com/apache/cassandra/pull/1568]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1495/workflows/d8256249-6af4-425b-80c0-3b5109204530] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/1495/workflows/264718b3-376c-4d85-b700-767dee99e3bd]| |[trunk|https://github.com/apache/cassandra/pull/1569]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1496/workflows/1699556f-c389-472d-b217-fd17e1007a41] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/1496/workflows/fc449127-5ea4-4438-9aaa-163c78c62884]| was (Author: adelapena): [~maedhroz] are ready to commit this? I have just rebased without conflicts and I'm running CI one last time: ||PR||CI|| |[4.0|https://github.com/apache/cassandra/pull/1568] |[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1495/workflows/d8256249-6af4-425b-80c0-3b5109204530] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/1495/workflows/264718b3-376c-4d85-b700-767dee99e3bd]| |[trunk|https://github.com/apache/cassandra/pull/1569]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1496/workflows/1699556f-c389-472d-b217-fd17e1007a41] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/1496/workflows/fc449127-5ea4-4438-9aaa-163c78c62884]| > ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8: strategy=NONE > coordinator=1 flush=false paging=false] times out sporadically > --- > > Key: CASSANDRA-17543 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17543 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/java >Reporter: Caleb Rackliffe >Assignee: Andres de la Peña >Priority: Normal > Fix For: 4.x > > Time Spent: 1h 10m > Remaining Estimate: 0h > > org.apache.cassandra.distributed.test.ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8: > strategy=NONE coordinator=1 flush=false paging=false] > {noformat} > Error Message > Timeout occurred. Please note the time in the report does not reflect the > time until the timeout. > Stacktrace > junit.framework.AssertionFailedError: Timeout occurred. Please note the time > in the report does not reflect the time until the timeout. > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.util.Vector.forEach(Vector.java:1388) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.util.Vector.forEach(Vector.java:1388) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.util.Vector.forEach(Vector.java:1388) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at org.apache.cassandra.anttasks.TestHelper.execute(TestHelper.java:53) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.util.Vector.forEach(Vector.java:1388) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) >
[jira] [Commented] (CASSANDRA-17543) ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8: strategy=NONE coordinator=1 flush=false paging=false] times out sporadically
[ https://issues.apache.org/jira/browse/CASSANDRA-17543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526329#comment-17526329 ] Andres de la Peña commented on CASSANDRA-17543: --- [~maedhroz] are ready to commit this? I have just rebased without conflicts and I'm running CI one last time: ||PR||CI|| |[4.0|https://github.com/apache/cassandra/pull/1568] |[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1495/workflows/d8256249-6af4-425b-80c0-3b5109204530] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/1495/workflows/264718b3-376c-4d85-b700-767dee99e3bd]| |[trunk|https://github.com/apache/cassandra/pull/1569]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1496/workflows/1699556f-c389-472d-b217-fd17e1007a41] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/1496/workflows/fc449127-5ea4-4438-9aaa-163c78c62884]| > ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8: strategy=NONE > coordinator=1 flush=false paging=false] times out sporadically > --- > > Key: CASSANDRA-17543 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17543 > Project: Cassandra > Issue Type: Bug > Components: Test/dtest/java >Reporter: Caleb Rackliffe >Assignee: Andres de la Peña >Priority: Normal > Fix For: 4.x > > Time Spent: 1h 10m > Remaining Estimate: 0h > > org.apache.cassandra.distributed.test.ReadRepairQueryTypesTest.testUnrestrictedQueryOnSkinnyTable[8: > strategy=NONE coordinator=1 flush=false paging=false] > {noformat} > Error Message > Timeout occurred. Please note the time in the report does not reflect the > time until the timeout. > Stacktrace > junit.framework.AssertionFailedError: Timeout occurred. Please note the time > in the report does not reflect the time until the timeout. > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.util.Vector.forEach(Vector.java:1388) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.util.Vector.forEach(Vector.java:1388) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.util.Vector.forEach(Vector.java:1388) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at org.apache.cassandra.anttasks.TestHelper.execute(TestHelper.java:53) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.base/java.util.Vector.forEach(Vector.java:1388) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at jdk.internal.reflect.GeneratedMethodAccessor4.invoke(Unknown Source) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > {noformat} > See > https://ci-cassandra.apache.org/job/Cassandra-trunk/1075/testReport/org.apache.cassandra.distributed.test/ReadRepairQueryTypesTest/testUnrestrictedQueryOnSkinnyTable_8__strategy_NONE_coordinator_1_flush_false_paging_false_/ -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: co
[jira] [Commented] (CASSANDRA-17150) Guardrails for disk usage
[ https://issues.apache.org/jira/browse/CASSANDRA-17150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526317#comment-17526317 ] Andres de la Peña commented on CASSANDRA-17150: --- [~e.dimitrova] thanks for the review. I think I have addressed the last bits. I'm running CI after rebase+squash: ||PR||CI|| |[trunk|https://github.com/apache/cassandra/pull/1546]|[j8|https://app.circleci.com/pipelines/github/adelapena/cassandra/1494/workflows/d032178d-f8a9-4124-b36f-5bf6f47b3116] [j11|https://app.circleci.com/pipelines/github/adelapena/cassandra/1494/workflows/bc844580-6f3a-4bc3-a4d0-d85f082330f8]| Please note that during the rebase I have replaced a few references to the removed `Config.DISABLED_GUARDRAIL` constant by {{{}-1{}}}. Those references were recently added to track warnings during CASSANDRA-17560. As it's mentioned [here|https://github.com/apache/cassandra/pull/1572#discussion_r854251196], using {{-1}} as the disabled value is a global config convention and not a guardrails thing, so we should either use it directly or define a new constant with a more generic name. If we decide to do the latter, I'd prefer to do it in a separate ticket, so we can focus on locating all the usages around. > Guardrails for disk usage > - > > Key: CASSANDRA-17150 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17150 > Project: Cassandra > Issue Type: New Feature > Components: Feature/Guardrails >Reporter: Andres de la Peña >Assignee: Andres de la Peña >Priority: Normal > Fix For: 4.x > > Time Spent: 8h 20m > Remaining Estimate: 0h > > Add guardrails for disk usage establishing soft/hard limits on the percentage > of used disk space. For example: > {code} > # Warning threshold to warn when local disk usage exceeds threshold. Valid > values: (1, 100] > # Defaults to -1 to disable. > # disk_usage_percentage_warn_threshold: -1 > # Failure threshold to reject write requests if replica disk usage exceeds > threshold. Valid values: (1, 100] > # Defaults to -1 to disable. > # disk_usage_percentage_failure_threshold: -1 > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17212) Migrate threshold for minimum keyspace replication factor to guardrails
[ https://issues.apache.org/jira/browse/CASSANDRA-17212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andres de la Peña updated CASSANDRA-17212: -- Reviewers: Andres de la Peña > Migrate threshold for minimum keyspace replication factor to guardrails > --- > > Key: CASSANDRA-17212 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17212 > Project: Cassandra > Issue Type: New Feature > Components: Feature/Guardrails >Reporter: Andres de la Peña >Assignee: Savni Nagarkar >Priority: Normal > Fix For: 4.x > > > The config property > [{{minimum_keyspace_rf}}|https://github.com/apache/cassandra/blob/5fdadb25f95099b8945d9d9ee11d3e380d3867f4/conf/cassandra.yaml] > that was added by CASSANDRA-14557 can be migrated to guardrails, for example: > {code} > guardrails: > ... > replication_factor: > warn_threshold: 2 > abort_threshold: 3 > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-11871) Allow to aggregate by time intervals
[ https://issues.apache.org/jira/browse/CASSANDRA-11871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer updated CASSANDRA-11871: --- Description: For time series data it can be usefull to aggregate by time intervals. The idea would be to add support for one or several functions in the {{GROUP BY}} clause. Regarding the implementation, even if in general I also prefer to follow the SQL syntax, I do not believe it will be a good fit for Cassandra. If we have a table like: {code} CREATE TABLE trades { symbol text, date date, time time, priceMantissa int, priceExponent tinyint, volume int, PRIMARY KEY ((symbol, date), time) }; {code} The trades will be inserted with an increasing time and sorted in the same order. As we can have to process a large amount of data, we want to try to limit ourself to the cases where we can build the groups on the flight (which is not a requirement in the SQL world). If we want to get the number of trades per minutes with the SQL syntax we will have to write: {{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' AND date = '2016-01-11' GROUP BY hour(time), minute(time);}} which is fine. The problem is that if the user invert by mistake the functions like that: {{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' AND date = '2016-01-11' GROUP BY minute(time), hour(time);}} the query will return weird results. The only way to prevent that would be to check the function order and make sure that we do not allow to skip functions (e.g. {{GROUP BY hour(time), second(time)}}). In my opinion a function like {{floor(, )}} will be much better as it does not allow for this type of mistakes and is much more flexible (you can create 5 minutes buckets if you want to). {code} SELECT floor(time, m), count() FROM Trades WHERE symbol = 'AAPL' AND date = '2016-01-11' GROUP BY floor(time, m); {code} An important aspect to keep in mind with a function like {{floor}} is the starting point. For a query like: {{SELECT floor(time, m), count() FROM Trades WHERE symbol = 'AAPL' AND date = '2016-01-11' AND time >= '01:30:00' AND time =< '07:30:00' GROUP BY floor(time, 2h);}}, I think that ideally the result should return 3 groups: {{01:30:00}}, {{03:30:00}} and {{05:30:00}}. was: For time series data it can be usefull to aggregate by time intervals. The idea would be to add support for one or several functions in the {{GROUP BY}} clause. Regarding the implementation, even if in general I also prefer to follow the SQL syntax, I do not believe it will be a good fit for Cassandra. If we have a table like: {code} CREATE TABLE trades { symbol text, date date, time time, priceMantissa int, priceExponent tinyint, volume int, PRIMARY KEY ((symbol, date), time) }; {code} The trades will be inserted with an increasing time and sorted in the same order. As we can have to process a large amount of data, we want to try to limit ourself to the cases where we can build the groups on the flight (which is not a requirement in the SQL world). If we want to get the number of trades per minutes with the SQL syntax we will have to write: {{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' AND date = '2016-01-11' GROUP BY hour(time), minute(time);}} which is fine. The problem is that if the user invert by mistake the functions like that: {{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' AND date = '2016-01-11' GROUP BY minute(time), hour(time);}} the query will return weird results. The only way to prevent that would be to check the function order and make sure that we do not allow to skip functions (e.g. {{GROUP BY hour(time), second(time)}}). In my opinion a function like {{floor(, )}} will be much better as it does not allow for this type of mistakes and is much more flexible (you can create 5 minutes buckets if you want to). {code}SELECT floor(time, m), count() FROM Trades WHERE symbol = 'AAPL' AND date = '2016-01-11' GROUP BY floor(time, m);{code} An important aspect to keep in mind with a function like {{floor}} is the starting point. For a query like: {{SELECT floor(time, m), count() FROM Trades WHERE symbol = 'AAPL' AND date = '2016-01-11' AND time >= '01:30:00' AND time =< '07:30:00' GROUP BY floor(time, 2h);}}, I think that ideally the result should return 3 groups: {{01:30:00}}, {{03:30:00}} and {{05:30:00}}. > Allow to aggregate by time intervals > > > Key: CASSANDRA-11871 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11871 > Project: Cassandra > Issue Type: Improvement > Components: Legacy/CQL >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer >Priority: Nor
[jira] [Updated] (CASSANDRA-11871) Allow to aggregate by time intervals
[ https://issues.apache.org/jira/browse/CASSANDRA-11871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer updated CASSANDRA-11871: --- Description: For time series data it can be usefull to aggregate by time intervals. The idea would be to add support for one or several functions in the {{GROUP BY}} clause. Regarding the implementation, even if in general I also prefer to follow the SQL syntax, I do not believe it will be a good fit for Cassandra. If we have a table like: {code} CREATE TABLE trades { symbol text, date date, time time, priceMantissa int, priceExponent tinyint, volume int, PRIMARY KEY ((symbol, date), time) }; {code} The trades will be inserted with an increasing time and sorted in the same order. As we can have to process a large amount of data, we want to try to limit ourself to the cases where we can build the groups on the flight (which is not a requirement in the SQL world). If we want to get the number of trades per minutes with the SQL syntax we will have to write: {{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' AND date = '2016-01-11' GROUP BY hour(time), minute(time);}} which is fine. The problem is that if the user invert by mistake the functions like that: {{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' AND date = '2016-01-11' GROUP BY minute(time), hour(time);}} the query will return weird results. The only way to prevent that would be to check the function order and make sure that we do not allow to skip functions (e.g. {{GROUP BY hour(time), second(time)}}). In my opinion a function like {{floor(, )}} will be much better as it does not allow for this type of mistakes and is much more flexible (you can create 5 minutes buckets if you want to). {code}SELECT floor(time, m), count() FROM Trades WHERE symbol = 'AAPL' AND date = '2016-01-11' GROUP BY floor(time, m);{code} An important aspect to keep in mind with a function like {{floor}} is the starting point. For a query like: {{SELECT floor(time, m), count() FROM Trades WHERE symbol = 'AAPL' AND date = '2016-01-11' AND time >= '01:30:00' AND time =< '07:30:00' GROUP BY floor(time, 2h);}}, I think that ideally the result should return 3 groups: {{01:30:00}}, {{03:30:00}} and {{05:30:00}}. was: For time series data it can be usefull to aggregate by time intervals. The idea would be to add support for one or several functions in the {{GROUP BY}} clause. Regarding the implementation, even if in general I also prefer to follow the SQL syntax, I do not believe it will be a good fit for Cassandra. If we have a table like: {code} CREATE TABLE trades { symbol text, date date, time time, priceMantissa int, priceExponent tinyint, volume int, PRIMARY KEY ((symbol, date), time) }; {code} The trades will be inserted with an increasing time and sorted in the same order. As we can have to process a large amount of data, we want to try to limit ourself to the cases where we can build the groups on the flight (which is not a requirement in the SQL world). If we want to get the number of trades per minutes with the SQL syntax we will have to write: {{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' AND date = '2016-01-11' GROUP BY hour(time), minute(time);}} which is fine. The problem is that if the user invert by mistake the functions like that: {{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' AND date = '2016-01-11' GROUP BY minute(time), hour(time);}} the query will return weird results. The only way to prevent that would be to check the function order and make sure that we do not allow to skip functions (e.g. {{GROUP BY hour(time), second(time)}}). In my opinion a function like {{floor(, )}} will be much better as it does not allow for this type of mistakes and is much more flexible (you can create 5 minutes buckets if you want to). {code}SELECT floor(time, m), count() FROM Trades WHERE symbol = 'AAPL' AND date = '2016-01-11' GROUP BY floor(time, m);{code} An important aspect to keep in mind with a function like {{floor}} is the starting point. For a query like: {{SELECT floor(time, m), count() FROM Trades WHERE symbol = 'AAPL' AND date = '2016-01-11' AND time >= '01:30:00' AND time =< '07:30:00' GROUP BY floor(time, 2h);}}, I think that ideally the result should return 3 groups: {{01:30:00}}, {{03:30:00}} and {{05:30:00}}. > Allow to aggregate by time intervals > > > Key: CASSANDRA-11871 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11871 > Project: Cassandra > Issue Type: Improvement > Components: Legacy/CQL >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer >Priority: Norm
[jira] [Updated] (CASSANDRA-11871) Allow to aggregate by time intervals
[ https://issues.apache.org/jira/browse/CASSANDRA-11871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer updated CASSANDRA-11871: --- Fix Version/s: 4.1 (was: 4.x) Source Control Link: https://github.com/apache/cassandra/commit/1ad8bf67a9c82cbb5ff38e5cf785f9fe2516d009 Resolution: Fixed Status: Resolved (was: Ready to Commit) Patch committed into trunk at 1ad8bf67a9c82cbb5ff38e5cf785f9fe2516d009 > Allow to aggregate by time intervals > > > Key: CASSANDRA-11871 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11871 > Project: Cassandra > Issue Type: Improvement > Components: Legacy/CQL >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer >Priority: Normal > Fix For: 4.1 > > Time Spent: 3h 20m > Remaining Estimate: 0h > > For time series data it can be usefull to aggregate by time intervals. > The idea would be to add support for one or several functions in the {{GROUP > BY}} clause. > Regarding the implementation, even if in general I also prefer to follow the > SQL syntax, I do not believe it will be a good fit for Cassandra. > If we have a table like: > {code} > CREATE TABLE trades > { > symbol text, > date date, > time time, > priceMantissa int, > priceExponent tinyint, > volume int, > PRIMARY KEY ((symbol, date), time) > }; > {code} > The trades will be inserted with an increasing time and sorted in the same > order. As we can have to process a large amount of data, we want to try to > limit ourself to the cases where we can build the groups on the flight (which > is not a requirement in the SQL world). > If we want to get the number of trades per minutes with the SQL syntax we > will have to write: > {{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' > AND date = '2016-01-11' GROUP BY hour(time), minute(time);}} > which is fine. The problem is that if the user invert by mistake the > functions like that: > {{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' > AND date = '2016-01-11' GROUP BY minute(time), hour(time);}} > the query will return weird results. > The only way to prevent that would be to check the function order and make > sure that we do not allow to skip functions (e.g. {{GROUP BY hour(time), > second(time)}}). > In my opinion a function like {{floor(, )}} will be > much better as it does not allow for this type of mistakes and is much more > flexible (you can create 5 minutes buckets if you want to). > {code}SELECT floor(time, m), count() FROM Trades WHERE symbol = 'AAPL' AND > date = '2016-01-11' GROUP BY floor(time, m);{code} > An important aspect to keep in mind with a function like {{floor}} is the > starting point. For a query like: {{SELECT floor(time, m), count() FROM > Trades WHERE symbol = 'AAPL' AND date = '2016-01-11' AND time >= '01:30:00' > AND time =< '07:30:00' GROUP BY floor(time, 2h);}}, I think that ideally the > result should return 3 groups: {{01:30:00}}, {{03:30:00}} and {{05:30:00}}. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-11871) Allow to aggregate by time intervals
[ https://issues.apache.org/jira/browse/CASSANDRA-11871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer updated CASSANDRA-11871: --- Status: Ready to Commit (was: Review In Progress) > Allow to aggregate by time intervals > > > Key: CASSANDRA-11871 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11871 > Project: Cassandra > Issue Type: Improvement > Components: Legacy/CQL >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer >Priority: Normal > Fix For: 4.x > > Time Spent: 3h 20m > Remaining Estimate: 0h > > For time series data it can be usefull to aggregate by time intervals. > The idea would be to add support for one or several functions in the {{GROUP > BY}} clause. > Regarding the implementation, even if in general I also prefer to follow the > SQL syntax, I do not believe it will be a good fit for Cassandra. > If we have a table like: > {code} > CREATE TABLE trades > { > symbol text, > date date, > time time, > priceMantissa int, > priceExponent tinyint, > volume int, > PRIMARY KEY ((symbol, date), time) > }; > {code} > The trades will be inserted with an increasing time and sorted in the same > order. As we can have to process a large amount of data, we want to try to > limit ourself to the cases where we can build the groups on the flight (which > is not a requirement in the SQL world). > If we want to get the number of trades per minutes with the SQL syntax we > will have to write: > {{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' > AND date = '2016-01-11' GROUP BY hour(time), minute(time);}} > which is fine. The problem is that if the user invert by mistake the > functions like that: > {{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' > AND date = '2016-01-11' GROUP BY minute(time), hour(time);}} > the query will return weird results. > The only way to prevent that would be to check the function order and make > sure that we do not allow to skip functions (e.g. {{GROUP BY hour(time), > second(time)}}). > In my opinion a function like {{floor(, )}} will be > much better as it does not allow for this type of mistakes and is much more > flexible (you can create 5 minutes buckets if you want to). > {code}SELECT floor(time, m), count() FROM Trades WHERE symbol = 'AAPL' AND > date = '2016-01-11' GROUP BY floor(time, m);{code} > An important aspect to keep in mind with a function like {{floor}} is the > starting point. For a query like: {{SELECT floor(time, m), count() FROM > Trades WHERE symbol = 'AAPL' AND date = '2016-01-11' AND time >= '01:30:00' > AND time =< '07:30:00' GROUP BY floor(time, 2h);}}, I think that ideally the > result should return 3 groups: {{01:30:00}}, {{03:30:00}} and {{05:30:00}}. > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-11871) Allow to aggregate by time intervals
[ https://issues.apache.org/jira/browse/CASSANDRA-11871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Lerer updated CASSANDRA-11871: --- Description: For time series data it can be usefull to aggregate by time intervals. The idea would be to add support for one or several functions in the {{GROUP BY}} clause. Regarding the implementation, even if in general I also prefer to follow the SQL syntax, I do not believe it will be a good fit for Cassandra. If we have a table like: {code} CREATE TABLE trades { symbol text, date date, time time, priceMantissa int, priceExponent tinyint, volume int, PRIMARY KEY ((symbol, date), time) }; {code} The trades will be inserted with an increasing time and sorted in the same order. As we can have to process a large amount of data, we want to try to limit ourself to the cases where we can build the groups on the flight (which is not a requirement in the SQL world). If we want to get the number of trades per minutes with the SQL syntax we will have to write: {{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' AND date = '2016-01-11' GROUP BY hour(time), minute(time);}} which is fine. The problem is that if the user invert by mistake the functions like that: {{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' AND date = '2016-01-11' GROUP BY minute(time), hour(time);}} the query will return weird results. The only way to prevent that would be to check the function order and make sure that we do not allow to skip functions (e.g. {{GROUP BY hour(time), second(time)}}). In my opinion a function like {{floor(, )}} will be much better as it does not allow for this type of mistakes and is much more flexible (you can create 5 minutes buckets if you want to). {code}SELECT floor(time, m), count() FROM Trades WHERE symbol = 'AAPL' AND date = '2016-01-11' GROUP BY floor(time, m);{code} An important aspect to keep in mind with a function like {{floor}} is the starting point. For a query like: {{SELECT floor(time, m), count() FROM Trades WHERE symbol = 'AAPL' AND date = '2016-01-11' AND time >= '01:30:00' AND time =< '07:30:00' GROUP BY floor(time, 2h);}}, I think that ideally the result should return 3 groups: {{01:30:00}}, {{03:30:00}} and {{05:30:00}}. was: For time series data it can be usefull to aggregate by time intervals. The idea would be to add support for one or several functions in the {{GROUP BY}} clause. Regarding the implementation, even if in general I also prefer to follow the SQL syntax, I do not believe it will be a good fit for Cassandra. If we have a table like: {code} CREATE TABLE trades { symbol text, date date, time time, priceMantissa int, priceExponent tinyint, volume int, PRIMARY KEY ((symbol, date), time) }; {code} The trades will be inserted with an increasing time and sorted in the same order. As we can have to process a large amount of data, we want to try to limit ourself to the cases where we can build the groups on the flight (which is not a requirement in the SQL world). If we want to get the number of trades per minutes with the SQL syntax we will have to write: {{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' AND date = '2016-01-11' GROUP BY hour(time), minute(time);}} which is fine. The problem is that if the user invert by mistake the functions like that: {{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' AND date = '2016-01-11' GROUP BY minute(time), hour(time);}} the query will return weird results. The only way to prevent that would be to check the function order and make sure that we do not allow to skip functions (e.g. {{GROUP BY hour(time), second(time)}}). In my opinion a function like {{floor(, )}} will be much better as it does not allow for this type of mistakes and is much more flexible (you can create 5 minutes buckets if you want to). {{SELECT floor(time, m), count() FROM Trades WHERE symbol = 'AAPL' AND date = '2016-01-11' GROUP BY floor(time, m);}} An important aspect to keep in mind with a function like {{floor}} is the starting point. For a query like: {{SELECT floor(time, m), count() FROM Trades WHERE symbol = 'AAPL' AND date = '2016-01-11' AND time >= '01:30:00' AND time =< '07:30:00' GROUP BY floor(time, 2h);}}, I think that ideally the result should return 3 groups: {{01:30:00}}, {{03:30:00}} and {{05:30:00}}. > Allow to aggregate by time intervals > > > Key: CASSANDRA-11871 > URL: https://issues.apache.org/jira/browse/CASSANDRA-11871 > Project: Cassandra > Issue Type: Improvement > Components: Legacy/CQL >Reporter: Benjamin Lerer >Assignee: Benjamin Lerer >Priority: Normal > Fix For:
[jira] [Commented] (CASSANDRA-16456) Add Plugin Support for CQLSH
[ https://issues.apache.org/jira/browse/CASSANDRA-16456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17526259#comment-17526259 ] Stefan Miklosovic commented on CASSANDRA-16456: --- point 3) with the addition that it should warn you that stuff should be in credentials instead of cqlshrc when it comes to username / password. We do not have any control over any other possible further credentials located in cqlshrc but username and password as these two are the most known. point 4) same, we should emit warning as it is done now that this stuff should be located in credentials The reason for the warning is that then we will remove the support of authentication section in cqlshrc in the next release and everything will go to credentials only (or as flags on the command line). point 5) if you meant override as in "applied on top of them" then yes, you are basically adding one set (as a mathematical construct) to the other one with a detail thatit will replace values in cqlshrc by these which are as values for the same key in credentials file point 6) yes, that username flag on the console, then you ask for password. Because out of the box you can login just without anything and it will assume you are loging anonymously. > Add Plugin Support for CQLSH > > > Key: CASSANDRA-16456 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16456 > Project: Cassandra > Issue Type: New Feature > Components: Tool/cqlsh >Reporter: Brian Houser >Assignee: Brian Houser >Priority: Normal > Labels: gsoc2021, mentor > Time Spent: 2h 50m > Remaining Estimate: 0h > > Currently the Cassandra drivers offer a plugin authenticator architecture for > the support of different authentication methods. This has been leveraged to > provide support for LDAP, Kerberos, and Sigv4 authentication. Unfortunately, > cqlsh, the included CLI tool, does not offer such support. Switching to a new > enhanced authentication scheme thus means being cut off from using cqlsh in > normal operation. > We should have a means of using the same plugins and authentication providers > as the Python Cassandra driver. > Here's a link to an initial draft of > [CEP|https://docs.google.com/document/d/1_G-OZCAEmDyuQuAN2wQUYUtZBEJpMkHWnkYELLhqvKc/edit?usp=sharing]. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org