[jira] [Commented] (CASSANDRA-17367) sstableloader ignores streaming encryption settings

2022-02-22 Thread Berenguer Blasi (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17496485#comment-17496485
 ] 

Berenguer Blasi commented on CASSANDRA-17367:
-

I _think_ (and [~brandon.williams] can correct me) there were problems in the 
past about connecting without SSL to a node configured with SSL. Like we had to 
support both ways, which is why the test doesn't originally connect to the SSL 
port? I would leave the original test as it was and add yours as a new test 
method. Otherwise we'd need to also fix the other test in the class which 
doesn't use the SSL port either. On top of that the original test is using 
legacy sstables and now we'd be removing that which is suspicious.

I think it might be safer to just add your test as a new one, but I may be just 
imagining things in my bad memory, so I'll defer to [~brandon.williams] to see 
if he knows better?

> sstableloader ignores streaming encryption settings
> ---
>
> Key: CASSANDRA-17367
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17367
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/bulk load
>Reporter: Dmitry Potepalov
>Assignee: Dmitry Potepalov
>Priority: Normal
> Fix For: 4.0.x, 4.x
>
> Attachments: 17367-4.0.txt, 17367-trunk.txt
>
>
> Reproducible in Cassandra 4.x. If one configures encryption for streaming in 
> config yaml fed to sstableloader like this
> {{server_encryption_options:}}
> {{    internode_encryption: all}}
> {{    keystore: sstableloader.keystore.p12}}
> {{    keystore_password: changeit}}
> {{    truststore: sstableloader.truststore.jks}}
> {{    truststore_password: changeit}}
> then sstableloader should perform an SSL handshake on the streaming 
> connections and encrypt the payload. But this does not happen. Judging by the 
> TCPdump of the outgoing traffic on the internode port, sstableloader sends 
> plaintext traffic. This is the TCP payload of the first packet that 
> sstableloader sends after establishing TCP connection:
> {{ca 55 2d fa 0c 0c 0c 08 06 0a f0 01 f9 1b 58 a8 32 f2 d0}}
> The first 4 bytes look like Cassandra protocol magic, not like a client hello.
> I've discovered the issue while trying to migrate some data to a Cassandra 4 
> listening on the legacy ssl storage port (therefore, accepting only encrypted 
> connections on that port). Streaming phase of the migration failed with a 
> "connection closed" error, which hints that the connection was closed 
> server-side.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16092) Add Index Group Interface for Storage Attached Index

2022-02-22 Thread Zhao Yang (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17496462#comment-17496462
 ] 

Zhao Yang commented on CASSANDRA-16092:
---

I have pushed a rebased version, waiting for CI..

> Add Index Group Interface for Storage Attached Index
> 
>
> Key: CASSANDRA-16092
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16092
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/SASI
>Reporter: Zhao Yang
>Assignee: Zhao Yang
>Priority: Normal
> Fix For: 5.x
>
>
> [Index 
> group|https://github.com/datastax/cassandra/blob/storage_attached_index/src/java/org/apache/cassandra/index/Index.java#L634]
>  interface allows:
> * indexes on the same table to receive centralized lifecycle events called 
> secondary index groups. Sharing of data between multiple column indexes on 
> the same table allows SAI disk usage to realise significant space savings 
> over other index implementations.
> * index-group to analyze user query and provide a query plan that leverages 
> all available indexes within the group.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17399) a new SSTable created when single SSTable tombstone compact occurred in TWCS

2022-02-22 Thread eason hao (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17496440#comment-17496440
 ] 

eason hao commented on CASSANDRA-17399:
---

[~brandon.williams] There is no message when I run sstableexpiredblockers, the 
new SSTable contains TTL=0 records from the sstablemetadata, maybe it's the 
reason?

> a new SSTable created when single SSTable tombstone compact occurred in TWCS
> 
>
> Key: CASSANDRA-17399
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17399
> Project: Cassandra
>  Issue Type: Bug
>Reporter: eason hao
>Priority: Normal
>  Labels: 3.10
>
> we found a issue that a new SSTable created when single SSTable tombstone 
> compact occurred. The cassandra version is *cqlsh 5.0.1 | Cassandra 3.10 | 
> CQL spec 3.4.4,* we use *TWCS.*
> The old SSTable, which Estimated droppable tombstones above 0.9, is the 
> oldest SSTable in this table, it store oldest records, and it contains same 
> partitions with newer SSTables, there is no expired SSTable deletion block 
> about it.
> when the old SSTable exists almost TTL+gc_grace_seconds, then it's deleted, 
> but later I found a new SSTable created, from log we know the new SSTable is 
> created by old one, the size 42.920MiB is old SSTable and 2.381MiB is new 
> SSTable.
>  
> {code:java}
> DEBUG [CompactionExecutor:44581] 
> 2022-02-21 11:11:15,429 CompactionTask.java:255 - Compacted 
> (e99c1550-9306-11ec-8461-0bfbe41d7414) 1 sstables to 
> [.../mc-317850-big,]
>  to level=0. 42.920MiB to 2.381MiB (~5% of original) in 31,424ms. Read 
> Throughput = 1.366MiB/s, Write Throughput = 77.602KiB/s, Row Throughput =
>  ~4,311/s. 194 total partitions merged to 194. Partition merge counts 
> were {1:194, } {code}
>  
> and weird data exist in new SSTable, all the fileds only contain 
> deletion_info, the partition/clustering/x/y is same in old SSTable.
>  
> {code:java}
> "cells" : [
>           { "name" : "x", "deletion_info" : { "local_delete_time" : 
> "2022-02-12T10:55:15Z" }
>           },
>           { "name" : "y", "deletion_info" : { "local_delete_time" : 
> "2022-02-12T10:55:15Z" }
>           },
> ...
> }{code}
> also, the old SSTable only contain part of data in new SSTable, we found 
> 129426 rows in old and 94694 rows in new one.
>  
>  
> also I found there are TTL min:0 in sstablemetadata but I dump all data from 
> the old SSTable, then I can't find any record with ttl=0, all data is same as 
> deletion_info records
>  
> {code:java}
> Minimum timestamp: 1644740070072443
> Maximum timestamp: 1644742695566429
> SSTable min local deletion time: 1644740070
> SSTable max local deletion time: 1645433895
> Compressor: org.apache.cassandra.io.compress.LZ4Compressor
> Compression ratio: 0.01234938023191464
> TTL min: 0
> TTL max: 691200
> Estimated droppable tombstones: 0.9057755011460312 {code}
>  
>  
> I guess it's not performed as design, when a SSTable live exceed TTL+gc, it 
> should be deleted if Estimated droppable tombstones exceed threshold, this is 
> what I thought. So create a new SSTable behavior should be removed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17002) Cassandra Ring state transitions should be available through a Virtual Table

2022-02-22 Thread Francisco Guerrero (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17496424#comment-17496424
 ] 

Francisco Guerrero commented on CASSANDRA-17002:


[~dcapwell] [~yifanc] [~skoppu] thanks for the review and the discussion.

I will repurpose this Jira for the gossipinfo virtual table. I had a longer 
conversation in slack with Stefan Podkowinski to create a generic virtual table 
to expose all DiagnosticEvents. The scope of that table is a little larger, but 
the idea would be to have a table where you can query DiagnosticEvents like 
this:

{{SELECT * FROM events WHERE eventType = 'GossipEvent' AND timestamp >= 
1222}}

I will go ahead and create a new Jira for that work, and I will change the 
description and title of this Jira. I will be pushing a new commit for this 
work to support gossipinfo as a virtual table soon.

> Cassandra Ring state transitions should be available through a Virtual Table
> 
>
> Key: CASSANDRA-17002
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17002
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Feature/Virtual Tables
>Reporter: Dinesh Joshi
>Assignee: Francisco Guerrero
>Priority: Normal
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> In many situations it is beneficial to see the last N Gossip state 
> transitions for debugging and other purposes. We should expose the last N 
> state transitions through a bounded Virtual Table.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra-website] branch asf-staging updated (e31295c -> b097ece)

2022-02-22 Thread git-site-role
This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a change to branch asf-staging
in repository https://gitbox.apache.org/repos/asf/cassandra-website.git.


omit e31295c  generate docs for c8135531
 add 123b47c  CASSANDRA-17398 February 2022 blog "Apache Cassandra and Java 
SE 11 support"
 new b097ece  generate docs for 123b47c3

This update added new revisions after undoing existing revisions.
That is to say, some revisions that were in the old version of the
branch are not in the new version.  This situation occurs
when a user --force pushes a change and generates a repository
containing something like this:

 * -- * -- B -- O -- O -- O   (e31295c)
\
 N -- N -- N   refs/heads/asf-staging (b097ece)

You should already have received notification emails for all of the O
revisions, and so the following emails describe only the N revisions
from the common base, B.

Any revisions marked "omit" are not gone; other references still
refer to them.  Any revisions marked "discard" are gone forever.

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 ...java-se-11-support-unsplash-michiel-leunens.jpg | Bin 0 -> 116492 bytes
 content/_/blog.html|  25 +++
 ...> Apache-Cassandra-and-Java-SE-11-support.html} |  77 +++--
 content/search-index.js|   2 +-
 ...java-se-11-support-unsplash-michiel-leunens.jpg | Bin 0 -> 116492 bytes
 site-content/source/modules/ROOT/pages/blog.adoc   |  26 +++
 .../Apache-Cassandra-and-Java-SE-11-support.adoc   |  39 +++
 site-ui/build/ui-bundle.zip| Bin 4740084 -> 4740084 
bytes
 8 files changed, 116 insertions(+), 53 deletions(-)
 create mode 100644 
content/_/_images/blog/apache-cassandra-and-java-se-11-support-unsplash-michiel-leunens.jpg
 copy content/_/blog/{World-Party.html => 
Apache-Cassandra-and-Java-SE-11-support.html} (70%)
 create mode 100644 
site-content/source/modules/ROOT/images/blog/apache-cassandra-and-java-se-11-support-unsplash-michiel-leunens.jpg
 create mode 100644 
site-content/source/modules/ROOT/pages/blog/Apache-Cassandra-and-Java-SE-11-support.adoc

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17398) WEBSITE - February 2022 blog "Apache Cassandra and Java SE 11 support"

2022-02-22 Thread Erick Ramirez (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Ramirez updated CASSANDRA-17398:
--
Source Control Link: 
https://github.com/apache/cassandra-website/commit/123b47c3a13402ee562eb0111defd74883369b25
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

Committed as 
[{{123b47c}}|https://github.com/apache/cassandra-website/commit/123b47c3a13402ee562eb0111defd74883369b25].

> WEBSITE - February 2022 blog "Apache Cassandra and Java SE 11 support"
> --
>
> Key: CASSANDRA-17398
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17398
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation/Blog
>Reporter: Diogenese Topper
>Assignee: Chris Thornett
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0.3
>
> Attachments: c17398-01-blog-index.png, c17398-02-blog-post.png
>
>
> This ticket is to capture the work associated with publishing the January 
> 2022 blog "Apache Cassandra and Java SE 11 support"
> If this blog cannot be published by the *February 24, 2022 publish date*, 
> please contact me on ASF Slack or suggest/make changes when possible in the 
> pull request for the appropriate time that the blog will go live (on 
> blog.adoc and in the blog post .adoc).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17398) WEBSITE - February 2022 blog "Apache Cassandra and Java SE 11 support"

2022-02-22 Thread Erick Ramirez (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Ramirez updated CASSANDRA-17398:
--
Fix Version/s: 4.0.3
   (was: NA)

> WEBSITE - February 2022 blog "Apache Cassandra and Java SE 11 support"
> --
>
> Key: CASSANDRA-17398
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17398
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation/Blog
>Reporter: Diogenese Topper
>Assignee: Chris Thornett
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0.3
>
> Attachments: c17398-01-blog-index.png, c17398-02-blog-post.png
>
>
> This ticket is to capture the work associated with publishing the January 
> 2022 blog "Apache Cassandra and Java SE 11 support"
> If this blog cannot be published by the *February 24, 2022 publish date*, 
> please contact me on ASF Slack or suggest/make changes when possible in the 
> pull request for the appropriate time that the blog will go live (on 
> blog.adoc and in the blog post .adoc).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17398) WEBSITE - February 2022 blog "Apache Cassandra and Java SE 11 support"

2022-02-22 Thread Erick Ramirez (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Ramirez updated CASSANDRA-17398:
--
Status: Ready to Commit  (was: Review In Progress)

Looks great. Fantastic post, [~Calico] ! 👍

  !c17398-01-blog-index.png|width=300! 
 !c17398-02-blog-post.png|width=300! 

> WEBSITE - February 2022 blog "Apache Cassandra and Java SE 11 support"
> --
>
> Key: CASSANDRA-17398
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17398
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation/Blog
>Reporter: Diogenese Topper
>Assignee: Chris Thornett
>Priority: Normal
>  Labels: pull-request-available
> Fix For: NA
>
> Attachments: c17398-01-blog-index.png, c17398-02-blog-post.png
>
>
> This ticket is to capture the work associated with publishing the January 
> 2022 blog "Apache Cassandra and Java SE 11 support"
> If this blog cannot be published by the *February 24, 2022 publish date*, 
> please contact me on ASF Slack or suggest/make changes when possible in the 
> pull request for the appropriate time that the blog will go live (on 
> blog.adoc and in the blog post .adoc).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra-website] branch trunk updated: CASSANDRA-17398 February 2022 blog "Apache Cassandra and Java SE 11 support"

2022-02-22 Thread erickramirezau
This is an automated email from the ASF dual-hosted git repository.

erickramirezau pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra-website.git


The following commit(s) were added to refs/heads/trunk by this push:
 new 123b47c  CASSANDRA-17398 February 2022 blog "Apache Cassandra and Java 
SE 11 support"
123b47c is described below

commit 123b47c3a13402ee562eb0111defd74883369b25
Author: Diogenese Topper 
AuthorDate: Tue Feb 22 13:15:32 2022 -0800

CASSANDRA-17398 February 2022 blog "Apache Cassandra and Java SE 11 support"

patch by Chris Thornett, Diogenese Topper; reviewed by Erick Ramirez for 
CASSANDRA-17398
---
 ...java-se-11-support-unsplash-michiel-leunens.jpg | Bin 0 -> 116492 bytes
 site-content/source/modules/ROOT/pages/blog.adoc   |  26 ++
 .../Apache-Cassandra-and-Java-SE-11-support.adoc   |  39 +
 3 files changed, 65 insertions(+)

diff --git 
a/site-content/source/modules/ROOT/images/blog/apache-cassandra-and-java-se-11-support-unsplash-michiel-leunens.jpg
 
b/site-content/source/modules/ROOT/images/blog/apache-cassandra-and-java-se-11-support-unsplash-michiel-leunens.jpg
new file mode 100644
index 000..e389629
Binary files /dev/null and 
b/site-content/source/modules/ROOT/images/blog/apache-cassandra-and-java-se-11-support-unsplash-michiel-leunens.jpg
 differ
diff --git a/site-content/source/modules/ROOT/pages/blog.adoc 
b/site-content/source/modules/ROOT/pages/blog.adoc
index 14e51cd..9aa11e9 100644
--- a/site-content/source/modules/ROOT/pages/blog.adoc
+++ b/site-content/source/modules/ROOT/pages/blog.adoc
@@ -14,6 +14,32 @@ NOTES FOR CONTENT CREATORS
 [openblock,card-header]
 --
 [discrete]
+=== Java SE 11 LTS and Apache Cassandra
+[discrete]
+ February 24, 2022
+--
+[openblock,card-content]
+--
+With the release of version 4.0.2, Cassandra's support
+for Java 11 will no longer be experimental and offers a number of features 
including better performance because of better garbage collection.
+
+[openblock,card-btn card-btn--blog]
+
+
+[.btn.btn--alt]
+xref:blog/Apache-Cassandra-and-Java-SE-11-support.adoc[Read More]
+
+
+--
+
+//end card
+
+//start card
+[openblock,card shadow relative test]
+
+[openblock,card-header]
+--
+[discrete]
 === Apache Cassandra Upgrade Advisory
 [discrete]
  February 18, 2022
diff --git 
a/site-content/source/modules/ROOT/pages/blog/Apache-Cassandra-and-Java-SE-11-support.adoc
 
b/site-content/source/modules/ROOT/pages/blog/Apache-Cassandra-and-Java-SE-11-support.adoc
new file mode 100644
index 000..faf07a0
--- /dev/null
+++ 
b/site-content/source/modules/ROOT/pages/blog/Apache-Cassandra-and-Java-SE-11-support.adoc
@@ -0,0 +1,39 @@
+= Apache Cassandra and Java SE 11 support
+:page-layout: single-post
+:page-role: blog-post
+:page-post-date: February, 24 2021
+:page-post-author: Chris Thornett
+:description: The Apache Cassandra Community
+:keywords: Java, Cassandra 4.0, garbage collection
+
+:!figure-caption:
+
+.Image credit: https://unsplash.com/@leunesmedia/[Michiel Leunens on Unsplash^]
+image::blog/apache-cassandra-and-java-se-11-support-unsplash-michiel-leunens.jpg[Java
 coffee and cake]
+
+In September 2021, 
https://www.oracle.com/java/technologies/java-se-support-roadmap.html[Oracle
+announced their new Java support roadmap^]. Part of this announcement included 
the designation of certain releases as Long-Term-Support (LTS). LTS releases 
are eligible for Oracle's premier and extended support, and a new LTS release 
will be announced approximately every two years. As of this announcement, Java 
SE 7, 8, 11, and 17 are designated as LTS releases.
+
+The LTS designation is excellent news for Apache Cassandra. As an open source 
project aiming to be enterprise-ready, LTS releases give us a target platform 
to develop atop. We can focus on utilizing and leveraging language features 
that we know will continue to be developed
+and supported, giving all of our users a feeling of stability, predictability, 
and confidence.
+
+While earlier versions of Apache Cassandra are built for the Java 8 platform, 
with the release of version 4.0.2, Cassandra's support for Java 11 will no 
longer be experimental. We will support Java 11 as our LTS release of choice.
+
+=== Better Performance with Better Garbage Collection
+
+Java 11 has many improvements over Java 8. One significant advantage is the 
choice of garbage collectors—the process Java uses to remove data that is no 
longer needed from memory—which can significantly impact microservice 
performance. Garbage collection can cause unpredictable pauses in Java 
applications, so any improvements are welcome. There are three promising 
garbage collectors in Java 11: G1GC, Shenandoah and ZGC, although we can’t 
expressly recommend ZGC or a LTS release as it is  [...]
+
+G1GC, or the
+https://docs.oracle.com/en/java/javase/11/gctuning/garbage-first-garbage-collector.ht

[jira] [Updated] (CASSANDRA-17398) WEBSITE - February 2022 blog "Apache Cassandra and Java SE 11 support"

2022-02-22 Thread Erick Ramirez (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Ramirez updated CASSANDRA-17398:
--
Attachment: c17398-01-blog-index.png

> WEBSITE - February 2022 blog "Apache Cassandra and Java SE 11 support"
> --
>
> Key: CASSANDRA-17398
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17398
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation/Blog
>Reporter: Diogenese Topper
>Assignee: Chris Thornett
>Priority: Normal
>  Labels: pull-request-available
> Fix For: NA
>
> Attachments: c17398-01-blog-index.png, c17398-02-blog-post.png
>
>
> This ticket is to capture the work associated with publishing the January 
> 2022 blog "Apache Cassandra and Java SE 11 support"
> If this blog cannot be published by the *February 24, 2022 publish date*, 
> please contact me on ASF Slack or suggest/make changes when possible in the 
> pull request for the appropriate time that the blog will go live (on 
> blog.adoc and in the blog post .adoc).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17398) WEBSITE - February 2022 blog "Apache Cassandra and Java SE 11 support"

2022-02-22 Thread Erick Ramirez (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Ramirez updated CASSANDRA-17398:
--
Attachment: c17398-02-blog-post.png

> WEBSITE - February 2022 blog "Apache Cassandra and Java SE 11 support"
> --
>
> Key: CASSANDRA-17398
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17398
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation/Blog
>Reporter: Diogenese Topper
>Assignee: Chris Thornett
>Priority: Normal
>  Labels: pull-request-available
> Fix For: NA
>
> Attachments: c17398-01-blog-index.png, c17398-02-blog-post.png
>
>
> This ticket is to capture the work associated with publishing the January 
> 2022 blog "Apache Cassandra and Java SE 11 support"
> If this blog cannot be published by the *February 24, 2022 publish date*, 
> please contact me on ASF Slack or suggest/make changes when possible in the 
> pull request for the appropriate time that the blog will go live (on 
> blog.adoc and in the blog post .adoc).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-17398) WEBSITE - February 2022 blog "Apache Cassandra and Java SE 11 support"

2022-02-22 Thread Erick Ramirez (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Ramirez reassigned CASSANDRA-17398:
-

Assignee: Chris Thornett

> WEBSITE - February 2022 blog "Apache Cassandra and Java SE 11 support"
> --
>
> Key: CASSANDRA-17398
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17398
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation/Blog
>Reporter: Diogenese Topper
>Assignee: Chris Thornett
>Priority: Normal
>  Labels: pull-request-available
> Fix For: NA
>
>
> This ticket is to capture the work associated with publishing the January 
> 2022 blog "Apache Cassandra and Java SE 11 support"
> If this blog cannot be published by the *February 24, 2022 publish date*, 
> please contact me on ASF Slack or suggest/make changes when possible in the 
> pull request for the appropriate time that the blog will go live (on 
> blog.adoc and in the blog post .adoc).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17398) WEBSITE - February 2022 blog "Apache Cassandra and Java SE 11 support"

2022-02-22 Thread Erick Ramirez (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Ramirez updated CASSANDRA-17398:
--
Reviewers: Erick Ramirez
   Status: Review In Progress  (was: Patch Available)

> WEBSITE - February 2022 blog "Apache Cassandra and Java SE 11 support"
> --
>
> Key: CASSANDRA-17398
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17398
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation/Blog
>Reporter: Diogenese Topper
>Priority: Normal
>  Labels: pull-request-available
> Fix For: NA
>
>
> This ticket is to capture the work associated with publishing the January 
> 2022 blog "Apache Cassandra and Java SE 11 support"
> If this blog cannot be published by the *February 24, 2022 publish date*, 
> please contact me on ASF Slack or suggest/make changes when possible in the 
> pull request for the appropriate time that the blog will go live (on 
> blog.adoc and in the blog post .adoc).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17332) Add support for vnodes in jvm-dtest

2022-02-22 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17496387#comment-17496387
 ] 

David Capwell commented on CASSANDRA-17332:
---

sorry for dropping this, had several other things unblock at the same time... 
will get back to this tomorrow to show trunk work.  My goal is to support in 
trunk, but not intending to support in older branches unless effort is trivial.

> Add support for vnodes in jvm-dtest
> ---
>
> Key: CASSANDRA-17332
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17332
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Test/dtest/java
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.x
>
>
> Right now python dtests need to keep running after being ported to jvm-dtests 
> as vnode support is not present, to fully deprecate the python dtests, we 
> need vnode support in jvm-dtest.
> Sadly, to add support we need to break binary compatibility, but can maintain 
> source compatibility… so will need to bump every jar across every branch 
> (mostly due to TokenSupplier)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17292) Move cassandra.yaml toward a nested structure around major database concepts

2022-02-22 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17496352#comment-17496352
 ] 

Paulo Motta commented on CASSANDRA-17292:
-

{quote}I think the main point of contention then is incremental vs. 
non-incremental migration of existing configuration.
{quote}
I think we can support the new layout for new configurations added before 5.X. 
For existing (legacy) configurations I see the following options:
a) Non-incrementally migrate all legacy properties to the new layout on 5.X
b) Incrementally migrate on 4.x while allowing users to opt-in to the new 
configuration, and switch that to opt-out on 5.x.

I'm slightly in favor of b) due to splitting the work into bite-sized chunks 
and making the new layout incrementally available earlier, but I'm also OK with 
a).
{quote}I think the thought that's hard for me to escape around this is that we 
really want a coherent design for the whole configuration up-front, given the 
lack of one is at least partially to blame for the current mess.
{quote}
This is my main motivation for chiming in here with this feature-centric 
proposal, since it allows anyone to pretty easily decide where a particular 
configuration belongs using the following heuristic when adding a new 
configuration option:
 * Does this configuration belong to an existing {{{}FeatureConfiguration{}}}?
 ** If yes, add the new property to the existing {{{}FeatureConfiguration{}}}.
 ** If not, create a new {{FeatureConfiguration}} subclass for the particular 
feature that you're adding.

No prior knowledge on the "domain model" is needed to use the heuristics above 
when deciding where a configuration should go.
{quote}Then, if we have that, and we can work out whatever small 
inconsistencies exist, we can present operators with a clean v2 config file 
format in 5.0 (that requires us to do very little thinking about compatibility, 
outside checking the version element).
{quote}
The migration of "legacy configuration" to the new feature-centric layout is 
also straightforward using the same heuristics above, for whenever we decide to 
perform a "big bang" switch to the new configuration layout.

> Move cassandra.yaml toward a nested structure around major database concepts
> 
>
> Key: CASSANDRA-17292
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17292
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 5.x
>
>
> Recent mailing list conversation (see "[DISCUSS] Nested YAML configs for new 
> features") has made it clear we will gravitate toward appropriately nested 
> structures for new parameters in {{cassandra.yaml}}, but from the scattered 
> conversation across a few Guardrails tickets (see CASSANDRA-17212 and 
> CASSANDRA-17148) and CASSANDRA-15234, there is also a general desire to 
> eventually extend this to the rest of {{cassandra.yaml}}. The benefits of 
> this change include those we gain by doing it for new features (single point 
> of interest for feature documentation, typed configuration objects, logical 
> grouping for additional parameters added over time, discoverability, etc.), 
> but one a larger scale.
> This may overlap with ongoing work, including the Guardrails epic. Ideally, 
> even a rough cut of a design here would allow that to move forward in a 
> timely and coherent manner (with less long-term refactoring pain).
> Current proposals:
> From [~benedict] - 
> https://github.com/belliottsmith/cassandra/commits/CASSANDRA-15234-grouping-ideas
> From [~maedhroz] - 
> https://github.com/maedhroz/cassandra/commit/450b920e0ac072cec635e0ebcb63538ee7f1fc5a
> From [~paulo] - 
> https://gist.github.com/pauloricardomg/e9e23feea1b172b4f084cb01d7a89b05



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17166) Enhance SnakeYAML properties to be reusable outside of YAML parsing, support camel case conversion to snake case, and add support to ignore properties

2022-02-22 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17496344#comment-17496344
 ] 

David Capwell commented on CASSANDRA-17166:
---

[~jjordan], [~e.dimitrova], [~stefan.miklosovic] [~franciscog] all feedback 
addressed, ready for review

> Enhance SnakeYAML properties to be reusable outside of YAML parsing, support 
> camel case conversion to snake case, and add support to ignore properties
> --
>
> Key: CASSANDRA-17166
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17166
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
> Fix For: 4.x
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> SnakeYaml is rather limited in the “object mapping” layer, which forces our 
> internal code to match specific patterns (all fields public and camel case); 
> we can remove this restriction by leveraging Jackson for property lookup, and 
> leaving the YAML handling to SnakeYAML



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17293) Update python test framework from nose to pytest

2022-02-22 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17496335#comment-17496335
 ] 

Stefan Miklosovic commented on CASSANDRA-17293:
---

[~mck] if you look here (1) (that is "new test") it says that there is a test 
called "testrun_cqlsh". However, if you look what test_unicode looks like, it 
consists of four tests (2).

If you take a look how it look in the code for one of these methods, there is, 
for example:

{code}
def test_unicode_value_round_trip(self):
with testrun_cqlsh(tty=True, env=self.default_env) as c:
{code}

Hence, to me, it looks like it will report that "testrun_cqlsh" is also a test 
which "passed".

I am not sure how to get rid of this. Maybe we might programmatically remove 
that entry from xml, not sure how complicated that would be. 

(1) 
https://ci-cassandra.apache.org/job/Cassandra-trunk-cqlsh-tests/lastSuccessfulBuild/cython=yes,jdk=jdk_1.8_latest,label=cassandra/testReport/cqlshlib.python3.jdk8.no_cython.test/test_unicode/

(2) 
https://ci-cassandra.apache.org/job/Cassandra-trunk-cqlsh-tests/lastSuccessfulBuild/cython=yes,jdk=jdk_1.8_latest,label=cassandra/testReport/cqlshlib.python3.jdk8.no_cython.test.test_unicode/TestCqlshUnicode/

> Update python test framework from nose to pytest
> 
>
> Key: CASSANDRA-17293
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17293
> Project: Cassandra
>  Issue Type: Task
>  Components: CQL/Interpreter
>Reporter: Brad Schoening
>Assignee: Brad Schoening
>Priority: Normal
> Fix For: 4.1
>
>
> I had trouble trying to install and run the python nose test from pip 
> (nosetest not found).
> According to the homepage of nose at [https://nose.readthedocs.io/en/latest/]
> h1. _Note to Users_
> _Nose has been in maintenance mode for the past several years and will likely 
> cease without a new person/team to take over maintainership. New projects 
> should consider using [Nose2|https://github.com/nose-devs/nose2], 
> [py.test|http://pytest.org/], or just plain unittest/unittest2._
>  
> Upgrading to pytest is likely the least effort. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17398) WEBSITE - February 2022 blog "Apache Cassandra and Java SE 11 support"

2022-02-22 Thread Diogenese Topper (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Diogenese Topper updated CASSANDRA-17398:
-
Test and Documentation Plan: 
* Add blog post titled "Apache Cassandra and Java SE 11 support"
* Update blog index
* Add image for blog: 
"apache-cassandra-and-java-se-11-support-unsplash-michiel-leunens.jpg"

  was:
* Add blog post titled "Apache Cassandra and Java SE 11 LTS support"
* Update blog index
* Add image for blog: 
"apache-cassandra-and-java-se-11-lts-support-unsplash-michiel-leunens.jpg"


> WEBSITE - February 2022 blog "Apache Cassandra and Java SE 11 support"
> --
>
> Key: CASSANDRA-17398
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17398
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation/Blog
>Reporter: Diogenese Topper
>Priority: Normal
>  Labels: pull-request-available
> Fix For: NA
>
>
> This ticket is to capture the work associated with publishing the January 
> 2022 blog "Apache Cassandra and Java SE 11 LTS support"
> If this blog cannot be published by the *February 24, 2022 publish date*, 
> please contact me on ASF Slack or suggest/make changes when possible in the 
> pull request for the appropriate time that the blog will go live (on 
> blog.adoc and in the blog post .adoc).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17398) WEBSITE - February 2022 blog "Apache Cassandra and Java SE 11 support"

2022-02-22 Thread Diogenese Topper (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Diogenese Topper updated CASSANDRA-17398:
-
Description: 
This ticket is to capture the work associated with publishing the January 2022 
blog "Apache Cassandra and Java SE 11 support"

If this blog cannot be published by the *February 24, 2022 publish date*, 
please contact me on ASF Slack or suggest/make changes when possible in the 
pull request for the appropriate time that the blog will go live (on blog.adoc 
and in the blog post .adoc).

  was:
This ticket is to capture the work associated with publishing the January 2022 
blog "Apache Cassandra and Java SE 11 LTS support"

If this blog cannot be published by the *February 24, 2022 publish date*, 
please contact me on ASF Slack or suggest/make changes when possible in the 
pull request for the appropriate time that the blog will go live (on blog.adoc 
and in the blog post .adoc).


> WEBSITE - February 2022 blog "Apache Cassandra and Java SE 11 support"
> --
>
> Key: CASSANDRA-17398
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17398
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation/Blog
>Reporter: Diogenese Topper
>Priority: Normal
>  Labels: pull-request-available
> Fix For: NA
>
>
> This ticket is to capture the work associated with publishing the January 
> 2022 blog "Apache Cassandra and Java SE 11 support"
> If this blog cannot be published by the *February 24, 2022 publish date*, 
> please contact me on ASF Slack or suggest/make changes when possible in the 
> pull request for the appropriate time that the blog will go live (on 
> blog.adoc and in the blog post .adoc).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17398) WEBSITE - February 2022 blog "Apache Cassandra and Java SE 11 support"

2022-02-22 Thread Diogenese Topper (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Diogenese Topper updated CASSANDRA-17398:
-
Summary: WEBSITE - February 2022 blog "Apache Cassandra and Java SE 11 
support"  (was: WEBSITE - February 2022 blog "Apache Cassandra and Java SE 11 
LTS support")

> WEBSITE - February 2022 blog "Apache Cassandra and Java SE 11 support"
> --
>
> Key: CASSANDRA-17398
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17398
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation/Blog
>Reporter: Diogenese Topper
>Priority: Normal
>  Labels: pull-request-available
> Fix For: NA
>
>
> This ticket is to capture the work associated with publishing the January 
> 2022 blog "Apache Cassandra and Java SE 11 LTS support"
> If this blog cannot be published by the *February 24, 2022 publish date*, 
> please contact me on ASF Slack or suggest/make changes when possible in the 
> pull request for the appropriate time that the blog will go live (on 
> blog.adoc and in the blog post .adoc).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17398) WEBSITE - February 2022 blog "Apache Cassandra and Java SE 11 LTS support"

2022-02-22 Thread Diogenese Topper (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Diogenese Topper updated CASSANDRA-17398:
-
Status: Patch Available  (was: Open)

https://github.com/apache/cassandra-website/pull/104

> WEBSITE - February 2022 blog "Apache Cassandra and Java SE 11 LTS support"
> --
>
> Key: CASSANDRA-17398
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17398
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation/Blog
>Reporter: Diogenese Topper
>Priority: Normal
>  Labels: pull-request-available
> Fix For: NA
>
>
> This ticket is to capture the work associated with publishing the January 
> 2022 blog "Apache Cassandra and Java SE 11 LTS support"
> If this blog cannot be published by the *February 24, 2022 publish date*, 
> please contact me on ASF Slack or suggest/make changes when possible in the 
> pull request for the appropriate time that the blog will go live (on 
> blog.adoc and in the blog post .adoc).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17398) WEBSITE - February 2022 blog "Apache Cassandra and Java SE 11 LTS support"

2022-02-22 Thread Diogenese Topper (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Diogenese Topper updated CASSANDRA-17398:
-
Status: Open  (was: Triage Needed)

> WEBSITE - February 2022 blog "Apache Cassandra and Java SE 11 LTS support"
> --
>
> Key: CASSANDRA-17398
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17398
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation/Blog
>Reporter: Diogenese Topper
>Priority: Normal
>  Labels: pull-request-available
> Fix For: NA
>
>
> This ticket is to capture the work associated with publishing the January 
> 2022 blog "Apache Cassandra and Java SE 11 LTS support"
> If this blog cannot be published by the *February 24, 2022 publish date*, 
> please contact me on ASF Slack or suggest/make changes when possible in the 
> pull request for the appropriate time that the blog will go live (on 
> blog.adoc and in the blog post .adoc).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17398) WEBSITE - February 2022 blog "Apache Cassandra and Java SE 11 LTS support"

2022-02-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CASSANDRA-17398:
---
Labels: pull-request-available  (was: )

> WEBSITE - February 2022 blog "Apache Cassandra and Java SE 11 LTS support"
> --
>
> Key: CASSANDRA-17398
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17398
> Project: Cassandra
>  Issue Type: Task
>  Components: Documentation/Blog
>Reporter: Diogenese Topper
>Priority: Normal
>  Labels: pull-request-available
> Fix For: NA
>
>
> This ticket is to capture the work associated with publishing the January 
> 2022 blog "Apache Cassandra and Java SE 11 LTS support"
> If this blog cannot be published by the *February 24, 2022 publish date*, 
> please contact me on ASF Slack or suggest/make changes when possible in the 
> pull request for the appropriate time that the blog will go live (on 
> blog.adoc and in the blog post .adoc).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17292) Move cassandra.yaml toward a nested structure around major database concepts

2022-02-22 Thread Caleb Rackliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17496332#comment-17496332
 ] 

Caleb Rackliffe commented on CASSANDRA-17292:
-

bq. The basic construct to create new feature configurations is the following 
class:
bq. For example this is how "HintsConfiguration" would look like:
bq. And would be represented as following on cassandra.yaml:

Gotcha. I don't think we'd be too far apart on any of that once we get into 
implementation space.

I think the main point of contention then is incremental vs. non-incremental 
migration of _existing_ configuration. (I emphasize "existing", because we have 
the opportunity to do new things like CASSANDRA-17148 without having to change 
it later if we have a coherent design for it to ultimately fit into. Having a 
small section of the config in the same format between v1 and v2 isn't really a 
problem.) There was actually a Slack thread about this very recently 
[here|https://the-asf.slack.com/archives/CK23JSY2K/p1645049135928759]. I think 
the thought that's hard for me to escape around this is that we _really_ want a 
coherent design for the whole configuration up-front, given the lack of one is 
at least partially to blame for the current mess. Then, if we have that, and we 
can work out whatever small inconsistencies exist, we can present operators 
with a clean v2 config file format in 5.0 (that requires us to do very little 
thinking about compatibility, outside checking the {{version}} element).

> Move cassandra.yaml toward a nested structure around major database concepts
> 
>
> Key: CASSANDRA-17292
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17292
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 5.x
>
>
> Recent mailing list conversation (see "[DISCUSS] Nested YAML configs for new 
> features") has made it clear we will gravitate toward appropriately nested 
> structures for new parameters in {{cassandra.yaml}}, but from the scattered 
> conversation across a few Guardrails tickets (see CASSANDRA-17212 and 
> CASSANDRA-17148) and CASSANDRA-15234, there is also a general desire to 
> eventually extend this to the rest of {{cassandra.yaml}}. The benefits of 
> this change include those we gain by doing it for new features (single point 
> of interest for feature documentation, typed configuration objects, logical 
> grouping for additional parameters added over time, discoverability, etc.), 
> but one a larger scale.
> This may overlap with ongoing work, including the Guardrails epic. Ideally, 
> even a rough cut of a design here would allow that to move forward in a 
> timely and coherent manner (with less long-term refactoring pain).
> Current proposals:
> From [~benedict] - 
> https://github.com/belliottsmith/cassandra/commits/CASSANDRA-15234-grouping-ideas
> From [~maedhroz] - 
> https://github.com/maedhroz/cassandra/commit/450b920e0ac072cec635e0ebcb63538ee7f1fc5a
> From [~paulo] - 
> https://gist.github.com/pauloricardomg/e9e23feea1b172b4f084cb01d7a89b05



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17293) Update python test framework from nose to pytest

2022-02-22 Thread Stefan Miklosovic (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17496331#comment-17496331
 ] 

Stefan Miklosovic commented on CASSANDRA-17293:
---

Wow, seems like it! I ll take a closer look tomorrow.

> Update python test framework from nose to pytest
> 
>
> Key: CASSANDRA-17293
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17293
> Project: Cassandra
>  Issue Type: Task
>  Components: CQL/Interpreter
>Reporter: Brad Schoening
>Assignee: Brad Schoening
>Priority: Normal
> Fix For: 4.1
>
>
> I had trouble trying to install and run the python nose test from pip 
> (nosetest not found).
> According to the homepage of nose at [https://nose.readthedocs.io/en/latest/]
> h1. _Note to Users_
> _Nose has been in maintenance mode for the past several years and will likely 
> cease without a new person/team to take over maintainership. New projects 
> should consider using [Nose2|https://github.com/nose-devs/nose2], 
> [py.test|http://pytest.org/], or just plain unittest/unittest2._
>  
> Upgrading to pytest is likely the least effort. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17292) Move cassandra.yaml toward a nested structure around major database concepts

2022-02-22 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17496330#comment-17496330
 ] 

Paulo Motta commented on CASSANDRA-17292:
-

Added an example of new feature-centric layout mixed with legacy configuration 
on a single "cassandra.yaml" for illustration: 
https://gist.github.com/pauloricardomg/4369f4b0dd8b84421a11ae61bf2d2c7e

> Move cassandra.yaml toward a nested structure around major database concepts
> 
>
> Key: CASSANDRA-17292
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17292
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 5.x
>
>
> Recent mailing list conversation (see "[DISCUSS] Nested YAML configs for new 
> features") has made it clear we will gravitate toward appropriately nested 
> structures for new parameters in {{cassandra.yaml}}, but from the scattered 
> conversation across a few Guardrails tickets (see CASSANDRA-17212 and 
> CASSANDRA-17148) and CASSANDRA-15234, there is also a general desire to 
> eventually extend this to the rest of {{cassandra.yaml}}. The benefits of 
> this change include those we gain by doing it for new features (single point 
> of interest for feature documentation, typed configuration objects, logical 
> grouping for additional parameters added over time, discoverability, etc.), 
> but one a larger scale.
> This may overlap with ongoing work, including the Guardrails epic. Ideally, 
> even a rough cut of a design here would allow that to move forward in a 
> timely and coherent manner (with less long-term refactoring pain).
> Current proposals:
> From [~benedict] - 
> https://github.com/belliottsmith/cassandra/commits/CASSANDRA-15234-grouping-ideas
> From [~maedhroz] - 
> https://github.com/maedhroz/cassandra/commit/450b920e0ac072cec635e0ebcb63538ee7f1fc5a
> From [~paulo] - 
> https://gist.github.com/pauloricardomg/e9e23feea1b172b4f084cb01d7a89b05



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17292) Move cassandra.yaml toward a nested structure around major database concepts

2022-02-22 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17496307#comment-17496307
 ] 

Paulo Motta commented on CASSANDRA-17292:
-

One additional thing I would like to note is that my proposal conciously 
abstains from attempting to pre-define a full domain model upfront, in favor of 
an incremental feature-centric approach, where we migrate the properties from 
the legacy flat format to the new feature-centric format gradually - while new 
features can already start using the new format based on the 
{{FeatureConfiguration}} abstraction - as exemplified above in the migration of 
the "hints" configuration from the old to the new model.

> Move cassandra.yaml toward a nested structure around major database concepts
> 
>
> Key: CASSANDRA-17292
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17292
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 5.x
>
>
> Recent mailing list conversation (see "[DISCUSS] Nested YAML configs for new 
> features") has made it clear we will gravitate toward appropriately nested 
> structures for new parameters in {{cassandra.yaml}}, but from the scattered 
> conversation across a few Guardrails tickets (see CASSANDRA-17212 and 
> CASSANDRA-17148) and CASSANDRA-15234, there is also a general desire to 
> eventually extend this to the rest of {{cassandra.yaml}}. The benefits of 
> this change include those we gain by doing it for new features (single point 
> of interest for feature documentation, typed configuration objects, logical 
> grouping for additional parameters added over time, discoverability, etc.), 
> but one a larger scale.
> This may overlap with ongoing work, including the Guardrails epic. Ideally, 
> even a rough cut of a design here would allow that to move forward in a 
> timely and coherent manner (with less long-term refactoring pain).
> Current proposals:
> From [~benedict] - 
> https://github.com/belliottsmith/cassandra/commits/CASSANDRA-15234-grouping-ideas
> From [~maedhroz] - 
> https://github.com/maedhroz/cassandra/commit/450b920e0ac072cec635e0ebcb63538ee7f1fc5a
> From [~paulo] - 
> https://gist.github.com/pauloricardomg/e9e23feea1b172b4f084cb01d7a89b05



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-17292) Move cassandra.yaml toward a nested structure around major database concepts

2022-02-22 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17496302#comment-17496302
 ] 

Paulo Motta edited comment on CASSANDRA-17292 at 2/22/22, 8:29 PM:
---

Thanks for the additional context [~maedhroz], that is very helpful to 
understand the reasoning behind the proposed nesting.

{quote}For a moment, let's ignore the fact that there's any kind of textual 
configuration file at all for the project, but we still have all the 
knobs/systems/etc. The very first thing I would do is create a "domain model" 
for C* configuration on the Java side, a hierarchy rooted in a Configuration 
container class, which would contain members w/ types like 
ClusterConfiguration, NetworkConfiguration, StorageConfiguration, etc. These 
would be easy to navigate, would provide reasonable points for inline 
documentation, could encapsulate validation logic for relationships between 
parameters within subsystems and features, and could be passed as little 
"kernels" of configuration around the codebase, allowing for better mocking, 
etc.
{quote}

I think we're not very far from what we want the end result to look like from 
the developer's perspective, my proposal is just a simplification of yours 
where instead of a multi-level hierarchy rooted on physical resources 
(cluster/network/storage), I'm proposing a feature-centric domain model 
hierachy with a single level - each feature define its own configuration 
subtree.

The basic construct to create new feature configurations is the following class:
{code:java}
public abstract class FeatureConfiguration
{
// is the feature enabled by default?
boolean enabled = false;

// the feature name to be used in the YAML/JSON
public abstract String getFeatureName();

// whether this feature can be disabled
public boolean isOptional()
{
return true;
}
}
{code}
This would allow to easily create typed configuration for each feature:
 * CommitlogConfiguration
 * HintsConfiguration
 * MaterializedViewsConfiguration

For example this is how "HintsConfiguration" would look like:
{code:java}
public class HintsConfiguration extends FeatureConfiguration
{
   public HintsConfiguration()
   {
 this.enabled = true;
   } 

   public String getFeatureName()
   {
 return "hinted_handoff";
   }

   boolean auto_hints_cleanup = false
   Duration max_hint_window = "3h"
   Throttle hinted_handoff_throttle = "1024KiB"
   int max_hints_delivery_threads = 2
   Duration hints_flush_period = "1ms"
   Size max_hints_file_size = "128MiB"
}
{code}

And would be represented as following on {{cassandra.yaml}}:

{code:yaml}
# Commit log (cannot be disabled because isOptional()=false)
commit_log:
  commitlog_sync: periodic
  commitlog_sync_period: 1ms
  commitlog_segment_size: 32MiB

# Hinted Handoff
hinted_handoff:
  enabled: true
  auto_hints_cleanup: false
  max_hint_window: 3h
  hinted_handoff_throttle: 1024KiB
  max_hints_delivery_threads: 2
  hints_flush_period: 1ms
  max_hints_file_size: 128MiB

# MVs are experimental and not recommended for production-use
materialized_views:
  enabled: false 
{code}

The approach above provides a very simple user experience while allowing typed 
configuration in the developer's side.

I think that we can easily fit most database configurations in this 
feature-centric view, but if there are some that we cannot fit into an existing 
feature we could create a new type {{ResourceConfiguration}} which would allow 
to configure a resource not tied to a particular feature.

{quote}I'm still pretty strongly in support of a versioned but intact single 
configuration file.
{quote}
Perhaps I should've made it clear but the split of configuration in multiple 
files is a mere optional convenience of my proposal, which also support 
configurations in a single file for backward-compatibility.

For instance, moving the configuration from the {{features.yaml}} to 
{{core.yaml}} would still render the same global configuration.

I think that the optional splitting of configuration in different files provide 
an organizational benefit of grouping together properties belonging to a 
similar category (ie. core-features which cannot be disabled, optional features 
and guardrails).

My original proposal of starting with 3 initial categories 
(core.yaml/features.yaml/guardrails.yaml) is mostly to facilitate the 
transition to the new configuration model:
 - cassandra.yaml (previously core.yaml): all legacy configurations would 
initially go here separated by section headers
 - features.yaml: all configurations compatible with the new 
{{{}FeatureConfiguration{ model would go here (including new features and 
"migrated" legacy features)
 - guardrails.yaml: all guardrails are collocated in the same file for 
operational simplicity

For instance, the hints configuration is curren

[jira] [Comment Edited] (CASSANDRA-17292) Move cassandra.yaml toward a nested structure around major database concepts

2022-02-22 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17496302#comment-17496302
 ] 

Paulo Motta edited comment on CASSANDRA-17292 at 2/22/22, 8:23 PM:
---

Thanks for the additional context [~maedhroz], that is very helpful to 
understand the reasoning behind the proposed nesting.
{quote}For a moment, let's ignore the fact that there's any kind of textual 
configuration file at all for the project, but we still have all the 
knobs/systems/etc. The very first thing I would do is create a "domain model" 
for C* configuration on the Java side, a hierarchy rooted in a Configuration 
container class, which would contain members w/ types like 
ClusterConfiguration, NetworkConfiguration, StorageConfiguration, etc. These 
would be easy to navigate, would provide reasonable points for inline 
documentation, could encapsulate validation logic for relationships between 
parameters within subsystems and features, and could be passed as little 
"kernels" of configuration around the codebase, allowing for better mocking, 
etc.
{quote}
I think we're not very far from what we want the end result to look like from 
the developer's perspective, my proposal is just a simplification of yours 
where instead of a multi-level hierarchy rooted on physical resources 
(cluster/network/storage), I'm proposing a feature-centric domain model 
hierachy with a single level - each feature define its own configuration 
subtree.

The basic construct to create new feature configurations is the following class:
{code:java}
public abstract class FeatureConfiguration
{
// is the feature enabled by default?
boolean enabled = false;

// the feature name to be used in the YAML/JSON
public abstract String getFeatureName();

// whether this feature can be disabled
public boolean isOptional()
{
return true;
}
}
{code}
This would allow to easily create typed configuration for each feature:
 * CommitlogConfiguration
 * HintsConfiguration
 * MaterializedViewsConfiguration

For example this is how "HintsConfiguration" would look like:
{code:java}
public class HintsConfiguration extends FeatureConfiguration
{
   public HintsConfiguration()
   {
 this.enabled = true;
   } 

   public String getFeatureName()
   {
 return "hinted_handoff";
   }

   boolean auto_hints_cleanup = false
   Duration max_hint_window = "3h"
   Throttle hinted_handoff_throttle = "1024KiB"
   int max_hints_delivery_threads = 2
   Duration hints_flush_period = "1ms"
   Size max_hints_file_size = "128MiB"
}
{code}
And would be represented as following on {{{}cassandra.yaml{}}}:
{code:yaml}
# Commit log (cannot be disabled because isOptional()=false)
commit_log:
  commitlog_sync: periodic
  commitlog_sync_period: 1ms
  commitlog_segment_size: 32MiB

# Hinted Handoff
hinted_handoff:
  enabled: true
  auto_hints_cleanup: false
  max_hint_window: 3h
  hinted_handoff_throttle: 1024KiB
  max_hints_delivery_threads: 2
  hints_flush_period: 1ms
  max_hints_file_size: 128MiB

# MVs are experimental and not recommended for production-use
materialized_views:   enabled: false 
{code}
The approach above provides a very simple user experience while allowing typed 
configuration in the developer's side.

I think that we can easily fit most database configurations in this 
feature-centric view, but if there are some that we cannot fit into an existing 
feature we could create a new type {{ResourceConfiguration}} which would allow 
to configure a resource not tied to a particular feature.
{quote}I'm still pretty strongly in support of a versioned but intact single 
configuration file.
{quote}
Perhaps I should've made it clear but the split of configuration in multiple 
files is a mere optional convenience of my proposal, which also support 
configurations in a single file for backward-compatibility.

For instance, moving the configuration from the {{features.yaml}} to 
{{core.yaml}} would still render the same global configuration.

I think that the optional splitting of configuration in different files provide 
an organizational benefit of grouping together properties belonging to a 
similar category (ie. core-features which cannot be disabled, optional features 
and guardrails).

My original proposal of starting with 3 initial categories 
(core.yaml/features.yaml/guardrails.yaml) is mostly to facilitate the 
transition to the new configuration model:
 - cassandra.yaml (previously core.yaml): all legacy configurations would 
initially go here separated by section headers
 - features.yaml: all configurations compatible with the new 
{{{}FeatureConfiguration{ model would go here (including new features and 
"migrated" legacy features)
 - guardrails.yaml: all guardrails are collocated in the same file for 
operational simplicity

For instance, the hints configuration is currentl

[jira] [Commented] (CASSANDRA-17292) Move cassandra.yaml toward a nested structure around major database concepts

2022-02-22 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17496302#comment-17496302
 ] 

Paulo Motta commented on CASSANDRA-17292:
-

Thanks for the additional context [~maedhroz], that is very helpful to 
understand the reasoning behind the proposed nesting.
{quote}For a moment, let's ignore the fact that there's any kind of textual 
configuration file at all for the project, but we still have all the 
knobs/systems/etc. The very first thing I would do is create a "domain model" 
for C* configuration on the Java side, a hierarchy rooted in a Configuration 
container class, which would contain members w/ types like 
ClusterConfiguration, NetworkConfiguration, StorageConfiguration, etc. These 
would be easy to navigate, would provide reasonable points for inline 
documentation, could encapsulate validation logic for relationships between 
parameters within subsystems and features, and could be passed as little 
"kernels" of configuration around the codebase, allowing for better mocking, 
etc.
{quote}
I think we're not very far from what we want the end result to look like from 
the developer's perspective, my proposal is just a simplification of yours 
where instead of a multi-level hierarchy rooted on physical resources 
(cluster/network/storage), I'm proposing a feature-centric domain model 
hierachy with a single level - each feature define its own configuration 
subtree.

The basic construct to create new feature configurations is the following class:
{code:java}
public abstract class FeatureConfiguration
{
// is the feature enabled by default?
boolean enabled = false;

// the feature name to be used in the YAML/JSON
public abstract String getFeatureName();

// whether this feature can be disabled
public boolean isOptional()
{
return true;
}
}
{code}
This would allow to easily create typed configuration for each feature:
 * CommitlogConfiguration
 * HintsConfiguration
 * MaterializedViewsConfiguration

For example this is how "HintsConfiguration" would look like:
{code:java}
public class HintsConfiguration extends FeatureConfiguration
{
   public HintsConfiguration()
   {
 this.enabled = true;
   } 

   public String getFeatureName()
   {
 return "hinted_handoff";
   }

   boolean auto_hints_cleanup = false
   Duration max_hint_window = "3h"
   Throttle hinted_handoff_throttle = "1024KiB"
   int max_hints_delivery_threads = 2
   Duration hints_flush_period = "1ms"
   Size max_hints_file_size = "128MiB"
}
{code}
And would be represented as following on {{{}cassandra.yaml{}}}:
{code:yaml}
# Commit log (cannot be disabled because isOptional()=false)
commit_log:   commitlog_sync: periodic
  commitlog_sync_period: 1ms
  commitlog_segment_size: 32MiB

# Hinted Handoff
hinted_handoff:   enabled: true
  auto_hints_cleanup: false
  max_hint_window: 3h
  hinted_handoff_throttle: 1024KiB
  max_hints_delivery_threads: 2
  hints_flush_period: 1ms
  max_hints_file_size: 128MiB

# MVs are experimental and not recommended for production-use
materialized_views:   enabled: false 
{code}
The approach above provides a very simple user experience while allowing typed 
configuration in the developer's side.

I think that we can easily fit most database configurations in this 
feature-centric view, but if there are some that we cannot fit into an existing 
feature we could create a new type {{ResourceConfiguration}} which would allow 
to configure a resource not tied to a particular feature.
{quote}I'm still pretty strongly in support of a versioned but intact single 
configuration file.
{quote}
Perhaps I should've made it clear but the split of configuration in multiple 
files is a mere optional convenience of my proposal, which also support 
configurations in a single file for backward-compatibility.

For instance, moving the configuration from the {{features.yaml}} to 
{{core.yaml}} would still render the same global configuration.

I think that the optional splitting of configuration in different files provide 
an organizational benefit of grouping together properties belonging to a 
similar category (ie. core-features which cannot be disabled, optional features 
and guardrails).

My original proposal of starting with 3 initial categories 
(core.yaml/features.yaml/guardrails.yaml) is mostly to facilitate the 
transition to the new configuration model:
 - cassandra.yaml (previously core.yaml): all legacy configurations would 
initially go here separated by section headers
 - features.yaml: all configurations compatible with the new 
{{{}FeatureConfiguration{ model would go here (including new features and 
"migrated" legacy features)
 - guardrails.yaml: all guardrails are collocated in the same file for 
operational simplicity

For instance, the hints configuration is currently flat so it would initially 
go in {{cassandra.yam

[jira] [Updated] (CASSANDRA-17287) Replace cqlshlib/wcwidth.py with pypi module 'wcwidth'

2022-02-22 Thread Stefan Miklosovic (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Miklosovic updated CASSANDRA-17287:
--
Reviewers: Brandon Williams, Stefan Miklosovic  (was: Brandon Williams)
   Status: Review In Progress  (was: Needs Committer)

> Replace cqlshlib/wcwidth.py with pypi module 'wcwidth'
> --
>
> Key: CASSANDRA-17287
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17287
> Project: Cassandra
>  Issue Type: Task
>  Components: CQL/Interpreter
>Reporter: Brad Schoening
>Assignee: Brad Schoening
>Priority: Normal
> Fix For: 4.x
>
> Attachments: CQLSH sample query.jpg
>
>
> The module wcwidth implements the same Markus Kuhn algorithm defined in 
> POSIX.1-2008 to return the number of cells a unicode string is expected to 
> occupy.
> The module wcwidth is used by hundreds of libraries including pytest and 
> prompt-toolkit (used in ipython).  It replaces 379 lines of bespoke code in 
> cqlshlib.
> {quote}from wcwidth import wcswidth   # at [https://pypi.org/project/wcwidth/]
> print(wcswidth('コンニチハ'))
> 10
> {{from cqlshlib.wcwidth import wcswidth as cql_wcswidth}}
> print(cql_wcswidth('コンニチハ'))
> 10
> {quote}
> wcwidth appears to be used only by one line in formatting.py:
>  return bval if colormap is NO_COLOR_MAP else color_text(bval, colormap, 
> wcwidth.wcswidth(bval))



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16092) Add Index Group Interface for Storage Attached Index

2022-02-22 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-16092:

Fix Version/s: 5.x
   (was: 4.x)

> Add Index Group Interface for Storage Attached Index
> 
>
> Key: CASSANDRA-16092
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16092
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/SASI
>Reporter: Zhao Yang
>Assignee: Zhao Yang
>Priority: Normal
> Fix For: 5.x
>
>
> [Index 
> group|https://github.com/datastax/cassandra/blob/storage_attached_index/src/java/org/apache/cassandra/index/Index.java#L634]
>  interface allows:
> * indexes on the same table to receive centralized lifecycle events called 
> secondary index groups. Sharing of data between multiple column indexes on 
> the same table allows SAI disk usage to realise significant space savings 
> over other index implementations.
> * index-group to analyze user query and provide a query plan that leverages 
> all available indexes within the group.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16092) Add Index Group Interface for Storage Attached Index

2022-02-22 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-16092:

Status: Ready to Commit  (was: Review In Progress)

[~jasonstack] I've finally created the SAI feature branch 
[here|https://github.com/apache/cassandra/pull/1466]. I'm assuming we might 
need a quick rebase, but otherwise, feel free to merge there and resolve.

> Add Index Group Interface for Storage Attached Index
> 
>
> Key: CASSANDRA-16092
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16092
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/SASI
>Reporter: Zhao Yang
>Assignee: Zhao Yang
>Priority: Normal
> Fix For: 4.x
>
>
> [Index 
> group|https://github.com/datastax/cassandra/blob/storage_attached_index/src/java/org/apache/cassandra/index/Index.java#L634]
>  interface allows:
> * indexes on the same table to receive centralized lifecycle events called 
> secondary index groups. Sharing of data between multiple column indexes on 
> the same table allows SAI disk usage to realise significant space savings 
> over other index implementations.
> * index-group to analyze user query and provide a query plan that leverages 
> all available indexes within the group.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16092) Add Index Group Interface for Storage Attached Index

2022-02-22 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-16092:

Status: Review In Progress  (was: Patch Available)

> Add Index Group Interface for Storage Attached Index
> 
>
> Key: CASSANDRA-16092
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16092
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/SASI
>Reporter: Zhao Yang
>Assignee: Zhao Yang
>Priority: Normal
> Fix For: 4.x
>
>
> [Index 
> group|https://github.com/datastax/cassandra/blob/storage_attached_index/src/java/org/apache/cassandra/index/Index.java#L634]
>  interface allows:
> * indexes on the same table to receive centralized lifecycle events called 
> secondary index groups. Sharing of data between multiple column indexes on 
> the same table allows SAI disk usage to realise significant space savings 
> over other index implementations.
> * index-group to analyze user query and provide a query plan that leverages 
> all available indexes within the group.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16052) CEP-7 Storage Attached Indexes

2022-02-22 Thread Caleb Rackliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17496261#comment-17496261
 ] 

Caleb Rackliffe commented on CASSANDRA-16052:
-

I've created what will probably be a relative long-lived feature branch (that I 
will try to keep rebased to {{trunk}}) for us to merge SAI-related work.

https://github.com/apache/cassandra/pull/1466
https://app.circleci.com/pipelines/github/maedhroz/cassandra?branch=CASSANDRA-16052

> CEP-7 Storage Attached Indexes
> --
>
> Key: CASSANDRA-16052
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16052
> Project: Cassandra
>  Issue Type: Epic
>  Components: Feature/2i Index
>Reporter: Zhao Yang
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> [CEP|https://docs.google.com/document/d/1V830eAMmQAspjJdjviVZIaSolVGvZ1hVsqOLWyV0DS4/edit#heading=h.67ap6rr1mxr]
>  - A new index implementation, called Storage
>  Attached Index(SAI), based on the advancement made by SASI.
>  * disk usage by sharing of common data between multiple column indexes on 
> the same table and better compression of on-disk structures.
>  * numeric range query performance with modified KDTree and collection type 
> support.
>  * compaction performance and stability for larger data set.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-16052) CEP-7 Storage Attached Indexes

2022-02-22 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-16052:

Summary: CEP-7 Storage Attached Indexes  (was: CEP-7 Storage Attached Index 
for Apache Cassandra)

> CEP-7 Storage Attached Indexes
> --
>
> Key: CASSANDRA-16052
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16052
> Project: Cassandra
>  Issue Type: Epic
>  Components: Feature/2i Index
>Reporter: Zhao Yang
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 5.x
>
>
> [CEP|https://docs.google.com/document/d/1V830eAMmQAspjJdjviVZIaSolVGvZ1hVsqOLWyV0DS4/edit#heading=h.67ap6rr1mxr]
>  - A new index implementation, called Storage
>  Attached Index(SAI), based on the advancement made by SASI.
>  * disk usage by sharing of common data between multiple column indexes on 
> the same table and better compression of on-disk structures.
>  * numeric range query performance with modified KDTree and collection type 
> support.
>  * compaction performance and stability for larger data set.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17292) Move cassandra.yaml toward a nested structure around major database concepts

2022-02-22 Thread Caleb Rackliffe (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17496259#comment-17496259
 ] 

Caleb Rackliffe commented on CASSANDRA-17292:
-

[~paulo] Thanks for the 
[proposal|https://gist.github.com/pauloricardomg/e9e23feea1b172b4f084cb01d7a89b05].
 I've been able to give it and your comments a couple reads through, but before 
I offer some feedback, a little diversion...

For a moment, let's ignore the fact that there's any kind of textual 
configuration file at all for the project, but we still have all the 
knobs/systems/etc. The very first thing I would do is create a "domain model" 
for C* configuration on the Java side, a hierarchy rooted in a 
{{Configuration}} container class, which would contain members w/ types like 
{{ClusterConfiguration}}, {{NetworkConfiguration}}, {{StorageConfiguration}}, 
etc. These would be easy to navigate, would provide reasonable points for 
inline documentation, could encapsulate validation logic for relationships 
between parameters within subsystems and features, and could be passed as 
little "kernels" of configuration around the codebase, allowing for better 
mocking, etc.

With that configuration model in hand, we could then deal w/ the problem of its 
mapping to and from some kind of human-readable format. In this case, something 
like a nested YAML file (or it could be JSON, etc.) seems to be the best 
option, in terms of its ease of use w/ tooling, its conceptual mapping, and 
with even minimal care around naming, its human navigability/readability.

Predictably then, I'm still pretty strongly in support of a versioned but 
intact single configuration file. I could imagine a synthesis of the two 
proposals that would minimize the amount of potential bouncing between files 
for operators trying to make sense of related configuration items, but simply 
having multiple files worries me. Within the structure of the individual files, 
I would also push for named hierarchies rather than relying on comments to 
denote sections of related parameters. (This has been one of the primary 
motivations behind moving toward a nested structure.)

bq. I think that the intermingling of feature/subsystem/resource in the yaml 
structure can get a little counterintuitive and does not provide a consistent 
framework for extending the properties.

This, however, is something I really want to dig into, because it echoes some 
of the concerns [~benedict] has had about the current single-file approach 
(although the most current iteration of it 
[here|https://github.com/maedhroz/cassandra/commit/450b920e0ac072cec635e0ebcb63538ee7f1fc5a]
 was specifically built to address some of those concerns and integrates even 
future parameters like those we'll introduce in CASSANDRA-17148). Are there any 
major inconsistencies you could expand on?

> Move cassandra.yaml toward a nested structure around major database concepts
> 
>
> Key: CASSANDRA-17292
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17292
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 5.x
>
>
> Recent mailing list conversation (see "[DISCUSS] Nested YAML configs for new 
> features") has made it clear we will gravitate toward appropriately nested 
> structures for new parameters in {{cassandra.yaml}}, but from the scattered 
> conversation across a few Guardrails tickets (see CASSANDRA-17212 and 
> CASSANDRA-17148) and CASSANDRA-15234, there is also a general desire to 
> eventually extend this to the rest of {{cassandra.yaml}}. The benefits of 
> this change include those we gain by doing it for new features (single point 
> of interest for feature documentation, typed configuration objects, logical 
> grouping for additional parameters added over time, discoverability, etc.), 
> but one a larger scale.
> This may overlap with ongoing work, including the Guardrails epic. Ideally, 
> even a rough cut of a design here would allow that to move forward in a 
> timely and coherent manner (with less long-term refactoring pain).
> Current proposals:
> From [~benedict] - 
> https://github.com/belliottsmith/cassandra/commits/CASSANDRA-15234-grouping-ideas
> From [~maedhroz] - 
> https://github.com/maedhroz/cassandra/commit/450b920e0ac072cec635e0ebcb63538ee7f1fc5a
> From [~paulo] - 
> https://gist.github.com/pauloricardomg/e9e23feea1b172b4f084cb01d7a89b05



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apach

[jira] [Commented] (CASSANDRA-17293) Update python test framework from nose to pytest

2022-02-22 Thread Michael Semb Wever (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17496256#comment-17496256
 ] 

Michael Semb Wever commented on CASSANDRA-17293:


Thank you [~smiklosovic] for the attention and quick fix.

The addition fix looks good to me.

One question though… there appears to be four additional tests now, is that 
intentional? 

Compare
- 
https://ci-cassandra.apache.org/job/Cassandra-trunk-cqlsh-tests/lastSuccessfulBuild/cython=yes,jdk=jdk_1.8_latest,label=cassandra/testReport/
 
- 
https://ci-cassandra.apache.org/job/Cassandra-trunk-cqlsh-tests/1137/cython=yes,jdk=jdk_1.8_latest,label=cassandra/testReport/
 

These tests appear, where they didn't before…
- 
https://ci-cassandra.apache.org/job/Cassandra-trunk-cqlsh-tests/lastSuccessfulBuild/cython=yes,jdk=jdk_1.8_latest,label=cassandra/testReport/cqlshlib.python3.jdk8.no_cython.test/

Those tests have been around for a while, so it looks like this has fixed 
something?



> Update python test framework from nose to pytest
> 
>
> Key: CASSANDRA-17293
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17293
> Project: Cassandra
>  Issue Type: Task
>  Components: CQL/Interpreter
>Reporter: Brad Schoening
>Assignee: Brad Schoening
>Priority: Normal
> Fix For: 4.1
>
>
> I had trouble trying to install and run the python nose test from pip 
> (nosetest not found).
> According to the homepage of nose at [https://nose.readthedocs.io/en/latest/]
> h1. _Note to Users_
> _Nose has been in maintenance mode for the past several years and will likely 
> cease without a new person/team to take over maintainership. New projects 
> should consider using [Nose2|https://github.com/nose-devs/nose2], 
> [py.test|http://pytest.org/], or just plain unittest/unittest2._
>  
> Upgrading to pytest is likely the least effort. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17292) Move cassandra.yaml toward a nested structure around major database concepts

2022-02-22 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-17292:

Description: 
Recent mailing list conversation (see "[DISCUSS] Nested YAML configs for new 
features") has made it clear we will gravitate toward appropriately nested 
structures for new parameters in {{cassandra.yaml}}, but from the scattered 
conversation across a few Guardrails tickets (see CASSANDRA-17212 and 
CASSANDRA-17148) and CASSANDRA-15234, there is also a general desire to 
eventually extend this to the rest of {{cassandra.yaml}}. The benefits of this 
change include those we gain by doing it for new features (single point of 
interest for feature documentation, typed configuration objects, logical 
grouping for additional parameters added over time, discoverability, etc.), but 
one a larger scale.

This may overlap with ongoing work, including the Guardrails epic. Ideally, 
even a rough cut of a design here would allow that to move forward in a timely 
and coherent manner (with less long-term refactoring pain).

Current proposals:

>From [~benedict] - 
>https://github.com/belliottsmith/cassandra/commits/CASSANDRA-15234-grouping-ideas

>From [~maedhroz] - 
>https://github.com/maedhroz/cassandra/commit/450b920e0ac072cec635e0ebcb63538ee7f1fc5a

>From [~paulo] - 
>https://gist.github.com/pauloricardomg/e9e23feea1b172b4f084cb01d7a89b05

  was:
Recent mailing list conversation (see "[DISCUSS] Nested YAML configs for new 
features") has made it clear we will gravitate toward appropriately nested 
structures for new parameters in {{cassandra.yaml}}, but from the scattered 
conversation across a few Guardrails tickets (see CASSANDRA-17212 and 
CASSANDRA-17148) and CASSANDRA-15234, there is also a general desire to 
eventually extend this to the rest of {{cassandra.yaml}}. The benefits of this 
change include those we gain by doing it for new features (single point of 
interest for feature documentation, typed configuration objects, logical 
grouping for additional parameters added over time, discoverability, etc.), but 
one a larger scale.

This may overlap with ongoing work, including the Guardrails epic. Ideally, 
even a rough cut of a design here would allow that to move forward in a timely 
and coherent manner (with less long-term refactoring pain).

While these would have to be adjusted to CASSANDRA-15234 (probably after it 
merges), there have been two proposals floated already for what this might look 
like:

>From [~benedict] - 
>https://github.com/belliottsmith/cassandra/commits/CASSANDRA-15234-grouping-ideas

>From [~maedhroz] - 
>https://github.com/maedhroz/cassandra/commit/450b920e0ac072cec635e0ebcb63538ee7f1fc5a

>From [~paulo] - 
>https://gist.github.com/pauloricardomg/e9e23feea1b172b4f084cb01d7a89b05


> Move cassandra.yaml toward a nested structure around major database concepts
> 
>
> Key: CASSANDRA-17292
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17292
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 5.x
>
>
> Recent mailing list conversation (see "[DISCUSS] Nested YAML configs for new 
> features") has made it clear we will gravitate toward appropriately nested 
> structures for new parameters in {{cassandra.yaml}}, but from the scattered 
> conversation across a few Guardrails tickets (see CASSANDRA-17212 and 
> CASSANDRA-17148) and CASSANDRA-15234, there is also a general desire to 
> eventually extend this to the rest of {{cassandra.yaml}}. The benefits of 
> this change include those we gain by doing it for new features (single point 
> of interest for feature documentation, typed configuration objects, logical 
> grouping for additional parameters added over time, discoverability, etc.), 
> but one a larger scale.
> This may overlap with ongoing work, including the Guardrails epic. Ideally, 
> even a rough cut of a design here would allow that to move forward in a 
> timely and coherent manner (with less long-term refactoring pain).
> Current proposals:
> From [~benedict] - 
> https://github.com/belliottsmith/cassandra/commits/CASSANDRA-15234-grouping-ideas
> From [~maedhroz] - 
> https://github.com/maedhroz/cassandra/commit/450b920e0ac072cec635e0ebcb63538ee7f1fc5a
> From [~paulo] - 
> https://gist.github.com/pauloricardomg/e9e23feea1b172b4f084cb01d7a89b05



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17292) Move cassandra.yaml toward a nested structure around major database concepts

2022-02-22 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-17292:

Description: 
Recent mailing list conversation (see "[DISCUSS] Nested YAML configs for new 
features") has made it clear we will gravitate toward appropriately nested 
structures for new parameters in {{cassandra.yaml}}, but from the scattered 
conversation across a few Guardrails tickets (see CASSANDRA-17212 and 
CASSANDRA-17148) and CASSANDRA-15234, there is also a general desire to 
eventually extend this to the rest of {{cassandra.yaml}}. The benefits of this 
change include those we gain by doing it for new features (single point of 
interest for feature documentation, typed configuration objects, logical 
grouping for additional parameters added over time, discoverability, etc.), but 
one a larger scale.

This may overlap with ongoing work, including the Guardrails epic. Ideally, 
even a rough cut of a design here would allow that to move forward in a timely 
and coherent manner (with less long-term refactoring pain).

While these would have to be adjusted to CASSANDRA-15234 (probably after it 
merges), there have been two proposals floated already for what this might look 
like:

>From [~benedict] - 
>https://github.com/belliottsmith/cassandra/commits/CASSANDRA-15234-grouping-ideas

>From [~maedhroz] - 
>https://github.com/maedhroz/cassandra/commit/450b920e0ac072cec635e0ebcb63538ee7f1fc5a

>From [~paulo] - 
>https://gist.github.com/pauloricardomg/e9e23feea1b172b4f084cb01d7a89b05

  was:
Recent mailing list conversation (see "[DISCUSS] Nested YAML configs for new 
features") has made it clear we will gravitate toward appropriately nested 
structures for new parameters in {{cassandra.yaml}}, but from the scattered 
conversation across a few Guardrails tickets (see CASSANDRA-17212 and 
CASSANDRA-17148) and CASSANDRA-15234, there is also a general desire to 
eventually extend this to the rest of {{cassandra.yaml}}. The benefits of this 
change include those we gain by doing it for new features (single point of 
interest for feature documentation, typed configuration objects, logical 
grouping for additional parameters added over time, discoverability, etc.), but 
one a larger scale.

This may overlap with ongoing work, including the Guardrails epic. Ideally, 
even a rough cut of a design here would allow that to move forward in a timely 
and coherent manner (with less long-term refactoring pain).

While these would have to be adjusted to CASSANDRA-15234 (probably after it 
merges), there have been two proposals floated already for what this might look 
like:

>From [~maedhroz] - 
>https://github.com/maedhroz/cassandra/commit/450b920e0ac072cec635e0ebcb63538ee7f1fc5a

>From [~benedict] - 
>https://github.com/belliottsmith/cassandra/commits/CASSANDRA-15234-grouping-ideas


> Move cassandra.yaml toward a nested structure around major database concepts
> 
>
> Key: CASSANDRA-17292
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17292
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Config
>Reporter: Caleb Rackliffe
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 5.x
>
>
> Recent mailing list conversation (see "[DISCUSS] Nested YAML configs for new 
> features") has made it clear we will gravitate toward appropriately nested 
> structures for new parameters in {{cassandra.yaml}}, but from the scattered 
> conversation across a few Guardrails tickets (see CASSANDRA-17212 and 
> CASSANDRA-17148) and CASSANDRA-15234, there is also a general desire to 
> eventually extend this to the rest of {{cassandra.yaml}}. The benefits of 
> this change include those we gain by doing it for new features (single point 
> of interest for feature documentation, typed configuration objects, logical 
> grouping for additional parameters added over time, discoverability, etc.), 
> but one a larger scale.
> This may overlap with ongoing work, including the Guardrails epic. Ideally, 
> even a rough cut of a design here would allow that to move forward in a 
> timely and coherent manner (with less long-term refactoring pain).
> While these would have to be adjusted to CASSANDRA-15234 (probably after it 
> merges), there have been two proposals floated already for what this might 
> look like:
> From [~benedict] - 
> https://github.com/belliottsmith/cassandra/commits/CASSANDRA-15234-grouping-ideas
> From [~maedhroz] - 
> https://github.com/maedhroz/cassandra/commit/450b920e0ac072cec635e0ebcb63538ee7f1fc5a
> From [~paulo] - 
> https://gist.github.com/pauloricardomg/e9e23feea1b172b4f084cb01d7a89b05



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

--

[jira] [Updated] (CASSANDRA-16052) CEP-7 Storage Attached Index for Apache Cassandra

2022-02-22 Thread Caleb Rackliffe (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-16052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caleb Rackliffe updated CASSANDRA-16052:

Change Category: Performance
 Complexity: Challenging
  Fix Version/s: 5.x
 Status: Open  (was: Triage Needed)

> CEP-7 Storage Attached Index for Apache Cassandra
> -
>
> Key: CASSANDRA-16052
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16052
> Project: Cassandra
>  Issue Type: Epic
>  Components: Feature/2i Index
>Reporter: Zhao Yang
>Assignee: Caleb Rackliffe
>Priority: Normal
> Fix For: 5.x
>
>
> [CEP|https://docs.google.com/document/d/1V830eAMmQAspjJdjviVZIaSolVGvZ1hVsqOLWyV0DS4/edit#heading=h.67ap6rr1mxr]
>  - A new index implementation, called Storage
>  Attached Index(SAI), based on the advancement made by SASI.
>  * disk usage by sharing of common data between multiple column indexes on 
> the same table and better compression of on-disk structures.
>  * numeric range query performance with modified KDTree and collection type 
> support.
>  * compaction performance and stability for larger data set.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16349) SSTableLoader reports error when SSTable(s) do not have data for some nodes

2022-02-22 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17496208#comment-17496208
 ] 

Ekaterina Dimitrova commented on CASSANDRA-16349:
-

Jenkins CI runs submitted:

[4.0|https://jenkins-cm4.apache.org/job/Cassandra-devbranch/1443/], 
[trunk|https://jenkins-cm4.apache.org/job/Cassandra-devbranch/1444/]

> SSTableLoader reports error when SSTable(s) do not have data for some nodes
> ---
>
> Key: CASSANDRA-16349
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16349
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tool/sstable
>Reporter: Serban Teodorescu
>Assignee: Serban Teodorescu
>Priority: Normal
> Fix For: 4.0.x, 4.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Running SSTableLoader in verbose mode will show error(s) if there are node(s) 
> that do not own any data from the SSTable(s). This can happen in at least 2 
> cases:
>  # SSTableLoader is used to stream backups while keeping the same token ranges
>  # SSTable(s) are created with CQLSSTableWriter to match token ranges (this 
> can bring better performance by using ZeroCopy streaming)
> Partial output of the SSTableLoader:
> {quote}ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] 
> Remote peer /127.0.0.4:7000 failed stream session.
> ERROR 02:47:47,842 [Stream #fa8e73b0-3da5-11eb-9c47-c5d27ae8fe47] Remote peer 
> /127.0.0.3:7000 failed stream session.
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.611KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.515KiB/s)
> progress: [/127.0.0.4:7000]0:0/1 100% [/127.0.0.3:7000]0:0/1 100% 
> [/127.0.0.2:7000]0:7/7 100% [/127.0.0.1:7000]0:7/7 100% total: 100% 
> 0.000KiB/s (avg: 1.427KiB/s)
> {quote}
>  
> Stack trace:
> {quote}java.util.concurrent.ExecutionException: 
> org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> com.google.common.util.concurrent.AbstractFuture.getDoneValue(AbstractFuture.java:552)
> at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:533)
> at org.apache.cassandra.tools.BulkLoader.load(BulkLoader.java:99)
> at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:49)
> Caused by: org.apache.cassandra.streaming.StreamException: Stream failed
> at 
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:88)
> at 
> com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1056)
> at 
> com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
> at 
> com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1138)
> at 
> com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:958)
> at 
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:748)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:220)
> at 
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:196)
> at 
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:505)
> at 
> org.apache.cassandra.streaming.StreamSession.complete(StreamSession.java:819)
> at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:595)
> at 
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:189)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:844)
> {quote}
> To reproduce create a cluster with ccm with more nodes than the RF, put some 
> data into it copy a SSTable and stream it.
>  
> The error originates on the nodes, the following stack trace is shown in the 
> logs:
> {quote}java.lang.IllegalStateException: Stream hasn't been read yet
>     at 
> com.google.common.base.Preconditions.checkState(Preconditions.java:507)
>     at 
> org.apache.cassandra.db.streaming.CassandraIncomingFile.getSize(CassandraIncomingFile.java:96)
>     at 
> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:789)
>     at 
> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:587)
>     at 
> org.apache.cassandra.streaming.async

[jira] [Commented] (CASSANDRA-15399) Add ability to track state in repair

2022-02-22 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17496177#comment-17496177
 ] 

David Capwell commented on CASSANDRA-15399:
---

[~paulo], [~jasonstack] CASSANDRA-17390 is ready for review.  [~djoshi] said he 
would review this week as well.

> Add ability to track state in repair
> 
>
> Key: CASSANDRA-15399
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15399
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Repair
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> To enhance the visibility in repair, we should expose internal state via 
> virtual tables; the state should include coordinator as well as participant 
> state (validation, sync, etc.)
> I propose the following tables:
> repairs - high level summary of the global state of repair; this should be 
> called on the coordinator.
> {code:sql}
> CREATE TABLE repairs (
>   id uuid,
>   keyspace_name text,
>   table_names frozen>,
>   ranges frozen>,
>   coordinator text,
>   participants frozen>,
>   state text,
>   progress_percentage float,
>   last_updated_at_millis bigint,
>   duration_micro bigint,
>   failure_cause text,
>   PRIMARY KEY ( (id) )
> )
> {code}
> repair_tasks - represents RepairJob and participants state.  This will show 
> if validations are running on participants and the progress they are making; 
> this should be called on the coordinator.
> {code:sql}
> CREATE TABLE repair_tasks (
>   id uuid,
>   session_id uuid,
>   keyspace_name text,
>   table_name text,
>   ranges frozen>,
>   coordinator text,
>   participant text,
>   state text,
>   state_description text,
>   progress_percentage float, -- between 0.0 and 100.0
>   last_updated_at_millis bigint,
>   duration_micro bigint,
>   failure_cause text,
>   PRIMARY KEY ( (id), session_id, table_name, participant )
> )
> {code}
> repair_validations - shows the state of the validation task and updated 
> periodically while validation is running; this should be called on the 
> participants.
> {code:sql}
> CREATE TABLE repair_validations (
>   id uuid,
>   session_id uuid,
>   ranges frozen>,
>   keyspace_name text,
>   table_name text,
>   initiator text,
>   state text,
>   progress_percentage float,
>   queue_duration_ms bigint,
>   runtime_duration_ms bigint,
>   total_duration_ms bigint,
>   estimated_partitions bigint,
>   partitions_processed bigint,
>   estimated_total_bytes bigint,
>   failure_cause text,
>   PRIMARY KEY ( (id), session_id, table_name )
> )
> {code}
> The main reason for exposing virtual tables rather than exposing through 
> durable tables is to make sure what is exposed is accurate.  In cases of 
> write failures or node failures, the durable tables could become in-accurate 
> and could add edge cases where the repair is not running but the tables say 
> it is; by relying on repair's internal in-memory bookkeeping, these problems 
> go away.
> This jira does not try to solve the following:
> 1) repair resiliency - there are edge cases where repair hits an error and 
> runs forever (at least from nodetool's perspective).
> 2) repair stream tracking - I have not learned the streaming side yet and 
> what I see is multiple implementations exist, so seems like high scope.  My 
> hope is to punt from this jira and tackle separately.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17338) Fix flaky test - test_cqlsh_completion.TestCqlshCompletion

2022-02-22 Thread Josh McKenzie (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17496101#comment-17496101
 ] 

Josh McKenzie commented on CASSANDRA-17338:
---

That's "1 failure of the last 38 runs".

 

Maybe I should revise it to read "1 in 38" instead?

> Fix flaky test - test_cqlsh_completion.TestCqlshCompletion
> --
>
> Key: CASSANDRA-17338
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17338
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL/Interpreter
>Reporter: Brandon Williams
>Assignee: Aleksei Zotov
>Priority: Normal
> Fix For: 3.0.27, 3.11.13
>
>
>  Failed 4 times in the last 24 runs. Flakiness: 30%, Stability: 83%
> A bunch of the test_completion_* tests fail occasionally with an eyebleed 
> inducing mismatched output.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17267) Snapshot true size is miscalculated

2022-02-22 Thread Paulo Motta (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17496100#comment-17496100
 ] 

Paulo Motta commented on CASSANDRA-17267:
-

In the [previous test 
run|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/1422/] 
{{org.apache.cassandra.index.sasi.SASIIndexTest.testSASIComponentsAddedToSnapshot}}
 was getting stuck when running within the suite (worked when executed 
individually).

I tracked down the reason to the {{ReadExecutionController}} not being closed 
properly on other tests, causing operations to block indefinitely on the 
{{{}OpOrder{}}}. Fixed [on this 
commit|https://github.com/apache/cassandra/commit/77f688e75ff403875755f34dc31ab75401bcaa3d]
 on all branches.

I created CASSANDRA-17400 to add a checker to verify resources are being 
properly closed to avoid stuck tests in the future.

Resubmitted CI:
|[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...pauloricardomg:CASSANDRA-17267-3.11]|[tests|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/1440/]|
|[4.0|https://github.com/apache/cassandra/compare/cassandra-4.0...pauloricardomg:CASSANDRA-17267-4.0]|[tests|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/1441/]|
|[trunk|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:CASSANDRA-17267-trunk]|[tests|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch/1442/]|

> Snapshot true size is miscalculated
> ---
>
> Key: CASSANDRA-17267
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17267
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Snapshots
>Reporter: Paulo Motta
>Assignee: Paulo Motta
>Priority: Normal
>
> As far as I understand, the snapshot "size on disk" is the total size of the 
> snapshot, while the "true size" is the (size_on_disk - size_of_live_sstables).
> I created a snapshot on a 3.11 node without traffic and I expected the "true 
> size" to be 0KB since the original sstables were still present, but this 
> didn't seem to be the case:
> {noformat}
> $ nodetool listsnapshots
> Snapshot Details:
> Snapshot name Keyspace name Column family name True size Size on disk
> test  ks1   tbl1   4.86 KiB  5.69 KiB
> Total TrueDiskSpaceUsed: 4.86 KiB
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17399) a new SSTable created when single SSTable tombstone compact occurred in TWCS

2022-02-22 Thread Brandon Williams (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-17399:
-
Resolution: Not A Problem
Status: Resolved  (was: Triage Needed)

You should run sstableexpiredblockers on the table to find out why it can't be 
deleted.

> a new SSTable created when single SSTable tombstone compact occurred in TWCS
> 
>
> Key: CASSANDRA-17399
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17399
> Project: Cassandra
>  Issue Type: Bug
>Reporter: eason hao
>Priority: Normal
>  Labels: 3.10
>
> we found a issue that a new SSTable created when single SSTable tombstone 
> compact occurred. The cassandra version is *cqlsh 5.0.1 | Cassandra 3.10 | 
> CQL spec 3.4.4,* we use *TWCS.*
> The old SSTable, which Estimated droppable tombstones above 0.9, is the 
> oldest SSTable in this table, it store oldest records, and it contains same 
> partitions with newer SSTables, there is no expired SSTable deletion block 
> about it.
> when the old SSTable exists almost TTL+gc_grace_seconds, then it's deleted, 
> but later I found a new SSTable created, from log we know the new SSTable is 
> created by old one, the size 42.920MiB is old SSTable and 2.381MiB is new 
> SSTable.
>  
> {code:java}
> DEBUG [CompactionExecutor:44581] 
> 2022-02-21 11:11:15,429 CompactionTask.java:255 - Compacted 
> (e99c1550-9306-11ec-8461-0bfbe41d7414) 1 sstables to 
> [.../mc-317850-big,]
>  to level=0. 42.920MiB to 2.381MiB (~5% of original) in 31,424ms. Read 
> Throughput = 1.366MiB/s, Write Throughput = 77.602KiB/s, Row Throughput =
>  ~4,311/s. 194 total partitions merged to 194. Partition merge counts 
> were {1:194, } {code}
>  
> and weird data exist in new SSTable, all the fileds only contain 
> deletion_info, the partition/clustering/x/y is same in old SSTable.
>  
> {code:java}
> "cells" : [
>           { "name" : "x", "deletion_info" : { "local_delete_time" : 
> "2022-02-12T10:55:15Z" }
>           },
>           { "name" : "y", "deletion_info" : { "local_delete_time" : 
> "2022-02-12T10:55:15Z" }
>           },
> ...
> }{code}
> also, the old SSTable only contain part of data in new SSTable, we found 
> 129426 rows in old and 94694 rows in new one.
>  
>  
> also I found there are TTL min:0 in sstablemetadata but I dump all data from 
> the old SSTable, then I can't find any record with ttl=0, all data is same as 
> deletion_info records
>  
> {code:java}
> Minimum timestamp: 1644740070072443
> Maximum timestamp: 1644742695566429
> SSTable min local deletion time: 1644740070
> SSTable max local deletion time: 1645433895
> Compressor: org.apache.cassandra.io.compress.LZ4Compressor
> Compression ratio: 0.01234938023191464
> TTL min: 0
> TTL max: 691200
> Estimated droppable tombstones: 0.9057755011460312 {code}
>  
>  
> I guess it's not performed as design, when a SSTable live exceed TTL+gc, it 
> should be deleted if Estimated droppable tombstones exceed threshold, this is 
> what I thought. So create a new SSTable behavior should be removed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17400) Fail build or warn when closeable reference is not closed in tests

2022-02-22 Thread Paulo Motta (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-17400:

Labels: lhf  (was: )

> Fail build or warn when closeable reference is not closed in tests
> --
>
> Key: CASSANDRA-17400
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17400
> Project: Cassandra
>  Issue Type: Task
>Reporter: Paulo Motta
>Priority: Normal
>  Labels: lhf
>
> I came across a recent test stuck issue which was caused by an 
> {{Autocloseable}} object not being closed properly, leaking some references 
> and ultimately causing a deadlock.
> To prevent similar issues in the future we should add a check that fail or 
> warn when references are not closed during tests.
> If such check already exists we should look into fixing violations.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-17400) Fail build or warn when closeable reference is not closed in tests

2022-02-22 Thread Paulo Motta (Jira)
Paulo Motta created CASSANDRA-17400:
---

 Summary: Fail build or warn when closeable reference is not closed 
in tests
 Key: CASSANDRA-17400
 URL: https://issues.apache.org/jira/browse/CASSANDRA-17400
 Project: Cassandra
  Issue Type: Task
Reporter: Paulo Motta


I came across a recent test stuck issue which was caused by an 
{{Autocloseable}} object not being closed properly, leaking some references and 
ultimately causing a deadlock.

To prevent similar issues in the future we should add a check that fail or warn 
when references are not closed during tests.

If such check already exists we should look into fixing violations.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17399) a new SSTable created when single SSTable tombstone compact occurred in TWCS

2022-02-22 Thread eason hao (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

eason hao updated CASSANDRA-17399:
--
Labels: 3.10  (was: )

> a new SSTable created when single SSTable tombstone compact occurred in TWCS
> 
>
> Key: CASSANDRA-17399
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17399
> Project: Cassandra
>  Issue Type: Bug
>Reporter: eason hao
>Priority: Normal
>  Labels: 3.10
>
> we found a issue that a new SSTable created when single SSTable tombstone 
> compact occurred. The cassandra version is *cqlsh 5.0.1 | Cassandra 3.10 | 
> CQL spec 3.4.4,* we use *TWCS.*
> The old SSTable, which Estimated droppable tombstones above 0.9, is the 
> oldest SSTable in this table, it store oldest records, and it contains same 
> partitions with newer SSTables, there is no expired SSTable deletion block 
> about it.
> when the old SSTable exists almost TTL+gc_grace_seconds, then it's deleted, 
> but later I found a new SSTable created, from log we know the new SSTable is 
> created by old one, the size 42.920MiB is old SSTable and 2.381MiB is new 
> SSTable.
>  
> {code:java}
> DEBUG [CompactionExecutor:44581] 
> 2022-02-21 11:11:15,429 CompactionTask.java:255 - Compacted 
> (e99c1550-9306-11ec-8461-0bfbe41d7414) 1 sstables to 
> [.../mc-317850-big,]
>  to level=0. 42.920MiB to 2.381MiB (~5% of original) in 31,424ms. Read 
> Throughput = 1.366MiB/s, Write Throughput = 77.602KiB/s, Row Throughput =
>  ~4,311/s. 194 total partitions merged to 194. Partition merge counts 
> were {1:194, } {code}
>  
> and weird data exist in new SSTable, all the fileds only contain 
> deletion_info, the partition/clustering/x/y is same in old SSTable.
>  
> {code:java}
> "cells" : [
>           { "name" : "x", "deletion_info" : { "local_delete_time" : 
> "2022-02-12T10:55:15Z" }
>           },
>           { "name" : "y", "deletion_info" : { "local_delete_time" : 
> "2022-02-12T10:55:15Z" }
>           },
> ...
> }{code}
> also, the old SSTable only contain part of data in new SSTable, we found 
> 129426 rows in old and 94694 rows in new one.
>  
>  
> also I found there are TTL min:0 in sstablemetadata but I dump all data from 
> the old SSTable, then I can't find any record with ttl=0, all data is same as 
> deletion_info records
>  
> {code:java}
> Minimum timestamp: 1644740070072443
> Maximum timestamp: 1644742695566429
> SSTable min local deletion time: 1644740070
> SSTable max local deletion time: 1645433895
> Compressor: org.apache.cassandra.io.compress.LZ4Compressor
> Compression ratio: 0.01234938023191464
> TTL min: 0
> TTL max: 691200
> Estimated droppable tombstones: 0.9057755011460312 {code}
>  
>  
> I guess it's not performed as design, when a SSTable live exceed TTL+gc, it 
> should be deleted if Estimated droppable tombstones exceed threshold, this is 
> what I thought. So create a new SSTable behavior should be removed.
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-17399) a new SSTable created when single SSTable tombstone compact occurred in TWCS

2022-02-22 Thread eason hao (Jira)
eason hao created CASSANDRA-17399:
-

 Summary: a new SSTable created when single SSTable tombstone 
compact occurred in TWCS
 Key: CASSANDRA-17399
 URL: https://issues.apache.org/jira/browse/CASSANDRA-17399
 Project: Cassandra
  Issue Type: Bug
Reporter: eason hao


we found a issue that a new SSTable created when single SSTable tombstone 
compact occurred. The cassandra version is *cqlsh 5.0.1 | Cassandra 3.10 | CQL 
spec 3.4.4,* we use *TWCS.*

The old SSTable, which Estimated droppable tombstones above 0.9, is the oldest 
SSTable in this table, it store oldest records, and it contains same partitions 
with newer SSTables, there is no expired SSTable deletion block about it.

when the old SSTable exists almost TTL+gc_grace_seconds, then it's deleted, but 
later I found a new SSTable created, from log we know the new SSTable is 
created by old one, the size 42.920MiB is old SSTable and 2.381MiB is new 
SSTable.

 
{code:java}
DEBUG [CompactionExecutor:44581] 
2022-02-21 11:11:15,429 CompactionTask.java:255 - Compacted 
(e99c1550-9306-11ec-8461-0bfbe41d7414) 1 sstables to 
[.../mc-317850-big,]
 to level=0. 42.920MiB to 2.381MiB (~5% of original) in 31,424ms. Read 
Throughput = 1.366MiB/s, Write Throughput = 77.602KiB/s, Row Throughput =
 ~4,311/s. 194 total partitions merged to 194. Partition merge counts 
were {1:194, } {code}
 

and weird data exist in new SSTable, all the fileds only contain deletion_info, 
the partition/clustering/x/y is same in old SSTable.

 
{code:java}
"cells" : [
          { "name" : "x", "deletion_info" : { "local_delete_time" : 
"2022-02-12T10:55:15Z" }
          },
          { "name" : "y", "deletion_info" : { "local_delete_time" : 
"2022-02-12T10:55:15Z" }
          },
...
}{code}
also, the old SSTable only contain part of data in new SSTable, we found 129426 
rows in old and 94694 rows in new one.

 

 

also I found there are TTL min:0 in sstablemetadata but I dump all data from 
the old SSTable, then I can't find any record with ttl=0, all data is same as 
deletion_info records

 
{code:java}
Minimum timestamp: 1644740070072443
Maximum timestamp: 1644742695566429
SSTable min local deletion time: 1644740070
SSTable max local deletion time: 1645433895
Compressor: org.apache.cassandra.io.compress.LZ4Compressor
Compression ratio: 0.01234938023191464
TTL min: 0
TTL max: 691200
Estimated droppable tombstones: 0.9057755011460312 {code}
 

 

I guess it's not performed as design, when a SSTable live exceed TTL+gc, it 
should be deleted if Estimated droppable tombstones exceed threshold, this is 
what I thought. So create a new SSTable behavior should be removed.

 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org