[jira] [Updated] (CASSANDRA-19945) Reverse cursor and iteration support for Trie based memtables

2024-09-27 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-19945:

   Authors: Branimir Lambov, Michael Marshall  (was: Branimir Lambov)
Resolution: Fixed
Status: Resolved  (was: Ready to Commit)

Committed as 
[377e6aa04fb67ea4220445988e85c9ebacb06db4|https://github.com/apache/cassandra/commit/377e6aa04fb67ea4220445988e85c9ebacb06db4].

> Reverse cursor and iteration support for Trie based memtables
> -
>
> Key: CASSANDRA-19945
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19945
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable, Local/SSTable
>Reporter: Ariel Weisberg
>Assignee: Branimir Lambov
>Priority: Normal
> Fix For: 5.x
>
>
> Cherry- pick 
> [https://github.com/datastax/cassandra/commit/196b931c677829d681406f14cf1da814ff5a6624]
> For Accord in particular this is useful to avoid flushing memtables that 
> don't intersect with the range that is going to start having metadata GCed so 
> we can flush less frequently/later.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19945) Reverse cursor and iteration support for Trie based memtables

2024-09-27 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-19945:

Status: Needs Committer  (was: Patch Available)

> Reverse cursor and iteration support for Trie based memtables
> -
>
> Key: CASSANDRA-19945
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19945
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable, Local/SSTable
>Reporter: Ariel Weisberg
>Assignee: Branimir Lambov
>Priority: Normal
> Fix For: 5.x
>
>
> Cherry- pick 
> [https://github.com/datastax/cassandra/commit/196b931c677829d681406f14cf1da814ff5a6624]
> For Accord in particular this is useful to avoid flushing memtables that 
> don't intersect with the range that is going to start having metadata GCed so 
> we can flush less frequently/later.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19945) Reverse cursor and iteration support for Trie based memtables

2024-09-27 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-19945:

Status: Ready to Commit  (was: Review In Progress)

> Reverse cursor and iteration support for Trie based memtables
> -
>
> Key: CASSANDRA-19945
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19945
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable, Local/SSTable
>Reporter: Ariel Weisberg
>Assignee: Branimir Lambov
>Priority: Normal
> Fix For: 5.x
>
>
> Cherry- pick 
> [https://github.com/datastax/cassandra/commit/196b931c677829d681406f14cf1da814ff5a6624]
> For Accord in particular this is useful to avoid flushing memtables that 
> don't intersect with the range that is going to start having metadata GCed so 
> we can flush less frequently/later.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19945) Reverse cursor and iteration support for Trie based memtables

2024-09-27 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-19945:

Status: Review In Progress  (was: Needs Committer)

> Reverse cursor and iteration support for Trie based memtables
> -
>
> Key: CASSANDRA-19945
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19945
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable, Local/SSTable
>Reporter: Ariel Weisberg
>Assignee: Branimir Lambov
>Priority: Normal
> Fix For: 5.x
>
>
> Cherry- pick 
> [https://github.com/datastax/cassandra/commit/196b931c677829d681406f14cf1da814ff5a6624]
> For Accord in particular this is useful to avoid flushing memtables that 
> don't intersect with the range that is going to start having metadata GCed so 
> we can flush less frequently/later.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19945) Reverse cursor and iteration support for Trie based memtables

2024-09-27 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17885325#comment-17885325
 ] 

Branimir Lambov commented on CASSANDRA-19945:
-

{{ByteSourceComparisonTest}} checks this for selected examples (see 
[{{maybeAssertNotPrefix}}|https://github.com/apache/cassandra/blob/trunk/test/unit/org/apache/cassandra/utils/bytecomparable/ByteSourceComparisonTest.java#L865]
 as well as {{maybeCheck41Properties}} (renamed to 50 now)). This is actually 
one of those things that you can't really check by tests; 
[{{ByteComparable.md}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/utils/bytecomparable/ByteComparable.md]
 has sections that explain and prove the properties for every one of the types 
in use.

> Reverse cursor and iteration support for Trie based memtables
> -
>
> Key: CASSANDRA-19945
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19945
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable, Local/SSTable
>Reporter: Ariel Weisberg
>Assignee: Branimir Lambov
>Priority: Normal
> Fix For: 5.x
>
>
> Cherry- pick 
> [https://github.com/datastax/cassandra/commit/196b931c677829d681406f14cf1da814ff5a6624]
> For Accord in particular this is useful to avoid flushing memtables that 
> don't intersect with the range that is going to start having metadata GCed so 
> we can flush less frequently/later.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19945) Reverse cursor and iteration support for Trie based memtables

2024-09-25 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-19945:

Source Control Link: https://github.com/apache/cassandra/pull/3571

> Reverse cursor and iteration support for Trie based memtables
> -
>
> Key: CASSANDRA-19945
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19945
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable, Local/SSTable
>Reporter: Ariel Weisberg
>Assignee: Branimir Lambov
>Priority: Normal
> Fix For: 5.x
>
>
> Cherry- pick 
> [https://github.com/datastax/cassandra/commit/196b931c677829d681406f14cf1da814ff5a6624]
> For Accord in particular this is useful to avoid flushing memtables that 
> don't intersect with the range that is going to start having metadata GCed so 
> we can flush less frequently/later.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19945) Reverse cursor and iteration support for Trie based memtables

2024-09-25 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-19945:

Test and Documentation Plan: Unit tests
 Status: Patch Available  (was: Open)

> Reverse cursor and iteration support for Trie based memtables
> -
>
> Key: CASSANDRA-19945
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19945
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable, Local/SSTable
>Reporter: Ariel Weisberg
>Assignee: Branimir Lambov
>Priority: Normal
> Fix For: 5.x
>
>
> Cherry- pick 
> [https://github.com/datastax/cassandra/commit/196b931c677829d681406f14cf1da814ff5a6624]
> For Accord in particular this is useful to avoid flushing memtables that 
> don't intersect with the range that is going to start having metadata GCed so 
> we can flush less frequently/later.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19945) Reverse cursor and iteration support for Trie based memtables

2024-09-25 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-19945:

Change Category: Performance
 Complexity: Normal
  Reviewers: Ariel Weisberg
 Status: Open  (was: Triage Needed)

> Reverse cursor and iteration support for Trie based memtables
> -
>
> Key: CASSANDRA-19945
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19945
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Memtable, Local/SSTable
>Reporter: Ariel Weisberg
>Assignee: Branimir Lambov
>Priority: Normal
> Fix For: 5.x
>
>
> Cherry- pick 
> [https://github.com/datastax/cassandra/commit/196b931c677829d681406f14cf1da814ff5a6624]
> For Accord in particular this is useful to avoid flushing memtables that 
> don't intersect with the range that is going to start having metadata GCed so 
> we can flush less frequently/later.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19785) Possible memory leak in BTree.FastBuilder

2024-09-16 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17882059#comment-17882059
 ] 

Branimir Lambov commented on CASSANDRA-19785:
-

The pull request already has my approval.

> Possible memory leak in BTree.FastBuilder 
> --
>
> Key: CASSANDRA-19785
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19785
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Paul Chandler
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
> Attachments: image-2024-07-19-08-44-56-714.png, 
> image-2024-07-19-08-45-17-289.png, image-2024-07-19-08-45-33-933.png, 
> image-2024-07-19-08-45-50-383.png, image-2024-07-19-08-46-06-919.png, 
> image-2024-07-19-08-46-42-979.png, image-2024-07-19-08-46-56-594.png, 
> image-2024-07-19-08-47-19-517.png, image-2024-07-19-08-47-34-582.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We are having a problem with the heap growing in size, This is a large 
> cluster > 1,000 nodes across a large number of dc’s. This is running version 
> 4.0.11.
>  
> Each node has a 32GB heap, and the amount used continues to grow until it 
> reaches 30GB, it then struggles with multiple Full GC pauses, as can be seen 
> here:
> !image-2024-07-19-08-44-56-714.png!
> We took 2 heap dumps on one node a few days after it was restarted, and the 
> heap had grown by 2.7GB
>  
> 9{^}th{^} July
> !image-2024-07-19-08-45-17-289.png!
> 11{^}th{^} July
> !image-2024-07-19-08-45-33-933.png!
> This can be seen as mainly an increase of memory used by 
> FastThreadLocalThread, increasing from 5.92GB to 8.53GB
> !image-2024-07-19-08-45-50-383.png!
> !image-2024-07-19-08-46-06-919.png!
> Looking deeper into this it can be seen that the growing heap is contained 
> within the threads for the MutationStage, Native-transport-Requests, 
> ReadStage etc. We would expect the memory used within these threads to be 
> short lived, and not grow as time goes on.  We recently increased the size of 
> theses threadpools, and that has increased the size of the problem.
>  
> Top memory usage for FastThreadLocalThread
> 9{^}th{^} July
> !image-2024-07-19-08-46-42-979.png!
> 11{^}th{^} July
> !image-2024-07-19-08-46-56-594.png!
> This has led us to investigate whether there could be a memory leak, and we 
> have found the following issues within the retained references in 
> BTree.FastBuilder objects. The issue appears to stem from the reset() method, 
> which does not properly clear all buffers.  We are not really sure how the 
> BTree.FastBuilder works, but this this is our analysis of where a leak might 
> occur.
>  
> Specifically:
> Leaf Buffer Not Being Cleared:
> When leaf().count is 0, the statement Arrays.fill(leaf().buffer, 0, 
> leaf().count, null); does not clear the buffer because the end index is 0. 
> This leaves the buffer with references to potentially large objects, 
> preventing garbage collection and increasing heap usage.
> Branch inUse Property:
> If the inUse property of the branch is set to false elsewhere in the code, 
> the while loop while (branch != null && branch.inUse) does not execute, 
> resulting in uncleared branch buffers and retained references.
>  
> This is based on the following observations:
>     Heap Dumps: Analysis of heap dumps shows that leaf().count is often 0, 
> and as a result, the buffer is not being cleared, leading to high heap 
> utilization.
> !image-2024-07-19-08-47-19-517.png!
>     Remote Debugging: Debugging sessions indicate that the drain() method 
> sets count to 0, and the inUse flag for the parent branch is set to false, 
> preventing the while loop in reset() from clearing the branch buffers.
> !image-2024-07-19-08-47-34-582.png!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17298) Test Failure: org.apache.cassandra.cql3.MemtableSizeTest

2024-08-30 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-17298:

Status: Needs Committer  (was: Review In Progress)

> Test Failure: org.apache.cassandra.cql3.MemtableSizeTest
> 
>
> Key: CASSANDRA-17298
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17298
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Josh McKenzie
>Assignee: Dmitry Konstantinov
>Priority: Normal
> Fix For: 4.0.x, 4.1.x
>
> Attachments: analyzed_objects.svg, structure_example.svg
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> [https://ci-cassandra.apache.org/job/Cassandra-4.0/313/testReport/org.apache.cassandra.cql3/MemtableSizeTest/testTruncationReleasesLogSpace_2/]
>  Failed 4 times in the last 30 runs. Flakiness: 27%, Stability: 86%
> Error Message
> Expected heap usage close to 49.930MiB, got 41.542MiB.
> {code}
> Stacktrace
> junit.framework.AssertionFailedError: Expected heap usage close to 49.930MiB, 
> got 41.542MiB.
>   at 
> org.apache.cassandra.cql3.MemtableSizeTest.testSize(MemtableSizeTest.java:130)
>   at org.apache.cassandra.Util.runCatchingAssertionError(Util.java:644)
>   at org.apache.cassandra.Util.flakyTest(Util.java:669)
>   at 
> org.apache.cassandra.cql3.MemtableSizeTest.testTruncationReleasesLogSpace(MemtableSizeTest.java:61)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  {code}
> *UPDATE:* It was discovered that unit tests were running with 
> memtable_allocation_type: offheap_objects when we ship C* with heap_buffers.
> So we changed that in CASSANDRA-19326, now we test with 
> memtable_allocation_type: heap_buffers. As a result, this test now fails all 
> the time on 4.0 and 4.1. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-17298) Test Failure: org.apache.cassandra.cql3.MemtableSizeTest

2024-08-30 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-17298:

Reviewers: Branimir Lambov, Branimir Lambov
   Branimir Lambov, Branimir Lambov  (was: Branimir Lambov)
   Status: Review In Progress  (was: Patch Available)

> Test Failure: org.apache.cassandra.cql3.MemtableSizeTest
> 
>
> Key: CASSANDRA-17298
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17298
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Josh McKenzie
>Assignee: Dmitry Konstantinov
>Priority: Normal
> Fix For: 4.0.x, 4.1.x
>
> Attachments: analyzed_objects.svg, structure_example.svg
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> [https://ci-cassandra.apache.org/job/Cassandra-4.0/313/testReport/org.apache.cassandra.cql3/MemtableSizeTest/testTruncationReleasesLogSpace_2/]
>  Failed 4 times in the last 30 runs. Flakiness: 27%, Stability: 86%
> Error Message
> Expected heap usage close to 49.930MiB, got 41.542MiB.
> {code}
> Stacktrace
> junit.framework.AssertionFailedError: Expected heap usage close to 49.930MiB, 
> got 41.542MiB.
>   at 
> org.apache.cassandra.cql3.MemtableSizeTest.testSize(MemtableSizeTest.java:130)
>   at org.apache.cassandra.Util.runCatchingAssertionError(Util.java:644)
>   at org.apache.cassandra.Util.flakyTest(Util.java:669)
>   at 
> org.apache.cassandra.cql3.MemtableSizeTest.testTruncationReleasesLogSpace(MemtableSizeTest.java:61)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  {code}
> *UPDATE:* It was discovered that unit tests were running with 
> memtable_allocation_type: offheap_objects when we ship C* with heap_buffers.
> So we changed that in CASSANDRA-19326, now we test with 
> memtable_allocation_type: heap_buffers. As a result, this test now fails all 
> the time on 4.0 and 4.1. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17298) Test Failure: org.apache.cassandra.cql3.MemtableSizeTest

2024-08-30 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17878003#comment-17878003
 ] 

Branimir Lambov commented on CASSANDRA-17298:
-

+1

Both patches look good to me.

> Test Failure: org.apache.cassandra.cql3.MemtableSizeTest
> 
>
> Key: CASSANDRA-17298
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17298
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Josh McKenzie
>Assignee: Dmitry Konstantinov
>Priority: Normal
> Fix For: 4.0.x, 4.1.x
>
> Attachments: analyzed_objects.svg, structure_example.svg
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> [https://ci-cassandra.apache.org/job/Cassandra-4.0/313/testReport/org.apache.cassandra.cql3/MemtableSizeTest/testTruncationReleasesLogSpace_2/]
>  Failed 4 times in the last 30 runs. Flakiness: 27%, Stability: 86%
> Error Message
> Expected heap usage close to 49.930MiB, got 41.542MiB.
> {code}
> Stacktrace
> junit.framework.AssertionFailedError: Expected heap usage close to 49.930MiB, 
> got 41.542MiB.
>   at 
> org.apache.cassandra.cql3.MemtableSizeTest.testSize(MemtableSizeTest.java:130)
>   at org.apache.cassandra.Util.runCatchingAssertionError(Util.java:644)
>   at org.apache.cassandra.Util.flakyTest(Util.java:669)
>   at 
> org.apache.cassandra.cql3.MemtableSizeTest.testTruncationReleasesLogSpace(MemtableSizeTest.java:61)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  {code}
> *UPDATE:* It was discovered that unit tests were running with 
> memtable_allocation_type: offheap_objects when we ship C* with heap_buffers.
> So we changed that in CASSANDRA-19326, now we test with 
> memtable_allocation_type: heap_buffers. As a result, this test now fails all 
> the time on 4.0 and 4.1. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17298) Test Failure: org.apache.cassandra.cql3.MemtableSizeTest

2024-08-28 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17877296#comment-17877296
 ] 

Branimir Lambov commented on CASSANDRA-17298:
-

Thank you for the very detailed investigation.

This has been a source of annoyance, with many attempts to fix since the test 
was first introduced, but it's necessary because before the test we had reached 
about 2x difference between the memtable's understanding of its on-heap size 
and what it actually used. One consideration we've previously had around this 
is that adjusting the memory usage reporting may cause memtables to flush 
earlier and change the behavior of existing clusters too much for a patch 
release. In this case it looks like the difference is on the order of 10%, 
which I personally would not see as a problem.

I wonder if it we shouldn't backport most of that 5.0 patch so that we start 
testing all allocation strategies in 4.x as well. Also, am I understanding 
correctly that there is also something (EMPTY_LEAF?) that we are not tracking 
correctly in 5.0?

On the open question, there may be a reason to use both  {{BTree.sizeOnHeapOf}} 
and {{BTree.sizeOfStructureOnHeap}} (e.g. some BTree-building methods always 
share the size map, others never, and if the caller knows which one it is it 
could choose between the two). From a quick glance it looks like we use the 
latter version for {{Columns}}, and these are built from sorted using shared 
size maps, thus this appears to be the right thing to do. However, the names of 
the two methods should reflect this difference and they it should also be 
explained in javaDoc.

> Test Failure: org.apache.cassandra.cql3.MemtableSizeTest
> 
>
> Key: CASSANDRA-17298
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17298
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Josh McKenzie
>Assignee: Dmitry Konstantinov
>Priority: Normal
> Fix For: 4.0.x, 4.1.x
>
> Attachments: analyzed_objects.svg, structure_example.svg
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> [https://ci-cassandra.apache.org/job/Cassandra-4.0/313/testReport/org.apache.cassandra.cql3/MemtableSizeTest/testTruncationReleasesLogSpace_2/]
>  Failed 4 times in the last 30 runs. Flakiness: 27%, Stability: 86%
> Error Message
> Expected heap usage close to 49.930MiB, got 41.542MiB.
> {code}
> Stacktrace
> junit.framework.AssertionFailedError: Expected heap usage close to 49.930MiB, 
> got 41.542MiB.
>   at 
> org.apache.cassandra.cql3.MemtableSizeTest.testSize(MemtableSizeTest.java:130)
>   at org.apache.cassandra.Util.runCatchingAssertionError(Util.java:644)
>   at org.apache.cassandra.Util.flakyTest(Util.java:669)
>   at 
> org.apache.cassandra.cql3.MemtableSizeTest.testTruncationReleasesLogSpace(MemtableSizeTest.java:61)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  {code}
> *UPDATE:* It was discovered that unit tests were running with 
> memtable_allocation_type: offheap_objects when we ship C* with heap_buffers.
> So we changed that in CASSANDRA-19326, now we test with 
> memtable_allocation_type: heap_buffers. As a result, this test now fails all 
> the time on 4.0 and 4.1. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19779) direct IO support is always evaluated to false upon the very first start of a node

2024-07-31 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17869890#comment-17869890
 ] 

Branimir Lambov commented on CASSANDRA-19779:
-

Unsupported {{FileUtils.getBlockSize}} means unsupported direct I/O as well, 
doesn't it? If not, I wonder if it makes sense to use a default block size of 
4k instead of failing.

> direct IO support is always evaluated to false upon the very first start of a 
> node
> --
>
> Key: CASSANDRA-19779
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19779
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Tools
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 5.0-rc
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When I extract the distribution tarball and I want to use tools in tools/bin, 
> there is this warn log visible every time for tools when they are started 
> (does not happen on "help" command, obviously)
> {code:java}
> WARN  14:25:11,835 Unable to determine block size for commit log directory: 
> null {code}
> This is because we introduced this (1) in CASSANDRA-18464
> What that does is that it will go and try to create a temporary file in 
> commit log directory to get "block size" for a "file store" that file is in.
> The problem with that is that when we just extract a tarball and run the 
> tools - Cassandra was never started - then such commit log directory does not 
> exist yet, so it tries to create a temporary file in a non-existing 
> directory, which fails, hence the log message.
> The fix is to check if commitlog dir exists and return / skip the resolution 
> of block size if it does not.
> Another approach might be to check if this is executed in the context of a 
> tool and skip it from resolution altogether. The problem with this is that 
> not all tools we have in bin/log call DatabaseDescriptor.
> toolInitialization() so we might combine these two.
> (1) 
> [https://github.com/apache/cassandra/blob/cassandra-5.0/src/java/org/apache/cassandra/config/DatabaseDescriptor.java#L1455-L1462]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19764) Corruption can occur while a field is being added to UDT clustering key

2024-07-12 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17865472#comment-17865472
 ] 

Branimir Lambov commented on CASSANDRA-19764:
-

{quote}the write will timeout at QUORUM consistency, as expected.
{quote}
In other words, TCM makes it practically impossible to run into the situation 
this test is meant to exercise?

Coming back to
{quote}it seems like a bad idea to allow altering UDTs once they are part of a 
primary key...
{quote}
Adding a new field to a UDT key is actually okay, we treat old values as 
shorter and can correctly order them, as long as we know all the types.

But there may be a short amount of time where a replica does not yet know the 
type of the added field (likely only really a thing before TCM). If it then 
accepts a write without knowing the types as it currently does, it can corrupt 
itself. It makes sense to just reject this write, even more so if TCM or 
something else prevents schemas from going out of sync altogether.

> Corruption can occur while a field is being added to UDT clustering key
> ---
>
> Key: CASSANDRA-19764
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19764
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/UDT
>Reporter: Branimir Lambov
>Priority: Normal
>
> CASSANDRA-15938 made some improvements in how unknown components in UDTs are 
> treated. Unfortunately this can cause corruption as soon as more than one 
> value is inserted for a partition.
> The problem can be easily shown by modifying the 
> {{FrozenUDTTest.testDivergentSchema}} test to insert two entries in the wrong 
> order:
> {code:java}
> cluster.coordinator(1).execute("insert into " + KEYSPACE + ".x (id, ck, i) 
> VALUES (?, " + json(1, 2) + ", ? )", ConsistencyLevel.ALL,
> 1, 2);
> cluster.coordinator(1).execute("insert into " + KEYSPACE + ".x (id, ck, i) 
> VALUES (?, " + json(1, 1) + ", ? )", ConsistencyLevel.ALL,
> 1, 1);
> {code}
> after which we can get corrupted sstable state, shown as a
> {code:java}
> java.lang.AssertionError: Lower bound [SSTABLE_LOWER_BOUND(1) ]is bigger than 
> first returned value [Row: ck=1 | i=2]
> {code}
> exception, or results like {{[[1],[2],[2],[1]]}} or {{[[2],[1],[2]]}} for 
> {{select i from x WHERE id = 1}} depending on which node we use as 
> coordinator.
> Because we don't know the type of new fields and cannot properly order 
> entries, we need to outright reject UDT keys that are not compatible with a 
> replica's schema.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19764) Corruption can occur while a field is being added to UDT clustering key

2024-07-11 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17865186#comment-17865186
 ] 

Branimir Lambov commented on CASSANDRA-19764:
-

I'm not sure the test is actually getting diverging schemas with 
{{cluster.coordinator(1).execute("alter type " + KEYSPACE + ".a add bar text", 
ConsistencyLevel.QUORUM)}}.

Using the original {{cluster.get(1).executeInternal("alter type " + KEYSPACE + 
".a add bar text")}} in
{code:java}
@Test
public void testDivergentSchemas() throws Throwable
{
try (Cluster cluster = init(Cluster.create(2)))
{
cluster.schemaChange("create type " + KEYSPACE + ".a (foo text)");
cluster.schemaChange("create table " + KEYSPACE + ".x (id int, ck 
frozen, i int, primary key (id, ck))");
cluster.get(1).executeInternal("alter type " + KEYSPACE + ".a add bar 
text");
cluster.coordinator(1).execute("insert into " + KEYSPACE + ".x (id, ck, 
i) VALUES (?, " + json(1, 2) + ", ? )", ConsistencyLevel.ALL, 1, 2);
cluster.coordinator(1).execute("insert into " + KEYSPACE + ".x (id, ck, 
i) VALUES (?, " + json(1, 1) + ", ? )", ConsistencyLevel.ALL, 1, 1);
cluster.get(2).flush(KEYSPACE);

Object[][] res1 = cluster.coordinator(1).execute("select i from " + 
KEYSPACE + ".x WHERE id = 1", ConsistencyLevel.ALL);
Object[][] res2 = cluster.coordinator(2).execute("select i from " + 
KEYSPACE + ".x WHERE id = 1", ConsistencyLevel.ALL);

assertArrayEquals(res1, res2);
}
}
{code}
fails, at least on 5.0.

> Corruption can occur while a field is being added to UDT clustering key
> ---
>
> Key: CASSANDRA-19764
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19764
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/UDT
>Reporter: Branimir Lambov
>Priority: Normal
>
> CASSANDRA-15938 made some improvements in how unknown components in UDTs are 
> treated. Unfortunately this can cause corruption as soon as more than one 
> value is inserted for a partition.
> The problem can be easily shown by modifying the 
> {{FrozenUDTTest.testDivergentSchema}} test to insert two entries in the wrong 
> order:
> {code:java}
> cluster.coordinator(1).execute("insert into " + KEYSPACE + ".x (id, ck, i) 
> VALUES (?, " + json(1, 2) + ", ? )", ConsistencyLevel.ALL,
> 1, 2);
> cluster.coordinator(1).execute("insert into " + KEYSPACE + ".x (id, ck, i) 
> VALUES (?, " + json(1, 1) + ", ? )", ConsistencyLevel.ALL,
> 1, 1);
> {code}
> after which we can get corrupted sstable state, shown as a
> {code:java}
> java.lang.AssertionError: Lower bound [SSTABLE_LOWER_BOUND(1) ]is bigger than 
> first returned value [Row: ck=1 | i=2]
> {code}
> exception, or results like {{[[1],[2],[2],[1]]}} or {{[[2],[1],[2]]}} for 
> {{select i from x WHERE id = 1}} depending on which node we use as 
> coordinator.
> Because we don't know the type of new fields and cannot properly order 
> entries, we need to outright reject UDT keys that are not compatible with a 
> replica's schema.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19764) Corruption can occur while a field is being added to UDT clustering key

2024-07-11 Thread Branimir Lambov (Jira)
Branimir Lambov created CASSANDRA-19764:
---

 Summary: Corruption can occur while a field is being added to UDT 
clustering key
 Key: CASSANDRA-19764
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19764
 Project: Cassandra
  Issue Type: Bug
  Components: Feature/UDT
Reporter: Branimir Lambov


CASSANDRA-15938 made some improvements in how unknown components in UDTs are 
treated. Unfortunately this can cause corruption as soon as more than one value 
is inserted for a partition.

The problem can be easily shown by modifying the 
{{FrozenUDTTest.testDivergentSchema}} test to insert two entries in the wrong 
order:
{code:java}
cluster.coordinator(1).execute("insert into " + KEYSPACE + ".x (id, ck, i) 
VALUES (?, " + json(1, 2) + ", ? )", ConsistencyLevel.ALL,
1, 2);
cluster.coordinator(1).execute("insert into " + KEYSPACE + ".x (id, ck, i) 
VALUES (?, " + json(1, 1) + ", ? )", ConsistencyLevel.ALL,
1, 1);
{code}
after which we can get corrupted sstable state, shown as a
{code:java}
java.lang.AssertionError: Lower bound [SSTABLE_LOWER_BOUND(1) ]is bigger than 
first returned value [Row: ck=1 | i=2]
{code}
exception, or results like {{[[1],[2],[2],[1]]}} or {{[[2],[1],[2]]}} for 
{{select i from x WHERE id = 1}} depending on which node we use as coordinator.

Because we don't know the type of new fields and cannot properly order entries, 
we need to outright reject UDT keys that are not compatible with a replica's 
schema.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19601) Test failure: test_change_durable_writes

2024-05-15 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17846557#comment-17846557
 ] 

Branimir Lambov commented on CASSANDRA-19601:
-

I don't know why such a flush would be necessary.

In terms of how to change the test, one thing we can try is to check that the 
commit log's dirty regions don't contain anything from that keyspace, but I 
don't know how we could access these from a python dtest. It might make sense 
to convert the test to in-jvm one where such things AFAIU are not hard to do.

> Test failure: test_change_durable_writes
> 
>
> Key: CASSANDRA-19601
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19601
> Project: Cassandra
>  Issue Type: Bug
>  Components: CI
>Reporter: Ekaterina Dimitrova
>Assignee: Brandon Williams
>Priority: Normal
> Fix For: 5.0.x, 5.x
>
>
> Failing on trunk:
> [https://ci-cassandra.apache.org/job/Cassandra-trunk/1880/testReport/junit/dtest-latest.configuration_test/TestConfiguration/Tests___dtest_latest_jdk11_31_64___test_change_durable_writes/]
> [https://app.circleci.com/pipelines/github/blerer/cassandra/400/workflows/893a0edb-9181-4981-b542-77228c8bc975/jobs/10941/tests]
> {code:java}
> AssertionError: Commitlog was written with durable writes disabled
> assert 90112 == 86016
>   +90112
>   -86016
> self = 
> @pytest.mark.timeout(60*30)
> def test_change_durable_writes(self):
> """
> @jira_ticket CASSANDRA-9560
> 
> Test that changes to the DURABLE_WRITES option on keyspaces is
> respected in subsequent writes.
> 
> This test starts by writing a dataset to a cluster and asserting that
> the commitlogs have been written to. The subsequent test depends on
> the assumption that this dataset triggers an fsync.
> 
> After checking this assumption, the test destroys the cluster and
> creates a fresh one. Then it tests that DURABLE_WRITES is respected 
> by:
> 
> - creating a keyspace with DURABLE_WRITES set to false,
> - using ALTER KEYSPACE to set its DURABLE_WRITES option to true,
> - writing a dataset to this keyspace that is known to trigger a 
> commitlog fsync,
> - asserting that the commitlog has grown in size since the data was 
> written.
> """
> cluster = self.cluster
> cluster.set_batch_commitlog(enabled=True, use_batch_window = 
> cluster.version() < '5.0')
> 
> cluster.set_configuration_options(values={'commitlog_segment_size_in_mb': 1})
> 
> cluster.populate(1).start()
> durable_node = cluster.nodelist()[0]
> 
> durable_init_size = commitlog_size(durable_node)
> durable_session = self.patient_exclusive_cql_connection(durable_node)
> 
> # test assumption that write_to_trigger_fsync actually triggers a 
> commitlog fsync
> durable_session.execute("CREATE KEYSPACE ks WITH REPLICATION = 
> {'class': 'SimpleStrategy', 'replication_factor': 1} "
> "AND DURABLE_WRITES = true")
> durable_session.execute('CREATE TABLE ks.tab (key int PRIMARY KEY, a 
> int, b int, c int)')
> logger.debug('commitlog size diff = ' + 
> str(commitlog_size(durable_node) - durable_init_size))
> write_to_trigger_fsync(durable_session, 'ks', 'tab')
> logger.debug('commitlog size diff = ' + 
> str(commitlog_size(durable_node) - durable_init_size))
> 
> assert commitlog_size(durable_node) > durable_init_size, \
> "This test will not work in this environment; 
> write_to_trigger_fsync does not trigger fsync."
> 
> durable_session.shutdown()
> cluster.stop()
> cluster.clear()
> 
> cluster.set_batch_commitlog(enabled=True, use_batch_window = 
> cluster.version() < '5.0')
> 
> cluster.set_configuration_options(values={'commitlog_segment_size_in_mb': 1})
> cluster.start()
> node = cluster.nodelist()[0]
> session = self.patient_exclusive_cql_connection(node)
> 
> # set up a keyspace without durable writes, then alter it to use them
> session.execute("CREATE KEYSPACE ks WITH REPLICATION = {'class': 
> 'SimpleStrategy', 'replication_factor': 1} "
> "AND DURABLE_WRITES = false")
> session.execute('CREATE TABLE ks.tab (key int PRIMARY KEY, a int, b 
> int, c int)')
> init_size = commitlog_size(node)
> write_to_trigger_fsync(session, 'ks', 'tab')
> >   assert commitlog_size(node) == init_size, "Commitlog was written with 
> > durable writes disabled"
> E   AssertionError: Commitlog was written with durable writes disabled
> E   assert 90112 == 86016
> E +901

[jira] [Updated] (CASSANDRA-18753) Add an optimized default configuration to tests and make it available for new users

2024-04-05 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18753:

Source Control Link: https://github.com/apache/cassandra/pull/2896
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

> Add an optimized default configuration to tests and make it available for new 
> users
> ---
>
> Key: CASSANDRA-18753
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18753
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Urgent
> Fix For: 5.0-rc, 5.x
>
> Attachments: 
> CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch,
>  
> DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch
>
>  Time Spent: 12h 20m
>  Remaining Estimate: 0h
>
> We currently offer only one sample configuration file with Cassandra, and 
> that file is deliberately configured to disable all new functionality and 
> incompatible improvements. This works well for legacy users that want to have 
> a painless upgrade, but is a very bad choice for new users, or anyone wanting 
> to make comparisons between Cassandra versions or between Cassandra and other 
> databases.
> We offer very little indication, in the database packaging itself, that there 
> are well-tested configuration choices that can solve known problems and 
> dramatically improve performance. This is guaranteed to paint the database in 
> a worse light than it deserves, and will very likely hurt adoption.
> We should find a way to offer a very easy way of choosing between "optimized" 
> and "compatible" defaults. At minimal, we could provide alternate yaml files. 
> Alternatively, we could build on the {{storage_compatibility_mode}} concept 
> to grow it into a setting that not only enables/disables certain settings, 
> but also changes their default values.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19471) Commitlog with direct io fails test_change_durable_writes

2024-03-22 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17829805#comment-17829805
 ] 

Branimir Lambov commented on CASSANDRA-19471:
-

They are only for the IAE, which is a a serious issue and IMHO a blocker for 
5.0.

I have not investigated the commitlog being written with durable writes off 
which is a much more benign issue. It is likely caused by the preparation of 
the direct I/O segments writing and flushing the header and first sync marker 
in advance of any use of the segment.

> Commitlog with direct io fails test_change_durable_writes
> -
>
> Key: CASSANDRA-19471
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19471
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Brandon Williams
>Assignee: Brandon Williams
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> With the commitlog_disk_access_mode set to direct, and the improved 
> configuration_test.py::TestConfiguration::test_change_durable_writes from 
> CASSANDRA-19465, this fails with either:
> {noformat}
>  AssertionError: Commitlog was written with durable writes disabled
> {noformat}
> Or what appears to be the original exception reported in CASSANDRA-19465:
> {noformat}
>   node1: ERROR [PERIODIC-COMMIT-LOG-SYNCER] 2024-03-14 17:16:08,465 
> StorageService.java:631 - Stopping native transport
>   node1: ERROR [MutationStage-5] 2024-03-14 17:16:08,465 
> StorageProxy.java:1670 - Failed to apply mutation locally :
>   java.lang.IllegalArgumentException: newPosition > limit: (1048634 > 1048576)
> at java.base/java.nio.Buffer.createPositionException(Buffer.java:341)
> at java.base/java.nio.Buffer.position(Buffer.java:316)
> at java.base/java.nio.ByteBuffer.position(ByteBuffer.java:1516)
> at 
> java.base/java.nio.MappedByteBuffer.position(MappedByteBuffer.java:321)
> at 
> java.base/java.nio.MappedByteBuffer.position(MappedByteBuffer.java:73)
> at 
> org.apache.cassandra.db.commitlog.CommitLogSegment.allocate(CommitLogSegment.java:216)
> at 
> org.apache.cassandra.db.commitlog.CommitLogSegmentManagerStandard.allocate(CommitLogSegmentManagerStandard.java:52)
> at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:307)
> at 
> org.apache.cassandra.db.CassandraKeyspaceWriteHandler.addToCommitLog(CassandraKeyspaceWriteHandler.java:99)
> at 
> org.apache.cassandra.db.CassandraKeyspaceWriteHandler.beginWrite(CassandraKeyspaceWriteHandler.java:53)
> at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:612)
> at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:497)
> at org.apache.cassandra.db.Mutation.apply(Mutation.java:244)
> at org.apache.cassandra.db.Mutation.apply(Mutation.java:264)
> at 
> org.apache.cassandra.service.StorageProxy$4.runMayThrow(StorageProxy.java:1664)
> at 
> org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2624)
> at 
> org.apache.cassandra.concurrent.ExecutionFailure$2.run(ExecutionFailure.java:163)
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:143)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:833)
>   node1: ERROR [PERIODIC-COMMIT-LOG-SYNCER] 2024-03-14 17:16:08,470 
> StorageService.java:636 - Stopping gossiper
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19471) Commitlog with direct io fails test_change_durable_writes

2024-03-19 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17828364#comment-17828364
 ] 

Branimir Lambov edited comment on CASSANDRA-19471 at 3/19/24 2:43 PM:
--

I believe the problem is that the buffer's limit (set 
[here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/DirectIOSegment.java#L208])
 is not the same as the buffer's capacity (from which {{endOfBuffer}} is set 
[here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java#L177]).

I guess what we want is to change the former to set the limit first and then 
apply {{{}slice{}}}. We probably also want the aligning path above it to go 
through this slicing to set the capacity appropriately. I'd also change the 
assertions that follow to make sure the limit and capacity of the prepared 
buffer match, and are equal to the segment size.


was (Author: blambov):
I believe the problem is that the buffer's limit (set 
[here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/DirectIOSegment.java#L208])
 is not the same as the buffer's capacity (from which `endOfBuffer` is set 
[here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java#L177]).

I guess what we want is to change the former to set the limit first and then 
apply `slice`.

> Commitlog with direct io fails test_change_durable_writes
> -
>
> Key: CASSANDRA-19471
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19471
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Brandon Williams
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> With the commitlog_disk_access_mode set to direct, and the improved 
> configuration_test.py::TestConfiguration::test_change_durable_writes from 
> CASSANDRA-19465, this fails with either:
> {noformat}
>  AssertionError: Commitlog was written with durable writes disabled
> {noformat}
> Or what appears to be the original exception reported in CASSANDRA-19465:
> {noformat}
>   node1: ERROR [PERIODIC-COMMIT-LOG-SYNCER] 2024-03-14 17:16:08,465 
> StorageService.java:631 - Stopping native transport
>   node1: ERROR [MutationStage-5] 2024-03-14 17:16:08,465 
> StorageProxy.java:1670 - Failed to apply mutation locally :
>   java.lang.IllegalArgumentException: newPosition > limit: (1048634 > 1048576)
> at java.base/java.nio.Buffer.createPositionException(Buffer.java:341)
> at java.base/java.nio.Buffer.position(Buffer.java:316)
> at java.base/java.nio.ByteBuffer.position(ByteBuffer.java:1516)
> at 
> java.base/java.nio.MappedByteBuffer.position(MappedByteBuffer.java:321)
> at 
> java.base/java.nio.MappedByteBuffer.position(MappedByteBuffer.java:73)
> at 
> org.apache.cassandra.db.commitlog.CommitLogSegment.allocate(CommitLogSegment.java:216)
> at 
> org.apache.cassandra.db.commitlog.CommitLogSegmentManagerStandard.allocate(CommitLogSegmentManagerStandard.java:52)
> at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:307)
> at 
> org.apache.cassandra.db.CassandraKeyspaceWriteHandler.addToCommitLog(CassandraKeyspaceWriteHandler.java:99)
> at 
> org.apache.cassandra.db.CassandraKeyspaceWriteHandler.beginWrite(CassandraKeyspaceWriteHandler.java:53)
> at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:612)
> at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:497)
> at org.apache.cassandra.db.Mutation.apply(Mutation.java:244)
> at org.apache.cassandra.db.Mutation.apply(Mutation.java:264)
> at 
> org.apache.cassandra.service.StorageProxy$4.runMayThrow(StorageProxy.java:1664)
> at 
> org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2624)
> at 
> org.apache.cassandra.concurrent.ExecutionFailure$2.run(ExecutionFailure.java:163)
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:143)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:833)
>   node1: ERROR [PERIODIC-COMMIT-LOG-SYNCER] 2024-03-14 17:16:08,470 
> StorageService.java:636 - Stopping gossiper
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19471) Commitlog with direct io fails test_change_durable_writes

2024-03-19 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17828364#comment-17828364
 ] 

Branimir Lambov commented on CASSANDRA-19471:
-

I believe the problem is that the buffer's limit (set 
[here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/DirectIOSegment.java#L208])
 is not the same as the buffer's capacity (from which `endOfBuffer` is set 
[here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java#L177]).

I guess what we want is to change the former to set the limit first and then 
apply `slice`.

> Commitlog with direct io fails test_change_durable_writes
> -
>
> Key: CASSANDRA-19471
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19471
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Brandon Williams
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> With the commitlog_disk_access_mode set to direct, and the improved 
> configuration_test.py::TestConfiguration::test_change_durable_writes from 
> CASSANDRA-19465, this fails with either:
> {noformat}
>  AssertionError: Commitlog was written with durable writes disabled
> {noformat}
> Or what appears to be the original exception reported in CASSANDRA-19465:
> {noformat}
>   node1: ERROR [PERIODIC-COMMIT-LOG-SYNCER] 2024-03-14 17:16:08,465 
> StorageService.java:631 - Stopping native transport
>   node1: ERROR [MutationStage-5] 2024-03-14 17:16:08,465 
> StorageProxy.java:1670 - Failed to apply mutation locally :
>   java.lang.IllegalArgumentException: newPosition > limit: (1048634 > 1048576)
> at java.base/java.nio.Buffer.createPositionException(Buffer.java:341)
> at java.base/java.nio.Buffer.position(Buffer.java:316)
> at java.base/java.nio.ByteBuffer.position(ByteBuffer.java:1516)
> at 
> java.base/java.nio.MappedByteBuffer.position(MappedByteBuffer.java:321)
> at 
> java.base/java.nio.MappedByteBuffer.position(MappedByteBuffer.java:73)
> at 
> org.apache.cassandra.db.commitlog.CommitLogSegment.allocate(CommitLogSegment.java:216)
> at 
> org.apache.cassandra.db.commitlog.CommitLogSegmentManagerStandard.allocate(CommitLogSegmentManagerStandard.java:52)
> at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:307)
> at 
> org.apache.cassandra.db.CassandraKeyspaceWriteHandler.addToCommitLog(CassandraKeyspaceWriteHandler.java:99)
> at 
> org.apache.cassandra.db.CassandraKeyspaceWriteHandler.beginWrite(CassandraKeyspaceWriteHandler.java:53)
> at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:612)
> at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:497)
> at org.apache.cassandra.db.Mutation.apply(Mutation.java:244)
> at org.apache.cassandra.db.Mutation.apply(Mutation.java:264)
> at 
> org.apache.cassandra.service.StorageProxy$4.runMayThrow(StorageProxy.java:1664)
> at 
> org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2624)
> at 
> org.apache.cassandra.concurrent.ExecutionFailure$2.run(ExecutionFailure.java:163)
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:143)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:833)
>   node1: ERROR [PERIODIC-COMMIT-LOG-SYNCER] 2024-03-14 17:16:08,470 
> StorageService.java:636 - Stopping gossiper
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19460) Fix tests to work with ULID SSTable identifiers to enable uuid_sstable_identifiers_enabled in cassandra-latest.yaml

2024-03-08 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17824756#comment-17824756
 ] 

Branimir Lambov commented on CASSANDRA-19460:
-

LGTM

> Fix tests to work with ULID SSTable identifiers to enable 
> uuid_sstable_identifiers_enabled in cassandra-latest.yaml
> ---
>
> Key: CASSANDRA-19460
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19460
> Project: Cassandra
>  Issue Type: Task
>  Components: CI, Test/dtest/java, Test/unit
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In CASSANDRA-18753 we identified that we want to also set 
> uuid_sstable_identifiers_enabled to true, while running a CI with it turned 
> on, it failed (1).
> Errors do not seem to be serious, it is just the test suite we have is not 
> prepared for the case when uuid_sstable_identifiers_enabled is set to true by 
> default.
> We need to fix all these tests so we can have cassandra-latest.yaml 
> containing that property.
> https://app.circleci.com/pipelines/github/blambov/cassandra/609/workflows/aef2b936-0551-4f3b-9d86-a49451c83947



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19460) Fix tests to work with ULID SSTable identifiers to enable uuid_sstable_identifiers_enabled in cassandra-latest.yaml

2024-03-08 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-19460:

Reviewers: Branimir Lambov
   Status: Review In Progress  (was: Needs Committer)

> Fix tests to work with ULID SSTable identifiers to enable 
> uuid_sstable_identifiers_enabled in cassandra-latest.yaml
> ---
>
> Key: CASSANDRA-19460
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19460
> Project: Cassandra
>  Issue Type: Task
>  Components: CI, Test/dtest/java, Test/unit
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In CASSANDRA-18753 we identified that we want to also set 
> uuid_sstable_identifiers_enabled to true, while running a CI with it turned 
> on, it failed (1).
> Errors do not seem to be serious, it is just the test suite we have is not 
> prepared for the case when uuid_sstable_identifiers_enabled is set to true by 
> default.
> We need to fix all these tests so we can have cassandra-latest.yaml 
> containing that property.
> https://app.circleci.com/pipelines/github/blambov/cassandra/609/workflows/aef2b936-0551-4f3b-9d86-a49451c83947



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19460) Fix tests to work with ULID SSTable identifiers to enable uuid_sstable_identifiers_enabled in cassandra-latest.yaml

2024-03-08 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-19460:

Status: Ready to Commit  (was: Review In Progress)

> Fix tests to work with ULID SSTable identifiers to enable 
> uuid_sstable_identifiers_enabled in cassandra-latest.yaml
> ---
>
> Key: CASSANDRA-19460
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19460
> Project: Cassandra
>  Issue Type: Task
>  Components: CI, Test/dtest/java, Test/unit
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In CASSANDRA-18753 we identified that we want to also set 
> uuid_sstable_identifiers_enabled to true, while running a CI with it turned 
> on, it failed (1).
> Errors do not seem to be serious, it is just the test suite we have is not 
> prepared for the case when uuid_sstable_identifiers_enabled is set to true by 
> default.
> We need to fix all these tests so we can have cassandra-latest.yaml 
> containing that property.
> https://app.circleci.com/pipelines/github/blambov/cassandra/609/workflows/aef2b936-0551-4f3b-9d86-a49451c83947



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18753) Add an optimized default configuration to tests and make it available for new users

2024-03-07 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17824394#comment-17824394
 ] 

Branimir Lambov commented on CASSANDRA-18753:
-

Committed to 5.0 as 
[06ed1afc34128523298020e7601dad148f2b2fb6|https://github.com/apache/cassandra/commit/06ed1afc34128523298020e7601dad148f2b2fb6]
 and trunk as 
[28efb63df52bafaf51cd458da021f6050900017a|https://github.com/apache/cassandra/commit/28efb63df52bafaf51cd458da021f6050900017a].

> Add an optimized default configuration to tests and make it available for new 
> users
> ---
>
> Key: CASSANDRA-18753
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18753
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Urgent
> Fix For: 5.0-rc, 5.x
>
> Attachments: 
> CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch,
>  
> DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch
>
>  Time Spent: 11h
>  Remaining Estimate: 0h
>
> We currently offer only one sample configuration file with Cassandra, and 
> that file is deliberately configured to disable all new functionality and 
> incompatible improvements. This works well for legacy users that want to have 
> a painless upgrade, but is a very bad choice for new users, or anyone wanting 
> to make comparisons between Cassandra versions or between Cassandra and other 
> databases.
> We offer very little indication, in the database packaging itself, that there 
> are well-tested configuration choices that can solve known problems and 
> dramatically improve performance. This is guaranteed to paint the database in 
> a worse light than it deserves, and will very likely hurt adoption.
> We should find a way to offer a very easy way of choosing between "optimized" 
> and "compatible" defaults. At minimal, we could provide alternate yaml files. 
> Alternatively, we could build on the {{storage_compatibility_mode}} concept 
> to grow it into a setting that not only enables/disables certain settings, 
> but also changes their default values.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18753) Add an optimized default configuration to tests and make it available for new users

2024-03-06 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823998#comment-17823998
 ] 

Branimir Lambov commented on CASSANDRA-18753:
-

That test is apparently already fixed. 

[Latest 
run|https://app.circleci.com/pipelines/github/blambov/cassandra/606/workflows/628459f1-f3fe-449c-a047-a784cc9711f5/jobs/24959/tests]
 had only a timeout of {{ActiveCompactionsTest}} -- reduced the number of 
iterations in the test to fix this.

Uploaded final version; I'm ready to commit it but I'd like one last review of 
the wording in {{NEWS.txt}} and {{cassandra(-latest).yaml}}.

> Add an optimized default configuration to tests and make it available for new 
> users
> ---
>
> Key: CASSANDRA-18753
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18753
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Urgent
> Fix For: 5.0-rc, 5.x
>
> Attachments: 
> CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch,
>  
> DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch
>
>  Time Spent: 11h
>  Remaining Estimate: 0h
>
> We currently offer only one sample configuration file with Cassandra, and 
> that file is deliberately configured to disable all new functionality and 
> incompatible improvements. This works well for legacy users that want to have 
> a painless upgrade, but is a very bad choice for new users, or anyone wanting 
> to make comparisons between Cassandra versions or between Cassandra and other 
> databases.
> We offer very little indication, in the database packaging itself, that there 
> are well-tested configuration choices that can solve known problems and 
> dramatically improve performance. This is guaranteed to paint the database in 
> a worse light than it deserves, and will very likely hurt adoption.
> We should find a way to offer a very easy way of choosing between "optimized" 
> and "compatible" defaults. At minimal, we could provide alternate yaml files. 
> Alternatively, we could build on the {{storage_compatibility_mode}} concept 
> to grow it into a setting that not only enables/disables certain settings, 
> but also changes their default values.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19459) test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions fails with SAI

2024-03-06 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-19459:

Resolution: Fixed
Status: Resolved  (was: Triage Needed)

Fixed by CASSANDRA-19018.

> test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions
>  fails with SAI
> ---
>
> Key: CASSANDRA-19459
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19459
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SAI
>Reporter: Branimir Lambov
>Priority: Normal
>
> The dtest 
> {{replica_side_filtering_test::TestSecondaryIndexes::test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions}}
>  fails when the default secondary index is switched to SAI with
> {code}
> test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions
>  failed; it passed 0 out of the required 1 times.
>   
>   Subprocess ['nodetool', '-h', 'localhost', '-p', '7200', 'flush'] 
> exited with non-zero status; exit status: 2; 
> stderr: error: null
> -- StackTrace --
> java.lang.NullPointerException
>   at java.base/java.util.Objects.requireNonNull(Objects.java:209)
>   at 
> org.apache.cassandra.index.sai.disk.v1.segment.SegmentMetadata.(SegmentMetadata.java:102)
>   at 
> org.apache.cassandra.index.sai.disk.v1.MemtableIndexWriter.flush(MemtableIndexWriter.java:166)
>   at 
> org.apache.cassandra.index.sai.disk.v1.MemtableIndexWriter.complete(MemtableIndexWriter.java:125)
>   at 
> org.apache.cassandra.index.sai.disk.StorageAttachedIndexWriter.complete(StorageAttachedIndexWriter.java:185)
>   at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
>   at 
> java.base/java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1092)
>   at 
> org.apache.cassandra.io.sstable.format.SSTableWriter.commit(SSTableWriter.java:289)
>   at 
> org.apache.cassandra.io.sstable.SimpleSSTableMultiWriter.commit(SimpleSSTableMultiWriter.java:90)
>   at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtable(ColumnFamilyStore.java:1354)
>   at 
> org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1253)
>   at 
> org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.base/java.lang.Thread.run(Thread.java:840)
> {code}
> Discovered while testing CASSANDRA-18753.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-18753) Add an optimized default configuration to tests and make it available for new users

2024-03-06 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823946#comment-17823946
 ] 

Branimir Lambov edited comment on CASSANDRA-18753 at 3/6/24 10:07 AM:
--

Well, tests [look much better 
now|https://app.circleci.com/pipelines/github/blambov/cassandra/605/workflows/f567db7c-2231-4c22-8a60-7e43887880d7].

We have only one failure, 
{{replica_side_filtering_test.TestSecondaryIndexes:test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions}}
 with SAI. Opened CASSANDRA-19459 for this, and proceeding to merge this ticket.


was (Author: blambov):
Well, tests [look much better 
now|https://app.circleci.com/pipelines/github/blambov/cassandra/605/workflows/f567db7c-2231-4c22-8a60-7e43887880d7].

We have only one failure, 
{{replica_side_filtering_test.TestSecondaryIndexes:test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions}}
 with SAI. Opened CASSANDRA- 19459 for this, and proceeding to merge this 
ticket.

> Add an optimized default configuration to tests and make it available for new 
> users
> ---
>
> Key: CASSANDRA-18753
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18753
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Urgent
> Fix For: 5.0-rc, 5.x
>
> Attachments: 
> CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch,
>  
> DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch
>
>  Time Spent: 11h
>  Remaining Estimate: 0h
>
> We currently offer only one sample configuration file with Cassandra, and 
> that file is deliberately configured to disable all new functionality and 
> incompatible improvements. This works well for legacy users that want to have 
> a painless upgrade, but is a very bad choice for new users, or anyone wanting 
> to make comparisons between Cassandra versions or between Cassandra and other 
> databases.
> We offer very little indication, in the database packaging itself, that there 
> are well-tested configuration choices that can solve known problems and 
> dramatically improve performance. This is guaranteed to paint the database in 
> a worse light than it deserves, and will very likely hurt adoption.
> We should find a way to offer a very easy way of choosing between "optimized" 
> and "compatible" defaults. At minimal, we could provide alternate yaml files. 
> Alternatively, we could build on the {{storage_compatibility_mode}} concept 
> to grow it into a setting that not only enables/disables certain settings, 
> but also changes their default values.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19459) test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions fails with SAI

2024-03-06 Thread Branimir Lambov (Jira)
Branimir Lambov created CASSANDRA-19459:
---

 Summary: 
test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions
 fails with SAI
 Key: CASSANDRA-19459
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19459
 Project: Cassandra
  Issue Type: Bug
  Components: Feature/SAI
Reporter: Branimir Lambov


The dtest 
{{replica_side_filtering_test::TestSecondaryIndexes::test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions}}
 fails when the default secondary index is switched to SAI with
{code}
test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions
 failed; it passed 0 out of the required 1 times.

Subprocess ['nodetool', '-h', 'localhost', '-p', '7200', 'flush'] 
exited with non-zero status; exit status: 2; 
stderr: error: null
-- StackTrace --
java.lang.NullPointerException
at java.base/java.util.Objects.requireNonNull(Objects.java:209)
at 
org.apache.cassandra.index.sai.disk.v1.segment.SegmentMetadata.(SegmentMetadata.java:102)
at 
org.apache.cassandra.index.sai.disk.v1.MemtableIndexWriter.flush(MemtableIndexWriter.java:166)
at 
org.apache.cassandra.index.sai.disk.v1.MemtableIndexWriter.complete(MemtableIndexWriter.java:125)
at 
org.apache.cassandra.index.sai.disk.StorageAttachedIndexWriter.complete(StorageAttachedIndexWriter.java:185)
at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
at 
java.base/java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1092)
at 
org.apache.cassandra.io.sstable.format.SSTableWriter.commit(SSTableWriter.java:289)
at 
org.apache.cassandra.io.sstable.SimpleSSTableMultiWriter.commit(SimpleSSTableMultiWriter.java:90)
at 
org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtable(ColumnFamilyStore.java:1354)
at 
org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1253)
at 
org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133)
at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:840)
{code}

Discovered while testing CASSANDRA-18753.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18753) Add an optimized default configuration to tests and make it available for new users

2024-02-29 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17822034#comment-17822034
 ] 

Branimir Lambov commented on CASSANDRA-18753:
-

I don't mind removing it, especially if we have a plan for adding it back.

I'll remove it and re-run CI.

> Add an optimized default configuration to tests and make it available for new 
> users
> ---
>
> Key: CASSANDRA-18753
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18753
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Urgent
> Fix For: 5.0-rc, 5.x
>
> Attachments: 
> CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch,
>  
> DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch
>
>  Time Spent: 11h
>  Remaining Estimate: 0h
>
> We currently offer only one sample configuration file with Cassandra, and 
> that file is deliberately configured to disable all new functionality and 
> incompatible improvements. This works well for legacy users that want to have 
> a painless upgrade, but is a very bad choice for new users, or anyone wanting 
> to make comparisons between Cassandra versions or between Cassandra and other 
> databases.
> We offer very little indication, in the database packaging itself, that there 
> are well-tested configuration choices that can solve known problems and 
> dramatically improve performance. This is guaranteed to paint the database in 
> a worse light than it deserves, and will very likely hurt adoption.
> We should find a way to offer a very easy way of choosing between "optimized" 
> and "compatible" defaults. At minimal, we could provide alternate yaml files. 
> Alternatively, we could build on the {{storage_compatibility_mode}} concept 
> to grow it into a setting that not only enables/disables certain settings, 
> but also changes their default values.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-18753) Add an optimized default configuration to tests and make it available for new users

2024-01-16 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17799053#comment-17799053
 ] 

Branimir Lambov edited comment on CASSANDRA-18753 at 1/16/24 8:49 AM:
--

Merged CCM and DTest patches (they do not change anything unless the 
{{--configuration-yaml}} flag is used).

[The state of failing tests at the 
moment|https://app.circleci.com/pipelines/github/blambov/cassandra/595/workflows/ed598605-6af6-443e-9336-aaa47ae27e43]:
 - JUnit tests in compatible mode (which changes to use {{{}heap_buffers{}}}):
 -- {{CQLVectorTest}} (CASSANDRA-19167)
 -- {{VectorUpdateDeleteTest}} (CASSANDRA-19168)
 - JUnit tests in latest mode:
 -- repair fuzz tests {{{}ConcurrentIrWithPreviewFuzzTest{}}}, 
{{{}FailedAckTest{}}}, {{{}FailingRepairFuzzTest{}}}, 
{{{}HappyPathFuzzTest{}}}, {{SlowMessageFuzzTest}} (CASSANDRA-19042)
 -- {{RepairJobTest}} (CASSANDRA-19043)
 -- {{ClientRequestMetricsTest}} (CASSANDRA-19046)
 - JVM dtests in latest mode:
 -- {{RepairTest}} (CASSANDRA-19085)
 -- {{SSTableLoaderEncyptionOptionsTest}} (CASSANDRA-19126)
 -- {{QueriesTableTest}} (CASSANDRA-19046)
 - Python dtests in latest mode:
 -- {{TestWriteFailures.testPaxos}} (CASSANDRA-19145)
 -- {{TestReplaceAddress}} (CASSANDRA-19144)
 -- {{TestSnapshot}} (CASSANDRA-19126)
 -- {{TestClientRequestMetrics}} (CASSANDRA-19046)

Several {{TestBootstrap}} tests seems to be failing in all configurations, some 
already marked as flaky; this likely is not caused by this patch. There are 
also some timeouts (e.g. {{ActiveCompactionsTest}} times out when run 
repeatedly due to longer 
{{{}testActiveCompactionTrackingRaceWithIndexBuilder{}}}).

Please review [the PR|https://github.com/apache/cassandra/pull/2896].


was (Author: blambov):
Merged CCM and DTest patches (they do not change anything unless the 
{{--configuration-yaml}} flag is used).

[The state of failing tests at the 
moment|https://app.circleci.com/pipelines/github/blambov/cassandra/595/workflows/ed598605-6af6-443e-9336-aaa47ae27e43]:
 - JUnit tests in compatible mode (which changes to use {{{}heap_buffers{}}}):
 -- {{CQLVectorTest}} (CASSANDRA-19167)
 -- {{VectorUpdateDeleteTest}} (CASSANDRA-19168)
 - JUnit tests in latest mode:
 -- repair fuzz tests {{{}ConcurrentIrWithPreviewFuzzTest{}}}, 
{{{}FailedAckTest{}}}, {{{}FailingRepairFuzzTest{}}}, 
{{{}HappyPathFuzzTest{}}}, {{SlowMessageFuzzTest}} (CASSANDRA-19042)
 -- {{RepairJobTest}} (CASSANDRA-19043)
 - JVM dtests in latest mode:
 -- {{RepairTest}} (CASSANDRA-19085)
 -- {{SSTableLoaderEncyptionOptionsTest}} (CASSANDRA-19126)
 -- {{QueriesTableTest}} (CASSANDRA-19046)
 - Python dtests in latest mode:
 -- {{TestWriteFailures.testPaxos}} (CASSANDRA-19145)
 -- {{TestReplaceAddress}} (CASSANDRA-19144)
 -- {{TestSnapshot}} (CASSANDRA-19126)
 -- {{TestClientRequestMetrics}} (CASSANDRA-19046)

Several {{TestBootstrap}} tests seems to be failing in all configurations, some 
already marked as flaky; this likely is not caused by this patch. There are 
also some timeouts (e.g. {{ActiveCompactionsTest}} times out when run 
repeatedly due to longer 
{{{}testActiveCompactionTrackingRaceWithIndexBuilder{}}}).

Please review [the PR|https://github.com/apache/cassandra/pull/2896].

> Add an optimized default configuration to tests and make it available for new 
> users
> ---
>
> Key: CASSANDRA-18753
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18753
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Urgent
> Fix For: 5.0-rc, 5.x
>
> Attachments: 
> CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch,
>  
> DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch
>
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>
> We currently offer only one sample configuration file with Cassandra, and 
> that file is deliberately configured to disable all new functionality and 
> incompatible improvements. This works well for legacy users that want to have 
> a painless upgrade, but is a very bad choice for new users, or anyone wanting 
> to make comparisons between Cassandra versions or between Cassandra and other 
> databases.
> We offer very little indication, in the database packaging itself, that there 
> are well-tested configuration choices that can solve known problems and 
> dramatically improve performance. This is guaranteed to paint the database in 
> a worse light than it deserves, and will very likely hurt adoption.
> We should find a way to offer a very easy way of choosing between "optimized" 
> and "compatible" defaults. At minimal, we could provi

[jira] [Commented] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings

2024-01-09 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17804661#comment-17804661
 ] 

Branimir Lambov commented on CASSANDRA-19126:
-

It is to me.

> Streaming appears to be incompatible with different 
> storage_compatibility_mode settings
> ---
>
> Key: CASSANDRA-19126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Streaming, Legacy/Streaming and Messaging, 
> Messaging/Internode, Tool/bulk load
>Reporter: Branimir Lambov
>Assignee: Jacek Lewandowski
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> In particular, SSTableLoader appears to be incompatible with 
> storage_compatibility_mode: NONE, which manifests as a failure of 
> {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} 
> when the flag is turned on (found during CASSANDRA-18753 testing). Setting 
> {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not 
> help (according to the docs, this setting is not picked up).
> This is likely a bigger problem as the acceptable streaming version for C* 5 
> is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not 
> appear to be able to stream with each other if their setting for the 
> compatibility mode is different.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings

2023-12-19 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17798565#comment-17798565
 ] 

Branimir Lambov commented on CASSANDRA-19126:
-

{quote}
{code}
private static final String MIXED_MODE_ERROR = "Some nodes involved in 
repair are on an incompatible major version. " +
   "Repair is not supported in 
mixed major version clusters.";
{code}
{quote}

_To me_ this message in the context of a 5.0 cluster where something is in the 
wrong compatibility mode would be quite confusing. At the very least we need to 
state very clearly that a 5.x node in compatibility mode is considered a 4.x 
node for all intents and purposes, including being a "same major version" for 
the message above. Also, does this not mean we can't ever drop 4.0 support 
because e.g. 6.0 must be compatible with 5.0, including in its compatibility 
mode?

> Streaming appears to be incompatible with different 
> storage_compatibility_mode settings
> ---
>
> Key: CASSANDRA-19126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Streaming, Legacy/Streaming and Messaging, 
> Messaging/Internode, Tool/bulk load
>Reporter: Branimir Lambov
>Assignee: Jacek Lewandowski
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> In particular, SSTableLoader appears to be incompatible with 
> storage_compatibility_mode: NONE, which manifests as a failure of 
> {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} 
> when the flag is turned on (found during CASSANDRA-18753 testing). Setting 
> {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not 
> help (according to the docs, this setting is not picked up).
> This is likely a bigger problem as the acceptable streaming version for C* 5 
> is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not 
> appear to be able to stream with each other if their setting for the 
> compatibility mode is different.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings

2023-12-11 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17795406#comment-17795406
 ] 

Branimir Lambov commented on CASSANDRA-19126:
-

In other words, you both feel that it is okay for {{BulkLoader}} to not work if 
it is not the corresponding version or is not configured exactly like the 
database is?

Separately, that a node in e.g. {{UPGRADING}} mode should not be able to stream 
sstables to one in {{NONE}}?

> Streaming appears to be incompatible with different 
> storage_compatibility_mode settings
> ---
>
> Key: CASSANDRA-19126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Streaming, Legacy/Streaming and Messaging, 
> Messaging/Internode, Tool/bulk load
>Reporter: Branimir Lambov
>Assignee: Jacek Lewandowski
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> In particular, SSTableLoader appears to be incompatible with 
> storage_compatibility_mode: NONE, which manifests as a failure of 
> {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} 
> when the flag is turned on (found during CASSANDRA-18753 testing). Setting 
> {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not 
> help (according to the docs, this setting is not picked up).
> This is likely a bigger problem as the acceptable streaming version for C* 5 
> is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not 
> appear to be able to stream with each other if their setting for the 
> compatibility mode is different.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings

2023-12-10 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17795089#comment-17795089
 ] 

Branimir Lambov commented on CASSANDRA-19126:
-

> Precise fix for this would be to use the same compatibility mode for bulk 
> loader and the node.

While this would fix the test, it would not do anything about the underlying 
problem. C* 5 nodes in different compatibility mode should be able to stream 
with each other. One should at least be able to stream whole sstables from 
legacy mode to current.

Current to legacy mode when CASSANDRA-19012 is done also makes sense, but it 
might violate the downgradability promise while such data is not compacted. We 
probably need a warning if current-format data is streamed to a node in legacy 
mode (e.g. suggesting one does upgradesstables before downgrading below 5.0).

> Streaming appears to be incompatible with different 
> storage_compatibility_mode settings
> ---
>
> Key: CASSANDRA-19126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Streaming, Legacy/Streaming and Messaging, 
> Messaging/Internode, Tool/bulk load
>Reporter: Branimir Lambov
>Assignee: Jacek Lewandowski
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> In particular, SSTableLoader appears to be incompatible with 
> storage_compatibility_mode: NONE, which manifests as a failure of 
> {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} 
> when the flag is turned on (found during CASSANDRA-18753 testing). Setting 
> {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not 
> help (according to the docs, this setting is not picked up).
> This is likely a bigger problem as the acceptable streaming version for C* 5 
> is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not 
> appear to be able to stream with each other if their setting for the 
> compatibility mode is different.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings

2023-12-10 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17795089#comment-17795089
 ] 

Branimir Lambov edited comment on CASSANDRA-19126 at 12/10/23 4:57 PM:
---

bq. Precise fix for this would be to use the same compatibility mode for bulk 
loader and the node.

While this would fix the test, it would not do anything about the underlying 
problem. C* 5 nodes in different compatibility mode should be able to stream 
with each other. One should at least be able to stream whole sstables from 
legacy mode to current.

Current to legacy mode when CASSANDRA-19012 is done also makes sense, but it 
might violate the downgradability promise while such data is not compacted. We 
probably need a warning if current-format data is streamed to a node in legacy 
mode (e.g. suggesting one does upgradesstables before downgrading below 5.0).


was (Author: blambov):
> Precise fix for this would be to use the same compatibility mode for bulk 
> loader and the node.

While this would fix the test, it would not do anything about the underlying 
problem. C* 5 nodes in different compatibility mode should be able to stream 
with each other. One should at least be able to stream whole sstables from 
legacy mode to current.

Current to legacy mode when CASSANDRA-19012 is done also makes sense, but it 
might violate the downgradability promise while such data is not compacted. We 
probably need a warning if current-format data is streamed to a node in legacy 
mode (e.g. suggesting one does upgradesstables before downgrading below 5.0).

> Streaming appears to be incompatible with different 
> storage_compatibility_mode settings
> ---
>
> Key: CASSANDRA-19126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Streaming, Legacy/Streaming and Messaging, 
> Messaging/Internode, Tool/bulk load
>Reporter: Branimir Lambov
>Assignee: Jacek Lewandowski
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> In particular, SSTableLoader appears to be incompatible with 
> storage_compatibility_mode: NONE, which manifests as a failure of 
> {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} 
> when the flag is turned on (found during CASSANDRA-18753 testing). Setting 
> {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not 
> help (according to the docs, this setting is not picked up).
> This is likely a bigger problem as the acceptable streaming version for C* 5 
> is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not 
> appear to be able to stream with each other if their setting for the 
> compatibility mode is different.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19168) VectorUpdateDeleteTest fails with heap_buffers

2023-12-05 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-19168:

Fix Version/s: 5.0-rc

> VectorUpdateDeleteTest fails with heap_buffers
> --
>
> Key: CASSANDRA-19168
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19168
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Vector Search
>Reporter: Branimir Lambov
>Priority: Normal
> Fix For: 5.0-rc
>
>
> When {{memtable_allocation_type}} is set to {{heap_buffers}}, {{updateTest}} 
> fails with
> {code}
> junit.framework.AssertionFailedError: Result set does not contain a row with 
> pk = 0
>   at 
> org.apache.cassandra.index.sai.cql.VectorTypeTest.assertContainsInt(VectorTypeTest.java:133)
>   at 
> org.apache.cassandra.index.sai.cql.VectorUpdateDeleteTest.updateTest(VectorUpdateDeleteTest.java:308)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at 
> com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19168) VectorUpdateDeleteTest fails with heap_buffers

2023-12-05 Thread Branimir Lambov (Jira)
Branimir Lambov created CASSANDRA-19168:
---

 Summary: VectorUpdateDeleteTest fails with heap_buffers
 Key: CASSANDRA-19168
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19168
 Project: Cassandra
  Issue Type: Bug
  Components: Feature/Vector Search
Reporter: Branimir Lambov


When {{memtable_allocation_type}} is set to {{heap_buffers}}, {{updateTest}} 
fails with
{code}
junit.framework.AssertionFailedError: Result set does not contain a row with pk 
= 0
at 
org.apache.cassandra.index.sai.cql.VectorTypeTest.assertContainsInt(VectorTypeTest.java:133)
at 
org.apache.cassandra.index.sai.cql.VectorUpdateDeleteTest.updateTest(VectorUpdateDeleteTest.java:308)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at 
com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19167) CQLVectorTest fails with heap_buffers

2023-12-05 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-19167:

Fix Version/s: 5.0-rc

> CQLVectorTest fails with heap_buffers
> -
>
> Key: CASSANDRA-19167
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19167
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Vector Search
>Reporter: Branimir Lambov
>Priority: Normal
> Fix For: 5.0-rc
>
>
> When {{memtable_allocation_type}} is set to {{heap_buffers}}, the {{udf}} 
> test fails with
> {code}
> org.apache.cassandra.cql3.functions.types.exceptions.InvalidTypeException: 
> Invalid 32-bits integer value, expecting 4 bytes but got 6
>   at 
> org.apache.cassandra.cql3.functions.types.TypeCodec$IntCodec.deserializeNoBoxing(TypeCodec.java:1695)
>   at 
> org.apache.cassandra.cql3.functions.types.TypeCodec$PrimitiveIntCodec.deserialize(TypeCodec.java:842)
>   at 
> org.apache.cassandra.cql3.functions.types.TypeCodec$PrimitiveIntCodec.deserialize(TypeCodec.java:819)
>   at 
> org.apache.cassandra.cql3.functions.types.VectorCodec$FixedLength.deserialize(VectorCodec.java:135)
>   at 
> org.apache.cassandra.cql3.functions.types.VectorCodec$FixedLength.deserialize(VectorCodec.java:83)
>   at 
> org.apache.cassandra.cql3.functions.types.TypeCodec$AbstractCollectionCodec.deserialize(TypeCodec.java:2141)
>   at 
> org.apache.cassandra.cql3.functions.types.TypeCodec$AbstractCollectionCodec.deserialize(TypeCodec.java:2082)
>   at 
> org.apache.cassandra.cql3.functions.UDFDataType.compose(UDFDataType.java:180)
>   at 
> org.apache.cassandra.cql3.functions.FunctionArguments.set(FunctionArguments.java:142)
>   at 
> org.apache.cassandra.cql3.selection.AbstractFunctionSelector.setArg(AbstractFunctionSelector.java:277)
>   at 
> org.apache.cassandra.cql3.selection.ScalarFunctionSelector.getOutput(ScalarFunctionSelector.java:58)
>   at 
> org.apache.cassandra.cql3.selection.Selection$SelectionWithProcessing$1.getOutputRow(Selection.java:605)
>   at 
> org.apache.cassandra.cql3.selection.ResultSetBuilder.getOutputRow(ResultSetBuilder.java:175)
>   at 
> org.apache.cassandra.cql3.selection.ResultSetBuilder.build(ResultSetBuilder.java:162)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:999)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:564)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:600)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.executeLocally(SelectStatement.java:570)
>   at 
> org.apache.cassandra.cql3.statements.SelectStatement.executeLocally(SelectStatement.java:108)
>   at 
> org.apache.cassandra.cql3.QueryProcessor.executeInternal(QueryProcessor.java:445)
>   at 
> org.apache.cassandra.cql3.CQLTester.executeFormattedQuery(CQLTester.java:1597)
>   at org.apache.cassandra.cql3.CQLTester.execute(CQLTester.java:1576)
>   at 
> org.apache.cassandra.cql3.validation.operations.CQLVectorTest.udf(CQLVectorTest.java:427)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19167) CQLVectorTest fails with heap_buffers

2023-12-05 Thread Branimir Lambov (Jira)
Branimir Lambov created CASSANDRA-19167:
---

 Summary: CQLVectorTest fails with heap_buffers
 Key: CASSANDRA-19167
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19167
 Project: Cassandra
  Issue Type: Bug
  Components: Feature/Vector Search
Reporter: Branimir Lambov


When {{memtable_allocation_type}} is set to {{heap_buffers}}, the {{udf}} test 
fails with
{code}
org.apache.cassandra.cql3.functions.types.exceptions.InvalidTypeException: 
Invalid 32-bits integer value, expecting 4 bytes but got 6
at 
org.apache.cassandra.cql3.functions.types.TypeCodec$IntCodec.deserializeNoBoxing(TypeCodec.java:1695)
at 
org.apache.cassandra.cql3.functions.types.TypeCodec$PrimitiveIntCodec.deserialize(TypeCodec.java:842)
at 
org.apache.cassandra.cql3.functions.types.TypeCodec$PrimitiveIntCodec.deserialize(TypeCodec.java:819)
at 
org.apache.cassandra.cql3.functions.types.VectorCodec$FixedLength.deserialize(VectorCodec.java:135)
at 
org.apache.cassandra.cql3.functions.types.VectorCodec$FixedLength.deserialize(VectorCodec.java:83)
at 
org.apache.cassandra.cql3.functions.types.TypeCodec$AbstractCollectionCodec.deserialize(TypeCodec.java:2141)
at 
org.apache.cassandra.cql3.functions.types.TypeCodec$AbstractCollectionCodec.deserialize(TypeCodec.java:2082)
at 
org.apache.cassandra.cql3.functions.UDFDataType.compose(UDFDataType.java:180)
at 
org.apache.cassandra.cql3.functions.FunctionArguments.set(FunctionArguments.java:142)
at 
org.apache.cassandra.cql3.selection.AbstractFunctionSelector.setArg(AbstractFunctionSelector.java:277)
at 
org.apache.cassandra.cql3.selection.ScalarFunctionSelector.getOutput(ScalarFunctionSelector.java:58)
at 
org.apache.cassandra.cql3.selection.Selection$SelectionWithProcessing$1.getOutputRow(Selection.java:605)
at 
org.apache.cassandra.cql3.selection.ResultSetBuilder.getOutputRow(ResultSetBuilder.java:175)
at 
org.apache.cassandra.cql3.selection.ResultSetBuilder.build(ResultSetBuilder.java:162)
at 
org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:999)
at 
org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:564)
at 
org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:600)
at 
org.apache.cassandra.cql3.statements.SelectStatement.executeLocally(SelectStatement.java:570)
at 
org.apache.cassandra.cql3.statements.SelectStatement.executeLocally(SelectStatement.java:108)
at 
org.apache.cassandra.cql3.QueryProcessor.executeInternal(QueryProcessor.java:445)
at 
org.apache.cassandra.cql3.CQLTester.executeFormattedQuery(CQLTester.java:1597)
at org.apache.cassandra.cql3.CQLTester.execute(CQLTester.java:1576)
at 
org.apache.cassandra.cql3.validation.operations.CQLVectorTest.udf(CQLVectorTest.java:427)
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19145) Python dtest TestWriteFailures.test_paxos is failing with Paxos V2

2023-12-01 Thread Branimir Lambov (Jira)
Branimir Lambov created CASSANDRA-19145:
---

 Summary: Python dtest TestWriteFailures.test_paxos is failing with 
Paxos V2
 Key: CASSANDRA-19145
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19145
 Project: Cassandra
  Issue Type: Bug
  Components: Feature/Lightweight Transactions
Reporter: Branimir Lambov


With configuration changed to engage Paxos V2 with repaired state purging, the 
dtest fails with:
{code}
test_paxos
write_failures_test.TestWriteFailures

self = 

def test_paxos(self):
"""
A light transaction receives a WriteFailure
"""
>   exc = self._perform_cql_statement("INSERT INTO mytable (key, value) 
> VALUES ('key1', 'Value 1') IF NOT EXISTS")

write_failures_test.py:202: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
write_failures_test.py:88: in _perform_cql_statement
session.execute(statement)
../env3.7/src/cassandra-driver/cassandra/cluster.py:2618: in execute
return self.execute_async(query, parameters, trace, custom_payload, 
timeout, execution_profile, paging_state, host, execute_as).result()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = 

def result(self):
"""
Return the final result or raise an Exception if errors were
encountered.  If the final result or error has not been set
yet, this method will block until it is set, or the timeout
set for the request expires.

Timeout is specified in the Session request execution functions.
If the timeout is exceeded, an :exc:`cassandra.OperationTimedOut` will 
be raised.
This is a client-side timeout. For more information
about server-side coordinator timeouts, see 
:class:`.policies.RetryPolicy`.

Example usage::

>>> future = session.execute_async("SELECT * FROM mycf")
>>> # do other stuff...

>>> try:
... rows = future.result()
... for row in rows:
... ... # process results
... except Exception:
... log.exception("Operation failed:")

"""
self._event.wait()
if self._final_result is not _NOT_SET:
return ResultSet(self, self._final_result)
else:
>   raise self._final_exception
E   cassandra.WriteTimeout: Error from server: code=1100 [Coordinator 
node timed out waiting for replica nodes' responses] message="CAS operation 
timed out: received 1 of 2 required responses after 0 contention retries" 
info={'consistency': 'SERIAL', 'required_responses': 2, 'received_responses': 
1, 'write_type': 'CAS'}

../env3.7/src/cassandra-driver/cassandra/cluster.py:4894: WriteTimeout
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19144) Python dtest replace_address_test.TestReplaceAddress is failing with Paxos V2

2023-12-01 Thread Branimir Lambov (Jira)
Branimir Lambov created CASSANDRA-19144:
---

 Summary: Python dtest replace_address_test.TestReplaceAddress is 
failing with Paxos V2
 Key: CASSANDRA-19144
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19144
 Project: Cassandra
  Issue Type: Bug
  Components: Consistency/Bootstrap and Decommission, 
Feature/Lightweight Transactions
Reporter: Branimir Lambov


Paxos repair is causing an unexpected failure:
{code}
test_replace_with_insufficient_replicas
replace_address_test.TestReplaceAddress

failed on teardown with "Failed: Unexpected error found in node logs (see 
stdout for full details). Errors: [[replacement] 'ERROR [main] 2023-11-29 
10:23:08,752 CassandraDaemon.java:878 - Exception encountered during 
startup\njava.lang.UnsupportedOperationException: null\n\tat 
org.apache.cassandra.locator.AbstractReplicaCollection$ReplicaMap$AbstractImmutableSet.removeAll(AbstractReplicaCollection.java:298)\n\tat
 
org.apache.cassandra.service.ActiveRepairService.repairPaxosForTopologyChange(ActiveRepairService.java:1102)\n\tat
 
org.apache.cassandra.service.StorageService.startRepairPaxosForTopologyChange(StorageService.java:4829)\n\tat
 
org.apache.cassandra.service.StorageService.tryRepairPaxosForTopologyChange(StorageService.java:4760)\n\tat
 
org.apache.cassandra.service.StorageService.repairPaxosForTopologyChange(StorageService.java:4793)\n\tat
 
org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:2120)\n\tat
 
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1240)\n\tat
 
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1200)\n\tat
 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:979)\n\tat
 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:896)\n\tat
 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:377)\n\tat
 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:721)\n\tat
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:856)']"
Unexpected error found in node logs (see stdout for full details). Errors: 
[[replacement] 'ERROR [main] 2023-11-29 10:23:08,752 CassandraDaemon.java:878 - 
Exception encountered during startup\njava.lang.UnsupportedOperationException: 
null\n\tat 
org.apache.cassandra.locator.AbstractReplicaCollection$ReplicaMap$AbstractImmutableSet.removeAll(AbstractReplicaCollection.java:298)\n\tat
 
org.apache.cassandra.service.ActiveRepairService.repairPaxosForTopologyChange(ActiveRepairService.java:1102)\n\tat
 
org.apache.cassandra.service.StorageService.startRepairPaxosForTopologyChange(StorageService.java:4829)\n\tat
 
org.apache.cassandra.service.StorageService.tryRepairPaxosForTopologyChange(StorageService.java:4760)\n\tat
 
org.apache.cassandra.service.StorageService.repairPaxosForTopologyChange(StorageService.java:4793)\n\tat
 
org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:2120)\n\tat
 
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1240)\n\tat
 
org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1200)\n\tat
 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:979)\n\tat
 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:896)\n\tat
 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:377)\n\tat
 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:721)\n\tat
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:856)']
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings

2023-12-01 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17792095#comment-17792095
 ] 

Branimir Lambov commented on CASSANDRA-19126:
-

Python dtest \{{snaphost_test}} is also failing because of this sstableloader 
problem:
{code:java}
Exception: sstableloader command '/home/cassandra/cassandra/bin/sstableloader 
-d 127.0.0.1 /tmp/tmpidg_8u3c/0/ks/cf' failed; exit status: 1'; stdout: 
Established connection to initial hosts
Opening sstables and calculating sections to stream
Streaming relevant part of /tmp/tmpidg_8u3c/0/ks/cf/da-1-bti-Data.db to 
[/127.0.0.1:7000]

progress: total: 100% 0.000B/s (avg: 0.000B/s)
; stderr: ERROR 10:16:01,391 [Stream #4bb85ff0-8ea0-11ee-94d3-3de6344de31d] 
Streaming error occurred on session with peer 127.0.0.1:7000
java.lang.ClassCastException: class 
org.apache.cassandra.net.OutboundConnectionInitiator$Result$Incompatible cannot 
be cast to class 
org.apache.cassandra.net.OutboundConnectionInitiator$Result$Success 
(org.apache.cassandra.net.OutboundConnectionInitiator$Result$Incompatible and 
org.apache.cassandra.net.OutboundConnectionInitiator$Result$Success are in 
unnamed module of loader 'app')
{code}

> Streaming appears to be incompatible with different 
> storage_compatibility_mode settings
> ---
>
> Key: CASSANDRA-19126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Streaming, Legacy/Streaming and Messaging, 
> Messaging/Internode, Tool/bulk load
>Reporter: Branimir Lambov
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> In particular, SSTableLoader appears to be incompatible with 
> storage_compatibility_mode: NONE, which manifests as a failure of 
> {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} 
> when the flag is turned on (found during CASSANDRA-18753 testing). Setting 
> {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not 
> help (according to the docs, this setting is not picked up).
> This is likely a bigger problem as the acceptable streaming version for C* 5 
> is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not 
> appear to be able to stream with each other if their setting for the 
> compatibility mode is different.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19046) Paxos V2 does not update individual fields of readMetrics

2023-12-01 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17792090#comment-17792090
 ] 

Branimir Lambov commented on CASSANDRA-19046:
-

Python dtest failure related to this: 
{{client_request_metrics_test.TestClientRequestMetrics}}
{code:java}
 >   self.cas_read_contention()

client_request_metrics_test.py:103: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
client_request_metrics_test.py:355: in cas_read_contention
consistency_level=CL.SERIAL))
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = 
metric_factory = functools.partial(, 'CASRead')
statement = 

def cas_contention(self, metric_factory, statement):

query_count = 20
cassandra_version = self.dtest_config.cassandra_version_from_build

def sample():
baseline = metric_factory()
baseline.validate(cassandra_version)

execute_concurrent_with_args(self.session,
 statement,
 repeat([], query_count), 
raise_on_first_error=False)

updated = metric_factory()
updated.validate(cassandra_version)

return updated.diff(baseline)

for _ in range(10):
diff = sample()
if 'ContentionHistogram.Count' in diff:
break

assert diff['Latency.Count'] == query_count
assert diff['TotalLatency.Count'] > 0
>   assert 0 < diff['ContentionHistogram.Count'] <= query_count
E   KeyError: 'ContentionHistogram.Count'

client_request_metrics_test.py:382: KeyError{code}

> Paxos V2 does not update individual fields of readMetrics
> -
>
> Key: CASSANDRA-19046
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19046
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Lightweight Transactions, Observability/Metrics
>Reporter: Branimir Lambov
>Priority: Normal
> Fix For: 5.0-rc
>
>
> As a result, {{ClientMetricsTest.testPaxosStatement}} is failing with 
> {{paxos_variant: v2}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings

2023-12-01 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17791958#comment-17791958
 ] 

Branimir Lambov commented on CASSANDRA-19126:
-

I believe what Brandon means is that we also need upgrade tests where only some 
nodes have changed {{storage_compatibility_mode}}.

[This 
line|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/MessagingService.java#L259]
 is what appears to be preventing {{BulkLoader}} from working. I don't have 
enough knowledge in the area and have not dug deep enough to understand all 
implications.

> Streaming appears to be incompatible with different 
> storage_compatibility_mode settings
> ---
>
> Key: CASSANDRA-19126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Streaming, Legacy/Streaming and Messaging, 
> Messaging/Internode, Tool/bulk load
>Reporter: Branimir Lambov
>Priority: Normal
> Fix For: 5.0-rc, 5.x
>
>
> In particular, SSTableLoader appears to be incompatible with 
> storage_compatibility_mode: NONE, which manifests as a failure of 
> {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} 
> when the flag is turned on (found during CASSANDRA-18753 testing). Setting 
> {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not 
> help (according to the docs, this setting is not picked up).
> This is likely a bigger problem as the acceptable streaming version for C* 5 
> is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not 
> appear to be able to stream with each other if their setting for the 
> compatibility mode is different.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings

2023-11-30 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-19126:

Description: 
In particular, SSTableLoader appears to be incompatible with 
storage_compatibility_mode: NONE, which manifests as a failure of 
{{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} 
when the flag is turned on (found during CASSANDRA-18753 testing). Setting 
{{storage_compatibility_mode: NONE}} in the tool configuration yaml does not 
help (according to the docs, this setting is not picked up).

This is likely a bigger problem as the acceptable streaming version for C* 5 is 
12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not appear 
to be able to stream with each other if their setting for the compatibility 
mode is different.

  was:
In particular, SSTableLoader appears to be incompatible with 
storage_compatibility_mode: NONE, which manifests as a failure of 
`org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest` when 
the flag is turned on (found during CASSANDRA-18753 testing). Setting 
`storage_compatibility_mode: NONE` in the tool configuration yaml does not help 
(according to the docs, this setting is not picked up).

This is likely a bigger problem as the acceptable streaming version for C* 5 is 
12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not appear 
to be able to stream with each other if their setting for the compatibility 
mode is different.


> Streaming appears to be incompatible with different 
> storage_compatibility_mode settings
> ---
>
> Key: CASSANDRA-19126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19126
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Streaming, Legacy/Streaming and Messaging, 
> Messaging/Internode, Tool/bulk load
>Reporter: Branimir Lambov
>Priority: Normal
>
> In particular, SSTableLoader appears to be incompatible with 
> storage_compatibility_mode: NONE, which manifests as a failure of 
> {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} 
> when the flag is turned on (found during CASSANDRA-18753 testing). Setting 
> {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not 
> help (according to the docs, this setting is not picked up).
> This is likely a bigger problem as the acceptable streaming version for C* 5 
> is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not 
> appear to be able to stream with each other if their setting for the 
> compatibility mode is different.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings

2023-11-30 Thread Branimir Lambov (Jira)
Branimir Lambov created CASSANDRA-19126:
---

 Summary: Streaming appears to be incompatible with different 
storage_compatibility_mode settings
 Key: CASSANDRA-19126
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19126
 Project: Cassandra
  Issue Type: Bug
  Components: Consistency/Streaming, Legacy/Streaming and Messaging, 
Messaging/Internode, Tool/bulk load
Reporter: Branimir Lambov


In particular, SSTableLoader appears to be incompatible with 
storage_compatibility_mode: NONE, which manifests as a failure of 
`org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest` when 
the flag is turned on (found during CASSANDRA-18753 testing). Setting 
`storage_compatibility_mode: NONE` in the tool configuration yaml does not help 
(according to the docs, this setting is not picked up).

This is likely a bigger problem as the acceptable streaming version for C* 5 is 
12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not appear 
to be able to stream with each other if their setting for the compatibility 
mode is different.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19085) In-jvm dtest RepairTest fails with storage_compatibility_mode: NONE

2023-11-27 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-19085:

Fix Version/s: 5.0-rc

> In-jvm dtest RepairTest fails with storage_compatibility_mode: NONE
> ---
>
> Key: CASSANDRA-19085
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19085
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Branimir Lambov
>Priority: Normal
> Fix For: 5.0-rc
>
>
> More precisely, when the {{MessagingService}} version to {{{}VERSION_50{}}}, 
> the test fails with an exception that appears to be a genuine problem:
> {code:java}
> junit.framework.AssertionFailedError: Exception found expected null, but 
> was:   at 
> org.apache.cassandra.service.ActiveRepairService.lambda$prepareForRepair$2(ActiveRepairService.java:678)
>   at 
> org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133)
>   at 
> java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at 
> java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
>   at java.base/java.lang.Thread.run(Thread.java:833)
> >
>   at 
> org.apache.cassandra.distributed.test.DistributedRepairUtils.lambda$assertParentRepairSuccess$4(DistributedRepairUtils.java:129)
>   at 
> org.apache.cassandra.distributed.test.DistributedRepairUtils.validateExistingParentRepair(DistributedRepairUtils.java:164)
>   at 
> org.apache.cassandra.distributed.test.DistributedRepairUtils.assertParentRepairSuccess(DistributedRepairUtils.java:124)
>   at 
> org.apache.cassandra.distributed.test.RepairTest.testForcedNormalRepairWithOneNodeDown(RepairTest.java:211)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> org.apache.cassandra.distributed.shared.ShutdownException: Uncaught 
> exceptions were thrown during test
>   at 
> org.apache.cassandra.distributed.impl.AbstractCluster.checkAndResetUncaughtExceptions(AbstractCluster.java:1117)
>   at 
> org.apache.cassandra.distributed.impl.AbstractCluster.close(AbstractCluster.java:1103)
>   at 
> org.apache.cassandra.distributed.test.RepairTest.closeCluster(RepairTest.java:160)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   Suppressed: java.lang.IllegalStateException: complete already: 
> (failure: java.lang.RuntimeException: Did not get replies from all endpoints.)
>   at 
> org.apache.cassandra.utils.concurrent.AsyncPromise.setSuccess(AsyncPromise.java:106)
>   at 
> org.apache.cassandra.service.ActiveRepairService$2.ack(ActiveRepairService.java:721)
>   at 
> org.apache.cassandra.service.ActiveRepairService$2.onResponse(ActiveRepairService.java:697)
>   at 
> org.apache.cassandra.repair.messages.RepairMessage$2.onResponse(RepairMessage.java:187)
>   at 
> org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:58)
>   at 
> org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78)
>   at 
> org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:64)
>   at 
> org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:50)
>   at 
> org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97)
>   at 
> org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45)
>   at 
> org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:430)
>   at 
> org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133)
>   at 
> org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:143)
>   at 
> io.netty.util.concurrent.FastThreadLocalRunnable

[jira] [Updated] (CASSANDRA-18753) Add an optimized default configuration to tests and make it available for new users

2023-11-27 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18753:

Fix Version/s: 5.0-rc
   (was: 5.0.x)

> Add an optimized default configuration to tests and make it available for new 
> users
> ---
>
> Key: CASSANDRA-18753
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18753
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Urgent
> Fix For: 5.0-rc, 5.x
>
> Attachments: 
> CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch,
>  
> DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> We currently offer only one sample configuration file with Cassandra, and 
> that file is deliberately configured to disable all new functionality and 
> incompatible improvements. This works well for legacy users that want to have 
> a painless upgrade, but is a very bad choice for new users, or anyone wanting 
> to make comparisons between Cassandra versions or between Cassandra and other 
> databases.
> We offer very little indication, in the database packaging itself, that there 
> are well-tested configuration choices that can solve known problems and 
> dramatically improve performance. This is guaranteed to paint the database in 
> a worse light than it deserves, and will very likely hurt adoption.
> We should find a way to offer a very easy way of choosing between "optimized" 
> and "compatible" defaults. At minimal, we could provide alternate yaml files. 
> Alternatively, we could build on the {{storage_compatibility_mode}} concept 
> to grow it into a setting that not only enables/disables certain settings, 
> but also changes their default values.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19085) In-jvm dtest RepairTest fails with storage_compatibility_mode: NONE

2023-11-24 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-19085:

Description: 
More precisely, when the {{MessagingService}} version to {{{}VERSION_50{}}}, 
the test fails with an exception that appears to be a genuine problem:
{code:java}
junit.framework.AssertionFailedError: Exception found expected null, but 
was:
at 
org.apache.cassandra.distributed.test.DistributedRepairUtils.lambda$assertParentRepairSuccess$4(DistributedRepairUtils.java:129)
at 
org.apache.cassandra.distributed.test.DistributedRepairUtils.validateExistingParentRepair(DistributedRepairUtils.java:164)
at 
org.apache.cassandra.distributed.test.DistributedRepairUtils.assertParentRepairSuccess(DistributedRepairUtils.java:124)
at 
org.apache.cassandra.distributed.test.RepairTest.testForcedNormalRepairWithOneNodeDown(RepairTest.java:211)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)


org.apache.cassandra.distributed.shared.ShutdownException: Uncaught exceptions 
were thrown during test
at 
org.apache.cassandra.distributed.impl.AbstractCluster.checkAndResetUncaughtExceptions(AbstractCluster.java:1117)
at 
org.apache.cassandra.distributed.impl.AbstractCluster.close(AbstractCluster.java:1103)
at 
org.apache.cassandra.distributed.test.RepairTest.closeCluster(RepairTest.java:160)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
Suppressed: java.lang.IllegalStateException: complete already: 
(failure: java.lang.RuntimeException: Did not get replies from all endpoints.)
at 
org.apache.cassandra.utils.concurrent.AsyncPromise.setSuccess(AsyncPromise.java:106)
at 
org.apache.cassandra.service.ActiveRepairService$2.ack(ActiveRepairService.java:721)
at 
org.apache.cassandra.service.ActiveRepairService$2.onResponse(ActiveRepairService.java:697)
at 
org.apache.cassandra.repair.messages.RepairMessage$2.onResponse(RepairMessage.java:187)
at 
org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:58)
at 
org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78)
at 
org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:64)
at 
org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:50)
at 
org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97)
at 
org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45)
at 
org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:430)
at 
org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133)
at 
org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:143)
at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:833){code}
The updates to {{pending}} in ActiveRepairService are not concurrency-safe, but 
fixing them by doing e.g.
{code:java}
Index: src/java/org/apache/cassandra/service/ActiveRepairService.java
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===
diff --git a/src/java/org/apache/cassandra/service/ActiveRepairService.java 
b/src/java/org/apache/cassandra/service/ActiveRepairService.java
--- a/src/java/org/apache/cassandra/service/ActiveRepairService.java    
(revision 04552046f74f596e69e2d98c3f3e522fb5888c99)
+++ b/src/java/org/apache/cassandra/service/ActiveRepairService.java    (date 
1700839874092)
@@ -675,7 +675,7 @@
             if (promise.isDone())
                 return;
             String errorMsg = "Did not get replies from all endpoints.";
-            if (promise.tryFailure(new RuntimeException(errorMsg)))
+            if (pending.getAndSet(-1) > 0 && promise.tryFailure(new 
RuntimeException(errorMsg)))
                 participateFailed(parentRepairSession, errorMsg);
         }, timeoutMillis, MILLISECONDS);
 
@@ -703,8 +703,8 @@
                 failedNodes.add(from.toString());
                 if (failureReason == RequestFailureReason.TIMEOUT)
                 {
-                    pen

[jira] [Created] (CASSANDRA-19085) In-jvm dtest RepairTest fails with storage_compatibility_mode: NONE

2023-11-24 Thread Branimir Lambov (Jira)
Branimir Lambov created CASSANDRA-19085:
---

 Summary: In-jvm dtest RepairTest fails with 
storage_compatibility_mode: NONE
 Key: CASSANDRA-19085
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19085
 Project: Cassandra
  Issue Type: Bug
  Components: Consistency/Repair
Reporter: Branimir Lambov


More precisely, when the {{MessagingService}} version to {{{}VERSION_50{}}}, 
the test fails with an exception that appears to be a genuine problem:
{code:java}
junit.framework.AssertionFailedError: Exception found expected null, but 
was:
at 
org.apache.cassandra.distributed.test.DistributedRepairUtils.lambda$assertParentRepairSuccess$4(DistributedRepairUtils.java:129)
at 
org.apache.cassandra.distributed.test.DistributedRepairUtils.validateExistingParentRepair(DistributedRepairUtils.java:164)
at 
org.apache.cassandra.distributed.test.DistributedRepairUtils.assertParentRepairSuccess(DistributedRepairUtils.java:124)
at 
org.apache.cassandra.distributed.test.RepairTest.testForcedNormalRepairWithOneNodeDown(RepairTest.java:211)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)


org.apache.cassandra.distributed.shared.ShutdownException: Uncaught exceptions 
were thrown during test
at 
org.apache.cassandra.distributed.impl.AbstractCluster.checkAndResetUncaughtExceptions(AbstractCluster.java:1117)
at 
org.apache.cassandra.distributed.impl.AbstractCluster.close(AbstractCluster.java:1103)
at 
org.apache.cassandra.distributed.test.RepairTest.closeCluster(RepairTest.java:160)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
Suppressed: java.lang.IllegalStateException: complete already: 
(failure: java.lang.RuntimeException: Did not get replies from all endpoints.)
at 
org.apache.cassandra.utils.concurrent.AsyncPromise.setSuccess(AsyncPromise.java:106)
at 
org.apache.cassandra.service.ActiveRepairService$2.ack(ActiveRepairService.java:721)
at 
org.apache.cassandra.service.ActiveRepairService$2.onResponse(ActiveRepairService.java:697)
at 
org.apache.cassandra.repair.messages.RepairMessage$2.onResponse(RepairMessage.java:187)
at 
org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:58)
at 
org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78)
at 
org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:64)
at 
org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:50)
at 
org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97)
at 
org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45)
at 
org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:430)
at 
org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133)
at 
org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:143)
at 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:833){code}
The updates to {{pending}} in AbstractRepairService are not concurrency-safe, 
but fixing them by doing e.g.
{code:java}
Index: src/java/org/apache/cassandra/service/ActiveRepairService.java
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===
diff --git a/src/java/org/apache/cassandra/service/ActiveRepairService.java 
b/src/java/org/apache/cassandra/service/ActiveRepairService.java
--- a/src/java/org/apache/cassandra/service/ActiveRepairService.java    
(revision 04552046f74f596e69e2d98c3f3e522fb5888c99)
+++ b/src/java/org/apache/cassandra/service/ActiveRepairService.java    (date 
1700839874092)
@@ -675,7 +675,7 @@
             if (promise.isDone())
                 return;
             String errorMsg = "Did not get replies from all endpoints.";
-            if (promise.tryFailure(new RuntimeException(errorMsg)))
+            if (pending.getAndSet(-1) > 0 && promise.tryFailure(new 
RuntimeException(errorMsg)))
                 participateFailed(parentRepairSession, errorMsg);
         }, timeoutMillis, M

[jira] [Commented] (CASSANDRA-18757) UnifiedCompactionTask is incorrectly setting keepOriginals

2023-11-23 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17789130#comment-17789130
 ] 

Branimir Lambov commented on CASSANDRA-18757:
-

Tests look good, repeated test completed with no failures: 
[https://app.circleci.com/pipelines/github/blambov/cassandra?branch=CASSANDRA-18757]

[~smiklosovic], do you give a second approval so that I can commit this?

> UnifiedCompactionTask is incorrectly setting keepOriginals
> --
>
> Key: CASSANDRA-18757
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18757
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
> Fix For: 5.0-beta, 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {code:java}
> super(cfs, txn, gcBefore, 
> strategy.getController().getIgnoreOverlapsInExpirationCheck());{code}
> in {{UnifiedCompactionTask}} is calling the base constructor
> {code:java}
>  public CompactionTask(ColumnFamilyStore cfs, LifecycleTransaction txn, long 
> gcBefore, boolean keepOriginals)
> {code}
> which can set {{keepOriginals}} to true when it should not be.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-18753) We should offer an option for optimized default configuration

2023-11-23 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17789052#comment-17789052
 ] 

Branimir Lambov edited comment on CASSANDRA-18753 at 11/23/23 10:20 AM:


DTest support has been added.

The python dtests require pull requests for 
[CCM|https://github.com/riptano/ccm/pull/760] and 
[cassandra-dtest|https://github.com/apache/cassandra-dtest/pull/243] to be 
merged. It works by passing an argument to ccm to make it read the 
configuration from "cassandra_latest.yaml". The new configuration replaces 
{{{}dtest_offheap{}}}, as the offheap setting for memtables is also turned on 
in the latest configuration.

I'm not happy at all with how the in-jvm dtests are configured at this point 
(directly including the settings in code), but I could not think of a quick way 
to get them to load a configuration file. The latest config is combined with 
vnodes to lighten the testing load.

Test results to appear 
[here|https://app.circleci.com/pipelines/github/blambov/cassandra/567/workflows/aa84b1f1-b138-42a8-8e81-dd149c87224e].


was (Author: blambov):
DTest support has been added.

The python dtests require pull requests for 
[CCM|https://github.com/riptano/ccm/pull/760] and 
[cassandra-dtest|https://github.com/apache/cassandra-dtest/pull/243] to be 
merged. It works by passing an argument to ccm to make it read the 
configuration from "cassandra_latest.yaml". The new configuration replaces 
{{{}dtest_offheap{}}}, as the offheap setting for memtables is also turned on 
in the latest configuration.

I'm not happy at all with how the in-jvm dtests are configured at this point 
(directly including the settings in code), but I could not think of a quick way 
to get them to load a configuration file.

Test results to appear 
[here|https://app.circleci.com/pipelines/github/blambov/cassandra/567/workflows/aa84b1f1-b138-42a8-8e81-dd149c87224e].

> We should offer an option for optimized default configuration
> -
>
> Key: CASSANDRA-18753
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18753
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Branimir Lambov
>Assignee: Branimir Lambov
>Priority: Urgent
> Fix For: 5.0.x, 5.x
>
> Attachments: 
> CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch,
>  
> DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> We currently offer only one sample configuration file with Cassandra, and 
> that file is deliberately configured to disable all new functionality and 
> incompatible improvements. This works well for legacy users that want to have 
> a painless upgrade, but is a very bad choice for new users, or anyone wanting 
> to make comparisons between Cassandra versions or between Cassandra and other 
> databases.
> We offer very little indication, in the database packaging itself, that there 
> are well-tested configuration choices that can solve known problems and 
> dramatically improve performance. This is guaranteed to paint the database in 
> a worse light than it deserves, and will very likely hurt adoption.
> We should find a way to offer a very easy way of choosing between "optimized" 
> and "compatible" defaults. At minimal, we could provide alternate yaml files. 
> Alternatively, we could build on the {{storage_compatibility_mode}} concept 
> to grow it into a setting that not only enables/disables certain settings, 
> but also changes their default values.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18757) UnifiedCompactionTask is incorrectly setting keepOriginals

2023-11-22 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17788730#comment-17788730
 ] 

Branimir Lambov commented on CASSANDRA-18757:
-

How about splitting this into separate tests for the 4 cases? I.e. have the 
four calls in {{testIgnoreOverlaps}} run in separate {{@Test}}-annotated 
methods?

> UnifiedCompactionTask is incorrectly setting keepOriginals
> --
>
> Key: CASSANDRA-18757
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18757
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
> Fix For: 5.0-beta, 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {code:java}
> super(cfs, txn, gcBefore, 
> strategy.getController().getIgnoreOverlapsInExpirationCheck());{code}
> in {{UnifiedCompactionTask}} is calling the base constructor
> {code:java}
>  public CompactionTask(ColumnFamilyStore cfs, LifecycleTransaction txn, long 
> gcBefore, boolean keepOriginals)
> {code}
> which can set {{keepOriginals}} to true when it should not be.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-19046) Paxos V2 does not update individual fields of readMetrics

2023-11-21 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-19046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-19046:

Summary: Paxos V2 does not update individual fields of readMetrics  (was: 
Paxos V2 does not individual fields of readMetrics)

> Paxos V2 does not update individual fields of readMetrics
> -
>
> Key: CASSANDRA-19046
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19046
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/Lightweight Transactions, Observability/Metrics
>Reporter: Branimir Lambov
>Priority: Normal
>
> As a result, {{ClientMetricsTest.testPaxosStatement}} is failing with 
> {{paxos_variant: v2}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19034) SelectTest fails when run with SAI index

2023-11-17 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17787323#comment-17787323
 ] 

Branimir Lambov commented on CASSANDRA-19034:
-

Yes, we have run the entire unit test suite (no dtests yet) with SAI as 
default, and these three are the only failures that aren't usecases that SAI 
can't support (ByteOrderedPartitioner and blobs).

With CASSANDRA-18753, we will have a test configuration run as part as the 
precommit tests that runs with SAI (plus tries, UCS, paxos v2...).

> SelectTest fails when run with SAI index
> 
>
> Key: CASSANDRA-19034
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19034
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SAI
>Reporter: Branimir Lambov
>Priority: Normal
> Fix For: 5.0-beta
>
>
> When run with SAI index, the following two tests error out:
> {code}
> [junit-timeout] Testcase: 
> testContainsKeyAndContainsWithIndexOnMapValue(org.apache.cassandra.cql3.validation.operations.SelectTest)-_jdk11:
>FAILED
> [junit-timeout] Got less rows than expected. Expected 1 but got 0
> [junit-timeout] junit.framework.AssertionFailedError: Got less rows than 
> expected. Expected 1 but got 0
> [junit-timeout]   at 
> org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1849)
> [junit-timeout]   at 
> org.apache.cassandra.cql3.validation.operations.SelectTest.lambda$testContainsKeyAndContainsWithIndexOnMapValue$9(SelectTest.java:625)
> [junit-timeout]   at 
> org.apache.cassandra.cql3.CQLTester.beforeAndAfterFlush(CQLTester.java:2238)
> [junit-timeout]   at 
> org.apache.cassandra.cql3.validation.operations.SelectTest.testContainsKeyAndContainsWithIndexOnMapValue(SelectTest.java:618)
> [junit-timeout]   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [junit-timeout]   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> [junit-timeout]   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> [junit-timeout] 
> [junit-timeout] 
> [junit-timeout] Testcase: 
> testFilterWithIndexForContains(org.apache.cassandra.cql3.validation.operations.SelectTest)-_jdk11:
>   FAILED
> [junit-timeout] Invalid value for row 1 column 0 (k1 of type int), expected 
> <1> but got <0>
> [junit-timeout] Invalid value for row 1 column 2 (v of type set), 
> expected <{4, 5, 6}> but got <{2, 3, 4}>
> [junit-timeout] 
> [junit-timeout] junit.framework.AssertionFailedError: Invalid value for row 1 
> column 0 (k1 of type int), expected <1> but got <0>
> [junit-timeout] Invalid value for row 1 column 2 (v of type set), 
> expected <{4, 5, 6}> but got <{2, 3, 4}>
> [junit-timeout] 
> [junit-timeout]   at 
> org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1826)
> [junit-timeout]   at 
> org.apache.cassandra.cql3.validation.operations.SelectTest.lambda$testFilterWithIndexForContains$6(SelectTest.java:543)
> [junit-timeout]   at 
> org.apache.cassandra.cql3.CQLTester.beforeAndAfterFlush(CQLTester.java:2240)
> [junit-timeout]   at 
> org.apache.cassandra.cql3.validation.operations.SelectTest.testFilterWithIndexForContains(SelectTest.java:542)
> [junit-timeout]   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [junit-timeout]   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> [junit-timeout]   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> {code}
> The latter seems to be giving the results in the wrong order, and the order 
> flips when the data is flushed.
> Caught during preparation of _latest config that would switch default to SAI 
> (CASSANDRA-18753).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-19034) SelectTest fails when run with SAI index

2023-11-17 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-19034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17787279#comment-17787279
 ] 

Branimir Lambov commented on CASSANDRA-19034:
-

A further failure of this kind:

{code}
[junit-timeout] Testcase: 
testStaticIndexAndNonStaticIndex(org.apache.cassandra.cql3.validation.entities.SecondaryIndexOnStaticColumnTest)-_jdk11:
  FAILED
[junit-timeout] Got less rows than expected. Expected 1 but got 0
[junit-timeout] junit.framework.AssertionFailedError: Got less rows than 
expected. Expected 1 but got 0
[junit-timeout] at 
org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1849)
[junit-timeout] at 
org.apache.cassandra.cql3.validation.entities.SecondaryIndexOnStaticColumnTest.testStaticIndexAndNonStaticIndex(SecondaryIndexOnStaticColumnTest.java:191)
[junit-timeout] at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[junit-timeout] at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[junit-timeout] at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[junit-timeout] 
[junit-timeout] 
[junit-timeout] Test 
org.apache.cassandra.cql3.validation.entities.SecondaryIndexOnStaticColumnTest 
FAILED
{code} 

> SelectTest fails when run with SAI index
> 
>
> Key: CASSANDRA-19034
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19034
> Project: Cassandra
>  Issue Type: Bug
>  Components: Feature/SAI
>Reporter: Branimir Lambov
>Priority: Normal
>
> When run with SAI index, the following two tests error out:
> {code}
> [junit-timeout] Testcase: 
> testContainsKeyAndContainsWithIndexOnMapValue(org.apache.cassandra.cql3.validation.operations.SelectTest)-_jdk11:
>FAILED
> [junit-timeout] Got less rows than expected. Expected 1 but got 0
> [junit-timeout] junit.framework.AssertionFailedError: Got less rows than 
> expected. Expected 1 but got 0
> [junit-timeout]   at 
> org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1849)
> [junit-timeout]   at 
> org.apache.cassandra.cql3.validation.operations.SelectTest.lambda$testContainsKeyAndContainsWithIndexOnMapValue$9(SelectTest.java:625)
> [junit-timeout]   at 
> org.apache.cassandra.cql3.CQLTester.beforeAndAfterFlush(CQLTester.java:2238)
> [junit-timeout]   at 
> org.apache.cassandra.cql3.validation.operations.SelectTest.testContainsKeyAndContainsWithIndexOnMapValue(SelectTest.java:618)
> [junit-timeout]   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [junit-timeout]   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> [junit-timeout]   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> [junit-timeout] 
> [junit-timeout] 
> [junit-timeout] Testcase: 
> testFilterWithIndexForContains(org.apache.cassandra.cql3.validation.operations.SelectTest)-_jdk11:
>   FAILED
> [junit-timeout] Invalid value for row 1 column 0 (k1 of type int), expected 
> <1> but got <0>
> [junit-timeout] Invalid value for row 1 column 2 (v of type set), 
> expected <{4, 5, 6}> but got <{2, 3, 4}>
> [junit-timeout] 
> [junit-timeout] junit.framework.AssertionFailedError: Invalid value for row 1 
> column 0 (k1 of type int), expected <1> but got <0>
> [junit-timeout] Invalid value for row 1 column 2 (v of type set), 
> expected <{4, 5, 6}> but got <{2, 3, 4}>
> [junit-timeout] 
> [junit-timeout]   at 
> org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1826)
> [junit-timeout]   at 
> org.apache.cassandra.cql3.validation.operations.SelectTest.lambda$testFilterWithIndexForContains$6(SelectTest.java:543)
> [junit-timeout]   at 
> org.apache.cassandra.cql3.CQLTester.beforeAndAfterFlush(CQLTester.java:2240)
> [junit-timeout]   at 
> org.apache.cassandra.cql3.validation.operations.SelectTest.testFilterWithIndexForContains(SelectTest.java:542)
> [junit-timeout]   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [junit-timeout]   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> [junit-timeout]   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> {code}
> The latter seems to be giving the results in the wrong order, and the order 
> flips when the data is flushed.
> Caught during preparation of _latest config that would switch default to SAI 
> (CASSANDRA-18753).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-

[jira] [Created] (CASSANDRA-19034) SelectTest fails when run with SAI index

2023-11-17 Thread Branimir Lambov (Jira)
Branimir Lambov created CASSANDRA-19034:
---

 Summary: SelectTest fails when run with SAI index
 Key: CASSANDRA-19034
 URL: https://issues.apache.org/jira/browse/CASSANDRA-19034
 Project: Cassandra
  Issue Type: Bug
  Components: Feature/SAI
Reporter: Branimir Lambov


When run with SAI index, the following two tests error out:

{code}
[junit-timeout] Testcase: 
testContainsKeyAndContainsWithIndexOnMapValue(org.apache.cassandra.cql3.validation.operations.SelectTest)-_jdk11:
 FAILED
[junit-timeout] Got less rows than expected. Expected 1 but got 0
[junit-timeout] junit.framework.AssertionFailedError: Got less rows than 
expected. Expected 1 but got 0
[junit-timeout] at 
org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1849)
[junit-timeout] at 
org.apache.cassandra.cql3.validation.operations.SelectTest.lambda$testContainsKeyAndContainsWithIndexOnMapValue$9(SelectTest.java:625)
[junit-timeout] at 
org.apache.cassandra.cql3.CQLTester.beforeAndAfterFlush(CQLTester.java:2238)
[junit-timeout] at 
org.apache.cassandra.cql3.validation.operations.SelectTest.testContainsKeyAndContainsWithIndexOnMapValue(SelectTest.java:618)
[junit-timeout] at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[junit-timeout] at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[junit-timeout] at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[junit-timeout] 
[junit-timeout] 
[junit-timeout] Testcase: 
testFilterWithIndexForContains(org.apache.cassandra.cql3.validation.operations.SelectTest)-_jdk11:
FAILED
[junit-timeout] Invalid value for row 1 column 0 (k1 of type int), expected <1> 
but got <0>
[junit-timeout] Invalid value for row 1 column 2 (v of type set), expected 
<{4, 5, 6}> but got <{2, 3, 4}>
[junit-timeout] 
[junit-timeout] junit.framework.AssertionFailedError: Invalid value for row 1 
column 0 (k1 of type int), expected <1> but got <0>
[junit-timeout] Invalid value for row 1 column 2 (v of type set), expected 
<{4, 5, 6}> but got <{2, 3, 4}>
[junit-timeout] 
[junit-timeout] at 
org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1826)
[junit-timeout] at 
org.apache.cassandra.cql3.validation.operations.SelectTest.lambda$testFilterWithIndexForContains$6(SelectTest.java:543)
[junit-timeout] at 
org.apache.cassandra.cql3.CQLTester.beforeAndAfterFlush(CQLTester.java:2240)
[junit-timeout] at 
org.apache.cassandra.cql3.validation.operations.SelectTest.testFilterWithIndexForContains(SelectTest.java:542)
[junit-timeout] at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[junit-timeout] at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[junit-timeout] at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
{code}

The latter seems to be giving the results in the wrong order, and the order 
flips when the data is flushed.

Caught during preparation of _latest config that would switch default to SAI 
(CASSANDRA-18753).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-18710) Test failure: org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)

2023-11-15 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17786290#comment-17786290
 ] 

Branimir Lambov edited comment on CASSANDRA-18710 at 11/15/23 10:15 AM:


{quote}So perhaps the expected value should be calculated as a moving average 
by updating it with subsequent table sizes.
{quote}
This makes sense. Sorting the sstable files by name should give them in the 
correct order, so we can easily calculate the moving average from them.

Actually, that would solve the extra flush problem as well, wouldn't it?


was (Author: blambov):
{quote}So perhaps the expected value should be calculated as a moving average 
by updating it with subsequent table sizes.
{quote}
This makes sense. Sorting the sstable files by name should give them in the 
correct order, so we can easily calculate the moving average from them.

> Test failure: 
> org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from 
> org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)
> --
>
> Key: CASSANDRA-18710
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18710
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Ekaterina Dimitrova
>Assignee: Jacek Lewandowski
>Priority: Normal
> Fix For: 4.1.x, 5.0-beta, 5.0.x, 5.x
>
> Attachments: org.apache.cassandra.io.DiskSpaceMetricsTest.txt
>
>
> Seen here:
> [https://ci-cassandra.apache.org/job/Cassandra-trunk/1644/testReport/org.apache.cassandra.io/DiskSpaceMetricsTest/testFlushSize__jdk17/]
> h3.  
> {code:java}
> Error Message
> expected:<7200.0> but was:<1367.83970468544>
> Stacktrace
> junit.framework.AssertionFailedError: expected:<7200.0> but 
> was:<1367.83970468544> at 
> org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize(DiskSpaceMetricsTest.java:119)
>  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18710) Test failure: org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)

2023-11-15 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17786290#comment-17786290
 ] 

Branimir Lambov commented on CASSANDRA-18710:
-

{quote}So perhaps the expected value should be calculated as a moving average 
by updating it with subsequent table sizes.
{quote}
This makes sense. Sorting the sstable files by name should give them in the 
correct order, so we can easily calculate the moving average from them.

> Test failure: 
> org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from 
> org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)
> --
>
> Key: CASSANDRA-18710
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18710
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Ekaterina Dimitrova
>Assignee: Jacek Lewandowski
>Priority: Normal
> Fix For: 4.1.x, 5.0-beta, 5.0.x, 5.x
>
> Attachments: org.apache.cassandra.io.DiskSpaceMetricsTest.txt
>
>
> Seen here:
> [https://ci-cassandra.apache.org/job/Cassandra-trunk/1644/testReport/org.apache.cassandra.io/DiskSpaceMetricsTest/testFlushSize__jdk17/]
> h3.  
> {code:java}
> Error Message
> expected:<7200.0> but was:<1367.83970468544>
> Stacktrace
> junit.framework.AssertionFailedError: expected:<7200.0> but 
> was:<1367.83970468544> at 
> org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize(DiskSpaceMetricsTest.java:119)
>  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18757) UnifiedCompactionTask is incorrectly setting keepOriginals

2023-11-15 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17786282#comment-17786282
 ] 

Branimir Lambov commented on CASSANDRA-18757:
-

I think it is a leftover from a refactoring that (among other things) fixed 
CASSANDRA-18756 in DSE.

Fix LGTM, but it's a shame that no test caught it.

> UnifiedCompactionTask is incorrectly setting keepOriginals
> --
>
> Key: CASSANDRA-18757
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18757
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
> Fix For: 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> {code:java}
> super(cfs, txn, gcBefore, 
> strategy.getController().getIgnoreOverlapsInExpirationCheck());{code}
> in {{UnifiedCompactionTask}} is calling the base constructor
> {code:java}
>  public CompactionTask(ColumnFamilyStore cfs, LifecycleTransaction txn, long 
> gcBefore, boolean keepOriginals)
> {code}
> which can set {{keepOriginals}} to true when it should not be.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables

2023-11-03 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17782692#comment-17782692
 ] 

Branimir Lambov edited comment on CASSANDRA-18945 at 11/3/23 6:15 PM:
--

{quote}+ assert !(count < 0); // Must be positive, 0 or NaN, which should 
translate to baseShardCount

Review Comment:
@ethan-brown2022 `count >= 0` is more natural to me
{quote}
I can't find this to reply to it directly. The comment at the end of the line 
says that {{count}} can be {{{}NaN{}}}, which will fail {{count >= 0}} but pass 
{{{}!(count < 0){}}}. Perhaps we should change the bit after NaN to "(which 
would fail count >= 0, but is acceptable and should translate to 
baseShardCount)" or something similar?


was (Author: blambov):
{quote}+ assert !(count < 0); // Must be positive, 0 or NaN, which should 
translate to baseShardCount

Review Comment:
@ethan-brown2022 `count >= 0` is more natural to me
{quote}
I can't find this to reply to it directly. The comment at the end of the line 
says that {{count}} can be {{{}NaN{}}}, which will fail {{count >= 0}} but pass 
{{{}!(count < 0){}}}. Perhaps we should change the bit after NaN to "(which 
would fail {{{}count >= 0,{}}}", but is acceptable and should translate to 
baseShardCount)" or something similar?

> Unified Compaction Strategy is creating too many sstables
> -
>
> Key: CASSANDRA-18945
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18945
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
> Fix For: 5.0-beta
>
> Attachments: file_ucs_shenandoah.html, file_ucs_shenandoah_3.html, 
> file_ucs_shenandoah_off_heap_memtable.html, 
> file_ucs_shenandoah_on_heap_memtable_2.html, 
> file_ucs_shenandoah_on_heap_memtable_3.html, key-value-oss.html
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> The unified compaction strategy currently aims to create sstables with close 
> to the same size, defaulting to 1 GiB. Unfortunately tests show that 
> Cassandra starts to have performance problems when the number of sstables 
> grows to the order of a thousand, and in particular that even 1 TiB of data 
> with the default configuration is creating too many sstables for efficient 
> processing. This matters even more for SAI, where the number of sstables in 
> the system can have a proportional effect on the complexity of operations.
> It is quite easy to create a configuration option that allows sstables to 
> take some part of the data growth by adding a multiplier to [the shard count 
> calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding]
>  formula, replacing 
> {{2 ^ round(log2(d / (t * b))) * b}} 
> with 
> {{2 ^ round((1 - 𝜆) * log2(d / (t * b))) * b}}, 
> where 𝜆 is a parameter whose value is between 0 and 1.
> With this, a 𝜆 of 0.5 would mean that shard count and sstable size grow in 
> parallel at the square root of the data size growth. 0 would result in no 
> growth, and 1 in always using the same number of shards.
> It may also be valuable to introduce a threshold for engaging the base shard 
> count to avoid splitting lowest-level sstables into fragments that are too 
> small.
> Once both of these are in place, we can set defaults that better suit all 
> node densities, including 10 TiB and beyond, for example:
>  - target size of 1 GiB
>  - 𝜆 of 1/3
>  - base shard count of 4
>  - minimum size 100 MiB



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables

2023-11-03 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17782692#comment-17782692
 ] 

Branimir Lambov commented on CASSANDRA-18945:
-

{quote}+ assert !(count < 0); // Must be positive, 0 or NaN, which should 
translate to baseShardCount

Review Comment:
@ethan-brown2022 `count >= 0` is more natural to me
{quote}
I can't find this to reply to it directly. The comment at the end of the line 
says that {{count}} can be {{{}NaN{}}}, which will fail {{count >= 0}} but pass 
{{{}!(count < 0){}}}. Perhaps we should change the bit after NaN to "(which 
would fail {{{}count >= 0,{}}}", but is acceptable and should translate to 
baseShardCount)" or something similar?

> Unified Compaction Strategy is creating too many sstables
> -
>
> Key: CASSANDRA-18945
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18945
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
> Fix For: 5.0-beta
>
> Attachments: file_ucs_shenandoah.html, file_ucs_shenandoah_3.html, 
> file_ucs_shenandoah_off_heap_memtable.html, 
> file_ucs_shenandoah_on_heap_memtable_2.html, 
> file_ucs_shenandoah_on_heap_memtable_3.html, key-value-oss.html
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> The unified compaction strategy currently aims to create sstables with close 
> to the same size, defaulting to 1 GiB. Unfortunately tests show that 
> Cassandra starts to have performance problems when the number of sstables 
> grows to the order of a thousand, and in particular that even 1 TiB of data 
> with the default configuration is creating too many sstables for efficient 
> processing. This matters even more for SAI, where the number of sstables in 
> the system can have a proportional effect on the complexity of operations.
> It is quite easy to create a configuration option that allows sstables to 
> take some part of the data growth by adding a multiplier to [the shard count 
> calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding]
>  formula, replacing 
> {{2 ^ round(log2(d / (t * b))) * b}} 
> with 
> {{2 ^ round((1 - 𝜆) * log2(d / (t * b))) * b}}, 
> where 𝜆 is a parameter whose value is between 0 and 1.
> With this, a 𝜆 of 0.5 would mean that shard count and sstable size grow in 
> parallel at the square root of the data size growth. 0 would result in no 
> growth, and 1 in always using the same number of shards.
> It may also be valuable to introduce a threshold for engaging the base shard 
> count to avoid splitting lowest-level sstables into fragments that are too 
> small.
> Once both of these are in place, we can set defaults that better suit all 
> node densities, including 10 TiB and beyond, for example:
>  - target size of 1 GiB
>  - 𝜆 of 1/3
>  - base shard count of 4
>  - minimum size 100 MiB



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18232) Write docs for CEP-26 Unified Compaction Strategy (UCS)

2023-11-03 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17782640#comment-17782640
 ] 

Branimir Lambov commented on CASSANDRA-18232:
-

There are some additional options coming with CASSANDRA-18945. The details can 
be found in [the developer-side markdown 
doc|https://github.com/datastax/cassandra/blob/CASSANDRA-18945/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#full-sharding-scheme].

> Write docs for CEP-26 Unified Compaction Strategy (UCS)
> ---
>
> Key: CASSANDRA-18232
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18232
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Documentation
>Reporter: Lorina Poland
>Assignee: Lorina Poland
>Priority: High
> Fix For: 5.x
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables

2023-11-03 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17782638#comment-17782638
 ] 

Branimir Lambov commented on CASSANDRA-18945:
-

We will handle the docs in the documentation ticket, CASSANDRA-18232. I will 
reach out to Lorina make her aware of the changes.

> Unified Compaction Strategy is creating too many sstables
> -
>
> Key: CASSANDRA-18945
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18945
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
> Fix For: 5.0-beta
>
> Attachments: file_ucs_shenandoah.html, file_ucs_shenandoah_3.html, 
> file_ucs_shenandoah_off_heap_memtable.html, 
> file_ucs_shenandoah_on_heap_memtable_2.html, 
> file_ucs_shenandoah_on_heap_memtable_3.html, key-value-oss.html
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> The unified compaction strategy currently aims to create sstables with close 
> to the same size, defaulting to 1 GiB. Unfortunately tests show that 
> Cassandra starts to have performance problems when the number of sstables 
> grows to the order of a thousand, and in particular that even 1 TiB of data 
> with the default configuration is creating too many sstables for efficient 
> processing. This matters even more for SAI, where the number of sstables in 
> the system can have a proportional effect on the complexity of operations.
> It is quite easy to create a configuration option that allows sstables to 
> take some part of the data growth by adding a multiplier to [the shard count 
> calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding]
>  formula, replacing 
> {{2 ^ round(log2(d / (t * b))) * b}} 
> with 
> {{2 ^ round((1 - 𝜆) * log2(d / (t * b))) * b}}, 
> where 𝜆 is a parameter whose value is between 0 and 1.
> With this, a 𝜆 of 0.5 would mean that shard count and sstable size grow in 
> parallel at the square root of the data size growth. 0 would result in no 
> growth, and 1 in always using the same number of shards.
> It may also be valuable to introduce a threshold for engaging the base shard 
> count to avoid splitting lowest-level sstables into fragments that are too 
> small.
> Once both of these are in place, we can set defaults that better suit all 
> node densities, including 10 TiB and beyond, for example:
>  - target size of 1 GiB
>  - 𝜆 of 1/3
>  - base shard count of 4
>  - minimum size 100 MiB



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-18997) Unified Compaction Strategy is missing documentation

2023-11-03 Thread Branimir Lambov (Jira)
Branimir Lambov created CASSANDRA-18997:
---

 Summary: Unified Compaction Strategy is missing documentation
 Key: CASSANDRA-18997
 URL: https://issues.apache.org/jira/browse/CASSANDRA-18997
 Project: Cassandra
  Issue Type: Task
  Components: Documentation
Reporter: Branimir Lambov


UCS is missing from [the CQL documentation for 
5.0|https://cassandra.apache.org/doc/5.0/cassandra/developing/cql/ddl.html#cql-compaction-options]
 and [the compaction 
page|https://cassandra.apache.org/doc/5.0/cassandra/managing/operating/compaction/index.html#compaction-options].

We need to create a documentation page for UCS and link it from both.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18710) Test failure: org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)

2023-11-03 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17782610#comment-17782610
 ] 

Branimir Lambov commented on CASSANDRA-18710:
-

Yes, this looks like a 4.1 regression that is affecting all tests that are 
sensitive to the number of sstables. Such tests usually run in a separate 
keyspace (using {{KEYSPACE_PER_TEST}}) to avoid the keyspace flush that 
dropping a table triggers, but this new commit log recycling is triggering 
another flush that is not restricted to the affected keyspace.

> Test failure: 
> org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from 
> org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)
> --
>
> Key: CASSANDRA-18710
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18710
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Ekaterina Dimitrova
>Assignee: Brandon Williams
>Priority: Normal
> Fix For: 5.0-beta, 5.0.x, 5.x
>
> Attachments: org.apache.cassandra.io.DiskSpaceMetricsTest.txt
>
>
> Seen here:
> [https://ci-cassandra.apache.org/job/Cassandra-trunk/1644/testReport/org.apache.cassandra.io/DiskSpaceMetricsTest/testFlushSize__jdk17/]
> h3.  
> {code:java}
> Error Message
> expected:<7200.0> but was:<1367.83970468544>
> Stacktrace
> junit.framework.AssertionFailedError: expected:<7200.0> but 
> was:<1367.83970468544> at 
> org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize(DiskSpaceMetricsTest.java:119)
>  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18533) Move format-specific sstable options into the format configuration

2023-11-02 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17782193#comment-17782193
 ] 

Branimir Lambov commented on CASSANDRA-18533:
-

I would keep it simple and not add a common settings entry under options. If 
necessary, the user can copy the value to both.

> Move format-specific sstable options into the format configuration
> --
>
> Key: CASSANDRA-18533
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18533
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Cluster/Schema, Local/SSTable
>Reporter: Branimir Lambov
>Assignee: Maxwell Guo
>Priority: Normal
> Fix For: 5.x
>
>
> This mainly concerns cassandra yaml settings:
> - {{column_index_size}}, which should also be renamed to 
> {{row_index_granularity}}
> - {{column_index_cache_size}}
> - {{index_summary_capacity}}
> - {{index_summary_resize_interval}}
> and possibly
> - {{key_cache_size}}, {{key_cache_keys_to_save}}, {{key_cache_save_period}}, 
> {{key_cache_migrate_during_compaction}}
> - {{sstable_preemptive_open_interval}}
> Existing settings should be deprecated but still picked up if defined.
> At this point we will not consider table-level options that make better sense 
> as format parameters ({{min/max_index_interval}}, {{bloom_filter_fp_chance}}, 
> {{crc_check_chance}} and possibly {{compression}}), because we do not yet 
> support per-table format selection/configuration.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18533) Move format-specific sstable options into the format configuration

2023-11-02 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17782184#comment-17782184
 ] 

Branimir Lambov commented on CASSANDRA-18533:
-

1. Yes, precisely.
2. The key cache is constructed in a completely separate portion of the code, 
isn't it? Ignore the key cache settings (except migration), I don't think 
changing this is something we can do at the moment.
3. Although it is not at the moment, the row index granularity in particular 
should be a table-level property -- there's no real reason to use one setting 
for all tables, and there's an advantage to be had by making it configurable. 
However, things like the key cache size or index summary capacity are something 
to be shared, not just between tables but also potentially between formats; I 
don't want to get into a complicated solution for this, I would either ignore 
any table-level modification for these (with a warning) or check that the value 
is the same among all tables. This, along with format variations (e.g. 
"bti-fast"), is also out of scope for this ticket.

> Move format-specific sstable options into the format configuration
> --
>
> Key: CASSANDRA-18533
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18533
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Cluster/Schema, Local/SSTable
>Reporter: Branimir Lambov
>Assignee: Maxwell Guo
>Priority: Normal
> Fix For: 5.x
>
>
> This mainly concerns cassandra yaml settings:
> - {{column_index_size}}, which should also be renamed to 
> {{row_index_granularity}}
> - {{column_index_cache_size}}
> - {{index_summary_capacity}}
> - {{index_summary_resize_interval}}
> and possibly
> - {{key_cache_size}}, {{key_cache_keys_to_save}}, {{key_cache_save_period}}, 
> {{key_cache_migrate_during_compaction}}
> - {{sstable_preemptive_open_interval}}
> Existing settings should be deprecated but still picked up if defined.
> At this point we will not consider table-level options that make better sense 
> as format parameters ({{min/max_index_interval}}, {{bloom_filter_fp_chance}}, 
> {{crc_check_chance}} and possibly {{compression}}), because we do not yet 
> support per-table format selection/configuration.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables

2023-10-27 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17780350#comment-17780350
 ] 

Branimir Lambov commented on CASSANDRA-18945:
-

Yes, I intend to commit it to 5.0.

> Unified Compaction Strategy is creating too many sstables
> -
>
> Key: CASSANDRA-18945
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18945
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
> Fix For: 5.0, 5.x
>
> Attachments: file_ucs_shenandoah.html, file_ucs_shenandoah_3.html, 
> file_ucs_shenandoah_off_heap_memtable.html, 
> file_ucs_shenandoah_on_heap_memtable_2.html, 
> file_ucs_shenandoah_on_heap_memtable_3.html, key-value-oss.html
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The unified compaction strategy currently aims to create sstables with close 
> to the same size, defaulting to 1 GiB. Unfortunately tests show that 
> Cassandra starts to have performance problems when the number of sstables 
> grows to the order of a thousand, and in particular that even 1 TiB of data 
> with the default configuration is creating too many sstables for efficient 
> processing. This matters even more for SAI, where the number of sstables in 
> the system can have a proportional effect on the complexity of operations.
> It is quite easy to create a configuration option that allows sstables to 
> take some part of the data growth by adding a multiplier to [the shard count 
> calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding]
>  formula, replacing 
> {{2 ^ round(log2(d / (t * b))) * b}} 
> with 
> {{2 ^ round((1 - 𝜆) * log2(d / (t * b))) * b}}, 
> where 𝜆 is a parameter whose value is between 0 and 1.
> With this, a 𝜆 of 0.5 would mean that shard count and sstable size grow in 
> parallel at the square root of the data size growth. 0 would result in no 
> growth, and 1 in always using the same number of shards.
> It may also be valuable to introduce a threshold for engaging the base shard 
> count to avoid splitting lowest-level sstables into fragments that are too 
> small.
> Once both of these are in place, we can set defaults that better suit all 
> node densities, including 10 TiB and beyond, for example:
>  - target size of 1 GiB
>  - 𝜆 of 1/3
>  - base shard count of 4
>  - minimum size 100 MiB



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables

2023-10-27 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18945:

Fix Version/s: 5.0

> Unified Compaction Strategy is creating too many sstables
> -
>
> Key: CASSANDRA-18945
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18945
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
> Fix For: 5.0, 5.x
>
> Attachments: file_ucs_shenandoah.html, file_ucs_shenandoah_3.html, 
> file_ucs_shenandoah_off_heap_memtable.html, 
> file_ucs_shenandoah_on_heap_memtable_2.html, 
> file_ucs_shenandoah_on_heap_memtable_3.html, key-value-oss.html
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The unified compaction strategy currently aims to create sstables with close 
> to the same size, defaulting to 1 GiB. Unfortunately tests show that 
> Cassandra starts to have performance problems when the number of sstables 
> grows to the order of a thousand, and in particular that even 1 TiB of data 
> with the default configuration is creating too many sstables for efficient 
> processing. This matters even more for SAI, where the number of sstables in 
> the system can have a proportional effect on the complexity of operations.
> It is quite easy to create a configuration option that allows sstables to 
> take some part of the data growth by adding a multiplier to [the shard count 
> calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding]
>  formula, replacing 
> {{2 ^ round(log2(d / (t * b))) * b}} 
> with 
> {{2 ^ round((1 - 𝜆) * log2(d / (t * b))) * b}}, 
> where 𝜆 is a parameter whose value is between 0 and 1.
> With this, a 𝜆 of 0.5 would mean that shard count and sstable size grow in 
> parallel at the square root of the data size growth. 0 would result in no 
> growth, and 1 in always using the same number of shards.
> It may also be valuable to introduce a threshold for engaging the base shard 
> count to avoid splitting lowest-level sstables into fragments that are too 
> small.
> Once both of these are in place, we can set defaults that better suit all 
> node densities, including 10 TiB and beyond, for example:
>  - target size of 1 GiB
>  - 𝜆 of 1/3
>  - base shard count of 4
>  - minimum size 100 MiB



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables

2023-10-27 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18945:

 Bug Category: Parent values: Degradation(12984)Level 1 values: Performance 
Bug/Regression(12997)
   Complexity: Normal
Discovered By: Adhoc Test
Reviewers: Branimir Lambov
 Severity: Normal
   Status: Open  (was: Triage Needed)

> Unified Compaction Strategy is creating too many sstables
> -
>
> Key: CASSANDRA-18945
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18945
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
> Attachments: file_ucs_shenandoah.html, file_ucs_shenandoah_3.html, 
> file_ucs_shenandoah_off_heap_memtable.html, 
> file_ucs_shenandoah_on_heap_memtable_2.html, 
> file_ucs_shenandoah_on_heap_memtable_3.html, key-value-oss.html
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The unified compaction strategy currently aims to create sstables with close 
> to the same size, defaulting to 1 GiB. Unfortunately tests show that 
> Cassandra starts to have performance problems when the number of sstables 
> grows to the order of a thousand, and in particular that even 1 TiB of data 
> with the default configuration is creating too many sstables for efficient 
> processing. This matters even more for SAI, where the number of sstables in 
> the system can have a proportional effect on the complexity of operations.
> It is quite easy to create a configuration option that allows sstables to 
> take some part of the data growth by adding a multiplier to [the shard count 
> calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding]
>  formula, replacing 
> {{2 ^ round(log2(d / (t * b))) * b}} 
> with 
> {{2 ^ round((1 - 𝜆) * log2(d / (t * b))) * b}}, 
> where 𝜆 is a parameter whose value is between 0 and 1.
> With this, a 𝜆 of 0.5 would mean that shard count and sstable size grow in 
> parallel at the square root of the data size growth. 0 would result in no 
> growth, and 1 in always using the same number of shards.
> It may also be valuable to introduce a threshold for engaging the base shard 
> count to avoid splitting lowest-level sstables into fragments that are too 
> small.
> Once both of these are in place, we can set defaults that better suit all 
> node densities, including 10 TiB and beyond, for example:
>  - target size of 1 GiB
>  - 𝜆 of 1/3
>  - base shard count of 4
>  - minimum size 100 MiB



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables

2023-10-27 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17780240#comment-17780240
 ] 

Branimir Lambov commented on CASSANDRA-18945:
-

[~smiklosovic], would you be willing to be the second reviewer?

> Unified Compaction Strategy is creating too many sstables
> -
>
> Key: CASSANDRA-18945
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18945
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
> Attachments: file_ucs_shenandoah.html, file_ucs_shenandoah_3.html, 
> file_ucs_shenandoah_off_heap_memtable.html, 
> file_ucs_shenandoah_on_heap_memtable_2.html, 
> file_ucs_shenandoah_on_heap_memtable_3.html, key-value-oss.html
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The unified compaction strategy currently aims to create sstables with close 
> to the same size, defaulting to 1 GiB. Unfortunately tests show that 
> Cassandra starts to have performance problems when the number of sstables 
> grows to the order of a thousand, and in particular that even 1 TiB of data 
> with the default configuration is creating too many sstables for efficient 
> processing. This matters even more for SAI, where the number of sstables in 
> the system can have a proportional effect on the complexity of operations.
> It is quite easy to create a configuration option that allows sstables to 
> take some part of the data growth by adding a multiplier to [the shard count 
> calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding]
>  formula, replacing 
> {{2 ^ round(log2(d / (t * b))) * b}} 
> with 
> {{2 ^ round((1 - 𝜆) * log2(d / (t * b))) * b}}, 
> where 𝜆 is a parameter whose value is between 0 and 1.
> With this, a 𝜆 of 0.5 would mean that shard count and sstable size grow in 
> parallel at the square root of the data size growth. 0 would result in no 
> growth, and 1 in always using the same number of shards.
> It may also be valuable to introduce a threshold for engaging the base shard 
> count to avoid splitting lowest-level sstables into fragments that are too 
> small.
> Once both of these are in place, we can set defaults that better suit all 
> node densities, including 10 TiB and beyond, for example:
>  - target size of 1 GiB
>  - 𝜆 of 1/3
>  - base shard count of 4
>  - minimum size 100 MiB



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18710) Test failure: org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)

2023-10-25 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17779468#comment-17779468
 ] 

Branimir Lambov commented on CASSANDRA-18710:
-

So the {{KEYSPACE_PER_TEST}} fix for unexpected flushes no longer works after 
CASSANDRA-17071? All of the tests that use it will be having intermittent 
failures unless we find a way to block this.

> Test failure: 
> org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from 
> org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)
> --
>
> Key: CASSANDRA-18710
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18710
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Ekaterina Dimitrova
>Assignee: Brandon Williams
>Priority: Normal
> Fix For: 5.0.x, 5.x
>
> Attachments: org.apache.cassandra.io.DiskSpaceMetricsTest.txt
>
>
> Seen here:
> [https://ci-cassandra.apache.org/job/Cassandra-trunk/1644/testReport/org.apache.cassandra.io/DiskSpaceMetricsTest/testFlushSize__jdk17/]
> h3.  
> {code:java}
> Error Message
> expected:<7200.0> but was:<1367.83970468544>
> Stacktrace
> junit.framework.AssertionFailedError: expected:<7200.0> but 
> was:<1367.83970468544> at 
> org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize(DiskSpaceMetricsTest.java:119)
>  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables

2023-10-25 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17779444#comment-17779444
 ] 

Branimir Lambov commented on CASSANDRA-18945:
-

Attached [the result of a recent 
benchmark|https://issues.apache.org/jira/secure/attachment/13063855/key-value-oss.html]
 comparing the UCS default (green) to STCS (blue) and an option with larger 
SSTable size (orange). The default UCS has worse results in the throughput 
stage, but more importantly it is unable to serve the 110k ops/s during the 1:1 
and read-only stages. I'm still investigating what causes these reads to be so 
slow, but switching to 10GiB target fully fixes the problem (the two other 
options the orange graph uses, 'base_shard_count': '1' and 
'max_sstables_to_compact': '32', help but are not as significant on their own).

Rather than ask users to choose a target size based on their expected data 
density, the database should be able to deal with this itself. Admitting some 
of the growth into the sstable size is a good way to achieve that.

> Unified Compaction Strategy is creating too many sstables
> -
>
> Key: CASSANDRA-18945
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18945
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
> Attachments: key-value-oss.html
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The unified compaction strategy currently aims to create sstables with close 
> to the same size, defaulting to 1 GiB. Unfortunately tests show that 
> Cassandra starts to have performance problems when the number of sstables 
> grows to the order of a thousand, and in particular that even 1 TiB of data 
> with the default configuration is creating too many sstables for efficient 
> processing. This matters even more for SAI, where the number of sstables in 
> the system can have a proportional effect on the complexity of operations.
> It is quite easy to create a configuration option that allows sstables to 
> take some part of the data growth by adding a multiplier to [the shard count 
> calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding]
>  formula, replacing 
> {{2 ^ round(log2(d / (t * b))) * b}} 
> with 
> {{2 ^ round((1 - 𝜆) * log2(d / (t * b))) * b}}, 
> where 𝜆 is a parameter whose value is between 0 and 1.
> With this, a 𝜆 of 0.5 would mean that shard count and sstable size grow in 
> parallel at the square root of the data size growth. 0 would result in no 
> growth, and 1 in always using the same number of shards.
> It may also be valuable to introduce a threshold for engaging the base shard 
> count to avoid splitting lowest-level sstables into fragments that are too 
> small.
> Once both of these are in place, we can set defaults that better suit all 
> node densities, including 10 TiB and beyond, for example:
>  - target size of 1 GiB
>  - 𝜆 of 1/3
>  - base shard count of 4
>  - minimum size 100 MiB



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables

2023-10-25 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18945:

Attachment: key-value-oss.html

> Unified Compaction Strategy is creating too many sstables
> -
>
> Key: CASSANDRA-18945
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18945
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
> Attachments: key-value-oss.html
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The unified compaction strategy currently aims to create sstables with close 
> to the same size, defaulting to 1 GiB. Unfortunately tests show that 
> Cassandra starts to have performance problems when the number of sstables 
> grows to the order of a thousand, and in particular that even 1 TiB of data 
> with the default configuration is creating too many sstables for efficient 
> processing. This matters even more for SAI, where the number of sstables in 
> the system can have a proportional effect on the complexity of operations.
> It is quite easy to create a configuration option that allows sstables to 
> take some part of the data growth by adding a multiplier to [the shard count 
> calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding]
>  formula, replacing 
> {{2 ^ round(log2(d / (t * b))) * b}} 
> with 
> {{2 ^ round((1 - 𝜆) * log2(d / (t * b))) * b}}, 
> where 𝜆 is a parameter whose value is between 0 and 1.
> With this, a 𝜆 of 0.5 would mean that shard count and sstable size grow in 
> parallel at the square root of the data size growth. 0 would result in no 
> growth, and 1 in always using the same number of shards.
> It may also be valuable to introduce a threshold for engaging the base shard 
> count to avoid splitting lowest-level sstables into fragments that are too 
> small.
> Once both of these are in place, we can set defaults that better suit all 
> node densities, including 10 TiB and beyond, for example:
>  - target size of 1 GiB
>  - 𝜆 of 1/3
>  - base shard count of 4
>  - minimum size 100 MiB



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18710) Test failure: org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)

2023-10-20 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1830#comment-1830
 ] 

Branimir Lambov commented on CASSANDRA-18710:
-

It looks like the reason for the unexpected flush is the commit log:
{code:java}
[junit-timeout] INFO  [OptionalTasks:1] 2023-10-12 21:55:11,095 
ColumnFamilyStore.java:1017 - Enqueuing flush of 
cql_test_keyspace_alt.table_01, Reason: COMMITLOG_DIRTY, Usage: 74.752KiB (0%) 
on-heap, 3.777KiB (0%) off-heap
[junit-timeout] INFO  [PerDiskMemtableFlushWriter_0:2] 2023-10-12 21:55:11,103 
Flushing.java:154 - Writing Memtable-table_01@1180822937(6.854KiB serialized 
bytes, 242 ops, 74.916KiB (0%) on-heap, 3.781KiB (0%) off-heap), flushed range 
= [null, null)
[junit-timeout] INFO  [PerDiskMemtableFlushWriter_0:2] 2023-10-12 21:55:11,128 
Flushing.java:180 - Completed flushing 
/tmp/cassandra/build/test/cassandra/data/cql_test_keyspace_alt/table_01-03e61210694a11eeb4091bdb4ac3170b/nc-1-big-Data.db
 (6.839KiB) ... {code}
which is flushing just 242 out of the 1000 ops that the test needs per table.

We need to understand what causes these {{COMMITLOG_DIRTY}} flushes, because 
there are quite a few tests that will fail if a flush happens at the wrong 
time. Or maybe somehow disable commitlog-driven flushing for tests (e.g. by 
setting a really large commit log space limit).

> Test failure: 
> org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from 
> org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)
> --
>
> Key: CASSANDRA-18710
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18710
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Ekaterina Dimitrova
>Assignee: Brandon Williams
>Priority: Normal
> Fix For: 5.0.x, 5.x
>
> Attachments: org.apache.cassandra.io.DiskSpaceMetricsTest.txt
>
>
> Seen here:
> [https://ci-cassandra.apache.org/job/Cassandra-trunk/1644/testReport/org.apache.cassandra.io/DiskSpaceMetricsTest/testFlushSize__jdk17/]
> h3.  
> {code:java}
> Error Message
> expected:<7200.0> but was:<1367.83970468544>
> Stacktrace
> junit.framework.AssertionFailedError: expected:<7200.0> but 
> was:<1367.83970468544> at 
> org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize(DiskSpaceMetricsTest.java:119)
>  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables

2023-10-20 Thread Branimir Lambov (Jira)
Branimir Lambov created CASSANDRA-18945:
---

 Summary: Unified Compaction Strategy is creating too many sstables
 Key: CASSANDRA-18945
 URL: https://issues.apache.org/jira/browse/CASSANDRA-18945
 Project: Cassandra
  Issue Type: Bug
  Components: Local/Compaction
Reporter: Branimir Lambov


The unified compaction strategy currently aims to create sstables with close to 
the same size, defaulting to 1 GiB. Unfortunately tests show that Cassandra 
starts to have performance problems when the number of sstables grows to the 
order of a thousand, and in particular that even 1 TiB of data with the default 
configuration is creating too many sstables for efficient processing. This 
matters even more for SAI, where the number of sstables in the system can have 
a proportional effect on the complexity of operations.

It is quite easy to create a configuration option that allows sstables to take 
some part of the data growth by adding a multiplier to [the shard count 
calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding]
 formula, replacing 
{{2 ^ round(log2(d / (t * b))) * b}} 
with 
{{2 ^ round((1 - 𝜆) * log2(d / (t * b))) * b}}, 
where 𝜆 is a parameter whose value is between 0 and 1.

With this, a 𝜆 of 0.5 would mean that shard count and sstable size grow in 
parallel at the square root of the data size growth. 0 would result in no 
growth, and 1 in always using the same number of shards.

It may also be valuable to introduce a threshold for engaging the base shard 
count to avoid splitting lowest-level sstables into fragments that are too 
small.

Once both of these are in place, we can set defaults that better suit all node 
densities, including 10 TiB and beyond, for example:
 - target size of 1 GiB
 - 𝜆 of 1/3
 - base shard count of 4
 - minimum size 100 MiB



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18872) Remove deprecated crc_check_chance in compression params

2023-10-19 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1333#comment-1333
 ] 

Branimir Lambov commented on CASSANDRA-18872:
-

The patch looks good to me, the changes are not too invasive and can be easily 
replaced with format configuration in CASSANDRA-18534.

Do we have a documentation ticket corresponding to this? AFAICS [the 
docs|https://cassandra.apache.org/doc/latest/cassandra/operating/compression.html]
 only mention the compression-level setting, even for 4.1. This documentation 
change also needs to explain that the chance only applies to compressed 
sstables.

> Remove deprecated crc_check_chance in compression params
> 
>
> Key: CASSANDRA-18872
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18872
> Project: Cassandra
>  Issue Type: Task
>  Components: Feature/Compression, Legacy/CQL
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 5.x
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> crc_check_chance was moved from compression parameters and it is a standalone 
> table parameter. This was done in times of 3.0 so it is now time to get rid 
> of that in 5.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18534) Make sstable format configurable per table

2023-10-10 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18534:

Fix Version/s: 5.0
   (was: 5.x)

> Make sstable format configurable per table
> --
>
> Key: CASSANDRA-18534
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18534
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Cluster/Schema, Local/SSTable
>Reporter: Branimir Lambov
>Assignee: Maxwell Guo
>Priority: Normal
> Fix For: 5.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Some SSTable format settings need to be configurable per table for better 
> efficiency. This includes:
>  - {{row_index_granularity}}
>  - {{bloom_filter_fp_chance}}
>  - {{crc_check_chance}}
>  - {{min/max_index_interval}}
> Some of these are currently configurable using direct properties of tables. 
> Having them as format properties makes better sense and should also support 
> specifying useable combinations of settings, e.g.
> {code:java}
> CREATE TABLE ... WITH sstable_format = "bti-fast";
> CREATE TABLE ... WITH sstable_format = "bti-small";
> {code}
> where {{bti-fast}} and {{bti-small}} can be defined in {{cassandra.yaml}} 
> e.g. as
> {code:java}
> sstable.format.options:
>   - bti-fast:
>   row_index_granularity: 1kiB
>   bloom_filter_fp_chance: 0.01
>   - bti-small:
>   row_index_granularity: 32kiB
>   bloom_filter_fp_chance: 0.1
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18534) Make sstable format configurable per table

2023-10-10 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17773696#comment-17773696
 ] 

Branimir Lambov commented on CASSANDRA-18534:
-

bq. Also, do you think it is possible and useful to make sstable_format contain 
custom parameters?

_All_ of the parameters to the SSTable format are custom, i.e. format-specific. 
This is also the qualifying condition for something to be moved into the format 
config: if you can imagine an SSTable format that does not need that flag, then 
it belongs to the format. E.g. bloom-filter-less formats do not need 
{{bloom_filter_fp_chance}}, and (even though they are not a feature of writing 
an SSTable) only {{BIG}} requires key cache options. Unless we are certain that 
CRC is the only way a format could defend against bit rot, {{check_crc_chance}} 
is also a format-specific property.

> Make sstable format configurable per table
> --
>
> Key: CASSANDRA-18534
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18534
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Cluster/Schema, Local/SSTable
>Reporter: Branimir Lambov
>Assignee: Maxwell Guo
>Priority: Normal
> Fix For: 5.x
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Some SSTable format settings need to be configurable per table for better 
> efficiency. This includes:
>  - {{row_index_granularity}}
>  - {{bloom_filter_fp_chance}}
>  - {{crc_check_chance}}
>  - {{min/max_index_interval}}
> Some of these are currently configurable using direct properties of tables. 
> Having them as format properties makes better sense and should also support 
> specifying useable combinations of settings, e.g.
> {code:java}
> CREATE TABLE ... WITH sstable_format = "bti-fast";
> CREATE TABLE ... WITH sstable_format = "bti-small";
> {code}
> where {{bti-fast}} and {{bti-small}} can be defined in {{cassandra.yaml}} 
> e.g. as
> {code:java}
> sstable.format.options:
>   - bti-fast:
>   row_index_granularity: 1kiB
>   bloom_filter_fp_chance: 0.01
>   - bti-small:
>   row_index_granularity: 32kiB
>   bloom_filter_fp_chance: 0.1
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18872) Remove deprecated crc_check_chance in compression params

2023-10-09 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17773438#comment-17773438
 ] 

Branimir Lambov commented on CASSANDRA-18872:
-

Have you looked at CASSANDRA-18534? Now that we have multiple SSTable formats, 
it makes a lot of sense to move properties like this into the format 
configuration, which in turn would mean passing a format configuration (instead 
of compression one) to the file handle builder.

> Remove deprecated crc_check_chance in compression params
> 
>
> Key: CASSANDRA-18872
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18872
> Project: Cassandra
>  Issue Type: Task
>  Components: Feature/Compression, Legacy/CQL
>Reporter: Stefan Miklosovic
>Assignee: Stefan Miklosovic
>Priority: Normal
> Fix For: 5.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> crc_check_chance was moved from compression parameters and it is a standalone 
> table parameter. This was done in times of 3.0 so it is now time to get rid 
> of that in 5.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18464) Enable Direct I/O For CommitLog Files

2023-10-04 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771732#comment-17771732
 ] 

Branimir Lambov commented on CASSANDRA-18464:
-

To make the review easier, could you fork the {{apache/cassandra}} repository 
on github, push a branch with the changes to your fork on top of 
{{cassandra-5.0}}, and open a pull request against {{apache/cassandra-5.0}}?

My comments so far are these:

On [Config.java 
117|https://github.com/driftx/cassandra/commit/cb5bd169f0a9331f957f96a7318fee02a744e006#diff-e966f41bc2a418becfe687134ec8cf542eb051eead7fb4917e65a3a2e7c9bce3R117]:
{quote}
Using booleans makes it very unclear which options are actually valid, and what 
the alternative means. Please change the configuration to an enum, e.g. 
{{commit_log_access_mode}} with values {{direct_jna}}, {{direct}}, and {{mmap}}.
{quote}
{quote}
Actually, there should be only one direct option, and whether it uses nio or 
jni is an implementation detail that the users needn't care about.

The next question is whether or not non-direct should be supported at all, and 
I personally prefer to not support it as this adds configuration complexity for 
no expected benefit.

This also means that it makes sense to simply switch all other commit log 
segment types to be written direct, and this is simple enough to do in this 
ticket (especially since we dropped Java 8 and can use NIO's {{DIRECT}} option).
{quote}

On [Config.java 
517|https://github.com/driftx/cassandra/commit/cb5bd169f0a9331f957f96a7318fee02a744e006#diff-e966f41bc2a418becfe687134ec8cf542eb051eead7fb4917e65a3a2e7c9bce3R517]:
{quote}
When would someone need to change this?
{quote}


> Enable Direct I/O For CommitLog Files
> -
>
> Key: CASSANDRA-18464
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18464
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Commit Log
>Reporter: Josh McKenzie
>Assignee: Amit Pawar
>Priority: Normal
> Fix For: 5.0.x, 5.x
>
> Attachments: CommitLogStressTest.patch, 
> EnableDirectIOForCommitLogUsingNativeAPI.patch, 
> PeriodicCommitLogStressTest.tar.bz2, SetCommitLogFileSize.patch, 
> UseDirectIOFeatureForCommitLogFiles.patch, image-2023-06-29-01-12-49-382.png
>
>
> Relocating from [dev@ email 
> thread.|https://lists.apache.org/thread/j6ny17q2rhkp7jxvwxm69dd6v1dozjrg]
>  
> I shared my investigation about Commitlog I/O issue on large core count 
> system in my previous email dated July-22 and link to the thread is given 
> below.
> [https://lists.apache.org/thread/xc5ocog2qz2v2gnj4xlw5hbthfqytx2n]
> Basically, two solutions looked possible to improve the CommitLog I/O.
>  # Multi-threaded syncing
>  # Using Direct-IO through JNA
> I worked on 2nd option considering the following benefit compared to the 
> first one
>  # Direct I/O read/write throughput is very high compared to non-Direct I/O. 
> Learnt through FIO benchmarking.
>  # Reduces kernel file cache uses which in-turn reduces kernel I/O activity 
> for Commitlog files only.
>  # Overall CPU usage reduced for flush activity. JVisualvm shows CPU usage < 
> 30% for Commitlog syncer thread with Direct I/O feature
>  # Direct I/O implementation is easier compared to multi-threaded
> As per the community suggestion, less in code complex is good to have. Direct 
> I/O enablement looked promising but there was one issue. 
> Java version 8 does not have native support to enable Direct I/O. So, JNA 
> library usage is must. The same implementation should also work across other 
> versions of Java (like 11 and beyond).
> I have completed Direct I/O implementation and summary of the attached patch 
> changes are given below.
>  # This implementation is not using Java file channels and file is opened 
> through JNA to use Direct I/O feature.
>  # New Segment are defined named “DirectIOSegment”  for Direct I/O and 
> “NonDirectIOSegment” for non-direct I/O (NonDirectIOSegment is test purpose 
> only).
>  # JNA write call is used to flush the changes.
>  # New helper functions are defined in NativeLibrary.java and platform 
> specific file. Currently tested on Linux only.
>  # Patch allows user to configure optimum block size  and alignment if 
> default values are not OK for CommitLog disk.
>  # Following configuration options are provided in Cassandra.yaml file
> a. use_jna_for_commitlog_io : to use jna feature
> b. use_direct_io_for_commitlog : to use Direct I/O feature.
> c. direct_io_minimum_block_alignment: 512 (default)
> d. nvme_disk_block_size: 32MiB (default and can be changed as per the 
> required size)
>  Test matrix is complex so CommitLog related testcases and TPCx-IOT benchmark 
> was tested. It works with both Java 8 and 11 versions. Compressed and 
> Encrypted based segments ar

[jira] [Commented] (CASSANDRA-18464) Enable Direct I/O For CommitLog Files

2023-10-03 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771432#comment-17771432
 ] 

Branimir Lambov commented on CASSANDRA-18464:
-

There was a typo in my response above, I am in favour of having the patch land 
in 5.0.

Just the 512 vs 4k difference is not something I would personally consider a 
good reason to include the JNA writing; the sync segments are usually much 
larger than that. I would rather go with the simpler NIO option. 

I can't find my code comments with the link above any more. They are 
[here|https://github.com/driftx/cassandra/commit/cb5bd169f0a9331f957f96a7318fee02a744e006#r128716588].

> Enable Direct I/O For CommitLog Files
> -
>
> Key: CASSANDRA-18464
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18464
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Commit Log
>Reporter: Josh McKenzie
>Assignee: Amit Pawar
>Priority: Normal
> Fix For: 5.0.x, 5.x
>
> Attachments: CommitLogStressTest.patch, 
> EnableDirectIOForCommitLogUsingNativeAPI.patch, 
> PeriodicCommitLogStressTest.tar.bz2, SetCommitLogFileSize.patch, 
> UseDirectIOFeatureForCommitLogFiles.patch, image-2023-06-29-01-12-49-382.png
>
>
> Relocating from [dev@ email 
> thread.|https://lists.apache.org/thread/j6ny17q2rhkp7jxvwxm69dd6v1dozjrg]
>  
> I shared my investigation about Commitlog I/O issue on large core count 
> system in my previous email dated July-22 and link to the thread is given 
> below.
> [https://lists.apache.org/thread/xc5ocog2qz2v2gnj4xlw5hbthfqytx2n]
> Basically, two solutions looked possible to improve the CommitLog I/O.
>  # Multi-threaded syncing
>  # Using Direct-IO through JNA
> I worked on 2nd option considering the following benefit compared to the 
> first one
>  # Direct I/O read/write throughput is very high compared to non-Direct I/O. 
> Learnt through FIO benchmarking.
>  # Reduces kernel file cache uses which in-turn reduces kernel I/O activity 
> for Commitlog files only.
>  # Overall CPU usage reduced for flush activity. JVisualvm shows CPU usage < 
> 30% for Commitlog syncer thread with Direct I/O feature
>  # Direct I/O implementation is easier compared to multi-threaded
> As per the community suggestion, less in code complex is good to have. Direct 
> I/O enablement looked promising but there was one issue. 
> Java version 8 does not have native support to enable Direct I/O. So, JNA 
> library usage is must. The same implementation should also work across other 
> versions of Java (like 11 and beyond).
> I have completed Direct I/O implementation and summary of the attached patch 
> changes are given below.
>  # This implementation is not using Java file channels and file is opened 
> through JNA to use Direct I/O feature.
>  # New Segment are defined named “DirectIOSegment”  for Direct I/O and 
> “NonDirectIOSegment” for non-direct I/O (NonDirectIOSegment is test purpose 
> only).
>  # JNA write call is used to flush the changes.
>  # New helper functions are defined in NativeLibrary.java and platform 
> specific file. Currently tested on Linux only.
>  # Patch allows user to configure optimum block size  and alignment if 
> default values are not OK for CommitLog disk.
>  # Following configuration options are provided in Cassandra.yaml file
> a. use_jna_for_commitlog_io : to use jna feature
> b. use_direct_io_for_commitlog : to use Direct I/O feature.
> c. direct_io_minimum_block_alignment: 512 (default)
> d. nvme_disk_block_size: 32MiB (default and can be changed as per the 
> required size)
>  Test matrix is complex so CommitLog related testcases and TPCx-IOT benchmark 
> was tested. It works with both Java 8 and 11 versions. Compressed and 
> Encrypted based segments are not supported yet and it can be enabled later 
> based on the Community feedback.
>  Following improvement are seen with Direct I/O enablement.
>  # 32 cores >= ~15%
>  # 64 cores >= ~80%
>  Also, another observation would like to share here. Reading Commitlog files 
> with Direct I/O might help in reducing node bring-up time after the node 
> crash.
>  Tested with commit ID: 91f6a9aca8d3c22a03e68aa901a0b154d960ab07
>  The attached patch enables Direct I/O feature for Commitlog files. Please 
> check and share your feedback.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-18894) Drop commitlog chain marker updates

2023-09-29 Thread Branimir Lambov (Jira)
Branimir Lambov created CASSANDRA-18894:
---

 Summary: Drop commitlog chain marker updates
 Key: CASSANDRA-18894
 URL: https://issues.apache.org/jira/browse/CASSANDRA-18894
 Project: Cassandra
  Issue Type: Improvement
  Components: Local/Commit Log
Reporter: Branimir Lambov


CASSANDRA-13987 added a periodic update of the last commit log chain marker in 
order to allow for data in memory-mapped segments to be recovered even if it 
was not part of a synced segment.

A much simpler way to do this is something in the vein of CASSANDRA-16482, i.e. 
ignoring an empty sync marker for the last entry in the commit log. We could do 
this by default if the commit log is uncompressed (and possibly only if using 
memory mapping after CASSANDRA-18464).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18464) Enable Direct I/O For CommitLog Files

2023-09-29 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18464:

Reviewers: Branimir Lambov
   Status: Review In Progress  (was: Patch Available)

> Enable Direct I/O For CommitLog Files
> -
>
> Key: CASSANDRA-18464
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18464
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Commit Log
>Reporter: Josh McKenzie
>Assignee: Amit Pawar
>Priority: Normal
> Fix For: 5.x
>
> Attachments: CommitLogStressTest.patch, 
> EnableDirectIOForCommitLogUsingNativeAPI.patch, 
> PeriodicCommitLogStressTest.tar.bz2, SetCommitLogFileSize.patch, 
> UseDirectIOFeatureForCommitLogFiles.patch, image-2023-06-29-01-12-49-382.png
>
>
> Relocating from [dev@ email 
> thread.|https://lists.apache.org/thread/j6ny17q2rhkp7jxvwxm69dd6v1dozjrg]
>  
> I shared my investigation about Commitlog I/O issue on large core count 
> system in my previous email dated July-22 and link to the thread is given 
> below.
> [https://lists.apache.org/thread/xc5ocog2qz2v2gnj4xlw5hbthfqytx2n]
> Basically, two solutions looked possible to improve the CommitLog I/O.
>  # Multi-threaded syncing
>  # Using Direct-IO through JNA
> I worked on 2nd option considering the following benefit compared to the 
> first one
>  # Direct I/O read/write throughput is very high compared to non-Direct I/O. 
> Learnt through FIO benchmarking.
>  # Reduces kernel file cache uses which in-turn reduces kernel I/O activity 
> for Commitlog files only.
>  # Overall CPU usage reduced for flush activity. JVisualvm shows CPU usage < 
> 30% for Commitlog syncer thread with Direct I/O feature
>  # Direct I/O implementation is easier compared to multi-threaded
> As per the community suggestion, less in code complex is good to have. Direct 
> I/O enablement looked promising but there was one issue. 
> Java version 8 does not have native support to enable Direct I/O. So, JNA 
> library usage is must. The same implementation should also work across other 
> versions of Java (like 11 and beyond).
> I have completed Direct I/O implementation and summary of the attached patch 
> changes are given below.
>  # This implementation is not using Java file channels and file is opened 
> through JNA to use Direct I/O feature.
>  # New Segment are defined named “DirectIOSegment”  for Direct I/O and 
> “NonDirectIOSegment” for non-direct I/O (NonDirectIOSegment is test purpose 
> only).
>  # JNA write call is used to flush the changes.
>  # New helper functions are defined in NativeLibrary.java and platform 
> specific file. Currently tested on Linux only.
>  # Patch allows user to configure optimum block size  and alignment if 
> default values are not OK for CommitLog disk.
>  # Following configuration options are provided in Cassandra.yaml file
> a. use_jna_for_commitlog_io : to use jna feature
> b. use_direct_io_for_commitlog : to use Direct I/O feature.
> c. direct_io_minimum_block_alignment: 512 (default)
> d. nvme_disk_block_size: 32MiB (default and can be changed as per the 
> required size)
>  Test matrix is complex so CommitLog related testcases and TPCx-IOT benchmark 
> was tested. It works with both Java 8 and 11 versions. Compressed and 
> Encrypted based segments are not supported yet and it can be enabled later 
> based on the Community feedback.
>  Following improvement are seen with Direct I/O enablement.
>  # 32 cores >= ~15%
>  # 64 cores >= ~80%
>  Also, another observation would like to share here. Reading Commitlog files 
> with Direct I/O might help in reducing node bring-up time after the node 
> crash.
>  Tested with commit ID: 91f6a9aca8d3c22a03e68aa901a0b154d960ab07
>  The attached patch enables Direct I/O feature for Commitlog files. Please 
> check and share your feedback.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18464) Enable Direct I/O For CommitLog Files

2023-09-29 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17770415#comment-17770415
 ] 

Branimir Lambov commented on CASSANDRA-18464:
-

This patch is very valuable, and I support if going into 5.0 as well as 5.1.

In separate tests we have often found a memory-mapped commit log to be a 
serious performance problem for a node with a lot of data. Even without DIRECT 
or JNA, not using `msync` is making a huge difference. Because of this most of 
the performance testing I personally do is done with compressed commit log.

I added comments to [the latest published 
branch|https://github.com/driftx/cassandra/tree/CASSANDRA-18464-trunk] with 
some suggested changes. I am curious, if the NIO option is constructed 
correctly (with aligned direct buffers, possibly also issuing the writes to be 
page-aligned and containing whole pages), is it still copying to internal 
buffers?

> Enable Direct I/O For CommitLog Files
> -
>
> Key: CASSANDRA-18464
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18464
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Local/Commit Log
>Reporter: Josh McKenzie
>Assignee: Amit Pawar
>Priority: Normal
> Fix For: 5.x
>
> Attachments: CommitLogStressTest.patch, 
> EnableDirectIOForCommitLogUsingNativeAPI.patch, 
> PeriodicCommitLogStressTest.tar.bz2, SetCommitLogFileSize.patch, 
> UseDirectIOFeatureForCommitLogFiles.patch, image-2023-06-29-01-12-49-382.png
>
>
> Relocating from [dev@ email 
> thread.|https://lists.apache.org/thread/j6ny17q2rhkp7jxvwxm69dd6v1dozjrg]
>  
> I shared my investigation about Commitlog I/O issue on large core count 
> system in my previous email dated July-22 and link to the thread is given 
> below.
> [https://lists.apache.org/thread/xc5ocog2qz2v2gnj4xlw5hbthfqytx2n]
> Basically, two solutions looked possible to improve the CommitLog I/O.
>  # Multi-threaded syncing
>  # Using Direct-IO through JNA
> I worked on 2nd option considering the following benefit compared to the 
> first one
>  # Direct I/O read/write throughput is very high compared to non-Direct I/O. 
> Learnt through FIO benchmarking.
>  # Reduces kernel file cache uses which in-turn reduces kernel I/O activity 
> for Commitlog files only.
>  # Overall CPU usage reduced for flush activity. JVisualvm shows CPU usage < 
> 30% for Commitlog syncer thread with Direct I/O feature
>  # Direct I/O implementation is easier compared to multi-threaded
> As per the community suggestion, less in code complex is good to have. Direct 
> I/O enablement looked promising but there was one issue. 
> Java version 8 does not have native support to enable Direct I/O. So, JNA 
> library usage is must. The same implementation should also work across other 
> versions of Java (like 11 and beyond).
> I have completed Direct I/O implementation and summary of the attached patch 
> changes are given below.
>  # This implementation is not using Java file channels and file is opened 
> through JNA to use Direct I/O feature.
>  # New Segment are defined named “DirectIOSegment”  for Direct I/O and 
> “NonDirectIOSegment” for non-direct I/O (NonDirectIOSegment is test purpose 
> only).
>  # JNA write call is used to flush the changes.
>  # New helper functions are defined in NativeLibrary.java and platform 
> specific file. Currently tested on Linux only.
>  # Patch allows user to configure optimum block size  and alignment if 
> default values are not OK for CommitLog disk.
>  # Following configuration options are provided in Cassandra.yaml file
> a. use_jna_for_commitlog_io : to use jna feature
> b. use_direct_io_for_commitlog : to use Direct I/O feature.
> c. direct_io_minimum_block_alignment: 512 (default)
> d. nvme_disk_block_size: 32MiB (default and can be changed as per the 
> required size)
>  Test matrix is complex so CommitLog related testcases and TPCx-IOT benchmark 
> was tested. It works with both Java 8 and 11 versions. Compressed and 
> Encrypted based segments are not supported yet and it can be enabled later 
> based on the Community feedback.
>  Following improvement are seen with Direct I/O enablement.
>  # 32 cores >= ~15%
>  # 64 cores >= ~80%
>  Also, another observation would like to share here. Reading Commitlog files 
> with Direct I/O might help in reducing node bring-up time after the node 
> crash.
>  Tested with commit ID: 91f6a9aca8d3c22a03e68aa901a0b154d960ab07
>  The attached patch enables Direct I/O feature for Commitlog files. Please 
> check and share your feedback.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail:

[jira] [Commented] (CASSANDRA-18773) Compactions are slow

2023-09-26 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17769063#comment-17769063
 ] 

Branimir Lambov commented on CASSANDRA-18773:
-

There's some leftover code in the trunk version, apart from that the newer 
versions look good.

> Compactions are slow
> 
>
> Key: CASSANDRA-18773
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18773
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction
>Reporter: Cameron Zemek
>Assignee: Cameron Zemek
>Priority: Normal
> Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x
>
> Attachments: 18773.patch, compact-poc.patch, flamegraph.png, 
> stress.yaml
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> I have noticed that compactions involving a lot of sstables are very slow 
> (for example major compactions). I have attached a cassandra stress profile 
> that can generate such a dataset under ccm. In my local test I have 2567 
> sstables at 4Mb each.
> I added code to track wall clock time of various parts of the code. One 
> problematic part is ManyToOne constructor. Tracing through the code for every 
> partition creating a ManyToOne for all the sstable iterators for each 
> partition. In my local test get a measy 60Kb/sec read speed, and bottlenecked 
> on single core CPU (since this code is single threaded) with it spending 85% 
> of the wall clock time in ManyToOne constructor.
> As another datapoint to show its the merge iterator part of the code using 
> the cfstats from [https://github.com/instaclustr/cassandra-sstable-tools/] 
> which reads all the sstables but does no merging gets 26Mb/sec read speed.
> Tracking back from ManyToOne call I see this in 
> UnfilteredPartitionIterators::merge
> {code:java}
>                 for (int i = 0; i < toMerge.size(); i++)
>                 {
>                     if (toMerge.get(i) == null)
>                     {
>                         if (null == empty)
>                             empty = EmptyIterators.unfilteredRow(metadata, 
> partitionKey, isReverseOrder);
>                         toMerge.set(i, empty);
>                     }
>                 }
>  {code}
> Not sure what purpose of creating these empty rows are. But on a whim I 
> removed all these empty iterators before passing to ManyToOne and then all 
> the wall clock time shifted to CompactionIterator::hasNext() and read speed 
> increased to 1.5Mb/s.
> So there are further bottlenecks in this code path it seems, but the first is 
> this ManyToOne and having to build it for every partition read.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18873) Fix broken JMH benchmarks

2023-09-25 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17768591#comment-17768591
 ] 

Branimir Lambov commented on CASSANDRA-18873:
-

{quote}
* ReadSmallPartitionsBench (assertion error)
* ReadWidePartitionsBench (assertion error)
{quote}
These two tests need larger memtable size allocation to produce useable output. 
One way to "fix" this is to replace {{INMEM}} with {{NO}} for the default 
{{flush}}, which will make it ignore the fact that part of the data is in an 
sstable; another is to reduce the default {{count}} by an order of magnitude.

Both of these changes would make the test less suitable for what it is 
primarily meant to measure (access time with a non-trivial data size in a 
single memtable/sstable).

> Fix broken JMH benchmarks
> -
>
> Key: CASSANDRA-18873
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18873
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/benchmark
>Reporter: Jacek Lewandowski
>Priority: Normal
> Attachments: BenchTimeTest.java, 
> jmh-AtomicBtreePartitionUpdateBench.log, jmh-BloomFilterSerializerBench.log, 
> jmh-KeyLookupBench.log, jmh-ReadSmallPartitionsBench.log, 
> jmh-ReadWidePartitionsBench.log
>
>
> The following benchmarks are broken:
> * {{ZeroCopyStreamingBench}}
> * {{MutationBench}}
> * {{FastThreadLocalBench}}
> * {{AtomicBTreePartitionUpdateBench}} (OOM on Jenkins)
> * {{ReadSmallPartitionsBench}} (assertion error)
> * {{ReadWidePartitionsBench}} (assertion error)
> * {{BloomFilterSerializerBench}} (NPE)
> * {{KeyLookupBench}} (IAE)
> Additionally, those benchmarks take too much time to run:
> * {{BTreeUpdateBench}} ~ 58 hours
> * {{AtomicBTreePartitionUpdateBench}} ~ 5 hours
> * {{BTreeTransformBench}} ~ 2.5 hours
> Here the complete list of estimated benchmark times:
> {noformat}
> Estimated time for CacheLoaderBench: ~5 s
> Estimated time for LatencyTrackingBench: ~26 s
> Estimated time for SampleBench: ~30 s
> Estimated time for ReadWriteBench: ~30 s
> Estimated time for MutationBench: ~30 s
> Estimated time for CompactionBench: ~35 s
> Estimated time for DiagnosticEventPersistenceBench: ~40 s
> Estimated time for ZeroCopyStreamingBench: ~44 s
> Estimated time for BatchStatementBench: ~110 s
> Estimated time for DiagnosticEventServiceBench: ~120 s
> Estimated time for MessageOutBench: ~144 s
> Estimated time for BloomFilterSerializerBench: ~144 s
> Estimated time for FastThreadLocalBench: ~156 s
> Estimated time for HashingBench: ~156 s
> Estimated time for ChecksumBench: ~208 s
> Estimated time for StreamingTombstoneHistogramBuilderBench: ~208 s
> Estimated time for PendingRangesBench: ~ 5 m
> Estimated time for DirectorySizerBench: ~ 5 m
> Estimated time for instance.ReadSmallPartitionsBench: ~ 5 m
> Estimated time for PreaggregatedByteBufsBench: ~ 7 m
> Estimated time for AutoBoxingBench: ~ 8 m
> Estimated time for OutputStreamBench: ~ 13 m
> Estimated time for BTreeBuildBench: ~ 13 m
> Estimated time for StringsEncodeBench: ~ 20 m
> Estimated time for instance.ReadWidePartitionsBench: ~ 21 m
> Estimated time for btree.BTreeBuildBench: ~ 30 m
> Estimated time for BTreeSearchIteratorBench: ~ 31 m
> Estimated time for btree.BTreeTransformBench: ~ 138 m
> Estimated time for btree.AtomicBTreePartitionUpdateBench: ~ 288 m
> Estimated time for btree.BTreeUpdateBench: ~58 h
> Total estimated time: ~69 h
> {noformat}
> I'd like to add a test which estimates the benchmark times and fails if a 
> single benchmark estimated run time is longer than xxx minutes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-18533) Move format-specific sstable options into the format configuration

2023-09-21 Thread Branimir Lambov (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-18533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767594#comment-17767594
 ] 

Branimir Lambov commented on CASSANDRA-18533:
-

Absolutely.

> Move format-specific sstable options into the format configuration
> --
>
> Key: CASSANDRA-18533
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18533
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Branimir Lambov
>Priority: Normal
>
> This mainly concerns cassandra yaml settings:
> - {{column_index_size}}, which should also be renamed to 
> {{row_index_granularity}}
> - {{column_index_cache_size}}
> - {{index_summary_capacity}}
> - {{index_summary_resize_interval}}
> and possibly
> - {{key_cache_size}}, {{key_cache_keys_to_save}}, {{key_cache_save_period}}, 
> {{key_cache_migrate_during_compaction}}
> - {{sstable_preemptive_open_interval}}
> Existing settings should be deprecated but still picked up if defined.
> At this point we will not consider table-level options that make better sense 
> as format parameters ({{min/max_index_interval}}, {{bloom_filter_fp_chance}}, 
> {{crc_check_chance}} and possibly {{compression}}), because we do not yet 
> support per-table format selection/configuration.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references

2023-09-21 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18756:

  Fix Version/s: 3.11.17
 4.0.12
 4.1.4
 5.0-alpha2
 5.1
Source Control Link: https://github.com/apache/cassandra/pull/2656
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

Commited 
([3.11|https://github.com/apache/cassandra/commit/87c2af85c1305c130af7d66f83dec03a1c4a8bb2]
 
[4.0|https://github.com/apache/cassandra/commit/c6385ac3ddccabdc7cb650b090fa69c0523274e8]
 
[4.1|https://github.com/apache/cassandra/commit/db6641fbb6fd0c439e14f94caecdeee999311c62]
 
[5.0|https://github.com/apache/cassandra/commit/a23f4c0b15c684240ef0bcd55875610e8bd7179b]
 
[trunk|https://github.com/apache/cassandra/commit/970ec2d1db5770c13a42e1f2862ea398317d0f15])

> TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps 
> overlaping SSTable references
> --
>
> Key: CASSANDRA-18756
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18756
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
> Fix For: 3.11.17, 4.0.12, 4.1.4, 5.0-alpha2, 5.1
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not 
> create or maintain an iterator of overlapping sstables. However, because 
> {{TimeWindowCompactionController}} inherits from {{CompactionController}} and 
> only sets {{ignoreOverlaps}} after the base class has constructed the overlap 
> iterator, it ends up making an overlap iterator and then never updating it.
> The end result is that such a compaction keeps references to lots of and 
> likely _all_ other SSTables on the node and thus delays the deletion of 
> obsolete ones by hours or even days.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references

2023-09-21 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18756:

Status: Review In Progress  (was: Needs Committer)

> TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps 
> overlaping SSTable references
> --
>
> Key: CASSANDRA-18756
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18756
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not 
> create or maintain an iterator of overlapping sstables. However, because 
> {{TimeWindowCompactionController}} inherits from {{CompactionController}} and 
> only sets {{ignoreOverlaps}} after the base class has constructed the overlap 
> iterator, it ends up making an overlap iterator and then never updating it.
> The end result is that such a compaction keeps references to lots of and 
> likely _all_ other SSTables on the node and thus delays the deletion of 
> obsolete ones by hours or even days.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references

2023-09-21 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18756:

Status: Ready to Commit  (was: Review In Progress)

> TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps 
> overlaping SSTable references
> --
>
> Key: CASSANDRA-18756
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18756
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not 
> create or maintain an iterator of overlapping sstables. However, because 
> {{TimeWindowCompactionController}} inherits from {{CompactionController}} and 
> only sets {{ignoreOverlaps}} after the base class has constructed the overlap 
> iterator, it ends up making an overlap iterator and then never updating it.
> The end result is that such a compaction keeps references to lots of and 
> likely _all_ other SSTables on the node and thus delays the deletion of 
> obsolete ones by hours or even days.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references

2023-09-21 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18756:

Status: Needs Committer  (was: Patch Available)

> TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps 
> overlaping SSTable references
> --
>
> Key: CASSANDRA-18756
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18756
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not 
> create or maintain an iterator of overlapping sstables. However, because 
> {{TimeWindowCompactionController}} inherits from {{CompactionController}} and 
> only sets {{ignoreOverlaps}} after the base class has constructed the overlap 
> iterator, it ends up making an overlap iterator and then never updating it.
> The end result is that such a compaction keeps references to lots of and 
> likely _all_ other SSTables on the node and thus delays the deletion of 
> obsolete ones by hours or even days.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references

2023-09-21 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18756:

Status: Patch Available  (was: Requires Testing)

> TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps 
> overlaping SSTable references
> --
>
> Key: CASSANDRA-18756
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18756
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not 
> create or maintain an iterator of overlapping sstables. However, because 
> {{TimeWindowCompactionController}} inherits from {{CompactionController}} and 
> only sets {{ignoreOverlaps}} after the base class has constructed the overlap 
> iterator, it ends up making an overlap iterator and then never updating it.
> The end result is that such a compaction keeps references to lots of and 
> likely _all_ other SSTables on the node and thus delays the deletion of 
> obsolete ones by hours or even days.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references

2023-09-21 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18756:

Status: Requires Testing  (was: Review In Progress)

> TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps 
> overlaping SSTable references
> --
>
> Key: CASSANDRA-18756
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18756
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not 
> create or maintain an iterator of overlapping sstables. However, because 
> {{TimeWindowCompactionController}} inherits from {{CompactionController}} and 
> only sets {{ignoreOverlaps}} after the base class has constructed the overlap 
> iterator, it ends up making an overlap iterator and then never updating it.
> The end result is that such a compaction keeps references to lots of and 
> likely _all_ other SSTables on the node and thus delays the deletion of 
> obsolete ones by hours or even days.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references

2023-09-21 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18756:

Reviewers: Branimir Lambov, Michael Semb Wever  (was: Michael Semb Wever)

> TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps 
> overlaping SSTable references
> --
>
> Key: CASSANDRA-18756
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18756
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not 
> create or maintain an iterator of overlapping sstables. However, because 
> {{TimeWindowCompactionController}} inherits from {{CompactionController}} and 
> only sets {{ignoreOverlaps}} after the base class has constructed the overlap 
> iterator, it ends up making an overlap iterator and then never updating it.
> The end result is that such a compaction keeps references to lots of and 
> likely _all_ other SSTables on the node and thus delays the deletion of 
> obsolete ones by hours or even days.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references

2023-09-21 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18756:

Reviewers: Branimir Lambov, Michael Semb Wever, Branimir Lambov  (was: 
Branimir Lambov, Michael Semb Wever)
   Branimir Lambov, Michael Semb Wever, Branimir Lambov  (was: 
Branimir Lambov, Michael Semb Wever)
   Status: Review In Progress  (was: Patch Available)

> TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps 
> overlaping SSTable references
> --
>
> Key: CASSANDRA-18756
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18756
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not 
> create or maintain an iterator of overlapping sstables. However, because 
> {{TimeWindowCompactionController}} inherits from {{CompactionController}} and 
> only sets {{ignoreOverlaps}} after the base class has constructed the overlap 
> iterator, it ends up making an overlap iterator and then never updating it.
> The end result is that such a compaction keeps references to lots of and 
> likely _all_ other SSTables on the node and thus delays the deletion of 
> obsolete ones by hours or even days.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references

2023-09-21 Thread Branimir Lambov (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Branimir Lambov updated CASSANDRA-18756:

Test and Documentation Plan: CI
 Status: Patch Available  (was: In Progress)

> TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps 
> overlaping SSTable references
> --
>
> Key: CASSANDRA-18756
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18756
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Branimir Lambov
>Assignee: Ethan Brown
>Priority: Normal
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not 
> create or maintain an iterator of overlapping sstables. However, because 
> {{TimeWindowCompactionController}} inherits from {{CompactionController}} and 
> only sets {{ignoreOverlaps}} after the base class has constructed the overlap 
> iterator, it ends up making an overlap iterator and then never updating it.
> The end result is that such a compaction keeps references to lots of and 
> likely _all_ other SSTables on the node and thus delays the deletion of 
> obsolete ones by hours or even days.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



  1   2   3   4   5   6   7   8   9   10   >