[jira] [Updated] (CASSANDRA-19945) Reverse cursor and iteration support for Trie based memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-19945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-19945: Authors: Branimir Lambov, Michael Marshall (was: Branimir Lambov) Resolution: Fixed Status: Resolved (was: Ready to Commit) Committed as [377e6aa04fb67ea4220445988e85c9ebacb06db4|https://github.com/apache/cassandra/commit/377e6aa04fb67ea4220445988e85c9ebacb06db4]. > Reverse cursor and iteration support for Trie based memtables > - > > Key: CASSANDRA-19945 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19945 > Project: Cassandra > Issue Type: Improvement > Components: Local/Memtable, Local/SSTable >Reporter: Ariel Weisberg >Assignee: Branimir Lambov >Priority: Normal > Fix For: 5.x > > > Cherry- pick > [https://github.com/datastax/cassandra/commit/196b931c677829d681406f14cf1da814ff5a6624] > For Accord in particular this is useful to avoid flushing memtables that > don't intersect with the range that is going to start having metadata GCed so > we can flush less frequently/later. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19945) Reverse cursor and iteration support for Trie based memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-19945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-19945: Status: Needs Committer (was: Patch Available) > Reverse cursor and iteration support for Trie based memtables > - > > Key: CASSANDRA-19945 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19945 > Project: Cassandra > Issue Type: Improvement > Components: Local/Memtable, Local/SSTable >Reporter: Ariel Weisberg >Assignee: Branimir Lambov >Priority: Normal > Fix For: 5.x > > > Cherry- pick > [https://github.com/datastax/cassandra/commit/196b931c677829d681406f14cf1da814ff5a6624] > For Accord in particular this is useful to avoid flushing memtables that > don't intersect with the range that is going to start having metadata GCed so > we can flush less frequently/later. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19945) Reverse cursor and iteration support for Trie based memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-19945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-19945: Status: Ready to Commit (was: Review In Progress) > Reverse cursor and iteration support for Trie based memtables > - > > Key: CASSANDRA-19945 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19945 > Project: Cassandra > Issue Type: Improvement > Components: Local/Memtable, Local/SSTable >Reporter: Ariel Weisberg >Assignee: Branimir Lambov >Priority: Normal > Fix For: 5.x > > > Cherry- pick > [https://github.com/datastax/cassandra/commit/196b931c677829d681406f14cf1da814ff5a6624] > For Accord in particular this is useful to avoid flushing memtables that > don't intersect with the range that is going to start having metadata GCed so > we can flush less frequently/later. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19945) Reverse cursor and iteration support for Trie based memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-19945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-19945: Status: Review In Progress (was: Needs Committer) > Reverse cursor and iteration support for Trie based memtables > - > > Key: CASSANDRA-19945 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19945 > Project: Cassandra > Issue Type: Improvement > Components: Local/Memtable, Local/SSTable >Reporter: Ariel Weisberg >Assignee: Branimir Lambov >Priority: Normal > Fix For: 5.x > > > Cherry- pick > [https://github.com/datastax/cassandra/commit/196b931c677829d681406f14cf1da814ff5a6624] > For Accord in particular this is useful to avoid flushing memtables that > don't intersect with the range that is going to start having metadata GCed so > we can flush less frequently/later. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19945) Reverse cursor and iteration support for Trie based memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-19945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17885325#comment-17885325 ] Branimir Lambov commented on CASSANDRA-19945: - {{ByteSourceComparisonTest}} checks this for selected examples (see [{{maybeAssertNotPrefix}}|https://github.com/apache/cassandra/blob/trunk/test/unit/org/apache/cassandra/utils/bytecomparable/ByteSourceComparisonTest.java#L865] as well as {{maybeCheck41Properties}} (renamed to 50 now)). This is actually one of those things that you can't really check by tests; [{{ByteComparable.md}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/utils/bytecomparable/ByteComparable.md] has sections that explain and prove the properties for every one of the types in use. > Reverse cursor and iteration support for Trie based memtables > - > > Key: CASSANDRA-19945 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19945 > Project: Cassandra > Issue Type: Improvement > Components: Local/Memtable, Local/SSTable >Reporter: Ariel Weisberg >Assignee: Branimir Lambov >Priority: Normal > Fix For: 5.x > > > Cherry- pick > [https://github.com/datastax/cassandra/commit/196b931c677829d681406f14cf1da814ff5a6624] > For Accord in particular this is useful to avoid flushing memtables that > don't intersect with the range that is going to start having metadata GCed so > we can flush less frequently/later. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19945) Reverse cursor and iteration support for Trie based memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-19945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-19945: Source Control Link: https://github.com/apache/cassandra/pull/3571 > Reverse cursor and iteration support for Trie based memtables > - > > Key: CASSANDRA-19945 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19945 > Project: Cassandra > Issue Type: Improvement > Components: Local/Memtable, Local/SSTable >Reporter: Ariel Weisberg >Assignee: Branimir Lambov >Priority: Normal > Fix For: 5.x > > > Cherry- pick > [https://github.com/datastax/cassandra/commit/196b931c677829d681406f14cf1da814ff5a6624] > For Accord in particular this is useful to avoid flushing memtables that > don't intersect with the range that is going to start having metadata GCed so > we can flush less frequently/later. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19945) Reverse cursor and iteration support for Trie based memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-19945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-19945: Test and Documentation Plan: Unit tests Status: Patch Available (was: Open) > Reverse cursor and iteration support for Trie based memtables > - > > Key: CASSANDRA-19945 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19945 > Project: Cassandra > Issue Type: Improvement > Components: Local/Memtable, Local/SSTable >Reporter: Ariel Weisberg >Assignee: Branimir Lambov >Priority: Normal > Fix For: 5.x > > > Cherry- pick > [https://github.com/datastax/cassandra/commit/196b931c677829d681406f14cf1da814ff5a6624] > For Accord in particular this is useful to avoid flushing memtables that > don't intersect with the range that is going to start having metadata GCed so > we can flush less frequently/later. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19945) Reverse cursor and iteration support for Trie based memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-19945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-19945: Change Category: Performance Complexity: Normal Reviewers: Ariel Weisberg Status: Open (was: Triage Needed) > Reverse cursor and iteration support for Trie based memtables > - > > Key: CASSANDRA-19945 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19945 > Project: Cassandra > Issue Type: Improvement > Components: Local/Memtable, Local/SSTable >Reporter: Ariel Weisberg >Assignee: Branimir Lambov >Priority: Normal > Fix For: 5.x > > > Cherry- pick > [https://github.com/datastax/cassandra/commit/196b931c677829d681406f14cf1da814ff5a6624] > For Accord in particular this is useful to avoid flushing memtables that > don't intersect with the range that is going to start having metadata GCed so > we can flush less frequently/later. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19785) Possible memory leak in BTree.FastBuilder
[ https://issues.apache.org/jira/browse/CASSANDRA-19785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17882059#comment-17882059 ] Branimir Lambov commented on CASSANDRA-19785: - The pull request already has my approval. > Possible memory leak in BTree.FastBuilder > -- > > Key: CASSANDRA-19785 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19785 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Core >Reporter: Paul Chandler >Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x > > Attachments: image-2024-07-19-08-44-56-714.png, > image-2024-07-19-08-45-17-289.png, image-2024-07-19-08-45-33-933.png, > image-2024-07-19-08-45-50-383.png, image-2024-07-19-08-46-06-919.png, > image-2024-07-19-08-46-42-979.png, image-2024-07-19-08-46-56-594.png, > image-2024-07-19-08-47-19-517.png, image-2024-07-19-08-47-34-582.png > > Time Spent: 10m > Remaining Estimate: 0h > > We are having a problem with the heap growing in size, This is a large > cluster > 1,000 nodes across a large number of dc’s. This is running version > 4.0.11. > > Each node has a 32GB heap, and the amount used continues to grow until it > reaches 30GB, it then struggles with multiple Full GC pauses, as can be seen > here: > !image-2024-07-19-08-44-56-714.png! > We took 2 heap dumps on one node a few days after it was restarted, and the > heap had grown by 2.7GB > > 9{^}th{^} July > !image-2024-07-19-08-45-17-289.png! > 11{^}th{^} July > !image-2024-07-19-08-45-33-933.png! > This can be seen as mainly an increase of memory used by > FastThreadLocalThread, increasing from 5.92GB to 8.53GB > !image-2024-07-19-08-45-50-383.png! > !image-2024-07-19-08-46-06-919.png! > Looking deeper into this it can be seen that the growing heap is contained > within the threads for the MutationStage, Native-transport-Requests, > ReadStage etc. We would expect the memory used within these threads to be > short lived, and not grow as time goes on. We recently increased the size of > theses threadpools, and that has increased the size of the problem. > > Top memory usage for FastThreadLocalThread > 9{^}th{^} July > !image-2024-07-19-08-46-42-979.png! > 11{^}th{^} July > !image-2024-07-19-08-46-56-594.png! > This has led us to investigate whether there could be a memory leak, and we > have found the following issues within the retained references in > BTree.FastBuilder objects. The issue appears to stem from the reset() method, > which does not properly clear all buffers. We are not really sure how the > BTree.FastBuilder works, but this this is our analysis of where a leak might > occur. > > Specifically: > Leaf Buffer Not Being Cleared: > When leaf().count is 0, the statement Arrays.fill(leaf().buffer, 0, > leaf().count, null); does not clear the buffer because the end index is 0. > This leaves the buffer with references to potentially large objects, > preventing garbage collection and increasing heap usage. > Branch inUse Property: > If the inUse property of the branch is set to false elsewhere in the code, > the while loop while (branch != null && branch.inUse) does not execute, > resulting in uncleared branch buffers and retained references. > > This is based on the following observations: > Heap Dumps: Analysis of heap dumps shows that leaf().count is often 0, > and as a result, the buffer is not being cleared, leading to high heap > utilization. > !image-2024-07-19-08-47-19-517.png! > Remote Debugging: Debugging sessions indicate that the drain() method > sets count to 0, and the inUse flag for the parent branch is set to false, > preventing the while loop in reset() from clearing the branch buffers. > !image-2024-07-19-08-47-34-582.png! > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17298) Test Failure: org.apache.cassandra.cql3.MemtableSizeTest
[ https://issues.apache.org/jira/browse/CASSANDRA-17298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-17298: Status: Needs Committer (was: Review In Progress) > Test Failure: org.apache.cassandra.cql3.MemtableSizeTest > > > Key: CASSANDRA-17298 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17298 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Josh McKenzie >Assignee: Dmitry Konstantinov >Priority: Normal > Fix For: 4.0.x, 4.1.x > > Attachments: analyzed_objects.svg, structure_example.svg > > Time Spent: 20m > Remaining Estimate: 0h > > [https://ci-cassandra.apache.org/job/Cassandra-4.0/313/testReport/org.apache.cassandra.cql3/MemtableSizeTest/testTruncationReleasesLogSpace_2/] > Failed 4 times in the last 30 runs. Flakiness: 27%, Stability: 86% > Error Message > Expected heap usage close to 49.930MiB, got 41.542MiB. > {code} > Stacktrace > junit.framework.AssertionFailedError: Expected heap usage close to 49.930MiB, > got 41.542MiB. > at > org.apache.cassandra.cql3.MemtableSizeTest.testSize(MemtableSizeTest.java:130) > at org.apache.cassandra.Util.runCatchingAssertionError(Util.java:644) > at org.apache.cassandra.Util.flakyTest(Util.java:669) > at > org.apache.cassandra.cql3.MemtableSizeTest.testTruncationReleasesLogSpace(MemtableSizeTest.java:61) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > {code} > *UPDATE:* It was discovered that unit tests were running with > memtable_allocation_type: offheap_objects when we ship C* with heap_buffers. > So we changed that in CASSANDRA-19326, now we test with > memtable_allocation_type: heap_buffers. As a result, this test now fails all > the time on 4.0 and 4.1. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-17298) Test Failure: org.apache.cassandra.cql3.MemtableSizeTest
[ https://issues.apache.org/jira/browse/CASSANDRA-17298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-17298: Reviewers: Branimir Lambov, Branimir Lambov Branimir Lambov, Branimir Lambov (was: Branimir Lambov) Status: Review In Progress (was: Patch Available) > Test Failure: org.apache.cassandra.cql3.MemtableSizeTest > > > Key: CASSANDRA-17298 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17298 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Josh McKenzie >Assignee: Dmitry Konstantinov >Priority: Normal > Fix For: 4.0.x, 4.1.x > > Attachments: analyzed_objects.svg, structure_example.svg > > Time Spent: 20m > Remaining Estimate: 0h > > [https://ci-cassandra.apache.org/job/Cassandra-4.0/313/testReport/org.apache.cassandra.cql3/MemtableSizeTest/testTruncationReleasesLogSpace_2/] > Failed 4 times in the last 30 runs. Flakiness: 27%, Stability: 86% > Error Message > Expected heap usage close to 49.930MiB, got 41.542MiB. > {code} > Stacktrace > junit.framework.AssertionFailedError: Expected heap usage close to 49.930MiB, > got 41.542MiB. > at > org.apache.cassandra.cql3.MemtableSizeTest.testSize(MemtableSizeTest.java:130) > at org.apache.cassandra.Util.runCatchingAssertionError(Util.java:644) > at org.apache.cassandra.Util.flakyTest(Util.java:669) > at > org.apache.cassandra.cql3.MemtableSizeTest.testTruncationReleasesLogSpace(MemtableSizeTest.java:61) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > {code} > *UPDATE:* It was discovered that unit tests were running with > memtable_allocation_type: offheap_objects when we ship C* with heap_buffers. > So we changed that in CASSANDRA-19326, now we test with > memtable_allocation_type: heap_buffers. As a result, this test now fails all > the time on 4.0 and 4.1. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17298) Test Failure: org.apache.cassandra.cql3.MemtableSizeTest
[ https://issues.apache.org/jira/browse/CASSANDRA-17298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17878003#comment-17878003 ] Branimir Lambov commented on CASSANDRA-17298: - +1 Both patches look good to me. > Test Failure: org.apache.cassandra.cql3.MemtableSizeTest > > > Key: CASSANDRA-17298 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17298 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Josh McKenzie >Assignee: Dmitry Konstantinov >Priority: Normal > Fix For: 4.0.x, 4.1.x > > Attachments: analyzed_objects.svg, structure_example.svg > > Time Spent: 20m > Remaining Estimate: 0h > > [https://ci-cassandra.apache.org/job/Cassandra-4.0/313/testReport/org.apache.cassandra.cql3/MemtableSizeTest/testTruncationReleasesLogSpace_2/] > Failed 4 times in the last 30 runs. Flakiness: 27%, Stability: 86% > Error Message > Expected heap usage close to 49.930MiB, got 41.542MiB. > {code} > Stacktrace > junit.framework.AssertionFailedError: Expected heap usage close to 49.930MiB, > got 41.542MiB. > at > org.apache.cassandra.cql3.MemtableSizeTest.testSize(MemtableSizeTest.java:130) > at org.apache.cassandra.Util.runCatchingAssertionError(Util.java:644) > at org.apache.cassandra.Util.flakyTest(Util.java:669) > at > org.apache.cassandra.cql3.MemtableSizeTest.testTruncationReleasesLogSpace(MemtableSizeTest.java:61) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > {code} > *UPDATE:* It was discovered that unit tests were running with > memtable_allocation_type: offheap_objects when we ship C* with heap_buffers. > So we changed that in CASSANDRA-19326, now we test with > memtable_allocation_type: heap_buffers. As a result, this test now fails all > the time on 4.0 and 4.1. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-17298) Test Failure: org.apache.cassandra.cql3.MemtableSizeTest
[ https://issues.apache.org/jira/browse/CASSANDRA-17298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17877296#comment-17877296 ] Branimir Lambov commented on CASSANDRA-17298: - Thank you for the very detailed investigation. This has been a source of annoyance, with many attempts to fix since the test was first introduced, but it's necessary because before the test we had reached about 2x difference between the memtable's understanding of its on-heap size and what it actually used. One consideration we've previously had around this is that adjusting the memory usage reporting may cause memtables to flush earlier and change the behavior of existing clusters too much for a patch release. In this case it looks like the difference is on the order of 10%, which I personally would not see as a problem. I wonder if it we shouldn't backport most of that 5.0 patch so that we start testing all allocation strategies in 4.x as well. Also, am I understanding correctly that there is also something (EMPTY_LEAF?) that we are not tracking correctly in 5.0? On the open question, there may be a reason to use both {{BTree.sizeOnHeapOf}} and {{BTree.sizeOfStructureOnHeap}} (e.g. some BTree-building methods always share the size map, others never, and if the caller knows which one it is it could choose between the two). From a quick glance it looks like we use the latter version for {{Columns}}, and these are built from sorted using shared size maps, thus this appears to be the right thing to do. However, the names of the two methods should reflect this difference and they it should also be explained in javaDoc. > Test Failure: org.apache.cassandra.cql3.MemtableSizeTest > > > Key: CASSANDRA-17298 > URL: https://issues.apache.org/jira/browse/CASSANDRA-17298 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Josh McKenzie >Assignee: Dmitry Konstantinov >Priority: Normal > Fix For: 4.0.x, 4.1.x > > Attachments: analyzed_objects.svg, structure_example.svg > > Time Spent: 20m > Remaining Estimate: 0h > > [https://ci-cassandra.apache.org/job/Cassandra-4.0/313/testReport/org.apache.cassandra.cql3/MemtableSizeTest/testTruncationReleasesLogSpace_2/] > Failed 4 times in the last 30 runs. Flakiness: 27%, Stability: 86% > Error Message > Expected heap usage close to 49.930MiB, got 41.542MiB. > {code} > Stacktrace > junit.framework.AssertionFailedError: Expected heap usage close to 49.930MiB, > got 41.542MiB. > at > org.apache.cassandra.cql3.MemtableSizeTest.testSize(MemtableSizeTest.java:130) > at org.apache.cassandra.Util.runCatchingAssertionError(Util.java:644) > at org.apache.cassandra.Util.flakyTest(Util.java:669) > at > org.apache.cassandra.cql3.MemtableSizeTest.testTruncationReleasesLogSpace(MemtableSizeTest.java:61) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > {code} > *UPDATE:* It was discovered that unit tests were running with > memtable_allocation_type: offheap_objects when we ship C* with heap_buffers. > So we changed that in CASSANDRA-19326, now we test with > memtable_allocation_type: heap_buffers. As a result, this test now fails all > the time on 4.0 and 4.1. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19779) direct IO support is always evaluated to false upon the very first start of a node
[ https://issues.apache.org/jira/browse/CASSANDRA-19779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17869890#comment-17869890 ] Branimir Lambov commented on CASSANDRA-19779: - Unsupported {{FileUtils.getBlockSize}} means unsupported direct I/O as well, doesn't it? If not, I wonder if it makes sense to use a default block size of 4k instead of failing. > direct IO support is always evaluated to false upon the very first start of a > node > -- > > Key: CASSANDRA-19779 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19779 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Tools >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 5.0-rc > > Time Spent: 40m > Remaining Estimate: 0h > > When I extract the distribution tarball and I want to use tools in tools/bin, > there is this warn log visible every time for tools when they are started > (does not happen on "help" command, obviously) > {code:java} > WARN 14:25:11,835 Unable to determine block size for commit log directory: > null {code} > This is because we introduced this (1) in CASSANDRA-18464 > What that does is that it will go and try to create a temporary file in > commit log directory to get "block size" for a "file store" that file is in. > The problem with that is that when we just extract a tarball and run the > tools - Cassandra was never started - then such commit log directory does not > exist yet, so it tries to create a temporary file in a non-existing > directory, which fails, hence the log message. > The fix is to check if commitlog dir exists and return / skip the resolution > of block size if it does not. > Another approach might be to check if this is executed in the context of a > tool and skip it from resolution altogether. The problem with this is that > not all tools we have in bin/log call DatabaseDescriptor. > toolInitialization() so we might combine these two. > (1) > [https://github.com/apache/cassandra/blob/cassandra-5.0/src/java/org/apache/cassandra/config/DatabaseDescriptor.java#L1455-L1462] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19764) Corruption can occur while a field is being added to UDT clustering key
[ https://issues.apache.org/jira/browse/CASSANDRA-19764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17865472#comment-17865472 ] Branimir Lambov commented on CASSANDRA-19764: - {quote}the write will timeout at QUORUM consistency, as expected. {quote} In other words, TCM makes it practically impossible to run into the situation this test is meant to exercise? Coming back to {quote}it seems like a bad idea to allow altering UDTs once they are part of a primary key... {quote} Adding a new field to a UDT key is actually okay, we treat old values as shorter and can correctly order them, as long as we know all the types. But there may be a short amount of time where a replica does not yet know the type of the added field (likely only really a thing before TCM). If it then accepts a write without knowing the types as it currently does, it can corrupt itself. It makes sense to just reject this write, even more so if TCM or something else prevents schemas from going out of sync altogether. > Corruption can occur while a field is being added to UDT clustering key > --- > > Key: CASSANDRA-19764 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19764 > Project: Cassandra > Issue Type: Bug > Components: Feature/UDT >Reporter: Branimir Lambov >Priority: Normal > > CASSANDRA-15938 made some improvements in how unknown components in UDTs are > treated. Unfortunately this can cause corruption as soon as more than one > value is inserted for a partition. > The problem can be easily shown by modifying the > {{FrozenUDTTest.testDivergentSchema}} test to insert two entries in the wrong > order: > {code:java} > cluster.coordinator(1).execute("insert into " + KEYSPACE + ".x (id, ck, i) > VALUES (?, " + json(1, 2) + ", ? )", ConsistencyLevel.ALL, > 1, 2); > cluster.coordinator(1).execute("insert into " + KEYSPACE + ".x (id, ck, i) > VALUES (?, " + json(1, 1) + ", ? )", ConsistencyLevel.ALL, > 1, 1); > {code} > after which we can get corrupted sstable state, shown as a > {code:java} > java.lang.AssertionError: Lower bound [SSTABLE_LOWER_BOUND(1) ]is bigger than > first returned value [Row: ck=1 | i=2] > {code} > exception, or results like {{[[1],[2],[2],[1]]}} or {{[[2],[1],[2]]}} for > {{select i from x WHERE id = 1}} depending on which node we use as > coordinator. > Because we don't know the type of new fields and cannot properly order > entries, we need to outright reject UDT keys that are not compatible with a > replica's schema. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19764) Corruption can occur while a field is being added to UDT clustering key
[ https://issues.apache.org/jira/browse/CASSANDRA-19764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17865186#comment-17865186 ] Branimir Lambov commented on CASSANDRA-19764: - I'm not sure the test is actually getting diverging schemas with {{cluster.coordinator(1).execute("alter type " + KEYSPACE + ".a add bar text", ConsistencyLevel.QUORUM)}}. Using the original {{cluster.get(1).executeInternal("alter type " + KEYSPACE + ".a add bar text")}} in {code:java} @Test public void testDivergentSchemas() throws Throwable { try (Cluster cluster = init(Cluster.create(2))) { cluster.schemaChange("create type " + KEYSPACE + ".a (foo text)"); cluster.schemaChange("create table " + KEYSPACE + ".x (id int, ck frozen, i int, primary key (id, ck))"); cluster.get(1).executeInternal("alter type " + KEYSPACE + ".a add bar text"); cluster.coordinator(1).execute("insert into " + KEYSPACE + ".x (id, ck, i) VALUES (?, " + json(1, 2) + ", ? )", ConsistencyLevel.ALL, 1, 2); cluster.coordinator(1).execute("insert into " + KEYSPACE + ".x (id, ck, i) VALUES (?, " + json(1, 1) + ", ? )", ConsistencyLevel.ALL, 1, 1); cluster.get(2).flush(KEYSPACE); Object[][] res1 = cluster.coordinator(1).execute("select i from " + KEYSPACE + ".x WHERE id = 1", ConsistencyLevel.ALL); Object[][] res2 = cluster.coordinator(2).execute("select i from " + KEYSPACE + ".x WHERE id = 1", ConsistencyLevel.ALL); assertArrayEquals(res1, res2); } } {code} fails, at least on 5.0. > Corruption can occur while a field is being added to UDT clustering key > --- > > Key: CASSANDRA-19764 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19764 > Project: Cassandra > Issue Type: Bug > Components: Feature/UDT >Reporter: Branimir Lambov >Priority: Normal > > CASSANDRA-15938 made some improvements in how unknown components in UDTs are > treated. Unfortunately this can cause corruption as soon as more than one > value is inserted for a partition. > The problem can be easily shown by modifying the > {{FrozenUDTTest.testDivergentSchema}} test to insert two entries in the wrong > order: > {code:java} > cluster.coordinator(1).execute("insert into " + KEYSPACE + ".x (id, ck, i) > VALUES (?, " + json(1, 2) + ", ? )", ConsistencyLevel.ALL, > 1, 2); > cluster.coordinator(1).execute("insert into " + KEYSPACE + ".x (id, ck, i) > VALUES (?, " + json(1, 1) + ", ? )", ConsistencyLevel.ALL, > 1, 1); > {code} > after which we can get corrupted sstable state, shown as a > {code:java} > java.lang.AssertionError: Lower bound [SSTABLE_LOWER_BOUND(1) ]is bigger than > first returned value [Row: ck=1 | i=2] > {code} > exception, or results like {{[[1],[2],[2],[1]]}} or {{[[2],[1],[2]]}} for > {{select i from x WHERE id = 1}} depending on which node we use as > coordinator. > Because we don't know the type of new fields and cannot properly order > entries, we need to outright reject UDT keys that are not compatible with a > replica's schema. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-19764) Corruption can occur while a field is being added to UDT clustering key
Branimir Lambov created CASSANDRA-19764: --- Summary: Corruption can occur while a field is being added to UDT clustering key Key: CASSANDRA-19764 URL: https://issues.apache.org/jira/browse/CASSANDRA-19764 Project: Cassandra Issue Type: Bug Components: Feature/UDT Reporter: Branimir Lambov CASSANDRA-15938 made some improvements in how unknown components in UDTs are treated. Unfortunately this can cause corruption as soon as more than one value is inserted for a partition. The problem can be easily shown by modifying the {{FrozenUDTTest.testDivergentSchema}} test to insert two entries in the wrong order: {code:java} cluster.coordinator(1).execute("insert into " + KEYSPACE + ".x (id, ck, i) VALUES (?, " + json(1, 2) + ", ? )", ConsistencyLevel.ALL, 1, 2); cluster.coordinator(1).execute("insert into " + KEYSPACE + ".x (id, ck, i) VALUES (?, " + json(1, 1) + ", ? )", ConsistencyLevel.ALL, 1, 1); {code} after which we can get corrupted sstable state, shown as a {code:java} java.lang.AssertionError: Lower bound [SSTABLE_LOWER_BOUND(1) ]is bigger than first returned value [Row: ck=1 | i=2] {code} exception, or results like {{[[1],[2],[2],[1]]}} or {{[[2],[1],[2]]}} for {{select i from x WHERE id = 1}} depending on which node we use as coordinator. Because we don't know the type of new fields and cannot properly order entries, we need to outright reject UDT keys that are not compatible with a replica's schema. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19601) Test failure: test_change_durable_writes
[ https://issues.apache.org/jira/browse/CASSANDRA-19601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17846557#comment-17846557 ] Branimir Lambov commented on CASSANDRA-19601: - I don't know why such a flush would be necessary. In terms of how to change the test, one thing we can try is to check that the commit log's dirty regions don't contain anything from that keyspace, but I don't know how we could access these from a python dtest. It might make sense to convert the test to in-jvm one where such things AFAIU are not hard to do. > Test failure: test_change_durable_writes > > > Key: CASSANDRA-19601 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19601 > Project: Cassandra > Issue Type: Bug > Components: CI >Reporter: Ekaterina Dimitrova >Assignee: Brandon Williams >Priority: Normal > Fix For: 5.0.x, 5.x > > > Failing on trunk: > [https://ci-cassandra.apache.org/job/Cassandra-trunk/1880/testReport/junit/dtest-latest.configuration_test/TestConfiguration/Tests___dtest_latest_jdk11_31_64___test_change_durable_writes/] > [https://app.circleci.com/pipelines/github/blerer/cassandra/400/workflows/893a0edb-9181-4981-b542-77228c8bc975/jobs/10941/tests] > {code:java} > AssertionError: Commitlog was written with durable writes disabled > assert 90112 == 86016 > +90112 > -86016 > self = > @pytest.mark.timeout(60*30) > def test_change_durable_writes(self): > """ > @jira_ticket CASSANDRA-9560 > > Test that changes to the DURABLE_WRITES option on keyspaces is > respected in subsequent writes. > > This test starts by writing a dataset to a cluster and asserting that > the commitlogs have been written to. The subsequent test depends on > the assumption that this dataset triggers an fsync. > > After checking this assumption, the test destroys the cluster and > creates a fresh one. Then it tests that DURABLE_WRITES is respected > by: > > - creating a keyspace with DURABLE_WRITES set to false, > - using ALTER KEYSPACE to set its DURABLE_WRITES option to true, > - writing a dataset to this keyspace that is known to trigger a > commitlog fsync, > - asserting that the commitlog has grown in size since the data was > written. > """ > cluster = self.cluster > cluster.set_batch_commitlog(enabled=True, use_batch_window = > cluster.version() < '5.0') > > cluster.set_configuration_options(values={'commitlog_segment_size_in_mb': 1}) > > cluster.populate(1).start() > durable_node = cluster.nodelist()[0] > > durable_init_size = commitlog_size(durable_node) > durable_session = self.patient_exclusive_cql_connection(durable_node) > > # test assumption that write_to_trigger_fsync actually triggers a > commitlog fsync > durable_session.execute("CREATE KEYSPACE ks WITH REPLICATION = > {'class': 'SimpleStrategy', 'replication_factor': 1} " > "AND DURABLE_WRITES = true") > durable_session.execute('CREATE TABLE ks.tab (key int PRIMARY KEY, a > int, b int, c int)') > logger.debug('commitlog size diff = ' + > str(commitlog_size(durable_node) - durable_init_size)) > write_to_trigger_fsync(durable_session, 'ks', 'tab') > logger.debug('commitlog size diff = ' + > str(commitlog_size(durable_node) - durable_init_size)) > > assert commitlog_size(durable_node) > durable_init_size, \ > "This test will not work in this environment; > write_to_trigger_fsync does not trigger fsync." > > durable_session.shutdown() > cluster.stop() > cluster.clear() > > cluster.set_batch_commitlog(enabled=True, use_batch_window = > cluster.version() < '5.0') > > cluster.set_configuration_options(values={'commitlog_segment_size_in_mb': 1}) > cluster.start() > node = cluster.nodelist()[0] > session = self.patient_exclusive_cql_connection(node) > > # set up a keyspace without durable writes, then alter it to use them > session.execute("CREATE KEYSPACE ks WITH REPLICATION = {'class': > 'SimpleStrategy', 'replication_factor': 1} " > "AND DURABLE_WRITES = false") > session.execute('CREATE TABLE ks.tab (key int PRIMARY KEY, a int, b > int, c int)') > init_size = commitlog_size(node) > write_to_trigger_fsync(session, 'ks', 'tab') > > assert commitlog_size(node) == init_size, "Commitlog was written with > > durable writes disabled" > E AssertionError: Commitlog was written with durable writes disabled > E assert 90112 == 86016 > E +901
[jira] [Updated] (CASSANDRA-18753) Add an optimized default configuration to tests and make it available for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18753: Source Control Link: https://github.com/apache/cassandra/pull/2896 Resolution: Fixed Status: Resolved (was: Ready to Commit) > Add an optimized default configuration to tests and make it available for new > users > --- > > Key: CASSANDRA-18753 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18753 > Project: Cassandra > Issue Type: Improvement > Components: Packaging >Reporter: Branimir Lambov >Assignee: Branimir Lambov >Priority: Urgent > Fix For: 5.0-rc, 5.x > > Attachments: > CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch, > > DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch > > Time Spent: 12h 20m > Remaining Estimate: 0h > > We currently offer only one sample configuration file with Cassandra, and > that file is deliberately configured to disable all new functionality and > incompatible improvements. This works well for legacy users that want to have > a painless upgrade, but is a very bad choice for new users, or anyone wanting > to make comparisons between Cassandra versions or between Cassandra and other > databases. > We offer very little indication, in the database packaging itself, that there > are well-tested configuration choices that can solve known problems and > dramatically improve performance. This is guaranteed to paint the database in > a worse light than it deserves, and will very likely hurt adoption. > We should find a way to offer a very easy way of choosing between "optimized" > and "compatible" defaults. At minimal, we could provide alternate yaml files. > Alternatively, we could build on the {{storage_compatibility_mode}} concept > to grow it into a setting that not only enables/disables certain settings, > but also changes their default values. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19471) Commitlog with direct io fails test_change_durable_writes
[ https://issues.apache.org/jira/browse/CASSANDRA-19471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17829805#comment-17829805 ] Branimir Lambov commented on CASSANDRA-19471: - They are only for the IAE, which is a a serious issue and IMHO a blocker for 5.0. I have not investigated the commitlog being written with durable writes off which is a much more benign issue. It is likely caused by the preparation of the direct I/O segments writing and flushing the header and first sync marker in advance of any use of the segment. > Commitlog with direct io fails test_change_durable_writes > - > > Key: CASSANDRA-19471 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19471 > Project: Cassandra > Issue Type: Bug > Components: Local/Commit Log >Reporter: Brandon Williams >Assignee: Brandon Williams >Priority: Normal > Fix For: 5.0-rc, 5.x > > > With the commitlog_disk_access_mode set to direct, and the improved > configuration_test.py::TestConfiguration::test_change_durable_writes from > CASSANDRA-19465, this fails with either: > {noformat} > AssertionError: Commitlog was written with durable writes disabled > {noformat} > Or what appears to be the original exception reported in CASSANDRA-19465: > {noformat} > node1: ERROR [PERIODIC-COMMIT-LOG-SYNCER] 2024-03-14 17:16:08,465 > StorageService.java:631 - Stopping native transport > node1: ERROR [MutationStage-5] 2024-03-14 17:16:08,465 > StorageProxy.java:1670 - Failed to apply mutation locally : > java.lang.IllegalArgumentException: newPosition > limit: (1048634 > 1048576) > at java.base/java.nio.Buffer.createPositionException(Buffer.java:341) > at java.base/java.nio.Buffer.position(Buffer.java:316) > at java.base/java.nio.ByteBuffer.position(ByteBuffer.java:1516) > at > java.base/java.nio.MappedByteBuffer.position(MappedByteBuffer.java:321) > at > java.base/java.nio.MappedByteBuffer.position(MappedByteBuffer.java:73) > at > org.apache.cassandra.db.commitlog.CommitLogSegment.allocate(CommitLogSegment.java:216) > at > org.apache.cassandra.db.commitlog.CommitLogSegmentManagerStandard.allocate(CommitLogSegmentManagerStandard.java:52) > at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:307) > at > org.apache.cassandra.db.CassandraKeyspaceWriteHandler.addToCommitLog(CassandraKeyspaceWriteHandler.java:99) > at > org.apache.cassandra.db.CassandraKeyspaceWriteHandler.beginWrite(CassandraKeyspaceWriteHandler.java:53) > at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:612) > at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:497) > at org.apache.cassandra.db.Mutation.apply(Mutation.java:244) > at org.apache.cassandra.db.Mutation.apply(Mutation.java:264) > at > org.apache.cassandra.service.StorageProxy$4.runMayThrow(StorageProxy.java:1664) > at > org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2624) > at > org.apache.cassandra.concurrent.ExecutionFailure$2.run(ExecutionFailure.java:163) > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:143) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.base/java.lang.Thread.run(Thread.java:833) > node1: ERROR [PERIODIC-COMMIT-LOG-SYNCER] 2024-03-14 17:16:08,470 > StorageService.java:636 - Stopping gossiper > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-19471) Commitlog with direct io fails test_change_durable_writes
[ https://issues.apache.org/jira/browse/CASSANDRA-19471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17828364#comment-17828364 ] Branimir Lambov edited comment on CASSANDRA-19471 at 3/19/24 2:43 PM: -- I believe the problem is that the buffer's limit (set [here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/DirectIOSegment.java#L208]) is not the same as the buffer's capacity (from which {{endOfBuffer}} is set [here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java#L177]). I guess what we want is to change the former to set the limit first and then apply {{{}slice{}}}. We probably also want the aligning path above it to go through this slicing to set the capacity appropriately. I'd also change the assertions that follow to make sure the limit and capacity of the prepared buffer match, and are equal to the segment size. was (Author: blambov): I believe the problem is that the buffer's limit (set [here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/DirectIOSegment.java#L208]) is not the same as the buffer's capacity (from which `endOfBuffer` is set [here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java#L177]). I guess what we want is to change the former to set the limit first and then apply `slice`. > Commitlog with direct io fails test_change_durable_writes > - > > Key: CASSANDRA-19471 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19471 > Project: Cassandra > Issue Type: Bug > Components: Local/Commit Log >Reporter: Brandon Williams >Priority: Normal > Fix For: 5.0-rc, 5.x > > > With the commitlog_disk_access_mode set to direct, and the improved > configuration_test.py::TestConfiguration::test_change_durable_writes from > CASSANDRA-19465, this fails with either: > {noformat} > AssertionError: Commitlog was written with durable writes disabled > {noformat} > Or what appears to be the original exception reported in CASSANDRA-19465: > {noformat} > node1: ERROR [PERIODIC-COMMIT-LOG-SYNCER] 2024-03-14 17:16:08,465 > StorageService.java:631 - Stopping native transport > node1: ERROR [MutationStage-5] 2024-03-14 17:16:08,465 > StorageProxy.java:1670 - Failed to apply mutation locally : > java.lang.IllegalArgumentException: newPosition > limit: (1048634 > 1048576) > at java.base/java.nio.Buffer.createPositionException(Buffer.java:341) > at java.base/java.nio.Buffer.position(Buffer.java:316) > at java.base/java.nio.ByteBuffer.position(ByteBuffer.java:1516) > at > java.base/java.nio.MappedByteBuffer.position(MappedByteBuffer.java:321) > at > java.base/java.nio.MappedByteBuffer.position(MappedByteBuffer.java:73) > at > org.apache.cassandra.db.commitlog.CommitLogSegment.allocate(CommitLogSegment.java:216) > at > org.apache.cassandra.db.commitlog.CommitLogSegmentManagerStandard.allocate(CommitLogSegmentManagerStandard.java:52) > at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:307) > at > org.apache.cassandra.db.CassandraKeyspaceWriteHandler.addToCommitLog(CassandraKeyspaceWriteHandler.java:99) > at > org.apache.cassandra.db.CassandraKeyspaceWriteHandler.beginWrite(CassandraKeyspaceWriteHandler.java:53) > at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:612) > at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:497) > at org.apache.cassandra.db.Mutation.apply(Mutation.java:244) > at org.apache.cassandra.db.Mutation.apply(Mutation.java:264) > at > org.apache.cassandra.service.StorageProxy$4.runMayThrow(StorageProxy.java:1664) > at > org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2624) > at > org.apache.cassandra.concurrent.ExecutionFailure$2.run(ExecutionFailure.java:163) > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:143) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.base/java.lang.Thread.run(Thread.java:833) > node1: ERROR [PERIODIC-COMMIT-LOG-SYNCER] 2024-03-14 17:16:08,470 > StorageService.java:636 - Stopping gossiper > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19471) Commitlog with direct io fails test_change_durable_writes
[ https://issues.apache.org/jira/browse/CASSANDRA-19471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17828364#comment-17828364 ] Branimir Lambov commented on CASSANDRA-19471: - I believe the problem is that the buffer's limit (set [here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/DirectIOSegment.java#L208]) is not the same as the buffer's capacity (from which `endOfBuffer` is set [here|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLogSegment.java#L177]). I guess what we want is to change the former to set the limit first and then apply `slice`. > Commitlog with direct io fails test_change_durable_writes > - > > Key: CASSANDRA-19471 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19471 > Project: Cassandra > Issue Type: Bug > Components: Local/Commit Log >Reporter: Brandon Williams >Priority: Normal > Fix For: 5.0-rc, 5.x > > > With the commitlog_disk_access_mode set to direct, and the improved > configuration_test.py::TestConfiguration::test_change_durable_writes from > CASSANDRA-19465, this fails with either: > {noformat} > AssertionError: Commitlog was written with durable writes disabled > {noformat} > Or what appears to be the original exception reported in CASSANDRA-19465: > {noformat} > node1: ERROR [PERIODIC-COMMIT-LOG-SYNCER] 2024-03-14 17:16:08,465 > StorageService.java:631 - Stopping native transport > node1: ERROR [MutationStage-5] 2024-03-14 17:16:08,465 > StorageProxy.java:1670 - Failed to apply mutation locally : > java.lang.IllegalArgumentException: newPosition > limit: (1048634 > 1048576) > at java.base/java.nio.Buffer.createPositionException(Buffer.java:341) > at java.base/java.nio.Buffer.position(Buffer.java:316) > at java.base/java.nio.ByteBuffer.position(ByteBuffer.java:1516) > at > java.base/java.nio.MappedByteBuffer.position(MappedByteBuffer.java:321) > at > java.base/java.nio.MappedByteBuffer.position(MappedByteBuffer.java:73) > at > org.apache.cassandra.db.commitlog.CommitLogSegment.allocate(CommitLogSegment.java:216) > at > org.apache.cassandra.db.commitlog.CommitLogSegmentManagerStandard.allocate(CommitLogSegmentManagerStandard.java:52) > at org.apache.cassandra.db.commitlog.CommitLog.add(CommitLog.java:307) > at > org.apache.cassandra.db.CassandraKeyspaceWriteHandler.addToCommitLog(CassandraKeyspaceWriteHandler.java:99) > at > org.apache.cassandra.db.CassandraKeyspaceWriteHandler.beginWrite(CassandraKeyspaceWriteHandler.java:53) > at org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:612) > at org.apache.cassandra.db.Keyspace.apply(Keyspace.java:497) > at org.apache.cassandra.db.Mutation.apply(Mutation.java:244) > at org.apache.cassandra.db.Mutation.apply(Mutation.java:264) > at > org.apache.cassandra.service.StorageProxy$4.runMayThrow(StorageProxy.java:1664) > at > org.apache.cassandra.service.StorageProxy$LocalMutationRunnable.run(StorageProxy.java:2624) > at > org.apache.cassandra.concurrent.ExecutionFailure$2.run(ExecutionFailure.java:163) > at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:143) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.base/java.lang.Thread.run(Thread.java:833) > node1: ERROR [PERIODIC-COMMIT-LOG-SYNCER] 2024-03-14 17:16:08,470 > StorageService.java:636 - Stopping gossiper > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19460) Fix tests to work with ULID SSTable identifiers to enable uuid_sstable_identifiers_enabled in cassandra-latest.yaml
[ https://issues.apache.org/jira/browse/CASSANDRA-19460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17824756#comment-17824756 ] Branimir Lambov commented on CASSANDRA-19460: - LGTM > Fix tests to work with ULID SSTable identifiers to enable > uuid_sstable_identifiers_enabled in cassandra-latest.yaml > --- > > Key: CASSANDRA-19460 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19460 > Project: Cassandra > Issue Type: Task > Components: CI, Test/dtest/java, Test/unit >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 5.0-rc, 5.x > > Time Spent: 1h > Remaining Estimate: 0h > > In CASSANDRA-18753 we identified that we want to also set > uuid_sstable_identifiers_enabled to true, while running a CI with it turned > on, it failed (1). > Errors do not seem to be serious, it is just the test suite we have is not > prepared for the case when uuid_sstable_identifiers_enabled is set to true by > default. > We need to fix all these tests so we can have cassandra-latest.yaml > containing that property. > https://app.circleci.com/pipelines/github/blambov/cassandra/609/workflows/aef2b936-0551-4f3b-9d86-a49451c83947 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19460) Fix tests to work with ULID SSTable identifiers to enable uuid_sstable_identifiers_enabled in cassandra-latest.yaml
[ https://issues.apache.org/jira/browse/CASSANDRA-19460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-19460: Reviewers: Branimir Lambov Status: Review In Progress (was: Needs Committer) > Fix tests to work with ULID SSTable identifiers to enable > uuid_sstable_identifiers_enabled in cassandra-latest.yaml > --- > > Key: CASSANDRA-19460 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19460 > Project: Cassandra > Issue Type: Task > Components: CI, Test/dtest/java, Test/unit >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 5.0-rc, 5.x > > Time Spent: 1h > Remaining Estimate: 0h > > In CASSANDRA-18753 we identified that we want to also set > uuid_sstable_identifiers_enabled to true, while running a CI with it turned > on, it failed (1). > Errors do not seem to be serious, it is just the test suite we have is not > prepared for the case when uuid_sstable_identifiers_enabled is set to true by > default. > We need to fix all these tests so we can have cassandra-latest.yaml > containing that property. > https://app.circleci.com/pipelines/github/blambov/cassandra/609/workflows/aef2b936-0551-4f3b-9d86-a49451c83947 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19460) Fix tests to work with ULID SSTable identifiers to enable uuid_sstable_identifiers_enabled in cassandra-latest.yaml
[ https://issues.apache.org/jira/browse/CASSANDRA-19460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-19460: Status: Ready to Commit (was: Review In Progress) > Fix tests to work with ULID SSTable identifiers to enable > uuid_sstable_identifiers_enabled in cassandra-latest.yaml > --- > > Key: CASSANDRA-19460 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19460 > Project: Cassandra > Issue Type: Task > Components: CI, Test/dtest/java, Test/unit >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 5.0-rc, 5.x > > Time Spent: 1h > Remaining Estimate: 0h > > In CASSANDRA-18753 we identified that we want to also set > uuid_sstable_identifiers_enabled to true, while running a CI with it turned > on, it failed (1). > Errors do not seem to be serious, it is just the test suite we have is not > prepared for the case when uuid_sstable_identifiers_enabled is set to true by > default. > We need to fix all these tests so we can have cassandra-latest.yaml > containing that property. > https://app.circleci.com/pipelines/github/blambov/cassandra/609/workflows/aef2b936-0551-4f3b-9d86-a49451c83947 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18753) Add an optimized default configuration to tests and make it available for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17824394#comment-17824394 ] Branimir Lambov commented on CASSANDRA-18753: - Committed to 5.0 as [06ed1afc34128523298020e7601dad148f2b2fb6|https://github.com/apache/cassandra/commit/06ed1afc34128523298020e7601dad148f2b2fb6] and trunk as [28efb63df52bafaf51cd458da021f6050900017a|https://github.com/apache/cassandra/commit/28efb63df52bafaf51cd458da021f6050900017a]. > Add an optimized default configuration to tests and make it available for new > users > --- > > Key: CASSANDRA-18753 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18753 > Project: Cassandra > Issue Type: Improvement > Components: Packaging >Reporter: Branimir Lambov >Assignee: Branimir Lambov >Priority: Urgent > Fix For: 5.0-rc, 5.x > > Attachments: > CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch, > > DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch > > Time Spent: 11h > Remaining Estimate: 0h > > We currently offer only one sample configuration file with Cassandra, and > that file is deliberately configured to disable all new functionality and > incompatible improvements. This works well for legacy users that want to have > a painless upgrade, but is a very bad choice for new users, or anyone wanting > to make comparisons between Cassandra versions or between Cassandra and other > databases. > We offer very little indication, in the database packaging itself, that there > are well-tested configuration choices that can solve known problems and > dramatically improve performance. This is guaranteed to paint the database in > a worse light than it deserves, and will very likely hurt adoption. > We should find a way to offer a very easy way of choosing between "optimized" > and "compatible" defaults. At minimal, we could provide alternate yaml files. > Alternatively, we could build on the {{storage_compatibility_mode}} concept > to grow it into a setting that not only enables/disables certain settings, > but also changes their default values. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18753) Add an optimized default configuration to tests and make it available for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823998#comment-17823998 ] Branimir Lambov commented on CASSANDRA-18753: - That test is apparently already fixed. [Latest run|https://app.circleci.com/pipelines/github/blambov/cassandra/606/workflows/628459f1-f3fe-449c-a047-a784cc9711f5/jobs/24959/tests] had only a timeout of {{ActiveCompactionsTest}} -- reduced the number of iterations in the test to fix this. Uploaded final version; I'm ready to commit it but I'd like one last review of the wording in {{NEWS.txt}} and {{cassandra(-latest).yaml}}. > Add an optimized default configuration to tests and make it available for new > users > --- > > Key: CASSANDRA-18753 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18753 > Project: Cassandra > Issue Type: Improvement > Components: Packaging >Reporter: Branimir Lambov >Assignee: Branimir Lambov >Priority: Urgent > Fix For: 5.0-rc, 5.x > > Attachments: > CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch, > > DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch > > Time Spent: 11h > Remaining Estimate: 0h > > We currently offer only one sample configuration file with Cassandra, and > that file is deliberately configured to disable all new functionality and > incompatible improvements. This works well for legacy users that want to have > a painless upgrade, but is a very bad choice for new users, or anyone wanting > to make comparisons between Cassandra versions or between Cassandra and other > databases. > We offer very little indication, in the database packaging itself, that there > are well-tested configuration choices that can solve known problems and > dramatically improve performance. This is guaranteed to paint the database in > a worse light than it deserves, and will very likely hurt adoption. > We should find a way to offer a very easy way of choosing between "optimized" > and "compatible" defaults. At minimal, we could provide alternate yaml files. > Alternatively, we could build on the {{storage_compatibility_mode}} concept > to grow it into a setting that not only enables/disables certain settings, > but also changes their default values. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19459) test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions fails with SAI
[ https://issues.apache.org/jira/browse/CASSANDRA-19459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-19459: Resolution: Fixed Status: Resolved (was: Triage Needed) Fixed by CASSANDRA-19018. > test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions > fails with SAI > --- > > Key: CASSANDRA-19459 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19459 > Project: Cassandra > Issue Type: Bug > Components: Feature/SAI >Reporter: Branimir Lambov >Priority: Normal > > The dtest > {{replica_side_filtering_test::TestSecondaryIndexes::test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions}} > fails when the default secondary index is switched to SAI with > {code} > test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions > failed; it passed 0 out of the required 1 times. > > Subprocess ['nodetool', '-h', 'localhost', '-p', '7200', 'flush'] > exited with non-zero status; exit status: 2; > stderr: error: null > -- StackTrace -- > java.lang.NullPointerException > at java.base/java.util.Objects.requireNonNull(Objects.java:209) > at > org.apache.cassandra.index.sai.disk.v1.segment.SegmentMetadata.(SegmentMetadata.java:102) > at > org.apache.cassandra.index.sai.disk.v1.MemtableIndexWriter.flush(MemtableIndexWriter.java:166) > at > org.apache.cassandra.index.sai.disk.v1.MemtableIndexWriter.complete(MemtableIndexWriter.java:125) > at > org.apache.cassandra.index.sai.disk.StorageAttachedIndexWriter.complete(StorageAttachedIndexWriter.java:185) > at java.base/java.util.ArrayList.forEach(ArrayList.java:1511) > at > java.base/java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1092) > at > org.apache.cassandra.io.sstable.format.SSTableWriter.commit(SSTableWriter.java:289) > at > org.apache.cassandra.io.sstable.SimpleSSTableMultiWriter.commit(SimpleSSTableMultiWriter.java:90) > at > org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtable(ColumnFamilyStore.java:1354) > at > org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1253) > at > org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.base/java.lang.Thread.run(Thread.java:840) > {code} > Discovered while testing CASSANDRA-18753. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18753) Add an optimized default configuration to tests and make it available for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17823946#comment-17823946 ] Branimir Lambov edited comment on CASSANDRA-18753 at 3/6/24 10:07 AM: -- Well, tests [look much better now|https://app.circleci.com/pipelines/github/blambov/cassandra/605/workflows/f567db7c-2231-4c22-8a60-7e43887880d7]. We have only one failure, {{replica_side_filtering_test.TestSecondaryIndexes:test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions}} with SAI. Opened CASSANDRA-19459 for this, and proceeding to merge this ticket. was (Author: blambov): Well, tests [look much better now|https://app.circleci.com/pipelines/github/blambov/cassandra/605/workflows/f567db7c-2231-4c22-8a60-7e43887880d7]. We have only one failure, {{replica_side_filtering_test.TestSecondaryIndexes:test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions}} with SAI. Opened CASSANDRA- 19459 for this, and proceeding to merge this ticket. > Add an optimized default configuration to tests and make it available for new > users > --- > > Key: CASSANDRA-18753 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18753 > Project: Cassandra > Issue Type: Improvement > Components: Packaging >Reporter: Branimir Lambov >Assignee: Branimir Lambov >Priority: Urgent > Fix For: 5.0-rc, 5.x > > Attachments: > CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch, > > DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch > > Time Spent: 11h > Remaining Estimate: 0h > > We currently offer only one sample configuration file with Cassandra, and > that file is deliberately configured to disable all new functionality and > incompatible improvements. This works well for legacy users that want to have > a painless upgrade, but is a very bad choice for new users, or anyone wanting > to make comparisons between Cassandra versions or between Cassandra and other > databases. > We offer very little indication, in the database packaging itself, that there > are well-tested configuration choices that can solve known problems and > dramatically improve performance. This is guaranteed to paint the database in > a worse light than it deserves, and will very likely hurt adoption. > We should find a way to offer a very easy way of choosing between "optimized" > and "compatible" defaults. At minimal, we could provide alternate yaml files. > Alternatively, we could build on the {{storage_compatibility_mode}} concept > to grow it into a setting that not only enables/disables certain settings, > but also changes their default values. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-19459) test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions fails with SAI
Branimir Lambov created CASSANDRA-19459: --- Summary: test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions fails with SAI Key: CASSANDRA-19459 URL: https://issues.apache.org/jira/browse/CASSANDRA-19459 Project: Cassandra Issue Type: Bug Components: Feature/SAI Reporter: Branimir Lambov The dtest {{replica_side_filtering_test::TestSecondaryIndexes::test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions}} fails when the default secondary index is switched to SAI with {code} test_complementary_deletion_with_limit_on_partition_key_column_with_empty_partitions failed; it passed 0 out of the required 1 times. Subprocess ['nodetool', '-h', 'localhost', '-p', '7200', 'flush'] exited with non-zero status; exit status: 2; stderr: error: null -- StackTrace -- java.lang.NullPointerException at java.base/java.util.Objects.requireNonNull(Objects.java:209) at org.apache.cassandra.index.sai.disk.v1.segment.SegmentMetadata.(SegmentMetadata.java:102) at org.apache.cassandra.index.sai.disk.v1.MemtableIndexWriter.flush(MemtableIndexWriter.java:166) at org.apache.cassandra.index.sai.disk.v1.MemtableIndexWriter.complete(MemtableIndexWriter.java:125) at org.apache.cassandra.index.sai.disk.StorageAttachedIndexWriter.complete(StorageAttachedIndexWriter.java:185) at java.base/java.util.ArrayList.forEach(ArrayList.java:1511) at java.base/java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1092) at org.apache.cassandra.io.sstable.format.SSTableWriter.commit(SSTableWriter.java:289) at org.apache.cassandra.io.sstable.SimpleSSTableMultiWriter.commit(SimpleSSTableMultiWriter.java:90) at org.apache.cassandra.db.ColumnFamilyStore$Flush.flushMemtable(ColumnFamilyStore.java:1354) at org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1253) at org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:840) {code} Discovered while testing CASSANDRA-18753. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18753) Add an optimized default configuration to tests and make it available for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17822034#comment-17822034 ] Branimir Lambov commented on CASSANDRA-18753: - I don't mind removing it, especially if we have a plan for adding it back. I'll remove it and re-run CI. > Add an optimized default configuration to tests and make it available for new > users > --- > > Key: CASSANDRA-18753 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18753 > Project: Cassandra > Issue Type: Improvement > Components: Packaging >Reporter: Branimir Lambov >Assignee: Branimir Lambov >Priority: Urgent > Fix For: 5.0-rc, 5.x > > Attachments: > CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch, > > DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch > > Time Spent: 11h > Remaining Estimate: 0h > > We currently offer only one sample configuration file with Cassandra, and > that file is deliberately configured to disable all new functionality and > incompatible improvements. This works well for legacy users that want to have > a painless upgrade, but is a very bad choice for new users, or anyone wanting > to make comparisons between Cassandra versions or between Cassandra and other > databases. > We offer very little indication, in the database packaging itself, that there > are well-tested configuration choices that can solve known problems and > dramatically improve performance. This is guaranteed to paint the database in > a worse light than it deserves, and will very likely hurt adoption. > We should find a way to offer a very easy way of choosing between "optimized" > and "compatible" defaults. At minimal, we could provide alternate yaml files. > Alternatively, we could build on the {{storage_compatibility_mode}} concept > to grow it into a setting that not only enables/disables certain settings, > but also changes their default values. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18753) Add an optimized default configuration to tests and make it available for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17799053#comment-17799053 ] Branimir Lambov edited comment on CASSANDRA-18753 at 1/16/24 8:49 AM: -- Merged CCM and DTest patches (they do not change anything unless the {{--configuration-yaml}} flag is used). [The state of failing tests at the moment|https://app.circleci.com/pipelines/github/blambov/cassandra/595/workflows/ed598605-6af6-443e-9336-aaa47ae27e43]: - JUnit tests in compatible mode (which changes to use {{{}heap_buffers{}}}): -- {{CQLVectorTest}} (CASSANDRA-19167) -- {{VectorUpdateDeleteTest}} (CASSANDRA-19168) - JUnit tests in latest mode: -- repair fuzz tests {{{}ConcurrentIrWithPreviewFuzzTest{}}}, {{{}FailedAckTest{}}}, {{{}FailingRepairFuzzTest{}}}, {{{}HappyPathFuzzTest{}}}, {{SlowMessageFuzzTest}} (CASSANDRA-19042) -- {{RepairJobTest}} (CASSANDRA-19043) -- {{ClientRequestMetricsTest}} (CASSANDRA-19046) - JVM dtests in latest mode: -- {{RepairTest}} (CASSANDRA-19085) -- {{SSTableLoaderEncyptionOptionsTest}} (CASSANDRA-19126) -- {{QueriesTableTest}} (CASSANDRA-19046) - Python dtests in latest mode: -- {{TestWriteFailures.testPaxos}} (CASSANDRA-19145) -- {{TestReplaceAddress}} (CASSANDRA-19144) -- {{TestSnapshot}} (CASSANDRA-19126) -- {{TestClientRequestMetrics}} (CASSANDRA-19046) Several {{TestBootstrap}} tests seems to be failing in all configurations, some already marked as flaky; this likely is not caused by this patch. There are also some timeouts (e.g. {{ActiveCompactionsTest}} times out when run repeatedly due to longer {{{}testActiveCompactionTrackingRaceWithIndexBuilder{}}}). Please review [the PR|https://github.com/apache/cassandra/pull/2896]. was (Author: blambov): Merged CCM and DTest patches (they do not change anything unless the {{--configuration-yaml}} flag is used). [The state of failing tests at the moment|https://app.circleci.com/pipelines/github/blambov/cassandra/595/workflows/ed598605-6af6-443e-9336-aaa47ae27e43]: - JUnit tests in compatible mode (which changes to use {{{}heap_buffers{}}}): -- {{CQLVectorTest}} (CASSANDRA-19167) -- {{VectorUpdateDeleteTest}} (CASSANDRA-19168) - JUnit tests in latest mode: -- repair fuzz tests {{{}ConcurrentIrWithPreviewFuzzTest{}}}, {{{}FailedAckTest{}}}, {{{}FailingRepairFuzzTest{}}}, {{{}HappyPathFuzzTest{}}}, {{SlowMessageFuzzTest}} (CASSANDRA-19042) -- {{RepairJobTest}} (CASSANDRA-19043) - JVM dtests in latest mode: -- {{RepairTest}} (CASSANDRA-19085) -- {{SSTableLoaderEncyptionOptionsTest}} (CASSANDRA-19126) -- {{QueriesTableTest}} (CASSANDRA-19046) - Python dtests in latest mode: -- {{TestWriteFailures.testPaxos}} (CASSANDRA-19145) -- {{TestReplaceAddress}} (CASSANDRA-19144) -- {{TestSnapshot}} (CASSANDRA-19126) -- {{TestClientRequestMetrics}} (CASSANDRA-19046) Several {{TestBootstrap}} tests seems to be failing in all configurations, some already marked as flaky; this likely is not caused by this patch. There are also some timeouts (e.g. {{ActiveCompactionsTest}} times out when run repeatedly due to longer {{{}testActiveCompactionTrackingRaceWithIndexBuilder{}}}). Please review [the PR|https://github.com/apache/cassandra/pull/2896]. > Add an optimized default configuration to tests and make it available for new > users > --- > > Key: CASSANDRA-18753 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18753 > Project: Cassandra > Issue Type: Improvement > Components: Packaging >Reporter: Branimir Lambov >Assignee: Branimir Lambov >Priority: Urgent > Fix For: 5.0-rc, 5.x > > Attachments: > CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch, > > DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch > > Time Spent: 9h 50m > Remaining Estimate: 0h > > We currently offer only one sample configuration file with Cassandra, and > that file is deliberately configured to disable all new functionality and > incompatible improvements. This works well for legacy users that want to have > a painless upgrade, but is a very bad choice for new users, or anyone wanting > to make comparisons between Cassandra versions or between Cassandra and other > databases. > We offer very little indication, in the database packaging itself, that there > are well-tested configuration choices that can solve known problems and > dramatically improve performance. This is guaranteed to paint the database in > a worse light than it deserves, and will very likely hurt adoption. > We should find a way to offer a very easy way of choosing between "optimized" > and "compatible" defaults. At minimal, we could provi
[jira] [Commented] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings
[ https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17804661#comment-17804661 ] Branimir Lambov commented on CASSANDRA-19126: - It is to me. > Streaming appears to be incompatible with different > storage_compatibility_mode settings > --- > > Key: CASSANDRA-19126 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19126 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Streaming, Legacy/Streaming and Messaging, > Messaging/Internode, Tool/bulk load >Reporter: Branimir Lambov >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 5.0-rc, 5.x > > > In particular, SSTableLoader appears to be incompatible with > storage_compatibility_mode: NONE, which manifests as a failure of > {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} > when the flag is turned on (found during CASSANDRA-18753 testing). Setting > {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not > help (according to the docs, this setting is not picked up). > This is likely a bigger problem as the acceptable streaming version for C* 5 > is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not > appear to be able to stream with each other if their setting for the > compatibility mode is different. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings
[ https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17798565#comment-17798565 ] Branimir Lambov commented on CASSANDRA-19126: - {quote} {code} private static final String MIXED_MODE_ERROR = "Some nodes involved in repair are on an incompatible major version. " + "Repair is not supported in mixed major version clusters."; {code} {quote} _To me_ this message in the context of a 5.0 cluster where something is in the wrong compatibility mode would be quite confusing. At the very least we need to state very clearly that a 5.x node in compatibility mode is considered a 4.x node for all intents and purposes, including being a "same major version" for the message above. Also, does this not mean we can't ever drop 4.0 support because e.g. 6.0 must be compatible with 5.0, including in its compatibility mode? > Streaming appears to be incompatible with different > storage_compatibility_mode settings > --- > > Key: CASSANDRA-19126 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19126 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Streaming, Legacy/Streaming and Messaging, > Messaging/Internode, Tool/bulk load >Reporter: Branimir Lambov >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 5.0-rc, 5.x > > > In particular, SSTableLoader appears to be incompatible with > storage_compatibility_mode: NONE, which manifests as a failure of > {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} > when the flag is turned on (found during CASSANDRA-18753 testing). Setting > {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not > help (according to the docs, this setting is not picked up). > This is likely a bigger problem as the acceptable streaming version for C* 5 > is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not > appear to be able to stream with each other if their setting for the > compatibility mode is different. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings
[ https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17795406#comment-17795406 ] Branimir Lambov commented on CASSANDRA-19126: - In other words, you both feel that it is okay for {{BulkLoader}} to not work if it is not the corresponding version or is not configured exactly like the database is? Separately, that a node in e.g. {{UPGRADING}} mode should not be able to stream sstables to one in {{NONE}}? > Streaming appears to be incompatible with different > storage_compatibility_mode settings > --- > > Key: CASSANDRA-19126 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19126 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Streaming, Legacy/Streaming and Messaging, > Messaging/Internode, Tool/bulk load >Reporter: Branimir Lambov >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 5.0-rc, 5.x > > > In particular, SSTableLoader appears to be incompatible with > storage_compatibility_mode: NONE, which manifests as a failure of > {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} > when the flag is turned on (found during CASSANDRA-18753 testing). Setting > {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not > help (according to the docs, this setting is not picked up). > This is likely a bigger problem as the acceptable streaming version for C* 5 > is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not > appear to be able to stream with each other if their setting for the > compatibility mode is different. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings
[ https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17795089#comment-17795089 ] Branimir Lambov commented on CASSANDRA-19126: - > Precise fix for this would be to use the same compatibility mode for bulk > loader and the node. While this would fix the test, it would not do anything about the underlying problem. C* 5 nodes in different compatibility mode should be able to stream with each other. One should at least be able to stream whole sstables from legacy mode to current. Current to legacy mode when CASSANDRA-19012 is done also makes sense, but it might violate the downgradability promise while such data is not compacted. We probably need a warning if current-format data is streamed to a node in legacy mode (e.g. suggesting one does upgradesstables before downgrading below 5.0). > Streaming appears to be incompatible with different > storage_compatibility_mode settings > --- > > Key: CASSANDRA-19126 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19126 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Streaming, Legacy/Streaming and Messaging, > Messaging/Internode, Tool/bulk load >Reporter: Branimir Lambov >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 5.0-rc, 5.x > > > In particular, SSTableLoader appears to be incompatible with > storage_compatibility_mode: NONE, which manifests as a failure of > {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} > when the flag is turned on (found during CASSANDRA-18753 testing). Setting > {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not > help (according to the docs, this setting is not picked up). > This is likely a bigger problem as the acceptable streaming version for C* 5 > is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not > appear to be able to stream with each other if their setting for the > compatibility mode is different. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings
[ https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17795089#comment-17795089 ] Branimir Lambov edited comment on CASSANDRA-19126 at 12/10/23 4:57 PM: --- bq. Precise fix for this would be to use the same compatibility mode for bulk loader and the node. While this would fix the test, it would not do anything about the underlying problem. C* 5 nodes in different compatibility mode should be able to stream with each other. One should at least be able to stream whole sstables from legacy mode to current. Current to legacy mode when CASSANDRA-19012 is done also makes sense, but it might violate the downgradability promise while such data is not compacted. We probably need a warning if current-format data is streamed to a node in legacy mode (e.g. suggesting one does upgradesstables before downgrading below 5.0). was (Author: blambov): > Precise fix for this would be to use the same compatibility mode for bulk > loader and the node. While this would fix the test, it would not do anything about the underlying problem. C* 5 nodes in different compatibility mode should be able to stream with each other. One should at least be able to stream whole sstables from legacy mode to current. Current to legacy mode when CASSANDRA-19012 is done also makes sense, but it might violate the downgradability promise while such data is not compacted. We probably need a warning if current-format data is streamed to a node in legacy mode (e.g. suggesting one does upgradesstables before downgrading below 5.0). > Streaming appears to be incompatible with different > storage_compatibility_mode settings > --- > > Key: CASSANDRA-19126 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19126 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Streaming, Legacy/Streaming and Messaging, > Messaging/Internode, Tool/bulk load >Reporter: Branimir Lambov >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 5.0-rc, 5.x > > > In particular, SSTableLoader appears to be incompatible with > storage_compatibility_mode: NONE, which manifests as a failure of > {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} > when the flag is turned on (found during CASSANDRA-18753 testing). Setting > {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not > help (according to the docs, this setting is not picked up). > This is likely a bigger problem as the acceptable streaming version for C* 5 > is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not > appear to be able to stream with each other if their setting for the > compatibility mode is different. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19168) VectorUpdateDeleteTest fails with heap_buffers
[ https://issues.apache.org/jira/browse/CASSANDRA-19168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-19168: Fix Version/s: 5.0-rc > VectorUpdateDeleteTest fails with heap_buffers > -- > > Key: CASSANDRA-19168 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19168 > Project: Cassandra > Issue Type: Bug > Components: Feature/Vector Search >Reporter: Branimir Lambov >Priority: Normal > Fix For: 5.0-rc > > > When {{memtable_allocation_type}} is set to {{heap_buffers}}, {{updateTest}} > fails with > {code} > junit.framework.AssertionFailedError: Result set does not contain a row with > pk = 0 > at > org.apache.cassandra.index.sai.cql.VectorTypeTest.assertContainsInt(VectorTypeTest.java:133) > at > org.apache.cassandra.index.sai.cql.VectorUpdateDeleteTest.updateTest(VectorUpdateDeleteTest.java:308) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at > com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-19168) VectorUpdateDeleteTest fails with heap_buffers
Branimir Lambov created CASSANDRA-19168: --- Summary: VectorUpdateDeleteTest fails with heap_buffers Key: CASSANDRA-19168 URL: https://issues.apache.org/jira/browse/CASSANDRA-19168 Project: Cassandra Issue Type: Bug Components: Feature/Vector Search Reporter: Branimir Lambov When {{memtable_allocation_type}} is set to {{heap_buffers}}, {{updateTest}} fails with {code} junit.framework.AssertionFailedError: Result set does not contain a row with pk = 0 at org.apache.cassandra.index.sai.cql.VectorTypeTest.assertContainsInt(VectorTypeTest.java:133) at org.apache.cassandra.index.sai.cql.VectorUpdateDeleteTest.updateTest(VectorUpdateDeleteTest.java:308) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19167) CQLVectorTest fails with heap_buffers
[ https://issues.apache.org/jira/browse/CASSANDRA-19167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-19167: Fix Version/s: 5.0-rc > CQLVectorTest fails with heap_buffers > - > > Key: CASSANDRA-19167 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19167 > Project: Cassandra > Issue Type: Bug > Components: Feature/Vector Search >Reporter: Branimir Lambov >Priority: Normal > Fix For: 5.0-rc > > > When {{memtable_allocation_type}} is set to {{heap_buffers}}, the {{udf}} > test fails with > {code} > org.apache.cassandra.cql3.functions.types.exceptions.InvalidTypeException: > Invalid 32-bits integer value, expecting 4 bytes but got 6 > at > org.apache.cassandra.cql3.functions.types.TypeCodec$IntCodec.deserializeNoBoxing(TypeCodec.java:1695) > at > org.apache.cassandra.cql3.functions.types.TypeCodec$PrimitiveIntCodec.deserialize(TypeCodec.java:842) > at > org.apache.cassandra.cql3.functions.types.TypeCodec$PrimitiveIntCodec.deserialize(TypeCodec.java:819) > at > org.apache.cassandra.cql3.functions.types.VectorCodec$FixedLength.deserialize(VectorCodec.java:135) > at > org.apache.cassandra.cql3.functions.types.VectorCodec$FixedLength.deserialize(VectorCodec.java:83) > at > org.apache.cassandra.cql3.functions.types.TypeCodec$AbstractCollectionCodec.deserialize(TypeCodec.java:2141) > at > org.apache.cassandra.cql3.functions.types.TypeCodec$AbstractCollectionCodec.deserialize(TypeCodec.java:2082) > at > org.apache.cassandra.cql3.functions.UDFDataType.compose(UDFDataType.java:180) > at > org.apache.cassandra.cql3.functions.FunctionArguments.set(FunctionArguments.java:142) > at > org.apache.cassandra.cql3.selection.AbstractFunctionSelector.setArg(AbstractFunctionSelector.java:277) > at > org.apache.cassandra.cql3.selection.ScalarFunctionSelector.getOutput(ScalarFunctionSelector.java:58) > at > org.apache.cassandra.cql3.selection.Selection$SelectionWithProcessing$1.getOutputRow(Selection.java:605) > at > org.apache.cassandra.cql3.selection.ResultSetBuilder.getOutputRow(ResultSetBuilder.java:175) > at > org.apache.cassandra.cql3.selection.ResultSetBuilder.build(ResultSetBuilder.java:162) > at > org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:999) > at > org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:564) > at > org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:600) > at > org.apache.cassandra.cql3.statements.SelectStatement.executeLocally(SelectStatement.java:570) > at > org.apache.cassandra.cql3.statements.SelectStatement.executeLocally(SelectStatement.java:108) > at > org.apache.cassandra.cql3.QueryProcessor.executeInternal(QueryProcessor.java:445) > at > org.apache.cassandra.cql3.CQLTester.executeFormattedQuery(CQLTester.java:1597) > at org.apache.cassandra.cql3.CQLTester.execute(CQLTester.java:1576) > at > org.apache.cassandra.cql3.validation.operations.CQLVectorTest.udf(CQLVectorTest.java:427) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-19167) CQLVectorTest fails with heap_buffers
Branimir Lambov created CASSANDRA-19167: --- Summary: CQLVectorTest fails with heap_buffers Key: CASSANDRA-19167 URL: https://issues.apache.org/jira/browse/CASSANDRA-19167 Project: Cassandra Issue Type: Bug Components: Feature/Vector Search Reporter: Branimir Lambov When {{memtable_allocation_type}} is set to {{heap_buffers}}, the {{udf}} test fails with {code} org.apache.cassandra.cql3.functions.types.exceptions.InvalidTypeException: Invalid 32-bits integer value, expecting 4 bytes but got 6 at org.apache.cassandra.cql3.functions.types.TypeCodec$IntCodec.deserializeNoBoxing(TypeCodec.java:1695) at org.apache.cassandra.cql3.functions.types.TypeCodec$PrimitiveIntCodec.deserialize(TypeCodec.java:842) at org.apache.cassandra.cql3.functions.types.TypeCodec$PrimitiveIntCodec.deserialize(TypeCodec.java:819) at org.apache.cassandra.cql3.functions.types.VectorCodec$FixedLength.deserialize(VectorCodec.java:135) at org.apache.cassandra.cql3.functions.types.VectorCodec$FixedLength.deserialize(VectorCodec.java:83) at org.apache.cassandra.cql3.functions.types.TypeCodec$AbstractCollectionCodec.deserialize(TypeCodec.java:2141) at org.apache.cassandra.cql3.functions.types.TypeCodec$AbstractCollectionCodec.deserialize(TypeCodec.java:2082) at org.apache.cassandra.cql3.functions.UDFDataType.compose(UDFDataType.java:180) at org.apache.cassandra.cql3.functions.FunctionArguments.set(FunctionArguments.java:142) at org.apache.cassandra.cql3.selection.AbstractFunctionSelector.setArg(AbstractFunctionSelector.java:277) at org.apache.cassandra.cql3.selection.ScalarFunctionSelector.getOutput(ScalarFunctionSelector.java:58) at org.apache.cassandra.cql3.selection.Selection$SelectionWithProcessing$1.getOutputRow(Selection.java:605) at org.apache.cassandra.cql3.selection.ResultSetBuilder.getOutputRow(ResultSetBuilder.java:175) at org.apache.cassandra.cql3.selection.ResultSetBuilder.build(ResultSetBuilder.java:162) at org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:999) at org.apache.cassandra.cql3.statements.SelectStatement.processResults(SelectStatement.java:564) at org.apache.cassandra.cql3.statements.SelectStatement.executeInternal(SelectStatement.java:600) at org.apache.cassandra.cql3.statements.SelectStatement.executeLocally(SelectStatement.java:570) at org.apache.cassandra.cql3.statements.SelectStatement.executeLocally(SelectStatement.java:108) at org.apache.cassandra.cql3.QueryProcessor.executeInternal(QueryProcessor.java:445) at org.apache.cassandra.cql3.CQLTester.executeFormattedQuery(CQLTester.java:1597) at org.apache.cassandra.cql3.CQLTester.execute(CQLTester.java:1576) at org.apache.cassandra.cql3.validation.operations.CQLVectorTest.udf(CQLVectorTest.java:427) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-19145) Python dtest TestWriteFailures.test_paxos is failing with Paxos V2
Branimir Lambov created CASSANDRA-19145: --- Summary: Python dtest TestWriteFailures.test_paxos is failing with Paxos V2 Key: CASSANDRA-19145 URL: https://issues.apache.org/jira/browse/CASSANDRA-19145 Project: Cassandra Issue Type: Bug Components: Feature/Lightweight Transactions Reporter: Branimir Lambov With configuration changed to engage Paxos V2 with repaired state purging, the dtest fails with: {code} test_paxos write_failures_test.TestWriteFailures self = def test_paxos(self): """ A light transaction receives a WriteFailure """ > exc = self._perform_cql_statement("INSERT INTO mytable (key, value) > VALUES ('key1', 'Value 1') IF NOT EXISTS") write_failures_test.py:202: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ write_failures_test.py:88: in _perform_cql_statement session.execute(statement) ../env3.7/src/cassandra-driver/cassandra/cluster.py:2618: in execute return self.execute_async(query, parameters, trace, custom_payload, timeout, execution_profile, paging_state, host, execute_as).result() _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = def result(self): """ Return the final result or raise an Exception if errors were encountered. If the final result or error has not been set yet, this method will block until it is set, or the timeout set for the request expires. Timeout is specified in the Session request execution functions. If the timeout is exceeded, an :exc:`cassandra.OperationTimedOut` will be raised. This is a client-side timeout. For more information about server-side coordinator timeouts, see :class:`.policies.RetryPolicy`. Example usage:: >>> future = session.execute_async("SELECT * FROM mycf") >>> # do other stuff... >>> try: ... rows = future.result() ... for row in rows: ... ... # process results ... except Exception: ... log.exception("Operation failed:") """ self._event.wait() if self._final_result is not _NOT_SET: return ResultSet(self, self._final_result) else: > raise self._final_exception E cassandra.WriteTimeout: Error from server: code=1100 [Coordinator node timed out waiting for replica nodes' responses] message="CAS operation timed out: received 1 of 2 required responses after 0 contention retries" info={'consistency': 'SERIAL', 'required_responses': 2, 'received_responses': 1, 'write_type': 'CAS'} ../env3.7/src/cassandra-driver/cassandra/cluster.py:4894: WriteTimeout {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-19144) Python dtest replace_address_test.TestReplaceAddress is failing with Paxos V2
Branimir Lambov created CASSANDRA-19144: --- Summary: Python dtest replace_address_test.TestReplaceAddress is failing with Paxos V2 Key: CASSANDRA-19144 URL: https://issues.apache.org/jira/browse/CASSANDRA-19144 Project: Cassandra Issue Type: Bug Components: Consistency/Bootstrap and Decommission, Feature/Lightweight Transactions Reporter: Branimir Lambov Paxos repair is causing an unexpected failure: {code} test_replace_with_insufficient_replicas replace_address_test.TestReplaceAddress failed on teardown with "Failed: Unexpected error found in node logs (see stdout for full details). Errors: [[replacement] 'ERROR [main] 2023-11-29 10:23:08,752 CassandraDaemon.java:878 - Exception encountered during startup\njava.lang.UnsupportedOperationException: null\n\tat org.apache.cassandra.locator.AbstractReplicaCollection$ReplicaMap$AbstractImmutableSet.removeAll(AbstractReplicaCollection.java:298)\n\tat org.apache.cassandra.service.ActiveRepairService.repairPaxosForTopologyChange(ActiveRepairService.java:1102)\n\tat org.apache.cassandra.service.StorageService.startRepairPaxosForTopologyChange(StorageService.java:4829)\n\tat org.apache.cassandra.service.StorageService.tryRepairPaxosForTopologyChange(StorageService.java:4760)\n\tat org.apache.cassandra.service.StorageService.repairPaxosForTopologyChange(StorageService.java:4793)\n\tat org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:2120)\n\tat org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1240)\n\tat org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1200)\n\tat org.apache.cassandra.service.StorageService.initServer(StorageService.java:979)\n\tat org.apache.cassandra.service.StorageService.initServer(StorageService.java:896)\n\tat org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:377)\n\tat org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:721)\n\tat org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:856)']" Unexpected error found in node logs (see stdout for full details). Errors: [[replacement] 'ERROR [main] 2023-11-29 10:23:08,752 CassandraDaemon.java:878 - Exception encountered during startup\njava.lang.UnsupportedOperationException: null\n\tat org.apache.cassandra.locator.AbstractReplicaCollection$ReplicaMap$AbstractImmutableSet.removeAll(AbstractReplicaCollection.java:298)\n\tat org.apache.cassandra.service.ActiveRepairService.repairPaxosForTopologyChange(ActiveRepairService.java:1102)\n\tat org.apache.cassandra.service.StorageService.startRepairPaxosForTopologyChange(StorageService.java:4829)\n\tat org.apache.cassandra.service.StorageService.tryRepairPaxosForTopologyChange(StorageService.java:4760)\n\tat org.apache.cassandra.service.StorageService.repairPaxosForTopologyChange(StorageService.java:4793)\n\tat org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:2120)\n\tat org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1240)\n\tat org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:1200)\n\tat org.apache.cassandra.service.StorageService.initServer(StorageService.java:979)\n\tat org.apache.cassandra.service.StorageService.initServer(StorageService.java:896)\n\tat org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:377)\n\tat org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:721)\n\tat org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:856)'] {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings
[ https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17792095#comment-17792095 ] Branimir Lambov commented on CASSANDRA-19126: - Python dtest \{{snaphost_test}} is also failing because of this sstableloader problem: {code:java} Exception: sstableloader command '/home/cassandra/cassandra/bin/sstableloader -d 127.0.0.1 /tmp/tmpidg_8u3c/0/ks/cf' failed; exit status: 1'; stdout: Established connection to initial hosts Opening sstables and calculating sections to stream Streaming relevant part of /tmp/tmpidg_8u3c/0/ks/cf/da-1-bti-Data.db to [/127.0.0.1:7000] progress: total: 100% 0.000B/s (avg: 0.000B/s) ; stderr: ERROR 10:16:01,391 [Stream #4bb85ff0-8ea0-11ee-94d3-3de6344de31d] Streaming error occurred on session with peer 127.0.0.1:7000 java.lang.ClassCastException: class org.apache.cassandra.net.OutboundConnectionInitiator$Result$Incompatible cannot be cast to class org.apache.cassandra.net.OutboundConnectionInitiator$Result$Success (org.apache.cassandra.net.OutboundConnectionInitiator$Result$Incompatible and org.apache.cassandra.net.OutboundConnectionInitiator$Result$Success are in unnamed module of loader 'app') {code} > Streaming appears to be incompatible with different > storage_compatibility_mode settings > --- > > Key: CASSANDRA-19126 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19126 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Streaming, Legacy/Streaming and Messaging, > Messaging/Internode, Tool/bulk load >Reporter: Branimir Lambov >Priority: Normal > Fix For: 5.0-rc, 5.x > > > In particular, SSTableLoader appears to be incompatible with > storage_compatibility_mode: NONE, which manifests as a failure of > {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} > when the flag is turned on (found during CASSANDRA-18753 testing). Setting > {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not > help (according to the docs, this setting is not picked up). > This is likely a bigger problem as the acceptable streaming version for C* 5 > is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not > appear to be able to stream with each other if their setting for the > compatibility mode is different. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19046) Paxos V2 does not update individual fields of readMetrics
[ https://issues.apache.org/jira/browse/CASSANDRA-19046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17792090#comment-17792090 ] Branimir Lambov commented on CASSANDRA-19046: - Python dtest failure related to this: {{client_request_metrics_test.TestClientRequestMetrics}} {code:java} > self.cas_read_contention() client_request_metrics_test.py:103: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ client_request_metrics_test.py:355: in cas_read_contention consistency_level=CL.SERIAL)) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ self = metric_factory = functools.partial(, 'CASRead') statement = def cas_contention(self, metric_factory, statement): query_count = 20 cassandra_version = self.dtest_config.cassandra_version_from_build def sample(): baseline = metric_factory() baseline.validate(cassandra_version) execute_concurrent_with_args(self.session, statement, repeat([], query_count), raise_on_first_error=False) updated = metric_factory() updated.validate(cassandra_version) return updated.diff(baseline) for _ in range(10): diff = sample() if 'ContentionHistogram.Count' in diff: break assert diff['Latency.Count'] == query_count assert diff['TotalLatency.Count'] > 0 > assert 0 < diff['ContentionHistogram.Count'] <= query_count E KeyError: 'ContentionHistogram.Count' client_request_metrics_test.py:382: KeyError{code} > Paxos V2 does not update individual fields of readMetrics > - > > Key: CASSANDRA-19046 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19046 > Project: Cassandra > Issue Type: Bug > Components: Feature/Lightweight Transactions, Observability/Metrics >Reporter: Branimir Lambov >Priority: Normal > Fix For: 5.0-rc > > > As a result, {{ClientMetricsTest.testPaxosStatement}} is failing with > {{paxos_variant: v2}}. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings
[ https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17791958#comment-17791958 ] Branimir Lambov commented on CASSANDRA-19126: - I believe what Brandon means is that we also need upgrade tests where only some nodes have changed {{storage_compatibility_mode}}. [This line|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/MessagingService.java#L259] is what appears to be preventing {{BulkLoader}} from working. I don't have enough knowledge in the area and have not dug deep enough to understand all implications. > Streaming appears to be incompatible with different > storage_compatibility_mode settings > --- > > Key: CASSANDRA-19126 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19126 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Streaming, Legacy/Streaming and Messaging, > Messaging/Internode, Tool/bulk load >Reporter: Branimir Lambov >Priority: Normal > Fix For: 5.0-rc, 5.x > > > In particular, SSTableLoader appears to be incompatible with > storage_compatibility_mode: NONE, which manifests as a failure of > {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} > when the flag is turned on (found during CASSANDRA-18753 testing). Setting > {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not > help (according to the docs, this setting is not picked up). > This is likely a bigger problem as the acceptable streaming version for C* 5 > is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not > appear to be able to stream with each other if their setting for the > compatibility mode is different. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings
[ https://issues.apache.org/jira/browse/CASSANDRA-19126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-19126: Description: In particular, SSTableLoader appears to be incompatible with storage_compatibility_mode: NONE, which manifests as a failure of {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} when the flag is turned on (found during CASSANDRA-18753 testing). Setting {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not help (according to the docs, this setting is not picked up). This is likely a bigger problem as the acceptable streaming version for C* 5 is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not appear to be able to stream with each other if their setting for the compatibility mode is different. was: In particular, SSTableLoader appears to be incompatible with storage_compatibility_mode: NONE, which manifests as a failure of `org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest` when the flag is turned on (found during CASSANDRA-18753 testing). Setting `storage_compatibility_mode: NONE` in the tool configuration yaml does not help (according to the docs, this setting is not picked up). This is likely a bigger problem as the acceptable streaming version for C* 5 is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not appear to be able to stream with each other if their setting for the compatibility mode is different. > Streaming appears to be incompatible with different > storage_compatibility_mode settings > --- > > Key: CASSANDRA-19126 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19126 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Streaming, Legacy/Streaming and Messaging, > Messaging/Internode, Tool/bulk load >Reporter: Branimir Lambov >Priority: Normal > > In particular, SSTableLoader appears to be incompatible with > storage_compatibility_mode: NONE, which manifests as a failure of > {{org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest}} > when the flag is turned on (found during CASSANDRA-18753 testing). Setting > {{storage_compatibility_mode: NONE}} in the tool configuration yaml does not > help (according to the docs, this setting is not picked up). > This is likely a bigger problem as the acceptable streaming version for C* 5 > is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not > appear to be able to stream with each other if their setting for the > compatibility mode is different. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-19126) Streaming appears to be incompatible with different storage_compatibility_mode settings
Branimir Lambov created CASSANDRA-19126: --- Summary: Streaming appears to be incompatible with different storage_compatibility_mode settings Key: CASSANDRA-19126 URL: https://issues.apache.org/jira/browse/CASSANDRA-19126 Project: Cassandra Issue Type: Bug Components: Consistency/Streaming, Legacy/Streaming and Messaging, Messaging/Internode, Tool/bulk load Reporter: Branimir Lambov In particular, SSTableLoader appears to be incompatible with storage_compatibility_mode: NONE, which manifests as a failure of `org.apache.cassandra.distributed.test.SSTableLoaderEncryptionOptionsTest` when the flag is turned on (found during CASSANDRA-18753 testing). Setting `storage_compatibility_mode: NONE` in the tool configuration yaml does not help (according to the docs, this setting is not picked up). This is likely a bigger problem as the acceptable streaming version for C* 5 is 12 only in legacy mode and 13 only in none, i.e. two C* 5 nodes do not appear to be able to stream with each other if their setting for the compatibility mode is different. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19085) In-jvm dtest RepairTest fails with storage_compatibility_mode: NONE
[ https://issues.apache.org/jira/browse/CASSANDRA-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-19085: Fix Version/s: 5.0-rc > In-jvm dtest RepairTest fails with storage_compatibility_mode: NONE > --- > > Key: CASSANDRA-19085 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19085 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair >Reporter: Branimir Lambov >Priority: Normal > Fix For: 5.0-rc > > > More precisely, when the {{MessagingService}} version to {{{}VERSION_50{}}}, > the test fails with an exception that appears to be a genuine problem: > {code:java} > junit.framework.AssertionFailedError: Exception found expected null, but > was: at > org.apache.cassandra.service.ActiveRepairService.lambda$prepareForRepair$2(ActiveRepairService.java:678) > at > org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133) > at > java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) > at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) > at > java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > at java.base/java.lang.Thread.run(Thread.java:833) > > > at > org.apache.cassandra.distributed.test.DistributedRepairUtils.lambda$assertParentRepairSuccess$4(DistributedRepairUtils.java:129) > at > org.apache.cassandra.distributed.test.DistributedRepairUtils.validateExistingParentRepair(DistributedRepairUtils.java:164) > at > org.apache.cassandra.distributed.test.DistributedRepairUtils.assertParentRepairSuccess(DistributedRepairUtils.java:124) > at > org.apache.cassandra.distributed.test.RepairTest.testForcedNormalRepairWithOneNodeDown(RepairTest.java:211) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > org.apache.cassandra.distributed.shared.ShutdownException: Uncaught > exceptions were thrown during test > at > org.apache.cassandra.distributed.impl.AbstractCluster.checkAndResetUncaughtExceptions(AbstractCluster.java:1117) > at > org.apache.cassandra.distributed.impl.AbstractCluster.close(AbstractCluster.java:1103) > at > org.apache.cassandra.distributed.test.RepairTest.closeCluster(RepairTest.java:160) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > Suppressed: java.lang.IllegalStateException: complete already: > (failure: java.lang.RuntimeException: Did not get replies from all endpoints.) > at > org.apache.cassandra.utils.concurrent.AsyncPromise.setSuccess(AsyncPromise.java:106) > at > org.apache.cassandra.service.ActiveRepairService$2.ack(ActiveRepairService.java:721) > at > org.apache.cassandra.service.ActiveRepairService$2.onResponse(ActiveRepairService.java:697) > at > org.apache.cassandra.repair.messages.RepairMessage$2.onResponse(RepairMessage.java:187) > at > org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:58) > at > org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78) > at > org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:64) > at > org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:50) > at > org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97) > at > org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45) > at > org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:430) > at > org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133) > at > org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:143) > at > io.netty.util.concurrent.FastThreadLocalRunnable
[jira] [Updated] (CASSANDRA-18753) Add an optimized default configuration to tests and make it available for new users
[ https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18753: Fix Version/s: 5.0-rc (was: 5.0.x) > Add an optimized default configuration to tests and make it available for new > users > --- > > Key: CASSANDRA-18753 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18753 > Project: Cassandra > Issue Type: Improvement > Components: Packaging >Reporter: Branimir Lambov >Assignee: Branimir Lambov >Priority: Urgent > Fix For: 5.0-rc, 5.x > > Attachments: > CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch, > > DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch > > Time Spent: 3h 20m > Remaining Estimate: 0h > > We currently offer only one sample configuration file with Cassandra, and > that file is deliberately configured to disable all new functionality and > incompatible improvements. This works well for legacy users that want to have > a painless upgrade, but is a very bad choice for new users, or anyone wanting > to make comparisons between Cassandra versions or between Cassandra and other > databases. > We offer very little indication, in the database packaging itself, that there > are well-tested configuration choices that can solve known problems and > dramatically improve performance. This is guaranteed to paint the database in > a worse light than it deserves, and will very likely hurt adoption. > We should find a way to offer a very easy way of choosing between "optimized" > and "compatible" defaults. At minimal, we could provide alternate yaml files. > Alternatively, we could build on the {{storage_compatibility_mode}} concept > to grow it into a setting that not only enables/disables certain settings, > but also changes their default values. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19085) In-jvm dtest RepairTest fails with storage_compatibility_mode: NONE
[ https://issues.apache.org/jira/browse/CASSANDRA-19085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-19085: Description: More precisely, when the {{MessagingService}} version to {{{}VERSION_50{}}}, the test fails with an exception that appears to be a genuine problem: {code:java} junit.framework.AssertionFailedError: Exception found expected null, but was: at org.apache.cassandra.distributed.test.DistributedRepairUtils.lambda$assertParentRepairSuccess$4(DistributedRepairUtils.java:129) at org.apache.cassandra.distributed.test.DistributedRepairUtils.validateExistingParentRepair(DistributedRepairUtils.java:164) at org.apache.cassandra.distributed.test.DistributedRepairUtils.assertParentRepairSuccess(DistributedRepairUtils.java:124) at org.apache.cassandra.distributed.test.RepairTest.testForcedNormalRepairWithOneNodeDown(RepairTest.java:211) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) org.apache.cassandra.distributed.shared.ShutdownException: Uncaught exceptions were thrown during test at org.apache.cassandra.distributed.impl.AbstractCluster.checkAndResetUncaughtExceptions(AbstractCluster.java:1117) at org.apache.cassandra.distributed.impl.AbstractCluster.close(AbstractCluster.java:1103) at org.apache.cassandra.distributed.test.RepairTest.closeCluster(RepairTest.java:160) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) Suppressed: java.lang.IllegalStateException: complete already: (failure: java.lang.RuntimeException: Did not get replies from all endpoints.) at org.apache.cassandra.utils.concurrent.AsyncPromise.setSuccess(AsyncPromise.java:106) at org.apache.cassandra.service.ActiveRepairService$2.ack(ActiveRepairService.java:721) at org.apache.cassandra.service.ActiveRepairService$2.onResponse(ActiveRepairService.java:697) at org.apache.cassandra.repair.messages.RepairMessage$2.onResponse(RepairMessage.java:187) at org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:58) at org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78) at org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:64) at org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:50) at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97) at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45) at org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:430) at org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133) at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:143) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:833){code} The updates to {{pending}} in ActiveRepairService are not concurrency-safe, but fixing them by doing e.g. {code:java} Index: src/java/org/apache/cassandra/service/ActiveRepairService.java IDEA additional info: Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP <+>UTF-8 === diff --git a/src/java/org/apache/cassandra/service/ActiveRepairService.java b/src/java/org/apache/cassandra/service/ActiveRepairService.java --- a/src/java/org/apache/cassandra/service/ActiveRepairService.java (revision 04552046f74f596e69e2d98c3f3e522fb5888c99) +++ b/src/java/org/apache/cassandra/service/ActiveRepairService.java (date 1700839874092) @@ -675,7 +675,7 @@ if (promise.isDone()) return; String errorMsg = "Did not get replies from all endpoints."; - if (promise.tryFailure(new RuntimeException(errorMsg))) + if (pending.getAndSet(-1) > 0 && promise.tryFailure(new RuntimeException(errorMsg))) participateFailed(parentRepairSession, errorMsg); }, timeoutMillis, MILLISECONDS); @@ -703,8 +703,8 @@ failedNodes.add(from.toString()); if (failureReason == RequestFailureReason.TIMEOUT) { - pen
[jira] [Created] (CASSANDRA-19085) In-jvm dtest RepairTest fails with storage_compatibility_mode: NONE
Branimir Lambov created CASSANDRA-19085: --- Summary: In-jvm dtest RepairTest fails with storage_compatibility_mode: NONE Key: CASSANDRA-19085 URL: https://issues.apache.org/jira/browse/CASSANDRA-19085 Project: Cassandra Issue Type: Bug Components: Consistency/Repair Reporter: Branimir Lambov More precisely, when the {{MessagingService}} version to {{{}VERSION_50{}}}, the test fails with an exception that appears to be a genuine problem: {code:java} junit.framework.AssertionFailedError: Exception found expected null, but was: at org.apache.cassandra.distributed.test.DistributedRepairUtils.lambda$assertParentRepairSuccess$4(DistributedRepairUtils.java:129) at org.apache.cassandra.distributed.test.DistributedRepairUtils.validateExistingParentRepair(DistributedRepairUtils.java:164) at org.apache.cassandra.distributed.test.DistributedRepairUtils.assertParentRepairSuccess(DistributedRepairUtils.java:124) at org.apache.cassandra.distributed.test.RepairTest.testForcedNormalRepairWithOneNodeDown(RepairTest.java:211) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) org.apache.cassandra.distributed.shared.ShutdownException: Uncaught exceptions were thrown during test at org.apache.cassandra.distributed.impl.AbstractCluster.checkAndResetUncaughtExceptions(AbstractCluster.java:1117) at org.apache.cassandra.distributed.impl.AbstractCluster.close(AbstractCluster.java:1103) at org.apache.cassandra.distributed.test.RepairTest.closeCluster(RepairTest.java:160) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) Suppressed: java.lang.IllegalStateException: complete already: (failure: java.lang.RuntimeException: Did not get replies from all endpoints.) at org.apache.cassandra.utils.concurrent.AsyncPromise.setSuccess(AsyncPromise.java:106) at org.apache.cassandra.service.ActiveRepairService$2.ack(ActiveRepairService.java:721) at org.apache.cassandra.service.ActiveRepairService$2.onResponse(ActiveRepairService.java:697) at org.apache.cassandra.repair.messages.RepairMessage$2.onResponse(RepairMessage.java:187) at org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:58) at org.apache.cassandra.net.InboundSink.lambda$new$0(InboundSink.java:78) at org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:64) at org.apache.cassandra.net.InboundSink$Filtered.accept(InboundSink.java:50) at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:97) at org.apache.cassandra.net.InboundSink.accept(InboundSink.java:45) at org.apache.cassandra.net.InboundMessageHandler$ProcessMessage.run(InboundMessageHandler.java:430) at org.apache.cassandra.concurrent.ExecutionFailure$1.run(ExecutionFailure.java:133) at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:143) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:833){code} The updates to {{pending}} in AbstractRepairService are not concurrency-safe, but fixing them by doing e.g. {code:java} Index: src/java/org/apache/cassandra/service/ActiveRepairService.java IDEA additional info: Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP <+>UTF-8 === diff --git a/src/java/org/apache/cassandra/service/ActiveRepairService.java b/src/java/org/apache/cassandra/service/ActiveRepairService.java --- a/src/java/org/apache/cassandra/service/ActiveRepairService.java (revision 04552046f74f596e69e2d98c3f3e522fb5888c99) +++ b/src/java/org/apache/cassandra/service/ActiveRepairService.java (date 1700839874092) @@ -675,7 +675,7 @@ if (promise.isDone()) return; String errorMsg = "Did not get replies from all endpoints."; - if (promise.tryFailure(new RuntimeException(errorMsg))) + if (pending.getAndSet(-1) > 0 && promise.tryFailure(new RuntimeException(errorMsg))) participateFailed(parentRepairSession, errorMsg); }, timeoutMillis, M
[jira] [Commented] (CASSANDRA-18757) UnifiedCompactionTask is incorrectly setting keepOriginals
[ https://issues.apache.org/jira/browse/CASSANDRA-18757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17789130#comment-17789130 ] Branimir Lambov commented on CASSANDRA-18757: - Tests look good, repeated test completed with no failures: [https://app.circleci.com/pipelines/github/blambov/cassandra?branch=CASSANDRA-18757] [~smiklosovic], do you give a second approval so that I can commit this? > UnifiedCompactionTask is incorrectly setting keepOriginals > -- > > Key: CASSANDRA-18757 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18757 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Fix For: 5.0-beta, 5.x > > Time Spent: 10m > Remaining Estimate: 0h > > {code:java} > super(cfs, txn, gcBefore, > strategy.getController().getIgnoreOverlapsInExpirationCheck());{code} > in {{UnifiedCompactionTask}} is calling the base constructor > {code:java} > public CompactionTask(ColumnFamilyStore cfs, LifecycleTransaction txn, long > gcBefore, boolean keepOriginals) > {code} > which can set {{keepOriginals}} to true when it should not be. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18753) We should offer an option for optimized default configuration
[ https://issues.apache.org/jira/browse/CASSANDRA-18753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17789052#comment-17789052 ] Branimir Lambov edited comment on CASSANDRA-18753 at 11/23/23 10:20 AM: DTest support has been added. The python dtests require pull requests for [CCM|https://github.com/riptano/ccm/pull/760] and [cassandra-dtest|https://github.com/apache/cassandra-dtest/pull/243] to be merged. It works by passing an argument to ccm to make it read the configuration from "cassandra_latest.yaml". The new configuration replaces {{{}dtest_offheap{}}}, as the offheap setting for memtables is also turned on in the latest configuration. I'm not happy at all with how the in-jvm dtests are configured at this point (directly including the settings in code), but I could not think of a quick way to get them to load a configuration file. The latest config is combined with vnodes to lighten the testing load. Test results to appear [here|https://app.circleci.com/pipelines/github/blambov/cassandra/567/workflows/aa84b1f1-b138-42a8-8e81-dd149c87224e]. was (Author: blambov): DTest support has been added. The python dtests require pull requests for [CCM|https://github.com/riptano/ccm/pull/760] and [cassandra-dtest|https://github.com/apache/cassandra-dtest/pull/243] to be merged. It works by passing an argument to ccm to make it read the configuration from "cassandra_latest.yaml". The new configuration replaces {{{}dtest_offheap{}}}, as the offheap setting for memtables is also turned on in the latest configuration. I'm not happy at all with how the in-jvm dtests are configured at this point (directly including the settings in code), but I could not think of a quick way to get them to load a configuration file. Test results to appear [here|https://app.circleci.com/pipelines/github/blambov/cassandra/567/workflows/aa84b1f1-b138-42a8-8e81-dd149c87224e]. > We should offer an option for optimized default configuration > - > > Key: CASSANDRA-18753 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18753 > Project: Cassandra > Issue Type: Improvement > Components: Packaging >Reporter: Branimir Lambov >Assignee: Branimir Lambov >Priority: Urgent > Fix For: 5.0.x, 5.x > > Attachments: > CCM_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch, > > DTEST_Add_support_for_specifying_the_name_of_the_file_to_use_as_cassandra_YAML_.patch > > Time Spent: 2.5h > Remaining Estimate: 0h > > We currently offer only one sample configuration file with Cassandra, and > that file is deliberately configured to disable all new functionality and > incompatible improvements. This works well for legacy users that want to have > a painless upgrade, but is a very bad choice for new users, or anyone wanting > to make comparisons between Cassandra versions or between Cassandra and other > databases. > We offer very little indication, in the database packaging itself, that there > are well-tested configuration choices that can solve known problems and > dramatically improve performance. This is guaranteed to paint the database in > a worse light than it deserves, and will very likely hurt adoption. > We should find a way to offer a very easy way of choosing between "optimized" > and "compatible" defaults. At minimal, we could provide alternate yaml files. > Alternatively, we could build on the {{storage_compatibility_mode}} concept > to grow it into a setting that not only enables/disables certain settings, > but also changes their default values. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18757) UnifiedCompactionTask is incorrectly setting keepOriginals
[ https://issues.apache.org/jira/browse/CASSANDRA-18757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17788730#comment-17788730 ] Branimir Lambov commented on CASSANDRA-18757: - How about splitting this into separate tests for the 4 cases? I.e. have the four calls in {{testIgnoreOverlaps}} run in separate {{@Test}}-annotated methods? > UnifiedCompactionTask is incorrectly setting keepOriginals > -- > > Key: CASSANDRA-18757 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18757 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Fix For: 5.0-beta, 5.x > > Time Spent: 10m > Remaining Estimate: 0h > > {code:java} > super(cfs, txn, gcBefore, > strategy.getController().getIgnoreOverlapsInExpirationCheck());{code} > in {{UnifiedCompactionTask}} is calling the base constructor > {code:java} > public CompactionTask(ColumnFamilyStore cfs, LifecycleTransaction txn, long > gcBefore, boolean keepOriginals) > {code} > which can set {{keepOriginals}} to true when it should not be. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-19046) Paxos V2 does not update individual fields of readMetrics
[ https://issues.apache.org/jira/browse/CASSANDRA-19046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-19046: Summary: Paxos V2 does not update individual fields of readMetrics (was: Paxos V2 does not individual fields of readMetrics) > Paxos V2 does not update individual fields of readMetrics > - > > Key: CASSANDRA-19046 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19046 > Project: Cassandra > Issue Type: Bug > Components: Feature/Lightweight Transactions, Observability/Metrics >Reporter: Branimir Lambov >Priority: Normal > > As a result, {{ClientMetricsTest.testPaxosStatement}} is failing with > {{paxos_variant: v2}}. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19034) SelectTest fails when run with SAI index
[ https://issues.apache.org/jira/browse/CASSANDRA-19034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17787323#comment-17787323 ] Branimir Lambov commented on CASSANDRA-19034: - Yes, we have run the entire unit test suite (no dtests yet) with SAI as default, and these three are the only failures that aren't usecases that SAI can't support (ByteOrderedPartitioner and blobs). With CASSANDRA-18753, we will have a test configuration run as part as the precommit tests that runs with SAI (plus tries, UCS, paxos v2...). > SelectTest fails when run with SAI index > > > Key: CASSANDRA-19034 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19034 > Project: Cassandra > Issue Type: Bug > Components: Feature/SAI >Reporter: Branimir Lambov >Priority: Normal > Fix For: 5.0-beta > > > When run with SAI index, the following two tests error out: > {code} > [junit-timeout] Testcase: > testContainsKeyAndContainsWithIndexOnMapValue(org.apache.cassandra.cql3.validation.operations.SelectTest)-_jdk11: >FAILED > [junit-timeout] Got less rows than expected. Expected 1 but got 0 > [junit-timeout] junit.framework.AssertionFailedError: Got less rows than > expected. Expected 1 but got 0 > [junit-timeout] at > org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1849) > [junit-timeout] at > org.apache.cassandra.cql3.validation.operations.SelectTest.lambda$testContainsKeyAndContainsWithIndexOnMapValue$9(SelectTest.java:625) > [junit-timeout] at > org.apache.cassandra.cql3.CQLTester.beforeAndAfterFlush(CQLTester.java:2238) > [junit-timeout] at > org.apache.cassandra.cql3.validation.operations.SelectTest.testContainsKeyAndContainsWithIndexOnMapValue(SelectTest.java:618) > [junit-timeout] at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > [junit-timeout] at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > [junit-timeout] at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > [junit-timeout] > [junit-timeout] > [junit-timeout] Testcase: > testFilterWithIndexForContains(org.apache.cassandra.cql3.validation.operations.SelectTest)-_jdk11: > FAILED > [junit-timeout] Invalid value for row 1 column 0 (k1 of type int), expected > <1> but got <0> > [junit-timeout] Invalid value for row 1 column 2 (v of type set), > expected <{4, 5, 6}> but got <{2, 3, 4}> > [junit-timeout] > [junit-timeout] junit.framework.AssertionFailedError: Invalid value for row 1 > column 0 (k1 of type int), expected <1> but got <0> > [junit-timeout] Invalid value for row 1 column 2 (v of type set), > expected <{4, 5, 6}> but got <{2, 3, 4}> > [junit-timeout] > [junit-timeout] at > org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1826) > [junit-timeout] at > org.apache.cassandra.cql3.validation.operations.SelectTest.lambda$testFilterWithIndexForContains$6(SelectTest.java:543) > [junit-timeout] at > org.apache.cassandra.cql3.CQLTester.beforeAndAfterFlush(CQLTester.java:2240) > [junit-timeout] at > org.apache.cassandra.cql3.validation.operations.SelectTest.testFilterWithIndexForContains(SelectTest.java:542) > [junit-timeout] at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > [junit-timeout] at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > [junit-timeout] at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > {code} > The latter seems to be giving the results in the wrong order, and the order > flips when the data is flushed. > Caught during preparation of _latest config that would switch default to SAI > (CASSANDRA-18753). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-19034) SelectTest fails when run with SAI index
[ https://issues.apache.org/jira/browse/CASSANDRA-19034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17787279#comment-17787279 ] Branimir Lambov commented on CASSANDRA-19034: - A further failure of this kind: {code} [junit-timeout] Testcase: testStaticIndexAndNonStaticIndex(org.apache.cassandra.cql3.validation.entities.SecondaryIndexOnStaticColumnTest)-_jdk11: FAILED [junit-timeout] Got less rows than expected. Expected 1 but got 0 [junit-timeout] junit.framework.AssertionFailedError: Got less rows than expected. Expected 1 but got 0 [junit-timeout] at org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1849) [junit-timeout] at org.apache.cassandra.cql3.validation.entities.SecondaryIndexOnStaticColumnTest.testStaticIndexAndNonStaticIndex(SecondaryIndexOnStaticColumnTest.java:191) [junit-timeout] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit-timeout] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [junit-timeout] at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [junit-timeout] [junit-timeout] [junit-timeout] Test org.apache.cassandra.cql3.validation.entities.SecondaryIndexOnStaticColumnTest FAILED {code} > SelectTest fails when run with SAI index > > > Key: CASSANDRA-19034 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19034 > Project: Cassandra > Issue Type: Bug > Components: Feature/SAI >Reporter: Branimir Lambov >Priority: Normal > > When run with SAI index, the following two tests error out: > {code} > [junit-timeout] Testcase: > testContainsKeyAndContainsWithIndexOnMapValue(org.apache.cassandra.cql3.validation.operations.SelectTest)-_jdk11: >FAILED > [junit-timeout] Got less rows than expected. Expected 1 but got 0 > [junit-timeout] junit.framework.AssertionFailedError: Got less rows than > expected. Expected 1 but got 0 > [junit-timeout] at > org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1849) > [junit-timeout] at > org.apache.cassandra.cql3.validation.operations.SelectTest.lambda$testContainsKeyAndContainsWithIndexOnMapValue$9(SelectTest.java:625) > [junit-timeout] at > org.apache.cassandra.cql3.CQLTester.beforeAndAfterFlush(CQLTester.java:2238) > [junit-timeout] at > org.apache.cassandra.cql3.validation.operations.SelectTest.testContainsKeyAndContainsWithIndexOnMapValue(SelectTest.java:618) > [junit-timeout] at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > [junit-timeout] at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > [junit-timeout] at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > [junit-timeout] > [junit-timeout] > [junit-timeout] Testcase: > testFilterWithIndexForContains(org.apache.cassandra.cql3.validation.operations.SelectTest)-_jdk11: > FAILED > [junit-timeout] Invalid value for row 1 column 0 (k1 of type int), expected > <1> but got <0> > [junit-timeout] Invalid value for row 1 column 2 (v of type set), > expected <{4, 5, 6}> but got <{2, 3, 4}> > [junit-timeout] > [junit-timeout] junit.framework.AssertionFailedError: Invalid value for row 1 > column 0 (k1 of type int), expected <1> but got <0> > [junit-timeout] Invalid value for row 1 column 2 (v of type set), > expected <{4, 5, 6}> but got <{2, 3, 4}> > [junit-timeout] > [junit-timeout] at > org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1826) > [junit-timeout] at > org.apache.cassandra.cql3.validation.operations.SelectTest.lambda$testFilterWithIndexForContains$6(SelectTest.java:543) > [junit-timeout] at > org.apache.cassandra.cql3.CQLTester.beforeAndAfterFlush(CQLTester.java:2240) > [junit-timeout] at > org.apache.cassandra.cql3.validation.operations.SelectTest.testFilterWithIndexForContains(SelectTest.java:542) > [junit-timeout] at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > [junit-timeout] at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > [junit-timeout] at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > {code} > The latter seems to be giving the results in the wrong order, and the order > flips when the data is flushed. > Caught during preparation of _latest config that would switch default to SAI > (CASSANDRA-18753). -- This message was sent by Atlassian Jira (v8.20.10#820010) -
[jira] [Created] (CASSANDRA-19034) SelectTest fails when run with SAI index
Branimir Lambov created CASSANDRA-19034: --- Summary: SelectTest fails when run with SAI index Key: CASSANDRA-19034 URL: https://issues.apache.org/jira/browse/CASSANDRA-19034 Project: Cassandra Issue Type: Bug Components: Feature/SAI Reporter: Branimir Lambov When run with SAI index, the following two tests error out: {code} [junit-timeout] Testcase: testContainsKeyAndContainsWithIndexOnMapValue(org.apache.cassandra.cql3.validation.operations.SelectTest)-_jdk11: FAILED [junit-timeout] Got less rows than expected. Expected 1 but got 0 [junit-timeout] junit.framework.AssertionFailedError: Got less rows than expected. Expected 1 but got 0 [junit-timeout] at org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1849) [junit-timeout] at org.apache.cassandra.cql3.validation.operations.SelectTest.lambda$testContainsKeyAndContainsWithIndexOnMapValue$9(SelectTest.java:625) [junit-timeout] at org.apache.cassandra.cql3.CQLTester.beforeAndAfterFlush(CQLTester.java:2238) [junit-timeout] at org.apache.cassandra.cql3.validation.operations.SelectTest.testContainsKeyAndContainsWithIndexOnMapValue(SelectTest.java:618) [junit-timeout] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit-timeout] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [junit-timeout] at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [junit-timeout] [junit-timeout] [junit-timeout] Testcase: testFilterWithIndexForContains(org.apache.cassandra.cql3.validation.operations.SelectTest)-_jdk11: FAILED [junit-timeout] Invalid value for row 1 column 0 (k1 of type int), expected <1> but got <0> [junit-timeout] Invalid value for row 1 column 2 (v of type set), expected <{4, 5, 6}> but got <{2, 3, 4}> [junit-timeout] [junit-timeout] junit.framework.AssertionFailedError: Invalid value for row 1 column 0 (k1 of type int), expected <1> but got <0> [junit-timeout] Invalid value for row 1 column 2 (v of type set), expected <{4, 5, 6}> but got <{2, 3, 4}> [junit-timeout] [junit-timeout] at org.apache.cassandra.cql3.CQLTester.assertRows(CQLTester.java:1826) [junit-timeout] at org.apache.cassandra.cql3.validation.operations.SelectTest.lambda$testFilterWithIndexForContains$6(SelectTest.java:543) [junit-timeout] at org.apache.cassandra.cql3.CQLTester.beforeAndAfterFlush(CQLTester.java:2240) [junit-timeout] at org.apache.cassandra.cql3.validation.operations.SelectTest.testFilterWithIndexForContains(SelectTest.java:542) [junit-timeout] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [junit-timeout] at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [junit-timeout] at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) {code} The latter seems to be giving the results in the wrong order, and the order flips when the data is flushed. Caught during preparation of _latest config that would switch default to SAI (CASSANDRA-18753). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18710) Test failure: org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)
[ https://issues.apache.org/jira/browse/CASSANDRA-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17786290#comment-17786290 ] Branimir Lambov edited comment on CASSANDRA-18710 at 11/15/23 10:15 AM: {quote}So perhaps the expected value should be calculated as a moving average by updating it with subsequent table sizes. {quote} This makes sense. Sorting the sstable files by name should give them in the correct order, so we can easily calculate the moving average from them. Actually, that would solve the extra flush problem as well, wouldn't it? was (Author: blambov): {quote}So perhaps the expected value should be calculated as a moving average by updating it with subsequent table sizes. {quote} This makes sense. Sorting the sstable files by name should give them in the correct order, so we can easily calculate the moving average from them. > Test failure: > org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from > org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17) > -- > > Key: CASSANDRA-18710 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18710 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Ekaterina Dimitrova >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 4.1.x, 5.0-beta, 5.0.x, 5.x > > Attachments: org.apache.cassandra.io.DiskSpaceMetricsTest.txt > > > Seen here: > [https://ci-cassandra.apache.org/job/Cassandra-trunk/1644/testReport/org.apache.cassandra.io/DiskSpaceMetricsTest/testFlushSize__jdk17/] > h3. > {code:java} > Error Message > expected:<7200.0> but was:<1367.83970468544> > Stacktrace > junit.framework.AssertionFailedError: expected:<7200.0> but > was:<1367.83970468544> at > org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize(DiskSpaceMetricsTest.java:119) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18710) Test failure: org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)
[ https://issues.apache.org/jira/browse/CASSANDRA-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17786290#comment-17786290 ] Branimir Lambov commented on CASSANDRA-18710: - {quote}So perhaps the expected value should be calculated as a moving average by updating it with subsequent table sizes. {quote} This makes sense. Sorting the sstable files by name should give them in the correct order, so we can easily calculate the moving average from them. > Test failure: > org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from > org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17) > -- > > Key: CASSANDRA-18710 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18710 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Ekaterina Dimitrova >Assignee: Jacek Lewandowski >Priority: Normal > Fix For: 4.1.x, 5.0-beta, 5.0.x, 5.x > > Attachments: org.apache.cassandra.io.DiskSpaceMetricsTest.txt > > > Seen here: > [https://ci-cassandra.apache.org/job/Cassandra-trunk/1644/testReport/org.apache.cassandra.io/DiskSpaceMetricsTest/testFlushSize__jdk17/] > h3. > {code:java} > Error Message > expected:<7200.0> but was:<1367.83970468544> > Stacktrace > junit.framework.AssertionFailedError: expected:<7200.0> but > was:<1367.83970468544> at > org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize(DiskSpaceMetricsTest.java:119) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18757) UnifiedCompactionTask is incorrectly setting keepOriginals
[ https://issues.apache.org/jira/browse/CASSANDRA-18757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17786282#comment-17786282 ] Branimir Lambov commented on CASSANDRA-18757: - I think it is a leftover from a refactoring that (among other things) fixed CASSANDRA-18756 in DSE. Fix LGTM, but it's a shame that no test caught it. > UnifiedCompactionTask is incorrectly setting keepOriginals > -- > > Key: CASSANDRA-18757 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18757 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Fix For: 5.x > > Time Spent: 10m > Remaining Estimate: 0h > > {code:java} > super(cfs, txn, gcBefore, > strategy.getController().getIgnoreOverlapsInExpirationCheck());{code} > in {{UnifiedCompactionTask}} is calling the base constructor > {code:java} > public CompactionTask(ColumnFamilyStore cfs, LifecycleTransaction txn, long > gcBefore, boolean keepOriginals) > {code} > which can set {{keepOriginals}} to true when it should not be. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables
[ https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17782692#comment-17782692 ] Branimir Lambov edited comment on CASSANDRA-18945 at 11/3/23 6:15 PM: -- {quote}+ assert !(count < 0); // Must be positive, 0 or NaN, which should translate to baseShardCount Review Comment: @ethan-brown2022 `count >= 0` is more natural to me {quote} I can't find this to reply to it directly. The comment at the end of the line says that {{count}} can be {{{}NaN{}}}, which will fail {{count >= 0}} but pass {{{}!(count < 0){}}}. Perhaps we should change the bit after NaN to "(which would fail count >= 0, but is acceptable and should translate to baseShardCount)" or something similar? was (Author: blambov): {quote}+ assert !(count < 0); // Must be positive, 0 or NaN, which should translate to baseShardCount Review Comment: @ethan-brown2022 `count >= 0` is more natural to me {quote} I can't find this to reply to it directly. The comment at the end of the line says that {{count}} can be {{{}NaN{}}}, which will fail {{count >= 0}} but pass {{{}!(count < 0){}}}. Perhaps we should change the bit after NaN to "(which would fail {{{}count >= 0,{}}}", but is acceptable and should translate to baseShardCount)" or something similar? > Unified Compaction Strategy is creating too many sstables > - > > Key: CASSANDRA-18945 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18945 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Fix For: 5.0-beta > > Attachments: file_ucs_shenandoah.html, file_ucs_shenandoah_3.html, > file_ucs_shenandoah_off_heap_memtable.html, > file_ucs_shenandoah_on_heap_memtable_2.html, > file_ucs_shenandoah_on_heap_memtable_3.html, key-value-oss.html > > Time Spent: 1h 50m > Remaining Estimate: 0h > > The unified compaction strategy currently aims to create sstables with close > to the same size, defaulting to 1 GiB. Unfortunately tests show that > Cassandra starts to have performance problems when the number of sstables > grows to the order of a thousand, and in particular that even 1 TiB of data > with the default configuration is creating too many sstables for efficient > processing. This matters even more for SAI, where the number of sstables in > the system can have a proportional effect on the complexity of operations. > It is quite easy to create a configuration option that allows sstables to > take some part of the data growth by adding a multiplier to [the shard count > calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding] > formula, replacing > {{2 ^ round(log2(d / (t * b))) * b}} > with > {{2 ^ round((1 - 𝜆) * log2(d / (t * b))) * b}}, > where 𝜆 is a parameter whose value is between 0 and 1. > With this, a 𝜆 of 0.5 would mean that shard count and sstable size grow in > parallel at the square root of the data size growth. 0 would result in no > growth, and 1 in always using the same number of shards. > It may also be valuable to introduce a threshold for engaging the base shard > count to avoid splitting lowest-level sstables into fragments that are too > small. > Once both of these are in place, we can set defaults that better suit all > node densities, including 10 TiB and beyond, for example: > - target size of 1 GiB > - 𝜆 of 1/3 > - base shard count of 4 > - minimum size 100 MiB -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables
[ https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17782692#comment-17782692 ] Branimir Lambov commented on CASSANDRA-18945: - {quote}+ assert !(count < 0); // Must be positive, 0 or NaN, which should translate to baseShardCount Review Comment: @ethan-brown2022 `count >= 0` is more natural to me {quote} I can't find this to reply to it directly. The comment at the end of the line says that {{count}} can be {{{}NaN{}}}, which will fail {{count >= 0}} but pass {{{}!(count < 0){}}}. Perhaps we should change the bit after NaN to "(which would fail {{{}count >= 0,{}}}", but is acceptable and should translate to baseShardCount)" or something similar? > Unified Compaction Strategy is creating too many sstables > - > > Key: CASSANDRA-18945 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18945 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Fix For: 5.0-beta > > Attachments: file_ucs_shenandoah.html, file_ucs_shenandoah_3.html, > file_ucs_shenandoah_off_heap_memtable.html, > file_ucs_shenandoah_on_heap_memtable_2.html, > file_ucs_shenandoah_on_heap_memtable_3.html, key-value-oss.html > > Time Spent: 1h 50m > Remaining Estimate: 0h > > The unified compaction strategy currently aims to create sstables with close > to the same size, defaulting to 1 GiB. Unfortunately tests show that > Cassandra starts to have performance problems when the number of sstables > grows to the order of a thousand, and in particular that even 1 TiB of data > with the default configuration is creating too many sstables for efficient > processing. This matters even more for SAI, where the number of sstables in > the system can have a proportional effect on the complexity of operations. > It is quite easy to create a configuration option that allows sstables to > take some part of the data growth by adding a multiplier to [the shard count > calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding] > formula, replacing > {{2 ^ round(log2(d / (t * b))) * b}} > with > {{2 ^ round((1 - 𝜆) * log2(d / (t * b))) * b}}, > where 𝜆 is a parameter whose value is between 0 and 1. > With this, a 𝜆 of 0.5 would mean that shard count and sstable size grow in > parallel at the square root of the data size growth. 0 would result in no > growth, and 1 in always using the same number of shards. > It may also be valuable to introduce a threshold for engaging the base shard > count to avoid splitting lowest-level sstables into fragments that are too > small. > Once both of these are in place, we can set defaults that better suit all > node densities, including 10 TiB and beyond, for example: > - target size of 1 GiB > - 𝜆 of 1/3 > - base shard count of 4 > - minimum size 100 MiB -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18232) Write docs for CEP-26 Unified Compaction Strategy (UCS)
[ https://issues.apache.org/jira/browse/CASSANDRA-18232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17782640#comment-17782640 ] Branimir Lambov commented on CASSANDRA-18232: - There are some additional options coming with CASSANDRA-18945. The details can be found in [the developer-side markdown doc|https://github.com/datastax/cassandra/blob/CASSANDRA-18945/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#full-sharding-scheme]. > Write docs for CEP-26 Unified Compaction Strategy (UCS) > --- > > Key: CASSANDRA-18232 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18232 > Project: Cassandra > Issue Type: New Feature > Components: Documentation >Reporter: Lorina Poland >Assignee: Lorina Poland >Priority: High > Fix For: 5.x > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables
[ https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17782638#comment-17782638 ] Branimir Lambov commented on CASSANDRA-18945: - We will handle the docs in the documentation ticket, CASSANDRA-18232. I will reach out to Lorina make her aware of the changes. > Unified Compaction Strategy is creating too many sstables > - > > Key: CASSANDRA-18945 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18945 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Fix For: 5.0-beta > > Attachments: file_ucs_shenandoah.html, file_ucs_shenandoah_3.html, > file_ucs_shenandoah_off_heap_memtable.html, > file_ucs_shenandoah_on_heap_memtable_2.html, > file_ucs_shenandoah_on_heap_memtable_3.html, key-value-oss.html > > Time Spent: 1h 40m > Remaining Estimate: 0h > > The unified compaction strategy currently aims to create sstables with close > to the same size, defaulting to 1 GiB. Unfortunately tests show that > Cassandra starts to have performance problems when the number of sstables > grows to the order of a thousand, and in particular that even 1 TiB of data > with the default configuration is creating too many sstables for efficient > processing. This matters even more for SAI, where the number of sstables in > the system can have a proportional effect on the complexity of operations. > It is quite easy to create a configuration option that allows sstables to > take some part of the data growth by adding a multiplier to [the shard count > calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding] > formula, replacing > {{2 ^ round(log2(d / (t * b))) * b}} > with > {{2 ^ round((1 - 𝜆) * log2(d / (t * b))) * b}}, > where 𝜆 is a parameter whose value is between 0 and 1. > With this, a 𝜆 of 0.5 would mean that shard count and sstable size grow in > parallel at the square root of the data size growth. 0 would result in no > growth, and 1 in always using the same number of shards. > It may also be valuable to introduce a threshold for engaging the base shard > count to avoid splitting lowest-level sstables into fragments that are too > small. > Once both of these are in place, we can set defaults that better suit all > node densities, including 10 TiB and beyond, for example: > - target size of 1 GiB > - 𝜆 of 1/3 > - base shard count of 4 > - minimum size 100 MiB -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-18997) Unified Compaction Strategy is missing documentation
Branimir Lambov created CASSANDRA-18997: --- Summary: Unified Compaction Strategy is missing documentation Key: CASSANDRA-18997 URL: https://issues.apache.org/jira/browse/CASSANDRA-18997 Project: Cassandra Issue Type: Task Components: Documentation Reporter: Branimir Lambov UCS is missing from [the CQL documentation for 5.0|https://cassandra.apache.org/doc/5.0/cassandra/developing/cql/ddl.html#cql-compaction-options] and [the compaction page|https://cassandra.apache.org/doc/5.0/cassandra/managing/operating/compaction/index.html#compaction-options]. We need to create a documentation page for UCS and link it from both. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18710) Test failure: org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)
[ https://issues.apache.org/jira/browse/CASSANDRA-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17782610#comment-17782610 ] Branimir Lambov commented on CASSANDRA-18710: - Yes, this looks like a 4.1 regression that is affecting all tests that are sensitive to the number of sstables. Such tests usually run in a separate keyspace (using {{KEYSPACE_PER_TEST}}) to avoid the keyspace flush that dropping a table triggers, but this new commit log recycling is triggering another flush that is not restricted to the affected keyspace. > Test failure: > org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from > org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17) > -- > > Key: CASSANDRA-18710 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18710 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Ekaterina Dimitrova >Assignee: Brandon Williams >Priority: Normal > Fix For: 5.0-beta, 5.0.x, 5.x > > Attachments: org.apache.cassandra.io.DiskSpaceMetricsTest.txt > > > Seen here: > [https://ci-cassandra.apache.org/job/Cassandra-trunk/1644/testReport/org.apache.cassandra.io/DiskSpaceMetricsTest/testFlushSize__jdk17/] > h3. > {code:java} > Error Message > expected:<7200.0> but was:<1367.83970468544> > Stacktrace > junit.framework.AssertionFailedError: expected:<7200.0> but > was:<1367.83970468544> at > org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize(DiskSpaceMetricsTest.java:119) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18533) Move format-specific sstable options into the format configuration
[ https://issues.apache.org/jira/browse/CASSANDRA-18533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17782193#comment-17782193 ] Branimir Lambov commented on CASSANDRA-18533: - I would keep it simple and not add a common settings entry under options. If necessary, the user can copy the value to both. > Move format-specific sstable options into the format configuration > -- > > Key: CASSANDRA-18533 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18533 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Schema, Local/SSTable >Reporter: Branimir Lambov >Assignee: Maxwell Guo >Priority: Normal > Fix For: 5.x > > > This mainly concerns cassandra yaml settings: > - {{column_index_size}}, which should also be renamed to > {{row_index_granularity}} > - {{column_index_cache_size}} > - {{index_summary_capacity}} > - {{index_summary_resize_interval}} > and possibly > - {{key_cache_size}}, {{key_cache_keys_to_save}}, {{key_cache_save_period}}, > {{key_cache_migrate_during_compaction}} > - {{sstable_preemptive_open_interval}} > Existing settings should be deprecated but still picked up if defined. > At this point we will not consider table-level options that make better sense > as format parameters ({{min/max_index_interval}}, {{bloom_filter_fp_chance}}, > {{crc_check_chance}} and possibly {{compression}}), because we do not yet > support per-table format selection/configuration. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18533) Move format-specific sstable options into the format configuration
[ https://issues.apache.org/jira/browse/CASSANDRA-18533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17782184#comment-17782184 ] Branimir Lambov commented on CASSANDRA-18533: - 1. Yes, precisely. 2. The key cache is constructed in a completely separate portion of the code, isn't it? Ignore the key cache settings (except migration), I don't think changing this is something we can do at the moment. 3. Although it is not at the moment, the row index granularity in particular should be a table-level property -- there's no real reason to use one setting for all tables, and there's an advantage to be had by making it configurable. However, things like the key cache size or index summary capacity are something to be shared, not just between tables but also potentially between formats; I don't want to get into a complicated solution for this, I would either ignore any table-level modification for these (with a warning) or check that the value is the same among all tables. This, along with format variations (e.g. "bti-fast"), is also out of scope for this ticket. > Move format-specific sstable options into the format configuration > -- > > Key: CASSANDRA-18533 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18533 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Schema, Local/SSTable >Reporter: Branimir Lambov >Assignee: Maxwell Guo >Priority: Normal > Fix For: 5.x > > > This mainly concerns cassandra yaml settings: > - {{column_index_size}}, which should also be renamed to > {{row_index_granularity}} > - {{column_index_cache_size}} > - {{index_summary_capacity}} > - {{index_summary_resize_interval}} > and possibly > - {{key_cache_size}}, {{key_cache_keys_to_save}}, {{key_cache_save_period}}, > {{key_cache_migrate_during_compaction}} > - {{sstable_preemptive_open_interval}} > Existing settings should be deprecated but still picked up if defined. > At this point we will not consider table-level options that make better sense > as format parameters ({{min/max_index_interval}}, {{bloom_filter_fp_chance}}, > {{crc_check_chance}} and possibly {{compression}}), because we do not yet > support per-table format selection/configuration. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables
[ https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17780350#comment-17780350 ] Branimir Lambov commented on CASSANDRA-18945: - Yes, I intend to commit it to 5.0. > Unified Compaction Strategy is creating too many sstables > - > > Key: CASSANDRA-18945 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18945 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Fix For: 5.0, 5.x > > Attachments: file_ucs_shenandoah.html, file_ucs_shenandoah_3.html, > file_ucs_shenandoah_off_heap_memtable.html, > file_ucs_shenandoah_on_heap_memtable_2.html, > file_ucs_shenandoah_on_heap_memtable_3.html, key-value-oss.html > > Time Spent: 50m > Remaining Estimate: 0h > > The unified compaction strategy currently aims to create sstables with close > to the same size, defaulting to 1 GiB. Unfortunately tests show that > Cassandra starts to have performance problems when the number of sstables > grows to the order of a thousand, and in particular that even 1 TiB of data > with the default configuration is creating too many sstables for efficient > processing. This matters even more for SAI, where the number of sstables in > the system can have a proportional effect on the complexity of operations. > It is quite easy to create a configuration option that allows sstables to > take some part of the data growth by adding a multiplier to [the shard count > calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding] > formula, replacing > {{2 ^ round(log2(d / (t * b))) * b}} > with > {{2 ^ round((1 - 𝜆) * log2(d / (t * b))) * b}}, > where 𝜆 is a parameter whose value is between 0 and 1. > With this, a 𝜆 of 0.5 would mean that shard count and sstable size grow in > parallel at the square root of the data size growth. 0 would result in no > growth, and 1 in always using the same number of shards. > It may also be valuable to introduce a threshold for engaging the base shard > count to avoid splitting lowest-level sstables into fragments that are too > small. > Once both of these are in place, we can set defaults that better suit all > node densities, including 10 TiB and beyond, for example: > - target size of 1 GiB > - 𝜆 of 1/3 > - base shard count of 4 > - minimum size 100 MiB -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables
[ https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18945: Fix Version/s: 5.0 > Unified Compaction Strategy is creating too many sstables > - > > Key: CASSANDRA-18945 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18945 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Fix For: 5.0, 5.x > > Attachments: file_ucs_shenandoah.html, file_ucs_shenandoah_3.html, > file_ucs_shenandoah_off_heap_memtable.html, > file_ucs_shenandoah_on_heap_memtable_2.html, > file_ucs_shenandoah_on_heap_memtable_3.html, key-value-oss.html > > Time Spent: 50m > Remaining Estimate: 0h > > The unified compaction strategy currently aims to create sstables with close > to the same size, defaulting to 1 GiB. Unfortunately tests show that > Cassandra starts to have performance problems when the number of sstables > grows to the order of a thousand, and in particular that even 1 TiB of data > with the default configuration is creating too many sstables for efficient > processing. This matters even more for SAI, where the number of sstables in > the system can have a proportional effect on the complexity of operations. > It is quite easy to create a configuration option that allows sstables to > take some part of the data growth by adding a multiplier to [the shard count > calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding] > formula, replacing > {{2 ^ round(log2(d / (t * b))) * b}} > with > {{2 ^ round((1 - 𝜆) * log2(d / (t * b))) * b}}, > where 𝜆 is a parameter whose value is between 0 and 1. > With this, a 𝜆 of 0.5 would mean that shard count and sstable size grow in > parallel at the square root of the data size growth. 0 would result in no > growth, and 1 in always using the same number of shards. > It may also be valuable to introduce a threshold for engaging the base shard > count to avoid splitting lowest-level sstables into fragments that are too > small. > Once both of these are in place, we can set defaults that better suit all > node densities, including 10 TiB and beyond, for example: > - target size of 1 GiB > - 𝜆 of 1/3 > - base shard count of 4 > - minimum size 100 MiB -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables
[ https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18945: Bug Category: Parent values: Degradation(12984)Level 1 values: Performance Bug/Regression(12997) Complexity: Normal Discovered By: Adhoc Test Reviewers: Branimir Lambov Severity: Normal Status: Open (was: Triage Needed) > Unified Compaction Strategy is creating too many sstables > - > > Key: CASSANDRA-18945 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18945 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Attachments: file_ucs_shenandoah.html, file_ucs_shenandoah_3.html, > file_ucs_shenandoah_off_heap_memtable.html, > file_ucs_shenandoah_on_heap_memtable_2.html, > file_ucs_shenandoah_on_heap_memtable_3.html, key-value-oss.html > > Time Spent: 50m > Remaining Estimate: 0h > > The unified compaction strategy currently aims to create sstables with close > to the same size, defaulting to 1 GiB. Unfortunately tests show that > Cassandra starts to have performance problems when the number of sstables > grows to the order of a thousand, and in particular that even 1 TiB of data > with the default configuration is creating too many sstables for efficient > processing. This matters even more for SAI, where the number of sstables in > the system can have a proportional effect on the complexity of operations. > It is quite easy to create a configuration option that allows sstables to > take some part of the data growth by adding a multiplier to [the shard count > calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding] > formula, replacing > {{2 ^ round(log2(d / (t * b))) * b}} > with > {{2 ^ round((1 - 𝜆) * log2(d / (t * b))) * b}}, > where 𝜆 is a parameter whose value is between 0 and 1. > With this, a 𝜆 of 0.5 would mean that shard count and sstable size grow in > parallel at the square root of the data size growth. 0 would result in no > growth, and 1 in always using the same number of shards. > It may also be valuable to introduce a threshold for engaging the base shard > count to avoid splitting lowest-level sstables into fragments that are too > small. > Once both of these are in place, we can set defaults that better suit all > node densities, including 10 TiB and beyond, for example: > - target size of 1 GiB > - 𝜆 of 1/3 > - base shard count of 4 > - minimum size 100 MiB -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables
[ https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17780240#comment-17780240 ] Branimir Lambov commented on CASSANDRA-18945: - [~smiklosovic], would you be willing to be the second reviewer? > Unified Compaction Strategy is creating too many sstables > - > > Key: CASSANDRA-18945 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18945 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Attachments: file_ucs_shenandoah.html, file_ucs_shenandoah_3.html, > file_ucs_shenandoah_off_heap_memtable.html, > file_ucs_shenandoah_on_heap_memtable_2.html, > file_ucs_shenandoah_on_heap_memtable_3.html, key-value-oss.html > > Time Spent: 50m > Remaining Estimate: 0h > > The unified compaction strategy currently aims to create sstables with close > to the same size, defaulting to 1 GiB. Unfortunately tests show that > Cassandra starts to have performance problems when the number of sstables > grows to the order of a thousand, and in particular that even 1 TiB of data > with the default configuration is creating too many sstables for efficient > processing. This matters even more for SAI, where the number of sstables in > the system can have a proportional effect on the complexity of operations. > It is quite easy to create a configuration option that allows sstables to > take some part of the data growth by adding a multiplier to [the shard count > calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding] > formula, replacing > {{2 ^ round(log2(d / (t * b))) * b}} > with > {{2 ^ round((1 - 𝜆) * log2(d / (t * b))) * b}}, > where 𝜆 is a parameter whose value is between 0 and 1. > With this, a 𝜆 of 0.5 would mean that shard count and sstable size grow in > parallel at the square root of the data size growth. 0 would result in no > growth, and 1 in always using the same number of shards. > It may also be valuable to introduce a threshold for engaging the base shard > count to avoid splitting lowest-level sstables into fragments that are too > small. > Once both of these are in place, we can set defaults that better suit all > node densities, including 10 TiB and beyond, for example: > - target size of 1 GiB > - 𝜆 of 1/3 > - base shard count of 4 > - minimum size 100 MiB -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18710) Test failure: org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)
[ https://issues.apache.org/jira/browse/CASSANDRA-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17779468#comment-17779468 ] Branimir Lambov commented on CASSANDRA-18710: - So the {{KEYSPACE_PER_TEST}} fix for unexpected flushes no longer works after CASSANDRA-17071? All of the tests that use it will be having intermittent failures unless we find a way to block this. > Test failure: > org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from > org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17) > -- > > Key: CASSANDRA-18710 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18710 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Ekaterina Dimitrova >Assignee: Brandon Williams >Priority: Normal > Fix For: 5.0.x, 5.x > > Attachments: org.apache.cassandra.io.DiskSpaceMetricsTest.txt > > > Seen here: > [https://ci-cassandra.apache.org/job/Cassandra-trunk/1644/testReport/org.apache.cassandra.io/DiskSpaceMetricsTest/testFlushSize__jdk17/] > h3. > {code:java} > Error Message > expected:<7200.0> but was:<1367.83970468544> > Stacktrace > junit.framework.AssertionFailedError: expected:<7200.0> but > was:<1367.83970468544> at > org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize(DiskSpaceMetricsTest.java:119) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables
[ https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17779444#comment-17779444 ] Branimir Lambov commented on CASSANDRA-18945: - Attached [the result of a recent benchmark|https://issues.apache.org/jira/secure/attachment/13063855/key-value-oss.html] comparing the UCS default (green) to STCS (blue) and an option with larger SSTable size (orange). The default UCS has worse results in the throughput stage, but more importantly it is unable to serve the 110k ops/s during the 1:1 and read-only stages. I'm still investigating what causes these reads to be so slow, but switching to 10GiB target fully fixes the problem (the two other options the orange graph uses, 'base_shard_count': '1' and 'max_sstables_to_compact': '32', help but are not as significant on their own). Rather than ask users to choose a target size based on their expected data density, the database should be able to deal with this itself. Admitting some of the growth into the sstable size is a good way to achieve that. > Unified Compaction Strategy is creating too many sstables > - > > Key: CASSANDRA-18945 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18945 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Attachments: key-value-oss.html > > Time Spent: 10m > Remaining Estimate: 0h > > The unified compaction strategy currently aims to create sstables with close > to the same size, defaulting to 1 GiB. Unfortunately tests show that > Cassandra starts to have performance problems when the number of sstables > grows to the order of a thousand, and in particular that even 1 TiB of data > with the default configuration is creating too many sstables for efficient > processing. This matters even more for SAI, where the number of sstables in > the system can have a proportional effect on the complexity of operations. > It is quite easy to create a configuration option that allows sstables to > take some part of the data growth by adding a multiplier to [the shard count > calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding] > formula, replacing > {{2 ^ round(log2(d / (t * b))) * b}} > with > {{2 ^ round((1 - 𝜆) * log2(d / (t * b))) * b}}, > where 𝜆 is a parameter whose value is between 0 and 1. > With this, a 𝜆 of 0.5 would mean that shard count and sstable size grow in > parallel at the square root of the data size growth. 0 would result in no > growth, and 1 in always using the same number of shards. > It may also be valuable to introduce a threshold for engaging the base shard > count to avoid splitting lowest-level sstables into fragments that are too > small. > Once both of these are in place, we can set defaults that better suit all > node densities, including 10 TiB and beyond, for example: > - target size of 1 GiB > - 𝜆 of 1/3 > - base shard count of 4 > - minimum size 100 MiB -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables
[ https://issues.apache.org/jira/browse/CASSANDRA-18945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18945: Attachment: key-value-oss.html > Unified Compaction Strategy is creating too many sstables > - > > Key: CASSANDRA-18945 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18945 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Attachments: key-value-oss.html > > Time Spent: 10m > Remaining Estimate: 0h > > The unified compaction strategy currently aims to create sstables with close > to the same size, defaulting to 1 GiB. Unfortunately tests show that > Cassandra starts to have performance problems when the number of sstables > grows to the order of a thousand, and in particular that even 1 TiB of data > with the default configuration is creating too many sstables for efficient > processing. This matters even more for SAI, where the number of sstables in > the system can have a proportional effect on the complexity of operations. > It is quite easy to create a configuration option that allows sstables to > take some part of the data growth by adding a multiplier to [the shard count > calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding] > formula, replacing > {{2 ^ round(log2(d / (t * b))) * b}} > with > {{2 ^ round((1 - 𝜆) * log2(d / (t * b))) * b}}, > where 𝜆 is a parameter whose value is between 0 and 1. > With this, a 𝜆 of 0.5 would mean that shard count and sstable size grow in > parallel at the square root of the data size growth. 0 would result in no > growth, and 1 in always using the same number of shards. > It may also be valuable to introduce a threshold for engaging the base shard > count to avoid splitting lowest-level sstables into fragments that are too > small. > Once both of these are in place, we can set defaults that better suit all > node densities, including 10 TiB and beyond, for example: > - target size of 1 GiB > - 𝜆 of 1/3 > - base shard count of 4 > - minimum size 100 MiB -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18710) Test failure: org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17)
[ https://issues.apache.org/jira/browse/CASSANDRA-18710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1830#comment-1830 ] Branimir Lambov commented on CASSANDRA-18710: - It looks like the reason for the unexpected flush is the commit log: {code:java} [junit-timeout] INFO [OptionalTasks:1] 2023-10-12 21:55:11,095 ColumnFamilyStore.java:1017 - Enqueuing flush of cql_test_keyspace_alt.table_01, Reason: COMMITLOG_DIRTY, Usage: 74.752KiB (0%) on-heap, 3.777KiB (0%) off-heap [junit-timeout] INFO [PerDiskMemtableFlushWriter_0:2] 2023-10-12 21:55:11,103 Flushing.java:154 - Writing Memtable-table_01@1180822937(6.854KiB serialized bytes, 242 ops, 74.916KiB (0%) on-heap, 3.781KiB (0%) off-heap), flushed range = [null, null) [junit-timeout] INFO [PerDiskMemtableFlushWriter_0:2] 2023-10-12 21:55:11,128 Flushing.java:180 - Completed flushing /tmp/cassandra/build/test/cassandra/data/cql_test_keyspace_alt/table_01-03e61210694a11eeb4091bdb4ac3170b/nc-1-big-Data.db (6.839KiB) ... {code} which is flushing just 242 out of the 1000 ops that the test needs per table. We need to understand what causes these {{COMMITLOG_DIRTY}} flushes, because there are quite a few tests that will fail if a flush happens at the wrong time. Or maybe somehow disable commitlog-driven flushing for tests (e.g. by setting a really large commit log space limit). > Test failure: > org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize-.jdk17 (from > org.apache.cassandra.io.DiskSpaceMetricsTest-.jdk17) > -- > > Key: CASSANDRA-18710 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18710 > Project: Cassandra > Issue Type: Bug > Components: Test/unit >Reporter: Ekaterina Dimitrova >Assignee: Brandon Williams >Priority: Normal > Fix For: 5.0.x, 5.x > > Attachments: org.apache.cassandra.io.DiskSpaceMetricsTest.txt > > > Seen here: > [https://ci-cassandra.apache.org/job/Cassandra-trunk/1644/testReport/org.apache.cassandra.io/DiskSpaceMetricsTest/testFlushSize__jdk17/] > h3. > {code:java} > Error Message > expected:<7200.0> but was:<1367.83970468544> > Stacktrace > junit.framework.AssertionFailedError: expected:<7200.0> but > was:<1367.83970468544> at > org.apache.cassandra.io.DiskSpaceMetricsTest.testFlushSize(DiskSpaceMetricsTest.java:119) > at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77) > at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-18945) Unified Compaction Strategy is creating too many sstables
Branimir Lambov created CASSANDRA-18945: --- Summary: Unified Compaction Strategy is creating too many sstables Key: CASSANDRA-18945 URL: https://issues.apache.org/jira/browse/CASSANDRA-18945 Project: Cassandra Issue Type: Bug Components: Local/Compaction Reporter: Branimir Lambov The unified compaction strategy currently aims to create sstables with close to the same size, defaulting to 1 GiB. Unfortunately tests show that Cassandra starts to have performance problems when the number of sstables grows to the order of a thousand, and in particular that even 1 TiB of data with the default configuration is creating too many sstables for efficient processing. This matters even more for SAI, where the number of sstables in the system can have a proportional effect on the complexity of operations. It is quite easy to create a configuration option that allows sstables to take some part of the data growth by adding a multiplier to [the shard count calculation|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md#sharding] formula, replacing {{2 ^ round(log2(d / (t * b))) * b}} with {{2 ^ round((1 - 𝜆) * log2(d / (t * b))) * b}}, where 𝜆 is a parameter whose value is between 0 and 1. With this, a 𝜆 of 0.5 would mean that shard count and sstable size grow in parallel at the square root of the data size growth. 0 would result in no growth, and 1 in always using the same number of shards. It may also be valuable to introduce a threshold for engaging the base shard count to avoid splitting lowest-level sstables into fragments that are too small. Once both of these are in place, we can set defaults that better suit all node densities, including 10 TiB and beyond, for example: - target size of 1 GiB - 𝜆 of 1/3 - base shard count of 4 - minimum size 100 MiB -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18872) Remove deprecated crc_check_chance in compression params
[ https://issues.apache.org/jira/browse/CASSANDRA-18872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1333#comment-1333 ] Branimir Lambov commented on CASSANDRA-18872: - The patch looks good to me, the changes are not too invasive and can be easily replaced with format configuration in CASSANDRA-18534. Do we have a documentation ticket corresponding to this? AFAICS [the docs|https://cassandra.apache.org/doc/latest/cassandra/operating/compression.html] only mention the compression-level setting, even for 4.1. This documentation change also needs to explain that the chance only applies to compressed sstables. > Remove deprecated crc_check_chance in compression params > > > Key: CASSANDRA-18872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18872 > Project: Cassandra > Issue Type: Task > Components: Feature/Compression, Legacy/CQL >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 5.x > > Time Spent: 20m > Remaining Estimate: 0h > > crc_check_chance was moved from compression parameters and it is a standalone > table parameter. This was done in times of 3.0 so it is now time to get rid > of that in 5.0. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18534) Make sstable format configurable per table
[ https://issues.apache.org/jira/browse/CASSANDRA-18534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18534: Fix Version/s: 5.0 (was: 5.x) > Make sstable format configurable per table > -- > > Key: CASSANDRA-18534 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18534 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Schema, Local/SSTable >Reporter: Branimir Lambov >Assignee: Maxwell Guo >Priority: Normal > Fix For: 5.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Some SSTable format settings need to be configurable per table for better > efficiency. This includes: > - {{row_index_granularity}} > - {{bloom_filter_fp_chance}} > - {{crc_check_chance}} > - {{min/max_index_interval}} > Some of these are currently configurable using direct properties of tables. > Having them as format properties makes better sense and should also support > specifying useable combinations of settings, e.g. > {code:java} > CREATE TABLE ... WITH sstable_format = "bti-fast"; > CREATE TABLE ... WITH sstable_format = "bti-small"; > {code} > where {{bti-fast}} and {{bti-small}} can be defined in {{cassandra.yaml}} > e.g. as > {code:java} > sstable.format.options: > - bti-fast: > row_index_granularity: 1kiB > bloom_filter_fp_chance: 0.01 > - bti-small: > row_index_granularity: 32kiB > bloom_filter_fp_chance: 0.1 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18534) Make sstable format configurable per table
[ https://issues.apache.org/jira/browse/CASSANDRA-18534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17773696#comment-17773696 ] Branimir Lambov commented on CASSANDRA-18534: - bq. Also, do you think it is possible and useful to make sstable_format contain custom parameters? _All_ of the parameters to the SSTable format are custom, i.e. format-specific. This is also the qualifying condition for something to be moved into the format config: if you can imagine an SSTable format that does not need that flag, then it belongs to the format. E.g. bloom-filter-less formats do not need {{bloom_filter_fp_chance}}, and (even though they are not a feature of writing an SSTable) only {{BIG}} requires key cache options. Unless we are certain that CRC is the only way a format could defend against bit rot, {{check_crc_chance}} is also a format-specific property. > Make sstable format configurable per table > -- > > Key: CASSANDRA-18534 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18534 > Project: Cassandra > Issue Type: Improvement > Components: Cluster/Schema, Local/SSTable >Reporter: Branimir Lambov >Assignee: Maxwell Guo >Priority: Normal > Fix For: 5.x > > Time Spent: 0.5h > Remaining Estimate: 0h > > Some SSTable format settings need to be configurable per table for better > efficiency. This includes: > - {{row_index_granularity}} > - {{bloom_filter_fp_chance}} > - {{crc_check_chance}} > - {{min/max_index_interval}} > Some of these are currently configurable using direct properties of tables. > Having them as format properties makes better sense and should also support > specifying useable combinations of settings, e.g. > {code:java} > CREATE TABLE ... WITH sstable_format = "bti-fast"; > CREATE TABLE ... WITH sstable_format = "bti-small"; > {code} > where {{bti-fast}} and {{bti-small}} can be defined in {{cassandra.yaml}} > e.g. as > {code:java} > sstable.format.options: > - bti-fast: > row_index_granularity: 1kiB > bloom_filter_fp_chance: 0.01 > - bti-small: > row_index_granularity: 32kiB > bloom_filter_fp_chance: 0.1 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18872) Remove deprecated crc_check_chance in compression params
[ https://issues.apache.org/jira/browse/CASSANDRA-18872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17773438#comment-17773438 ] Branimir Lambov commented on CASSANDRA-18872: - Have you looked at CASSANDRA-18534? Now that we have multiple SSTable formats, it makes a lot of sense to move properties like this into the format configuration, which in turn would mean passing a format configuration (instead of compression one) to the file handle builder. > Remove deprecated crc_check_chance in compression params > > > Key: CASSANDRA-18872 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18872 > Project: Cassandra > Issue Type: Task > Components: Feature/Compression, Legacy/CQL >Reporter: Stefan Miklosovic >Assignee: Stefan Miklosovic >Priority: Normal > Fix For: 5.x > > Time Spent: 10m > Remaining Estimate: 0h > > crc_check_chance was moved from compression parameters and it is a standalone > table parameter. This was done in times of 3.0 so it is now time to get rid > of that in 5.0. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18464) Enable Direct I/O For CommitLog Files
[ https://issues.apache.org/jira/browse/CASSANDRA-18464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771732#comment-17771732 ] Branimir Lambov commented on CASSANDRA-18464: - To make the review easier, could you fork the {{apache/cassandra}} repository on github, push a branch with the changes to your fork on top of {{cassandra-5.0}}, and open a pull request against {{apache/cassandra-5.0}}? My comments so far are these: On [Config.java 117|https://github.com/driftx/cassandra/commit/cb5bd169f0a9331f957f96a7318fee02a744e006#diff-e966f41bc2a418becfe687134ec8cf542eb051eead7fb4917e65a3a2e7c9bce3R117]: {quote} Using booleans makes it very unclear which options are actually valid, and what the alternative means. Please change the configuration to an enum, e.g. {{commit_log_access_mode}} with values {{direct_jna}}, {{direct}}, and {{mmap}}. {quote} {quote} Actually, there should be only one direct option, and whether it uses nio or jni is an implementation detail that the users needn't care about. The next question is whether or not non-direct should be supported at all, and I personally prefer to not support it as this adds configuration complexity for no expected benefit. This also means that it makes sense to simply switch all other commit log segment types to be written direct, and this is simple enough to do in this ticket (especially since we dropped Java 8 and can use NIO's {{DIRECT}} option). {quote} On [Config.java 517|https://github.com/driftx/cassandra/commit/cb5bd169f0a9331f957f96a7318fee02a744e006#diff-e966f41bc2a418becfe687134ec8cf542eb051eead7fb4917e65a3a2e7c9bce3R517]: {quote} When would someone need to change this? {quote} > Enable Direct I/O For CommitLog Files > - > > Key: CASSANDRA-18464 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18464 > Project: Cassandra > Issue Type: New Feature > Components: Local/Commit Log >Reporter: Josh McKenzie >Assignee: Amit Pawar >Priority: Normal > Fix For: 5.0.x, 5.x > > Attachments: CommitLogStressTest.patch, > EnableDirectIOForCommitLogUsingNativeAPI.patch, > PeriodicCommitLogStressTest.tar.bz2, SetCommitLogFileSize.patch, > UseDirectIOFeatureForCommitLogFiles.patch, image-2023-06-29-01-12-49-382.png > > > Relocating from [dev@ email > thread.|https://lists.apache.org/thread/j6ny17q2rhkp7jxvwxm69dd6v1dozjrg] > > I shared my investigation about Commitlog I/O issue on large core count > system in my previous email dated July-22 and link to the thread is given > below. > [https://lists.apache.org/thread/xc5ocog2qz2v2gnj4xlw5hbthfqytx2n] > Basically, two solutions looked possible to improve the CommitLog I/O. > # Multi-threaded syncing > # Using Direct-IO through JNA > I worked on 2nd option considering the following benefit compared to the > first one > # Direct I/O read/write throughput is very high compared to non-Direct I/O. > Learnt through FIO benchmarking. > # Reduces kernel file cache uses which in-turn reduces kernel I/O activity > for Commitlog files only. > # Overall CPU usage reduced for flush activity. JVisualvm shows CPU usage < > 30% for Commitlog syncer thread with Direct I/O feature > # Direct I/O implementation is easier compared to multi-threaded > As per the community suggestion, less in code complex is good to have. Direct > I/O enablement looked promising but there was one issue. > Java version 8 does not have native support to enable Direct I/O. So, JNA > library usage is must. The same implementation should also work across other > versions of Java (like 11 and beyond). > I have completed Direct I/O implementation and summary of the attached patch > changes are given below. > # This implementation is not using Java file channels and file is opened > through JNA to use Direct I/O feature. > # New Segment are defined named “DirectIOSegment” for Direct I/O and > “NonDirectIOSegment” for non-direct I/O (NonDirectIOSegment is test purpose > only). > # JNA write call is used to flush the changes. > # New helper functions are defined in NativeLibrary.java and platform > specific file. Currently tested on Linux only. > # Patch allows user to configure optimum block size and alignment if > default values are not OK for CommitLog disk. > # Following configuration options are provided in Cassandra.yaml file > a. use_jna_for_commitlog_io : to use jna feature > b. use_direct_io_for_commitlog : to use Direct I/O feature. > c. direct_io_minimum_block_alignment: 512 (default) > d. nvme_disk_block_size: 32MiB (default and can be changed as per the > required size) > Test matrix is complex so CommitLog related testcases and TPCx-IOT benchmark > was tested. It works with both Java 8 and 11 versions. Compressed and > Encrypted based segments ar
[jira] [Commented] (CASSANDRA-18464) Enable Direct I/O For CommitLog Files
[ https://issues.apache.org/jira/browse/CASSANDRA-18464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17771432#comment-17771432 ] Branimir Lambov commented on CASSANDRA-18464: - There was a typo in my response above, I am in favour of having the patch land in 5.0. Just the 512 vs 4k difference is not something I would personally consider a good reason to include the JNA writing; the sync segments are usually much larger than that. I would rather go with the simpler NIO option. I can't find my code comments with the link above any more. They are [here|https://github.com/driftx/cassandra/commit/cb5bd169f0a9331f957f96a7318fee02a744e006#r128716588]. > Enable Direct I/O For CommitLog Files > - > > Key: CASSANDRA-18464 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18464 > Project: Cassandra > Issue Type: New Feature > Components: Local/Commit Log >Reporter: Josh McKenzie >Assignee: Amit Pawar >Priority: Normal > Fix For: 5.0.x, 5.x > > Attachments: CommitLogStressTest.patch, > EnableDirectIOForCommitLogUsingNativeAPI.patch, > PeriodicCommitLogStressTest.tar.bz2, SetCommitLogFileSize.patch, > UseDirectIOFeatureForCommitLogFiles.patch, image-2023-06-29-01-12-49-382.png > > > Relocating from [dev@ email > thread.|https://lists.apache.org/thread/j6ny17q2rhkp7jxvwxm69dd6v1dozjrg] > > I shared my investigation about Commitlog I/O issue on large core count > system in my previous email dated July-22 and link to the thread is given > below. > [https://lists.apache.org/thread/xc5ocog2qz2v2gnj4xlw5hbthfqytx2n] > Basically, two solutions looked possible to improve the CommitLog I/O. > # Multi-threaded syncing > # Using Direct-IO through JNA > I worked on 2nd option considering the following benefit compared to the > first one > # Direct I/O read/write throughput is very high compared to non-Direct I/O. > Learnt through FIO benchmarking. > # Reduces kernel file cache uses which in-turn reduces kernel I/O activity > for Commitlog files only. > # Overall CPU usage reduced for flush activity. JVisualvm shows CPU usage < > 30% for Commitlog syncer thread with Direct I/O feature > # Direct I/O implementation is easier compared to multi-threaded > As per the community suggestion, less in code complex is good to have. Direct > I/O enablement looked promising but there was one issue. > Java version 8 does not have native support to enable Direct I/O. So, JNA > library usage is must. The same implementation should also work across other > versions of Java (like 11 and beyond). > I have completed Direct I/O implementation and summary of the attached patch > changes are given below. > # This implementation is not using Java file channels and file is opened > through JNA to use Direct I/O feature. > # New Segment are defined named “DirectIOSegment” for Direct I/O and > “NonDirectIOSegment” for non-direct I/O (NonDirectIOSegment is test purpose > only). > # JNA write call is used to flush the changes. > # New helper functions are defined in NativeLibrary.java and platform > specific file. Currently tested on Linux only. > # Patch allows user to configure optimum block size and alignment if > default values are not OK for CommitLog disk. > # Following configuration options are provided in Cassandra.yaml file > a. use_jna_for_commitlog_io : to use jna feature > b. use_direct_io_for_commitlog : to use Direct I/O feature. > c. direct_io_minimum_block_alignment: 512 (default) > d. nvme_disk_block_size: 32MiB (default and can be changed as per the > required size) > Test matrix is complex so CommitLog related testcases and TPCx-IOT benchmark > was tested. It works with both Java 8 and 11 versions. Compressed and > Encrypted based segments are not supported yet and it can be enabled later > based on the Community feedback. > Following improvement are seen with Direct I/O enablement. > # 32 cores >= ~15% > # 64 cores >= ~80% > Also, another observation would like to share here. Reading Commitlog files > with Direct I/O might help in reducing node bring-up time after the node > crash. > Tested with commit ID: 91f6a9aca8d3c22a03e68aa901a0b154d960ab07 > The attached patch enables Direct I/O feature for Commitlog files. Please > check and share your feedback. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-18894) Drop commitlog chain marker updates
Branimir Lambov created CASSANDRA-18894: --- Summary: Drop commitlog chain marker updates Key: CASSANDRA-18894 URL: https://issues.apache.org/jira/browse/CASSANDRA-18894 Project: Cassandra Issue Type: Improvement Components: Local/Commit Log Reporter: Branimir Lambov CASSANDRA-13987 added a periodic update of the last commit log chain marker in order to allow for data in memory-mapped segments to be recovered even if it was not part of a synced segment. A much simpler way to do this is something in the vein of CASSANDRA-16482, i.e. ignoring an empty sync marker for the last entry in the commit log. We could do this by default if the commit log is uncompressed (and possibly only if using memory mapping after CASSANDRA-18464). -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18464) Enable Direct I/O For CommitLog Files
[ https://issues.apache.org/jira/browse/CASSANDRA-18464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18464: Reviewers: Branimir Lambov Status: Review In Progress (was: Patch Available) > Enable Direct I/O For CommitLog Files > - > > Key: CASSANDRA-18464 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18464 > Project: Cassandra > Issue Type: New Feature > Components: Local/Commit Log >Reporter: Josh McKenzie >Assignee: Amit Pawar >Priority: Normal > Fix For: 5.x > > Attachments: CommitLogStressTest.patch, > EnableDirectIOForCommitLogUsingNativeAPI.patch, > PeriodicCommitLogStressTest.tar.bz2, SetCommitLogFileSize.patch, > UseDirectIOFeatureForCommitLogFiles.patch, image-2023-06-29-01-12-49-382.png > > > Relocating from [dev@ email > thread.|https://lists.apache.org/thread/j6ny17q2rhkp7jxvwxm69dd6v1dozjrg] > > I shared my investigation about Commitlog I/O issue on large core count > system in my previous email dated July-22 and link to the thread is given > below. > [https://lists.apache.org/thread/xc5ocog2qz2v2gnj4xlw5hbthfqytx2n] > Basically, two solutions looked possible to improve the CommitLog I/O. > # Multi-threaded syncing > # Using Direct-IO through JNA > I worked on 2nd option considering the following benefit compared to the > first one > # Direct I/O read/write throughput is very high compared to non-Direct I/O. > Learnt through FIO benchmarking. > # Reduces kernel file cache uses which in-turn reduces kernel I/O activity > for Commitlog files only. > # Overall CPU usage reduced for flush activity. JVisualvm shows CPU usage < > 30% for Commitlog syncer thread with Direct I/O feature > # Direct I/O implementation is easier compared to multi-threaded > As per the community suggestion, less in code complex is good to have. Direct > I/O enablement looked promising but there was one issue. > Java version 8 does not have native support to enable Direct I/O. So, JNA > library usage is must. The same implementation should also work across other > versions of Java (like 11 and beyond). > I have completed Direct I/O implementation and summary of the attached patch > changes are given below. > # This implementation is not using Java file channels and file is opened > through JNA to use Direct I/O feature. > # New Segment are defined named “DirectIOSegment” for Direct I/O and > “NonDirectIOSegment” for non-direct I/O (NonDirectIOSegment is test purpose > only). > # JNA write call is used to flush the changes. > # New helper functions are defined in NativeLibrary.java and platform > specific file. Currently tested on Linux only. > # Patch allows user to configure optimum block size and alignment if > default values are not OK for CommitLog disk. > # Following configuration options are provided in Cassandra.yaml file > a. use_jna_for_commitlog_io : to use jna feature > b. use_direct_io_for_commitlog : to use Direct I/O feature. > c. direct_io_minimum_block_alignment: 512 (default) > d. nvme_disk_block_size: 32MiB (default and can be changed as per the > required size) > Test matrix is complex so CommitLog related testcases and TPCx-IOT benchmark > was tested. It works with both Java 8 and 11 versions. Compressed and > Encrypted based segments are not supported yet and it can be enabled later > based on the Community feedback. > Following improvement are seen with Direct I/O enablement. > # 32 cores >= ~15% > # 64 cores >= ~80% > Also, another observation would like to share here. Reading Commitlog files > with Direct I/O might help in reducing node bring-up time after the node > crash. > Tested with commit ID: 91f6a9aca8d3c22a03e68aa901a0b154d960ab07 > The attached patch enables Direct I/O feature for Commitlog files. Please > check and share your feedback. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18464) Enable Direct I/O For CommitLog Files
[ https://issues.apache.org/jira/browse/CASSANDRA-18464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17770415#comment-17770415 ] Branimir Lambov commented on CASSANDRA-18464: - This patch is very valuable, and I support if going into 5.0 as well as 5.1. In separate tests we have often found a memory-mapped commit log to be a serious performance problem for a node with a lot of data. Even without DIRECT or JNA, not using `msync` is making a huge difference. Because of this most of the performance testing I personally do is done with compressed commit log. I added comments to [the latest published branch|https://github.com/driftx/cassandra/tree/CASSANDRA-18464-trunk] with some suggested changes. I am curious, if the NIO option is constructed correctly (with aligned direct buffers, possibly also issuing the writes to be page-aligned and containing whole pages), is it still copying to internal buffers? > Enable Direct I/O For CommitLog Files > - > > Key: CASSANDRA-18464 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18464 > Project: Cassandra > Issue Type: New Feature > Components: Local/Commit Log >Reporter: Josh McKenzie >Assignee: Amit Pawar >Priority: Normal > Fix For: 5.x > > Attachments: CommitLogStressTest.patch, > EnableDirectIOForCommitLogUsingNativeAPI.patch, > PeriodicCommitLogStressTest.tar.bz2, SetCommitLogFileSize.patch, > UseDirectIOFeatureForCommitLogFiles.patch, image-2023-06-29-01-12-49-382.png > > > Relocating from [dev@ email > thread.|https://lists.apache.org/thread/j6ny17q2rhkp7jxvwxm69dd6v1dozjrg] > > I shared my investigation about Commitlog I/O issue on large core count > system in my previous email dated July-22 and link to the thread is given > below. > [https://lists.apache.org/thread/xc5ocog2qz2v2gnj4xlw5hbthfqytx2n] > Basically, two solutions looked possible to improve the CommitLog I/O. > # Multi-threaded syncing > # Using Direct-IO through JNA > I worked on 2nd option considering the following benefit compared to the > first one > # Direct I/O read/write throughput is very high compared to non-Direct I/O. > Learnt through FIO benchmarking. > # Reduces kernel file cache uses which in-turn reduces kernel I/O activity > for Commitlog files only. > # Overall CPU usage reduced for flush activity. JVisualvm shows CPU usage < > 30% for Commitlog syncer thread with Direct I/O feature > # Direct I/O implementation is easier compared to multi-threaded > As per the community suggestion, less in code complex is good to have. Direct > I/O enablement looked promising but there was one issue. > Java version 8 does not have native support to enable Direct I/O. So, JNA > library usage is must. The same implementation should also work across other > versions of Java (like 11 and beyond). > I have completed Direct I/O implementation and summary of the attached patch > changes are given below. > # This implementation is not using Java file channels and file is opened > through JNA to use Direct I/O feature. > # New Segment are defined named “DirectIOSegment” for Direct I/O and > “NonDirectIOSegment” for non-direct I/O (NonDirectIOSegment is test purpose > only). > # JNA write call is used to flush the changes. > # New helper functions are defined in NativeLibrary.java and platform > specific file. Currently tested on Linux only. > # Patch allows user to configure optimum block size and alignment if > default values are not OK for CommitLog disk. > # Following configuration options are provided in Cassandra.yaml file > a. use_jna_for_commitlog_io : to use jna feature > b. use_direct_io_for_commitlog : to use Direct I/O feature. > c. direct_io_minimum_block_alignment: 512 (default) > d. nvme_disk_block_size: 32MiB (default and can be changed as per the > required size) > Test matrix is complex so CommitLog related testcases and TPCx-IOT benchmark > was tested. It works with both Java 8 and 11 versions. Compressed and > Encrypted based segments are not supported yet and it can be enabled later > based on the Community feedback. > Following improvement are seen with Direct I/O enablement. > # 32 cores >= ~15% > # 64 cores >= ~80% > Also, another observation would like to share here. Reading Commitlog files > with Direct I/O might help in reducing node bring-up time after the node > crash. > Tested with commit ID: 91f6a9aca8d3c22a03e68aa901a0b154d960ab07 > The attached patch enables Direct I/O feature for Commitlog files. Please > check and share your feedback. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail:
[jira] [Commented] (CASSANDRA-18773) Compactions are slow
[ https://issues.apache.org/jira/browse/CASSANDRA-18773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17769063#comment-17769063 ] Branimir Lambov commented on CASSANDRA-18773: - There's some leftover code in the trunk version, apart from that the newer versions look good. > Compactions are slow > > > Key: CASSANDRA-18773 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18773 > Project: Cassandra > Issue Type: Improvement > Components: Local/Compaction >Reporter: Cameron Zemek >Assignee: Cameron Zemek >Priority: Normal > Fix For: 4.0.x, 4.1.x, 5.0.x, 5.x > > Attachments: 18773.patch, compact-poc.patch, flamegraph.png, > stress.yaml > > Time Spent: 2h 50m > Remaining Estimate: 0h > > I have noticed that compactions involving a lot of sstables are very slow > (for example major compactions). I have attached a cassandra stress profile > that can generate such a dataset under ccm. In my local test I have 2567 > sstables at 4Mb each. > I added code to track wall clock time of various parts of the code. One > problematic part is ManyToOne constructor. Tracing through the code for every > partition creating a ManyToOne for all the sstable iterators for each > partition. In my local test get a measy 60Kb/sec read speed, and bottlenecked > on single core CPU (since this code is single threaded) with it spending 85% > of the wall clock time in ManyToOne constructor. > As another datapoint to show its the merge iterator part of the code using > the cfstats from [https://github.com/instaclustr/cassandra-sstable-tools/] > which reads all the sstables but does no merging gets 26Mb/sec read speed. > Tracking back from ManyToOne call I see this in > UnfilteredPartitionIterators::merge > {code:java} > for (int i = 0; i < toMerge.size(); i++) > { > if (toMerge.get(i) == null) > { > if (null == empty) > empty = EmptyIterators.unfilteredRow(metadata, > partitionKey, isReverseOrder); > toMerge.set(i, empty); > } > } > {code} > Not sure what purpose of creating these empty rows are. But on a whim I > removed all these empty iterators before passing to ManyToOne and then all > the wall clock time shifted to CompactionIterator::hasNext() and read speed > increased to 1.5Mb/s. > So there are further bottlenecks in this code path it seems, but the first is > this ManyToOne and having to build it for every partition read. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18873) Fix broken JMH benchmarks
[ https://issues.apache.org/jira/browse/CASSANDRA-18873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17768591#comment-17768591 ] Branimir Lambov commented on CASSANDRA-18873: - {quote} * ReadSmallPartitionsBench (assertion error) * ReadWidePartitionsBench (assertion error) {quote} These two tests need larger memtable size allocation to produce useable output. One way to "fix" this is to replace {{INMEM}} with {{NO}} for the default {{flush}}, which will make it ignore the fact that part of the data is in an sstable; another is to reduce the default {{count}} by an order of magnitude. Both of these changes would make the test less suitable for what it is primarily meant to measure (access time with a non-trivial data size in a single memtable/sstable). > Fix broken JMH benchmarks > - > > Key: CASSANDRA-18873 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18873 > Project: Cassandra > Issue Type: Bug > Components: Test/benchmark >Reporter: Jacek Lewandowski >Priority: Normal > Attachments: BenchTimeTest.java, > jmh-AtomicBtreePartitionUpdateBench.log, jmh-BloomFilterSerializerBench.log, > jmh-KeyLookupBench.log, jmh-ReadSmallPartitionsBench.log, > jmh-ReadWidePartitionsBench.log > > > The following benchmarks are broken: > * {{ZeroCopyStreamingBench}} > * {{MutationBench}} > * {{FastThreadLocalBench}} > * {{AtomicBTreePartitionUpdateBench}} (OOM on Jenkins) > * {{ReadSmallPartitionsBench}} (assertion error) > * {{ReadWidePartitionsBench}} (assertion error) > * {{BloomFilterSerializerBench}} (NPE) > * {{KeyLookupBench}} (IAE) > Additionally, those benchmarks take too much time to run: > * {{BTreeUpdateBench}} ~ 58 hours > * {{AtomicBTreePartitionUpdateBench}} ~ 5 hours > * {{BTreeTransformBench}} ~ 2.5 hours > Here the complete list of estimated benchmark times: > {noformat} > Estimated time for CacheLoaderBench: ~5 s > Estimated time for LatencyTrackingBench: ~26 s > Estimated time for SampleBench: ~30 s > Estimated time for ReadWriteBench: ~30 s > Estimated time for MutationBench: ~30 s > Estimated time for CompactionBench: ~35 s > Estimated time for DiagnosticEventPersistenceBench: ~40 s > Estimated time for ZeroCopyStreamingBench: ~44 s > Estimated time for BatchStatementBench: ~110 s > Estimated time for DiagnosticEventServiceBench: ~120 s > Estimated time for MessageOutBench: ~144 s > Estimated time for BloomFilterSerializerBench: ~144 s > Estimated time for FastThreadLocalBench: ~156 s > Estimated time for HashingBench: ~156 s > Estimated time for ChecksumBench: ~208 s > Estimated time for StreamingTombstoneHistogramBuilderBench: ~208 s > Estimated time for PendingRangesBench: ~ 5 m > Estimated time for DirectorySizerBench: ~ 5 m > Estimated time for instance.ReadSmallPartitionsBench: ~ 5 m > Estimated time for PreaggregatedByteBufsBench: ~ 7 m > Estimated time for AutoBoxingBench: ~ 8 m > Estimated time for OutputStreamBench: ~ 13 m > Estimated time for BTreeBuildBench: ~ 13 m > Estimated time for StringsEncodeBench: ~ 20 m > Estimated time for instance.ReadWidePartitionsBench: ~ 21 m > Estimated time for btree.BTreeBuildBench: ~ 30 m > Estimated time for BTreeSearchIteratorBench: ~ 31 m > Estimated time for btree.BTreeTransformBench: ~ 138 m > Estimated time for btree.AtomicBTreePartitionUpdateBench: ~ 288 m > Estimated time for btree.BTreeUpdateBench: ~58 h > Total estimated time: ~69 h > {noformat} > I'd like to add a test which estimates the benchmark times and fails if a > single benchmark estimated run time is longer than xxx minutes. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-18533) Move format-specific sstable options into the format configuration
[ https://issues.apache.org/jira/browse/CASSANDRA-18533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767594#comment-17767594 ] Branimir Lambov commented on CASSANDRA-18533: - Absolutely. > Move format-specific sstable options into the format configuration > -- > > Key: CASSANDRA-18533 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18533 > Project: Cassandra > Issue Type: Improvement >Reporter: Branimir Lambov >Priority: Normal > > This mainly concerns cassandra yaml settings: > - {{column_index_size}}, which should also be renamed to > {{row_index_granularity}} > - {{column_index_cache_size}} > - {{index_summary_capacity}} > - {{index_summary_resize_interval}} > and possibly > - {{key_cache_size}}, {{key_cache_keys_to_save}}, {{key_cache_save_period}}, > {{key_cache_migrate_during_compaction}} > - {{sstable_preemptive_open_interval}} > Existing settings should be deprecated but still picked up if defined. > At this point we will not consider table-level options that make better sense > as format parameters ({{min/max_index_interval}}, {{bloom_filter_fp_chance}}, > {{crc_check_chance}} and possibly {{compression}}), because we do not yet > support per-table format selection/configuration. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references
[ https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18756: Fix Version/s: 3.11.17 4.0.12 4.1.4 5.0-alpha2 5.1 Source Control Link: https://github.com/apache/cassandra/pull/2656 Resolution: Fixed Status: Resolved (was: Ready to Commit) Commited ([3.11|https://github.com/apache/cassandra/commit/87c2af85c1305c130af7d66f83dec03a1c4a8bb2] [4.0|https://github.com/apache/cassandra/commit/c6385ac3ddccabdc7cb650b090fa69c0523274e8] [4.1|https://github.com/apache/cassandra/commit/db6641fbb6fd0c439e14f94caecdeee999311c62] [5.0|https://github.com/apache/cassandra/commit/a23f4c0b15c684240ef0bcd55875610e8bd7179b] [trunk|https://github.com/apache/cassandra/commit/970ec2d1db5770c13a42e1f2862ea398317d0f15]) > TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps > overlaping SSTable references > -- > > Key: CASSANDRA-18756 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18756 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Fix For: 3.11.17, 4.0.12, 4.1.4, 5.0-alpha2, 5.1 > > Time Spent: 0.5h > Remaining Estimate: 0h > > When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not > create or maintain an iterator of overlapping sstables. However, because > {{TimeWindowCompactionController}} inherits from {{CompactionController}} and > only sets {{ignoreOverlaps}} after the base class has constructed the overlap > iterator, it ends up making an overlap iterator and then never updating it. > The end result is that such a compaction keeps references to lots of and > likely _all_ other SSTables on the node and thus delays the deletion of > obsolete ones by hours or even days. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references
[ https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18756: Status: Review In Progress (was: Needs Committer) > TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps > overlaping SSTable references > -- > > Key: CASSANDRA-18756 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18756 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Time Spent: 0.5h > Remaining Estimate: 0h > > When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not > create or maintain an iterator of overlapping sstables. However, because > {{TimeWindowCompactionController}} inherits from {{CompactionController}} and > only sets {{ignoreOverlaps}} after the base class has constructed the overlap > iterator, it ends up making an overlap iterator and then never updating it. > The end result is that such a compaction keeps references to lots of and > likely _all_ other SSTables on the node and thus delays the deletion of > obsolete ones by hours or even days. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references
[ https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18756: Status: Ready to Commit (was: Review In Progress) > TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps > overlaping SSTable references > -- > > Key: CASSANDRA-18756 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18756 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Time Spent: 0.5h > Remaining Estimate: 0h > > When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not > create or maintain an iterator of overlapping sstables. However, because > {{TimeWindowCompactionController}} inherits from {{CompactionController}} and > only sets {{ignoreOverlaps}} after the base class has constructed the overlap > iterator, it ends up making an overlap iterator and then never updating it. > The end result is that such a compaction keeps references to lots of and > likely _all_ other SSTables on the node and thus delays the deletion of > obsolete ones by hours or even days. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references
[ https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18756: Status: Needs Committer (was: Patch Available) > TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps > overlaping SSTable references > -- > > Key: CASSANDRA-18756 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18756 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Time Spent: 0.5h > Remaining Estimate: 0h > > When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not > create or maintain an iterator of overlapping sstables. However, because > {{TimeWindowCompactionController}} inherits from {{CompactionController}} and > only sets {{ignoreOverlaps}} after the base class has constructed the overlap > iterator, it ends up making an overlap iterator and then never updating it. > The end result is that such a compaction keeps references to lots of and > likely _all_ other SSTables on the node and thus delays the deletion of > obsolete ones by hours or even days. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references
[ https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18756: Status: Patch Available (was: Requires Testing) > TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps > overlaping SSTable references > -- > > Key: CASSANDRA-18756 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18756 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Time Spent: 0.5h > Remaining Estimate: 0h > > When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not > create or maintain an iterator of overlapping sstables. However, because > {{TimeWindowCompactionController}} inherits from {{CompactionController}} and > only sets {{ignoreOverlaps}} after the base class has constructed the overlap > iterator, it ends up making an overlap iterator and then never updating it. > The end result is that such a compaction keeps references to lots of and > likely _all_ other SSTables on the node and thus delays the deletion of > obsolete ones by hours or even days. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references
[ https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18756: Status: Requires Testing (was: Review In Progress) > TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps > overlaping SSTable references > -- > > Key: CASSANDRA-18756 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18756 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Time Spent: 0.5h > Remaining Estimate: 0h > > When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not > create or maintain an iterator of overlapping sstables. However, because > {{TimeWindowCompactionController}} inherits from {{CompactionController}} and > only sets {{ignoreOverlaps}} after the base class has constructed the overlap > iterator, it ends up making an overlap iterator and then never updating it. > The end result is that such a compaction keeps references to lots of and > likely _all_ other SSTables on the node and thus delays the deletion of > obsolete ones by hours or even days. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references
[ https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18756: Reviewers: Branimir Lambov, Michael Semb Wever (was: Michael Semb Wever) > TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps > overlaping SSTable references > -- > > Key: CASSANDRA-18756 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18756 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Time Spent: 0.5h > Remaining Estimate: 0h > > When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not > create or maintain an iterator of overlapping sstables. However, because > {{TimeWindowCompactionController}} inherits from {{CompactionController}} and > only sets {{ignoreOverlaps}} after the base class has constructed the overlap > iterator, it ends up making an overlap iterator and then never updating it. > The end result is that such a compaction keeps references to lots of and > likely _all_ other SSTables on the node and thus delays the deletion of > obsolete ones by hours or even days. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references
[ https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18756: Reviewers: Branimir Lambov, Michael Semb Wever, Branimir Lambov (was: Branimir Lambov, Michael Semb Wever) Branimir Lambov, Michael Semb Wever, Branimir Lambov (was: Branimir Lambov, Michael Semb Wever) Status: Review In Progress (was: Patch Available) > TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps > overlaping SSTable references > -- > > Key: CASSANDRA-18756 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18756 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Time Spent: 0.5h > Remaining Estimate: 0h > > When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not > create or maintain an iterator of overlapping sstables. However, because > {{TimeWindowCompactionController}} inherits from {{CompactionController}} and > only sets {{ignoreOverlaps}} after the base class has constructed the overlap > iterator, it ends up making an overlap iterator and then never updating it. > The end result is that such a compaction keeps references to lots of and > likely _all_ other SSTables on the node and thus delays the deletion of > obsolete ones by hours or even days. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-18756) TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps overlaping SSTable references
[ https://issues.apache.org/jira/browse/CASSANDRA-18756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Branimir Lambov updated CASSANDRA-18756: Test and Documentation Plan: CI Status: Patch Available (was: In Progress) > TimeWindowCompactionStrategy with unsafe_aggressive_sstable_expiration keeps > overlaping SSTable references > -- > > Key: CASSANDRA-18756 > URL: https://issues.apache.org/jira/browse/CASSANDRA-18756 > Project: Cassandra > Issue Type: Bug > Components: Local/Compaction >Reporter: Branimir Lambov >Assignee: Ethan Brown >Priority: Normal > Time Spent: 0.5h > Remaining Estimate: 0h > > When {{unsafe_aggressive_sstable_expiration}} is turned on, TWCS should not > create or maintain an iterator of overlapping sstables. However, because > {{TimeWindowCompactionController}} inherits from {{CompactionController}} and > only sets {{ignoreOverlaps}} after the base class has constructed the overlap > iterator, it ends up making an overlap iterator and then never updating it. > The end result is that such a compaction keeps references to lots of and > likely _all_ other SSTables on the node and thus delays the deletion of > obsolete ones by hours or even days. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org