[jira] [Updated] (CASSANDRA-15394) Remove list iterators
[ https://issues.apache.org/jira/browse/CASSANDRA-15394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-15394: Change Category: Performance Complexity: Low Hanging Fruit Fix Version/s: 4.0 Status: Open (was: Triage Needed) > Remove list iterators > - > > Key: CASSANDRA-15394 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15394 > Project: Cassandra > Issue Type: Sub-task > Components: Local/Compaction >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 4.0 > > > We allocate list iterators in several places in hot paths. This converts them > to get by index. This provides a ~4% improvement in relvant workloads. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15394) Remove list iterators
[ https://issues.apache.org/jira/browse/CASSANDRA-15394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-15394: Test and Documentation Plan: circle Status: Patch Available (was: Open) [4.0|https://github.com/bdeggleston/cassandra/tree/15394] > Remove list iterators > - > > Key: CASSANDRA-15394 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15394 > Project: Cassandra > Issue Type: Sub-task > Components: Local/Compaction >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 4.0 > > > We allocate list iterators in several places in hot paths. This converts them > to get by index. This provides a ~4% improvement in relvant workloads. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Comment Edited] (CASSANDRA-15393) Add byte array backed cells
[ https://issues.apache.org/jira/browse/CASSANDRA-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16963497#comment-16963497 ] Blake Eggleston edited comment on CASSANDRA-15393 at 10/30/19 10:52 PM: [4.0|https://github.com/bdeggleston/cassandra/tree/15393] - this depends on the changes in CASSANDRA-15391 was (Author: bdeggleston): [4.0|https://github.com/bdeggleston/cassandra/tree/15393] > Add byte array backed cells > --- > > Key: CASSANDRA-15393 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15393 > Project: Cassandra > Issue Type: Sub-task > Components: Local/Compaction >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 4.0 > > > We currently materialize all values as on heap byte buffers. Byte buffers > have a fairly high overhead given how frequently they’re used, and on the > compaction and local read path we don’t do anything that needs them. Use of > byte buffer methods only happens on the coordinator. Using cells that are > backed by byte arrays instead in these situations reduces compaction and read > garbage up to 22% in many cases. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15394) Remove list iterators
Blake Eggleston created CASSANDRA-15394: --- Summary: Remove list iterators Key: CASSANDRA-15394 URL: https://issues.apache.org/jira/browse/CASSANDRA-15394 Project: Cassandra Issue Type: Sub-task Components: Local/Compaction Reporter: Blake Eggleston Assignee: Blake Eggleston We allocate list iterators in several places in hot paths. This converts them to get by index. This provides a ~4% improvement in relvant workloads. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15393) Add byte array backed cells
[ https://issues.apache.org/jira/browse/CASSANDRA-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-15393: Test and Documentation Plan: circle Status: Patch Available (was: Open) [4.0|https://github.com/bdeggleston/cassandra/tree/15393] > Add byte array backed cells > --- > > Key: CASSANDRA-15393 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15393 > Project: Cassandra > Issue Type: Sub-task > Components: Local/Compaction >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 4.0 > > > We currently materialize all values as on heap byte buffers. Byte buffers > have a fairly high overhead given how frequently they’re used, and on the > compaction and local read path we don’t do anything that needs them. Use of > byte buffer methods only happens on the coordinator. Using cells that are > backed by byte arrays instead in these situations reduces compaction and read > garbage up to 22% in many cases. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15392) Pool Merge Iterators
[ https://issues.apache.org/jira/browse/CASSANDRA-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-15392: Test and Documentation Plan: circle Status: Patch Available (was: Open) [4.0|https://github.com/bdeggleston/cassandra/tree/15392] > Pool Merge Iterators > > > Key: CASSANDRA-15392 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15392 > Project: Cassandra > Issue Type: Sub-task > Components: Local/Compaction >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 4.0 > > > By pooling merge iterators, instead of creating new ones each time we need > them, we can reduce garbage on the compaction and read paths under relevant > workloads by ~4% in many cases. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15391) Reduce heap footprint of commonly allocated objects
[ https://issues.apache.org/jira/browse/CASSANDRA-15391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-15391: Test and Documentation Plan: circle Status: Patch Available (was: Open) [4.0|https://github.com/bdeggleston/cassandra/tree/15391] > Reduce heap footprint of commonly allocated objects > --- > > Key: CASSANDRA-15391 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15391 > Project: Cassandra > Issue Type: Sub-task > Components: Local/Compaction >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 4.0 > > > BufferCell, BTreeRow, and Clustering make up a significant amount of > allocations during reads/compactions, and many of the fields of these classes > are often unused. For example, the CellPath reference in BufferCell is only > every used for collection columns. Since we know which fields will and won’t > be used during cell creation, we can define specialized classes that only > take up heap space for the data they’ll be using. This reduces compaction > garbage by up to 4.5%. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15390) Avoid unnecessary collection/iterator allocations during btree construction
[ https://issues.apache.org/jira/browse/CASSANDRA-15390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-15390: Test and Documentation Plan: circle Status: Patch Available (was: Open) [4.0|https://github.com/bdeggleston/cassandra/tree/15390] > Avoid unnecessary collection/iterator allocations during btree construction > --- > > Key: CASSANDRA-15390 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15390 > Project: Cassandra > Issue Type: Sub-task > Components: Local/Compaction >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 4.0 > > > A heavily used btree builder path does a lot of unnecessary conversions to > and from collections and iterators. Adding dedicated support for Object[] > reduces compaction garbage by up to 8.3% -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15389) Minimize BTree iterator allocations
[ https://issues.apache.org/jira/browse/CASSANDRA-15389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-15389: Test and Documentation Plan: circle Status: Patch Available (was: Open) [4.0|https://github.com/bdeggleston/cassandra/tree/15389] > Minimize BTree iterator allocations > --- > > Key: CASSANDRA-15389 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15389 > Project: Cassandra > Issue Type: Sub-task > Components: Local/Compaction >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 4.0 > > > Allocations of BTree iterators contribute a lot amount of garbage to the > compaction and read paths. > This patch removes most btree iterator allocations on hot paths by: > • using Row#apply where appropriate on frequently called methods > (Row#digest, Row#validateData > • adding BTree accumulate method. Like the apply method, this method walks > the btree with a function that takes and returns a long argument, this > eliminates iterator allocations without adding helper object allocations > (BTreeRow#hasComplex, BTreeRow#hasInvalidDeletions, BTreeRow#dataSize, > BTreeRow#unsharedHeapSizeExcludingData, Rows#collectStats, > UnfilteredSerializer#serializedRowBodySize) as well as eliminating the > allocation of helper objects in places where apply was used previously^[1]^. > • Create map of columns in SerializationHeader, this lets us avoid > allocating a btree search iterator for each row we serialize. > These optimizations reduce garbage created during compaction by up to 13.5% > > [1] the memory test does measure memory allocated by lambdas capturing objects -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15388) Add compaction allocation measurement test to support compaction gc optimization.
[ https://issues.apache.org/jira/browse/CASSANDRA-15388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-15388: Test and Documentation Plan: circle Status: Patch Available (was: Open) [4.0|https://github.com/bdeggleston/cassandra/tree/15388] > Add compaction allocation measurement test to support compaction gc > optimization. > -- > > Key: CASSANDRA-15388 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15388 > Project: Cassandra > Issue Type: Sub-task > Components: Local/Compaction >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 4.0 > > > This adds a test that is able to quickly and accurately measure the effect of > potential gc optimizations against a wide range of (synthetic) compaction > workloads. This test accurately measures allocation rates from 16 workloads > in less that 2 minutes. > This test uses google’s {{java-allocation-instrumenter}} agent to measure the > workloads. Measurements using this agent are very accurate and pretty > repeatable from run to run, with most variance being negligible (1-2 bytes > per partition), although workloads with larger but fewer partitions vary a > bit more (still less that 0.03%). > The thinking behind this patch is that with compaction, we’re generally > interested in the memory allocated per partition, since garbage scales more > or less linearly with the number of partitions compacted. So measuring > allocation from a small number of partitions that otherwise represent real > world use cases is a good enough approximation. > In addition to helping with compaction optimizations, this test could be used > as a template for future optimization work. This pattern could also be used > to set allocation limits on workloads/operations and fail CI if the > allocation behavior changes past some threshold. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15392) Pool Merge Iterators
[ https://issues.apache.org/jira/browse/CASSANDRA-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-15392: Change Category: Performance Complexity: Normal Fix Version/s: 4.0 Status: Open (was: Triage Needed) > Pool Merge Iterators > > > Key: CASSANDRA-15392 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15392 > Project: Cassandra > Issue Type: Sub-task > Components: Local/Compaction >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 4.0 > > > By pooling merge iterators, instead of creating new ones each time we need > them, we can reduce garbage on the compaction and read paths under relevant > workloads by ~4% in many cases. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15393) Add byte array backed cells
[ https://issues.apache.org/jira/browse/CASSANDRA-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-15393: Change Category: Performance Complexity: Normal Fix Version/s: 4.0 Status: Open (was: Triage Needed) > Add byte array backed cells > --- > > Key: CASSANDRA-15393 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15393 > Project: Cassandra > Issue Type: Sub-task > Components: Local/Compaction >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 4.0 > > > We currently materialize all values as on heap byte buffers. Byte buffers > have a fairly high overhead given how frequently they’re used, and on the > compaction and local read path we don’t do anything that needs them. Use of > byte buffer methods only happens on the coordinator. Using cells that are > backed by byte arrays instead in these situations reduces compaction and read > garbage up to 22% in many cases. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15391) Reduce heap footprint of commonly allocated objects
[ https://issues.apache.org/jira/browse/CASSANDRA-15391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-15391: Change Category: Performance Complexity: Normal Fix Version/s: 4.0 Status: Open (was: Triage Needed) > Reduce heap footprint of commonly allocated objects > --- > > Key: CASSANDRA-15391 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15391 > Project: Cassandra > Issue Type: Sub-task > Components: Local/Compaction >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 4.0 > > > BufferCell, BTreeRow, and Clustering make up a significant amount of > allocations during reads/compactions, and many of the fields of these classes > are often unused. For example, the CellPath reference in BufferCell is only > every used for collection columns. Since we know which fields will and won’t > be used during cell creation, we can define specialized classes that only > take up heap space for the data they’ll be using. This reduces compaction > garbage by up to 4.5%. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15389) Minimize BTree iterator allocations
[ https://issues.apache.org/jira/browse/CASSANDRA-15389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-15389: Change Category: Performance Complexity: Normal Component/s: Local/Compaction Fix Version/s: 4.0 Status: Open (was: Triage Needed) > Minimize BTree iterator allocations > --- > > Key: CASSANDRA-15389 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15389 > Project: Cassandra > Issue Type: Sub-task > Components: Local/Compaction >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 4.0 > > > Allocations of BTree iterators contribute a lot amount of garbage to the > compaction and read paths. > This patch removes most btree iterator allocations on hot paths by: > • using Row#apply where appropriate on frequently called methods > (Row#digest, Row#validateData > • adding BTree accumulate method. Like the apply method, this method walks > the btree with a function that takes and returns a long argument, this > eliminates iterator allocations without adding helper object allocations > (BTreeRow#hasComplex, BTreeRow#hasInvalidDeletions, BTreeRow#dataSize, > BTreeRow#unsharedHeapSizeExcludingData, Rows#collectStats, > UnfilteredSerializer#serializedRowBodySize) as well as eliminating the > allocation of helper objects in places where apply was used previously^[1]^. > • Create map of columns in SerializationHeader, this lets us avoid > allocating a btree search iterator for each row we serialize. > These optimizations reduce garbage created during compaction by up to 13.5% > > [1] the memory test does measure memory allocated by lambdas capturing objects -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15390) Avoid unnecessary collection/iterator allocations during btree construction
[ https://issues.apache.org/jira/browse/CASSANDRA-15390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-15390: Change Category: Performance Complexity: Normal Fix Version/s: 4.0 Status: Open (was: Triage Needed) > Avoid unnecessary collection/iterator allocations during btree construction > --- > > Key: CASSANDRA-15390 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15390 > Project: Cassandra > Issue Type: Sub-task > Components: Local/Compaction >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 4.0 > > > A heavily used btree builder path does a lot of unnecessary conversions to > and from collections and iterators. Adding dedicated support for Object[] > reduces compaction garbage by up to 8.3% -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15393) Add byte array backed cells
Blake Eggleston created CASSANDRA-15393: --- Summary: Add byte array backed cells Key: CASSANDRA-15393 URL: https://issues.apache.org/jira/browse/CASSANDRA-15393 Project: Cassandra Issue Type: Sub-task Components: Local/Compaction Reporter: Blake Eggleston Assignee: Blake Eggleston We currently materialize all values as on heap byte buffers. Byte buffers have a fairly high overhead given how frequently they’re used, and on the compaction and local read path we don’t do anything that needs them. Use of byte buffer methods only happens on the coordinator. Using cells that are backed by byte arrays instead in these situations reduces compaction and read garbage up to 22% in many cases. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15392) Pool Merge Iterators
Blake Eggleston created CASSANDRA-15392: --- Summary: Pool Merge Iterators Key: CASSANDRA-15392 URL: https://issues.apache.org/jira/browse/CASSANDRA-15392 Project: Cassandra Issue Type: Sub-task Components: Local/Compaction Reporter: Blake Eggleston Assignee: Blake Eggleston By pooling merge iterators, instead of creating new ones each time we need them, we can reduce garbage on the compaction and read paths under relevant workloads by ~4% in many cases. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15391) Reduce heap footprint of commonly allocated objects
Blake Eggleston created CASSANDRA-15391: --- Summary: Reduce heap footprint of commonly allocated objects Key: CASSANDRA-15391 URL: https://issues.apache.org/jira/browse/CASSANDRA-15391 Project: Cassandra Issue Type: Sub-task Components: Local/Compaction Reporter: Blake Eggleston Assignee: Blake Eggleston BufferCell, BTreeRow, and Clustering make up a significant amount of allocations during reads/compactions, and many of the fields of these classes are often unused. For example, the CellPath reference in BufferCell is only every used for collection columns. Since we know which fields will and won’t be used during cell creation, we can define specialized classes that only take up heap space for the data they’ll be using. This reduces compaction garbage by up to 4.5%. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15390) Avoid unnecessary collection/iterator allocations during btree construction
Blake Eggleston created CASSANDRA-15390: --- Summary: Avoid unnecessary collection/iterator allocations during btree construction Key: CASSANDRA-15390 URL: https://issues.apache.org/jira/browse/CASSANDRA-15390 Project: Cassandra Issue Type: Sub-task Components: Local/Compaction Reporter: Blake Eggleston Assignee: Blake Eggleston A heavily used btree builder path does a lot of unnecessary conversions to and from collections and iterators. Adding dedicated support for Object[] reduces compaction garbage by up to 8.3% -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15389) Minimize BTree iterator allocations
Blake Eggleston created CASSANDRA-15389: --- Summary: Minimize BTree iterator allocations Key: CASSANDRA-15389 URL: https://issues.apache.org/jira/browse/CASSANDRA-15389 Project: Cassandra Issue Type: Sub-task Reporter: Blake Eggleston Assignee: Blake Eggleston Allocations of BTree iterators contribute a lot amount of garbage to the compaction and read paths. This patch removes most btree iterator allocations on hot paths by: • using Row#apply where appropriate on frequently called methods (Row#digest, Row#validateData • adding BTree accumulate method. Like the apply method, this method walks the btree with a function that takes and returns a long argument, this eliminates iterator allocations without adding helper object allocations (BTreeRow#hasComplex, BTreeRow#hasInvalidDeletions, BTreeRow#dataSize, BTreeRow#unsharedHeapSizeExcludingData, Rows#collectStats, UnfilteredSerializer#serializedRowBodySize) as well as eliminating the allocation of helper objects in places where apply was used previously^[1]^. • Create map of columns in SerializationHeader, this lets us avoid allocating a btree search iterator for each row we serialize. These optimizations reduce garbage created during compaction by up to 13.5% [1] the memory test does measure memory allocated by lambdas capturing objects -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15388) Add compaction allocation measurement test to support compaction gc optimization.
[ https://issues.apache.org/jira/browse/CASSANDRA-15388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-15388: Change Category: Performance Complexity: Normal Fix Version/s: 4.0 Status: Open (was: Triage Needed) > Add compaction allocation measurement test to support compaction gc > optimization. > -- > > Key: CASSANDRA-15388 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15388 > Project: Cassandra > Issue Type: Sub-task > Components: Local/Compaction >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 4.0 > > > This adds a test that is able to quickly and accurately measure the effect of > potential gc optimizations against a wide range of (synthetic) compaction > workloads. This test accurately measures allocation rates from 16 workloads > in less that 2 minutes. > This test uses google’s {{java-allocation-instrumenter}} agent to measure the > workloads. Measurements using this agent are very accurate and pretty > repeatable from run to run, with most variance being negligible (1-2 bytes > per partition), although workloads with larger but fewer partitions vary a > bit more (still less that 0.03%). > The thinking behind this patch is that with compaction, we’re generally > interested in the memory allocated per partition, since garbage scales more > or less linearly with the number of partitions compacted. So measuring > allocation from a small number of partitions that otherwise represent real > world use cases is a good enough approximation. > In addition to helping with compaction optimizations, this test could be used > as a template for future optimization work. This pattern could also be used > to set allocation limits on workloads/operations and fail CI if the > allocation behavior changes past some threshold. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15388) Add compaction allocation measurement test to support compaction gc optimization.
Blake Eggleston created CASSANDRA-15388: --- Summary: Add compaction allocation measurement test to support compaction gc optimization. Key: CASSANDRA-15388 URL: https://issues.apache.org/jira/browse/CASSANDRA-15388 Project: Cassandra Issue Type: Sub-task Components: Local/Compaction Reporter: Blake Eggleston Assignee: Blake Eggleston This adds a test that is able to quickly and accurately measure the effect of potential gc optimizations against a wide range of (synthetic) compaction workloads. This test accurately measures allocation rates from 16 workloads in less that 2 minutes. This test uses google’s {{java-allocation-instrumenter}} agent to measure the workloads. Measurements using this agent are very accurate and pretty repeatable from run to run, with most variance being negligible (1-2 bytes per partition), although workloads with larger but fewer partitions vary a bit more (still less that 0.03%). The thinking behind this patch is that with compaction, we’re generally interested in the memory allocated per partition, since garbage scales more or less linearly with the number of partitions compacted. So measuring allocation from a small number of partitions that otherwise represent real world use cases is a good enough approximation. In addition to helping with compaction optimizations, this test could be used as a template for future optimization work. This pattern could also be used to set allocation limits on workloads/operations and fail CI if the allocation behavior changes past some threshold. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15387) Reduce compaction & local read path garbage
[ https://issues.apache.org/jira/browse/CASSANDRA-15387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-15387: Change Category: Performance Complexity: Normal Fix Version/s: 4.0 Status: Open (was: Triage Needed) > Reduce compaction & local read path garbage > --- > > Key: CASSANDRA-15387 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15387 > Project: Cassandra > Issue Type: Improvement > Components: Local/Compaction >Reporter: Blake Eggleston >Assignee: Blake Eggleston >Priority: Normal > Fix For: 4.0 > > > There are several opportunities to significantly reduce the amount of garbage > generated by compaction and the local read path. This will serve as a top > level jira for related changes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15387) Reduce compaction & local read path garbage
Blake Eggleston created CASSANDRA-15387: --- Summary: Reduce compaction & local read path garbage Key: CASSANDRA-15387 URL: https://issues.apache.org/jira/browse/CASSANDRA-15387 Project: Cassandra Issue Type: Improvement Components: Local/Compaction Reporter: Blake Eggleston Assignee: Blake Eggleston There are several opportunities to significantly reduce the amount of garbage generated by compaction and the local read path. This will serve as a top level jira for related changes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra-website] branch master updated: Fixed docker builds
This is an automated email from the ASF dual-hosted git repository. rustyrazorblade pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/cassandra-website.git The following commit(s) were added to refs/heads/master by this push: new 28cdbd7 Fixed docker builds 28cdbd7 is described below commit 28cdbd705ec2a3194f7a581dd2702dab15238043 Author: Jon Haddad AuthorDate: Wed Oct 30 13:09:10 2019 -0700 Fixed docker builds --- docker-compose.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docker-compose.yml b/docker-compose.yml index 261f99c..cfa944c 100644 --- a/docker-compose.yml +++ b/docker-compose.yml @@ -6,7 +6,7 @@ services: image: cassandra-website:latest volumes: - ./src:/usr/src/cassandra-site/src - - ./content:/usr/src/cassandra-site/content + - ./content:/usr/src/cassandra-site/publish cassandra-website-serve: build: . - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15385) Ensure that tracing doesn't break connections in 3.x/4.0 mixed mode by default
[ https://issues.apache.org/jira/browse/CASSANDRA-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-15385: -- Component/s: Observability/Tracing > Ensure that tracing doesn't break connections in 3.x/4.0 mixed mode by default > -- > > Key: CASSANDRA-15385 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15385 > Project: Cassandra > Issue Type: Bug > Components: Observability/Tracing >Reporter: Aleksey Yeschenko >Assignee: Aleksey Yeschenko >Priority: Normal > Fix For: 3.0.x, 3.11.x > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15385) Ensure that tracing doesn't break connections in 3.x/4.0 mixed mode by default
[ https://issues.apache.org/jira/browse/CASSANDRA-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-15385: -- Fix Version/s: 3.11.x 3.0.x > Ensure that tracing doesn't break connections in 3.x/4.0 mixed mode by default > -- > > Key: CASSANDRA-15385 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15385 > Project: Cassandra > Issue Type: Bug >Reporter: Aleksey Yeschenko >Assignee: Aleksey Yeschenko >Priority: Normal > Fix For: 3.0.x, 3.11.x > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15385) Ensure that tracing doesn't break connections in 3.x/4.0 mixed mode by default
[ https://issues.apache.org/jira/browse/CASSANDRA-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aleksey Yeschenko updated CASSANDRA-15385: -- Summary: Ensure that tracing doesn't break connections in 3.x/4.0 mixed mode by default (was: TBD (minor; boring)) > Ensure that tracing doesn't break connections in 3.x/4.0 mixed mode by default > -- > > Key: CASSANDRA-15385 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15385 > Project: Cassandra > Issue Type: Bug >Reporter: Aleksey Yeschenko >Assignee: Aleksey Yeschenko >Priority: Normal > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15332) When repair is running with tracing, if a CorruptSSTableException is thrown while building Merkle Trees the DiskFailurePolicy does not get applied
[ https://issues.apache.org/jira/browse/CASSANDRA-15332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16963240#comment-16963240 ] David Capwell commented on CASSANDRA-15332: --- [~marcuse] I pushed all changes based off your feedback. [~jrwest] can you re-review? The changes to src/java are completely different since your last review, so good to look at that again (tests are mostly the same, small changes) > When repair is running with tracing, if a CorruptSSTableException is thrown > while building Merkle Trees the DiskFailurePolicy does not get applied > -- > > Key: CASSANDRA-15332 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15332 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair, Observability/Tracing >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Labels: pull-request-available > Fix For: 4.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > When a repair is in the validation phase and is building MerkleTrees, if a > corrupt SSTable exception is thrown the disk failure policy does not get > applied -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15386) Use multiple data directories in the in-jvm dtests
[ https://issues.apache.org/jira/browse/CASSANDRA-15386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-15386: Test and Documentation Plan: circle ci run Status: Patch Available (was: Open) [patch|https://github.com/krummas/cassandra/commits/marcuse/15386] [tests|https://circleci.com/workflow-run/c002075c-d210-465c-b7fd-bf6abaf7dc27] > Use multiple data directories in the in-jvm dtests > -- > > Key: CASSANDRA-15386 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15386 > Project: Cassandra > Issue Type: Improvement > Components: Local/Compaction, Test/dtest >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Normal > Fix For: 3.11.x, 4.x > > > We should default to using 3 data directories when running the in-jvm dtests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15386) Use multiple data directories in the in-jvm dtests
[ https://issues.apache.org/jira/browse/CASSANDRA-15386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-15386: Change Category: Quality Assurance Complexity: Low Hanging Fruit Fix Version/s: 4.x 3.11.x Status: Open (was: Triage Needed) > Use multiple data directories in the in-jvm dtests > -- > > Key: CASSANDRA-15386 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15386 > Project: Cassandra > Issue Type: Improvement > Components: Local/Compaction, Test/dtest >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Normal > Fix For: 3.11.x, 4.x > > > We should default to using 3 data directories when running the in-jvm dtests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15386) Use multiple data directories in the in-jvm dtests
Marcus Eriksson created CASSANDRA-15386: --- Summary: Use multiple data directories in the in-jvm dtests Key: CASSANDRA-15386 URL: https://issues.apache.org/jira/browse/CASSANDRA-15386 Project: Cassandra Issue Type: Improvement Components: Local/Compaction, Test/dtest Reporter: Marcus Eriksson Assignee: Marcus Eriksson We should default to using 3 data directories when running the in-jvm dtests. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13974) Bad prefix matching when figuring out data directory for an sstable
[ https://issues.apache.org/jira/browse/CASSANDRA-13974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-13974: Test and Documentation Plan: new tests, circleci runs Status: Patch Available (was: In Progress) Pushed a fix to the 3.11, trunk branches above which keeps a mapping of canonical data path to data directory - we then use this map when figuring out the data directory for an sstable. The directory in the descriptor is always canonical. > Bad prefix matching when figuring out data directory for an sstable > --- > > Key: CASSANDRA-13974 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13974 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Core >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.x > > > We do a "startsWith" check when getting data directory for an sstable, we > should match including File.separator -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Created] (CASSANDRA-15385) TBD (minor; boring)
Aleksey Yeschenko created CASSANDRA-15385: - Summary: TBD (minor; boring) Key: CASSANDRA-15385 URL: https://issues.apache.org/jira/browse/CASSANDRA-15385 Project: Cassandra Issue Type: Bug Reporter: Aleksey Yeschenko Assignee: Aleksey Yeschenko -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-13974) Bad prefix matching when figuring out data directory for an sstable
[ https://issues.apache.org/jira/browse/CASSANDRA-13974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-13974: Status: In Progress (was: Changes Suggested) > Bad prefix matching when figuring out data directory for an sstable > --- > > Key: CASSANDRA-13974 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13974 > Project: Cassandra > Issue Type: Bug > Components: Legacy/Core >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Normal > Fix For: 3.0.x, 3.11.x, 4.x > > > We do a "startsWith" check when getting data directory for an sstable, we > should match including File.separator -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15365) Add primary key liveness info when skipping illegal cells
[ https://issues.apache.org/jira/browse/CASSANDRA-15365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcus Eriksson updated CASSANDRA-15365: Fix Version/s: (was: 3.11.x) (was: 3.0.x) 3.11.6 3.0.20 Since Version: 3.0 alpha 1 Source Control Link: https://github.com/apache/cassandra/commit/767a68cd00050298abf7bbfd8b322e5663439c23 Resolution: Fixed Status: Resolved (was: Ready to Commit) and committed to 3.0 and merged to 3.11, thanks test runs: [3.0|https://circleci.com/workflow-run/0c910638-db5f-494b-ba47-cc728a4d2eb6] [3.11|https://circleci.com/workflow-run/da2cd137-ae9e-45bf-9c4d-1b98a006e08a] > Add primary key liveness info when skipping illegal cells > - > > Key: CASSANDRA-15365 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15365 > Project: Cassandra > Issue Type: Bug > Components: Local/SSTable >Reporter: Marcus Eriksson >Assignee: Marcus Eriksson >Priority: Normal > Fix For: 3.0.20, 3.11.6 > > > In CASSANDRA-15086/CASSANDRA-15178 we started skipping the illegal legacy > cells, problem is that if the row only contains illegal cells, we return a > totally empty row which breaks stats collection: > https://github.com/apache/cassandra/blob/93815db9853cb592edf13d82e91dc2e9d172f01f/src/java/org/apache/cassandra/db/rows/Rows.java#L70 > If the row only has these invalid cells, we should add a primary key liveness > info to it to match the 2.1 behaviour. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] 01/01: Merge branch 'cassandra-3.0' into cassandra-3.11
This is an automated email from the ASF dual-hosted git repository. marcuse pushed a commit to branch cassandra-3.11 in repository https://gitbox.apache.org/repos/asf/cassandra.git commit 2d90e3c2a443e4da3d04cb5701d30fb709406c96 Merge: fb50d82 767a68c Author: Marcus Eriksson AuthorDate: Wed Oct 30 15:10:41 2019 +0100 Merge branch 'cassandra-3.0' into cassandra-3.11 CHANGES.txt| 5 +- src/java/org/apache/cassandra/db/LegacyLayout.java | 56 --- ...with_illegal_cell_names-ka-2-CompressionInfo.db | Bin 0 -> 43 bytes ...-legacy_ka_with_illegal_cell_names-ka-2-Data.db | Bin 0 -> 59 bytes ...acy_ka_with_illegal_cell_names-ka-2-Digest.sha1 | 1 + ...egacy_ka_with_illegal_cell_names-ka-2-Filter.db | Bin 0 -> 16 bytes ...legacy_ka_with_illegal_cell_names-ka-2-Index.db | Bin 0 -> 18 bytes ...y_ka_with_illegal_cell_names-ka-2-Statistics.db | Bin 0 -> 4452 bytes ...-legacy_ka_with_illegal_cell_names-ka-2-TOC.txt | 8 +++ ...th_illegal_cell_names_2-ka-1-CompressionInfo.db | Bin 0 -> 43 bytes ...egacy_ka_with_illegal_cell_names_2-ka-1-Data.db | Bin 0 -> 67 bytes ...y_ka_with_illegal_cell_names_2-ka-1-Digest.sha1 | 1 + ...acy_ka_with_illegal_cell_names_2-ka-1-Filter.db | Bin 0 -> 16 bytes ...gacy_ka_with_illegal_cell_names_2-ka-1-Index.db | Bin 0 -> 18 bytes ...ka_with_illegal_cell_names_2-ka-1-Statistics.db | Bin 0 -> 4452 bytes ...cy_ka_with_illegal_cell_names_2-ka-1-Summary.db | Bin 0 -> 92 bytes ...egacy_ka_with_illegal_cell_names_2-ka-1-TOC.txt | 8 +++ .../cassandra/io/sstable/LegacySSTableTest.java| 80 +++-- 18 files changed, 144 insertions(+), 15 deletions(-) diff --cc CHANGES.txt index efd9ecb,d58a199..c21b2cf --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,14 -1,10 +1,17 @@@ -3.0.20 +3.11.6 - Merged from 2.2 ++Merged from 3.0: + * Ensure legacy rows have primary key livenessinfo when they contain illegal cells (CASSANDRA-15365) -Merged from 2.2 ++Merged from 2.2: * In-JVM DTest: Set correct internode message version for upgrade test (CASSANDRA-15371) + -3.0.19 +3.11.5 + * Fix SASI non-literal string comparisons (range operators) (CASSANDRA-15169) + * Make sure user defined compaction transactions are always closed (CASSANDRA-15123) + * Fix cassandra-env.sh to use $CASSANDRA_CONF to find cassandra-jaas.config (CASSANDRA-14305) + * Fixed nodetool cfstats printing index name twice (CASSANDRA-14903) + * Add flag to disable SASI indexes, and warnings on creation (CASSANDRA-14866) +Merged from 3.0: * Add ability to cap max negotiable protocol version (CASSANDRA-15193) * Gossip tokens on startup if available (CASSANDRA-15335) * Fix resource leak in CompressedSequentialWriter (CASSANDRA-15340) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch cassandra-3.11 updated (fb50d82 -> 2d90e3c)
This is an automated email from the ASF dual-hosted git repository. marcuse pushed a change to branch cassandra-3.11 in repository https://gitbox.apache.org/repos/asf/cassandra.git. from fb50d82 Increment version to 3.11.6 new 767a68c Ensure legacy rows have primary key livenessinfo when they contain illegal cells new 2d90e3c Merge branch 'cassandra-3.0' into cassandra-3.11 The 2 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: CHANGES.txt| 5 +- src/java/org/apache/cassandra/db/LegacyLayout.java | 56 --- ...with_illegal_cell_names-ka-2-CompressionInfo.db | Bin 0 -> 43 bytes ...-legacy_ka_with_illegal_cell_names-ka-2-Data.db | Bin 0 -> 59 bytes ...acy_ka_with_illegal_cell_names-ka-2-Digest.sha1 | 1 + ...egacy_ka_with_illegal_cell_names-ka-2-Filter.db | Bin 0 -> 16 bytes ...legacy_ka_with_illegal_cell_names-ka-2-Index.db | Bin 0 -> 18 bytes ..._ka_with_illegal_cell_names-ka-2-Statistics.db} | Bin 4450 -> 4452 bytes ...legacy_ka_with_illegal_cell_names-ka-2-TOC.txt} | 10 +-- ...th_illegal_cell_names_2-ka-1-CompressionInfo.db | Bin 0 -> 43 bytes ...egacy_ka_with_illegal_cell_names_2-ka-1-Data.db | Bin 0 -> 67 bytes ...y_ka_with_illegal_cell_names_2-ka-1-Digest.sha1 | 1 + ...acy_ka_with_illegal_cell_names_2-ka-1-Filter.db | Bin 0 -> 16 bytes ...gacy_ka_with_illegal_cell_names_2-ka-1-Index.db | Bin 0 -> 18 bytes ...a_with_illegal_cell_names_2-ka-1-Statistics.db} | Bin 4450 -> 4452 bytes ...cy_ka_with_illegal_cell_names_2-ka-1-Summary.db | Bin 0 -> 92 bytes ...gacy_ka_with_illegal_cell_names_2-ka-1-TOC.txt} | 10 +-- .../cassandra/io/sstable/LegacySSTableTest.java| 80 +++-- 18 files changed, 138 insertions(+), 25 deletions(-) create mode 100644 test/data/legacy-sstables/ka/legacy_tables/legacy_ka_with_illegal_cell_names/legacy_tables-legacy_ka_with_illegal_cell_names-ka-2-CompressionInfo.db create mode 100644 test/data/legacy-sstables/ka/legacy_tables/legacy_ka_with_illegal_cell_names/legacy_tables-legacy_ka_with_illegal_cell_names-ka-2-Data.db create mode 100644 test/data/legacy-sstables/ka/legacy_tables/legacy_ka_with_illegal_cell_names/legacy_tables-legacy_ka_with_illegal_cell_names-ka-2-Digest.sha1 create mode 100644 test/data/legacy-sstables/ka/legacy_tables/legacy_ka_with_illegal_cell_names/legacy_tables-legacy_ka_with_illegal_cell_names-ka-2-Filter.db create mode 100644 test/data/legacy-sstables/ka/legacy_tables/legacy_ka_with_illegal_cell_names/legacy_tables-legacy_ka_with_illegal_cell_names-ka-2-Index.db copy test/data/legacy-sstables/ka/legacy_tables/{legacy_ka_14766/legacy_tables-legacy_ka_14766-ka-1-Statistics.db => legacy_ka_with_illegal_cell_names/legacy_tables-legacy_ka_with_illegal_cell_names-ka-2-Statistics.db} (91%) copy test/data/{bloom-filter/ka/foo/foo-atable-ka-1-TOC.txt => legacy-sstables/ka/legacy_tables/legacy_ka_with_illegal_cell_names/legacy_tables-legacy_ka_with_illegal_cell_names-ka-2-TOC.txt} (100%) create mode 100644 test/data/legacy-sstables/ka/legacy_tables/legacy_ka_with_illegal_cell_names_2/legacy_tables-legacy_ka_with_illegal_cell_names_2-ka-1-CompressionInfo.db create mode 100644 test/data/legacy-sstables/ka/legacy_tables/legacy_ka_with_illegal_cell_names_2/legacy_tables-legacy_ka_with_illegal_cell_names_2-ka-1-Data.db create mode 100644 test/data/legacy-sstables/ka/legacy_tables/legacy_ka_with_illegal_cell_names_2/legacy_tables-legacy_ka_with_illegal_cell_names_2-ka-1-Digest.sha1 create mode 100644 test/data/legacy-sstables/ka/legacy_tables/legacy_ka_with_illegal_cell_names_2/legacy_tables-legacy_ka_with_illegal_cell_names_2-ka-1-Filter.db create mode 100644 test/data/legacy-sstables/ka/legacy_tables/legacy_ka_with_illegal_cell_names_2/legacy_tables-legacy_ka_with_illegal_cell_names_2-ka-1-Index.db copy test/data/legacy-sstables/ka/legacy_tables/{legacy_ka_14766/legacy_tables-legacy_ka_14766-ka-1-Statistics.db => legacy_ka_with_illegal_cell_names_2/legacy_tables-legacy_ka_with_illegal_cell_names_2-ka-1-Statistics.db} (91%) create mode 100644 test/data/legacy-sstables/ka/legacy_tables/legacy_ka_with_illegal_cell_names_2/legacy_tables-legacy_ka_with_illegal_cell_names_2-ka-1-Summary.db copy test/data/{bloom-filter/ka/foo/foo-atable-ka-1-TOC.txt => legacy-sstables/ka/legacy_tables/legacy_ka_with_illegal_cell_names_2/legacy_tables-legacy_ka_with_illegal_cell_names_2-ka-1-TOC.txt} (100%) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch cassandra-3.0 updated: Ensure legacy rows have primary key livenessinfo when they contain illegal cells
This is an automated email from the ASF dual-hosted git repository. marcuse pushed a commit to branch cassandra-3.0 in repository https://gitbox.apache.org/repos/asf/cassandra.git The following commit(s) were added to refs/heads/cassandra-3.0 by this push: new 767a68c Ensure legacy rows have primary key livenessinfo when they contain illegal cells 767a68c is described below commit 767a68cd00050298abf7bbfd8b322e5663439c23 Author: Marcus Eriksson AuthorDate: Thu Oct 17 10:24:57 2019 +0200 Ensure legacy rows have primary key livenessinfo when they contain illegal cells Patch by marcuse and Sam Tunnicliffe; reviewed by Benedict Elliott Smith for CASSANDRA-15365 --- CHANGES.txt| 1 + src/java/org/apache/cassandra/db/LegacyLayout.java | 56 --- ...with_illegal_cell_names-ka-2-CompressionInfo.db | Bin 0 -> 43 bytes ...-legacy_ka_with_illegal_cell_names-ka-2-Data.db | Bin 0 -> 59 bytes ...acy_ka_with_illegal_cell_names-ka-2-Digest.sha1 | 1 + ...egacy_ka_with_illegal_cell_names-ka-2-Filter.db | Bin 0 -> 16 bytes ...legacy_ka_with_illegal_cell_names-ka-2-Index.db | Bin 0 -> 18 bytes ...y_ka_with_illegal_cell_names-ka-2-Statistics.db | Bin 0 -> 4452 bytes ...-legacy_ka_with_illegal_cell_names-ka-2-TOC.txt | 8 +++ ...th_illegal_cell_names_2-ka-1-CompressionInfo.db | Bin 0 -> 43 bytes ...egacy_ka_with_illegal_cell_names_2-ka-1-Data.db | Bin 0 -> 67 bytes ...y_ka_with_illegal_cell_names_2-ka-1-Digest.sha1 | 1 + ...acy_ka_with_illegal_cell_names_2-ka-1-Filter.db | Bin 0 -> 16 bytes ...gacy_ka_with_illegal_cell_names_2-ka-1-Index.db | Bin 0 -> 18 bytes ...ka_with_illegal_cell_names_2-ka-1-Statistics.db | Bin 0 -> 4452 bytes ...cy_ka_with_illegal_cell_names_2-ka-1-Summary.db | Bin 0 -> 92 bytes ...egacy_ka_with_illegal_cell_names_2-ka-1-TOC.txt | 8 +++ .../cassandra/io/sstable/LegacySSTableTest.java| 80 +++-- 18 files changed, 141 insertions(+), 14 deletions(-) diff --git a/CHANGES.txt b/CHANGES.txt index bd12942..d58a199 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 3.0.20 + * Ensure legacy rows have primary key livenessinfo when they contain illegal cells (CASSANDRA-15365) Merged from 2.2 * In-JVM DTest: Set correct internode message version for upgrade test (CASSANDRA-15371) diff --git a/src/java/org/apache/cassandra/db/LegacyLayout.java b/src/java/org/apache/cassandra/db/LegacyLayout.java index 1a03c91..6e93d08 100644 --- a/src/java/org/apache/cassandra/db/LegacyLayout.java +++ b/src/java/org/apache/cassandra/db/LegacyLayout.java @@ -1291,6 +1291,22 @@ public abstract class LegacyLayout private LegacyRangeTombstone rowDeletion; private LegacyRangeTombstone collectionDeletion; +/** + * Used to track if we need to add pk liveness info (row marker) when removing invalid legacy cells. + * + * In 2.1 these invalid cells existed but were not queryable, in this case specifically because they + * represented values for clustering key columns that were written as data cells. + * + * However, the presence (or not) of such cells on an otherwise empty CQL row (or partition) would decide + * if an empty result row were returned for the CQL row (or partition). To maintain this behaviour we + * insert a row marker containing the liveness info of these invalid cells iff we have no other data + * on the row. + * + * See also CASSANDRA-15365 + */ +private boolean hasValidCells = false; +private LivenessInfo invalidLivenessInfo = null; + public CellGrouper(CFMetaData metadata, SerializationHelper helper) { this(metadata, helper, false); @@ -1317,6 +1333,8 @@ public abstract class LegacyLayout this.clustering = null; this.rowDeletion = null; this.collectionDeletion = null; +this.invalidLivenessInfo = null; +this.hasValidCells = false; } public boolean addAtom(LegacyAtom atom) @@ -1326,7 +1344,7 @@ public abstract class LegacyLayout : addRangeTombstone(atom.asRangeTombstone()); } -public boolean addCell(LegacyCell cell) +private boolean addCell(LegacyCell cell) { if (clustering == null) { @@ -1359,21 +1377,38 @@ public abstract class LegacyLayout builder.addRowDeletion(Row.Deletion.regular(new DeletionTime(cell.timestamp, cell.localDeletionTime))); else builder.addPrimaryKeyLivenessInfo(LivenessInfo.create(cell.timestamp, FAKE_TTL, cell.localDeletionTime)); +hasValidCells = true; +} +else if (column.isPrimaryKeyColumn() && metadata.isCQLTable()) +{ +// SSTables generated offline and side-loaded may include invalid cells which
[cassandra] 01/01: Merge branch 'cassandra-3.11' into trunk
This is an automated email from the ASF dual-hosted git repository. marcuse pushed a commit to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git commit f6c1dead6600e5b15239b5e2fd050770c48c6aa3 Merge: 6d93f04 2d90e3c Author: Marcus Eriksson AuthorDate: Wed Oct 30 15:15:58 2019 +0100 Merge branch 'cassandra-3.11' into trunk - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[cassandra] branch trunk updated (6d93f04 -> f6c1dea)
This is an automated email from the ASF dual-hosted git repository. marcuse pushed a change to branch trunk in repository https://gitbox.apache.org/repos/asf/cassandra.git. from 6d93f04 Increment version to 4.0-alpha3 new 767a68c Ensure legacy rows have primary key livenessinfo when they contain illegal cells new 2d90e3c Merge branch 'cassandra-3.0' into cassandra-3.11 new f6c1dea Merge branch 'cassandra-3.11' into trunk The 3 revisions listed above as "new" are entirely new to this repository and will be described in separate emails. The revisions listed as "add" were already present in the repository and have only been added to this reference. Summary of changes: - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15292) Point-in-time recovery ignoring timestamp of static column updates
[ https://issues.apache.org/jira/browse/CASSANDRA-15292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict Elliott Smith updated CASSANDRA-15292: --- Bug Category: Parent values: Correctness(12982)Level 1 values: Recoverable Corruption / Loss(12986) (was: Parent values: Correctness(12982)) Complexity: Low Hanging Fruit (was: Normal) Status: Open (was: Triage Needed) > Point-in-time recovery ignoring timestamp of static column updates > -- > > Key: CASSANDRA-15292 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15292 > Project: Cassandra > Issue Type: Bug > Components: Local/Commit Log >Reporter: Vincent White >Priority: Normal > > During point-in-time recovery > org.apache.cassandra.db.partitions.PartitionUpdate#maxTimestamp is checked to > see if any write timestamps in the update exceed the recovery point. If any > of the timestamps do exceed this point the the commit log replay is stopped. > Currently maxTimestamp only iterates over the regular rows in the update and > doesn't check for any included updates to static columns. If a ParitionUpdate > only contains updates to static columns then maxTimestamp will return > Long.MIN_VALUE and always be replayed. > This generally isn't much of an issue, except for non-dense compact storage > tables which are implemented in the 3.x storage engine in large part with > static columns. In this case the commit log will always continue applying > updates to them past the recovery point until it hits an update to a > different table with regular columns or reaches the end of the commit logs. > > ||Patch|| > |[3.11|https://github.com/vincewhite/cassandra/commits/3_11_check_static_column_timestamps_commit_log_archive]| > |[Trunk|https://github.com/vincewhite/cassandra/commits/trunk_check_static_column_timestamps]| -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Commented] (CASSANDRA-15332) When repair is running with tracing, if a CorruptSSTableException is thrown while building Merkle Trees the DiskFailurePolicy does not get applied
[ https://issues.apache.org/jira/browse/CASSANDRA-15332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16962837#comment-16962837 ] Marcus Eriksson commented on CASSANDRA-15332: - this lgtm, left a few minor comments on the PR > When repair is running with tracing, if a CorruptSSTableException is thrown > while building Merkle Trees the DiskFailurePolicy does not get applied > -- > > Key: CASSANDRA-15332 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15332 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Repair, Observability/Tracing >Reporter: David Capwell >Assignee: David Capwell >Priority: Normal > Labels: pull-request-available > Fix For: 4.0 > > Time Spent: 40m > Remaining Estimate: 0h > > When a repair is in the validation phase and is building MerkleTrees, if a > corrupt SSTable exception is thrown the disk failure policy does not get > applied -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15322) Partition size virtual table
[ https://issues.apache.org/jira/browse/CASSANDRA-15322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict Elliott Smith updated CASSANDRA-15322: --- Component/s: Feature/Virtual Tables > Partition size virtual table > > > Key: CASSANDRA-15322 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15322 > Project: Cassandra > Issue Type: New Feature > Components: Feature/Virtual Tables >Reporter: Chris Lohfink >Assignee: Chris Lohfink >Priority: Normal > Labels: virtual-tables > > Virtual table to provide on disk size (local) of a given partition. Useful > for checking for or verifying issues with wide partitions. This is dependent > on the lazy virtual table ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15322) Partition size virtual table
[ https://issues.apache.org/jira/browse/CASSANDRA-15322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benedict Elliott Smith updated CASSANDRA-15322: --- Labels: (was: virtual-tables) > Partition size virtual table > > > Key: CASSANDRA-15322 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15322 > Project: Cassandra > Issue Type: New Feature > Components: Feature/Virtual Tables >Reporter: Chris Lohfink >Assignee: Chris Lohfink >Priority: Normal > > Virtual table to provide on disk size (local) of a given partition. Useful > for checking for or verifying issues with wide partitions. This is dependent > on the lazy virtual table ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org
[jira] [Updated] (CASSANDRA-15336) LegacyLayout RangeTombstoneList throws IndexOutOfBoundsException When Running Range Queries
[ https://issues.apache.org/jira/browse/CASSANDRA-15336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeremy Hanna updated CASSANDRA-15336: - Description: Hi All, This bug is similar to CASSANDRA-15172 but relates specifically to range queries running over range tombstones. *+Steps to Reproduce: +* CREATE KEYSPACE ks1 WITH replication = \{'class': 'NetworkTopologyStrategy', 'DC1': '3'} AND durable_writes = true; +*TABLE:*+ CREATE TABLE ks1.table1 ( col1 text, col2 text, col3 text, col4 text, col5 text, col6 timestamp, data text, PRIMARY KEY ((col1, col2, col3), col4, col5, col6) ); Inserted ~4 million rows and created range tombstones by deleting ~1 million rows. +*Create Data*+ _insert into ks1.table1 (col1, col2 , col3 , col4 , col5 , col6 , data ) VALUES ( '1', '11', '21', '1', 'a', 1231231230, 'data');_ _insert into ks1.table1 (col1, col2 , col3 , col4 , col5 , col6 , data ) VALUES ( '1', '11', '21', '2', 'a', 1231231230, 'data');_ _insert into ks1.table1 (col1, col2 , col3 , col4 , col5 , col6 , data ) VALUES ( '1', '11', '21', '3', 'a', 1231231230, 'data');_ _insert into ks1.table1 (col1, col2 , col3 , col4 , col5 , col6 , data ) VALUES ( '1', '11', '21', '4', 'a', 1231231230, 'data');_ _insert into ks1.table1 (col1, col2 , col3 , col4 , col5 , col6 , data ) VALUES ( '1', '11', '21', '5', 'a', 1231231230, 'data');_ +*Create Range Tombstones*+ delete from ks1.table1 where col1='1' and col2='11' and col3='21' and col4='1'; +*Query Live Rows (no tombstones)*+ _select * from ks1.table1 where col1='1' and col2='201' and col3='21' and col4='1' and col5='a' and *col6>1231231230*;_ No issues found, everything is running properly. +*Query Range Tombstones*+ _select * from ks1.table1 where col1='1' and col2='11' and col3='21' and col4='1' and col5='a' and *col6=1231231230*;_ No issues found, everything is running properly. +BUT when running range queries:+ _select * from ks1.table1 where col1='1' and col2='11' and col3='21' and col4='1' and col5='a' and *col6>1231231220;*_ WARN [ReadStage-1] 2019-09-23 14:17:10,281 AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread Thread[ReadStage-1,5,main]: {} java.lang.ArrayIndexOutOfBoundsException: 2 at org.apache.cassandra.db.AbstractBufferClusteringPrefix.get(AbstractBufferClusteringPrefix.java:55) at org.apache.cassandra.db.LegacyLayout$LegacyRangeTombstoneList.serializedSizeCompound(LegacyLayout.java:2545) at org.apache.cassandra.db.LegacyLayout$LegacyRangeTombstoneList.serializedSize(LegacyLayout.java:2522) at org.apache.cassandra.db.LegacyLayout.serializedSizeAsLegacyPartition(LegacyLayout.java:565) at org.apache.cassandra.db.ReadResponse$Serializer.serializedSize(ReadResponse.java:446) at org.apache.cassandra.db.ReadResponse$Serializer.serializedSize(ReadResponse.java:352) at org.apache.cassandra.net.MessageOut.payloadSize(MessageOut.java:171) at org.apache.cassandra.net.OutboundTcpConnectionPool.getConnection(OutboundTcpConnectionPool.java:77) at org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:802) at org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:953) at org.apache.cassandra.net.MessagingService.sendReply(MessagingService.java:929) at org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:62) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:66) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134) at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:114) at java.lang.Thread.run(Thread.java:745) This WARN is constantly generated until I stop the range queries script. Hope this helps.. Thanks! was: Hi All, This bug is similar to https://issues.apache.org/jira/browse/CASSANDRA-15172 but relates specifically to range queries running over range tombstones. *+Steps to Reproduce: +* CREATE KEYSPACE ks1 WITH replication = \{'class': 'NetworkTopologyStrategy', 'DC1': '3'} AND durable_writes = true; +*TABLE:*+ CREATE TABLE ks1.table1 ( col1 text, col2 text, col3 text, col4 text, col5 text, col6 timestamp, data text, PRIMARY KEY ((col1, col2, col3), col4, col5, col6) ); Inserted ~4 million rows and created range tombstones by deleting ~1 million rows. +*Create Data*+ _insert into ks1.table1 (col1, col2 , col3 , col4 , col5 , col6 , data ) VALUES ( '1', '11', '21', '1', 'a', 1231231230, 'data');_ _insert into ks1.table1 (col1, col2 , col3 , col4 , col5 , col6 , data ) VALUES ( '1', '11',