[jira] [Updated] (CASSANDRA-15394) Remove list iterators

2019-10-30 Thread Blake Eggleston (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15394:

Change Category: Performance
 Complexity: Low Hanging Fruit
  Fix Version/s: 4.0
 Status: Open  (was: Triage Needed)

> Remove list iterators
> -
>
> Key: CASSANDRA-15394
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15394
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> We allocate list iterators in several places in hot paths. This converts them 
> to get by index. This provides a ~4% improvement in relvant workloads.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15394) Remove list iterators

2019-10-30 Thread Blake Eggleston (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15394:

Test and Documentation Plan: circle
 Status: Patch Available  (was: Open)

[4.0|https://github.com/bdeggleston/cassandra/tree/15394]

> Remove list iterators
> -
>
> Key: CASSANDRA-15394
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15394
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> We allocate list iterators in several places in hot paths. This converts them 
> to get by index. This provides a ~4% improvement in relvant workloads.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15393) Add byte array backed cells

2019-10-30 Thread Blake Eggleston (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16963497#comment-16963497
 ] 

Blake Eggleston edited comment on CASSANDRA-15393 at 10/30/19 10:52 PM:


[4.0|https://github.com/bdeggleston/cassandra/tree/15393] - this depends on the 
changes in CASSANDRA-15391


was (Author: bdeggleston):
[4.0|https://github.com/bdeggleston/cassandra/tree/15393]

> Add byte array backed cells
> ---
>
> Key: CASSANDRA-15393
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15393
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> We currently materialize all values as on heap byte buffers. Byte buffers 
> have a fairly high overhead given how frequently they’re used, and on the 
> compaction and local read path we don’t do anything that needs them. Use of 
> byte buffer methods only happens on the coordinator. Using cells that are 
> backed by byte arrays instead in these situations reduces compaction and read 
> garbage up to 22% in many cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15394) Remove list iterators

2019-10-30 Thread Blake Eggleston (Jira)
Blake Eggleston created CASSANDRA-15394:
---

 Summary: Remove list iterators
 Key: CASSANDRA-15394
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15394
 Project: Cassandra
  Issue Type: Sub-task
  Components: Local/Compaction
Reporter: Blake Eggleston
Assignee: Blake Eggleston


We allocate list iterators in several places in hot paths. This converts them 
to get by index. This provides a ~4% improvement in relvant workloads.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15393) Add byte array backed cells

2019-10-30 Thread Blake Eggleston (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15393:

Test and Documentation Plan: circle
 Status: Patch Available  (was: Open)

[4.0|https://github.com/bdeggleston/cassandra/tree/15393]

> Add byte array backed cells
> ---
>
> Key: CASSANDRA-15393
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15393
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> We currently materialize all values as on heap byte buffers. Byte buffers 
> have a fairly high overhead given how frequently they’re used, and on the 
> compaction and local read path we don’t do anything that needs them. Use of 
> byte buffer methods only happens on the coordinator. Using cells that are 
> backed by byte arrays instead in these situations reduces compaction and read 
> garbage up to 22% in many cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15392) Pool Merge Iterators

2019-10-30 Thread Blake Eggleston (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15392:

Test and Documentation Plan: circle
 Status: Patch Available  (was: Open)

[4.0|https://github.com/bdeggleston/cassandra/tree/15392]

> Pool Merge Iterators
> 
>
> Key: CASSANDRA-15392
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15392
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> By pooling merge iterators, instead of creating new ones each time we need 
> them, we can reduce garbage on the compaction and read paths under relevant 
> workloads by ~4% in many cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15391) Reduce heap footprint of commonly allocated objects

2019-10-30 Thread Blake Eggleston (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15391:

Test and Documentation Plan: circle
 Status: Patch Available  (was: Open)

[4.0|https://github.com/bdeggleston/cassandra/tree/15391]

> Reduce heap footprint of commonly allocated objects
> ---
>
> Key: CASSANDRA-15391
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15391
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> BufferCell, BTreeRow, and Clustering make up a significant amount of 
> allocations during reads/compactions, and many of the fields of these classes 
> are often unused. For example, the CellPath reference in BufferCell is only 
> every used for collection columns. Since we know which fields will and won’t 
> be used during cell creation, we can define specialized classes that only 
> take up heap space for the data they’ll be using. This reduces compaction 
> garbage by up to 4.5%.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15390) Avoid unnecessary collection/iterator allocations during btree construction

2019-10-30 Thread Blake Eggleston (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15390:

Test and Documentation Plan: circle
 Status: Patch Available  (was: Open)

[4.0|https://github.com/bdeggleston/cassandra/tree/15390]

> Avoid unnecessary collection/iterator allocations during btree construction
> ---
>
> Key: CASSANDRA-15390
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15390
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> A heavily used btree builder path does a lot of unnecessary conversions to 
> and from collections and iterators. Adding dedicated support for Object[] 
> reduces compaction garbage by up to 8.3%



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15389) Minimize BTree iterator allocations

2019-10-30 Thread Blake Eggleston (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15389:

Test and Documentation Plan: circle
 Status: Patch Available  (was: Open)

[4.0|https://github.com/bdeggleston/cassandra/tree/15389]

> Minimize BTree iterator allocations
> ---
>
> Key: CASSANDRA-15389
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15389
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> Allocations of BTree iterators contribute a lot amount of garbage to the 
> compaction and read paths.
> This patch removes most btree iterator allocations on hot paths by:
>  • using Row#apply where appropriate on frequently called methods 
> (Row#digest, Row#validateData
>  • adding BTree accumulate method. Like the apply method, this method walks 
> the btree with a function that takes and returns a long argument, this 
> eliminates iterator allocations without adding helper object allocations 
> (BTreeRow#hasComplex, BTreeRow#hasInvalidDeletions, BTreeRow#dataSize, 
> BTreeRow#unsharedHeapSizeExcludingData, Rows#collectStats, 
> UnfilteredSerializer#serializedRowBodySize) as well as eliminating the 
> allocation of helper objects in places where apply was used previously^[1]^.
>  • Create map of columns in SerializationHeader, this lets us avoid 
> allocating a btree search iterator for each row we serialize.
> These optimizations reduce garbage created during compaction by up to 13.5%
>  
> [1] the memory test does measure memory allocated by lambdas capturing objects



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15388) Add compaction allocation measurement test to support compaction gc optimization.

2019-10-30 Thread Blake Eggleston (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15388:

Test and Documentation Plan: circle
 Status: Patch Available  (was: Open)

[4.0|https://github.com/bdeggleston/cassandra/tree/15388]

> Add compaction allocation measurement test to support compaction gc 
> optimization. 
> --
>
> Key: CASSANDRA-15388
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15388
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> This adds a test that is able to quickly and accurately measure the effect of 
> potential gc optimizations against a wide range of (synthetic) compaction 
> workloads. This test accurately measures allocation rates from 16 workloads 
> in less that 2 minutes.
> This test uses google’s {{java-allocation-instrumenter}} agent to measure the 
> workloads. Measurements using this agent are very accurate and pretty 
> repeatable from run to run, with most variance being negligible (1-2 bytes 
> per partition), although workloads with larger but fewer partitions vary a 
> bit more (still less that 0.03%).
> The thinking behind this patch is that with compaction, we’re generally 
> interested in the memory allocated per partition, since garbage scales more 
> or less linearly with the number of partitions compacted. So measuring 
> allocation from a small number of partitions that otherwise represent real 
> world use cases is a good enough approximation.
> In addition to helping with compaction optimizations, this test could be used 
> as a template for future optimization work. This pattern could also be used 
> to set allocation limits on workloads/operations and fail CI if the 
> allocation behavior changes past some threshold. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15392) Pool Merge Iterators

2019-10-30 Thread Blake Eggleston (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15392:

Change Category: Performance
 Complexity: Normal
  Fix Version/s: 4.0
 Status: Open  (was: Triage Needed)

> Pool Merge Iterators
> 
>
> Key: CASSANDRA-15392
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15392
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> By pooling merge iterators, instead of creating new ones each time we need 
> them, we can reduce garbage on the compaction and read paths under relevant 
> workloads by ~4% in many cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15393) Add byte array backed cells

2019-10-30 Thread Blake Eggleston (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15393:

Change Category: Performance
 Complexity: Normal
  Fix Version/s: 4.0
 Status: Open  (was: Triage Needed)

> Add byte array backed cells
> ---
>
> Key: CASSANDRA-15393
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15393
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> We currently materialize all values as on heap byte buffers. Byte buffers 
> have a fairly high overhead given how frequently they’re used, and on the 
> compaction and local read path we don’t do anything that needs them. Use of 
> byte buffer methods only happens on the coordinator. Using cells that are 
> backed by byte arrays instead in these situations reduces compaction and read 
> garbage up to 22% in many cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15391) Reduce heap footprint of commonly allocated objects

2019-10-30 Thread Blake Eggleston (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15391:

Change Category: Performance
 Complexity: Normal
  Fix Version/s: 4.0
 Status: Open  (was: Triage Needed)

> Reduce heap footprint of commonly allocated objects
> ---
>
> Key: CASSANDRA-15391
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15391
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> BufferCell, BTreeRow, and Clustering make up a significant amount of 
> allocations during reads/compactions, and many of the fields of these classes 
> are often unused. For example, the CellPath reference in BufferCell is only 
> every used for collection columns. Since we know which fields will and won’t 
> be used during cell creation, we can define specialized classes that only 
> take up heap space for the data they’ll be using. This reduces compaction 
> garbage by up to 4.5%.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15389) Minimize BTree iterator allocations

2019-10-30 Thread Blake Eggleston (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15389:

Change Category: Performance
 Complexity: Normal
Component/s: Local/Compaction
  Fix Version/s: 4.0
 Status: Open  (was: Triage Needed)

> Minimize BTree iterator allocations
> ---
>
> Key: CASSANDRA-15389
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15389
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> Allocations of BTree iterators contribute a lot amount of garbage to the 
> compaction and read paths.
> This patch removes most btree iterator allocations on hot paths by:
>  • using Row#apply where appropriate on frequently called methods 
> (Row#digest, Row#validateData
>  • adding BTree accumulate method. Like the apply method, this method walks 
> the btree with a function that takes and returns a long argument, this 
> eliminates iterator allocations without adding helper object allocations 
> (BTreeRow#hasComplex, BTreeRow#hasInvalidDeletions, BTreeRow#dataSize, 
> BTreeRow#unsharedHeapSizeExcludingData, Rows#collectStats, 
> UnfilteredSerializer#serializedRowBodySize) as well as eliminating the 
> allocation of helper objects in places where apply was used previously^[1]^.
>  • Create map of columns in SerializationHeader, this lets us avoid 
> allocating a btree search iterator for each row we serialize.
> These optimizations reduce garbage created during compaction by up to 13.5%
>  
> [1] the memory test does measure memory allocated by lambdas capturing objects



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15390) Avoid unnecessary collection/iterator allocations during btree construction

2019-10-30 Thread Blake Eggleston (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15390:

Change Category: Performance
 Complexity: Normal
  Fix Version/s: 4.0
 Status: Open  (was: Triage Needed)

> Avoid unnecessary collection/iterator allocations during btree construction
> ---
>
> Key: CASSANDRA-15390
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15390
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> A heavily used btree builder path does a lot of unnecessary conversions to 
> and from collections and iterators. Adding dedicated support for Object[] 
> reduces compaction garbage by up to 8.3%



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15393) Add byte array backed cells

2019-10-30 Thread Blake Eggleston (Jira)
Blake Eggleston created CASSANDRA-15393:
---

 Summary: Add byte array backed cells
 Key: CASSANDRA-15393
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15393
 Project: Cassandra
  Issue Type: Sub-task
  Components: Local/Compaction
Reporter: Blake Eggleston
Assignee: Blake Eggleston


We currently materialize all values as on heap byte buffers. Byte buffers have 
a fairly high overhead given how frequently they’re used, and on the compaction 
and local read path we don’t do anything that needs them. Use of byte buffer 
methods only happens on the coordinator. Using cells that are backed by byte 
arrays instead in these situations reduces compaction and read garbage up to 
22% in many cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15392) Pool Merge Iterators

2019-10-30 Thread Blake Eggleston (Jira)
Blake Eggleston created CASSANDRA-15392:
---

 Summary: Pool Merge Iterators
 Key: CASSANDRA-15392
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15392
 Project: Cassandra
  Issue Type: Sub-task
  Components: Local/Compaction
Reporter: Blake Eggleston
Assignee: Blake Eggleston


By pooling merge iterators, instead of creating new ones each time we need 
them, we can reduce garbage on the compaction and read paths under relevant 
workloads by ~4% in many cases.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15391) Reduce heap footprint of commonly allocated objects

2019-10-30 Thread Blake Eggleston (Jira)
Blake Eggleston created CASSANDRA-15391:
---

 Summary: Reduce heap footprint of commonly allocated objects
 Key: CASSANDRA-15391
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15391
 Project: Cassandra
  Issue Type: Sub-task
  Components: Local/Compaction
Reporter: Blake Eggleston
Assignee: Blake Eggleston


BufferCell, BTreeRow, and Clustering make up a significant amount of 
allocations during reads/compactions, and many of the fields of these classes 
are often unused. For example, the CellPath reference in BufferCell is only 
every used for collection columns. Since we know which fields will and won’t be 
used during cell creation, we can define specialized classes that only take up 
heap space for the data they’ll be using. This reduces compaction garbage by up 
to 4.5%.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15390) Avoid unnecessary collection/iterator allocations during btree construction

2019-10-30 Thread Blake Eggleston (Jira)
Blake Eggleston created CASSANDRA-15390:
---

 Summary: Avoid unnecessary collection/iterator allocations during 
btree construction
 Key: CASSANDRA-15390
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15390
 Project: Cassandra
  Issue Type: Sub-task
  Components: Local/Compaction
Reporter: Blake Eggleston
Assignee: Blake Eggleston


A heavily used btree builder path does a lot of unnecessary conversions to and 
from collections and iterators. Adding dedicated support for Object[] reduces 
compaction garbage by up to 8.3%



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15389) Minimize BTree iterator allocations

2019-10-30 Thread Blake Eggleston (Jira)
Blake Eggleston created CASSANDRA-15389:
---

 Summary: Minimize BTree iterator allocations
 Key: CASSANDRA-15389
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15389
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Blake Eggleston
Assignee: Blake Eggleston


Allocations of BTree iterators contribute a lot amount of garbage to the 
compaction and read paths.

This patch removes most btree iterator allocations on hot paths by:
 • using Row#apply where appropriate on frequently called methods (Row#digest, 
Row#validateData
 • adding BTree accumulate method. Like the apply method, this method walks the 
btree with a function that takes and returns a long argument, this eliminates 
iterator allocations without adding helper object allocations 
(BTreeRow#hasComplex, BTreeRow#hasInvalidDeletions, BTreeRow#dataSize, 
BTreeRow#unsharedHeapSizeExcludingData, Rows#collectStats, 
UnfilteredSerializer#serializedRowBodySize) as well as eliminating the 
allocation of helper objects in places where apply was used previously^[1]^.
 • Create map of columns in SerializationHeader, this lets us avoid allocating 
a btree search iterator for each row we serialize.

These optimizations reduce garbage created during compaction by up to 13.5%

 

[1] the memory test does measure memory allocated by lambdas capturing objects



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15388) Add compaction allocation measurement test to support compaction gc optimization.

2019-10-30 Thread Blake Eggleston (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15388:

Change Category: Performance
 Complexity: Normal
  Fix Version/s: 4.0
 Status: Open  (was: Triage Needed)

> Add compaction allocation measurement test to support compaction gc 
> optimization. 
> --
>
> Key: CASSANDRA-15388
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15388
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> This adds a test that is able to quickly and accurately measure the effect of 
> potential gc optimizations against a wide range of (synthetic) compaction 
> workloads. This test accurately measures allocation rates from 16 workloads 
> in less that 2 minutes.
> This test uses google’s {{java-allocation-instrumenter}} agent to measure the 
> workloads. Measurements using this agent are very accurate and pretty 
> repeatable from run to run, with most variance being negligible (1-2 bytes 
> per partition), although workloads with larger but fewer partitions vary a 
> bit more (still less that 0.03%).
> The thinking behind this patch is that with compaction, we’re generally 
> interested in the memory allocated per partition, since garbage scales more 
> or less linearly with the number of partitions compacted. So measuring 
> allocation from a small number of partitions that otherwise represent real 
> world use cases is a good enough approximation.
> In addition to helping with compaction optimizations, this test could be used 
> as a template for future optimization work. This pattern could also be used 
> to set allocation limits on workloads/operations and fail CI if the 
> allocation behavior changes past some threshold. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15388) Add compaction allocation measurement test to support compaction gc optimization.

2019-10-30 Thread Blake Eggleston (Jira)
Blake Eggleston created CASSANDRA-15388:
---

 Summary: Add compaction allocation measurement test to support 
compaction gc optimization. 
 Key: CASSANDRA-15388
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15388
 Project: Cassandra
  Issue Type: Sub-task
  Components: Local/Compaction
Reporter: Blake Eggleston
Assignee: Blake Eggleston


This adds a test that is able to quickly and accurately measure the effect of 
potential gc optimizations against a wide range of (synthetic) compaction 
workloads. This test accurately measures allocation rates from 16 workloads in 
less that 2 minutes.

This test uses google’s {{java-allocation-instrumenter}} agent to measure the 
workloads. Measurements using this agent are very accurate and pretty 
repeatable from run to run, with most variance being negligible (1-2 bytes per 
partition), although workloads with larger but fewer partitions vary a bit more 
(still less that 0.03%).

The thinking behind this patch is that with compaction, we’re generally 
interested in the memory allocated per partition, since garbage scales more or 
less linearly with the number of partitions compacted. So measuring allocation 
from a small number of partitions that otherwise represent real world use cases 
is a good enough approximation.

In addition to helping with compaction optimizations, this test could be used 
as a template for future optimization work. This pattern could also be used to 
set allocation limits on workloads/operations and fail CI if the allocation 
behavior changes past some threshold. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15387) Reduce compaction & local read path garbage

2019-10-30 Thread Blake Eggleston (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Blake Eggleston updated CASSANDRA-15387:

Change Category: Performance
 Complexity: Normal
  Fix Version/s: 4.0
 Status: Open  (was: Triage Needed)

> Reduce compaction & local read path garbage
> ---
>
> Key: CASSANDRA-15387
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15387
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> There are several opportunities to significantly reduce the amount of garbage 
> generated by compaction and the local read path. This will serve as a top 
> level jira for related changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15387) Reduce compaction & local read path garbage

2019-10-30 Thread Blake Eggleston (Jira)
Blake Eggleston created CASSANDRA-15387:
---

 Summary: Reduce compaction & local read path garbage
 Key: CASSANDRA-15387
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15387
 Project: Cassandra
  Issue Type: Improvement
  Components: Local/Compaction
Reporter: Blake Eggleston
Assignee: Blake Eggleston


There are several opportunities to significantly reduce the amount of garbage 
generated by compaction and the local read path. This will serve as a top level 
jira for related changes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra-website] branch master updated: Fixed docker builds

2019-10-30 Thread rustyrazorblade
This is an automated email from the ASF dual-hosted git repository.

rustyrazorblade pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/cassandra-website.git


The following commit(s) were added to refs/heads/master by this push:
 new 28cdbd7  Fixed docker builds
28cdbd7 is described below

commit 28cdbd705ec2a3194f7a581dd2702dab15238043
Author: Jon Haddad 
AuthorDate: Wed Oct 30 13:09:10 2019 -0700

Fixed docker builds
---
 docker-compose.yml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docker-compose.yml b/docker-compose.yml
index 261f99c..cfa944c 100644
--- a/docker-compose.yml
+++ b/docker-compose.yml
@@ -6,7 +6,7 @@ services:
 image: cassandra-website:latest
 volumes:
   - ./src:/usr/src/cassandra-site/src
-  - ./content:/usr/src/cassandra-site/content
+  - ./content:/usr/src/cassandra-site/publish
 
   cassandra-website-serve:
 build: .


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15385) Ensure that tracing doesn't break connections in 3.x/4.0 mixed mode by default

2019-10-30 Thread Aleksey Yeschenko (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-15385:
--
Component/s: Observability/Tracing

> Ensure that tracing doesn't break connections in 3.x/4.0 mixed mode by default
> --
>
> Key: CASSANDRA-15385
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15385
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Tracing
>Reporter: Aleksey Yeschenko
>Assignee: Aleksey Yeschenko
>Priority: Normal
> Fix For: 3.0.x, 3.11.x
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15385) Ensure that tracing doesn't break connections in 3.x/4.0 mixed mode by default

2019-10-30 Thread Aleksey Yeschenko (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-15385:
--
Fix Version/s: 3.11.x
   3.0.x

> Ensure that tracing doesn't break connections in 3.x/4.0 mixed mode by default
> --
>
> Key: CASSANDRA-15385
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15385
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Aleksey Yeschenko
>Assignee: Aleksey Yeschenko
>Priority: Normal
> Fix For: 3.0.x, 3.11.x
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15385) Ensure that tracing doesn't break connections in 3.x/4.0 mixed mode by default

2019-10-30 Thread Aleksey Yeschenko (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-15385:
--
Summary: Ensure that tracing doesn't break connections in 3.x/4.0 mixed 
mode by default  (was: TBD (minor; boring))

> Ensure that tracing doesn't break connections in 3.x/4.0 mixed mode by default
> --
>
> Key: CASSANDRA-15385
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15385
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Aleksey Yeschenko
>Assignee: Aleksey Yeschenko
>Priority: Normal
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15332) When repair is running with tracing, if a CorruptSSTableException is thrown while building Merkle Trees the DiskFailurePolicy does not get applied

2019-10-30 Thread David Capwell (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16963240#comment-16963240
 ] 

David Capwell commented on CASSANDRA-15332:
---

[~marcuse] I pushed all changes based off your feedback.

[~jrwest] can you re-review?  The changes to src/java are completely different 
since your last review, so good to look at that again (tests are mostly the 
same, small changes)

> When repair is running with tracing, if a CorruptSSTableException is thrown 
> while building Merkle Trees the DiskFailurePolicy does not get applied
> --
>
> Key: CASSANDRA-15332
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15332
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Observability/Tracing
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> When a repair is in the validation phase and is building MerkleTrees, if a 
> corrupt SSTable exception is thrown the disk failure policy does not get 
> applied



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15386) Use multiple data directories in the in-jvm dtests

2019-10-30 Thread Marcus Eriksson (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-15386:

Test and Documentation Plan: circle ci run
 Status: Patch Available  (was: Open)

[patch|https://github.com/krummas/cassandra/commits/marcuse/15386]
[tests|https://circleci.com/workflow-run/c002075c-d210-465c-b7fd-bf6abaf7dc27]

> Use multiple data directories in the in-jvm dtests
> --
>
> Key: CASSANDRA-15386
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15386
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction, Test/dtest
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 3.11.x, 4.x
>
>
> We should default to using 3 data directories when running the in-jvm dtests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15386) Use multiple data directories in the in-jvm dtests

2019-10-30 Thread Marcus Eriksson (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-15386:

Change Category: Quality Assurance
 Complexity: Low Hanging Fruit
  Fix Version/s: 4.x
 3.11.x
 Status: Open  (was: Triage Needed)

> Use multiple data directories in the in-jvm dtests
> --
>
> Key: CASSANDRA-15386
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15386
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/Compaction, Test/dtest
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 3.11.x, 4.x
>
>
> We should default to using 3 data directories when running the in-jvm dtests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15386) Use multiple data directories in the in-jvm dtests

2019-10-30 Thread Marcus Eriksson (Jira)
Marcus Eriksson created CASSANDRA-15386:
---

 Summary: Use multiple data directories in the in-jvm dtests
 Key: CASSANDRA-15386
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15386
 Project: Cassandra
  Issue Type: Improvement
  Components: Local/Compaction, Test/dtest
Reporter: Marcus Eriksson
Assignee: Marcus Eriksson


We should default to using 3 data directories when running the in-jvm dtests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13974) Bad prefix matching when figuring out data directory for an sstable

2019-10-30 Thread Marcus Eriksson (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-13974:

Test and Documentation Plan: new tests, circleci runs
 Status: Patch Available  (was: In Progress)

Pushed a fix to the 3.11, trunk branches above which keeps a mapping of 
canonical data path to data directory - we then use this map when figuring out 
the data directory for an sstable. The directory in the descriptor is always 
canonical.

> Bad prefix matching when figuring out data directory for an sstable
> ---
>
> Key: CASSANDRA-13974
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13974
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> We do a "startsWith" check when getting data directory for an sstable, we 
> should match including File.separator



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15385) TBD (minor; boring)

2019-10-30 Thread Aleksey Yeschenko (Jira)
Aleksey Yeschenko created CASSANDRA-15385:
-

 Summary: TBD (minor; boring)
 Key: CASSANDRA-15385
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15385
 Project: Cassandra
  Issue Type: Bug
Reporter: Aleksey Yeschenko
Assignee: Aleksey Yeschenko






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13974) Bad prefix matching when figuring out data directory for an sstable

2019-10-30 Thread Marcus Eriksson (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-13974:

Status: In Progress  (was: Changes Suggested)

> Bad prefix matching when figuring out data directory for an sstable
> ---
>
> Key: CASSANDRA-13974
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13974
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Core
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.x
>
>
> We do a "startsWith" check when getting data directory for an sstable, we 
> should match including File.separator



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15365) Add primary key liveness info when skipping illegal cells

2019-10-30 Thread Marcus Eriksson (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marcus Eriksson updated CASSANDRA-15365:

  Fix Version/s: (was: 3.11.x)
 (was: 3.0.x)
 3.11.6
 3.0.20
  Since Version: 3.0 alpha 1
Source Control Link: 
https://github.com/apache/cassandra/commit/767a68cd00050298abf7bbfd8b322e5663439c23
 Resolution: Fixed
 Status: Resolved  (was: Ready to Commit)

and committed to 3.0 and merged to 3.11, thanks

test runs: 
[3.0|https://circleci.com/workflow-run/0c910638-db5f-494b-ba47-cc728a4d2eb6] 
[3.11|https://circleci.com/workflow-run/da2cd137-ae9e-45bf-9c4d-1b98a006e08a]

> Add primary key liveness info when skipping illegal cells
> -
>
> Key: CASSANDRA-15365
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15365
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/SSTable
>Reporter: Marcus Eriksson
>Assignee: Marcus Eriksson
>Priority: Normal
> Fix For: 3.0.20, 3.11.6
>
>
> In CASSANDRA-15086/CASSANDRA-15178 we started skipping the illegal legacy 
> cells, problem is that if the row only contains illegal cells, we return a 
> totally empty row which breaks stats collection: 
> https://github.com/apache/cassandra/blob/93815db9853cb592edf13d82e91dc2e9d172f01f/src/java/org/apache/cassandra/db/rows/Rows.java#L70
> If the row only has these invalid cells, we should add a primary key liveness 
> info to it to match the 2.1 behaviour.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] 01/01: Merge branch 'cassandra-3.0' into cassandra-3.11

2019-10-30 Thread marcuse
This is an automated email from the ASF dual-hosted git repository.

marcuse pushed a commit to branch cassandra-3.11
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit 2d90e3c2a443e4da3d04cb5701d30fb709406c96
Merge: fb50d82 767a68c
Author: Marcus Eriksson 
AuthorDate: Wed Oct 30 15:10:41 2019 +0100

Merge branch 'cassandra-3.0' into cassandra-3.11

 CHANGES.txt|   5 +-
 src/java/org/apache/cassandra/db/LegacyLayout.java |  56 ---
 ...with_illegal_cell_names-ka-2-CompressionInfo.db | Bin 0 -> 43 bytes
 ...-legacy_ka_with_illegal_cell_names-ka-2-Data.db | Bin 0 -> 59 bytes
 ...acy_ka_with_illegal_cell_names-ka-2-Digest.sha1 |   1 +
 ...egacy_ka_with_illegal_cell_names-ka-2-Filter.db | Bin 0 -> 16 bytes
 ...legacy_ka_with_illegal_cell_names-ka-2-Index.db | Bin 0 -> 18 bytes
 ...y_ka_with_illegal_cell_names-ka-2-Statistics.db | Bin 0 -> 4452 bytes
 ...-legacy_ka_with_illegal_cell_names-ka-2-TOC.txt |   8 +++
 ...th_illegal_cell_names_2-ka-1-CompressionInfo.db | Bin 0 -> 43 bytes
 ...egacy_ka_with_illegal_cell_names_2-ka-1-Data.db | Bin 0 -> 67 bytes
 ...y_ka_with_illegal_cell_names_2-ka-1-Digest.sha1 |   1 +
 ...acy_ka_with_illegal_cell_names_2-ka-1-Filter.db | Bin 0 -> 16 bytes
 ...gacy_ka_with_illegal_cell_names_2-ka-1-Index.db | Bin 0 -> 18 bytes
 ...ka_with_illegal_cell_names_2-ka-1-Statistics.db | Bin 0 -> 4452 bytes
 ...cy_ka_with_illegal_cell_names_2-ka-1-Summary.db | Bin 0 -> 92 bytes
 ...egacy_ka_with_illegal_cell_names_2-ka-1-TOC.txt |   8 +++
 .../cassandra/io/sstable/LegacySSTableTest.java|  80 +++--
 18 files changed, 144 insertions(+), 15 deletions(-)

diff --cc CHANGES.txt
index efd9ecb,d58a199..c21b2cf
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,14 -1,10 +1,17 @@@
 -3.0.20
 +3.11.6
- Merged from 2.2
++Merged from 3.0:
+  * Ensure legacy rows have primary key livenessinfo when they contain illegal 
cells (CASSANDRA-15365)
 -Merged from 2.2
++Merged from 2.2:
   * In-JVM DTest: Set correct internode message version for upgrade test 
(CASSANDRA-15371)
  
+ 
 -3.0.19
 +3.11.5
 + * Fix SASI non-literal string comparisons (range operators) (CASSANDRA-15169)
 + * Make sure user defined compaction transactions are always closed 
(CASSANDRA-15123)
 + * Fix cassandra-env.sh to use $CASSANDRA_CONF to find cassandra-jaas.config 
(CASSANDRA-14305)
 + * Fixed nodetool cfstats printing index name twice (CASSANDRA-14903)
 + * Add flag to disable SASI indexes, and warnings on creation 
(CASSANDRA-14866)
 +Merged from 3.0:
   * Add ability to cap max negotiable protocol version (CASSANDRA-15193)
   * Gossip tokens on startup if available (CASSANDRA-15335)
   * Fix resource leak in CompressedSequentialWriter (CASSANDRA-15340)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch cassandra-3.11 updated (fb50d82 -> 2d90e3c)

2019-10-30 Thread marcuse
This is an automated email from the ASF dual-hosted git repository.

marcuse pushed a change to branch cassandra-3.11
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from fb50d82  Increment version to 3.11.6
 new 767a68c  Ensure legacy rows have primary key livenessinfo when they 
contain illegal cells
 new 2d90e3c  Merge branch 'cassandra-3.0' into cassandra-3.11

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGES.txt|   5 +-
 src/java/org/apache/cassandra/db/LegacyLayout.java |  56 ---
 ...with_illegal_cell_names-ka-2-CompressionInfo.db | Bin 0 -> 43 bytes
 ...-legacy_ka_with_illegal_cell_names-ka-2-Data.db | Bin 0 -> 59 bytes
 ...acy_ka_with_illegal_cell_names-ka-2-Digest.sha1 |   1 +
 ...egacy_ka_with_illegal_cell_names-ka-2-Filter.db | Bin 0 -> 16 bytes
 ...legacy_ka_with_illegal_cell_names-ka-2-Index.db | Bin 0 -> 18 bytes
 ..._ka_with_illegal_cell_names-ka-2-Statistics.db} | Bin 4450 -> 4452 bytes
 ...legacy_ka_with_illegal_cell_names-ka-2-TOC.txt} |  10 +--
 ...th_illegal_cell_names_2-ka-1-CompressionInfo.db | Bin 0 -> 43 bytes
 ...egacy_ka_with_illegal_cell_names_2-ka-1-Data.db | Bin 0 -> 67 bytes
 ...y_ka_with_illegal_cell_names_2-ka-1-Digest.sha1 |   1 +
 ...acy_ka_with_illegal_cell_names_2-ka-1-Filter.db | Bin 0 -> 16 bytes
 ...gacy_ka_with_illegal_cell_names_2-ka-1-Index.db | Bin 0 -> 18 bytes
 ...a_with_illegal_cell_names_2-ka-1-Statistics.db} | Bin 4450 -> 4452 bytes
 ...cy_ka_with_illegal_cell_names_2-ka-1-Summary.db | Bin 0 -> 92 bytes
 ...gacy_ka_with_illegal_cell_names_2-ka-1-TOC.txt} |  10 +--
 .../cassandra/io/sstable/LegacySSTableTest.java|  80 +++--
 18 files changed, 138 insertions(+), 25 deletions(-)
 create mode 100644 
test/data/legacy-sstables/ka/legacy_tables/legacy_ka_with_illegal_cell_names/legacy_tables-legacy_ka_with_illegal_cell_names-ka-2-CompressionInfo.db
 create mode 100644 
test/data/legacy-sstables/ka/legacy_tables/legacy_ka_with_illegal_cell_names/legacy_tables-legacy_ka_with_illegal_cell_names-ka-2-Data.db
 create mode 100644 
test/data/legacy-sstables/ka/legacy_tables/legacy_ka_with_illegal_cell_names/legacy_tables-legacy_ka_with_illegal_cell_names-ka-2-Digest.sha1
 create mode 100644 
test/data/legacy-sstables/ka/legacy_tables/legacy_ka_with_illegal_cell_names/legacy_tables-legacy_ka_with_illegal_cell_names-ka-2-Filter.db
 create mode 100644 
test/data/legacy-sstables/ka/legacy_tables/legacy_ka_with_illegal_cell_names/legacy_tables-legacy_ka_with_illegal_cell_names-ka-2-Index.db
 copy 
test/data/legacy-sstables/ka/legacy_tables/{legacy_ka_14766/legacy_tables-legacy_ka_14766-ka-1-Statistics.db
 => 
legacy_ka_with_illegal_cell_names/legacy_tables-legacy_ka_with_illegal_cell_names-ka-2-Statistics.db}
 (91%)
 copy test/data/{bloom-filter/ka/foo/foo-atable-ka-1-TOC.txt => 
legacy-sstables/ka/legacy_tables/legacy_ka_with_illegal_cell_names/legacy_tables-legacy_ka_with_illegal_cell_names-ka-2-TOC.txt}
 (100%)
 create mode 100644 
test/data/legacy-sstables/ka/legacy_tables/legacy_ka_with_illegal_cell_names_2/legacy_tables-legacy_ka_with_illegal_cell_names_2-ka-1-CompressionInfo.db
 create mode 100644 
test/data/legacy-sstables/ka/legacy_tables/legacy_ka_with_illegal_cell_names_2/legacy_tables-legacy_ka_with_illegal_cell_names_2-ka-1-Data.db
 create mode 100644 
test/data/legacy-sstables/ka/legacy_tables/legacy_ka_with_illegal_cell_names_2/legacy_tables-legacy_ka_with_illegal_cell_names_2-ka-1-Digest.sha1
 create mode 100644 
test/data/legacy-sstables/ka/legacy_tables/legacy_ka_with_illegal_cell_names_2/legacy_tables-legacy_ka_with_illegal_cell_names_2-ka-1-Filter.db
 create mode 100644 
test/data/legacy-sstables/ka/legacy_tables/legacy_ka_with_illegal_cell_names_2/legacy_tables-legacy_ka_with_illegal_cell_names_2-ka-1-Index.db
 copy 
test/data/legacy-sstables/ka/legacy_tables/{legacy_ka_14766/legacy_tables-legacy_ka_14766-ka-1-Statistics.db
 => 
legacy_ka_with_illegal_cell_names_2/legacy_tables-legacy_ka_with_illegal_cell_names_2-ka-1-Statistics.db}
 (91%)
 create mode 100644 
test/data/legacy-sstables/ka/legacy_tables/legacy_ka_with_illegal_cell_names_2/legacy_tables-legacy_ka_with_illegal_cell_names_2-ka-1-Summary.db
 copy test/data/{bloom-filter/ka/foo/foo-atable-ka-1-TOC.txt => 
legacy-sstables/ka/legacy_tables/legacy_ka_with_illegal_cell_names_2/legacy_tables-legacy_ka_with_illegal_cell_names_2-ka-1-TOC.txt}
 (100%)


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch cassandra-3.0 updated: Ensure legacy rows have primary key livenessinfo when they contain illegal cells

2019-10-30 Thread marcuse
This is an automated email from the ASF dual-hosted git repository.

marcuse pushed a commit to branch cassandra-3.0
in repository https://gitbox.apache.org/repos/asf/cassandra.git


The following commit(s) were added to refs/heads/cassandra-3.0 by this push:
 new 767a68c  Ensure legacy rows have primary key livenessinfo when they 
contain illegal cells
767a68c is described below

commit 767a68cd00050298abf7bbfd8b322e5663439c23
Author: Marcus Eriksson 
AuthorDate: Thu Oct 17 10:24:57 2019 +0200

Ensure legacy rows have primary key livenessinfo when they contain illegal 
cells

Patch by marcuse and Sam Tunnicliffe; reviewed by Benedict Elliott Smith 
for CASSANDRA-15365
---
 CHANGES.txt|   1 +
 src/java/org/apache/cassandra/db/LegacyLayout.java |  56 ---
 ...with_illegal_cell_names-ka-2-CompressionInfo.db | Bin 0 -> 43 bytes
 ...-legacy_ka_with_illegal_cell_names-ka-2-Data.db | Bin 0 -> 59 bytes
 ...acy_ka_with_illegal_cell_names-ka-2-Digest.sha1 |   1 +
 ...egacy_ka_with_illegal_cell_names-ka-2-Filter.db | Bin 0 -> 16 bytes
 ...legacy_ka_with_illegal_cell_names-ka-2-Index.db | Bin 0 -> 18 bytes
 ...y_ka_with_illegal_cell_names-ka-2-Statistics.db | Bin 0 -> 4452 bytes
 ...-legacy_ka_with_illegal_cell_names-ka-2-TOC.txt |   8 +++
 ...th_illegal_cell_names_2-ka-1-CompressionInfo.db | Bin 0 -> 43 bytes
 ...egacy_ka_with_illegal_cell_names_2-ka-1-Data.db | Bin 0 -> 67 bytes
 ...y_ka_with_illegal_cell_names_2-ka-1-Digest.sha1 |   1 +
 ...acy_ka_with_illegal_cell_names_2-ka-1-Filter.db | Bin 0 -> 16 bytes
 ...gacy_ka_with_illegal_cell_names_2-ka-1-Index.db | Bin 0 -> 18 bytes
 ...ka_with_illegal_cell_names_2-ka-1-Statistics.db | Bin 0 -> 4452 bytes
 ...cy_ka_with_illegal_cell_names_2-ka-1-Summary.db | Bin 0 -> 92 bytes
 ...egacy_ka_with_illegal_cell_names_2-ka-1-TOC.txt |   8 +++
 .../cassandra/io/sstable/LegacySSTableTest.java|  80 +++--
 18 files changed, 141 insertions(+), 14 deletions(-)

diff --git a/CHANGES.txt b/CHANGES.txt
index bd12942..d58a199 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,5 @@
 3.0.20
+ * Ensure legacy rows have primary key livenessinfo when they contain illegal 
cells (CASSANDRA-15365)
 Merged from 2.2
  * In-JVM DTest: Set correct internode message version for upgrade test 
(CASSANDRA-15371)
 
diff --git a/src/java/org/apache/cassandra/db/LegacyLayout.java 
b/src/java/org/apache/cassandra/db/LegacyLayout.java
index 1a03c91..6e93d08 100644
--- a/src/java/org/apache/cassandra/db/LegacyLayout.java
+++ b/src/java/org/apache/cassandra/db/LegacyLayout.java
@@ -1291,6 +1291,22 @@ public abstract class LegacyLayout
 private LegacyRangeTombstone rowDeletion;
 private LegacyRangeTombstone collectionDeletion;
 
+/**
+ * Used to track if we need to add pk liveness info (row marker) when 
removing invalid legacy cells.
+ *
+ * In 2.1 these invalid cells existed but were not queryable, in this 
case specifically because they
+ * represented values for clustering key columns that were written as 
data cells.
+ *
+ * However, the presence (or not) of such cells on an otherwise empty 
CQL row (or partition) would decide
+ * if an empty result row were returned for the CQL row (or 
partition).  To maintain this behaviour we
+ * insert a row marker containing the liveness info of these invalid 
cells iff we have no other data
+ * on the row.
+ *
+ * See also CASSANDRA-15365
+ */
+private boolean hasValidCells = false;
+private LivenessInfo invalidLivenessInfo = null;
+
 public CellGrouper(CFMetaData metadata, SerializationHelper helper)
 {
 this(metadata, helper, false);
@@ -1317,6 +1333,8 @@ public abstract class LegacyLayout
 this.clustering = null;
 this.rowDeletion = null;
 this.collectionDeletion = null;
+this.invalidLivenessInfo = null;
+this.hasValidCells = false;
 }
 
 public boolean addAtom(LegacyAtom atom)
@@ -1326,7 +1344,7 @@ public abstract class LegacyLayout
  : addRangeTombstone(atom.asRangeTombstone());
 }
 
-public boolean addCell(LegacyCell cell)
+private boolean addCell(LegacyCell cell)
 {
 if (clustering == null)
 {
@@ -1359,21 +1377,38 @@ public abstract class LegacyLayout
 builder.addRowDeletion(Row.Deletion.regular(new 
DeletionTime(cell.timestamp, cell.localDeletionTime)));
 else
 
builder.addPrimaryKeyLivenessInfo(LivenessInfo.create(cell.timestamp, FAKE_TTL, 
cell.localDeletionTime));
+hasValidCells = true;
+}
+else if (column.isPrimaryKeyColumn() && metadata.isCQLTable())
+{
+// SSTables generated offline and side-loaded may include 
invalid cells which 

[cassandra] 01/01: Merge branch 'cassandra-3.11' into trunk

2019-10-30 Thread marcuse
This is an automated email from the ASF dual-hosted git repository.

marcuse pushed a commit to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git

commit f6c1dead6600e5b15239b5e2fd050770c48c6aa3
Merge: 6d93f04 2d90e3c
Author: Marcus Eriksson 
AuthorDate: Wed Oct 30 15:15:58 2019 +0100

Merge branch 'cassandra-3.11' into trunk



-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[cassandra] branch trunk updated (6d93f04 -> f6c1dea)

2019-10-30 Thread marcuse
This is an automated email from the ASF dual-hosted git repository.

marcuse pushed a change to branch trunk
in repository https://gitbox.apache.org/repos/asf/cassandra.git.


from 6d93f04  Increment version to 4.0-alpha3
 new 767a68c  Ensure legacy rows have primary key livenessinfo when they 
contain illegal cells
 new 2d90e3c  Merge branch 'cassandra-3.0' into cassandra-3.11
 new f6c1dea  Merge branch 'cassandra-3.11' into trunk

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:


-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15292) Point-in-time recovery ignoring timestamp of static column updates

2019-10-30 Thread Benedict Elliott Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict Elliott Smith updated CASSANDRA-15292:
---
Bug Category: Parent values: Correctness(12982)Level 1 values: Recoverable 
Corruption / Loss(12986)  (was: Parent values: Correctness(12982))
  Complexity: Low Hanging Fruit  (was: Normal)
  Status: Open  (was: Triage Needed)

> Point-in-time recovery ignoring timestamp of static column updates
> --
>
> Key: CASSANDRA-15292
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15292
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Commit Log
>Reporter: Vincent White
>Priority: Normal
>
> During point-in-time recovery 
> org.apache.cassandra.db.partitions.PartitionUpdate#maxTimestamp is checked to 
> see if any write timestamps in the update exceed the recovery point. If any 
> of the timestamps do exceed this point the the commit log replay is stopped.
> Currently maxTimestamp only iterates over the regular rows in the update and 
> doesn't check for any included updates to static columns. If a ParitionUpdate 
> only contains updates to static columns then maxTimestamp will return 
> Long.MIN_VALUE and always be replayed. 
> This generally isn't much of an issue, except for non-dense compact storage 
> tables which are implemented in the 3.x storage engine in large part with 
> static columns. In this case the commit log will always continue applying 
> updates to them past the recovery point until it hits an update to a 
> different table with regular columns or reaches the end of the commit logs.
>  
> ||Patch||
> |[3.11|https://github.com/vincewhite/cassandra/commits/3_11_check_static_column_timestamps_commit_log_archive]|
> |[Trunk|https://github.com/vincewhite/cassandra/commits/trunk_check_static_column_timestamps]|



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15332) When repair is running with tracing, if a CorruptSSTableException is thrown while building Merkle Trees the DiskFailurePolicy does not get applied

2019-10-30 Thread Marcus Eriksson (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16962837#comment-16962837
 ] 

Marcus Eriksson commented on CASSANDRA-15332:
-

this lgtm, left a few minor comments on the PR

> When repair is running with tracing, if a CorruptSSTableException is thrown 
> while building Merkle Trees the DiskFailurePolicy does not get applied
> --
>
> Key: CASSANDRA-15332
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15332
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair, Observability/Tracing
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When a repair is in the validation phase and is building MerkleTrees, if a 
> corrupt SSTable exception is thrown the disk failure policy does not get 
> applied



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15322) Partition size virtual table

2019-10-30 Thread Benedict Elliott Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict Elliott Smith updated CASSANDRA-15322:
---
Component/s: Feature/Virtual Tables

> Partition size virtual table
> 
>
> Key: CASSANDRA-15322
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15322
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Virtual Tables
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Normal
>  Labels: virtual-tables
>
> Virtual table to provide on disk size (local) of a given partition. Useful 
> for checking for or verifying issues with wide partitions. This is dependent 
> on the lazy virtual table ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15322) Partition size virtual table

2019-10-30 Thread Benedict Elliott Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict Elliott Smith updated CASSANDRA-15322:
---
Labels:   (was: virtual-tables)

> Partition size virtual table
> 
>
> Key: CASSANDRA-15322
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15322
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Virtual Tables
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Normal
>
> Virtual table to provide on disk size (local) of a given partition. Useful 
> for checking for or verifying issues with wide partitions. This is dependent 
> on the lazy virtual table ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15336) LegacyLayout RangeTombstoneList throws IndexOutOfBoundsException When Running Range Queries

2019-10-30 Thread Jeremy Hanna (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeremy Hanna updated CASSANDRA-15336:
-
Description: 
Hi All, 

This bug is similar to CASSANDRA-15172 but relates specifically to range 
queries running over range tombstones. 

 

 

*+Steps to Reproduce:
 +*

CREATE KEYSPACE ks1 WITH replication = \{'class': 'NetworkTopologyStrategy', 
'DC1': '3'} AND durable_writes = true;

+*TABLE:*+ 
 CREATE TABLE ks1.table1 (
 col1 text,
 col2 text,
 col3 text,
 col4 text,
 col5 text,
 col6 timestamp,
 data text,
 PRIMARY KEY ((col1, col2, col3), col4, col5, col6)
 );

 

Inserted ~4 million rows and created range tombstones by deleting ~1 million 
rows.

 

+*Create Data*+

_insert into ks1.table1 (col1, col2 , col3 , col4 , col5 , col6 , data ) VALUES 
( '1', '11', '21', '1', 'a', 1231231230, 'data');_
 _insert into ks1.table1 (col1, col2 , col3 , col4 , col5 , col6 , data ) 
VALUES ( '1', '11', '21', '2', 'a', 1231231230, 'data');_
 _insert into ks1.table1 (col1, col2 , col3 , col4 , col5 , col6 , data ) 
VALUES ( '1', '11', '21', '3', 'a', 1231231230, 'data');_
 _insert into ks1.table1 (col1, col2 , col3 , col4 , col5 , col6 , data ) 
VALUES ( '1', '11', '21', '4', 'a', 1231231230, 'data');_
 _insert into ks1.table1 (col1, col2 , col3 , col4 , col5 , col6 , data ) 
VALUES ( '1', '11', '21', '5', 'a', 1231231230, 'data');_

 

+*Create Range Tombstones*+

delete from ks1.table1 where col1='1' and col2='11' and col3='21' and col4='1';

 

+*Query Live Rows (no tombstones)*+

_select * from ks1.table1 where col1='1' and col2='201' and col3='21' and 
col4='1' and col5='a' and *col6>1231231230*;_

No issues found, everything is running properly.

 

+*Query Range Tombstones*+

_select * from ks1.table1 where col1='1' and col2='11' and col3='21' and 
col4='1' and col5='a' and *col6=1231231230*;_

No issues found, everything is running properly.

 

+BUT when running range queries:+

_select * from ks1.table1 where col1='1' and col2='11' and col3='21' and 
col4='1' and col5='a' and *col6>1231231220;*_

WARN [ReadStage-1] 2019-09-23 14:17:10,281 
AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread 
Thread[ReadStage-1,5,main]: {}
 java.lang.ArrayIndexOutOfBoundsException: 2
 at 
org.apache.cassandra.db.AbstractBufferClusteringPrefix.get(AbstractBufferClusteringPrefix.java:55)
 at 
org.apache.cassandra.db.LegacyLayout$LegacyRangeTombstoneList.serializedSizeCompound(LegacyLayout.java:2545)
 at 
org.apache.cassandra.db.LegacyLayout$LegacyRangeTombstoneList.serializedSize(LegacyLayout.java:2522)
 at 
org.apache.cassandra.db.LegacyLayout.serializedSizeAsLegacyPartition(LegacyLayout.java:565)
 at 
org.apache.cassandra.db.ReadResponse$Serializer.serializedSize(ReadResponse.java:446)
 at 
org.apache.cassandra.db.ReadResponse$Serializer.serializedSize(ReadResponse.java:352)
 at org.apache.cassandra.net.MessageOut.payloadSize(MessageOut.java:171)
 at 
org.apache.cassandra.net.OutboundTcpConnectionPool.getConnection(OutboundTcpConnectionPool.java:77)
 at 
org.apache.cassandra.net.MessagingService.getConnection(MessagingService.java:802)
 at 
org.apache.cassandra.net.MessagingService.sendOneWay(MessagingService.java:953)
 at 
org.apache.cassandra.net.MessagingService.sendReply(MessagingService.java:929)
 at 
org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:62)
 at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:66)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
 at 
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
 at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:114)
 at java.lang.Thread.run(Thread.java:745)

 

This WARN is constantly generated until I stop the range queries script.

Hope this helps..

Thanks!

  was:
Hi All, 

This bug is similar to https://issues.apache.org/jira/browse/CASSANDRA-15172 
but relates specifically to range queries running over range tombstones. 

 

 

*+Steps to Reproduce:
 +*

CREATE KEYSPACE ks1 WITH replication = \{'class': 'NetworkTopologyStrategy', 
'DC1': '3'} AND durable_writes = true;

+*TABLE:*+ 
 CREATE TABLE ks1.table1 (
 col1 text,
 col2 text,
 col3 text,
 col4 text,
 col5 text,
 col6 timestamp,
 data text,
 PRIMARY KEY ((col1, col2, col3), col4, col5, col6)
 );

 

Inserted ~4 million rows and created range tombstones by deleting ~1 million 
rows.

 

+*Create Data*+

_insert into ks1.table1 (col1, col2 , col3 , col4 , col5 , col6 , data ) VALUES 
( '1', '11', '21', '1', 'a', 1231231230, 'data');_
 _insert into ks1.table1 (col1, col2 , col3 , col4 , col5 , col6 , data ) 
VALUES ( '1', '11',