[jira] [Commented] (CASSANDRA-14160) maxPurgeableTimestamp should traverse tables in order of minTimestamp

2021-10-25 Thread Brandon Williams (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17433739#comment-17433739
 ] 

Brandon Williams commented on CASSANDRA-14160:
--

I believe it is still valid, [~subkanthi]

> maxPurgeableTimestamp should traverse tables in order of minTimestamp
> -
>
> Key: CASSANDRA-14160
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14160
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Josh Snyder
>Priority: Normal
>  Labels: performance
> Fix For: 4.x
>
>
> In maxPurgeableTimestamp, we iterate over the bloom filters of each 
> overlapping SSTable. Of the bloom filter hits, we take the SSTable with the 
> lowest minTimestamp. If we kept the SSTables in sorted order of minTimestamp, 
> then we could short-circuit the operation at the first bloom filter hit, 
> reducing cache pressure (or worse, I/O) and CPU time.
> I've written (but not yet benchmarked) [some 
> code|https://github.com/hashbrowncipher/cassandra/commit/29859a4a2e617f6775be49448858bc59fdafab44]
>  to demonstrate this possibility.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14160) maxPurgeableTimestamp should traverse tables in order of minTimestamp

2021-10-23 Thread Kanthi Subramanian (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17433379#comment-17433379
 ] 

Kanthi Subramanian commented on CASSANDRA-14160:


[~e.dimitrova], [~brandon.williams] is this issue still valid, would be a good 
experience to work on SSTables.

> maxPurgeableTimestamp should traverse tables in order of minTimestamp
> -
>
> Key: CASSANDRA-14160
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14160
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Josh Snyder
>Priority: Normal
>  Labels: performance
> Fix For: 4.x
>
>
> In maxPurgeableTimestamp, we iterate over the bloom filters of each 
> overlapping SSTable. Of the bloom filter hits, we take the SSTable with the 
> lowest minTimestamp. If we kept the SSTables in sorted order of minTimestamp, 
> then we could short-circuit the operation at the first bloom filter hit, 
> reducing cache pressure (or worse, I/O) and CPU time.
> I've written (but not yet benchmarked) [some 
> code|https://github.com/hashbrowncipher/cassandra/commit/29859a4a2e617f6775be49448858bc59fdafab44]
>  to demonstrate this possibility.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14160) maxPurgeableTimestamp should traverse tables in order of minTimestamp

2021-08-02 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17391755#comment-17391755
 ] 

Ekaterina Dimitrova commented on CASSANDRA-14160:
-

I just had a quick check in ASF Slack with [~josnyder], he confirmed he won't 
be able to work on this and he will be happy if someone else take over on this 
work. I will un-assign him based on this conversation. 

> maxPurgeableTimestamp should traverse tables in order of minTimestamp
> -
>
> Key: CASSANDRA-14160
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14160
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Josh Snyder
>Assignee: Josh Snyder
>Priority: Normal
>  Labels: performance
> Fix For: 4.x
>
>
> In maxPurgeableTimestamp, we iterate over the bloom filters of each 
> overlapping SSTable. Of the bloom filter hits, we take the SSTable with the 
> lowest minTimestamp. If we kept the SSTables in sorted order of minTimestamp, 
> then we could short-circuit the operation at the first bloom filter hit, 
> reducing cache pressure (or worse, I/O) and CPU time.
> I've written (but not yet benchmarked) [some 
> code|https://github.com/hashbrowncipher/cassandra/commit/29859a4a2e617f6775be49448858bc59fdafab44]
>  to demonstrate this possibility.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14160) maxPurgeableTimestamp should traverse tables in order of minTimestamp

2021-07-13 Thread Ekaterina Dimitrova (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17380169#comment-17380169
 ] 

Ekaterina Dimitrova commented on CASSANDRA-14160:
-

Hi [~josnyder], [~marcuse], [~jjirsa],

I am trying to look at tickets which need a reviewer and I saw this one. 

Was there any further discussion? 

> maxPurgeableTimestamp should traverse tables in order of minTimestamp
> -
>
> Key: CASSANDRA-14160
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14160
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Josh Snyder
>Assignee: Josh Snyder
>Priority: Normal
>  Labels: performance
> Fix For: 4.x
>
>
> In maxPurgeableTimestamp, we iterate over the bloom filters of each 
> overlapping SSTable. Of the bloom filter hits, we take the SSTable with the 
> lowest minTimestamp. If we kept the SSTables in sorted order of minTimestamp, 
> then we could short-circuit the operation at the first bloom filter hit, 
> reducing cache pressure (or worse, I/O) and CPU time.
> I've written (but not yet benchmarked) [some 
> code|https://github.com/hashbrowncipher/cassandra/commit/29859a4a2e617f6775be49448858bc59fdafab44]
>  to demonstrate this possibility.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14160) maxPurgeableTimestamp should traverse tables in order of minTimestamp

2018-04-17 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16441124#comment-16441124
 ] 

Jeff Jirsa commented on CASSANDRA-14160:


cc [~cnlwsu] as well (we've seen some meaningful perf improvements with 
NEVER_PURGE_TOMBSTONES=true, tagging Chris in case he remembers if the benefit 
was in the purge evaluator or getFullyExpiredSSTables() )



> maxPurgeableTimestamp should traverse tables in order of minTimestamp
> -
>
> Key: CASSANDRA-14160
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14160
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Josh Snyder
>Assignee: Josh Snyder
>Priority: Major
>  Labels: performance
> Fix For: 4.x
>
>
> In maxPurgeableTimestamp, we iterate over the bloom filters of each 
> overlapping SSTable. Of the bloom filter hits, we take the SSTable with the 
> lowest minTimestamp. If we kept the SSTables in sorted order of minTimestamp, 
> then we could short-circuit the operation at the first bloom filter hit, 
> reducing cache pressure (or worse, I/O) and CPU time.
> I've written (but not yet benchmarked) [some 
> code|https://github.com/hashbrowncipher/cassandra/commit/29859a4a2e617f6775be49448858bc59fdafab44]
>  to demonstrate this possibility.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14160) maxPurgeableTimestamp should traverse tables in order of minTimestamp

2018-04-17 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16440885#comment-16440885
 ] 

Marcus Eriksson commented on CASSANDRA-14160:
-

hey [~josnyder] the code looks good to me, but, as [~jjirsa] mentioned above, a 
unit test that makes sure the sstables are returned in the correct order should 
be added. You also mentioned a that you had not yet benchmarked the change, 
have you done that? If not, that would also be nice.

> maxPurgeableTimestamp should traverse tables in order of minTimestamp
> -
>
> Key: CASSANDRA-14160
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14160
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Josh Snyder
>Assignee: Josh Snyder
>Priority: Major
>  Labels: performance
> Fix For: 4.x
>
>
> In maxPurgeableTimestamp, we iterate over the bloom filters of each 
> overlapping SSTable. Of the bloom filter hits, we take the SSTable with the 
> lowest minTimestamp. If we kept the SSTables in sorted order of minTimestamp, 
> then we could short-circuit the operation at the first bloom filter hit, 
> reducing cache pressure (or worse, I/O) and CPU time.
> I've written (but not yet benchmarked) [some 
> code|https://github.com/hashbrowncipher/cassandra/commit/29859a4a2e617f6775be49448858bc59fdafab44]
>  to demonstrate this possibility.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14160) maxPurgeableTimestamp should traverse tables in order of minTimestamp

2018-04-11 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16434939#comment-16434939
 ] 

Jeff Jirsa commented on CASSANDRA-14160:


Re-pushed [here|https://github.com/jeffjirsa/cassandra/commits/14160] , tests 
running [here|https://circleci.com/gh/jeffjirsa/cassandra/tree/14160] (unit 
tests + dtests) 

> maxPurgeableTimestamp should traverse tables in order of minTimestamp
> -
>
> Key: CASSANDRA-14160
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14160
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Josh Snyder
>Assignee: Josh Snyder
>Priority: Major
>  Labels: performance
> Fix For: 4.x
>
>
> In maxPurgeableTimestamp, we iterate over the bloom filters of each 
> overlapping SSTable. Of the bloom filter hits, we take the SSTable with the 
> lowest minTimestamp. If we kept the SSTables in sorted order of minTimestamp, 
> then we could short-circuit the operation at the first bloom filter hit, 
> reducing cache pressure (or worse, I/O) and CPU time.
> I've written (but not yet benchmarked) [some 
> code|https://github.com/hashbrowncipher/cassandra/commit/29859a4a2e617f6775be49448858bc59fdafab44]
>  to demonstrate this possibility.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14160) maxPurgeableTimestamp should traverse tables in order of minTimestamp

2018-03-08 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16392018#comment-16392018
 ] 

Jeff Jirsa commented on CASSANDRA-14160:


[~krummas] can you review?

 

> maxPurgeableTimestamp should traverse tables in order of minTimestamp
> -
>
> Key: CASSANDRA-14160
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14160
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Josh Snyder
>Assignee: Josh Snyder
>Priority: Major
>  Labels: performance
> Fix For: 4.x
>
>
> In maxPurgeableTimestamp, we iterate over the bloom filters of each 
> overlapping SSTable. Of the bloom filter hits, we take the SSTable with the 
> lowest minTimestamp. If we kept the SSTables in sorted order of minTimestamp, 
> then we could short-circuit the operation at the first bloom filter hit, 
> reducing cache pressure (or worse, I/O) and CPU time.
> I've written (but not yet benchmarked) [some 
> code|https://github.com/hashbrowncipher/cassandra/commit/29859a4a2e617f6775be49448858bc59fdafab44]
>  to demonstrate this possibility.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14160) maxPurgeableTimestamp should traverse tables in order of minTimestamp

2018-01-13 Thread Josh Snyder (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16325397#comment-16325397
 ] 

Josh Snyder commented on CASSANDRA-14160:
-

I believe I've now fixed the issues. Take a look at:

https://github.com/hashbrowncipher/cassandra/tree/cassandra-14160
https://circleci.com/gh/hashbrowncipher/cassandra/2

> maxPurgeableTimestamp should traverse tables in order of minTimestamp
> -
>
> Key: CASSANDRA-14160
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14160
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Josh Snyder
>Assignee: Josh Snyder
>  Labels: performance
> Fix For: 4.x
>
>
> In maxPurgeableTimestamp, we iterate over the bloom filters of each 
> overlapping SSTable. Of the bloom filter hits, we take the SSTable with the 
> lowest minTimestamp. If we kept the SSTables in sorted order of minTimestamp, 
> then we could short-circuit the operation at the first bloom filter hit, 
> reducing cache pressure (or worse, I/O) and CPU time.
> I've written (but not yet benchmarked) [some 
> code|https://github.com/hashbrowncipher/cassandra/commit/29859a4a2e617f6775be49448858bc59fdafab44]
>  to demonstrate this possibility.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14160) maxPurgeableTimestamp should traverse tables in order of minTimestamp

2018-01-12 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16324826#comment-16324826
 ] 

Jeff Jirsa commented on CASSANDRA-14160:


Hey [~josnyder], the code looks sane, but it also doesn't compile (or rather, 
the existing unit tests for the {{OverlapIterator}} no longer compile). Any 
chance you can address that? 

> maxPurgeableTimestamp should traverse tables in order of minTimestamp
> -
>
> Key: CASSANDRA-14160
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14160
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Josh Snyder
>Assignee: Josh Snyder
>  Labels: performance
> Fix For: 4.x
>
>
> In maxPurgeableTimestamp, we iterate over the bloom filters of each 
> overlapping SSTable. Of the bloom filter hits, we take the SSTable with the 
> lowest minTimestamp. If we kept the SSTables in sorted order of minTimestamp, 
> then we could short-circuit the operation at the first bloom filter hit, 
> reducing cache pressure (or worse, I/O) and CPU time.
> I've written (but not yet benchmarked) [some 
> code|https://github.com/hashbrowncipher/cassandra/commit/29859a4a2e617f6775be49448858bc59fdafab44]
>  to demonstrate this possibility.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14160) maxPurgeableTimestamp should traverse tables in order of minTimestamp

2018-01-12 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16324804#comment-16324804
 ] 

Jeff Jirsa commented on CASSANDRA-14160:


Pushed the patch to one of my github branches, running CI 
[here|https://circleci.com/gh/jeffjirsa/cassandra/tree/cassandra-14160]


> maxPurgeableTimestamp should traverse tables in order of minTimestamp
> -
>
> Key: CASSANDRA-14160
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14160
> Project: Cassandra
>  Issue Type: Bug
>  Components: Compaction
>Reporter: Josh Snyder
>Assignee: Josh Snyder
>  Labels: performance
> Fix For: 4.x
>
>
> In maxPurgeableTimestamp, we iterate over the bloom filters of each 
> overlapping SSTable. Of the bloom filter hits, we take the SSTable with the 
> lowest minTimestamp. If we kept the SSTables in sorted order of minTimestamp, 
> then we could short-circuit the operation at the first bloom filter hit, 
> reducing cache pressure (or worse, I/O) and CPU time.
> I've written (but not yet benchmarked) [some 
> code|https://github.com/hashbrowncipher/cassandra/commit/29859a4a2e617f6775be49448858bc59fdafab44]
>  to demonstrate this possibility.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org