[jira] [Assigned] (CASSANDRA-17477) -dc option for repair, prevents incremental repairs

2022-08-10 Thread Venkata Harikrishna Nukala (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-17477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala reassigned CASSANDRA-17477:
--

Assignee: Venkata Harikrishna Nukala

> -dc option for repair, prevents incremental repairs
> ---
>
> Key: CASSANDRA-17477
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17477
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Pedro Gordo
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
> Fix For: 4.0.x, 4.x
>
>
> By default running `{*}nodetool repair{*}` should trigger incremental 
> repairs, but this does not happen if you use the `{*}-dc{*}` flag, even 
> though the repair summary says `{*}incremental: true{*}`.
> You can replicate the issue with the following commands:
> {code:bash}
> ccm create test-incremental-repairs -v 4.0.1 -n 3 -s
> ccm node1 cqlsh -e "CREATE KEYSPACE keyspace1 WITH replication = {'class': 
> 'NetworkTopologyStrategy', 'datacenter1': 2 };"
> ccm stress write
> ccm node1 nodetool "repair keyspace1 standard1 -dc datacenter1"
> find ~/.ccm/test-incremental-repairs/*/data0/keyspace1/standard1* -name 
> *Data.db -exec /bin/sstablemetadata {} \; | grep Repaired
> ccm node1 nodetool "repair keyspace1 standard1"
> find ~/.ccm/test-incremental-repairs/*/data0/keyspace1/standard1* -name 
> *Data.db -exec /bin/sstablemetadata {} \; | grep Repaired
> {code}
> You'll notice that the output for the first `{*}find{*}` command will all be 
> `{*}Repaired at: 0{*}`, but the second `{*}find{*}` command will give you 
> results like `{*}Repaired at: 1648044754464 (03/23/2022 14:12:34){*}`. 
> At the same time, both `{*}nodetool repair{*}` commands will output like the 
> following:
> {code:bash}
> [2022-03-23 15:15:52,500] Starting repair command #2 
> (20f12190-aabc-11ec-a3d4-e9e5a941ef6c), repairing keyspace keyspace1 with 
> repair options (parallelism: parallel, primary range: false, incremental: 
> true, job threads: 1, ColumnFamilies: [standard1], dataCenters: 
> [datacenter1], hosts: [], previewKind: NONE, # of ranges: 2, pull repair: 
> false, force repair: false, optimise streams: false, ignore unreplicated 
> keyspaces: false)
> {code}
> Indicating `{*}incremental: true{*}` for both of them.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17477) -dc option for repair, prevents incremental repairs

2022-08-10 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17578110#comment-17578110
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-17477:


I can work on it. 

> -dc option for repair, prevents incremental repairs
> ---
>
> Key: CASSANDRA-17477
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17477
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Repair
>Reporter: Pedro Gordo
>Priority: Normal
> Fix For: 4.0.x, 4.x
>
>
> By default running `{*}nodetool repair{*}` should trigger incremental 
> repairs, but this does not happen if you use the `{*}-dc{*}` flag, even 
> though the repair summary says `{*}incremental: true{*}`.
> You can replicate the issue with the following commands:
> {code:bash}
> ccm create test-incremental-repairs -v 4.0.1 -n 3 -s
> ccm node1 cqlsh -e "CREATE KEYSPACE keyspace1 WITH replication = {'class': 
> 'NetworkTopologyStrategy', 'datacenter1': 2 };"
> ccm stress write
> ccm node1 nodetool "repair keyspace1 standard1 -dc datacenter1"
> find ~/.ccm/test-incremental-repairs/*/data0/keyspace1/standard1* -name 
> *Data.db -exec /bin/sstablemetadata {} \; | grep Repaired
> ccm node1 nodetool "repair keyspace1 standard1"
> find ~/.ccm/test-incremental-repairs/*/data0/keyspace1/standard1* -name 
> *Data.db -exec /bin/sstablemetadata {} \; | grep Repaired
> {code}
> You'll notice that the output for the first `{*}find{*}` command will all be 
> `{*}Repaired at: 0{*}`, but the second `{*}find{*}` command will give you 
> results like `{*}Repaired at: 1648044754464 (03/23/2022 14:12:34){*}`. 
> At the same time, both `{*}nodetool repair{*}` commands will output like the 
> following:
> {code:bash}
> [2022-03-23 15:15:52,500] Starting repair command #2 
> (20f12190-aabc-11ec-a3d4-e9e5a941ef6c), repairing keyspace keyspace1 with 
> repair options (parallelism: parallel, primary range: false, incremental: 
> true, job threads: 1, ColumnFamilies: [standard1], dataCenters: 
> [datacenter1], hosts: [], previewKind: NONE, # of ranges: 2, pull repair: 
> false, force repair: false, optimise streams: false, ignore unreplicated 
> keyspaces: false)
> {code}
> Indicating `{*}incremental: true{*}` for both of them.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17222) Add new metrics to track the number of requests performed by GROUP BY and Aggregation queries

2022-05-18 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17539033#comment-17539033
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-17222:


Sorry for the very long delay. Spent some time on it and got stuck at this 
point.

As per my understanding (please correct me if I am wrong): when there are 
results equal to page size, then a response is sent to the client and server 
doesn't maintain query state. The client requests for the next page and the 
server executes based on the query state received from the client. If the 
server doesn't maintain any state of query, then calculating the *total* number 
of internal requests happening for a group by or aggregate query is not 
possible right (unless the state is maintained somewhere)? If we update the 
histogram with the internal requests counts for each page, I feel it won't give 
the picture we expect, is it?

> Add new metrics to track the number of requests performed by GROUP BY and 
> Aggregation queries 
> --
>
> Key: CASSANDRA-17222
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17222
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Benjamin Lerer
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
>  Labels: AdventCalendar2021, lhf
> Fix For: 4.x
>
>
> When a user perform a GROUP BY query or an aggregate query (e.g. {{SELECT 
> count\(\*) FROM my_table}}) internally C* will send multiple internal 
> requests to avoid running out of memory. The page size used for those 
> internal queries is the same as the external page size.
> Having a some visibility on the number of internal requests happening for a 
> group by or an aggregate query is important as it might help administrators 
> to debug performance issues.
> We should add some separate metrics for GROUP BY queries and Aggregate queries
> +Additional information for newcomers:+
> * A new metric class called {{AggregationMetrics}} should be created with an 
> {{Histogram}} called {{internalPagesPerGroupByQuerie}} and another called 
> {{internalPagesPerAggregateQuerie}} (see {{BatchMetrics}} for an example
> * High level query paging are managed by {{AggregationQueryPager}}. The 
> number of queries performed should be incremented within {{fetchSubPage}} and 
> the metrics should be updated on close. 
> *  To test that the numbers are reliable, you need to create a new Unit Test 
> {{AggregationMetricsTest}}. To have some example of how to test group by 
> queries with paging, you can look into  
> {{SelectGroupByTest.testGroupByWithPaging()}} to check how to clear the 
> histograms between test you can look into 
> {{BatchMetricsTests.clearHistogram()}} 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17222) Add new metrics to track the number of requests performed by GROUP BY and Aggregation queries

2022-01-23 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17480661#comment-17480661
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-17222:


Can I pick it?

> Add new metrics to track the number of requests performed by GROUP BY and 
> Aggregation queries 
> --
>
> Key: CASSANDRA-17222
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17222
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Metrics
>Reporter: Benjamin Lerer
>Priority: Normal
>  Labels: AdventCalendar2021, lhf
> Fix For: 4.x
>
>
> When a user perform a GROUP BY query or an aggregate query (e.g. {{SELECT 
> count\(\*) FROM my_table}}) internally C* will send multiple internal 
> requests to avoid running out of memory. The page size used for those 
> internal queries is the same as the external page size.
> Having a some visibility on the number of internal requests happening for a 
> group by or an aggregate query is important as it might help administrators 
> to debug performance issues.
> We should add some separate metrics for GROUP BY queries and Aggregate queries
> +Additional information for newcomers:+
> * A new metric class called {{AggregationMetrics}} should be created with an 
> {{Histogram}} called {{internalPagesPerGroupByQuerie}} and another called 
> {{internalPagesPerAggregateQuerie}} (see {{BatchMetrics}} for an example
> * High level query paging are managed by {{AggregationQueryPager}}. The 
> number of queries performed should be incremented within {{fetchSubPage}} and 
> the metrics should be updated on close. 
> *  To test that the numbers are reliable, you need to create a new Unit Test 
> {{AggregationMetricsTest}}. To have some example of how to test group by 
> queries with paging, you can look into  
> {{SelectGroupByTest.testGroupByWithPaging()}} to check how to clear the 
> histograms between test you can look into 
> {{BatchMetricsTests.clearHistogram()}} 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17126) Remove use of deprecated Files in tests

2022-01-09 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17471318#comment-17471318
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-17126:


Made the changes.

PR: https://github.com/apache/cassandra/pull/1381
Circle CI: 
https://app.circleci.com/pipelines/github/nvharikrishna/cassandra/60/workflows/cb6858cb-6c0d-401f-b77c-64ee517c22af

> Remove use of deprecated Files in tests
> ---
>
> Key: CASSANDRA-17126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17126
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/unit
>Reporter: Brandon Williams
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> From checkstyle:
> {noformat}
>   5  Illegal import - java.io.File. [IllegalImport]
>   3  Illegal import - java.io.FileInputStream. [IllegalImport]
>   2  Illegal import - java.io.FileOutputStream. [IllegalImport]
>   3  Illegal import - java.io.FileWriter. [IllegalImport]
>  16  Illegal import - java.io.RandomAccessFile. [IllegalImport]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-17126) Remove use of deprecated Files in tests

2021-12-15 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-17126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17459787#comment-17459787
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-17126:


Can I pick it?

> Remove use of deprecated Files in tests
> ---
>
> Key: CASSANDRA-17126
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17126
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Test/unit
>Reporter: Brandon Williams
>Priority: Normal
>
> From checkstyle:
> {noformat}
>   5  Illegal import - java.io.File. [IllegalImport]
>   3  Illegal import - java.io.FileInputStream. [IllegalImport]
>   2  Illegal import - java.io.FileOutputStream. [IllegalImport]
>   3  Illegal import - java.io.FileWriter. [IllegalImport]
>  16  Illegal import - java.io.RandomAccessFile. [IllegalImport]
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14898) Key cache loading is very slow when there are many SSTables

2021-12-08 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17455938#comment-17455938
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14898:


[~jolynch]  [~marcuse]  Addressed review comments, squashed the commits for 
each branch. 

> Key cache loading is very slow when there are many SSTables
> ---
>
> Key: CASSANDRA-14898
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14898
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
> Environment: AWS i3.2xlarge, 4 physical cores (8 threads), 60GB of 
> RAM, loading about 8MB of KeyCache with 10k keys in it.
>Reporter: Joey Lynch
>Assignee: Venkata Harikrishna Nukala
>Priority: Low
>  Labels: Performance, low-hanging-fruit
> Attachments: key_cache_load_slow.svg
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> While dealing with a production issue today where some 3.0.17 nodes had close 
> to ~8k sstables on disk due to excessive write pressure, we had a few nodes 
> crash due to OOM and then they took close to 17 minutes to load the key cache 
> and recover. This excessive key cache load significantly increased the 
> duration of the outage (to mitigate we just removed the saved key cache 
> files). For example here is one example taking 17 minutes to load 10k keys, 
> or about 10 keys per second (which is ... very slow):
> {noformat}
> INFO  [pool-3-thread-1] 2018-11-15 21:50:21,885 AutoSavingCache.java:190 - 
> reading saved cache /mnt/data/cassandra/saved_caches/KeyCache-d.db
> INFO  [pool-3-thread-1] 2018-11-15 22:07:16,490 AutoSavingCache.java:166 - 
> Completed loading (1014606 ms; 10103 keys) KeyCache cache
> {noformat}
> I've witnessed similar behavior in the past with large LCS clusters, and 
> indeed it appears that any time the number of sstables is large, KeyCache 
> loading takes a _really_ long time. Today I got a flame graph and I believe 
> that I found the issue and I think it's reasonably easy to fix. From what I 
> can tell the {{KeyCacheSerializer::deserialize}} [method 
> |https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L445]
>  which is called for every key is linear in the number of sstables due to the 
> [call|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L459]
>  to {{ColumnFamilyStore::getSSTables}} which ends up calling {{View::select}} 
> [here|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/lifecycle/View.java#L139].
>  The {{View::select}} call is linear in the number of sstables and causes a 
> _lot_ of {{HashSet}} 
> [resizing|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/lifecycle/View.java#L139]
>  when the number of sstables is much greater than 16 (the default size of the 
> backing {{HashMap}}).
> As we see in the attached flamegraph we spend 50% of our CPU time in these 
> {{getSSTable}} calls, of which 36% is spent adding sstables to the HashSet in 
> {{View::select}} and 17% is spent just iterating the sstables in the first 
> place. A full 16% of CPU time is spent _just resizing the HashMap_. Then 
> another 4% is spend calling {{CacheService::findDesc}} which does [a linear 
> search|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L475]
>  for the sstable generation.
> I believe that this affects at least Cassandra 3.0.17 and trunk, and could be 
> pretty easily fixed by either caching the getSSTables call or at the very 
> least pre-sizing the {{HashSet}} in {{View::select}} to be the size of the 
> sstables map.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14898) Key cache loading is very slow when there are many SSTables

2021-12-06 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17454080#comment-17454080
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14898:


Thanks for the review [~marcuse] ! Taking a look.

> Key cache loading is very slow when there are many SSTables
> ---
>
> Key: CASSANDRA-14898
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14898
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
> Environment: AWS i3.2xlarge, 4 physical cores (8 threads), 60GB of 
> RAM, loading about 8MB of KeyCache with 10k keys in it.
>Reporter: Joey Lynch
>Assignee: Venkata Harikrishna Nukala
>Priority: Low
>  Labels: Performance, low-hanging-fruit
> Attachments: key_cache_load_slow.svg
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> While dealing with a production issue today where some 3.0.17 nodes had close 
> to ~8k sstables on disk due to excessive write pressure, we had a few nodes 
> crash due to OOM and then they took close to 17 minutes to load the key cache 
> and recover. This excessive key cache load significantly increased the 
> duration of the outage (to mitigate we just removed the saved key cache 
> files). For example here is one example taking 17 minutes to load 10k keys, 
> or about 10 keys per second (which is ... very slow):
> {noformat}
> INFO  [pool-3-thread-1] 2018-11-15 21:50:21,885 AutoSavingCache.java:190 - 
> reading saved cache /mnt/data/cassandra/saved_caches/KeyCache-d.db
> INFO  [pool-3-thread-1] 2018-11-15 22:07:16,490 AutoSavingCache.java:166 - 
> Completed loading (1014606 ms; 10103 keys) KeyCache cache
> {noformat}
> I've witnessed similar behavior in the past with large LCS clusters, and 
> indeed it appears that any time the number of sstables is large, KeyCache 
> loading takes a _really_ long time. Today I got a flame graph and I believe 
> that I found the issue and I think it's reasonably easy to fix. From what I 
> can tell the {{KeyCacheSerializer::deserialize}} [method 
> |https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L445]
>  which is called for every key is linear in the number of sstables due to the 
> [call|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L459]
>  to {{ColumnFamilyStore::getSSTables}} which ends up calling {{View::select}} 
> [here|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/lifecycle/View.java#L139].
>  The {{View::select}} call is linear in the number of sstables and causes a 
> _lot_ of {{HashSet}} 
> [resizing|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/lifecycle/View.java#L139]
>  when the number of sstables is much greater than 16 (the default size of the 
> backing {{HashMap}}).
> As we see in the attached flamegraph we spend 50% of our CPU time in these 
> {{getSSTable}} calls, of which 36% is spent adding sstables to the HashSet in 
> {{View::select}} and 17% is spent just iterating the sstables in the first 
> place. A full 16% of CPU time is spent _just resizing the HashMap_. Then 
> another 4% is spend calling {{CacheService::findDesc}} which does [a linear 
> search|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L475]
>  for the sstable generation.
> I believe that this affects at least Cassandra 3.0.17 and trunk, and could be 
> pretty easily fixed by either caching the getSSTables call or at the very 
> least pre-sizing the {{HashSet}} in {{View::select}} to be the size of the 
> sstables map.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14898) Key cache loading is very slow when there are many SSTables

2021-11-29 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17434487#comment-17434487
 ] 

Venkata Harikrishna Nukala edited comment on CASSANDRA-14898 at 11/30/21, 4:22 
AM:
---

Here are the changes: 

|[trunk|https://github.com/apache/cassandra/pull/1287]|[CI|https://app.circleci.com/pipelines/github/nvharikrishna/cassandra/41/workflows/7fb33758-2a6d-464a-bcc5-8b5709d1d997]|

|[3.11|https://github.com/apache/cassandra/pull/1288]|[CI|https://app.circleci.com/pipelines/github/nvharikrishna/cassandra/42/workflows/e7cdd28d-9d2e-4445-9157-e2b68ca0ef47]|

|[3.0|https://github.com/apache/cassandra/pull/1290]|[CI|https://app.circleci.com/pipelines/github/nvharikrishna/cassandra/46/workflows/a2baf042-657d-44c2-8890-b52215f9f0cf]|

 


was (Author: n.v.harikrishna):
Here are the changes: 

|[trunk|https://github.com/apache/cassandra/pull/1287 
]|[CI|https://app.circleci.com/pipelines/github/nvharikrishna/cassandra/41/workflows/7fb33758-2a6d-464a-bcc5-8b5709d1d997]|

|[3.11|https://github.com/apache/cassandra/pull/1288 
]|[CI|https://app.circleci.com/pipelines/github/nvharikrishna/cassandra/42/workflows/e7cdd28d-9d2e-4445-9157-e2b68ca0ef47]|

|[3.0|https://github.com/apache/cassandra/pull/1290 
]|[CI|https://app.circleci.com/pipelines/github/nvharikrishna/cassandra/46/workflows/a2baf042-657d-44c2-8890-b52215f9f0cf]|

 

> Key cache loading is very slow when there are many SSTables
> ---
>
> Key: CASSANDRA-14898
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14898
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
> Environment: AWS i3.2xlarge, 4 physical cores (8 threads), 60GB of 
> RAM, loading about 8MB of KeyCache with 10k keys in it.
>Reporter: Joey Lynch
>Assignee: Venkata Harikrishna Nukala
>Priority: Low
>  Labels: Performance, low-hanging-fruit
> Attachments: key_cache_load_slow.svg
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> While dealing with a production issue today where some 3.0.17 nodes had close 
> to ~8k sstables on disk due to excessive write pressure, we had a few nodes 
> crash due to OOM and then they took close to 17 minutes to load the key cache 
> and recover. This excessive key cache load significantly increased the 
> duration of the outage (to mitigate we just removed the saved key cache 
> files). For example here is one example taking 17 minutes to load 10k keys, 
> or about 10 keys per second (which is ... very slow):
> {noformat}
> INFO  [pool-3-thread-1] 2018-11-15 21:50:21,885 AutoSavingCache.java:190 - 
> reading saved cache /mnt/data/cassandra/saved_caches/KeyCache-d.db
> INFO  [pool-3-thread-1] 2018-11-15 22:07:16,490 AutoSavingCache.java:166 - 
> Completed loading (1014606 ms; 10103 keys) KeyCache cache
> {noformat}
> I've witnessed similar behavior in the past with large LCS clusters, and 
> indeed it appears that any time the number of sstables is large, KeyCache 
> loading takes a _really_ long time. Today I got a flame graph and I believe 
> that I found the issue and I think it's reasonably easy to fix. From what I 
> can tell the {{KeyCacheSerializer::deserialize}} [method 
> |https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L445]
>  which is called for every key is linear in the number of sstables due to the 
> [call|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L459]
>  to {{ColumnFamilyStore::getSSTables}} which ends up calling {{View::select}} 
> [here|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/lifecycle/View.java#L139].
>  The {{View::select}} call is linear in the number of sstables and causes a 
> _lot_ of {{HashSet}} 
> [resizing|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/lifecycle/View.java#L139]
>  when the number of sstables is much greater than 16 (the default size of the 
> backing {{HashMap}}).
> As we see in the attached flamegraph we spend 50% of our CPU time in these 
> {{getSSTable}} calls, of which 36% is spent adding sstables to the HashSet in 
> {{View::select}} and 17% is spent just iterating the sstables in the first 
> place. A full 16% of CPU time is spent _just resizing the HashMap_. Then 
> another 4% is spend calling {{CacheService::findDesc}} which does [a linear 
> search|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L475]
>  for the 

[jira] [Commented] (CASSANDRA-14898) Key cache loading is very slow when there are many SSTables

2021-10-26 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17434487#comment-17434487
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14898:


Here are the changes: 

|[trunk|https://github.com/apache/cassandra/pull/1287 
]|[CI|https://app.circleci.com/pipelines/github/nvharikrishna/cassandra/41/workflows/7fb33758-2a6d-464a-bcc5-8b5709d1d997]|

|[3.11|https://github.com/apache/cassandra/pull/1288 
]|[CI|https://app.circleci.com/pipelines/github/nvharikrishna/cassandra/42/workflows/e7cdd28d-9d2e-4445-9157-e2b68ca0ef47]|

|[3.0|https://github.com/apache/cassandra/pull/1290 
]|[CI|https://app.circleci.com/pipelines/github/nvharikrishna/cassandra/46/workflows/a2baf042-657d-44c2-8890-b52215f9f0cf]|

 

> Key cache loading is very slow when there are many SSTables
> ---
>
> Key: CASSANDRA-14898
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14898
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
> Environment: AWS i3.2xlarge, 4 physical cores (8 threads), 60GB of 
> RAM, loading about 8MB of KeyCache with 10k keys in it.
>Reporter: Joey Lynch
>Assignee: Venkata Harikrishna Nukala
>Priority: Low
>  Labels: Performance, low-hanging-fruit
> Attachments: key_cache_load_slow.svg
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> While dealing with a production issue today where some 3.0.17 nodes had close 
> to ~8k sstables on disk due to excessive write pressure, we had a few nodes 
> crash due to OOM and then they took close to 17 minutes to load the key cache 
> and recover. This excessive key cache load significantly increased the 
> duration of the outage (to mitigate we just removed the saved key cache 
> files). For example here is one example taking 17 minutes to load 10k keys, 
> or about 10 keys per second (which is ... very slow):
> {noformat}
> INFO  [pool-3-thread-1] 2018-11-15 21:50:21,885 AutoSavingCache.java:190 - 
> reading saved cache /mnt/data/cassandra/saved_caches/KeyCache-d.db
> INFO  [pool-3-thread-1] 2018-11-15 22:07:16,490 AutoSavingCache.java:166 - 
> Completed loading (1014606 ms; 10103 keys) KeyCache cache
> {noformat}
> I've witnessed similar behavior in the past with large LCS clusters, and 
> indeed it appears that any time the number of sstables is large, KeyCache 
> loading takes a _really_ long time. Today I got a flame graph and I believe 
> that I found the issue and I think it's reasonably easy to fix. From what I 
> can tell the {{KeyCacheSerializer::deserialize}} [method 
> |https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L445]
>  which is called for every key is linear in the number of sstables due to the 
> [call|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L459]
>  to {{ColumnFamilyStore::getSSTables}} which ends up calling {{View::select}} 
> [here|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/lifecycle/View.java#L139].
>  The {{View::select}} call is linear in the number of sstables and causes a 
> _lot_ of {{HashSet}} 
> [resizing|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/lifecycle/View.java#L139]
>  when the number of sstables is much greater than 16 (the default size of the 
> backing {{HashMap}}).
> As we see in the attached flamegraph we spend 50% of our CPU time in these 
> {{getSSTable}} calls, of which 36% is spent adding sstables to the HashSet in 
> {{View::select}} and 17% is spent just iterating the sstables in the first 
> place. A full 16% of CPU time is spent _just resizing the HashMap_. Then 
> another 4% is spend calling {{CacheService::findDesc}} which does [a linear 
> search|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L475]
>  for the sstable generation.
> I believe that this affects at least Cassandra 3.0.17 and trunk, and could be 
> pretty easily fixed by either caching the getSSTables call or at the very 
> least pre-sizing the {{HashSet}} in {{View::select}} to be the size of the 
> sstables map.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14898) Key cache loading is very slow when there are many SSTables

2021-10-22 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17433131#comment-17433131
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14898:


[~jolynch] I had updated the patch for trunk & here is the pr: 
[https://github.com/apache/cassandra/pull/1287]  & CI is here: 
https://app.circleci.com/pipelines/github/nvharikrishna/cassandra/41/workflows/7fb33758-2a6d-464a-bcc5-8b5709d1d997
 # I did not made changes to implement AutoCloseable as closing the resource 
generally gives the impression that resource it not usable any more but 
CacheSerializer later used for serialising too. Using normal method to clear up 
the map.
 # Added few tests to KeyCacheTest as it has more tests related to key cache.
 # Added limit on cache load time which is configurable through 
max_cache_load_time param (defaults to 30 sec).

Sorry for the delay. Would appreciate your review on the code changes. Please 
give me a day or two for making changes to other branches.

 

 

> Key cache loading is very slow when there are many SSTables
> ---
>
> Key: CASSANDRA-14898
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14898
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
> Environment: AWS i3.2xlarge, 4 physical cores (8 threads), 60GB of 
> RAM, loading about 8MB of KeyCache with 10k keys in it.
>Reporter: Joey Lynch
>Assignee: Venkata Harikrishna Nukala
>Priority: Low
>  Labels: Performance, low-hanging-fruit
> Attachments: key_cache_load_slow.svg
>
>
> While dealing with a production issue today where some 3.0.17 nodes had close 
> to ~8k sstables on disk due to excessive write pressure, we had a few nodes 
> crash due to OOM and then they took close to 17 minutes to load the key cache 
> and recover. This excessive key cache load significantly increased the 
> duration of the outage (to mitigate we just removed the saved key cache 
> files). For example here is one example taking 17 minutes to load 10k keys, 
> or about 10 keys per second (which is ... very slow):
> {noformat}
> INFO  [pool-3-thread-1] 2018-11-15 21:50:21,885 AutoSavingCache.java:190 - 
> reading saved cache /mnt/data/cassandra/saved_caches/KeyCache-d.db
> INFO  [pool-3-thread-1] 2018-11-15 22:07:16,490 AutoSavingCache.java:166 - 
> Completed loading (1014606 ms; 10103 keys) KeyCache cache
> {noformat}
> I've witnessed similar behavior in the past with large LCS clusters, and 
> indeed it appears that any time the number of sstables is large, KeyCache 
> loading takes a _really_ long time. Today I got a flame graph and I believe 
> that I found the issue and I think it's reasonably easy to fix. From what I 
> can tell the {{KeyCacheSerializer::deserialize}} [method 
> |https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L445]
>  which is called for every key is linear in the number of sstables due to the 
> [call|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L459]
>  to {{ColumnFamilyStore::getSSTables}} which ends up calling {{View::select}} 
> [here|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/lifecycle/View.java#L139].
>  The {{View::select}} call is linear in the number of sstables and causes a 
> _lot_ of {{HashSet}} 
> [resizing|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/lifecycle/View.java#L139]
>  when the number of sstables is much greater than 16 (the default size of the 
> backing {{HashMap}}).
> As we see in the attached flamegraph we spend 50% of our CPU time in these 
> {{getSSTable}} calls, of which 36% is spent adding sstables to the HashSet in 
> {{View::select}} and 17% is spent just iterating the sstables in the first 
> place. A full 16% of CPU time is spent _just resizing the HashMap_. Then 
> another 4% is spend calling {{CacheService::findDesc}} which does [a linear 
> search|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L475]
>  for the sstable generation.
> I believe that this affects at least Cassandra 3.0.17 and trunk, and could be 
> pretty easily fixed by either caching the getSSTables call or at the very 
> least pre-sizing the {{HashSet}} in {{View::select}} to be the size of the 
> sstables map.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-un

[jira] [Commented] (CASSANDRA-14898) Key cache loading is very slow when there are many SSTables

2021-09-21 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17418254#comment-17418254
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14898:


[~jolynch] Thanks for the review!
{quote}Would you like to get this into 3.0/3.11/4.0 and trunk? 
{quote}
I will create three different patches
{quote}Did you mean to remove RowIndexEntry.Serializer.skipForCache(input); on 
line 453 ?
{quote}
No, looks like some mistake happened from my side before pushing the changes. 
Will revert it. Thanks for pointing it out.
{quote}If we want to be extra careful do you think we could use 
[AutoCloseable|https://docs.oracle.com/javase/tutorial/essential/exceptions/tryResourceClose.html]
 somehow to ensure the state is cleaned up?

Do you think we could improve the tests in AutoSavingCacheTest...

Regarding the deadline approach ... implement it in the while loop in 
AutoSavingCache.loadSaved ...
{quote}
I will make these changes and upload patches as soon as I can.

 

> Key cache loading is very slow when there are many SSTables
> ---
>
> Key: CASSANDRA-14898
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14898
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
> Environment: AWS i3.2xlarge, 4 physical cores (8 threads), 60GB of 
> RAM, loading about 8MB of KeyCache with 10k keys in it.
>Reporter: Joey Lynch
>Assignee: Venkata Harikrishna Nukala
>Priority: Low
>  Labels: Performance, low-hanging-fruit
> Attachments: key_cache_load_slow.svg
>
>
> While dealing with a production issue today where some 3.0.17 nodes had close 
> to ~8k sstables on disk due to excessive write pressure, we had a few nodes 
> crash due to OOM and then they took close to 17 minutes to load the key cache 
> and recover. This excessive key cache load significantly increased the 
> duration of the outage (to mitigate we just removed the saved key cache 
> files). For example here is one example taking 17 minutes to load 10k keys, 
> or about 10 keys per second (which is ... very slow):
> {noformat}
> INFO  [pool-3-thread-1] 2018-11-15 21:50:21,885 AutoSavingCache.java:190 - 
> reading saved cache /mnt/data/cassandra/saved_caches/KeyCache-d.db
> INFO  [pool-3-thread-1] 2018-11-15 22:07:16,490 AutoSavingCache.java:166 - 
> Completed loading (1014606 ms; 10103 keys) KeyCache cache
> {noformat}
> I've witnessed similar behavior in the past with large LCS clusters, and 
> indeed it appears that any time the number of sstables is large, KeyCache 
> loading takes a _really_ long time. Today I got a flame graph and I believe 
> that I found the issue and I think it's reasonably easy to fix. From what I 
> can tell the {{KeyCacheSerializer::deserialize}} [method 
> |https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L445]
>  which is called for every key is linear in the number of sstables due to the 
> [call|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L459]
>  to {{ColumnFamilyStore::getSSTables}} which ends up calling {{View::select}} 
> [here|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/lifecycle/View.java#L139].
>  The {{View::select}} call is linear in the number of sstables and causes a 
> _lot_ of {{HashSet}} 
> [resizing|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/lifecycle/View.java#L139]
>  when the number of sstables is much greater than 16 (the default size of the 
> backing {{HashMap}}).
> As we see in the attached flamegraph we spend 50% of our CPU time in these 
> {{getSSTable}} calls, of which 36% is spent adding sstables to the HashSet in 
> {{View::select}} and 17% is spent just iterating the sstables in the first 
> place. A full 16% of CPU time is spent _just resizing the HashMap_. Then 
> another 4% is spend calling {{CacheService::findDesc}} which does [a linear 
> search|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L475]
>  for the sstable generation.
> I believe that this affects at least Cassandra 3.0.17 and trunk, and could be 
> pretty easily fixed by either caching the getSSTables call or at the very 
> least pre-sizing the {{HashSet}} in {{View::select}} to be the size of the 
> sstables map.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubs

[jira] [Assigned] (CASSANDRA-13998) Cassandra stress distribution does not affect the result

2021-07-10 Thread Venkata Harikrishna Nukala (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala reassigned CASSANDRA-13998:
--

Assignee: (was: Venkata Harikrishna Nukala)

> Cassandra stress distribution does not affect the result
> 
>
> Key: CASSANDRA-13998
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13998
> Project: Cassandra
>  Issue Type: Task
>  Components: Tool/stress
> Environment: Widnows 10
>Reporter: Mikhail Pliskovsky
>Priority: Low
> Fix For: 3.11.x
>
> Attachments: 13998-trunk.txt, cqlstress-example.yaml
>
>
> When testing my schema on single-node cluster, I am getting the identical 
> data for each stress-test run
> I specify my cassandra-stress.yaml file 
> Table and column spec
> {code:java}
> table_definition: |
>   CREATE TABLE files (
> id uuid PRIMARY KEY,
> data blob
>   ) 
> columnspec:
>   - name: data
> size: UNIFORM(10..100)
> population: UNIFORM(1..100B)
> {code}
> But when query table rows after test, I am getting data as identical string 
> in each row
> Command to run the test
> {code:java}
> cassandra-stress user profile=..\cqlstress-example.yaml n=20 ops(insert=5) 
> -rate threads=8
> {code}
> What I am doing wrong? 
> My wish is to have the data of variable length



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14898) Key cache loading is very slow when there are many SSTables

2021-03-06 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17296524#comment-17296524
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14898:


[~jolynch] sorry, took long time to get back on this. I took your patch and 
tried it in my local env (with even more SStables: around 11k+) and it 
performed better than my patch. It loaded 400k+ keys with in a second where as 
my code changes took 8 to 9 seconds. It is worth maintaining state for this 
performance gain. I am fine with dropping my patch. Though I made a small 
change on top of your patch which is similar to my patch. Replaced find method 
with generic collect method so that copying SSTables into a new map can be 
avoided 
([https://github.com/nvharikrishna/cassandra/commit/b26b5f17877b5d89698840e42a3c77a6629594f5]).
 Probably splitting the CacheSerializer is an alternative (which you mentioned 
earlier) if not this solution.

I have also tried to time bound the key and row cache loading by calling get 
method with specific time and cancelling it (may leave key cache entries in 
invalid state if not cancelled when compaction starts) if it couldn't finish. 
Cancelling it lead to serious problem. Cancelling the task interrupted the 
deserializer -> DataInputStream -> ChannelProxy which is throwing  
_java.nio.channels.ClosedByInterruptException_ as 
_org.apache.cassandra.io.FSReadError_. FSReadError is treated as disk failure 
and instance is getting stopped as per disk failure policy. Pasting the stack 
trace here for reference. After looking at the performance improvement, I am 
not sure if it is worth digging down the path of modifying ChannelProxy to 
handle this case (not treating java.nio.channels.ClosedByInterruptException as 
error). So reverted the changes.
{code:java}
WARN [main] 2021-03-06 12:58:29,926 CassandraDaemon.java:346 - Cache did not 
load in given time. Cancelled loading of key and row cache. isCancelled: true
ERROR [pool-3-thread-1] 2021-03-06 12:58:29,928 DefaultFSErrorHandler.java:104 
- Exiting forcefully due to file system exception on startup, disk failure 
policy "stop"
org.apache.cassandra.io.FSReadError: 
java.nio.channels.ClosedByInterruptException
 at org.apache.cassandra.io.util.ChannelProxy.read(ChannelProxy.java:143)
 at 
org.apache.cassandra.io.util.SimpleChunkReader.readChunk(SimpleChunkReader.java:41)
 at 
org.apache.cassandra.io.util.ChecksummedRebufferer.rebuffer(ChecksummedRebufferer.java:45)
 at 
org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(RandomAccessReader.java:65)
 at 
org.apache.cassandra.io.util.RandomAccessReader.reBuffer(RandomAccessReader.java:59)
 at 
org.apache.cassandra.io.util.RebufferingInputStream.read(RebufferingInputStream.java:90)
 at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
 at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
 at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
 at 
org.apache.cassandra.io.util.LengthAvailableInputStream.read(LengthAvailableInputStream.java:57)
 at java.io.DataInputStream.readFully(DataInputStream.java:195)
 at java.io.DataInputStream.readFully(DataInputStream.java:169)
 at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:433)
 at 
org.apache.cassandra.service.CacheService$KeyCacheSerializer.deserialize(CacheService.java:453)
 at 
org.apache.cassandra.cache.AutoSavingCache.loadSaved(AutoSavingCache.java:227)
 at org.apache.cassandra.cache.AutoSavingCache$3.call(AutoSavingCache.java:168)
 at org.apache.cassandra.cache.AutoSavingCache$3.call(AutoSavingCache.java:164)
 at 
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125)
 at 
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57)
 at 
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)
Caused by: java.nio.channels.ClosedByInterruptException: null
 at 
java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
 at sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:740)
 at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:721)
 at org.apache.cassandra.io.util.ChannelProxy.read(ChannelProxy.java:139)
 ... 22 common frames omitted{code}



> Key cache loading is very slow when there are many SSTables
> ---
>
> Key: CASSANDRA-14898
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14898
> Project: Cassandra
>  Issue Type: Bug
>  Compon

[jira] [Commented] (CASSANDRA-14898) Key cache loading is very slow when there are many SSTables

2021-02-02 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17277761#comment-17277761
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14898:


[~jolynch] Taking a step back and re-looking at my patch. Give me sometime.

> Key cache loading is very slow when there are many SSTables
> ---
>
> Key: CASSANDRA-14898
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14898
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
> Environment: AWS i3.2xlarge, 4 physical cores (8 threads), 60GB of 
> RAM, loading about 8MB of KeyCache with 10k keys in it.
>Reporter: Joey Lynch
>Assignee: Venkata Harikrishna Nukala
>Priority: Low
>  Labels: Performance, low-hanging-fruit
> Attachments: key_cache_load_slow.svg
>
>
> While dealing with a production issue today where some 3.0.17 nodes had close 
> to ~8k sstables on disk due to excessive write pressure, we had a few nodes 
> crash due to OOM and then they took close to 17 minutes to load the key cache 
> and recover. This excessive key cache load significantly increased the 
> duration of the outage (to mitigate we just removed the saved key cache 
> files). For example here is one example taking 17 minutes to load 10k keys, 
> or about 10 keys per second (which is ... very slow):
> {noformat}
> INFO  [pool-3-thread-1] 2018-11-15 21:50:21,885 AutoSavingCache.java:190 - 
> reading saved cache /mnt/data/cassandra/saved_caches/KeyCache-d.db
> INFO  [pool-3-thread-1] 2018-11-15 22:07:16,490 AutoSavingCache.java:166 - 
> Completed loading (1014606 ms; 10103 keys) KeyCache cache
> {noformat}
> I've witnessed similar behavior in the past with large LCS clusters, and 
> indeed it appears that any time the number of sstables is large, KeyCache 
> loading takes a _really_ long time. Today I got a flame graph and I believe 
> that I found the issue and I think it's reasonably easy to fix. From what I 
> can tell the {{KeyCacheSerializer::deserialize}} [method 
> |https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L445]
>  which is called for every key is linear in the number of sstables due to the 
> [call|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L459]
>  to {{ColumnFamilyStore::getSSTables}} which ends up calling {{View::select}} 
> [here|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/lifecycle/View.java#L139].
>  The {{View::select}} call is linear in the number of sstables and causes a 
> _lot_ of {{HashSet}} 
> [resizing|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/lifecycle/View.java#L139]
>  when the number of sstables is much greater than 16 (the default size of the 
> backing {{HashMap}}).
> As we see in the attached flamegraph we spend 50% of our CPU time in these 
> {{getSSTable}} calls, of which 36% is spent adding sstables to the HashSet in 
> {{View::select}} and 17% is spent just iterating the sstables in the first 
> place. A full 16% of CPU time is spent _just resizing the HashMap_. Then 
> another 4% is spend calling {{CacheService::findDesc}} which does [a linear 
> search|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L475]
>  for the sstable generation.
> I believe that this affects at least Cassandra 3.0.17 and trunk, and could be 
> pretty easily fixed by either caching the getSSTables call or at the very 
> least pre-sizing the {{HashSet}} in {{View::select}} to be the size of the 
> sstables map.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14898) Key cache loading is very slow when there are many SSTables

2021-01-19 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17268097#comment-17268097
 ] 

Venkata Harikrishna Nukala edited comment on CASSANDRA-14898 at 1/19/21, 6:04 
PM:
--

[~jolynch] 
{quote}is the gist of the patch that we skip copying the sstables into a new 
hash set (by just pushing the filter down to the View) but still perform a {{O( 
 n  )}} scan over those tables just within the View?
{quote}
Yes, wanted to avoid copying lots of SSTables to hash set when we need only 
one. View's _find_ method don't do O( n ) scans all the times. In a normal 
case, find method stops at the fist entry (all SSTables generation would be 
same). I think it will require ( O( n ) scans for entry) in the case of upgrade 
and required generation SSTable is at end of the list. Added 60s for _get_ to 
avoid waiting forever.
{quote}I guess the tradeoff is making the CacheService implementations 
potentially stateful (introducing a new contract that AutoSavingCache will call 
a function at the end)?
{quote}
Thought of avoiding state across calls. _CacheService deserialize_ method is 
called for each entry. Things would have been easier if _deserialize_ method 
returns all keys instead of single key value pair (no need to maintain state 
across calls).


was (Author: n.v.harikrishna):
[~jolynch] 
{quote}is the gist of the patch that we skip copying the sstables into a new 
hash set (by just pushing the filter down to the View) but still perform a {{O( 
n )}} scan over those tables just within the View?
{quote}
Yes, wanted to avoid copying lots of SSTables to hash set when we need only 
one. View's _find_ method don't do O(n) scans all the times. In a normal case, 
find method stops at the fist entry (all SSTables generation would be same). I 
think it will require ( O(n) scans for entry) in the case of upgrade and 
required generation SSTable is at end of the list. Added 60s for _get_ to avoid 
waiting forever.
{quote}I guess the tradeoff is making the CacheService implementations 
potentially stateful (introducing a new contract that AutoSavingCache will call 
a function at the end)?
{quote}
Thought of avoiding state across calls. _CacheService deserialize_ method is 
called for each entry. Things would have been easier if _deserialize_ method 
returns all keys instead of single key value pair (no need to maintain state 
across calls).

> Key cache loading is very slow when there are many SSTables
> ---
>
> Key: CASSANDRA-14898
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14898
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
> Environment: AWS i3.2xlarge, 4 physical cores (8 threads), 60GB of 
> RAM, loading about 8MB of KeyCache with 10k keys in it.
>Reporter: Joey Lynch
>Assignee: Venkata Harikrishna Nukala
>Priority: Low
>  Labels: Performance, low-hanging-fruit
> Attachments: key_cache_load_slow.svg
>
>
> While dealing with a production issue today where some 3.0.17 nodes had close 
> to ~8k sstables on disk due to excessive write pressure, we had a few nodes 
> crash due to OOM and then they took close to 17 minutes to load the key cache 
> and recover. This excessive key cache load significantly increased the 
> duration of the outage (to mitigate we just removed the saved key cache 
> files). For example here is one example taking 17 minutes to load 10k keys, 
> or about 10 keys per second (which is ... very slow):
> {noformat}
> INFO  [pool-3-thread-1] 2018-11-15 21:50:21,885 AutoSavingCache.java:190 - 
> reading saved cache /mnt/data/cassandra/saved_caches/KeyCache-d.db
> INFO  [pool-3-thread-1] 2018-11-15 22:07:16,490 AutoSavingCache.java:166 - 
> Completed loading (1014606 ms; 10103 keys) KeyCache cache
> {noformat}
> I've witnessed similar behavior in the past with large LCS clusters, and 
> indeed it appears that any time the number of sstables is large, KeyCache 
> loading takes a _really_ long time. Today I got a flame graph and I believe 
> that I found the issue and I think it's reasonably easy to fix. From what I 
> can tell the {{KeyCacheSerializer::deserialize}} [method 
> |https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L445]
>  which is called for every key is linear in the number of sstables due to the 
> [call|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L459]
>  to {{ColumnFamilyStore::getSSTables}} which ends up calling {{View::select}} 
> [here|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/

[jira] [Commented] (CASSANDRA-14898) Key cache loading is very slow when there are many SSTables

2021-01-19 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17268097#comment-17268097
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14898:


[~jolynch] 
{quote}is the gist of the patch that we skip copying the sstables into a new 
hash set (by just pushing the filter down to the View) but still perform a {{O( 
n )}} scan over those tables just within the View?
{quote}
Yes, wanted to avoid copying lots of SSTables to hash set when we need only 
one. View's _find_ method don't do O(n) scans all the times. In a normal case, 
find method stops at the fist entry (all SSTables generation would be same). I 
think it will require ( O(n) scans for entry) in the case of upgrade and 
required generation SSTable is at end of the list. Added 60s for _get_ to avoid 
waiting forever.
{quote}I guess the tradeoff is making the CacheService implementations 
potentially stateful (introducing a new contract that AutoSavingCache will call 
a function at the end)?
{quote}
Thought of avoiding state across calls. _CacheService deserialize_ method is 
called for each entry. Things would have been easier if _deserialize_ method 
returns all keys instead of single key value pair (no need to maintain state 
across calls).

> Key cache loading is very slow when there are many SSTables
> ---
>
> Key: CASSANDRA-14898
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14898
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
> Environment: AWS i3.2xlarge, 4 physical cores (8 threads), 60GB of 
> RAM, loading about 8MB of KeyCache with 10k keys in it.
>Reporter: Joey Lynch
>Assignee: Venkata Harikrishna Nukala
>Priority: Low
>  Labels: Performance, low-hanging-fruit
> Attachments: key_cache_load_slow.svg
>
>
> While dealing with a production issue today where some 3.0.17 nodes had close 
> to ~8k sstables on disk due to excessive write pressure, we had a few nodes 
> crash due to OOM and then they took close to 17 minutes to load the key cache 
> and recover. This excessive key cache load significantly increased the 
> duration of the outage (to mitigate we just removed the saved key cache 
> files). For example here is one example taking 17 minutes to load 10k keys, 
> or about 10 keys per second (which is ... very slow):
> {noformat}
> INFO  [pool-3-thread-1] 2018-11-15 21:50:21,885 AutoSavingCache.java:190 - 
> reading saved cache /mnt/data/cassandra/saved_caches/KeyCache-d.db
> INFO  [pool-3-thread-1] 2018-11-15 22:07:16,490 AutoSavingCache.java:166 - 
> Completed loading (1014606 ms; 10103 keys) KeyCache cache
> {noformat}
> I've witnessed similar behavior in the past with large LCS clusters, and 
> indeed it appears that any time the number of sstables is large, KeyCache 
> loading takes a _really_ long time. Today I got a flame graph and I believe 
> that I found the issue and I think it's reasonably easy to fix. From what I 
> can tell the {{KeyCacheSerializer::deserialize}} [method 
> |https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L445]
>  which is called for every key is linear in the number of sstables due to the 
> [call|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L459]
>  to {{ColumnFamilyStore::getSSTables}} which ends up calling {{View::select}} 
> [here|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/lifecycle/View.java#L139].
>  The {{View::select}} call is linear in the number of sstables and causes a 
> _lot_ of {{HashSet}} 
> [resizing|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/lifecycle/View.java#L139]
>  when the number of sstables is much greater than 16 (the default size of the 
> backing {{HashMap}}).
> As we see in the attached flamegraph we spend 50% of our CPU time in these 
> {{getSSTable}} calls, of which 36% is spent adding sstables to the HashSet in 
> {{View::select}} and 17% is spent just iterating the sstables in the first 
> place. A full 16% of CPU time is spent _just resizing the HashMap_. Then 
> another 4% is spend calling {{CacheService::findDesc}} which does [a linear 
> search|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L475]
>  for the sstable generation.
> I believe that this affects at least Cassandra 3.0.17 and trunk, and could be 
> pretty easily fixed by either caching the getSSTables call or at the very 
> least p

[jira] [Commented] (CASSANDRA-14898) Key cache loading is very slow when there are many SSTables

2021-01-18 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267071#comment-17267071
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14898:


[~jolynch] I took your micro bench test and added it my branch: 
[https://github.com/nvharikrishna/cassandra/commits/14898-keycache-performance-fix]
 . Ran the test in my laptop without and with my changes and here are the 
results:

 

Without fix:
{noformat}
[java] Result 
"org.apache.cassandra.test.microbench.CacheLoaderBench.keyCacheLoadTest":
 [java] N = 33
 [java] mean = 125.885 ?(99.9%) 5.067 ms/op
 [java]
 [java] Histogram, ms/op:
 [java] [110.000, 115.000) = 0
 [java] [115.000, 120.000) = 2
 [java] [120.000, 125.000) = 19
 [java] [125.000, 130.000) = 8
 [java] [130.000, 135.000) = 2
 [java] [135.000, 140.000) = 0
 [java] [140.000, 145.000) = 1
 [java] [145.000, 150.000) = 0
 [java] [150.000, 155.000) = 0
 [java] [155.000, 160.000) = 0
 [java] [160.000, 165.000) = 1
 [java]
 [java] Percentiles, ms/op:
 [java] p(0.) = 118.358 ms/op
 [java] p(50.) = 123.863 ms/op
 [java] p(90.) = 131.990 ms/op
 [java] p(95.) = 148.347 ms/op
 [java] p(99.) = 163.578 ms/op
 [java] p(99.9000) = 163.578 ms/op
 [java] p(99.9900) = 163.578 ms/op
 [java] p(99.9990) = 163.578 ms/op
 [java] p(99.) = 163.578 ms/op
 [java] p(100.) = 163.578 ms/op{noformat}
 

With fix:
{noformat}
[java] Result 
"org.apache.cassandra.test.microbench.CacheLoaderBench.keyCacheLoadTest":
 [java] N = 277
 [java] mean = 14.093 ?(99.9%) 0.530 ms/op
 [java]
 [java] Histogram, ms/op:
 [java] [10.000, 11.250) = 0
 [java] [11.250, 12.500) = 118
 [java] [12.500, 13.750) = 50
 [java] [13.750, 15.000) = 27
 [java] [15.000, 16.250) = 16
 [java] [16.250, 17.500) = 20
 [java] [17.500, 18.750) = 23
 [java] [18.750, 20.000) = 16
 [java] [20.000, 21.250) = 6
 [java] [21.250, 22.500) = 1
 [java] [22.500, 23.750) = 0
 [java] [23.750, 25.000) = 0
 [java] [25.000, 26.250) = 0
 [java] [26.250, 27.500) = 0
 [java] [27.500, 28.750) = 0
 [java]
 [java] Percentiles, ms/op:
 [java] p(0.) = 11.289 ms/op
 [java] p(50.) = 13.238 ms/op
 [java] p(90.) = 18.468 ms/op
 [java] p(95.) = 19.218 ms/op
 [java] p(99.) = 20.588 ms/op
 [java] p(99.9000) = 22.053 ms/op
 [java] p(99.9900) = 22.053 ms/op
 [java] p(99.9990) = 22.053 ms/op
 [java] p(99.) = 22.053 ms/op
 [java] p(100.) = 22.053 ms/op{noformat}
 

> Key cache loading is very slow when there are many SSTables
> ---
>
> Key: CASSANDRA-14898
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14898
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
> Environment: AWS i3.2xlarge, 4 physical cores (8 threads), 60GB of 
> RAM, loading about 8MB of KeyCache with 10k keys in it.
>Reporter: Joey Lynch
>Assignee: Venkata Harikrishna Nukala
>Priority: Low
>  Labels: Performance, low-hanging-fruit
> Attachments: key_cache_load_slow.svg
>
>
> While dealing with a production issue today where some 3.0.17 nodes had close 
> to ~8k sstables on disk due to excessive write pressure, we had a few nodes 
> crash due to OOM and then they took close to 17 minutes to load the key cache 
> and recover. This excessive key cache load significantly increased the 
> duration of the outage (to mitigate we just removed the saved key cache 
> files). For example here is one example taking 17 minutes to load 10k keys, 
> or about 10 keys per second (which is ... very slow):
> {noformat}
> INFO  [pool-3-thread-1] 2018-11-15 21:50:21,885 AutoSavingCache.java:190 - 
> reading saved cache /mnt/data/cassandra/saved_caches/KeyCache-d.db
> INFO  [pool-3-thread-1] 2018-11-15 22:07:16,490 AutoSavingCache.java:166 - 
> Completed loading (1014606 ms; 10103 keys) KeyCache cache
> {noformat}
> I've witnessed similar behavior in the past with large LCS clusters, and 
> indeed it appears that any time the number of sstables is large, KeyCache 
> loading takes a _really_ long time. Today I got a flame graph and I believe 
> that I found the issue and I think it's reasonably easy to fix. From what I 
> can tell the {{KeyCacheSerializer::deserialize}} [method 
> |https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L445]
>  which is called for every key is linear in the number of sstables due to the 
> [call|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L459]
>  to {{ColumnFamilyStore::getSSTables}} which ends up calling {{View::select}} 
> [here|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache

[jira] [Assigned] (CASSANDRA-14898) Key cache loading is very slow when there are many SSTables

2021-01-17 Thread Venkata Harikrishna Nukala (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala reassigned CASSANDRA-14898:
--

Assignee: Venkata Harikrishna Nukala  (was: Joey Lynch)

> Key cache loading is very slow when there are many SSTables
> ---
>
> Key: CASSANDRA-14898
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14898
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
> Environment: AWS i3.2xlarge, 4 physical cores (8 threads), 60GB of 
> RAM, loading about 8MB of KeyCache with 10k keys in it.
>Reporter: Joey Lynch
>Assignee: Venkata Harikrishna Nukala
>Priority: Low
>  Labels: Performance, low-hanging-fruit
> Attachments: key_cache_load_slow.svg
>
>
> While dealing with a production issue today where some 3.0.17 nodes had close 
> to ~8k sstables on disk due to excessive write pressure, we had a few nodes 
> crash due to OOM and then they took close to 17 minutes to load the key cache 
> and recover. This excessive key cache load significantly increased the 
> duration of the outage (to mitigate we just removed the saved key cache 
> files). For example here is one example taking 17 minutes to load 10k keys, 
> or about 10 keys per second (which is ... very slow):
> {noformat}
> INFO  [pool-3-thread-1] 2018-11-15 21:50:21,885 AutoSavingCache.java:190 - 
> reading saved cache /mnt/data/cassandra/saved_caches/KeyCache-d.db
> INFO  [pool-3-thread-1] 2018-11-15 22:07:16,490 AutoSavingCache.java:166 - 
> Completed loading (1014606 ms; 10103 keys) KeyCache cache
> {noformat}
> I've witnessed similar behavior in the past with large LCS clusters, and 
> indeed it appears that any time the number of sstables is large, KeyCache 
> loading takes a _really_ long time. Today I got a flame graph and I believe 
> that I found the issue and I think it's reasonably easy to fix. From what I 
> can tell the {{KeyCacheSerializer::deserialize}} [method 
> |https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L445]
>  which is called for every key is linear in the number of sstables due to the 
> [call|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L459]
>  to {{ColumnFamilyStore::getSSTables}} which ends up calling {{View::select}} 
> [here|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/lifecycle/View.java#L139].
>  The {{View::select}} call is linear in the number of sstables and causes a 
> _lot_ of {{HashSet}} 
> [resizing|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/lifecycle/View.java#L139]
>  when the number of sstables is much greater than 16 (the default size of the 
> backing {{HashMap}}).
> As we see in the attached flamegraph we spend 50% of our CPU time in these 
> {{getSSTable}} calls, of which 36% is spent adding sstables to the HashSet in 
> {{View::select}} and 17% is spent just iterating the sstables in the first 
> place. A full 16% of CPU time is spent _just resizing the HashMap_. Then 
> another 4% is spend calling {{CacheService::findDesc}} which does [a linear 
> search|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L475]
>  for the sstable generation.
> I believe that this affects at least Cassandra 3.0.17 and trunk, and could be 
> pretty easily fixed by either caching the getSSTables call or at the very 
> least pre-sizing the {{HashSet}} in {{View::select}} to be the size of the 
> sstables map.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14898) Key cache loading is very slow when there are many SSTables

2021-01-17 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267041#comment-17267041
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14898:


Changes are here: 

[https://github.com/nvharikrishna/cassandra/commit/c090ed640440b284686131ac73b3a6295cea27c0]

Circle CI: 
[https://app.circleci.com/pipelines/github/nvharikrishna/cassandra?branch=14898-keycache-performance-fix]

 

Tested locally in my laptop with 5000+ SSTables and these are the results 
observed:

Before:

 
{code:java}
Test 1:
INFO [pool-3-thread-1] 2021-01-07 15:40:24,767 AutoSavingCache.java:177 - 
Completed loading (148997 ms; 589886 keys) KeyCache cache
Test 2:
INFO [pool-3-thread-1] 2021-01-07 16:55:23,327 AutoSavingCache.java:177 - 
Completed loading (99441 ms; 396405 keys) KeyCache cache
{code}
 

After:
{code:java}
INFO [pool-3-thread-1] 2021-01-08 17:54:11,402 AutoSavingCache.java:177 - 
Completed loading (8502 ms; 401358 keys) KeyCache cache{code}
 

 

> Key cache loading is very slow when there are many SSTables
> ---
>
> Key: CASSANDRA-14898
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14898
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
> Environment: AWS i3.2xlarge, 4 physical cores (8 threads), 60GB of 
> RAM, loading about 8MB of KeyCache with 10k keys in it.
>Reporter: Joey Lynch
>Assignee: Joey Lynch
>Priority: Low
>  Labels: Performance, low-hanging-fruit
> Attachments: key_cache_load_slow.svg
>
>
> While dealing with a production issue today where some 3.0.17 nodes had close 
> to ~8k sstables on disk due to excessive write pressure, we had a few nodes 
> crash due to OOM and then they took close to 17 minutes to load the key cache 
> and recover. This excessive key cache load significantly increased the 
> duration of the outage (to mitigate we just removed the saved key cache 
> files). For example here is one example taking 17 minutes to load 10k keys, 
> or about 10 keys per second (which is ... very slow):
> {noformat}
> INFO  [pool-3-thread-1] 2018-11-15 21:50:21,885 AutoSavingCache.java:190 - 
> reading saved cache /mnt/data/cassandra/saved_caches/KeyCache-d.db
> INFO  [pool-3-thread-1] 2018-11-15 22:07:16,490 AutoSavingCache.java:166 - 
> Completed loading (1014606 ms; 10103 keys) KeyCache cache
> {noformat}
> I've witnessed similar behavior in the past with large LCS clusters, and 
> indeed it appears that any time the number of sstables is large, KeyCache 
> loading takes a _really_ long time. Today I got a flame graph and I believe 
> that I found the issue and I think it's reasonably easy to fix. From what I 
> can tell the {{KeyCacheSerializer::deserialize}} [method 
> |https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L445]
>  which is called for every key is linear in the number of sstables due to the 
> [call|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L459]
>  to {{ColumnFamilyStore::getSSTables}} which ends up calling {{View::select}} 
> [here|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/lifecycle/View.java#L139].
>  The {{View::select}} call is linear in the number of sstables and causes a 
> _lot_ of {{HashSet}} 
> [resizing|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/lifecycle/View.java#L139]
>  when the number of sstables is much greater than 16 (the default size of the 
> backing {{HashMap}}).
> As we see in the attached flamegraph we spend 50% of our CPU time in these 
> {{getSSTable}} calls, of which 36% is spent adding sstables to the HashSet in 
> {{View::select}} and 17% is spent just iterating the sstables in the first 
> place. A full 16% of CPU time is spent _just resizing the HashMap_. Then 
> another 4% is spend calling {{CacheService::findDesc}} which does [a linear 
> search|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L475]
>  for the sstable generation.
> I believe that this affects at least Cassandra 3.0.17 and trunk, and could be 
> pretty easily fixed by either caching the getSSTables call or at the very 
> least pre-sizing the {{HashSet}} in {{View::select}} to be the size of the 
> sstables map.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For addit

[jira] [Commented] (CASSANDRA-14898) Key cache loading is very slow when there are many SSTables

2021-01-15 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17266161#comment-17266161
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14898:


[~jolynch] can I pick this ticket? I’m working on preparing a patch for this.

> Key cache loading is very slow when there are many SSTables
> ---
>
> Key: CASSANDRA-14898
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14898
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
> Environment: AWS i3.2xlarge, 4 physical cores (8 threads), 60GB of 
> RAM, loading about 8MB of KeyCache with 10k keys in it.
>Reporter: Joey Lynch
>Assignee: Joey Lynch
>Priority: Low
>  Labels: Performance, low-hanging-fruit
> Attachments: key_cache_load_slow.svg
>
>
> While dealing with a production issue today where some 3.0.17 nodes had close 
> to ~8k sstables on disk due to excessive write pressure, we had a few nodes 
> crash due to OOM and then they took close to 17 minutes to load the key cache 
> and recover. This excessive key cache load significantly increased the 
> duration of the outage (to mitigate we just removed the saved key cache 
> files). For example here is one example taking 17 minutes to load 10k keys, 
> or about 10 keys per second (which is ... very slow):
> {noformat}
> INFO  [pool-3-thread-1] 2018-11-15 21:50:21,885 AutoSavingCache.java:190 - 
> reading saved cache /mnt/data/cassandra/saved_caches/KeyCache-d.db
> INFO  [pool-3-thread-1] 2018-11-15 22:07:16,490 AutoSavingCache.java:166 - 
> Completed loading (1014606 ms; 10103 keys) KeyCache cache
> {noformat}
> I've witnessed similar behavior in the past with large LCS clusters, and 
> indeed it appears that any time the number of sstables is large, KeyCache 
> loading takes a _really_ long time. Today I got a flame graph and I believe 
> that I found the issue and I think it's reasonably easy to fix. From what I 
> can tell the {{KeyCacheSerializer::deserialize}} [method 
> |https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L445]
>  which is called for every key is linear in the number of sstables due to the 
> [call|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L459]
>  to {{ColumnFamilyStore::getSSTables}} which ends up calling {{View::select}} 
> [here|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/lifecycle/View.java#L139].
>  The {{View::select}} call is linear in the number of sstables and causes a 
> _lot_ of {{HashSet}} 
> [resizing|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/lifecycle/View.java#L139]
>  when the number of sstables is much greater than 16 (the default size of the 
> backing {{HashMap}}).
> As we see in the attached flamegraph we spend 50% of our CPU time in these 
> {{getSSTable}} calls, of which 36% is spent adding sstables to the HashSet in 
> {{View::select}} and 17% is spent just iterating the sstables in the first 
> place. A full 16% of CPU time is spent _just resizing the HashMap_. Then 
> another 4% is spend calling {{CacheService::findDesc}} which does [a linear 
> search|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L475]
>  for the sstable generation.
> I believe that this affects at least Cassandra 3.0.17 and trunk, and could be 
> pretty easily fixed by either caching the getSSTables call or at the very 
> least pre-sizing the {{HashSet}} in {{View::select}} to be the size of the 
> sstables map.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-16380) KeyCache load performance issue during startup

2021-01-12 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-16380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17263329#comment-17263329
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-16380:


I am preparing a patch for this issue.

> KeyCache load performance issue during startup
> --
>
> Key: CASSANDRA-16380
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16380
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Venkata Harikrishna Nukala
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
>
> Cassandra startup blocked for loading key cache.
> From org.apache.cassandra.service.CassandraDaemon#setup method:
> {code:java}
> try
> {
>  loadRowAndKeyCacheAsync().get();
> }
> catch (Throwable t)
> {
>  JVMStabilityInspector.inspectThrowable(t);
>  logger.warn("Error loading key or row cache", t);
> }{code}
> Key cache {{deserialize}} method is fetching all CANONICAL SSTables and picks 
> one from it for each entry: 
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CacheService.java#L447.
>  When the key cache is relatively big and has lots of SSTables (in thousands) 
> then loading key cache take lots of time.
> Performance of key cache loading can be improved and have timeout for it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-16380) KeyCache load performance issue during startup

2021-01-12 Thread Venkata Harikrishna Nukala (Jira)
Venkata Harikrishna Nukala created CASSANDRA-16380:
--

 Summary: KeyCache load performance issue during startup
 Key: CASSANDRA-16380
 URL: https://issues.apache.org/jira/browse/CASSANDRA-16380
 Project: Cassandra
  Issue Type: Improvement
Reporter: Venkata Harikrishna Nukala
Assignee: Venkata Harikrishna Nukala


Cassandra startup blocked for loading key cache.

>From org.apache.cassandra.service.CassandraDaemon#setup method:
{code:java}
try
{
 loadRowAndKeyCacheAsync().get();
}
catch (Throwable t)
{
 JVMStabilityInspector.inspectThrowable(t);
 logger.warn("Error loading key or row cache", t);
}{code}

Key cache {{deserialize}} method is fetching all CANONICAL SSTables and picks 
one from it for each entry: 
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CacheService.java#L447.
 When the key cache is relatively big and has lots of SSTables (in thousands) 
then loading key cache take lots of time.

Performance of key cache loading can be improved and have timeout for it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15890) Add token to tombstone warning and error log message

2020-06-23 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17143220#comment-17143220
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-15890:


[~brandon.williams] Thanks for looking into it and for the links :) .

Unit tests and dtests in Circle CI passed for trunk and 3.0, but a SASI index 
related test failed for 3.11 which is not related to these changes. Tests 
failed in ci-cassandra.apache.org Jenkins build (sorry I've no idea about this 
environment, not sure how to re-run/fix). How to proceed further?

> Add token to tombstone warning and error log message
> 
>
> Key: CASSANDRA-15890
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15890
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Logging
>Reporter: Venkata Harikrishna Nukala
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
> Fix For: 3.0.21, 3.11.7, 4.0
>
>
> If Cassandra scans too many tombstones while reading a partition, then it 
> prints log messages with query based on warning/failure thresholds. The token 
> is not printed in the log message. If tombstones are hurting the 
> instance/replica set, then running force compaction for the partition 
> ("nodetool compact" using start and end tokens i.e. token -/+ some delta) is 
> one of the actions taken to recover. In order to find out the token, someone 
> has to manually connect to cluster and run SELECT TOKEN query. Printing token 
> with the log message helps to avoid manual effort and execute force 
> compaction quickly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15890) Add token to tombstone warning and error log message

2020-06-22 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17142349#comment-17142349
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-15890:


Here are the patches:

*Trunk*

Git: 
[https://github.com/apache/cassandra/compare/trunk...nvharikrishna:15890-trunk?expand=1]

CI: 
[https://app.circleci.com/pipelines/github/nvharikrishna/cassandra?branch=15890-trunk]

*3.11*

Git: 
[https://github.com/apache/cassandra/compare/cassandra-3.11...nvharikrishna:15890-cassandra-3.11?expand=1]

CI: 
[https://app.circleci.com/pipelines/github/nvharikrishna/cassandra?branch=15890-cassandra-3.11]

*3.0*

Git: 
[https://github.com/apache/cassandra/compare/cassandra-3.0...nvharikrishna:15890-cassandra-3.0?expand=1]

CI: 
[https://app.circleci.com/pipelines/github/nvharikrishna/cassandra?branch=15890-cassandra-3.0]

 

CI is still running for these branches. I think it is going to take some more 
time. I will update once CI passes. 

> Add token to tombstone warning and error log message
> 
>
> Key: CASSANDRA-15890
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15890
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Logging
>Reporter: Venkata Harikrishna Nukala
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
> Fix For: 3.0.21, 3.11.7, 4.0
>
>
> If Cassandra scans too many tombstones while reading a partition, then it 
> prints log messages with query based on warning/failure thresholds. The token 
> is not printed in the log message. If tombstones are hurting the 
> instance/replica set, then running force compaction for the partition 
> ("nodetool compact" using start and end tokens i.e. token -/+ some delta) is 
> one of the actions taken to recover. In order to find out the token, someone 
> has to manually connect to cluster and run SELECT TOKEN query. Printing token 
> with the log message helps to avoid manual effort and execute force 
> compaction quickly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15890) Add token to tombstone warning and error log message

2020-06-22 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17142319#comment-17142319
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-15890:


Yes, I am preparing patch. I will update the patch soon.

> Add token to tombstone warning and error log message
> 
>
> Key: CASSANDRA-15890
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15890
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability/Logging
>Reporter: Venkata Harikrishna Nukala
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
> Fix For: 3.0.21, 3.11.7, 4.0
>
>
> If Cassandra scans too many tombstones while reading a partition, then it 
> prints log messages with query based on warning/failure thresholds. The token 
> is not printed in the log message. If tombstones are hurting the 
> instance/replica set, then running force compaction for the partition 
> ("nodetool compact" using start and end tokens i.e. token -/+ some delta) is 
> one of the actions taken to recover. In order to find out the token, someone 
> has to manually connect to cluster and run SELECT TOKEN query. Printing token 
> with the log message helps to avoid manual effort and execute force 
> compaction quickly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15890) Add token to tombstone warning and error log message

2020-06-22 Thread Venkata Harikrishna Nukala (Jira)
Venkata Harikrishna Nukala created CASSANDRA-15890:
--

 Summary: Add token to tombstone warning and error log message
 Key: CASSANDRA-15890
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15890
 Project: Cassandra
  Issue Type: Improvement
  Components: Observability/Logging
Reporter: Venkata Harikrishna Nukala
Assignee: Venkata Harikrishna Nukala


If Cassandra scans too many tombstones while reading a partition, then it 
prints log messages with query based on warning/failure thresholds. The token 
is not printed in the log message. If tombstones are hurting the 
instance/replica set, then running force compaction for the partition 
("nodetool compact" using start and end tokens i.e. token -/+ some delta) is 
one of the actions taken to recover. In order to find out the token, someone 
has to manually connect to cluster and run SELECT TOKEN query. Printing token 
with the log message helps to avoid manual effort and execute force compaction 
quickly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14781) Log message when mutation passed to CommitLog#add(Mutation) is too large is not descriptive enough

2020-05-05 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17100077#comment-17100077
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14781:


Thanks a lot [~jwest] for taking it forward!!

> Log message when mutation passed to CommitLog#add(Mutation) is too large is 
> not descriptive enough
> --
>
> Key: CASSANDRA-14781
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14781
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Hints, Local/Commit Log, Messaging/Client
>Reporter: Jordan West
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
>  Labels: protocolv5
> Fix For: 4.0-beta
>
> Attachments: CASSANDRA-14781.patch, CASSANDRA-14781_3.0.patch, 
> CASSANDRA-14781_3.11.patch
>
>
> When hitting 
> [https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/db/commitlog/CommitLog.java#L256-L257],
>  the log message produced does not help the operator track down what data is 
> being written. At a minimum the keyspace and cfIds involved would be useful 
> (and are available) – more detail might not be reasonable to include. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14781) Log message when mutation passed to CommitLog#add(Mutation) is too large is not descriptive enough

2020-04-19 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17087386#comment-17087386
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14781:


[~jrwest] raised CASSANDRA-15741 for validation and/or fixing client timeout 
when mutation exceeds max size.

> Log message when mutation passed to CommitLog#add(Mutation) is too large is 
> not descriptive enough
> --
>
> Key: CASSANDRA-14781
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14781
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Hints, Local/Commit Log, Messaging/Client
>Reporter: Jordan West
>Assignee: Tom Petracca
>Priority: Normal
>  Labels: protocolv5
> Fix For: 4.0-beta
>
> Attachments: CASSANDRA-14781.patch, CASSANDRA-14781_3.0.patch, 
> CASSANDRA-14781_3.11.patch
>
>
> When hitting 
> [https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/db/commitlog/CommitLog.java#L256-L257],
>  the log message produced does not help the operator track down what data is 
> being written. At a minimum the keyspace and cfIds involved would be useful 
> (and are available) – more detail might not be reasonable to include. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15741) Mutation size exceeds max limit - clients get timeout?

2020-04-19 Thread Venkata Harikrishna Nukala (Jira)
Venkata Harikrishna Nukala created CASSANDRA-15741:
--

 Summary: Mutation size exceeds max limit - clients get timeout?
 Key: CASSANDRA-15741
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15741
 Project: Cassandra
  Issue Type: Task
Reporter: Venkata Harikrishna Nukala
Assignee: Venkata Harikrishna Nukala


Raising this ticket based on the discussion from CASSANDRA-14781 to validate 
that co-oridinator returns timeout when mutation size exceeds maximum limit 
(need to add a jvm-dtest to confirm). If it throws timeout or any other 
exception which doesn't reflect properly, then response should be modified to 
throw meaningful exception immediately.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15644) Schema/Query analyzer

2020-03-16 Thread Venkata Harikrishna Nukala (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala updated CASSANDRA-15644:
---
Description: 
This proposal is to build schema/cql analyser which can help users to analyze 
their queries before it is too late.
  
 User may create stability issues by
 - Running expensive queries against cluster like , SELECT * or query without 
where clause or IN clause with many values etc.
 - Creating not so optimal schemas
 - Leaving scope for data loss or schema with performance issue (keyspace with 
durable writes set to false or table with many secondary indexes etc...).
  
 Most of the times these Dos & Don'ts go into some knowledge base/documentation 
as best practices. Having rules for best practices (which user can execute 
against statements) can help to avoid bad schema/queries getting executed in 
cluster. The main idea is to enable the users to take corrective actions, by
 1) Allowing a user to validate a DDL/DML statements before it is 
applied/executed.
 2) Allowing a user to validate existing schema/queries.
  
 Imo, a validation result should:
 1. Have severity
 2. Tell where it hurts like instance/replica set/cluster.
 3. Tell if it causes data loss.
 4. Tell the strategy to recover.
  
 Few approaches I can think of:
 1. Write validation rules at server side + have a new type of statement to run 
validations (something like MySQL's EXPLAIN) and return validation 
results/errors.
 2. Keep validation rules in sidecar + expose a service to run validations. In 
this case user can submit his statements to this API and get validation results.
 3. Expose a UI in sidecar which accepts statements and run validations. 
Validation rules can be with UI or UI can make either of above options.
  
 Open for any other approach.
  

  was:
This proposal is to build schema/cql analyser which can help users to analyze 
their queries before it is too late.
 
User may create stability issues by
- Running expensive queries against cluster like , SELECT * or query without 
where clause or IN clause with many values etc.
- Creating not so optimal schemas
- Leaving scope for data loss (keyspace with durable writes set to false or 
table with many secondary indexes etc...).
 
Most of the times these Dos & Don'ts go into some knowledge base/documentation 
as best practices. Having rules for best practices (which user can execute 
against statements) can help to avoid bad schema/queries getting executed in 
cluster. The main idea is to enable the users to take corrective actions, by
1) Allowing a user to validate a DDL/DML statements before it is 
applied/executed.
2) Allowing a user to validate existing schema/queries.
 
Imo, a validation result should:
1. Have severity
2. Tell where it hurts like instance/replica set/cluster.
3. Tell if it causes data loss.
4. Tell the strategy to recover.
 
Few approaches I can think of:
1. Write validation rules at server side + have a new type of statement to run 
validations (something like MySQL's EXPLAIN) and return validation 
results/errors.
2. Keep validation rules in sidecar + expose a service to run validations. In 
this case user can submit his statements to this API and get validation results.
3. Expose a UI in sidecar which accepts statements and run validations. 
Validation rules can be with UI or UI can make either of above options.
 
Open for any other approach.
 


> Schema/Query analyzer
> -
>
> Key: CASSANDRA-15644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15644
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Venkata Harikrishna Nukala
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
>
> This proposal is to build schema/cql analyser which can help users to analyze 
> their queries before it is too late.
>   
>  User may create stability issues by
>  - Running expensive queries against cluster like , SELECT * or query without 
> where clause or IN clause with many values etc.
>  - Creating not so optimal schemas
>  - Leaving scope for data loss or schema with performance issue (keyspace 
> with durable writes set to false or table with many secondary indexes etc...).
>   
>  Most of the times these Dos & Don'ts go into some knowledge 
> base/documentation as best practices. Having rules for best practices (which 
> user can execute against statements) can help to avoid bad schema/queries 
> getting executed in cluster. The main idea is to enable the users to take 
> corrective actions, by
>  1) Allowing a user to validate a DDL/DML statements before it is 
> applied/executed.
>  2) Allowing a user to validate existing schema/queries.
>   
>  Imo, a validation result should:
>  1. Have severity
>  2. Tell where it hurts like instance/replica set/cluster.
>  3. Tell if it causes data 

[jira] [Commented] (CASSANDRA-15644) Schema/Query analyzer

2020-03-16 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17060246#comment-17060246
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-15644:


As it is a new feature, I am not expecting it to be picked up for 4.0 release.

> Schema/Query analyzer
> -
>
> Key: CASSANDRA-15644
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15644
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Venkata Harikrishna Nukala
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
>
> This proposal is to build schema/cql analyser which can help users to analyze 
> their queries before it is too late.
>  
> User may create stability issues by
> - Running expensive queries against cluster like , SELECT * or query without 
> where clause or IN clause with many values etc.
> - Creating not so optimal schemas
> - Leaving scope for data loss (keyspace with durable writes set to false or 
> table with many secondary indexes etc...).
>  
> Most of the times these Dos & Don'ts go into some knowledge 
> base/documentation as best practices. Having rules for best practices (which 
> user can execute against statements) can help to avoid bad schema/queries 
> getting executed in cluster. The main idea is to enable the users to take 
> corrective actions, by
> 1) Allowing a user to validate a DDL/DML statements before it is 
> applied/executed.
> 2) Allowing a user to validate existing schema/queries.
>  
> Imo, a validation result should:
> 1. Have severity
> 2. Tell where it hurts like instance/replica set/cluster.
> 3. Tell if it causes data loss.
> 4. Tell the strategy to recover.
>  
> Few approaches I can think of:
> 1. Write validation rules at server side + have a new type of statement to 
> run validations (something like MySQL's EXPLAIN) and return validation 
> results/errors.
> 2. Keep validation rules in sidecar + expose a service to run validations. In 
> this case user can submit his statements to this API and get validation 
> results.
> 3. Expose a UI in sidecar which accepts statements and run validations. 
> Validation rules can be with UI or UI can make either of above options.
>  
> Open for any other approach.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15644) Schema/Query analyzer

2020-03-16 Thread Venkata Harikrishna Nukala (Jira)
Venkata Harikrishna Nukala created CASSANDRA-15644:
--

 Summary: Schema/Query analyzer
 Key: CASSANDRA-15644
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15644
 Project: Cassandra
  Issue Type: New Feature
Reporter: Venkata Harikrishna Nukala
Assignee: Venkata Harikrishna Nukala


This proposal is to build schema/cql analyser which can help users to analyze 
their queries before it is too late.
 
User may create stability issues by
- Running expensive queries against cluster like , SELECT * or query without 
where clause or IN clause with many values etc.
- Creating not so optimal schemas
- Leaving scope for data loss (keyspace with durable writes set to false or 
table with many secondary indexes etc...).
 
Most of the times these Dos & Don'ts go into some knowledge base/documentation 
as best practices. Having rules for best practices (which user can execute 
against statements) can help to avoid bad schema/queries getting executed in 
cluster. The main idea is to enable the users to take corrective actions, by
1) Allowing a user to validate a DDL/DML statements before it is 
applied/executed.
2) Allowing a user to validate existing schema/queries.
 
Imo, a validation result should:
1. Have severity
2. Tell where it hurts like instance/replica set/cluster.
3. Tell if it causes data loss.
4. Tell the strategy to recover.
 
Few approaches I can think of:
1. Write validation rules at server side + have a new type of statement to run 
validations (something like MySQL's EXPLAIN) and return validation 
results/errors.
2. Keep validation rules in sidecar + expose a service to run validations. In 
this case user can submit his statements to this API and get validation results.
3. Expose a UI in sidecar which accepts statements and run validations. 
Validation rules can be with UI or UI can make either of above options.
 
Open for any other approach.
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14781) Log message when mutation passed to CommitLog#add(Mutation) is too large is not descriptive enough

2020-01-14 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17015369#comment-17015369
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14781:


[~jrwest]
{quote}A few code review comments below. I did want to discuss if we are going 
to address the user facing concerns Aleksey brought up in this ticket? The 
patch addresses the operators lack of visibility into keyspace/table/partitions 
but still results in timeouts for the user. Are we going to address those in a 
separate ticket? My thought is that something for the operators is better than 
no patch (having been blind in this situation before besides custom tools) but 
if the user facing changes require protocol changes we should probably fix it 
pre-4.0 like we have or plan to w/ other similar tickets – but that could still 
be in a separate ticket.
{quote}
I would prefer to have a separate ticket. +1 on having something better than no 
patch.
{quote}
* We also shouldn’t duplicate the implementations between counter and regular 
mutations
* validateSize: since the two implementations are identical you could move them 
to a default implementation in IMutation
{quote}
validateSize methods implementation looks similar, but they use different 
serialisers (Mutation uses MutationSerializer and CounterMutation uses 
CounterMutationSerializer) which are not visible in IMutation interface. 
serializedSize() methods needs memoization (needs serializedSize* fields) be 
cause of which we cannot move to interface as fields will be final. We had 
ruled out option having a separate class (i.e. SizeHolder). Now it makes me 
think about 2 options.

1. I could not find any validation of size for {{CounterMutation}}. If it is 
expected or not required, then can remove {{CounterMutaiton}} changes and 
use\{{ Mutation.validateSize}} directly (instead of defining it in 
{{IMutation}}). The disadvantage I see with this approach is caller has to be 
aware of implementation and it makes things hard to abstract (code has to be 
aware of implementation instead of {{IMutation}}).

2. Expect {{VirtualMutaiton}}, I see mutations are expected to be serialized 
and/or deserialized. Provide serialize, serialziedSize and deserialize methods 
as part of {{IMutaiton}} (so that we can abstract out direct usages of 
{{Mutation.serializer}} and {{CounterMutation.serializer}}) with an abstract 
class in between having common functionality.

Or else pay the price of duplicate code. What do you think?
{quote}MaxMutationExceededException: the sort in #prepareMessage could get 
pretty expensive, is it necessary?
{quote}
In Mutation, I see that there is only one PartitionUpdate per Table, and 
according to Mutation.merge() logic, a mutation can have changes related to 
only one keyspace and one key. Even if there are multiple updates for different 
rows of same partition, they are merged into single PartitionUpdate.

When I ran a small test for sorting list of Longs (laptop having i7, 6 core and 
16gb ram) it took approximately 33ms, 6ms and 1ms for 100K, 10k and 1k 
respectively.

According to merge logic and test numbers, unless there are thousands of tables 
in a key space and trying to update all of them at once, I dont see a scenario 
where sorting can hurt (time taken for sort > 1 or 2ms).
{quote}It also looks like there is an edge case where “and more” will be added 
even when there aren’t more. Using listIterator.hasNext() instead of 
topPartitions.size() > 0 should fix that
{quote}
I had moved the code into separte funtion and added unit test cases. It is 
working as expected. Using listIterator.hasNext() caused few tests to fail. Did 
I miss any scenario to test?

Converted serializedSize* long fields to int as suggested by Aleksey. Changes 
are here: 
https://github.com/apache/cassandra/compare/trunk...nvharikrishna:14781-trunk?expand=1

 

> Log message when mutation passed to CommitLog#add(Mutation) is too large is 
> not descriptive enough
> --
>
> Key: CASSANDRA-14781
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14781
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Hints, Local/Commit Log, Messaging/Client
>Reporter: Jordan West
>Assignee: Tom Petracca
>Priority: Normal
>  Labels: protocolv5
> Fix For: 4.0-beta
>
> Attachments: CASSANDRA-14781.patch, CASSANDRA-14781_3.0.patch, 
> CASSANDRA-14781_3.11.patch
>
>
> When hitting 
> [https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/db/commitlog/CommitLog.java#L256-L257],
>  the log message produced does not help the operator track down what data is 
> being written. At a minimum the keyspace and cfIds 

[jira] [Updated] (CASSANDRA-12993) License headers missing in some source files

2020-01-08 Thread Venkata Harikrishna Nukala (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala updated CASSANDRA-12993:
---
Test and Documentation Plan: 
Changes are related to adding license text only. Ran unit test cases. Here is 
the CI: 
[https://app.circleci.com/github/nvharikrishna/cassandra/pipelines/257856a5-2c72-4efc-9162-6acba2c61c1f/workflows/e87fcb23-ff95-4ecc-a3f0-47bc5b44d64c]

 

 
 Status: Patch Available  (was: Open)

> License headers missing in some source files
> 
>
> Key: CASSANDRA-12993
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12993
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Tomas Repik
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
> Fix For: 4.0
>
>
> The following source files are without license headers:
>   doc/source/_static/extra.css
>   src/java/org/apache/cassandra/db/commitlog/IntervalSet.java
>   src/java/org/apache/cassandra/utils/IntegerInterval.java
>   test/unit/org/apache/cassandra/db/commitlog/CommitLogCQLTest.java
>   test/unit/org/apache/cassandra/utils/IntegerIntervalsTest.java
>   tools/stress/src/org/apache/cassandra/stress/WorkManager.java
> Could you please confirm the licensing of code and/or content/s, and add 
> license headers?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-12993) License headers missing in some source files

2020-01-08 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17010962#comment-17010962
 ] 

Venkata Harikrishna Nukala edited comment on CASSANDRA-12993 at 1/8/20 7:36 PM:


Added license text for the files mentioned in ticket description. Here is the 
patch on trunk: 
[https://github.com/nvharikrishna/cassandra/commit/96326245de7a5157452fb753075dff1870a2def6]

 

 


was (Author: n.v.harikrishna):
Added license text for the files mentioned above. Here is the patch on trunk: 
[https://github.com/nvharikrishna/cassandra/commit/96326245de7a5157452fb753075dff1870a2def6]

 

 

> License headers missing in some source files
> 
>
> Key: CASSANDRA-12993
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12993
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Tomas Repik
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
> Fix For: 4.0
>
>
> The following source files are without license headers:
>   doc/source/_static/extra.css
>   src/java/org/apache/cassandra/db/commitlog/IntervalSet.java
>   src/java/org/apache/cassandra/utils/IntegerInterval.java
>   test/unit/org/apache/cassandra/db/commitlog/CommitLogCQLTest.java
>   test/unit/org/apache/cassandra/utils/IntegerIntervalsTest.java
>   tools/stress/src/org/apache/cassandra/stress/WorkManager.java
> Could you please confirm the licensing of code and/or content/s, and add 
> license headers?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12993) License headers missing in some source files

2020-01-08 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17010962#comment-17010962
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-12993:


Added license text for the files mentioned above. Here is the patch on trunk: 
[https://github.com/nvharikrishna/cassandra/commit/96326245de7a5157452fb753075dff1870a2def6]

 

 

> License headers missing in some source files
> 
>
> Key: CASSANDRA-12993
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12993
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Tomas Repik
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
> Fix For: 4.0
>
>
> The following source files are without license headers:
>   doc/source/_static/extra.css
>   src/java/org/apache/cassandra/db/commitlog/IntervalSet.java
>   src/java/org/apache/cassandra/utils/IntegerInterval.java
>   test/unit/org/apache/cassandra/db/commitlog/CommitLogCQLTest.java
>   test/unit/org/apache/cassandra/utils/IntegerIntervalsTest.java
>   tools/stress/src/org/apache/cassandra/stress/WorkManager.java
> Could you please confirm the licensing of code and/or content/s, and add 
> license headers?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14781) Log message when mutation passed to CommitLog#add(Mutation) is too large is not descriptive enough

2020-01-08 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17010954#comment-17010954
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14781:


[~jrwest] Made the changes. 

Here is the [updated 
patch|https://github.com/nvharikrishna/cassandra/commit/1eb9a9846187f669516c88c85fa3550e4efb08f7]
 and [CI|https://app.circleci.com/jobs/github/nvharikrishna/cassandra/120]. 

Summary of changes:
 * Removed SizeHolder
 * MutationExceededMaxSizeException
 ** Avoided calculating size again.
 ** Changed constant for limiting no.of keys to size of the message. We are 
mostly concerned about dumping huge message to log. No.of keys to log has to 
vary based on its size and there won't be an ideal config by no.of keys. So 
changed it to message size (1kb for now, we can increase it further).
 * IMutation
 ** removed getMaxMutationSize and replaced it with constant from CommitLog. 
 * Replaced Mutation.serializer.serializedSize with mutation.serializedSize.

> Log message when mutation passed to CommitLog#add(Mutation) is too large is 
> not descriptive enough
> --
>
> Key: CASSANDRA-14781
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14781
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Hints, Local/Commit Log, Messaging/Client
>Reporter: Jordan West
>Assignee: Tom Petracca
>Priority: Normal
>  Labels: protocolv5
> Fix For: 4.0-beta
>
> Attachments: CASSANDRA-14781.patch, CASSANDRA-14781_3.0.patch, 
> CASSANDRA-14781_3.11.patch
>
>
> When hitting 
> [https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/db/commitlog/CommitLog.java#L256-L257],
>  the log message produced does not help the operator track down what data is 
> being written. At a minimum the keyspace and cfIds involved would be useful 
> (and are available) – more detail might not be reasonable to include. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12993) License headers missing in some source files

2020-01-07 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-12993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17010385#comment-17010385
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-12993:


Taking it. Will update the patch soon.

> License headers missing in some source files
> 
>
> Key: CASSANDRA-12993
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12993
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Tomas Repik
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
> Fix For: 4.0
>
>
> The following source files are without license headers:
>   doc/source/_static/extra.css
>   src/java/org/apache/cassandra/db/commitlog/IntervalSet.java
>   src/java/org/apache/cassandra/utils/IntegerInterval.java
>   test/unit/org/apache/cassandra/db/commitlog/CommitLogCQLTest.java
>   test/unit/org/apache/cassandra/utils/IntegerIntervalsTest.java
>   tools/stress/src/org/apache/cassandra/stress/WorkManager.java
> Could you please confirm the licensing of code and/or content/s, and add 
> license headers?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-12993) License headers missing in some source files

2020-01-07 Thread Venkata Harikrishna Nukala (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala reassigned CASSANDRA-12993:
--

Assignee: Venkata Harikrishna Nukala

> License headers missing in some source files
> 
>
> Key: CASSANDRA-12993
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12993
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Packaging
>Reporter: Tomas Repik
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
> Fix For: 4.0
>
>
> The following source files are without license headers:
>   doc/source/_static/extra.css
>   src/java/org/apache/cassandra/db/commitlog/IntervalSet.java
>   src/java/org/apache/cassandra/utils/IntegerInterval.java
>   test/unit/org/apache/cassandra/db/commitlog/CommitLogCQLTest.java
>   test/unit/org/apache/cassandra/utils/IntegerIntervalsTest.java
>   tools/stress/src/org/apache/cassandra/stress/WorkManager.java
> Could you please confirm the licensing of code and/or content/s, and add 
> license headers?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14781) Log message when mutation passed to CommitLog#add(Mutation) is too large is not descriptive enough

2020-01-07 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17009986#comment-17009986
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14781:


{quote}Since the only call to #getSize() currently is IMutation#validateSize 
which is only called once from CommitLog.java. Consider removing the 
memoization.
{quote}
Size is validated at 
org.apache.cassandra.service.reads.repair.BlockingReadRepairs#createRepairMutation
 method too. It may use different version for serialization based on 
destination. I feel memoization is required. Somehow I missed to include 
changes for this method in the patch. Including this change while addressing 
other review comments. Will update the patch asap.

> Log message when mutation passed to CommitLog#add(Mutation) is too large is 
> not descriptive enough
> --
>
> Key: CASSANDRA-14781
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14781
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Hints, Local/Commit Log, Messaging/Client
>Reporter: Jordan West
>Assignee: Tom Petracca
>Priority: Normal
>  Labels: protocolv5
> Fix For: 4.0-beta
>
> Attachments: CASSANDRA-14781.patch, CASSANDRA-14781_3.0.patch, 
> CASSANDRA-14781_3.11.patch
>
>
> When hitting 
> [https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/db/commitlog/CommitLog.java#L256-L257],
>  the log message produced does not help the operator track down what data is 
> being written. At a minimum the keyspace and cfIds involved would be useful 
> (and are available) – more detail might not be reasonable to include. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14781) Log message when mutation passed to CommitLog#add(Mutation) is too large is not descriptive enough

2020-01-06 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17008622#comment-17008622
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14781:


[~jrwest] Made a patch with the changes. [~tpetracca] apologies if I am 
overtaking you.

Here is the 
[patch|https://github.com/nvharikrishna/cassandra/commit/5b3af390ce64860505dfeb3a3549cc9897987771]
 and [CI|https://app.circleci.com/jobs/github/nvharikrishna/cassandra/95].

I have a small question though. In 
[CommitLog.java|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/commitlog/CommitLog.java#L274],
 it considers commit log entry overhead with mutation size to compare with max 
mutation size. Shouldn't it consider only the mutation size? I made changes to 
be compatible with overhead (I can change based on your comments).

{{SizeHolder}} introduced in this patch can be used in {{Message}} as well. I 
can make this change too if it is okay to mixup.

> Log message when mutation passed to CommitLog#add(Mutation) is too large is 
> not descriptive enough
> --
>
> Key: CASSANDRA-14781
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14781
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Hints, Local/Commit Log, Messaging/Client
>Reporter: Jordan West
>Assignee: Tom Petracca
>Priority: Normal
>  Labels: protocolv5
> Fix For: 4.0-beta
>
> Attachments: CASSANDRA-14781.patch, CASSANDRA-14781_3.0.patch, 
> CASSANDRA-14781_3.11.patch
>
>
> When hitting 
> [https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/db/commitlog/CommitLog.java#L256-L257],
>  the log message produced does not help the operator track down what data is 
> being written. At a minimum the keyspace and cfIds involved would be useful 
> (and are available) – more detail might not be reasonable to include. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14781) Log message when mutation passed to CommitLog#add(Mutation) is too large is not descriptive enough

2020-01-02 Thread Venkata Harikrishna Nukala (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17007005#comment-17007005
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14781:


[~tpetracca] Would you mind if I submit a patch for this? I ran into this 
requirement and made some progress on the patch.

> Log message when mutation passed to CommitLog#add(Mutation) is too large is 
> not descriptive enough
> --
>
> Key: CASSANDRA-14781
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14781
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Hints, Local/Commit Log, Messaging/Client
>Reporter: Jordan West
>Assignee: Tom Petracca
>Priority: Normal
>  Labels: protocolv5
> Fix For: 4.0-beta
>
> Attachments: CASSANDRA-14781.patch, CASSANDRA-14781_3.0.patch, 
> CASSANDRA-14781_3.11.patch
>
>
> When hitting 
> [https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/db/commitlog/CommitLog.java#L256-L257],
>  the log message produced does not help the operator track down what data is 
> being written. At a minimum the keyspace and cfIds involved would be useful 
> (and are available) – more detail might not be reasonable to include. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15223) OutboundTcpConnection leaks direct memory

2019-08-30 Thread Venkata Harikrishna Nukala (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala updated CASSANDRA-15223:
---
Test and Documentation Plan: Ran unit tests and tested basic things using 
ccm with compression enabled.
 Status: Patch Available  (was: Open)

> OutboundTcpConnection leaks direct memory
> -
>
> Key: CASSANDRA-15223
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15223
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Internode
>Reporter: Benedict
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
> Fix For: 3.0.x, 3.11.x
>
>
> On disconnect we set {{out}} to null without first closing it



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15227) Remove StageManager

2019-08-19 Thread Venkata Harikrishna Nukala (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala updated CASSANDRA-15227:
---
Test and Documentation Plan: Updated patch into the same branch. Tested 
tpstats and verfied thread dump.  (was: Minor change. Added branch and CI link 
in the comments.)
 Status: Patch Available  (was: In Progress)

> Remove StageManager
> ---
>
> Key: CASSANDRA-15227
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15227
> Project: Cassandra
>  Issue Type: Task
>  Components: Local/Other
>Reporter: Benedict
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
> Fix For: 4.0
>
>
> his is a minor cleanup; this class should not exist, but should be embedded 
> in the Stage enum.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15227) Remove StageManager

2019-08-19 Thread Venkata Harikrishna Nukala (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala updated CASSANDRA-15227:
---
Status: In Progress  (was: Changes Suggested)

> Remove StageManager
> ---
>
> Key: CASSANDRA-15227
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15227
> Project: Cassandra
>  Issue Type: Task
>  Components: Local/Other
>Reporter: Benedict
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
> Fix For: 4.0
>
>
> his is a minor cleanup; this class should not exist, but should be embedded 
> in the Stage enum.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15227) Remove StageManager

2019-08-16 Thread Venkata Harikrishna Nukala (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908928#comment-16908928
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-15227:


[~benedict] Made the changes and updated the branch (squashed commits). 

Made changes to pass _ExecutorServiceInitialiser_ as the last param to make 
enum easy to read. I tried passing executor as the param for the constructor 
but felt passing jmxName, jmxType, etc... as params look consistent and clean. 
Removed jmxName as member variable form Stage. Using thread instance comparison 
for _Gossiper_ as it needs to check if the code is being executed by executor 
thread or not. Created a separate class (_JMXEnabledSingleThreadExecutor_) 
instead _JMXEnabledThreadPoolExecutor_ as somebody can change the core/max 
threads using JMX which _JMXEnabledSingleThreadExecutor_ doesn't allow. Now 
Gossip stage literally uses only one thread. _ANTI_ENTROPY, MIGRATION & MISC_ 
stages use a single thread and can use _JMXEnabledSingleThreadExecutor_ but did 
not make changes because I thought it is worth tracking that as a separate 
change and maybe there could be some debate around them.

> Remove StageManager
> ---
>
> Key: CASSANDRA-15227
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15227
> Project: Cassandra
>  Issue Type: Task
>  Components: Local/Other
>Reporter: Benedict
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
> Fix For: 4.0
>
>
> his is a minor cleanup; this class should not exist, but should be embedded 
> in the Stage enum.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15225) FileUtils.close() does not handle non-IOException

2019-07-31 Thread Venkata Harikrishna Nukala (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16897432#comment-16897432
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-15225:


1) Insteasd of modifying already thrown exception, better to create new 
exception and call _newException_.addSuppressed. When the caller calls 
_newException_.getSuppressed, accurate list exceptions are returned.

2) Log statement has only one '{}'. I don't think it will print exception. I 
prefer caller to handle the logging part.

> FileUtils.close() does not handle non-IOException
> -
>
> Key: CASSANDRA-15225
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15225
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/SSTable
>Reporter: Benedict
>Assignee: Liudmila Kornilova
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 3.0.x, 3.11.x, 4.0.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This can lead to {{close}} not being invoked on remaining items



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15227) Remove StageManager

2019-07-26 Thread Venkata Harikrishna Nukala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala updated CASSANDRA-15227:
---
Test and Documentation Plan: Minor change. Added branch and CI link in the 
comments.
 Status: Patch Available  (was: In Progress)

> Remove StageManager
> ---
>
> Key: CASSANDRA-15227
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15227
> Project: Cassandra
>  Issue Type: Task
>  Components: Local/Other
>Reporter: Benedict
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
> Fix For: 4.0
>
>
> his is a minor cleanup; this class should not exist, but should be embedded 
> in the Stage enum.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15227) Remove StageManager

2019-07-26 Thread Venkata Harikrishna Nukala (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16893474#comment-16893474
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-15227:


Made changes here: [https://github.com/nvharikrishna/cassandra/tree/15227-trunk]

CI: [https://circleci.com/gh/nvharikrishna/cassandra/tree/15227-trunk]

 

> Remove StageManager
> ---
>
> Key: CASSANDRA-15227
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15227
> Project: Cassandra
>  Issue Type: Task
>  Components: Local/Other
>Reporter: Benedict
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
> Fix For: 4.0
>
>
> his is a minor cleanup; this class should not exist, but should be embedded 
> in the Stage enum.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15225) FileUtils.close() does not handle non-IOException

2019-07-18 Thread Venkata Harikrishna Nukala (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16888298#comment-16888298
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-15225:


[~Override] Instead of delivering last exception as IOException, can we use 
{{Throwable.addSuppressed()}} ?

cc: [~benedict]

> FileUtils.close() does not handle non-IOException
> -
>
> Key: CASSANDRA-15225
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15225
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/SSTable
>Reporter: Benedict
>Assignee: Liudmila Kornilova
>Priority: Normal
>  Labels: pull-request-available
> Fix For: 3.0.x, 3.11.x, 4.0.x
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This can lead to {{close}} not being invoked on remaining items



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-15227) Remove StageManager

2019-07-17 Thread Venkata Harikrishna Nukala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala reassigned CASSANDRA-15227:
--

Assignee: Venkata Harikrishna Nukala

> Remove StageManager
> ---
>
> Key: CASSANDRA-15227
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15227
> Project: Cassandra
>  Issue Type: Task
>  Components: Local/Other
>Reporter: Benedict
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
> Fix For: 4.0
>
>
> his is a minor cleanup; this class should not exist, but should be embedded 
> in the Stage enum.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15223) OutboundTcpConnection leaks direct memory

2019-07-17 Thread Venkata Harikrishna Nukala (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16887287#comment-16887287
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-15223:


Made changes for 3.0 and 3.11.

 

3.0 changes: 
[https://github.com/nvharikrishna/cassandra/tree/15223-cassandra-3.0]

Circle ci: 
[https://circleci.com/gh/nvharikrishna/cassandra/tree/15223-cassandra-3%2E0]

 

3.11 changes: 
[https://github.com/nvharikrishna/cassandra/tree/15223-cassandra-3.11]

Circle ci: 
[https://circleci.com/gh/nvharikrishna/cassandra/tree/15223-cassandra-3%2E11]

 

Closing the {{out}} closes the output stream passed to it 
(socket.getOutputStreamI()). Closing socket's output stream closes the socket 
too . So just closing {{out}}.

 

 

 

> OutboundTcpConnection leaks direct memory
> -
>
> Key: CASSANDRA-15223
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15223
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Internode
>Reporter: Benedict
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
> Fix For: 3.0.x, 3.11.x
>
>
> On disconnect we set {{out}} to null without first closing it



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-15223) OutboundTcpConnection leaks direct memory

2019-07-16 Thread Venkata Harikrishna Nukala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala reassigned CASSANDRA-15223:
--

Assignee: Venkata Harikrishna Nukala

> OutboundTcpConnection leaks direct memory
> -
>
> Key: CASSANDRA-15223
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15223
> Project: Cassandra
>  Issue Type: Bug
>  Components: Messaging/Internode
>Reporter: Benedict
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
> Fix For: 3.0.x, 3.11.x
>
>
> On disconnect we set {{out}} to null without first closing it



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15165) Reference-Reaper detected leak while running FramingTest unit test cases

2019-06-17 Thread Venkata Harikrishna Nukala (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16865941#comment-16865941
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-15165:


Here are the patch details:

[15165-trunk|https://github.com/nvharikrishna/cassandra/tree/15165-trunk] 
[CircleCi|https://circleci.com/gh/nvharikrishna/cassandra/18]

 

> Reference-Reaper detected leak while running FramingTest unit test cases
> 
>
> Key: CASSANDRA-15165
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15165
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Venkata Harikrishna Nukala
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
> Fix For: 4.0
>
>
> Reference-Reaper detected leak while running FramingTest unit test cases. 
> Here are the leak details:
> {code}
> [junit-timeout] ERROR [Reference-Reaper] 2019-06-17 01:44:53,812 Ref.java:228 
> - LEAK DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@15460327) to @876994034 was 
> not released before the reference was garbage collected
> [junit-timeout] ERROR [Reference-Reaper] 2019-06-17 01:44:53,812 Ref.java:259 
> - Allocate trace org.apache.cassandra.utils.concurrent.Ref$State@15460327:
> [junit-timeout] Thread[main,5,main]
> [junit-timeout]   at java.lang.Thread.getStackTrace(Thread.java:1559)
> [junit-timeout]   at 
> org.apache.cassandra.utils.concurrent.Ref$Debug.(Ref.java:249)
> [junit-timeout]   at 
> org.apache.cassandra.utils.concurrent.Ref$State.(Ref.java:179)
> [junit-timeout]   at 
> org.apache.cassandra.utils.concurrent.Ref.(Ref.java:101)
> [junit-timeout]   at 
> org.apache.cassandra.utils.memory.BufferPool$Chunk.setAttachment(BufferPool.java:960)
> [junit-timeout]   at 
> org.apache.cassandra.utils.memory.BufferPool$Chunk.set(BufferPool.java:1100)
> [junit-timeout]   at 
> org.apache.cassandra.utils.memory.BufferPool$Chunk.get(BufferPool.java:1090)
> [junit-timeout]   at 
> org.apache.cassandra.utils.memory.BufferPool$LocalPool.tryGetInternal(BufferPool.java:721)
> [junit-timeout]   at 
> org.apache.cassandra.utils.memory.BufferPool$LocalPool.tryGet(BufferPool.java:706)
> [junit-timeout]   at 
> org.apache.cassandra.utils.memory.BufferPool$LocalPool.get(BufferPool.java:656)
> [junit-timeout]   at 
> org.apache.cassandra.utils.memory.BufferPool$LocalPool.access$000(BufferPool.java:535)
> [junit-timeout]   at 
> org.apache.cassandra.utils.memory.BufferPool.getAtLeast(BufferPool.java:129)
> [junit-timeout]   at 
> org.apache.cassandra.net.FramingTest.sequenceOfMessages(FramingTest.java:413)
> [junit-timeout]   at 
> org.apache.cassandra.net.FramingTest.testRandomSequenceOfMessages(FramingTest.java:265)
> [junit-timeout]   at 
> org.apache.cassandra.net.FramingTest.testSomeMessages(FramingTest.java:259)
> [junit-timeout]   at 
> org.apache.cassandra.net.FramingTest.testRandomLegacy(FramingTest.java:243)
> [junit-timeout]   at 
> org.apache.cassandra.net.FramingTest.testRandomLegacy(FramingTest.java:234)
> [junit-timeout]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> [junit-timeout]   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> [junit-timeout]   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> [junit-timeout]   at java.lang.reflect.Method.invoke(Method.java:498)
> [junit-timeout]   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> [junit-timeout]   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> [junit-timeout]   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> [junit-timeout]   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> [junit-timeout]   at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> [junit-timeout]   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> [junit-timeout]   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> [junit-timeout]   at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> [junit-timeout]   at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> [junit-timeout]   at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> [junit-timeout]   at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> [junit-timeout]   at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> [junit-timeout]   at 
>

[jira] [Issue Comment Deleted] (CASSANDRA-15165) Reference-Reaper detected leak while running FramingTest unit test cases

2019-06-17 Thread Venkata Harikrishna Nukala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala updated CASSANDRA-15165:
---
Comment: was deleted

(was: Made the changes. Here are the patch details:

Patch: 
[15165|https://github.pie.apple.com/hnukala/hnukala-cassandra/compare/trunk...framingtest-leak-fix?expand=1]
 [CircleCi|https://circleci.com/gh/nvharikrishna/cassandra/18]

Two test cases are failing in it, but doesn't look like they related to these 
changes.)

> Reference-Reaper detected leak while running FramingTest unit test cases
> 
>
> Key: CASSANDRA-15165
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15165
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Venkata Harikrishna Nukala
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
> Fix For: 4.0
>
>
> Reference-Reaper detected leak while running FramingTest unit test cases. 
> Here are the leak details:
> {code}
> [junit-timeout] ERROR [Reference-Reaper] 2019-06-17 01:44:53,812 Ref.java:228 
> - LEAK DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@15460327) to @876994034 was 
> not released before the reference was garbage collected
> [junit-timeout] ERROR [Reference-Reaper] 2019-06-17 01:44:53,812 Ref.java:259 
> - Allocate trace org.apache.cassandra.utils.concurrent.Ref$State@15460327:
> [junit-timeout] Thread[main,5,main]
> [junit-timeout]   at java.lang.Thread.getStackTrace(Thread.java:1559)
> [junit-timeout]   at 
> org.apache.cassandra.utils.concurrent.Ref$Debug.(Ref.java:249)
> [junit-timeout]   at 
> org.apache.cassandra.utils.concurrent.Ref$State.(Ref.java:179)
> [junit-timeout]   at 
> org.apache.cassandra.utils.concurrent.Ref.(Ref.java:101)
> [junit-timeout]   at 
> org.apache.cassandra.utils.memory.BufferPool$Chunk.setAttachment(BufferPool.java:960)
> [junit-timeout]   at 
> org.apache.cassandra.utils.memory.BufferPool$Chunk.set(BufferPool.java:1100)
> [junit-timeout]   at 
> org.apache.cassandra.utils.memory.BufferPool$Chunk.get(BufferPool.java:1090)
> [junit-timeout]   at 
> org.apache.cassandra.utils.memory.BufferPool$LocalPool.tryGetInternal(BufferPool.java:721)
> [junit-timeout]   at 
> org.apache.cassandra.utils.memory.BufferPool$LocalPool.tryGet(BufferPool.java:706)
> [junit-timeout]   at 
> org.apache.cassandra.utils.memory.BufferPool$LocalPool.get(BufferPool.java:656)
> [junit-timeout]   at 
> org.apache.cassandra.utils.memory.BufferPool$LocalPool.access$000(BufferPool.java:535)
> [junit-timeout]   at 
> org.apache.cassandra.utils.memory.BufferPool.getAtLeast(BufferPool.java:129)
> [junit-timeout]   at 
> org.apache.cassandra.net.FramingTest.sequenceOfMessages(FramingTest.java:413)
> [junit-timeout]   at 
> org.apache.cassandra.net.FramingTest.testRandomSequenceOfMessages(FramingTest.java:265)
> [junit-timeout]   at 
> org.apache.cassandra.net.FramingTest.testSomeMessages(FramingTest.java:259)
> [junit-timeout]   at 
> org.apache.cassandra.net.FramingTest.testRandomLegacy(FramingTest.java:243)
> [junit-timeout]   at 
> org.apache.cassandra.net.FramingTest.testRandomLegacy(FramingTest.java:234)
> [junit-timeout]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> [junit-timeout]   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> [junit-timeout]   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> [junit-timeout]   at java.lang.reflect.Method.invoke(Method.java:498)
> [junit-timeout]   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> [junit-timeout]   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> [junit-timeout]   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> [junit-timeout]   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> [junit-timeout]   at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> [junit-timeout]   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> [junit-timeout]   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> [junit-timeout]   at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> [junit-timeout]   at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> [junit-timeout]   at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> [junit-timeout]   at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> 

[jira] [Commented] (CASSANDRA-15165) Reference-Reaper detected leak while running FramingTest unit test cases

2019-06-17 Thread Venkata Harikrishna Nukala (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16865930#comment-16865930
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-15165:


Made the changes. Here are the patch details:

Patch: 
[15165|https://github.pie.apple.com/hnukala/hnukala-cassandra/compare/trunk...framingtest-leak-fix?expand=1]
 [CircleCi|https://circleci.com/gh/nvharikrishna/cassandra/18]

Two test cases are failing in it, but doesn't look like they related to these 
changes.

> Reference-Reaper detected leak while running FramingTest unit test cases
> 
>
> Key: CASSANDRA-15165
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15165
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/unit
>Reporter: Venkata Harikrishna Nukala
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
>
> Reference-Reaper detected leak while running FramingTest unit test cases. 
> Here are the leak details:
> {code}
> [junit-timeout] ERROR [Reference-Reaper] 2019-06-17 01:44:53,812 Ref.java:228 
> - LEAK DETECTED: a reference 
> (org.apache.cassandra.utils.concurrent.Ref$State@15460327) to @876994034 was 
> not released before the reference was garbage collected
> [junit-timeout] ERROR [Reference-Reaper] 2019-06-17 01:44:53,812 Ref.java:259 
> - Allocate trace org.apache.cassandra.utils.concurrent.Ref$State@15460327:
> [junit-timeout] Thread[main,5,main]
> [junit-timeout]   at java.lang.Thread.getStackTrace(Thread.java:1559)
> [junit-timeout]   at 
> org.apache.cassandra.utils.concurrent.Ref$Debug.(Ref.java:249)
> [junit-timeout]   at 
> org.apache.cassandra.utils.concurrent.Ref$State.(Ref.java:179)
> [junit-timeout]   at 
> org.apache.cassandra.utils.concurrent.Ref.(Ref.java:101)
> [junit-timeout]   at 
> org.apache.cassandra.utils.memory.BufferPool$Chunk.setAttachment(BufferPool.java:960)
> [junit-timeout]   at 
> org.apache.cassandra.utils.memory.BufferPool$Chunk.set(BufferPool.java:1100)
> [junit-timeout]   at 
> org.apache.cassandra.utils.memory.BufferPool$Chunk.get(BufferPool.java:1090)
> [junit-timeout]   at 
> org.apache.cassandra.utils.memory.BufferPool$LocalPool.tryGetInternal(BufferPool.java:721)
> [junit-timeout]   at 
> org.apache.cassandra.utils.memory.BufferPool$LocalPool.tryGet(BufferPool.java:706)
> [junit-timeout]   at 
> org.apache.cassandra.utils.memory.BufferPool$LocalPool.get(BufferPool.java:656)
> [junit-timeout]   at 
> org.apache.cassandra.utils.memory.BufferPool$LocalPool.access$000(BufferPool.java:535)
> [junit-timeout]   at 
> org.apache.cassandra.utils.memory.BufferPool.getAtLeast(BufferPool.java:129)
> [junit-timeout]   at 
> org.apache.cassandra.net.FramingTest.sequenceOfMessages(FramingTest.java:413)
> [junit-timeout]   at 
> org.apache.cassandra.net.FramingTest.testRandomSequenceOfMessages(FramingTest.java:265)
> [junit-timeout]   at 
> org.apache.cassandra.net.FramingTest.testSomeMessages(FramingTest.java:259)
> [junit-timeout]   at 
> org.apache.cassandra.net.FramingTest.testRandomLegacy(FramingTest.java:243)
> [junit-timeout]   at 
> org.apache.cassandra.net.FramingTest.testRandomLegacy(FramingTest.java:234)
> [junit-timeout]   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method)
> [junit-timeout]   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> [junit-timeout]   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> [junit-timeout]   at java.lang.reflect.Method.invoke(Method.java:498)
> [junit-timeout]   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> [junit-timeout]   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> [junit-timeout]   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> [junit-timeout]   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> [junit-timeout]   at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> [junit-timeout]   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> [junit-timeout]   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> [junit-timeout]   at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> [junit-timeout]   at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> [junit-timeout]   at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> [junit-timeout]   at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> [jun

[jira] [Created] (CASSANDRA-15165) Reference-Reaper detected leak while running FramingTest unit test cases

2019-06-17 Thread Venkata Harikrishna Nukala (JIRA)
Venkata Harikrishna Nukala created CASSANDRA-15165:
--

 Summary: Reference-Reaper detected leak while running FramingTest 
unit test cases
 Key: CASSANDRA-15165
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15165
 Project: Cassandra
  Issue Type: Bug
  Components: Test/unit
Reporter: Venkata Harikrishna Nukala
Assignee: Venkata Harikrishna Nukala


Reference-Reaper detected leak while running FramingTest unit test cases. Here 
are the leak details:

{code}
[junit-timeout] ERROR [Reference-Reaper] 2019-06-17 01:44:53,812 Ref.java:228 - 
LEAK DETECTED: a reference 
(org.apache.cassandra.utils.concurrent.Ref$State@15460327) to @876994034 was 
not released before the reference was garbage collected
[junit-timeout] ERROR [Reference-Reaper] 2019-06-17 01:44:53,812 Ref.java:259 - 
Allocate trace org.apache.cassandra.utils.concurrent.Ref$State@15460327:
[junit-timeout] Thread[main,5,main]
[junit-timeout] at java.lang.Thread.getStackTrace(Thread.java:1559)
[junit-timeout] at 
org.apache.cassandra.utils.concurrent.Ref$Debug.(Ref.java:249)
[junit-timeout] at 
org.apache.cassandra.utils.concurrent.Ref$State.(Ref.java:179)
[junit-timeout] at 
org.apache.cassandra.utils.concurrent.Ref.(Ref.java:101)
[junit-timeout] at 
org.apache.cassandra.utils.memory.BufferPool$Chunk.setAttachment(BufferPool.java:960)
[junit-timeout] at 
org.apache.cassandra.utils.memory.BufferPool$Chunk.set(BufferPool.java:1100)
[junit-timeout] at 
org.apache.cassandra.utils.memory.BufferPool$Chunk.get(BufferPool.java:1090)
[junit-timeout] at 
org.apache.cassandra.utils.memory.BufferPool$LocalPool.tryGetInternal(BufferPool.java:721)
[junit-timeout] at 
org.apache.cassandra.utils.memory.BufferPool$LocalPool.tryGet(BufferPool.java:706)
[junit-timeout] at 
org.apache.cassandra.utils.memory.BufferPool$LocalPool.get(BufferPool.java:656)
[junit-timeout] at 
org.apache.cassandra.utils.memory.BufferPool$LocalPool.access$000(BufferPool.java:535)
[junit-timeout] at 
org.apache.cassandra.utils.memory.BufferPool.getAtLeast(BufferPool.java:129)
[junit-timeout] at 
org.apache.cassandra.net.FramingTest.sequenceOfMessages(FramingTest.java:413)
[junit-timeout] at 
org.apache.cassandra.net.FramingTest.testRandomSequenceOfMessages(FramingTest.java:265)
[junit-timeout] at 
org.apache.cassandra.net.FramingTest.testSomeMessages(FramingTest.java:259)
[junit-timeout] at 
org.apache.cassandra.net.FramingTest.testRandomLegacy(FramingTest.java:243)
[junit-timeout] at 
org.apache.cassandra.net.FramingTest.testRandomLegacy(FramingTest.java:234)
[junit-timeout] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
[junit-timeout] at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[junit-timeout] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[junit-timeout] at java.lang.reflect.Method.invoke(Method.java:498)
[junit-timeout] at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
[junit-timeout] at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
[junit-timeout] at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
[junit-timeout] at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
[junit-timeout] at 
org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
[junit-timeout] at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
[junit-timeout] at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
[junit-timeout] at 
org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
[junit-timeout] at 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
[junit-timeout] at 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
[junit-timeout] at 
org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
[junit-timeout] at 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
[junit-timeout] at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
[junit-timeout] at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
[junit-timeout] at 
org.junit.runners.ParentRunner.run(ParentRunner.java:363)
[junit-timeout] at 
junit.framework.JUnit4TestAdapter.run(JUnit4TestAdapter.java:38)
[junit-timeout] at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:534)
[junit-timeout] at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTest

[jira] [Commented] (CASSANDRA-14516) filter sstables by min/max clustering bounds during reads

2019-05-17 Thread Venkata Harikrishna Nukala (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16841989#comment-16841989
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14516:


I tried to reproduce the issue by following these steps.

1. Create keyspace
{code:java}
CREATE KEYSPACE ks1 WITH replication = {'class': 'SimpleStrategy' , 
'replication_factor': 1 };{code}
2. Create table
{code:java}
CREATE TABLE ks1.t1 ( k1 int , c1 int, c2 int, PRIMARY KEY (k1, c1, c2));{code}
3. Insert few rows
{code:java}
INSERT INTO ks1.t1 (k1 , c1 , c2 ) VALUES ( 5, 1, 1);
INSERT INTO ks1.t1 (k1 , c1 , c2 ) VALUES ( 5, 1, 2);
INSERT INTO ks1.t1 (k1 , c1 , c2 ) VALUES ( 5, 2, 3);
INSERT INTO ks1.t1 (k1 , c1 , c2 ) VALUES ( 5, 2, 4);
{code}
4. Nodetool flush.
 5. Insert more rows
{code:java}
INSERT INTO ks1.t1 (k1 , c1 , c2 ) VALUES ( 5, 7, 5);
INSERT INTO ks1.t1 (k1 , c1 , c2 ) VALUES ( 5, 7, 6);
INSERT INTO ks1.t1 (k1 , c1 , c2 ) VALUES ( 5, 8, 7);
INSERT INTO ks1.t1 (k1 , c1 , c2 ) VALUES ( 5, 8, 8);
{code}
6. Nodetool flush.
 7. Now run select command with clustering bounds
{code:java}
SELECT * FROM ks1.t1 WHERE k1 = 5 and c1 > 1 and c1 < 3;

 k1 | c1 | c2
++
  5 |  2 |  3
  5 |  2 |  4

(2 rows)

Tracing session: bc217bd0-7818-11e9-a606-b7a04374fcea

 activity   
| timestamp  | source| source_elapsed | 
client
++---++---
 
Execute CQL3 query | 2019-05-17 01:55:24.237000 | 127.0.0.1 |  0 | 
127.0.0.1
 Parsing SELECT * FROM ks1.t1 WHERE k1 = 5 and c1 > 1 and c1 < 3; 
[Native-Transport-Requests-1] | 2019-05-17 01:55:24.238000 | 127.0.0.1 |
321 | 127.0.0.1
  Preparing statement 
[Native-Transport-Requests-1] | 2019-05-17 01:55:24.238000 | 127.0.0.1 |
674 | 127.0.0.1
   Executing single-partition query on 
t1 [ReadStage-2] | 2019-05-17 01:55:24.239000 | 127.0.0.1 |   1402 | 
127.0.0.1
 Acquiring sstable 
references [ReadStage-2] | 2019-05-17 01:55:24.239000 | 127.0.0.1 |   
1500 | 127.0.0.1
Skipped 1/2 non-slice-intersecting sstables, included 0 due to 
tombstones [ReadStage-2] | 2019-05-17 01:55:24.239000 | 127.0.0.1 |   
1654 | 127.0.0.1
  Key cache hit for sstable 
1 [ReadStage-2] | 2019-05-17 01:55:24.239000 | 127.0.0.1 |   1787 | 
127.0.0.1
Merged data from memtables and 1 
sstables [ReadStage-2] | 2019-05-17 01:55:24.24 | 127.0.0.1 |   
2067 | 127.0.0.1
   Read 2 live rows and 0 tombstone 
cells [ReadStage-2] | 2019-05-17 01:55:24.24 | 127.0.0.1 |   2181 | 
127.0.0.1
   
Request complete | 2019-05-17 01:55:24.239666 | 127.0.0.1 |   2666 | 
127.0.0.1
{code}
 

I can see one SSTable skipped while executing the query from the trace. Looks 
like working as expected.

A check is already in the code: 
[https://github.com/apache/cassandra/blob/b80f6c65fb0b97a8c79f6da027deac06a4af9801/src/java/org/apache/cassandra/db/SinglePartitionReadCommand.java#L650]

> filter sstables by min/max clustering bounds during reads
> -
>
> Key: CASSANDRA-14516
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14516
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: Blake Eggleston
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
> Fix For: 4.0
>
>
> In SinglePartitionReadCommand, we don't filter out sstables whose min/max 
> clustering bounds don't intersect with the clustering bounds being queried. 
> This causes us to do extra work on the read path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14516) filter sstables by min/max clustering bounds during reads

2019-05-10 Thread Venkata Harikrishna Nukala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala reassigned CASSANDRA-14516:
--

Assignee: Venkata Harikrishna Nukala

> filter sstables by min/max clustering bounds during reads
> -
>
> Key: CASSANDRA-14516
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14516
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: Blake Eggleston
>Assignee: Venkata Harikrishna Nukala
>Priority: Normal
> Fix For: 4.0
>
>
> In SinglePartitionReadCommand, we don't filter out sstables whose min/max 
> clustering bounds don't intersect with the clustering bounds being queried. 
> This causes us to do extra work on the read path.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14647) Reading cardinality from Statistics.db failed

2019-03-01 Thread Venkata Harikrishna Nukala (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16781765#comment-16781765
 ] 

Venkata Harikrishna Nukala edited comment on CASSANDRA-14647 at 3/1/19 3:15 PM:


[~krummas] Updated the patch. 
||Branch||CircleCI||
|[14647-trunk|https://github.com/apache/cassandra/compare/trunk...nvharikrishna:14647-trunk]|[link|https://circleci.com/gh/nvharikrishna/cassandra/3#tests/containers/2]|


was (Author: n.v.harikrishna):
[~krummas] Updated the patch. 
||Branch||CircleCI||
|[14647-trunk\|https://github.com/apache/cassandra/compare/trunk...nvharikrishna:14647-trunk]|[link\|https://circleci.com/gh/nvharikrishna/cassandra/3#tests/containers/2]|

> Reading cardinality from Statistics.db failed
> -
>
> Key: CASSANDRA-14647
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14647
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
> Environment: Clients are doing only writes with Local One, cluster 
> consist of 3 regions with RF3.
> Storage is configured wth jbod/XFS on 10 x 1Tb disks
> IOPS limit for each disk 500 (total 5000 iops)
> Bandwith for each disk 60mb/s (600 total)
> OS is Debian linux.
>Reporter: Vitali Djatsuk
>Assignee: Venkata Harikrishna Nukala
>Priority: Minor
> Fix For: 3.0.x
>
> Attachments: 14647-trunk-1.patch, 
> cassandra_compaction_pending_tasks_7days.png
>
>
> There is some issue with sstable metadata which is visible in system.log, the 
> messages says:
> {noformat}
> WARN  [Thread-6] 2018-07-25 07:12:47,928 SSTableReader.java:249 - Reading 
> cardinality from Statistics.db failed for 
> /opt/data/disk5/data/keyspace/table/mc-big-Data.db.{noformat}
> Although there is no such file. 
> The message has appeared after i've changed the compaction strategy from 
> SizeTiered to Leveled. Compaction strategy has been changed region by region 
> (total 3 regions) and it has coincided with the double client write traffic 
> increase.
>  I have tried to run nodetool scrub to rebuilt the sstable, but that does not 
> fix the issue.
> So very hard to define the steps to reproduce, probably it will be:
>  # run stress tool with write traffic
>  # under load change compaction strategy from SireTiered to Leveled for the 
> bunch of hosts
>  # add more write traffic
> Reading the code it is said that if this metadata is broken, then "estimating 
> the keys will be done using index summary". 
>  
> [https://github.com/apache/cassandra/blob/cassandra-3.0.17/src/java/org/apache/cassandra/io/sstable/format/SSTableReader.java#L247]
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14647) Reading cardinality from Statistics.db failed

2019-03-01 Thread Venkata Harikrishna Nukala (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16781765#comment-16781765
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14647:


[~krummas] Updated the patch. 
||Branch||CircleCI||
|[14647-trunk\|https://github.com/apache/cassandra/compare/trunk...nvharikrishna:14647-trunk]|[link\|https://circleci.com/gh/nvharikrishna/cassandra/3#tests/containers/2]|

> Reading cardinality from Statistics.db failed
> -
>
> Key: CASSANDRA-14647
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14647
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
> Environment: Clients are doing only writes with Local One, cluster 
> consist of 3 regions with RF3.
> Storage is configured wth jbod/XFS on 10 x 1Tb disks
> IOPS limit for each disk 500 (total 5000 iops)
> Bandwith for each disk 60mb/s (600 total)
> OS is Debian linux.
>Reporter: Vitali Djatsuk
>Assignee: Venkata Harikrishna Nukala
>Priority: Minor
> Fix For: 3.0.x
>
> Attachments: 14647-trunk-1.patch, 
> cassandra_compaction_pending_tasks_7days.png
>
>
> There is some issue with sstable metadata which is visible in system.log, the 
> messages says:
> {noformat}
> WARN  [Thread-6] 2018-07-25 07:12:47,928 SSTableReader.java:249 - Reading 
> cardinality from Statistics.db failed for 
> /opt/data/disk5/data/keyspace/table/mc-big-Data.db.{noformat}
> Although there is no such file. 
> The message has appeared after i've changed the compaction strategy from 
> SizeTiered to Leveled. Compaction strategy has been changed region by region 
> (total 3 regions) and it has coincided with the double client write traffic 
> increase.
>  I have tried to run nodetool scrub to rebuilt the sstable, but that does not 
> fix the issue.
> So very hard to define the steps to reproduce, probably it will be:
>  # run stress tool with write traffic
>  # under load change compaction strategy from SireTiered to Leveled for the 
> bunch of hosts
>  # add more write traffic
> Reading the code it is said that if this metadata is broken, then "estimating 
> the keys will be done using index summary". 
>  
> [https://github.com/apache/cassandra/blob/cassandra-3.0.17/src/java/org/apache/cassandra/io/sstable/format/SSTableReader.java#L247]
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14647) Reading cardinality from Statistics.db failed

2019-02-01 Thread Venkata Harikrishna Nukala (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16758140#comment-16758140
 ] 

Venkata Harikrishna Nukala edited comment on CASSANDRA-14647 at 2/1/19 9:35 AM:


[~krummas] I have created a patch for this and uploaded to this ticket  
[^14647-trunk-1.patch]


was (Author: n.v.harikrishna):
[~krummas] I have created a patch for this and upload to this ticket  
[^14647-trunk-1.patch] 

> Reading cardinality from Statistics.db failed
> -
>
> Key: CASSANDRA-14647
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14647
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
> Environment: Clients are doing only writes with Local One, cluster 
> consist of 3 regions with RF3.
> Storage is configured wth jbod/XFS on 10 x 1Tb disks
> IOPS limit for each disk 500 (total 5000 iops)
> Bandwith for each disk 60mb/s (600 total)
> OS is Debian linux.
>Reporter: Vitali Djatsuk
>Assignee: Marcus Eriksson
>Priority: Minor
> Fix For: 3.0.x
>
> Attachments: 14647-trunk-1.patch, 
> cassandra_compaction_pending_tasks_7days.png
>
>
> There is some issue with sstable metadata which is visible in system.log, the 
> messages says:
> {noformat}
> WARN  [Thread-6] 2018-07-25 07:12:47,928 SSTableReader.java:249 - Reading 
> cardinality from Statistics.db failed for 
> /opt/data/disk5/data/keyspace/table/mc-big-Data.db.{noformat}
> Although there is no such file. 
> The message has appeared after i've changed the compaction strategy from 
> SizeTiered to Leveled. Compaction strategy has been changed region by region 
> (total 3 regions) and it has coincided with the double client write traffic 
> increase.
>  I have tried to run nodetool scrub to rebuilt the sstable, but that does not 
> fix the issue.
> So very hard to define the steps to reproduce, probably it will be:
>  # run stress tool with write traffic
>  # under load change compaction strategy from SireTiered to Leveled for the 
> bunch of hosts
>  # add more write traffic
> Reading the code it is said that if this metadata is broken, then "estimating 
> the keys will be done using index summary". 
>  
> [https://github.com/apache/cassandra/blob/cassandra-3.0.17/src/java/org/apache/cassandra/io/sstable/format/SSTableReader.java#L247]
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14647) Reading cardinality from Statistics.db failed

2019-02-01 Thread Venkata Harikrishna Nukala (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16758140#comment-16758140
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14647:


[~krummas] I have created a patch for this and upload to this ticket  
[^14647-trunk-1.patch] 

> Reading cardinality from Statistics.db failed
> -
>
> Key: CASSANDRA-14647
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14647
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
> Environment: Clients are doing only writes with Local One, cluster 
> consist of 3 regions with RF3.
> Storage is configured wth jbod/XFS on 10 x 1Tb disks
> IOPS limit for each disk 500 (total 5000 iops)
> Bandwith for each disk 60mb/s (600 total)
> OS is Debian linux.
>Reporter: Vitali Djatsuk
>Assignee: Marcus Eriksson
>Priority: Minor
> Fix For: 3.0.x
>
> Attachments: 14647-trunk-1.patch, 
> cassandra_compaction_pending_tasks_7days.png
>
>
> There is some issue with sstable metadata which is visible in system.log, the 
> messages says:
> {noformat}
> WARN  [Thread-6] 2018-07-25 07:12:47,928 SSTableReader.java:249 - Reading 
> cardinality from Statistics.db failed for 
> /opt/data/disk5/data/keyspace/table/mc-big-Data.db.{noformat}
> Although there is no such file. 
> The message has appeared after i've changed the compaction strategy from 
> SizeTiered to Leveled. Compaction strategy has been changed region by region 
> (total 3 regions) and it has coincided with the double client write traffic 
> increase.
>  I have tried to run nodetool scrub to rebuilt the sstable, but that does not 
> fix the issue.
> So very hard to define the steps to reproduce, probably it will be:
>  # run stress tool with write traffic
>  # under load change compaction strategy from SireTiered to Leveled for the 
> bunch of hosts
>  # add more write traffic
> Reading the code it is said that if this metadata is broken, then "estimating 
> the keys will be done using index summary". 
>  
> [https://github.com/apache/cassandra/blob/cassandra-3.0.17/src/java/org/apache/cassandra/io/sstable/format/SSTableReader.java#L247]
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14647) Reading cardinality from Statistics.db failed

2019-02-01 Thread Venkata Harikrishna Nukala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala updated CASSANDRA-14647:
---
Attachment: 14647-trunk-1.patch

> Reading cardinality from Statistics.db failed
> -
>
> Key: CASSANDRA-14647
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14647
> Project: Cassandra
>  Issue Type: Bug
>  Components: Observability/Metrics
> Environment: Clients are doing only writes with Local One, cluster 
> consist of 3 regions with RF3.
> Storage is configured wth jbod/XFS on 10 x 1Tb disks
> IOPS limit for each disk 500 (total 5000 iops)
> Bandwith for each disk 60mb/s (600 total)
> OS is Debian linux.
>Reporter: Vitali Djatsuk
>Assignee: Marcus Eriksson
>Priority: Minor
> Fix For: 3.0.x
>
> Attachments: 14647-trunk-1.patch, 
> cassandra_compaction_pending_tasks_7days.png
>
>
> There is some issue with sstable metadata which is visible in system.log, the 
> messages says:
> {noformat}
> WARN  [Thread-6] 2018-07-25 07:12:47,928 SSTableReader.java:249 - Reading 
> cardinality from Statistics.db failed for 
> /opt/data/disk5/data/keyspace/table/mc-big-Data.db.{noformat}
> Although there is no such file. 
> The message has appeared after i've changed the compaction strategy from 
> SizeTiered to Leveled. Compaction strategy has been changed region by region 
> (total 3 regions) and it has coincided with the double client write traffic 
> increase.
>  I have tried to run nodetool scrub to rebuilt the sstable, but that does not 
> fix the issue.
> So very hard to define the steps to reproduce, probably it will be:
>  # run stress tool with write traffic
>  # under load change compaction strategy from SireTiered to Leveled for the 
> bunch of hosts
>  # add more write traffic
> Reading the code it is said that if this metadata is broken, then "estimating 
> the keys will be done using index summary". 
>  
> [https://github.com/apache/cassandra/blob/cassandra-3.0.17/src/java/org/apache/cassandra/io/sstable/format/SSTableReader.java#L247]
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14344) Support filtering using IN restrictions

2019-01-30 Thread Venkata Harikrishna Nukala (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16756386#comment-16756386
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14344:


[~blerer] Updated the patch ( [^14344-trunk-3.patch] )  and uploaded to this 
ticket. 

> Support filtering using IN restrictions
> ---
>
> Key: CASSANDRA-14344
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14344
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/CQL
>Reporter: Dikang Gu
>Assignee: Venkata Harikrishna Nukala
>Priority: Major
> Fix For: 4.x
>
> Attachments: 14344-trunk-2.txt, 14344-trunk-3.patch, 
> 14344-trunk-inexpression-approach-2.txt, 
> 14344-trunk-inexpression-approach.txt, 14344-trunk.txt
>
>
> Support IN filter query like this:
>  
> CREATE TABLE ks1.t1 (
>     key int,
>     col1 int,
>     col2 int,
>     value int,
>     PRIMARY KEY (key, col1, col2)
> ) WITH CLUSTERING ORDER BY (col1 ASC, col2 ASC)
>  
> cqlsh:ks1> select * from t1 where key = 1 and col2 in (1) allow filtering;
>  
>  key | col1 | col2 | value
> -+--+--+---
>    1 |    1 |    1 |     1
>    1 |    2 |    1 |     3
>  
> (2 rows)
> cqlsh:ks1> select * from t1 where key = 1 and col2 in (1, 2) allow filtering;
> *{color:#ff}InvalidRequest: Error from server: code=2200 [Invalid query] 
> message="IN restrictions are not supported on indexed columns"{color}*
> cqlsh:ks1>



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14344) Support filtering using IN restrictions

2019-01-30 Thread Venkata Harikrishna Nukala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala updated CASSANDRA-14344:
---
Attachment: 14344-trunk-3.patch

> Support filtering using IN restrictions
> ---
>
> Key: CASSANDRA-14344
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14344
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Legacy/CQL
>Reporter: Dikang Gu
>Assignee: Venkata Harikrishna Nukala
>Priority: Major
> Fix For: 4.x
>
> Attachments: 14344-trunk-2.txt, 14344-trunk-3.patch, 
> 14344-trunk-inexpression-approach-2.txt, 
> 14344-trunk-inexpression-approach.txt, 14344-trunk.txt
>
>
> Support IN filter query like this:
>  
> CREATE TABLE ks1.t1 (
>     key int,
>     col1 int,
>     col2 int,
>     value int,
>     PRIMARY KEY (key, col1, col2)
> ) WITH CLUSTERING ORDER BY (col1 ASC, col2 ASC)
>  
> cqlsh:ks1> select * from t1 where key = 1 and col2 in (1) allow filtering;
>  
>  key | col1 | col2 | value
> -+--+--+---
>    1 |    1 |    1 |     1
>    1 |    2 |    1 |     3
>  
> (2 rows)
> cqlsh:ks1> select * from t1 where key = 1 and col2 in (1, 2) allow filtering;
> *{color:#ff}InvalidRequest: Error from server: code=2200 [Invalid query] 
> message="IN restrictions are not supported on indexed columns"{color}*
> cqlsh:ks1>



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14344) Support filtering using IN restrictions

2018-08-27 Thread Venkata Harikrishna Nukala (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16593474#comment-16593474
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14344:


Found another good reason to have a separate class for evaluating IN expression.

And i.e. if we run a query having IN clause with single null value, something 
like below,
{code}SELECT * FROM t1 WHERE col2 IN (null) ALLOW FILTERING;{code}

then it throws an error message saying:
{code}InvalidRequest: Error from server: code=2200 [Invalid query] 
message="Unsupported null value for column col2"{code}

I feel we should throw the same error message even for a query having IN clause 
with multiple values. Something like:
{code}SELECT * FROM t2 WHERE c2 IN (10, null) ALLOW FILTERING;{code}

If we serialize the values into a single buffer, then we cannot do such 
validation. Values should be inspected individually to do it. So, created a new 
class for it so that validation is performed in its own manner and evaluation 
is performed optimally. Avoided new value in {{Expression.Kind}} enum to avoid 
impact on serialization and deserialization.

Attached patch [^14344-trunk-inexpression-approach-2.txt] with required changes.


> Support filtering using IN restrictions
> ---
>
> Key: CASSANDRA-14344
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14344
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Dikang Gu
>Assignee: Venkata Harikrishna Nukala
>Priority: Major
> Attachments: 14344-trunk-2.txt, 
> 14344-trunk-inexpression-approach-2.txt, 
> 14344-trunk-inexpression-approach.txt, 14344-trunk.txt
>
>
> Support IN filter query like this:
>  
> CREATE TABLE ks1.t1 (
>     key int,
>     col1 int,
>     col2 int,
>     value int,
>     PRIMARY KEY (key, col1, col2)
> ) WITH CLUSTERING ORDER BY (col1 ASC, col2 ASC)
>  
> cqlsh:ks1> select * from t1 where key = 1 and col2 in (1) allow filtering;
>  
>  key | col1 | col2 | value
> -+--+--+---
>    1 |    1 |    1 |     1
>    1 |    2 |    1 |     3
>  
> (2 rows)
> cqlsh:ks1> select * from t1 where key = 1 and col2 in (1, 2) allow filtering;
> *{color:#ff}InvalidRequest: Error from server: code=2200 [Invalid query] 
> message="IN restrictions are not supported on indexed columns"{color}*
> cqlsh:ks1>



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14344) Support filtering using IN restrictions

2018-08-27 Thread Venkata Harikrishna Nukala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala updated CASSANDRA-14344:
---
Attachment: 14344-trunk-inexpression-approach-2.txt

> Support filtering using IN restrictions
> ---
>
> Key: CASSANDRA-14344
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14344
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Dikang Gu
>Assignee: Venkata Harikrishna Nukala
>Priority: Major
> Attachments: 14344-trunk-2.txt, 
> 14344-trunk-inexpression-approach-2.txt, 
> 14344-trunk-inexpression-approach.txt, 14344-trunk.txt
>
>
> Support IN filter query like this:
>  
> CREATE TABLE ks1.t1 (
>     key int,
>     col1 int,
>     col2 int,
>     value int,
>     PRIMARY KEY (key, col1, col2)
> ) WITH CLUSTERING ORDER BY (col1 ASC, col2 ASC)
>  
> cqlsh:ks1> select * from t1 where key = 1 and col2 in (1) allow filtering;
>  
>  key | col1 | col2 | value
> -+--+--+---
>    1 |    1 |    1 |     1
>    1 |    2 |    1 |     3
>  
> (2 rows)
> cqlsh:ks1> select * from t1 where key = 1 and col2 in (1, 2) allow filtering;
> *{color:#ff}InvalidRequest: Error from server: code=2200 [Invalid query] 
> message="IN restrictions are not supported on indexed columns"{color}*
> cqlsh:ks1>



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14344) Support filtering using IN restrictions

2018-08-20 Thread Venkata Harikrishna Nukala (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16587026#comment-16587026
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14344:


Still working on it. Will give the updated patch in a couple of days.

> Support filtering using IN restrictions
> ---
>
> Key: CASSANDRA-14344
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14344
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Dikang Gu
>Assignee: Venkata Harikrishna Nukala
>Priority: Major
> Attachments: 14344-trunk-2.txt, 
> 14344-trunk-inexpression-approach.txt, 14344-trunk.txt
>
>
> Support IN filter query like this:
>  
> CREATE TABLE ks1.t1 (
>     key int,
>     col1 int,
>     col2 int,
>     value int,
>     PRIMARY KEY (key, col1, col2)
> ) WITH CLUSTERING ORDER BY (col1 ASC, col2 ASC)
>  
> cqlsh:ks1> select * from t1 where key = 1 and col2 in (1) allow filtering;
>  
>  key | col1 | col2 | value
> -+--+--+---
>    1 |    1 |    1 |     1
>    1 |    2 |    1 |     3
>  
> (2 rows)
> cqlsh:ks1> select * from t1 where key = 1 and col2 in (1, 2) allow filtering;
> *{color:#ff}InvalidRequest: Error from server: code=2200 [Invalid query] 
> message="IN restrictions are not supported on indexed columns"{color}*
> cqlsh:ks1>



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14344) Support filtering using IN restrictions

2018-06-25 Thread Venkata Harikrishna Nukala (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16522138#comment-16522138
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14344:


[~blerer] Added test cases and uploaded new patch  [^14344-trunk-2.txt] with 
small fixes. {{RowFilter.Expression}} is accepting ByteBuffer for value which 
is restricting all values to be serialized into a single buffer. And they 
should be deserialized to compare. This approach forces deserialization for 
every check.

I am also trying another approach. Created a new class 
{{RowFilter.InExpression}} for evaluating the {{IN}} expressions. The 
expression instance is common across evaluations, so it can hold deserialized 
values and reuse them. Attached patch  [^14344-trunk-inexpression-approach.txt] 
of this approach. Introducing a new expression type has side effects on 
Serialization i.e. {{RowFilter.Expression.Serializer}} class. Not sure how much 
it impacts yet. This patch is still half-baked (sorry for that). Just want to 
give some idea of the approach.

> Support filtering using IN restrictions
> ---
>
> Key: CASSANDRA-14344
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14344
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Dikang Gu
>Assignee: Venkata Harikrishna Nukala
>Priority: Major
> Attachments: 14344-trunk-2.txt, 
> 14344-trunk-inexpression-approach.txt, 14344-trunk.txt
>
>
> Support IN filter query like this:
>  
> CREATE TABLE ks1.t1 (
>     key int,
>     col1 int,
>     col2 int,
>     value int,
>     PRIMARY KEY (key, col1, col2)
> ) WITH CLUSTERING ORDER BY (col1 ASC, col2 ASC)
>  
> cqlsh:ks1> select * from t1 where key = 1 and col2 in (1) allow filtering;
>  
>  key | col1 | col2 | value
> -+--+--+---
>    1 |    1 |    1 |     1
>    1 |    2 |    1 |     3
>  
> (2 rows)
> cqlsh:ks1> select * from t1 where key = 1 and col2 in (1, 2) allow filtering;
> *{color:#ff}InvalidRequest: Error from server: code=2200 [Invalid query] 
> message="IN restrictions are not supported on indexed columns"{color}*
> cqlsh:ks1>



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14344) Support filtering using IN restrictions

2018-06-25 Thread Venkata Harikrishna Nukala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala updated CASSANDRA-14344:
---
Attachment: 14344-trunk-inexpression-approach.txt
14344-trunk-2.txt

> Support filtering using IN restrictions
> ---
>
> Key: CASSANDRA-14344
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14344
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Dikang Gu
>Assignee: Venkata Harikrishna Nukala
>Priority: Major
> Attachments: 14344-trunk-2.txt, 
> 14344-trunk-inexpression-approach.txt, 14344-trunk.txt
>
>
> Support IN filter query like this:
>  
> CREATE TABLE ks1.t1 (
>     key int,
>     col1 int,
>     col2 int,
>     value int,
>     PRIMARY KEY (key, col1, col2)
> ) WITH CLUSTERING ORDER BY (col1 ASC, col2 ASC)
>  
> cqlsh:ks1> select * from t1 where key = 1 and col2 in (1) allow filtering;
>  
>  key | col1 | col2 | value
> -+--+--+---
>    1 |    1 |    1 |     1
>    1 |    2 |    1 |     3
>  
> (2 rows)
> cqlsh:ks1> select * from t1 where key = 1 and col2 in (1, 2) allow filtering;
> *{color:#ff}InvalidRequest: Error from server: code=2200 [Invalid query] 
> message="IN restrictions are not supported on indexed columns"{color}*
> cqlsh:ks1>



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14344) Support filtering using IN restrictions

2018-06-16 Thread Venkata Harikrishna Nukala (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16514851#comment-16514851
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14344:


[~blerer] Thanks for reviewing the patch.

I agree with the comment "approach force the deserialization of all the list 
elements and of the value for each check", but as per my knowledge, this 
evaluation happens as part of iterator with no additional status/context, 
making it difficult to reuse deserialized values across partitions. This 
approached is used by other operators too. Is there a better way?

I will add additional unit test cases.

> Support filtering using IN restrictions
> ---
>
> Key: CASSANDRA-14344
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14344
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Dikang Gu
>Assignee: Venkata Harikrishna Nukala
>Priority: Major
> Attachments: 14344-trunk.txt
>
>
> Support IN filter query like this:
>  
> CREATE TABLE ks1.t1 (
>     key int,
>     col1 int,
>     col2 int,
>     value int,
>     PRIMARY KEY (key, col1, col2)
> ) WITH CLUSTERING ORDER BY (col1 ASC, col2 ASC)
>  
> cqlsh:ks1> select * from t1 where key = 1 and col2 in (1) allow filtering;
>  
>  key | col1 | col2 | value
> -+--+--+---
>    1 |    1 |    1 |     1
>    1 |    2 |    1 |     3
>  
> (2 rows)
> cqlsh:ks1> select * from t1 where key = 1 and col2 in (1, 2) allow filtering;
> *{color:#ff}InvalidRequest: Error from server: code=2200 [Invalid query] 
> message="IN restrictions are not supported on indexed columns"{color}*
> cqlsh:ks1>



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14344) Support the IN restrictions on indexed columns

2018-05-17 Thread Venkata Harikrishna Nukala (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16478702#comment-16478702
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14344:


[~blerer] thanks for the confirmation.

I've uploaded the patch with this ticket. Added few unit test cases. Tested in 
multi-node setup too. To note: querying secondary indexed column with IN still 
not supported by these changes. Please let me know if any changes required for 
the patch.

> Support the IN restrictions on indexed columns
> --
>
> Key: CASSANDRA-14344
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14344
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Dikang Gu
>Assignee: Venkata Harikrishna Nukala
>Priority: Major
> Attachments: 14344-trunk.txt
>
>
> Support IN filter query like this:
>  
> CREATE TABLE ks1.t1 (
>     key int,
>     col1 int,
>     col2 int,
>     value int,
>     PRIMARY KEY (key, col1, col2)
> ) WITH CLUSTERING ORDER BY (col1 ASC, col2 ASC)
>  
> cqlsh:ks1> select * from t1 where key = 1 and col2 in (1) allow filtering;
>  
>  key | col1 | col2 | value
> -+--+--+---
>    1 |    1 |    1 |     1
>    1 |    2 |    1 |     3
>  
> (2 rows)
> cqlsh:ks1> select * from t1 where key = 1 and col2 in (1, 2) allow filtering;
> *{color:#ff}InvalidRequest: Error from server: code=2200 [Invalid query] 
> message="IN restrictions are not supported on indexed columns"{color}*
> cqlsh:ks1>



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14344) Support the IN restrictions on indexed columns

2018-05-17 Thread Venkata Harikrishna Nukala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala updated CASSANDRA-14344:
---
Status: Patch Available  (was: Open)

> Support the IN restrictions on indexed columns
> --
>
> Key: CASSANDRA-14344
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14344
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Dikang Gu
>Assignee: Venkata Harikrishna Nukala
>Priority: Major
> Attachments: 14344-trunk.txt
>
>
> Support IN filter query like this:
>  
> CREATE TABLE ks1.t1 (
>     key int,
>     col1 int,
>     col2 int,
>     value int,
>     PRIMARY KEY (key, col1, col2)
> ) WITH CLUSTERING ORDER BY (col1 ASC, col2 ASC)
>  
> cqlsh:ks1> select * from t1 where key = 1 and col2 in (1) allow filtering;
>  
>  key | col1 | col2 | value
> -+--+--+---
>    1 |    1 |    1 |     1
>    1 |    2 |    1 |     3
>  
> (2 rows)
> cqlsh:ks1> select * from t1 where key = 1 and col2 in (1, 2) allow filtering;
> *{color:#ff}InvalidRequest: Error from server: code=2200 [Invalid query] 
> message="IN restrictions are not supported on indexed columns"{color}*
> cqlsh:ks1>



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14344) Support the IN restrictions on indexed columns

2018-05-17 Thread Venkata Harikrishna Nukala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala updated CASSANDRA-14344:
---
Attachment: 14344-trunk.txt

> Support the IN restrictions on indexed columns
> --
>
> Key: CASSANDRA-14344
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14344
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Dikang Gu
>Assignee: Venkata Harikrishna Nukala
>Priority: Major
> Attachments: 14344-trunk.txt
>
>
> Support IN filter query like this:
>  
> CREATE TABLE ks1.t1 (
>     key int,
>     col1 int,
>     col2 int,
>     value int,
>     PRIMARY KEY (key, col1, col2)
> ) WITH CLUSTERING ORDER BY (col1 ASC, col2 ASC)
>  
> cqlsh:ks1> select * from t1 where key = 1 and col2 in (1) allow filtering;
>  
>  key | col1 | col2 | value
> -+--+--+---
>    1 |    1 |    1 |     1
>    1 |    2 |    1 |     3
>  
> (2 rows)
> cqlsh:ks1> select * from t1 where key = 1 and col2 in (1, 2) allow filtering;
> *{color:#ff}InvalidRequest: Error from server: code=2200 [Invalid query] 
> message="IN restrictions are not supported on indexed columns"{color}*
> cqlsh:ks1>



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14344) Support the IN restrictions on indexed columns

2018-05-15 Thread Venkata Harikrishna Nukala (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16476934#comment-16476934
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14344:


I am working on code changes. Just want to check if it is expected to support 
multi-column IN restriction too on clustering keys. For example:
 Table:
{code:java}
CREATE TABLE ks1.t3 (
key int,
col1 int,
col2 int,
col3 int,
value int,
PRIMARY KEY (key, col1, col2, col3)
) WITH CLUSTERING ORDER BY (col1 ASC, col2 ASC, col3 ASC);
{code}
Data:
{code:java}
 key | col1 | col2 | col3 | value
-+--+--+--+---
 100 |0 |0 |0 | 1
 102 |0 |1 |0 | 3
 101 |0 |0 |1 | 2
 103 |0 |1 |1 | 4
{code}
Query:
{code:java}
select * from ks1.t3 where (col2, col3) IN ((1,0), (1,1)) ALLOW FILTERING;
{code}

> Support the IN restrictions on indexed columns
> --
>
> Key: CASSANDRA-14344
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14344
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Dikang Gu
>Assignee: Venkata Harikrishna Nukala
>Priority: Major
>
> Support IN filter query like this:
>  
> CREATE TABLE ks1.t1 (
>     key int,
>     col1 int,
>     col2 int,
>     value int,
>     PRIMARY KEY (key, col1, col2)
> ) WITH CLUSTERING ORDER BY (col1 ASC, col2 ASC)
>  
> cqlsh:ks1> select * from t1 where key = 1 and col2 in (1) allow filtering;
>  
>  key | col1 | col2 | value
> -+--+--+---
>    1 |    1 |    1 |     1
>    1 |    2 |    1 |     3
>  
> (2 rows)
> cqlsh:ks1> select * from t1 where key = 1 and col2 in (1, 2) allow filtering;
> *{color:#ff}InvalidRequest: Error from server: code=2200 [Invalid query] 
> message="IN restrictions are not supported on indexed columns"{color}*
> cqlsh:ks1>



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14344) Support the IN restrictions on indexed columns

2018-05-05 Thread Venkata Harikrishna Nukala (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16464877#comment-16464877
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14344:


Can I work on this? If yes, can someone assign this to me?

> Support the IN restrictions on indexed columns
> --
>
> Key: CASSANDRA-14344
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14344
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Dikang Gu
>Priority: Major
>
> Support IN filter query like this:
>  
> CREATE TABLE ks1.t1 (
>     key int,
>     col1 int,
>     col2 int,
>     value int,
>     PRIMARY KEY (key, col1, col2)
> ) WITH CLUSTERING ORDER BY (col1 ASC, col2 ASC)
>  
> cqlsh:ks1> select * from t1 where key = 1 and col2 in (1) allow filtering;
>  
>  key | col1 | col2 | value
> -+--+--+---
>    1 |    1 |    1 |     1
>    1 |    2 |    1 |     3
>  
> (2 rows)
> cqlsh:ks1> select * from t1 where key = 1 and col2 in (1, 2) allow filtering;
> *{color:#ff}InvalidRequest: Error from server: code=2200 [Invalid query] 
> message="IN restrictions are not supported on indexed columns"{color}*
> cqlsh:ks1>



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13998) Cassandra stress distribution does not affect the result

2018-04-30 Thread Venkata Harikrishna Nukala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala updated CASSANDRA-13998:
---
Attachment: 13998-trunk.txt

> Cassandra stress distribution does not affect the result
> 
>
> Key: CASSANDRA-13998
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13998
> Project: Cassandra
>  Issue Type: Task
>  Components: Stress
> Environment: Widnows 10
>Reporter: Mikhail Pliskovsky
>Assignee: Venkata Harikrishna Nukala
>Priority: Trivial
> Fix For: 3.11.x
>
> Attachments: 13998-trunk.txt, cqlstress-example.yaml
>
>
> When testing my schema on single-node cluster, I am getting the identical 
> data for each stress-test run
> I specify my cassandra-stress.yaml file 
> Table and column spec
> {code:java}
> table_definition: |
>   CREATE TABLE files (
> id uuid PRIMARY KEY,
> data blob
>   ) 
> columnspec:
>   - name: data
> size: UNIFORM(10..100)
> population: UNIFORM(1..100B)
> {code}
> But when query table rows after test, I am getting data as identical string 
> in each row
> Command to run the test
> {code:java}
> cassandra-stress user profile=..\cqlstress-example.yaml n=20 ops(insert=5) 
> -rate threads=8
> {code}
> What I am doing wrong? 
> My wish is to have the data of variable length



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13998) Cassandra stress distribution does not affect the result

2018-04-30 Thread Venkata Harikrishna Nukala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala updated CASSANDRA-13998:
---
Reproduced In: 4.0
   Status: Patch Available  (was: Open)

> Cassandra stress distribution does not affect the result
> 
>
> Key: CASSANDRA-13998
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13998
> Project: Cassandra
>  Issue Type: Task
>  Components: Stress
> Environment: Widnows 10
>Reporter: Mikhail Pliskovsky
>Assignee: Venkata Harikrishna Nukala
>Priority: Trivial
> Fix For: 3.11.x
>
> Attachments: 13998-trunk.txt, cqlstress-example.yaml
>
>
> When testing my schema on single-node cluster, I am getting the identical 
> data for each stress-test run
> I specify my cassandra-stress.yaml file 
> Table and column spec
> {code:java}
> table_definition: |
>   CREATE TABLE files (
> id uuid PRIMARY KEY,
> data blob
>   ) 
> columnspec:
>   - name: data
> size: UNIFORM(10..100)
> population: UNIFORM(1..100B)
> {code}
> But when query table rows after test, I am getting data as identical string 
> in each row
> Command to run the test
> {code:java}
> cassandra-stress user profile=..\cqlstress-example.yaml n=20 ops(insert=5) 
> -rate threads=8
> {code}
> What I am doing wrong? 
> My wish is to have the data of variable length



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13998) Cassandra stress distribution does not affect the result

2018-04-30 Thread Venkata Harikrishna Nukala (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16458489#comment-16458489
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-13998:


This is the issue with UUID generation. UUID is generated with same MSB (most 
significant bits) and LSB (least significant bits). Because of this, the XOR of 
MSB and LSB is always zero. _PartitionIterator.seed(Object object, AbstractType 
type, long seed)_ is using the _MSB ^ LSB_ to generate the idSeed which is 
always zero and the initial value of idSeed is zero too. This idSeed is used as 
the seed for value columns too. Since the seed is same (zero) all the time, 
same values with the same size are being generated. So fixed UUID generation 
using FasterRandom.

After this change, I can see data generated with different size & values. 
Tested with a table having clustering columns and different types of primary 
keys.

Attaching the patch to this ticket.

> Cassandra stress distribution does not affect the result
> 
>
> Key: CASSANDRA-13998
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13998
> Project: Cassandra
>  Issue Type: Task
>  Components: Stress
> Environment: Widnows 10
>Reporter: Mikhail Pliskovsky
>Assignee: Venkata Harikrishna Nukala
>Priority: Trivial
> Fix For: 3.11.x
>
> Attachments: cqlstress-example.yaml
>
>
> When testing my schema on single-node cluster, I am getting the identical 
> data for each stress-test run
> I specify my cassandra-stress.yaml file 
> Table and column spec
> {code:java}
> table_definition: |
>   CREATE TABLE files (
> id uuid PRIMARY KEY,
> data blob
>   ) 
> columnspec:
>   - name: data
> size: UNIFORM(10..100)
> population: UNIFORM(1..100B)
> {code}
> But when query table rows after test, I am getting data as identical string 
> in each row
> Command to run the test
> {code:java}
> cassandra-stress user profile=..\cqlstress-example.yaml n=20 ops(insert=5) 
> -rate threads=8
> {code}
> What I am doing wrong? 
> My wish is to have the data of variable length



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13998) Cassandra stress distribution does not affect the result

2018-04-29 Thread Venkata Harikrishna Nukala (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16457988#comment-16457988
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-13998:


I can work on this. Can someone assign it to me?

> Cassandra stress distribution does not affect the result
> 
>
> Key: CASSANDRA-13998
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13998
> Project: Cassandra
>  Issue Type: Task
>  Components: Stress
> Environment: Widnows 10
>Reporter: Mikhail Pliskovsky
>Priority: Trivial
> Fix For: 3.11.x
>
> Attachments: cqlstress-example.yaml
>
>
> When testing my schema on single-node cluster, I am getting the identical 
> data for each stress-test run
> I specify my cassandra-stress.yaml file 
> Table and column spec
> {code:java}
> table_definition: |
>   CREATE TABLE files (
> id uuid PRIMARY KEY,
> data blob
>   ) 
> columnspec:
>   - name: data
> size: UNIFORM(10..100)
> population: UNIFORM(1..100B)
> {code}
> But when query table rows after test, I am getting data as identical string 
> in each row
> Command to run the test
> {code:java}
> cassandra-stress user profile=..\cqlstress-example.yaml n=20 ops(insert=5) 
> -rate threads=8
> {code}
> What I am doing wrong? 
> My wish is to have the data of variable length



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14372) data_file_directories config - update documentation in cassandra.yaml

2018-04-10 Thread Venkata Harikrishna Nukala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala updated CASSANDRA-14372:
---
Status: Patch Available  (was: Open)

> data_file_directories config - update documentation in cassandra.yaml
> -
>
> Key: CASSANDRA-14372
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14372
> Project: Cassandra
>  Issue Type: Bug
>  Components: Documentation and Website
>Reporter: Venkata Harikrishna Nukala
>Assignee: Venkata Harikrishna Nukala
>Priority: Minor
> Attachments: 14372-trunk.txt
>
>
> If "data_file_directories" configuration is enabled with multiple 
> directories, data is partitioned by token range so that data gets distributed 
> evenly. But the current documentation says that "Cassandra will spread data 
> evenly across them, subject to the granularity of the configured compaction 
> strategy". Need to update this comment to reflect the correct behavior.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14372) data_file_directories config - update documentation in cassandra.yaml

2018-04-10 Thread Venkata Harikrishna Nukala (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16432917#comment-16432917
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14372:


Attached patch with the changes. Please review it.

> data_file_directories config - update documentation in cassandra.yaml
> -
>
> Key: CASSANDRA-14372
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14372
> Project: Cassandra
>  Issue Type: Bug
>  Components: Documentation and Website
>Reporter: Venkata Harikrishna Nukala
>Assignee: Venkata Harikrishna Nukala
>Priority: Minor
> Attachments: 14372-trunk.txt
>
>
> If "data_file_directories" configuration is enabled with multiple 
> directories, data is partitioned by token range so that data gets distributed 
> evenly. But the current documentation says that "Cassandra will spread data 
> evenly across them, subject to the granularity of the configured compaction 
> strategy". Need to update this comment to reflect the correct behavior.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14372) data_file_directories config - update documentation in cassandra.yaml

2018-04-10 Thread Venkata Harikrishna Nukala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala updated CASSANDRA-14372:
---
Attachment: 14372-trunk.txt

> data_file_directories config - update documentation in cassandra.yaml
> -
>
> Key: CASSANDRA-14372
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14372
> Project: Cassandra
>  Issue Type: Bug
>  Components: Documentation and Website
>Reporter: Venkata Harikrishna Nukala
>Assignee: Venkata Harikrishna Nukala
>Priority: Minor
> Attachments: 14372-trunk.txt
>
>
> If "data_file_directories" configuration is enabled with multiple 
> directories, data is partitioned by token range so that data gets distributed 
> evenly. But the current documentation says that "Cassandra will spread data 
> evenly across them, subject to the granularity of the configured compaction 
> strategy". Need to update this comment to reflect the correct behavior.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14372) data_file_directories config - update documentation in cassandra.yaml

2018-04-09 Thread Venkata Harikrishna Nukala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala updated CASSANDRA-14372:
---
Summary: data_file_directories config - update documentation in 
cassandra.yaml  (was: data_file_directories update documentation in 
cassandra.yaml)

> data_file_directories config - update documentation in cassandra.yaml
> -
>
> Key: CASSANDRA-14372
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14372
> Project: Cassandra
>  Issue Type: Bug
>  Components: Documentation and Website
>Reporter: Venkata Harikrishna Nukala
>Priority: Minor
>
> If "data_file_directories" configuration is enabled with multiple 
> directories, data is partitioned by token range so that data gets distributed 
> evenly. But the current documentation says that "Cassandra will spread data 
> evenly across them, subject to the granularity of the configured compaction 
> strategy". Need to update this comment to reflect the correct behavior.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14372) data_file_directories config - update documentation in cassandra.yaml

2018-04-09 Thread Venkata Harikrishna Nukala (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16431319#comment-16431319
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14372:


I can submit a patch for this. Can someone assign this ticket to me?

> data_file_directories config - update documentation in cassandra.yaml
> -
>
> Key: CASSANDRA-14372
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14372
> Project: Cassandra
>  Issue Type: Bug
>  Components: Documentation and Website
>Reporter: Venkata Harikrishna Nukala
>Priority: Minor
>
> If "data_file_directories" configuration is enabled with multiple 
> directories, data is partitioned by token range so that data gets distributed 
> evenly. But the current documentation says that "Cassandra will spread data 
> evenly across them, subject to the granularity of the configured compaction 
> strategy". Need to update this comment to reflect the correct behavior.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14372) data_file_directories update documentation in cassandra.yaml

2018-04-09 Thread Venkata Harikrishna Nukala (JIRA)
Venkata Harikrishna Nukala created CASSANDRA-14372:
--

 Summary: data_file_directories update documentation in 
cassandra.yaml
 Key: CASSANDRA-14372
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14372
 Project: Cassandra
  Issue Type: Bug
  Components: Documentation and Website
Reporter: Venkata Harikrishna Nukala


If "data_file_directories" configuration is enabled with multiple directories, 
data is partitioned by token range so that data gets distributed evenly. But 
the current documentation says that "Cassandra will spread data evenly across 
them, subject to the granularity of the configured compaction strategy". Need 
to update this comment to reflect the correct behavior.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14354) rename ColumnFamilyStoreCQLHelper to TableCQLHelper

2018-04-02 Thread Venkata Harikrishna Nukala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala updated CASSANDRA-14354:
---
Reviewer: Jon Haddad
  Status: Patch Available  (was: Open)

> rename ColumnFamilyStoreCQLHelper to TableCQLHelper
> ---
>
> Key: CASSANDRA-14354
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14354
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Jon Haddad
>Assignee: Venkata Harikrishna Nukala
>Priority: Major
> Attachments: 14354-trunk.txt
>
>
> Seems like a simple 1:1 rename.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14354) rename ColumnFamilyStoreCQLHelper to TableCQLHelper

2018-03-30 Thread Venkata Harikrishna Nukala (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16421038#comment-16421038
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14354:


[~rustyrazorblade] Can you assign this ticket to me? I had uploaded the patch 
to this ticket. Please review it.

> rename ColumnFamilyStoreCQLHelper to TableCQLHelper
> ---
>
> Key: CASSANDRA-14354
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14354
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Jon Haddad
>Priority: Major
> Attachments: 14354-trunk.txt
>
>
> Seems like a simple 1:1 rename.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14354) rename ColumnFamilyStoreCQLHelper to TableCQLHelper

2018-03-30 Thread Venkata Harikrishna Nukala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala updated CASSANDRA-14354:
---
Attachment: 14354-trunk.txt

> rename ColumnFamilyStoreCQLHelper to TableCQLHelper
> ---
>
> Key: CASSANDRA-14354
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14354
> Project: Cassandra
>  Issue Type: Sub-task
>Reporter: Jon Haddad
>Priority: Major
> Attachments: 14354-trunk.txt
>
>
> Seems like a simple 1:1 rename.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14340) Refactor ColumnFamilyStore to Table

2018-03-29 Thread Venkata Harikrishna Nukala (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16419165#comment-16419165
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14340:


[~rustyrazorblade] Are you considering to rename ColumnFamilyStoreCQLHelper 
class too? If yes, would you mind if I submitting a patch for it?

> Refactor ColumnFamilyStore to Table
> ---
>
> Key: CASSANDRA-14340
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14340
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Jon Haddad
>Assignee: Jon Haddad
>Priority: Major
>
> This end result of this should change the ColumnFamily store, 
> ColumnFamilyStoreMBean, and tests that reference them by name.
> Deserves a note in news as this will break JMX compatibility and affect 
> scripts which change log level by class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-02-26 Thread Venkata Harikrishna Nukala (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377767#comment-16377767
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-13853:


[~pree] Please go ahead. Took my step back.

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-02-26 Thread Venkata Harikrishna Nukala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala updated CASSANDRA-13853:
---
Attachment: (was: 13853-trunk.txt)

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Issue Comment Deleted] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-02-26 Thread Venkata Harikrishna Nukala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala updated CASSANDRA-13853:
---
Comment: was deleted

(was: [~pree] My apologies if it is against the Cassandra community guidelines. 
I am also a newbie. The last update I saw was on last year. I thought this 
ticket is stale, so picked it up. I don't mind taking a step back.)

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Issue Comment Deleted] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-02-26 Thread Venkata Harikrishna Nukala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala updated CASSANDRA-13853:
---
Comment: was deleted

(was: [~rustyrazorblade] [~pree] 
Hi, I made a patch (attached) for the required changes. Hope you don't mind. 
With these changes, "nodetool describecluster" output looks like this:
{noformat}
Cluster Information:
Name: test [^13853-trunk.txt] 
Live nodes: 6
Joining nodes: [127.0.0.6, 127.0.0.5, 127.0.0.4, 127.0.0.3, 127.0.0.2, 
127.0.0.1]
Moving nodes: []
Leaving nodes: []
Keyspaces:
system_traces RF: 2
system RF: 1
system_distributed RF: 3
system_schema RF: 1
system_auth RF: 1
Release versions:
4.0.0: [127.0.0.6, 127.0.0.5, 127.0.0.4, 127.0.0.3, 127.0.0.2, 
127.0.0.1]
Datacenters:
dc1 Up# 3 Down# 0 Unknown# 0
dc2 Up# 3 Down# 0 Unknown# 0
Snitch: org.apache.cassandra.locator.PropertyFileSnitch
DynamicEndPointSnitch: enabled
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
cc4dbfad-1e8f-386d-9a79-de80def27f33: [127.0.0.6, 127.0.0.5, 
127.0.0.4, 127.0.0.3, 127.0.0.2, 127.0.0.1]

{noformat}

Let me know what you think.)

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-02-26 Thread Venkata Harikrishna Nukala (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377729#comment-16377729
 ] 

Venkata Harikrishna Nukala edited comment on CASSANDRA-13853 at 2/26/18 10:47 
PM:
--

[~pree] My apologies if it is against the Cassandra community guidelines. I am 
also a newbie. The last update I saw was on last year. I thought this ticket is 
stale, so picked it up. I don't mind taking a step back.


was (Author: hari_nv):
[~pree] My apologies if it is against the Cassandra community guidelines. I am 
also a newbie. That last update I saw was on last year. I thought this ticket 
is stale, so picked it up. I don't mind taking a step back.

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
> Attachments: 13853-trunk.txt
>
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-02-26 Thread Venkata Harikrishna Nukala (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377729#comment-16377729
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-13853:


[~pree] My apologies if it is against the Cassandra community guidelines. I am 
also a newbie. That last update I saw was on last year. I thought this ticket 
is stale, so picked it up. I don't mind taking a step back.

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
> Attachments: 13853-trunk.txt
>
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-02-26 Thread Venkata Harikrishna Nukala (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377591#comment-16377591
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-13853:


[~rustyrazorblade] [~pree] 
Hi, I made a patch (attached) for the required changes. Hope you don't mind. 
With these changes, "nodetool describecluster" output looks like this:
{noformat}
Cluster Information:
Name: test [^13853-trunk.txt] 
Live nodes: 6
Joining nodes: [127.0.0.6, 127.0.0.5, 127.0.0.4, 127.0.0.3, 127.0.0.2, 
127.0.0.1]
Moving nodes: []
Leaving nodes: []
Keyspaces:
system_traces RF: 2
system RF: 1
system_distributed RF: 3
system_schema RF: 1
system_auth RF: 1
Release versions:
4.0.0: [127.0.0.6, 127.0.0.5, 127.0.0.4, 127.0.0.3, 127.0.0.2, 
127.0.0.1]
Datacenters:
dc1 Up# 3 Down# 0 Unknown# 0
dc2 Up# 3 Down# 0 Unknown# 0
Snitch: org.apache.cassandra.locator.PropertyFileSnitch
DynamicEndPointSnitch: enabled
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
cc4dbfad-1e8f-386d-9a79-de80def27f33: [127.0.0.6, 127.0.0.5, 
127.0.0.4, 127.0.0.3, 127.0.0.2, 127.0.0.1]

{noformat}

Let me know what you think.

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
> Attachments: 13853-trunk.txt
>
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-13853) nodetool describecluster should be more informative

2018-02-26 Thread Venkata Harikrishna Nukala (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-13853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Harikrishna Nukala updated CASSANDRA-13853:
---
Attachment: 13853-trunk.txt

> nodetool describecluster should be more informative
> ---
>
> Key: CASSANDRA-13853
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13853
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Observability, Tools
>Reporter: Jon Haddad
>Assignee: Preetika Tyagi
>Priority: Minor
>  Labels: lhf
> Attachments: 13853-trunk.txt
>
>
> Additional information we should be displaying:
> * Total node count
> * List of datacenters, RF, with number of nodes per dc, how many are down, 
> * Version(s)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



  1   2   >