[jira] [Comment Edited] (CASSANDRA-15081) LegacyLayout does not have same behavior as 2.x when handling unknown column names

2019-11-05 Thread Michael Semb Wever (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16968120#comment-16968120
 ] 

Michael Semb Wever edited comment on CASSANDRA-15081 at 11/6/19 6:37 AM:
-

Thanks [~cam1982]. I believe you're correct. But it needs to be checked. 
There's an upgrade dtest relevant to this, I will check it out and get back to 
you.


was (Author: michaelsembwever):
Thanks [~cam1982]. I believe you're correct. But it needs to be checked. I 
believe there's an upgrade dtest relevant. Will check it out and get back to 
you.

> LegacyLayout does not have same behavior as 2.x when handling unknown column 
> names
> --
>
> Key: CASSANDRA-15081
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15081
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: Cameron Zemek
>Priority: High
>  Labels: patch, pull-request-available
> Attachments: 15081.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Due to a bug I haven't been able to reproduce the production cluster had 
> unknown column names. To replicate the issue for this test I did the 
> following:
> {noformat}
> $ ccm create -v 2.1.19 -n 1 -s bug
> $ cat > schema.cql << 'EOF'
> CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': '1'} AND durable_writes = true;
> CREATE TABLE test.unknowntest (id int primary key, payload text, "paylo!d" 
> text);
> EOF
> $ ccm node1 cqlsh -f schema.cql
> $ export CASSANDRA_INCLUDE=~/.ccm/bug/node1/bin/cassandra.in.sh
> $ cat > bug.json << 'EOF'
> [
> {"key": "1",
> "cells": [["","",1554432501209207],
> ["paylo!d","hello world",1554432501209207],
> ["payload","hello world",1554432501209207]]}
> ]
> EOF
> $ ~/.ccm/repository/2.1.19/tools/bin/json2sstable -K test -c unknowntest 
> ~/bug.json 
> ~/.ccm/bug/node1/data0/test/unknowntest-/test-unknowntest-ka-1-Data.db{noformat}
> Then test the behavior of unknown columns in 2.1:
> {noformat}
> $ ccm stop
> $ ccm create -v 2.1.19 -n 1 -s bug2_1_19
> $ cat > schema2.cql << 'EOF'
> CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': '1'} AND durable_writes = true;
> CREATE TABLE test.unknowntest (id int primary key, payload text);
> EOF
> $ ccm node1 cqlsh -f schema2.cql
> $ ccm stop
> $ cp ~/.ccm/bug/node1/data0/test/unknowntest-/test-unknowntest-ka-1-* 
> ~/.ccm/bug2_1_19/node1/data0/test/unknowntest-/
> $ ccm start
> $ ccm node1 cqlsh
> Connected to bug2_1_19 at 127.0.0.1:9042.
> [cqlsh 5.0.1 | Cassandra 2.1.19 | CQL spec 3.2.1 | Native protocol v3]
> Use HELP for help.
> cqlsh> select * from test.unknowntest where id = 1;
> id | payload
> +-
> 1 | hello world
> (1 rows){noformat}
> Compared to 3.11.4 which did the following:
> {noformat}
> $ ccm stop
> $ ccm create -v 3.11.4 -n 1 -s bug3_11_4
> $ ccm node1 cqlsh -f schema2.cql
> $ ccm stop
> $ cp ~/.ccm/bug/node1/data0/test/unknowntest-/test-unknowntest-ka-1-* 
> ~/.ccm/bug3_11_4/node1/data0/test/unknowntest-/
> $ ccm start
> $ ccm node1 cqlsh
> Connected to bug3_11_4 at 127.0.0.1:9042.
> [cqlsh 5.0.1 | Cassandra 3.11.4 | CQL spec 3.4.4 | Native protocol v4]
> Use HELP for help.
> cqlsh> select * from test.unknowntest where id = 1;
> ReadFailure: Error from server: code=1300 [Replica(s) failed to execute read] 
> message="Operation failed - received 0 responses and 1 failures" 
> info={'failures': 1, 'received_responses': 0, 'required_responses': 1, 
> 'consistency': 'ONE'}
> {noformat}
> In the logs this resulted in an IllegalStateException from LegacyLayout line 
> 1127
> The expected behavior would be to ignore the column and return results the 
> same as in 2.1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15081) LegacyLayout does not have same behavior as 2.x when handling unknown column names

2019-11-05 Thread Michael Semb Wever (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16968120#comment-16968120
 ] 

Michael Semb Wever commented on CASSANDRA-15081:


Thanks [~cam1982]. I believe you're correct. But it needs to be checked. I 
believe there's an upgrade dtest relevant. Will check it out and get back to 
you.

> LegacyLayout does not have same behavior as 2.x when handling unknown column 
> names
> --
>
> Key: CASSANDRA-15081
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15081
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: Cameron Zemek
>Priority: High
>  Labels: patch, pull-request-available
> Attachments: 15081.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Due to a bug I haven't been able to reproduce the production cluster had 
> unknown column names. To replicate the issue for this test I did the 
> following:
> {noformat}
> $ ccm create -v 2.1.19 -n 1 -s bug
> $ cat > schema.cql << 'EOF'
> CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': '1'} AND durable_writes = true;
> CREATE TABLE test.unknowntest (id int primary key, payload text, "paylo!d" 
> text);
> EOF
> $ ccm node1 cqlsh -f schema.cql
> $ export CASSANDRA_INCLUDE=~/.ccm/bug/node1/bin/cassandra.in.sh
> $ cat > bug.json << 'EOF'
> [
> {"key": "1",
> "cells": [["","",1554432501209207],
> ["paylo!d","hello world",1554432501209207],
> ["payload","hello world",1554432501209207]]}
> ]
> EOF
> $ ~/.ccm/repository/2.1.19/tools/bin/json2sstable -K test -c unknowntest 
> ~/bug.json 
> ~/.ccm/bug/node1/data0/test/unknowntest-/test-unknowntest-ka-1-Data.db{noformat}
> Then test the behavior of unknown columns in 2.1:
> {noformat}
> $ ccm stop
> $ ccm create -v 2.1.19 -n 1 -s bug2_1_19
> $ cat > schema2.cql << 'EOF'
> CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': '1'} AND durable_writes = true;
> CREATE TABLE test.unknowntest (id int primary key, payload text);
> EOF
> $ ccm node1 cqlsh -f schema2.cql
> $ ccm stop
> $ cp ~/.ccm/bug/node1/data0/test/unknowntest-/test-unknowntest-ka-1-* 
> ~/.ccm/bug2_1_19/node1/data0/test/unknowntest-/
> $ ccm start
> $ ccm node1 cqlsh
> Connected to bug2_1_19 at 127.0.0.1:9042.
> [cqlsh 5.0.1 | Cassandra 2.1.19 | CQL spec 3.2.1 | Native protocol v3]
> Use HELP for help.
> cqlsh> select * from test.unknowntest where id = 1;
> id | payload
> +-
> 1 | hello world
> (1 rows){noformat}
> Compared to 3.11.4 which did the following:
> {noformat}
> $ ccm stop
> $ ccm create -v 3.11.4 -n 1 -s bug3_11_4
> $ ccm node1 cqlsh -f schema2.cql
> $ ccm stop
> $ cp ~/.ccm/bug/node1/data0/test/unknowntest-/test-unknowntest-ka-1-* 
> ~/.ccm/bug3_11_4/node1/data0/test/unknowntest-/
> $ ccm start
> $ ccm node1 cqlsh
> Connected to bug3_11_4 at 127.0.0.1:9042.
> [cqlsh 5.0.1 | Cassandra 3.11.4 | CQL spec 3.4.4 | Native protocol v4]
> Use HELP for help.
> cqlsh> select * from test.unknowntest where id = 1;
> ReadFailure: Error from server: code=1300 [Replica(s) failed to execute read] 
> message="Operation failed - received 0 responses and 1 failures" 
> info={'failures': 1, 'received_responses': 0, 'required_responses': 1, 
> 'consistency': 'ONE'}
> {noformat}
> In the logs this resulted in an IllegalStateException from LegacyLayout line 
> 1127
> The expected behavior would be to ignore the column and return results the 
> same as in 2.1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15081) LegacyLayout does not have same behavior as 2.x when handling unknown column names

2019-11-05 Thread Cameron Zemek (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16968072#comment-16968072
 ] 

Cameron Zemek commented on CASSANDRA-15081:
---

[~mck] no input. As far as I could tell CASSANDRA-13939 shouldn't be affected 
by this, but to be honest I didn't fully understand that issue. I thought I 
mentioned it in just in case it might and someone more knowledgable might be 
able to injected if they see an issue. The unit tests passed so hoping that 
means I haven't broken anything elsewhere.

> LegacyLayout does not have same behavior as 2.x when handling unknown column 
> names
> --
>
> Key: CASSANDRA-15081
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15081
> Project: Cassandra
>  Issue Type: Bug
>  Components: Legacy/Local Write-Read Paths
>Reporter: Cameron Zemek
>Priority: High
>  Labels: patch, pull-request-available
> Attachments: 15081.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Due to a bug I haven't been able to reproduce the production cluster had 
> unknown column names. To replicate the issue for this test I did the 
> following:
> {noformat}
> $ ccm create -v 2.1.19 -n 1 -s bug
> $ cat > schema.cql << 'EOF'
> CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': '1'} AND durable_writes = true;
> CREATE TABLE test.unknowntest (id int primary key, payload text, "paylo!d" 
> text);
> EOF
> $ ccm node1 cqlsh -f schema.cql
> $ export CASSANDRA_INCLUDE=~/.ccm/bug/node1/bin/cassandra.in.sh
> $ cat > bug.json << 'EOF'
> [
> {"key": "1",
> "cells": [["","",1554432501209207],
> ["paylo!d","hello world",1554432501209207],
> ["payload","hello world",1554432501209207]]}
> ]
> EOF
> $ ~/.ccm/repository/2.1.19/tools/bin/json2sstable -K test -c unknowntest 
> ~/bug.json 
> ~/.ccm/bug/node1/data0/test/unknowntest-/test-unknowntest-ka-1-Data.db{noformat}
> Then test the behavior of unknown columns in 2.1:
> {noformat}
> $ ccm stop
> $ ccm create -v 2.1.19 -n 1 -s bug2_1_19
> $ cat > schema2.cql << 'EOF'
> CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': '1'} AND durable_writes = true;
> CREATE TABLE test.unknowntest (id int primary key, payload text);
> EOF
> $ ccm node1 cqlsh -f schema2.cql
> $ ccm stop
> $ cp ~/.ccm/bug/node1/data0/test/unknowntest-/test-unknowntest-ka-1-* 
> ~/.ccm/bug2_1_19/node1/data0/test/unknowntest-/
> $ ccm start
> $ ccm node1 cqlsh
> Connected to bug2_1_19 at 127.0.0.1:9042.
> [cqlsh 5.0.1 | Cassandra 2.1.19 | CQL spec 3.2.1 | Native protocol v3]
> Use HELP for help.
> cqlsh> select * from test.unknowntest where id = 1;
> id | payload
> +-
> 1 | hello world
> (1 rows){noformat}
> Compared to 3.11.4 which did the following:
> {noformat}
> $ ccm stop
> $ ccm create -v 3.11.4 -n 1 -s bug3_11_4
> $ ccm node1 cqlsh -f schema2.cql
> $ ccm stop
> $ cp ~/.ccm/bug/node1/data0/test/unknowntest-/test-unknowntest-ka-1-* 
> ~/.ccm/bug3_11_4/node1/data0/test/unknowntest-/
> $ ccm start
> $ ccm node1 cqlsh
> Connected to bug3_11_4 at 127.0.0.1:9042.
> [cqlsh 5.0.1 | Cassandra 3.11.4 | CQL spec 3.4.4 | Native protocol v4]
> Use HELP for help.
> cqlsh> select * from test.unknowntest where id = 1;
> ReadFailure: Error from server: code=1300 [Replica(s) failed to execute read] 
> message="Operation failed - received 0 responses and 1 failures" 
> info={'failures': 1, 'received_responses': 0, 'required_responses': 1, 
> 'consistency': 'ONE'}
> {noformat}
> In the logs this resulted in an IllegalStateException from LegacyLayout line 
> 1127
> The expected behavior would be to ignore the column and return results the 
> same as in 2.1



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15349) Add “Going away” message to the client protocol

2019-11-05 Thread Chris Lohfink (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16968011#comment-16968011
 ] 

Chris Lohfink edited comment on CASSANDRA-15349 at 11/6/19 1:47 AM:


Would it be possible to send a CQL level event on the connection itself (ie new 
opcode 0x11)? On larger clusters it can take 10s to propagate a gossip event.

Perhaps even a "request to close" event or something on the connection itself 
and let the client itself disconnect instead of just stop sending requests to 
it. Then the server can just forcibly shut down (like it does now) after some 
time or move on if all connections are left.


was (Author: cnlwsu):
Would it be possible to send a CQL level event on the connection itself? On 
larger clusters it can take 10s to propagate a gossip event.

Perhaps even a "request to close" event or something on the connection itself 
and let the client itself disconnect instead of just stop sending requests to 
it. Then the server can just forcibly shut down (like it does now) after some 
time or move on if all connections are left.

> Add “Going away” message to the client protocol
> ---
>
> Key: CASSANDRA-15349
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15349
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Messaging/Client
>Reporter: Alex Petrov
>Priority: Normal
>  Labels: client-impacting
>
> Add “Going away” message that allows node to announce its shutdown and let 
> clients gracefully shutdown and not attempt further requests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15349) Add “Going away” message to the client protocol

2019-11-05 Thread Chris Lohfink (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16968011#comment-16968011
 ] 

Chris Lohfink commented on CASSANDRA-15349:
---

Would it be possible to send a CQL level event on the connection itself? On 
larger clusters it can take 10s to propagate a gossip event.

Perhaps even a "request to close" event or something on the connection itself 
and let the client itself disconnect instead of just stop sending requests to 
it. Then the server can just forcibly shut down (like it does now) after some 
time or move on if all connections are left.

> Add “Going away” message to the client protocol
> ---
>
> Key: CASSANDRA-15349
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15349
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Messaging/Client
>Reporter: Alex Petrov
>Priority: Normal
>  Labels: client-impacting
>
> Add “Going away” message that allows node to announce its shutdown and let 
> clients gracefully shutdown and not attempt further requests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15399) Add ability to track state in repair

2019-11-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CASSANDRA-15399:
---
Labels: pull-request-available  (was: )

> Add ability to track state in repair
> 
>
> Key: CASSANDRA-15399
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15399
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Repair
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>  Labels: pull-request-available
>
> To enhance the visibility in repair, we should add in-memory objects that can 
> be exposed via JMX and virtual tables to show the state of the coordinator, 
> and validations (leaving sync out for now).
> These objects should expose the timing (create, start, complete), current 
> state (enum specific to the entity), and progress estimate (% complete); 
> along with any entity specific information useful.
> To help with growth, ActiveRepairService should periodically cleanup 
> completed state after a configurable interval.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15399) Add ability to track state in repair

2019-11-05 Thread David Capwell (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Capwell updated CASSANDRA-15399:
--
Change Category: Operability
 Complexity: Normal
 Status: Open  (was: Triage Needed)

> Add ability to track state in repair
> 
>
> Key: CASSANDRA-15399
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15399
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Consistency/Repair
>Reporter: David Capwell
>Assignee: David Capwell
>Priority: Normal
>
> To enhance the visibility in repair, we should add in-memory objects that can 
> be exposed via JMX and virtual tables to show the state of the coordinator, 
> and validations (leaving sync out for now).
> These objects should expose the timing (create, start, complete), current 
> state (enum specific to the entity), and progress estimate (% complete); 
> along with any entity specific information useful.
> To help with growth, ActiveRepairService should periodically cleanup 
> completed state after a configurable interval.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15399) Add ability to track state in repair

2019-11-05 Thread David Capwell (Jira)
David Capwell created CASSANDRA-15399:
-

 Summary: Add ability to track state in repair
 Key: CASSANDRA-15399
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15399
 Project: Cassandra
  Issue Type: Improvement
  Components: Consistency/Repair
Reporter: David Capwell
Assignee: David Capwell


To enhance the visibility in repair, we should add in-memory objects that can 
be exposed via JMX and virtual tables to show the state of the coordinator, and 
validations (leaving sync out for now).

These objects should expose the timing (create, start, complete), current state 
(enum specific to the entity), and progress estimate (% complete); along with 
any entity specific information useful.

To help with growth, ActiveRepairService should periodically cleanup completed 
state after a configurable interval.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.

2019-11-05 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967802#comment-16967802
 ] 

Benedict Elliott Smith commented on CASSANDRA-15397:


It's true that the costs of constructing a new {{IntervalTree}} are 
non-trivial, and it isn't necessarily reasonable to assume that it occurs 
sufficiently infrequent to not matter eitherr.  The lookup cost is not a 
terribly significant cost to worry about, but reducing construction cost could 
be a win for some users, and this modification might improve that.  If we 
really cared about construction costs, it would be possible to introduce an 
immutable but updatable {{IntervalTree}} instead of building it from scratch 
every time.  But that's likely to be a lot more work.

It's worth noting that the {{OverlapIterator}} we already have in tree is very 
similar in principle, I assume (but with different assumptions about usage), 
though I haven't had a chance to look at your proposal yet.

> IntervalTree performance comparison with Linear Walk and Binary Search based 
> Elimination. 
> --
>
> Key: CASSANDRA-15397
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15397
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/SSTable
>Reporter: Chandrasekhar Thumuluru
>Assignee: Chandrasekhar Thumuluru
>Priority: Low
> Attachments: 95p_1_SSTable_with_5000_Searches.png, 
> 95p_15000_SSTable_with_5000_Searches.png, 
> 95p_2_SSTable_with_5000_Searches.png, 
> 95p_25000_SSTable_with_5000_Searches.png, 
> 95p_3_SSTable_with_5000_Searches.png, 
> 95p_5000_SSTable_with_5000_Searches.png, 
> 99p_1_SSTable_with_5000_Searches.png, 
> 99p_15000_SSTable_with_5000_Searches.png, 
> 99p_2_SSTable_with_5000_Searches.png, 
> 99p_25000_SSTable_with_5000_Searches.png, 
> 99p_3_SSTable_with_5000_Searches.png, 
> 99p_5000_SSTable_with_5000_Searches.png, IntervalList.java, 
> IntervalListWithElimination.java, IntervalTreeSimplified.java, 
> Mean_1_SSTable_with_5000_Searches.png, 
> Mean_15000_SSTable_with_5000_Searches.png, 
> Mean_2_SSTable_with_5000_Searches.png, 
> Mean_25000_SSTable_with_5000_Searches.png, 
> Mean_3_SSTable_with_5000_Searches.png, 
> Mean_5000_SSTable_with_5000_Searches.png
>
>
> Cassandra uses IntervalTrees to identify the SSTables that overlap with 
> search interval. In Cassandra, IntervalTrees are not mutated. They are 
> recreated each time a mutation is required. This can be an issue during 
> repairs. In fact we noticed such issues during repair. 
> Since lists are cache friendly compared to linked lists and trees, I decided 
> to compare the search performance with:
> * Linear Walk.
> * Elimination using Binary Search (idea is to eliminate intervals using start 
> and end points of search interval). 
> Based on the tests I ran, I noticed Binary Search based elimination almost 
> always performs similar to IntervalTree or out performs IntervalTree based 
> search. The cost of IntervalTree construction is also substantial and 
> produces lot of garbage during repairs. 
> I ran the tests using random intervals to build the tree/lists and another 
> randomly generated search interval with 5000 iterations. I'm attaching all 
> the relevant graphs. The x-axis in the graphs is the search interval 
> coverage. 10p means the search interval covered 10% of the intervals. The 
> y-axis is the time the search took in nanos. 
> PS: 
> # For the purpose of test, I simplified the IntervalTree by removing the data 
> portion of the interval.  Modified the template version (Java generics) to a 
> specialized version. 
> # I used the code from Cassandra version _3.11_.
> # Time in the graph is in nanos. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.

2019-11-05 Thread Chandrasekhar Thumuluru (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967732#comment-16967732
 ] 

Chandrasekhar Thumuluru commented on CASSANDRA-15397:
-

Sure [~benedict]. I can make the changes and update the ticket with Github 
links. As you can see I simplified the IntervalTree implementation for 
comparison purposes. I'll make the final changes with tests and push them to my 
fork by weekend.

I completely agree with you it's not a pressing change but given the 
construction cost and immutable nature of IntervalTree usage I felt it's worth 
a shot. 

> IntervalTree performance comparison with Linear Walk and Binary Search based 
> Elimination. 
> --
>
> Key: CASSANDRA-15397
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15397
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/SSTable
>Reporter: Chandrasekhar Thumuluru
>Assignee: Chandrasekhar Thumuluru
>Priority: Low
> Attachments: 95p_1_SSTable_with_5000_Searches.png, 
> 95p_15000_SSTable_with_5000_Searches.png, 
> 95p_2_SSTable_with_5000_Searches.png, 
> 95p_25000_SSTable_with_5000_Searches.png, 
> 95p_3_SSTable_with_5000_Searches.png, 
> 95p_5000_SSTable_with_5000_Searches.png, 
> 99p_1_SSTable_with_5000_Searches.png, 
> 99p_15000_SSTable_with_5000_Searches.png, 
> 99p_2_SSTable_with_5000_Searches.png, 
> 99p_25000_SSTable_with_5000_Searches.png, 
> 99p_3_SSTable_with_5000_Searches.png, 
> 99p_5000_SSTable_with_5000_Searches.png, IntervalList.java, 
> IntervalListWithElimination.java, IntervalTreeSimplified.java, 
> Mean_1_SSTable_with_5000_Searches.png, 
> Mean_15000_SSTable_with_5000_Searches.png, 
> Mean_2_SSTable_with_5000_Searches.png, 
> Mean_25000_SSTable_with_5000_Searches.png, 
> Mean_3_SSTable_with_5000_Searches.png, 
> Mean_5000_SSTable_with_5000_Searches.png
>
>
> Cassandra uses IntervalTrees to identify the SSTables that overlap with 
> search interval. In Cassandra, IntervalTrees are not mutated. They are 
> recreated each time a mutation is required. This can be an issue during 
> repairs. In fact we noticed such issues during repair. 
> Since lists are cache friendly compared to linked lists and trees, I decided 
> to compare the search performance with:
> * Linear Walk.
> * Elimination using Binary Search (idea is to eliminate intervals using start 
> and end points of search interval). 
> Based on the tests I ran, I noticed Binary Search based elimination almost 
> always performs similar to IntervalTree or out performs IntervalTree based 
> search. The cost of IntervalTree construction is also substantial and 
> produces lot of garbage during repairs. 
> I ran the tests using random intervals to build the tree/lists and another 
> randomly generated search interval with 5000 iterations. I'm attaching all 
> the relevant graphs. The x-axis in the graphs is the search interval 
> coverage. 10p means the search interval covered 10% of the intervals. The 
> y-axis is the time the search took in nanos. 
> PS: 
> # For the purpose of test, I simplified the IntervalTree by removing the data 
> portion of the interval.  Modified the template version (Java generics) to a 
> specialized version. 
> # I used the code from Cassandra version _3.11_.
> # Time in the graph is in nanos. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.

2019-11-05 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967715#comment-16967715
 ] 

Benedict Elliott Smith commented on CASSANDRA-15397:


Hi [~cthumuluru], this sounds like a plausible optimisation (without having 
thought about it much myself yet).  Unfortunately it's not a very _pressing_ 
optimisation, but I will try to find time within the next couple of weeks to 
give you some feedback.

If possible, we generally prefer links to GitHub branches.  Could you push your 
fork with these changes somewhere to look at?

> IntervalTree performance comparison with Linear Walk and Binary Search based 
> Elimination. 
> --
>
> Key: CASSANDRA-15397
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15397
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/SSTable
>Reporter: Chandrasekhar Thumuluru
>Assignee: Chandrasekhar Thumuluru
>Priority: Low
> Attachments: 95p_1_SSTable_with_5000_Searches.png, 
> 95p_15000_SSTable_with_5000_Searches.png, 
> 95p_2_SSTable_with_5000_Searches.png, 
> 95p_25000_SSTable_with_5000_Searches.png, 
> 95p_3_SSTable_with_5000_Searches.png, 
> 95p_5000_SSTable_with_5000_Searches.png, 
> 99p_1_SSTable_with_5000_Searches.png, 
> 99p_15000_SSTable_with_5000_Searches.png, 
> 99p_2_SSTable_with_5000_Searches.png, 
> 99p_25000_SSTable_with_5000_Searches.png, 
> 99p_3_SSTable_with_5000_Searches.png, 
> 99p_5000_SSTable_with_5000_Searches.png, IntervalList.java, 
> IntervalListWithElimination.java, IntervalTreeSimplified.java, 
> Mean_1_SSTable_with_5000_Searches.png, 
> Mean_15000_SSTable_with_5000_Searches.png, 
> Mean_2_SSTable_with_5000_Searches.png, 
> Mean_25000_SSTable_with_5000_Searches.png, 
> Mean_3_SSTable_with_5000_Searches.png, 
> Mean_5000_SSTable_with_5000_Searches.png
>
>
> Cassandra uses IntervalTrees to identify the SSTables that overlap with 
> search interval. In Cassandra, IntervalTrees are not mutated. They are 
> recreated each time a mutation is required. This can be an issue during 
> repairs. In fact we noticed such issues during repair. 
> Since lists are cache friendly compared to linked lists and trees, I decided 
> to compare the search performance with:
> * Linear Walk.
> * Elimination using Binary Search (idea is to eliminate intervals using start 
> and end points of search interval). 
> Based on the tests I ran, I noticed Binary Search based elimination almost 
> always performs similar to IntervalTree or out performs IntervalTree based 
> search. The cost of IntervalTree construction is also substantial and 
> produces lot of garbage during repairs. 
> I ran the tests using random intervals to build the tree/lists and another 
> randomly generated search interval with 5000 iterations. I'm attaching all 
> the relevant graphs. The x-axis in the graphs is the search interval 
> coverage. 10p means the search interval covered 10% of the intervals. The 
> y-axis is the time the search took in nanos. 
> PS: 
> # For the purpose of test, I simplified the IntervalTree by removing the data 
> portion of the interval.  Modified the template version (Java generics) to a 
> specialized version. 
> # I used the code from Cassandra version _3.11_.
> # Time in the graph is in nanos. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.

2019-11-05 Thread Benedict Elliott Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict Elliott Smith updated CASSANDRA-15397:
---
Change Category: Performance
 Complexity: Normal
Component/s: Local/SSTable
  Reviewers: Benedict Elliott Smith
   Priority: Low  (was: Normal)
 Status: Open  (was: Triage Needed)

> IntervalTree performance comparison with Linear Walk and Binary Search based 
> Elimination. 
> --
>
> Key: CASSANDRA-15397
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15397
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Local/SSTable
>Reporter: Chandrasekhar Thumuluru
>Assignee: Chandrasekhar Thumuluru
>Priority: Low
> Attachments: 95p_1_SSTable_with_5000_Searches.png, 
> 95p_15000_SSTable_with_5000_Searches.png, 
> 95p_2_SSTable_with_5000_Searches.png, 
> 95p_25000_SSTable_with_5000_Searches.png, 
> 95p_3_SSTable_with_5000_Searches.png, 
> 95p_5000_SSTable_with_5000_Searches.png, 
> 99p_1_SSTable_with_5000_Searches.png, 
> 99p_15000_SSTable_with_5000_Searches.png, 
> 99p_2_SSTable_with_5000_Searches.png, 
> 99p_25000_SSTable_with_5000_Searches.png, 
> 99p_3_SSTable_with_5000_Searches.png, 
> 99p_5000_SSTable_with_5000_Searches.png, IntervalList.java, 
> IntervalListWithElimination.java, IntervalTreeSimplified.java, 
> Mean_1_SSTable_with_5000_Searches.png, 
> Mean_15000_SSTable_with_5000_Searches.png, 
> Mean_2_SSTable_with_5000_Searches.png, 
> Mean_25000_SSTable_with_5000_Searches.png, 
> Mean_3_SSTable_with_5000_Searches.png, 
> Mean_5000_SSTable_with_5000_Searches.png
>
>
> Cassandra uses IntervalTrees to identify the SSTables that overlap with 
> search interval. In Cassandra, IntervalTrees are not mutated. They are 
> recreated each time a mutation is required. This can be an issue during 
> repairs. In fact we noticed such issues during repair. 
> Since lists are cache friendly compared to linked lists and trees, I decided 
> to compare the search performance with:
> * Linear Walk.
> * Elimination using Binary Search (idea is to eliminate intervals using start 
> and end points of search interval). 
> Based on the tests I ran, I noticed Binary Search based elimination almost 
> always performs similar to IntervalTree or out performs IntervalTree based 
> search. The cost of IntervalTree construction is also substantial and 
> produces lot of garbage during repairs. 
> I ran the tests using random intervals to build the tree/lists and another 
> randomly generated search interval with 5000 iterations. I'm attaching all 
> the relevant graphs. The x-axis in the graphs is the search interval 
> coverage. 10p means the search interval covered 10% of the intervals. The 
> y-axis is the time the search took in nanos. 
> PS: 
> # For the purpose of test, I simplified the IntervalTree by removing the data 
> portion of the interval.  Modified the template version (Java generics) to a 
> specialized version. 
> # I used the code from Cassandra version _3.11_.
> # Time in the graph is in nanos. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.

2019-11-05 Thread Benedict Elliott Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict Elliott Smith reassigned CASSANDRA-15397:
--

Assignee: Chandrasekhar Thumuluru

> IntervalTree performance comparison with Linear Walk and Binary Search based 
> Elimination. 
> --
>
> Key: CASSANDRA-15397
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15397
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Chandrasekhar Thumuluru
>Assignee: Chandrasekhar Thumuluru
>Priority: Normal
> Attachments: 95p_1_SSTable_with_5000_Searches.png, 
> 95p_15000_SSTable_with_5000_Searches.png, 
> 95p_2_SSTable_with_5000_Searches.png, 
> 95p_25000_SSTable_with_5000_Searches.png, 
> 95p_3_SSTable_with_5000_Searches.png, 
> 95p_5000_SSTable_with_5000_Searches.png, 
> 99p_1_SSTable_with_5000_Searches.png, 
> 99p_15000_SSTable_with_5000_Searches.png, 
> 99p_2_SSTable_with_5000_Searches.png, 
> 99p_25000_SSTable_with_5000_Searches.png, 
> 99p_3_SSTable_with_5000_Searches.png, 
> 99p_5000_SSTable_with_5000_Searches.png, IntervalList.java, 
> IntervalListWithElimination.java, IntervalTreeSimplified.java, 
> Mean_1_SSTable_with_5000_Searches.png, 
> Mean_15000_SSTable_with_5000_Searches.png, 
> Mean_2_SSTable_with_5000_Searches.png, 
> Mean_25000_SSTable_with_5000_Searches.png, 
> Mean_3_SSTable_with_5000_Searches.png, 
> Mean_5000_SSTable_with_5000_Searches.png
>
>
> Cassandra uses IntervalTrees to identify the SSTables that overlap with 
> search interval. In Cassandra, IntervalTrees are not mutated. They are 
> recreated each time a mutation is required. This can be an issue during 
> repairs. In fact we noticed such issues during repair. 
> Since lists are cache friendly compared to linked lists and trees, I decided 
> to compare the search performance with:
> * Linear Walk.
> * Elimination using Binary Search (idea is to eliminate intervals using start 
> and end points of search interval). 
> Based on the tests I ran, I noticed Binary Search based elimination almost 
> always performs similar to IntervalTree or out performs IntervalTree based 
> search. The cost of IntervalTree construction is also substantial and 
> produces lot of garbage during repairs. 
> I ran the tests using random intervals to build the tree/lists and another 
> randomly generated search interval with 5000 iterations. I'm attaching all 
> the relevant graphs. The x-axis in the graphs is the search interval 
> coverage. 10p means the search interval covered 10% of the intervals. The 
> y-axis is the time the search took in nanos. 
> PS: 
> # For the purpose of test, I simplified the IntervalTree by removing the data 
> portion of the interval.  Modified the template version (Java generics) to a 
> specialized version. 
> # I used the code from Cassandra version _3.11_.
> # Time in the graph is in nanos. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-15390) Avoid unnecessary collection/iterator allocations during btree construction

2019-11-05 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967687#comment-16967687
 ] 

Benedict Elliott Smith edited comment on CASSANDRA-15390 at 11/5/19 5:16 PM:
-

I like the pattern.  It might be worth putting in some explanatory comments, 
about why these functions exist?

I'm not thrilled by the name "Getter" since that usually means a member 
function that returns a value, but I don't have a much better suggestion.  
Perhaps {{IterateFunction}}?  Could also drop "get" as a prefix of the method 
name.

I wonder if there is any value in introducing a bulk {{nextAt}} method that can 
fetch into an array, for the leaf building mode.  We could fetch them all via 
{{arrayCopy}}, then loop over the array to invoke the {{UpdateFunction}} 
(conditionally on it not being no-op, even, since we seem to test this anyway 
already).


was (Author: benedict):
I like the pattern.  It might be worth putting in some explanatory comments, 
about why these functions exist?

I'm not thrilled by the name "Getter" since that usually means a member 
function that returns a value, but I don't have a much better suggestion.  
Perhaps {{IterateFunction}}?  Could also drop "get" as a prefix of the method 
name.

I wonder if there is any value in introducing a bulk {{nextAt}} method that can 
fetch into an array, for the leaf building mode.  We could fetch them all via 
arrayCopy, then loop over the array to invoke the {{UpdateFunction}} 
(conditionally on it not being no-op, even, since we seem to test this anyway 
already).

> Avoid unnecessary collection/iterator allocations during btree construction
> ---
>
> Key: CASSANDRA-15390
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15390
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> A heavily used btree builder path does a lot of unnecessary conversions to 
> and from collections and iterators. Adding dedicated support for Object[] 
> reduces compaction garbage by up to 8.3%



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15390) Avoid unnecessary collection/iterator allocations during btree construction

2019-11-05 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967687#comment-16967687
 ] 

Benedict Elliott Smith commented on CASSANDRA-15390:


I like the pattern.  It might be worth putting in some explanatory comments, 
about why these functions exist?

I'm not thrilled by the name "Getter" since that usually means a member 
function that returns a value, but I don't have a much better suggestion.  
Perhaps {{IterateFunction}}?  Could also drop "get" as a prefix of the method 
name.

I wonder if there is any value in introducing a bulk {{nextAt}} method that can 
fetch into an array, for the leaf building mode.  We could fetch them all via 
arrayCopy, then loop over the array to invoke the {{UpdateFunction}} 
(conditionally on it not being no-op, even, since we seem to test this anyway 
already).

> Avoid unnecessary collection/iterator allocations during btree construction
> ---
>
> Key: CASSANDRA-15390
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15390
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> A heavily used btree builder path does a lot of unnecessary conversions to 
> and from collections and iterators. Adding dedicated support for Object[] 
> reduces compaction garbage by up to 8.3%



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15390) Avoid unnecessary collection/iterator allocations during btree construction

2019-11-05 Thread Benedict Elliott Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict Elliott Smith updated CASSANDRA-15390:
---
Reviewers: Benedict Elliott Smith, Benedict Elliott Smith  (was: Benedict 
Elliott Smith)
   Benedict Elliott Smith, Benedict Elliott Smith
   Status: Review In Progress  (was: Patch Available)

> Avoid unnecessary collection/iterator allocations during btree construction
> ---
>
> Key: CASSANDRA-15390
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15390
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> A heavily used btree builder path does a lot of unnecessary conversions to 
> and from collections and iterators. Adding dedicated support for Object[] 
> reduces compaction garbage by up to 8.3%



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15390) Avoid unnecessary collection/iterator allocations during btree construction

2019-11-05 Thread Benedict Elliott Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict Elliott Smith updated CASSANDRA-15390:
---
Status: Changes Suggested  (was: Review In Progress)

> Avoid unnecessary collection/iterator allocations during btree construction
> ---
>
> Key: CASSANDRA-15390
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15390
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> A heavily used btree builder path does a lot of unnecessary conversions to 
> and from collections and iterators. Adding dedicated support for Object[] 
> reduces compaction garbage by up to 8.3%



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14773) Overflow of 32-bit integer during compaction.

2019-11-05 Thread Benedict Elliott Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict Elliott Smith updated CASSANDRA-14773:
---
Reviewers:   (was: Benedict Elliott Smith)

> Overflow of 32-bit integer during compaction.
> -
>
> Key: CASSANDRA-14773
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14773
> Project: Cassandra
>  Issue Type: Bug
>  Components: Local/Compaction
>Reporter: Vladimir Bukhtoyarov
>Assignee: Vladimir Bukhtoyarov
>Priority: Urgent
> Fix For: 4.0, 4.0-beta
>
>
> In scope of CASSANDRA-13444 the compaction was significantly improved from 
> CPU and memory perspective. Hovewer this improvement introduces the bug in 
> rounding. When rounding the expriration time which is close to  
> *Cell.MAX_DELETION_TIME*(it is just *Integer.MAX_VALUE*) the math overflow 
> happens(because in scope of -CASSANDRA-13444-) data type for point was 
> changed from Long to Integer in order to reduce memory footprint), as result 
> point became negative and acts as silent poison for internal structures of 
> StreamingTombstoneHistogramBuilder like *DistanceHolder* and *DataHolder*. 
> Then depending of point intervals:
>  * The TombstoneHistogram produces wrong values when interval of points is 
> less then binSize, it is not critical.
>  * Compaction crashes with ArrayIndexOutOfBoundsException if amount of point 
> intervals is great then  binSize, this case is very critical.
>  
> This is pull request [https://github.com/apache/cassandra/pull/273] that 
> reproduces the issue and provides the fix. 
>  
> The stacktrace when running(on codebase without fix) 
> *testMathOverflowDuringRoundingOfLargeTimestamp* without -ea JVM flag
> {noformat}
> java.lang.ArrayIndexOutOfBoundsException
> at java.lang.System.arraycopy(Native Method)
> at 
> org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder$DistanceHolder.add(StreamingTombstoneHistogramBuilder.java:208)
> at 
> org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder.flushValue(StreamingTombstoneHistogramBuilder.java:140)
> at 
> org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder$$Lambda$1/1967205423.consume(Unknown
>  Source)
> at 
> org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder$Spool.forEach(StreamingTombstoneHistogramBuilder.java:574)
> at 
> org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder.flushHistogram(StreamingTombstoneHistogramBuilder.java:124)
> at 
> org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilder.build(StreamingTombstoneHistogramBuilder.java:184)
> at 
> org.apache.cassandra.utils.streamhist.StreamingTombstoneHistogramBuilderTest.testMathOverflowDuringRoundingOfLargeTimestamp(StreamingTombstoneHistogramBuilderTest.java:183)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
> at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:44)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:180)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:41)
> at org.junit.runners.ParentRunner$1.evaluate(ParentRunner.java:173)
> at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
> at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:220)
> at org.junit.runner.JUnitCore.run(JUnitCore.java:159)
> at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
> at 
> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
> at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
> at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
> {noformat}
>  
> The stacktrace when running(on codebase without fix)  
> 

[jira] [Updated] (CASSANDRA-14779) Changing EndpointSnitch via JMX has problems

2019-11-05 Thread Benedict Elliott Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict Elliott Smith updated CASSANDRA-14779:
---
Reviewers:   (was: Benedict Elliott Smith)

> Changing EndpointSnitch via JMX has problems
> 
>
> Key: CASSANDRA-14779
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14779
> Project: Cassandra
>  Issue Type: Bug
>  Components: Cluster/Membership, Observability/JMX
>Reporter: Benedict Elliott Smith
>Assignee: Ian Cleasby
>Priority: Low
> Fix For: 4.x
>
>
> The snitch can be set via StorageService over JMX, for what reason I’m 
> unsure. If this were to happen, we might encounter the following problems:
>  * If the effective local DC were to change, we would not update it. Perhaps 
> changing the local DC of a node should be rejected and cause it to fail, but 
> presently, it would simply result in our disagreeing with the snitch.
>  * During the transition, routing of queries might be broken, as we fetch the 
> snitch multiple times in different locations when deciding where to route our 
> query and writes. It’s not clear what the outcome of a discordant view of the 
> ring would be.
> Probably, changing this information in a live cluster is dangerous and we 
> should actually reject any effective changes to rack, or DC for any node. But 
> presently we don’t seem to corroborate that this information remains the 
> same. We don’t seem to perform any cluster wide confirmation that this data 
> is consistent, generally, which perhaps we should also consider.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15358) Cassandra alpha 4 testing - Nodes crashing due to bufferpool allocator issue

2019-11-05 Thread Benedict Elliott Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict Elliott Smith updated CASSANDRA-15358:
---
 Bug Category: Parent values: Availability(12983)Level 1 values: Response 
Crash(12991)
   Complexity: Normal
Discovered By: User Report
 Severity: Normal
   Status: Open  (was: Triage Needed)

> Cassandra alpha 4 testing - Nodes crashing due to bufferpool allocator issue
> 
>
> Key: CASSANDRA-15358
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15358
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/benchmark
>Reporter: Santhosh Kumar Ramalingam
>Assignee: Benedict Elliott Smith
>Priority: Normal
>  Labels: 4.0, alpha
>
> Hitting a bug with cassandra 4 alpha version. The same bug is repeated with 
> difefrent version of Java(8,11 &12) [~benedict]
>  
> Stack trace:
> {code:java}
> INFO [main] 2019-10-11 16:07:12,024 Server.java:164 - Starting listening for 
> CQL clients on /1.3.0.6:9042 (unencrypted)...
> WARN [OptionalTasks:1] 2019-10-11 16:07:13,961 CassandraRoleManager.java:343 
> - CassandraRoleManager skipped default role setup: some nodes were not ready
> INFO [OptionalTasks:1] 2019-10-11 16:07:13,961 CassandraRoleManager.java:369 
> - Setup task failed with error, rescheduling
> WARN [Messaging-EventLoop-3-2] 2019-10-11 16:07:22,038 NoSpamLogger.java:94 - 
> 10.3x.4x.5x:7000->1.3.0.5:7000-LARGE_MESSAGES-[no-channel] dropping message 
> of type PING_REQ whose timeout expired before reaching the network
> WARN [OptionalTasks:1] 2019-10-11 16:07:23,963 CassandraRoleManager.java:343 
> - CassandraRoleManager skipped default role setup: some nodes were not ready
> INFO [OptionalTasks:1] 2019-10-11 16:07:23,963 CassandraRoleManager.java:369 
> - Setup task failed with error, rescheduling
> INFO [Messaging-EventLoop-3-6] 2019-10-11 16:07:32,759 NoSpamLogger.java:91 - 
> 10.3x.4x.5x:7000->1.3.0.2:7000-URGENT_MESSAGES-[no-channel] failed to connect
> io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) 
> failed: Connection refused: /1.3.0.2:7000
> Caused by: java.net.ConnectException: finishConnect(..) failed: Connection 
> refused
> at io.netty.channel.unix.Errors.throwConnectException(Errors.java:124)
> at io.netty.channel.unix.Socket.finishConnect(Socket.java:243)
> at 
> io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.doFinishConnect(AbstractEpollChannel.java:667)
> at 
> io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.finishConnect(AbstractEpollChannel.java:644)
> at 
> io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.epollOutReady(AbstractEpollChannel.java:524)
> at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:414)
> at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:326)
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:918)
> at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:834)
> WARN [Messaging-EventLoop-3-3] 2019-10-11 16:11:32,639 NoSpamLogger.java:94 - 
> 1.3.4.6:7000->1.3.4.5:7000-URGENT_MESSAGES-[no-channel] dropping message of 
> type GOSSIP_DIGEST_SYN whose timeout expired before reaching the network
> INFO [Messaging-EventLoop-3-18] 2019-10-11 16:11:33,077 NoSpamLogger.java:91 
> - 1.3.4.5:7000->1.3.4.4:7000-URGENT_MESSAGES-[no-channel] failed to connect
>  
> ERROR [Messaging-EventLoop-3-11] 2019-10-10 01:34:34,407 
> InboundMessageHandler.java:657 - 
> 1.3.4.5:7000->1.3.4.8:7000-LARGE_MESSAGES-0b7d09cd unexpected exception 
> caught while processing inbound messages; terminating connection
> java.lang.IllegalArgumentException: initialBuffer is not a direct buffer.
> at io.netty.buffer.UnpooledDirectByteBuf.(UnpooledDirectByteBuf.java:87)
> at 
> io.netty.buffer.UnpooledUnsafeDirectByteBuf.(UnpooledUnsafeDirectByteBuf.java:59)
> at 
> org.apache.cassandra.net.BufferPoolAllocator$Wrapped.(BufferPoolAllocator.java:95)
> at 
> org.apache.cassandra.net.BufferPoolAllocator.newDirectBuffer(BufferPoolAllocator.java:56)
> at 
> io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:187)
> at 
> io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:178)
> at 
> io.netty.channel.unix.PreferredDirectByteBufAllocator.ioBuffer(PreferredDirectByteBufAllocator.java:53)
> at 
> io.netty.channel.DefaultMaxMessagesRecvByteBufAllocator$MaxMessageHandle.allocate(DefaultMaxMessagesRecvByteBufAllocator.java:114)
> at 
> 

[jira] [Commented] (CASSANDRA-15358) Cassandra alpha 4 testing - Nodes crashing due to bufferpool allocator issue

2019-11-05 Thread Santhosh Kumar Ramalingam (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967677#comment-16967677
 ] 

Santhosh Kumar Ramalingam commented on CASSANDRA-15358:
---

[~benedict]

Our spark test was run on the version taken from the trunk. 

> Cassandra alpha 4 testing - Nodes crashing due to bufferpool allocator issue
> 
>
> Key: CASSANDRA-15358
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15358
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/benchmark
>Reporter: Santhosh Kumar Ramalingam
>Assignee: Benedict Elliott Smith
>Priority: Normal
>  Labels: 4.0, alpha
>
> Hitting a bug with cassandra 4 alpha version. The same bug is repeated with 
> difefrent version of Java(8,11 &12) [~benedict]
>  
> Stack trace:
> {code:java}
> INFO [main] 2019-10-11 16:07:12,024 Server.java:164 - Starting listening for 
> CQL clients on /1.3.0.6:9042 (unencrypted)...
> WARN [OptionalTasks:1] 2019-10-11 16:07:13,961 CassandraRoleManager.java:343 
> - CassandraRoleManager skipped default role setup: some nodes were not ready
> INFO [OptionalTasks:1] 2019-10-11 16:07:13,961 CassandraRoleManager.java:369 
> - Setup task failed with error, rescheduling
> WARN [Messaging-EventLoop-3-2] 2019-10-11 16:07:22,038 NoSpamLogger.java:94 - 
> 10.3x.4x.5x:7000->1.3.0.5:7000-LARGE_MESSAGES-[no-channel] dropping message 
> of type PING_REQ whose timeout expired before reaching the network
> WARN [OptionalTasks:1] 2019-10-11 16:07:23,963 CassandraRoleManager.java:343 
> - CassandraRoleManager skipped default role setup: some nodes were not ready
> INFO [OptionalTasks:1] 2019-10-11 16:07:23,963 CassandraRoleManager.java:369 
> - Setup task failed with error, rescheduling
> INFO [Messaging-EventLoop-3-6] 2019-10-11 16:07:32,759 NoSpamLogger.java:91 - 
> 10.3x.4x.5x:7000->1.3.0.2:7000-URGENT_MESSAGES-[no-channel] failed to connect
> io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) 
> failed: Connection refused: /1.3.0.2:7000
> Caused by: java.net.ConnectException: finishConnect(..) failed: Connection 
> refused
> at io.netty.channel.unix.Errors.throwConnectException(Errors.java:124)
> at io.netty.channel.unix.Socket.finishConnect(Socket.java:243)
> at 
> io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.doFinishConnect(AbstractEpollChannel.java:667)
> at 
> io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.finishConnect(AbstractEpollChannel.java:644)
> at 
> io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.epollOutReady(AbstractEpollChannel.java:524)
> at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:414)
> at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:326)
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:918)
> at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:834)
> WARN [Messaging-EventLoop-3-3] 2019-10-11 16:11:32,639 NoSpamLogger.java:94 - 
> 1.3.4.6:7000->1.3.4.5:7000-URGENT_MESSAGES-[no-channel] dropping message of 
> type GOSSIP_DIGEST_SYN whose timeout expired before reaching the network
> INFO [Messaging-EventLoop-3-18] 2019-10-11 16:11:33,077 NoSpamLogger.java:91 
> - 1.3.4.5:7000->1.3.4.4:7000-URGENT_MESSAGES-[no-channel] failed to connect
>  
> ERROR [Messaging-EventLoop-3-11] 2019-10-10 01:34:34,407 
> InboundMessageHandler.java:657 - 
> 1.3.4.5:7000->1.3.4.8:7000-LARGE_MESSAGES-0b7d09cd unexpected exception 
> caught while processing inbound messages; terminating connection
> java.lang.IllegalArgumentException: initialBuffer is not a direct buffer.
> at io.netty.buffer.UnpooledDirectByteBuf.(UnpooledDirectByteBuf.java:87)
> at 
> io.netty.buffer.UnpooledUnsafeDirectByteBuf.(UnpooledUnsafeDirectByteBuf.java:59)
> at 
> org.apache.cassandra.net.BufferPoolAllocator$Wrapped.(BufferPoolAllocator.java:95)
> at 
> org.apache.cassandra.net.BufferPoolAllocator.newDirectBuffer(BufferPoolAllocator.java:56)
> at 
> io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:187)
> at 
> io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:178)
> at 
> io.netty.channel.unix.PreferredDirectByteBufAllocator.ioBuffer(PreferredDirectByteBufAllocator.java:53)
> at 
> io.netty.channel.DefaultMaxMessagesRecvByteBufAllocator$MaxMessageHandle.allocate(DefaultMaxMessagesRecvByteBufAllocator.java:114)
> at 
> io.netty.channel.epoll.EpollRecvByteAllocatorHandle.allocate(EpollRecvByteAllocatorHandle.java:75)
> at 
> 

[jira] [Commented] (CASSANDRA-15241) Virtual table to expose current running queries

2019-11-05 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967672#comment-16967672
 ] 

Benedict Elliott Smith commented on CASSANDRA-15241:


[~clohfink] sorry, for some reason in my mind this was still in your queue, not 
mine.  I will get it reviewed for you this week.

> Virtual table to expose current running queries
> ---
>
> Key: CASSANDRA-15241
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15241
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Feature/Virtual Tables
>Reporter: Chris Lohfink
>Assignee: Chris Lohfink
>Priority: Normal
>
> Expose current running queries and their duration.
> {code}cqlsh> select * from system_views.queries;
>  thread_id| duration_micros | task
> --+-+-
>  Native-Transport-Requests-17 |6325 |  QUERY 
> select * from system_views.queries; [pageSize = 100]
>   Native-Transport-Requests-4 |   14681 | EXECUTE 
> f4115f91190d4acf09e452637f1f2444 with 0 values at consistency LOCAL_ONE
>   Native-Transport-Requests-6 |   14678 | EXECUTE 
> f4115f91190d4acf09e452637f1f2444 with 0 values at consistency LOCAL_ONE
>  ReadStage-10 |   16535 | 
>SELECT * FROM basic.wide1 LIMIT 5000
>  ReadStage-13 |   16535 | 
>SELECT * FROM basic.wide1 LIMIT 5000
>  ReadStage-14 |   16535 | 
>SELECT * FROM basic.wide1 LIMIT 5000
>  ReadStage-19 |   11861 | 
>SELECT * FROM basic.wide1 LIMIT 5000
>  ReadStage-20 |   11861 | 
>SELECT * FROM basic.wide1 LIMIT 5000
>  ReadStage-22 |7279 | 
>SELECT * FROM basic.wide1 LIMIT 5000
>  ReadStage-23 |4716 | 
>SELECT * FROM basic.wide1 LIMIT 5000
>   ReadStage-5 |   16535 | 
>SELECT * FROM basic.wide1 LIMIT 5000
>   ReadStage-7 |   16535 | 
>SELECT * FROM basic.wide1 LIMIT 5000
>   ReadStage-8 |   16535 | 
>SELECT * FROM basic.wide1 LIMIT 5000{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15389) Minimize BTree iterator allocations

2019-11-05 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967627#comment-16967627
 ] 

Benedict Elliott Smith commented on CASSANDRA-15389:


Just collecting here some comments I made on GitHub:

h3. Rows.collectStats
* Could simply increment the long directly by 0x and 1, respectively, 
without unpacking
* The saturation checks seem to be of limited value after the loop terminates, 
and should perhaps be done on each increment?  It would throw if we had an 
overflow of 2B to 4B, but not 4B to 6B.  Not sure how likely either of these 
things are.
* The right-shift to extract should probably be unsigned (though unimportant if 
we haven't overflowed)

h3. SerializationHeader
Not sure if this would be an improvement or not, but 
{{FullBTreeSearchIterator}} has a rewind method, and this could be hoisted into 
{{SearchIterator}} to make it reusable. It's not clear if this would be faster 
than consulting a {{HashMap}}, particularly with the new 
{{LeafBTreeSearchIterator}} that uses {{binarySearch}} without any optimisation 
for the case where we are looking up the same set of values in sequence, 
however {{FullBTreeSearchIterator}} would have no indirect memory accesses for 
the common case of all (or most) columns being visited, and this could also be 
propagated to {{LeafBTreeSearchIterator}}.  It would mean fewer indirect memory 
accesses.

h3. BTreeRow
* {{hasComplex}} doesn't need to use an iterator at all - we can simply search 
for the first complex cell using {{BTree.find}} and the {{Cell}} equivalent of 
{{Columns.findFirstComplexIdx}} - however it looks like this method isn't even 
used, so we could simply remove it entirely.
* {{hasComplexDeletion}} could use the same logic to determine the 
{{firstComplexIdx}}, and instead of providing a {{StopCondition}} we could 
provide {{(firstComplexIdx, size)}} as the bounds to accumulate over.

These two would remove the need for a direction to the accumulate function, and 
the {{StopCondition}}, which I think would be an easier to understand API (and 
easier to parse implementation).

> Minimize BTree iterator allocations
> ---
>
> Key: CASSANDRA-15389
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15389
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> Allocations of BTree iterators contribute a lot amount of garbage to the 
> compaction and read paths.
> This patch removes most btree iterator allocations on hot paths by:
>  • using Row#apply where appropriate on frequently called methods 
> (Row#digest, Row#validateData
>  • adding BTree accumulate method. Like the apply method, this method walks 
> the btree with a function that takes and returns a long argument, this 
> eliminates iterator allocations without adding helper object allocations 
> (BTreeRow#hasComplex, BTreeRow#hasInvalidDeletions, BTreeRow#dataSize, 
> BTreeRow#unsharedHeapSizeExcludingData, Rows#collectStats, 
> UnfilteredSerializer#serializedRowBodySize) as well as eliminating the 
> allocation of helper objects in places where apply was used previously^[1]^.
>  • Create map of columns in SerializationHeader, this lets us avoid 
> allocating a btree search iterator for each row we serialize.
> These optimizations reduce garbage created during compaction by up to 13.5%
>  
> [1] the memory test does measure memory allocated by lambdas capturing objects



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15394) Remove list iterators

2019-11-05 Thread Benedict Elliott Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict Elliott Smith updated CASSANDRA-15394:
---
Status: Changes Suggested  (was: Review In Progress)

> Remove list iterators
> -
>
> Key: CASSANDRA-15394
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15394
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local/Compaction
>Reporter: Blake Eggleston
>Assignee: Blake Eggleston
>Priority: Normal
> Fix For: 4.0
>
>
> We allocate list iterators in several places in hot paths. This converts them 
> to get by index. This provides a ~4% improvement in relvant workloads.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.

2019-11-05 Thread Chandrasekhar Thumuluru (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandrasekhar Thumuluru updated CASSANDRA-15397:

Attachment: Mean_3_SSTable_with_5000_Searches.png
Mean_25000_SSTable_with_5000_Searches.png
Mean_2_SSTable_with_5000_Searches.png
Mean_15000_SSTable_with_5000_Searches.png
Mean_1_SSTable_with_5000_Searches.png
Mean_5000_SSTable_with_5000_Searches.png

> IntervalTree performance comparison with Linear Walk and Binary Search based 
> Elimination. 
> --
>
> Key: CASSANDRA-15397
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15397
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Chandrasekhar Thumuluru
>Priority: Normal
> Attachments: 95p_1_SSTable_with_5000_Searches.png, 
> 95p_15000_SSTable_with_5000_Searches.png, 
> 95p_2_SSTable_with_5000_Searches.png, 
> 95p_25000_SSTable_with_5000_Searches.png, 
> 95p_3_SSTable_with_5000_Searches.png, 
> 95p_5000_SSTable_with_5000_Searches.png, 
> 99p_1_SSTable_with_5000_Searches.png, 
> 99p_15000_SSTable_with_5000_Searches.png, 
> 99p_2_SSTable_with_5000_Searches.png, 
> 99p_25000_SSTable_with_5000_Searches.png, 
> 99p_3_SSTable_with_5000_Searches.png, 
> 99p_5000_SSTable_with_5000_Searches.png, IntervalList.java, 
> IntervalListWithElimination.java, IntervalTreeSimplified.java, 
> Mean_1_SSTable_with_5000_Searches.png, 
> Mean_15000_SSTable_with_5000_Searches.png, 
> Mean_2_SSTable_with_5000_Searches.png, 
> Mean_25000_SSTable_with_5000_Searches.png, 
> Mean_3_SSTable_with_5000_Searches.png, 
> Mean_5000_SSTable_with_5000_Searches.png
>
>
> Cassandra uses IntervalTrees to identify the SSTables that overlap with 
> search interval. In Cassandra, IntervalTrees are not mutated. They are 
> recreated each time a mutation is required. This can be an issue during 
> repairs. In fact we noticed such issues during repair. 
> Since lists are cache friendly compared to linked lists and trees, I decided 
> to compare the search performance with:
> * Linear Walk.
> * Elimination using Binary Search (idea is to eliminate intervals using start 
> and end points of search interval). 
> Based on the tests I ran, I noticed Binary Search based elimination almost 
> always performs similar to IntervalTree or out performs IntervalTree based 
> search. The cost of IntervalTree construction is also substantial and 
> produces lot of garbage during repairs. 
> I ran the tests using random intervals to build the tree/lists and another 
> randomly generated search interval with 5000 iterations. I'm attaching all 
> the relevant graphs. The x-axis in the graphs is the search interval 
> coverage. 10p means the search interval covered 10% of the intervals. The 
> y-axis is the time the search took in nanos. 
> PS: 
> # For the purpose of test, I simplified the IntervalTree by removing the data 
> portion of the interval.  Modified the template version (Java generics) to a 
> specialized version. 
> # I used the code from Cassandra version _3.11_.
> # Time in the graph is in nanos. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.

2019-11-05 Thread Chandrasekhar Thumuluru (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandrasekhar Thumuluru updated CASSANDRA-15397:

Attachment: 95p_5000_SSTable_with_5000_Searches.png
95p_1_SSTable_with_5000_Searches.png
95p_15000_SSTable_with_5000_Searches.png
95p_2_SSTable_with_5000_Searches.png
95p_25000_SSTable_with_5000_Searches.png
95p_3_SSTable_with_5000_Searches.png

> IntervalTree performance comparison with Linear Walk and Binary Search based 
> Elimination. 
> --
>
> Key: CASSANDRA-15397
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15397
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Chandrasekhar Thumuluru
>Priority: Normal
> Attachments: 95p_1_SSTable_with_5000_Searches.png, 
> 95p_15000_SSTable_with_5000_Searches.png, 
> 95p_2_SSTable_with_5000_Searches.png, 
> 95p_25000_SSTable_with_5000_Searches.png, 
> 95p_3_SSTable_with_5000_Searches.png, 
> 95p_5000_SSTable_with_5000_Searches.png, 
> 99p_1_SSTable_with_5000_Searches.png, 
> 99p_15000_SSTable_with_5000_Searches.png, 
> 99p_2_SSTable_with_5000_Searches.png, 
> 99p_25000_SSTable_with_5000_Searches.png, 
> 99p_3_SSTable_with_5000_Searches.png, 
> 99p_5000_SSTable_with_5000_Searches.png, IntervalList.java, 
> IntervalListWithElimination.java, IntervalTreeSimplified.java
>
>
> Cassandra uses IntervalTrees to identify the SSTables that overlap with 
> search interval. In Cassandra, IntervalTrees are not mutated. They are 
> recreated each time a mutation is required. This can be an issue during 
> repairs. In fact we noticed such issues during repair. 
> Since lists are cache friendly compared to linked lists and trees, I decided 
> to compare the search performance with:
> * Linear Walk.
> * Elimination using Binary Search (idea is to eliminate intervals using start 
> and end points of search interval). 
> Based on the tests I ran, I noticed Binary Search based elimination almost 
> always performs similar to IntervalTree or out performs IntervalTree based 
> search. The cost of IntervalTree construction is also substantial and 
> produces lot of garbage during repairs. 
> I ran the tests using random intervals to build the tree/lists and another 
> randomly generated search interval with 5000 iterations. I'm attaching all 
> the relevant graphs. The x-axis in the graphs is the search interval 
> coverage. 10p means the search interval covered 10% of the intervals. The 
> y-axis is the time the search took in nanos. 
> PS: 
> # For the purpose of test, I simplified the IntervalTree by removing the data 
> portion of the interval.  Modified the template version (Java generics) to a 
> specialized version. 
> # I used the code from Cassandra version _3.11_.
> # Time in the graph is in nanos. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.

2019-11-05 Thread Chandrasekhar Thumuluru (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandrasekhar Thumuluru updated CASSANDRA-15397:

Description: 
Cassandra uses IntervalTrees to identify the SSTables that overlap with search 
interval. In Cassandra, IntervalTrees are not mutated. They are recreated each 
time a mutation is required. This can be an issue during repairs. In fact we 
noticed such issues during repair. 

Since lists are cache friendly compared to linked lists and trees, I decided to 
compare the search performance with:
* Linear Walk.
* Elimination using Binary Search (idea is to eliminate intervals using start 
and end points of search interval). 

Based on the tests I ran, I noticed Binary Search based elimination almost 
always performs similar to IntervalTree or out performs IntervalTree based 
search. The cost of IntervalTree construction is also substantial and produces 
lot of garbage during repairs. 

I ran the tests using random intervals to build the tree/lists and another 
randomly generated search interval with 5000 iterations. I'm attaching all the 
relevant graphs. The x-axis in the graphs is the search interval coverage. 10p 
means the search interval covered 10% of the intervals. The y-axis is the time 
the search took in nanos. 

PS: 
# For the purpose of test, I simplified the IntervalTree by removing the data 
portion of the interval.  Modified the template version (Java generics) to a 
specialized version. 
# I used the code from Cassandra version _3.11_.
# Time in the graph is in nanos. 

  was:
Cassandra uses IntervalTrees to identify the SSTables that overlap with search 
interval. In Cassandra, IntervalTrees are not mutated. They are recreated each 
time a mutation is required. This can be an issue during repairs. In fact we 
noticed such issues during repair. 

Since lists are cache friendly compared to linked lists and trees, I decided to 
compare the search performance with:
* Linear Walk.
* Elimination using Binary Search (idea is to eliminate intervals using start 
and end points of search interval). 

Based on the tests I ran, I noticed Binary Search based elimination almost 
always performs similar to IntervalTree or out performs IntervalTree based 
search. The cost of IntervalTree construction is also substantial and produces 
lot of garbage during repairs. 

I ran the tests using random intervals to build the tree/lists and another 
randomly generated search interval with 5000 iterations. I'm attaching all the 
relevant graphs. The x-axis in the graphs is the search interval coverage. 10p 
means the search interval covered 10% of the intervals. The y-axis is the time 
the search took in nanos. 

PS: 
# For the purpose of test, I simplified the IntervalTree code by making it 
non-generic and removing the data portion of the interval.  
# I used the code from Cassandra version _3.11_.
# Time in the graph is in nanos. 


> IntervalTree performance comparison with Linear Walk and Binary Search based 
> Elimination. 
> --
>
> Key: CASSANDRA-15397
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15397
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Chandrasekhar Thumuluru
>Priority: Normal
> Attachments: 99p_1_SSTable_with_5000_Searches.png, 
> 99p_15000_SSTable_with_5000_Searches.png, 
> 99p_2_SSTable_with_5000_Searches.png, 
> 99p_25000_SSTable_with_5000_Searches.png, 
> 99p_3_SSTable_with_5000_Searches.png, 
> 99p_5000_SSTable_with_5000_Searches.png, IntervalList.java, 
> IntervalListWithElimination.java, IntervalTreeSimplified.java
>
>
> Cassandra uses IntervalTrees to identify the SSTables that overlap with 
> search interval. In Cassandra, IntervalTrees are not mutated. They are 
> recreated each time a mutation is required. This can be an issue during 
> repairs. In fact we noticed such issues during repair. 
> Since lists are cache friendly compared to linked lists and trees, I decided 
> to compare the search performance with:
> * Linear Walk.
> * Elimination using Binary Search (idea is to eliminate intervals using start 
> and end points of search interval). 
> Based on the tests I ran, I noticed Binary Search based elimination almost 
> always performs similar to IntervalTree or out performs IntervalTree based 
> search. The cost of IntervalTree construction is also substantial and 
> produces lot of garbage during repairs. 
> I ran the tests using random intervals to build the tree/lists and another 
> randomly generated search interval with 5000 iterations. I'm attaching all 
> the relevant graphs. The x-axis in the graphs is the search interval 
> coverage. 10p means the search interval covered 10% of the intervals. The 
> y-axis is the time the search 

[jira] [Commented] (CASSANDRA-15358) Cassandra alpha 4 testing - Nodes crashing due to bufferpool allocator issue

2019-11-05 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967610#comment-16967610
 ] 

Benedict Elliott Smith commented on CASSANDRA-15358:


Could you try running the following branch? 
[15358|https://github.geo.apple.com/belliottsmith/cassandra/tree/15358]

It is based on trunk, so let me know if you would prefer it rebased to 
{{4.0-alpha1}} or {{4.0-alpha2}}, though I don't believe a great deal has 
changed since.

Given the log statements, there's a good chance this is the problem.  Even 
though I still think it's nearly impossible for a read-only buffer to be 
created and returned to the pool, I cannot find another more plausible cause.  
If this doesn't fix the problem, I'll see if I can put together a debug build 
that can maybe report where this {{ByteBuffer}} materialises from.

> Cassandra alpha 4 testing - Nodes crashing due to bufferpool allocator issue
> 
>
> Key: CASSANDRA-15358
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15358
> Project: Cassandra
>  Issue Type: Bug
>  Components: Test/benchmark
>Reporter: Santhosh Kumar Ramalingam
>Assignee: Benedict Elliott Smith
>Priority: Normal
>  Labels: 4.0, alpha
>
> Hitting a bug with cassandra 4 alpha version. The same bug is repeated with 
> difefrent version of Java(8,11 &12) [~benedict]
>  
> Stack trace:
> {code:java}
> INFO [main] 2019-10-11 16:07:12,024 Server.java:164 - Starting listening for 
> CQL clients on /1.3.0.6:9042 (unencrypted)...
> WARN [OptionalTasks:1] 2019-10-11 16:07:13,961 CassandraRoleManager.java:343 
> - CassandraRoleManager skipped default role setup: some nodes were not ready
> INFO [OptionalTasks:1] 2019-10-11 16:07:13,961 CassandraRoleManager.java:369 
> - Setup task failed with error, rescheduling
> WARN [Messaging-EventLoop-3-2] 2019-10-11 16:07:22,038 NoSpamLogger.java:94 - 
> 10.3x.4x.5x:7000->1.3.0.5:7000-LARGE_MESSAGES-[no-channel] dropping message 
> of type PING_REQ whose timeout expired before reaching the network
> WARN [OptionalTasks:1] 2019-10-11 16:07:23,963 CassandraRoleManager.java:343 
> - CassandraRoleManager skipped default role setup: some nodes were not ready
> INFO [OptionalTasks:1] 2019-10-11 16:07:23,963 CassandraRoleManager.java:369 
> - Setup task failed with error, rescheduling
> INFO [Messaging-EventLoop-3-6] 2019-10-11 16:07:32,759 NoSpamLogger.java:91 - 
> 10.3x.4x.5x:7000->1.3.0.2:7000-URGENT_MESSAGES-[no-channel] failed to connect
> io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) 
> failed: Connection refused: /1.3.0.2:7000
> Caused by: java.net.ConnectException: finishConnect(..) failed: Connection 
> refused
> at io.netty.channel.unix.Errors.throwConnectException(Errors.java:124)
> at io.netty.channel.unix.Socket.finishConnect(Socket.java:243)
> at 
> io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.doFinishConnect(AbstractEpollChannel.java:667)
> at 
> io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.finishConnect(AbstractEpollChannel.java:644)
> at 
> io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe.epollOutReady(AbstractEpollChannel.java:524)
> at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:414)
> at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:326)
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:918)
> at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
> at 
> io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
> at java.base/java.lang.Thread.run(Thread.java:834)
> WARN [Messaging-EventLoop-3-3] 2019-10-11 16:11:32,639 NoSpamLogger.java:94 - 
> 1.3.4.6:7000->1.3.4.5:7000-URGENT_MESSAGES-[no-channel] dropping message of 
> type GOSSIP_DIGEST_SYN whose timeout expired before reaching the network
> INFO [Messaging-EventLoop-3-18] 2019-10-11 16:11:33,077 NoSpamLogger.java:91 
> - 1.3.4.5:7000->1.3.4.4:7000-URGENT_MESSAGES-[no-channel] failed to connect
>  
> ERROR [Messaging-EventLoop-3-11] 2019-10-10 01:34:34,407 
> InboundMessageHandler.java:657 - 
> 1.3.4.5:7000->1.3.4.8:7000-LARGE_MESSAGES-0b7d09cd unexpected exception 
> caught while processing inbound messages; terminating connection
> java.lang.IllegalArgumentException: initialBuffer is not a direct buffer.
> at io.netty.buffer.UnpooledDirectByteBuf.(UnpooledDirectByteBuf.java:87)
> at 
> io.netty.buffer.UnpooledUnsafeDirectByteBuf.(UnpooledUnsafeDirectByteBuf.java:59)
> at 
> org.apache.cassandra.net.BufferPoolAllocator$Wrapped.(BufferPoolAllocator.java:95)
> at 
> org.apache.cassandra.net.BufferPoolAllocator.newDirectBuffer(BufferPoolAllocator.java:56)
> at 
> 

[jira] [Updated] (CASSANDRA-15397) IntervalTree performance comparison with Linear Walk and Binary Search based Elimination.

2019-11-05 Thread Chandrasekhar Thumuluru (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-15397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chandrasekhar Thumuluru updated CASSANDRA-15397:

Description: 
Cassandra uses IntervalTrees to identify the SSTables that overlap with search 
interval. In Cassandra, IntervalTrees are not mutated. They are recreated each 
time a mutation is required. This can be an issue during repairs. In fact we 
noticed such issues during repair. 

Since lists are cache friendly compared to linked lists and trees, I decided to 
compare the search performance with:
* Linear Walk.
* Elimination using Binary Search (idea is to eliminate intervals using start 
and end points of search interval). 

Based on the tests I ran, I noticed Binary Search based elimination almost 
always performs similar to IntervalTree or out performs IntervalTree based 
search. The cost of IntervalTree construction is also substantial and produces 
lot of garbage during repairs. 

I ran the tests using random intervals to build the tree/lists and another 
randomly generated search interval with 5000 iterations. I'm attaching all the 
relevant graphs. The x-axis in the graphs is the search interval coverage. 10p 
means the search interval covered 10% of the intervals. The y-axis is the time 
the search took in nanos. 

PS: 
# For the purpose of test, I simplified the IntervalTree code by making it 
non-generic and removing the data portion of the interval.  
# I used the code from Cassandra version _3.11_.
# Time in the graph is in nanos. 

  was:
Cassandra uses IntervalTrees to identify the SSTables that overlap with search 
interval. In Cassandra, IntervalTrees are not mutated. They are recreated each 
time a mutation is required. This can be an issue during repairs. In fact we 
noticed such issues during repair. 

Since lists are cache friendly compared to linked lists and trees, I decided to 
compare the search performance with:
* Linear Walk.
* Elimination using Binary Search (idea is to eliminate intervals using start 
and end points of search interval). 

Based on the tests I ran, I noticed Binary Search based elimination almost 
always performs similar to IntervalTree performance or out performs 
IntervalTree based search. 

I ran the tests using random intervals to build the tree/lists and another 
randomly generated search interval with 5000 iterations. I'm attaching all the 
relevant graphs. 


PS: 
# For the purpose of test, I simplified the IntervalTree code by making it 
non-generic and removing the data portion of the interval.  
# I used the code from Cassandra version _3.11_.
# Time in the graph is in nanos. 


> IntervalTree performance comparison with Linear Walk and Binary Search based 
> Elimination. 
> --
>
> Key: CASSANDRA-15397
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15397
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Chandrasekhar Thumuluru
>Priority: Normal
> Attachments: 99p_1_SSTable_with_5000_Searches.png, 
> 99p_15000_SSTable_with_5000_Searches.png, 
> 99p_2_SSTable_with_5000_Searches.png, 
> 99p_25000_SSTable_with_5000_Searches.png, 
> 99p_3_SSTable_with_5000_Searches.png, 
> 99p_5000_SSTable_with_5000_Searches.png, IntervalList.java, 
> IntervalListWithElimination.java, IntervalTreeSimplified.java
>
>
> Cassandra uses IntervalTrees to identify the SSTables that overlap with 
> search interval. In Cassandra, IntervalTrees are not mutated. They are 
> recreated each time a mutation is required. This can be an issue during 
> repairs. In fact we noticed such issues during repair. 
> Since lists are cache friendly compared to linked lists and trees, I decided 
> to compare the search performance with:
> * Linear Walk.
> * Elimination using Binary Search (idea is to eliminate intervals using start 
> and end points of search interval). 
> Based on the tests I ran, I noticed Binary Search based elimination almost 
> always performs similar to IntervalTree or out performs IntervalTree based 
> search. The cost of IntervalTree construction is also substantial and 
> produces lot of garbage during repairs. 
> I ran the tests using random intervals to build the tree/lists and another 
> randomly generated search interval with 5000 iterations. I'm attaching all 
> the relevant graphs. The x-axis in the graphs is the search interval 
> coverage. 10p means the search interval covered 10% of the intervals. The 
> y-axis is the time the search took in nanos. 
> PS: 
> # For the purpose of test, I simplified the IntervalTree code by making it 
> non-generic and removing the data portion of the interval.  
> # I used the code from Cassandra version _3.11_.
> # Time in the graph is in nanos. 



--
This message was sent by Atlassian Jira

[jira] [Commented] (CASSANDRA-14731) Transient Write Metrics

2019-11-05 Thread Abdul Aziz Ali (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967580#comment-16967580
 ] 

Abdul Aziz Ali commented on CASSANDRA-14731:


Thanks [~benedict] ill try to send a patch or a PR in the next week or so

> Transient Write Metrics
> ---
>
> Key: CASSANDRA-14731
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14731
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Core, Observability/Metrics
>Reporter: Benedict Elliott Smith
>Assignee: Abdul Aziz Ali
>Priority: Low
>  Labels: metrics, transient-replication
> Fix For: 4.x
>
>
> While we record the number of attempt transient writes, we do not record how 
> successful these were.
> Also, we do not count transient writes that happen due to the failure 
> detector.  While these are distinct from those writes that happen 
> ‘speculatively’ due to slow responses, there’s a strong chance they will be 
> the most common form of transient write.  It might be worth having separate 
> metrics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-15398) TBD (minor, boring)

2019-11-05 Thread Aleksey Yeschenko (Jira)
Aleksey Yeschenko created CASSANDRA-15398:
-

 Summary: TBD (minor, boring)
 Key: CASSANDRA-15398
 URL: https://issues.apache.org/jira/browse/CASSANDRA-15398
 Project: Cassandra
  Issue Type: Bug
  Components: Cluster/Schema
Reporter: Aleksey Yeschenko
Assignee: Aleksey Yeschenko






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14731) Transient Write Metrics

2019-11-05 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967544#comment-16967544
 ] 

Benedict Elliott Smith edited comment on CASSANDRA-14731 at 11/5/19 1:54 PM:
-

Hi [~abdulazizali].  Sure, you're welcome to take this ticket.  It's a while 
since I thought about this, so I will have to acclimatise to the context again, 
but there appear to be two proposals in this ticket: 1) tracking those 
transient writes we perform _because of the failure detector,_ and these are 
going to happen in {{sendToHintedReplicas}}; 2) tracking success of a transient 
write, particularly for achieving the requested consistency level, and that 
would happen in {{AbstractWriteResponseHandler}} but, in retrospect, I'm not 
entirely sure what this would look like.


was (Author: benedict):
Hi [~abdulazizali].  Sure, you're welcome to take this ticket.  It's a while 
since I thought about this, so I will have to acclimatise to the context again, 
but there appear to be two proposals in this ticket: 1) tracking those 
transient writes we perform _because of the failure detector,_ and these are 
going to happen in {{sendToHintedReplicas}}.  If we want to track success of 
transient writes for achieving quorum, that would happen in 
{{AbstractWriteResponseHandler}} but, in retrospect, I'm not entirely sure what 
this would look like.

> Transient Write Metrics
> ---
>
> Key: CASSANDRA-14731
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14731
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Core, Observability/Metrics
>Reporter: Benedict Elliott Smith
>Assignee: Abdul Aziz Ali
>Priority: Low
>  Labels: metrics, transient-replication
> Fix For: 4.x
>
>
> While we record the number of attempt transient writes, we do not record how 
> successful these were.
> Also, we do not count transient writes that happen due to the failure 
> detector.  While these are distinct from those writes that happen 
> ‘speculatively’ due to slow responses, there’s a strong chance they will be 
> the most common form of transient write.  It might be worth having separate 
> metrics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14731) Transient Write Metrics

2019-11-05 Thread Benedict Elliott Smith (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16967544#comment-16967544
 ] 

Benedict Elliott Smith commented on CASSANDRA-14731:


Hi [~abdulazizali].  Sure, you're welcome to take this ticket.  It's a while 
since I thought about this, so I will have to acclimatise to the context again, 
but there appear to be two proposals in this ticket: 1) tracking those 
transient writes we perform _because of the failure detector,_ and these are 
going to happen in {{sendToHintedReplicas}}.  If we want to track success of 
transient writes for achieving quorum, that would happen in 
{{AbstractWriteResponseHandler}} but, in retrospect, I'm not entirely sure what 
this would look like.

> Transient Write Metrics
> ---
>
> Key: CASSANDRA-14731
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14731
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Core, Observability/Metrics
>Reporter: Benedict Elliott Smith
>Assignee: Abdul Aziz Ali
>Priority: Low
>  Labels: metrics, transient-replication
> Fix For: 4.x
>
>
> While we record the number of attempt transient writes, we do not record how 
> successful these were.
> Also, we do not count transient writes that happen due to the failure 
> detector.  While these are distinct from those writes that happen 
> ‘speculatively’ due to slow responses, there’s a strong chance they will be 
> the most common form of transient write.  It might be worth having separate 
> metrics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14731) Transient Write Metrics

2019-11-05 Thread Benedict Elliott Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict Elliott Smith updated CASSANDRA-14731:
---
Description: 
While we record the number of attempt transient writes, we do not record how 
successful these were.

Also, we do not count transient writes that happen due to the failure detector. 
 While these are distinct from those writes that happen ‘speculatively’ due to 
slow responses, there’s a strong chance they will be the most common form of 
transient write.  It might be worth having separate metrics.

  was:
While we record the number of attempt transient writes, we do not record how 
successful these were.

Also, we do not count transient writes that happen due to the failure detector. 
 Possibly, these While these are distinct from those writes that happen 
‘speculatively’ due to slow responses, there’s a strong chance they will be the 
most common form of transient write.  It might be worth having separate 


> Transient Write Metrics
> ---
>
> Key: CASSANDRA-14731
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14731
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Core, Observability/Metrics
>Reporter: Benedict Elliott Smith
>Assignee: Abdul Aziz Ali
>Priority: Low
>  Labels: metrics, transient-replication
> Fix For: 4.x
>
>
> While we record the number of attempt transient writes, we do not record how 
> successful these were.
> Also, we do not count transient writes that happen due to the failure 
> detector.  While these are distinct from those writes that happen 
> ‘speculatively’ due to slow responses, there’s a strong chance they will be 
> the most common form of transient write.  It might be worth having separate 
> metrics.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14731) Transient Write Metrics

2019-11-05 Thread Benedict Elliott Smith (Jira)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benedict Elliott Smith reassigned CASSANDRA-14731:
--

Assignee: Abdul Aziz Ali

> Transient Write Metrics
> ---
>
> Key: CASSANDRA-14731
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14731
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Legacy/Core, Observability/Metrics
>Reporter: Benedict Elliott Smith
>Assignee: Abdul Aziz Ali
>Priority: Low
>  Labels: metrics, transient-replication
> Fix For: 4.x
>
>
> While we record the number of attempt transient writes, we do not record how 
> successful these were.
> Also, we do not count transient writes that happen due to the failure 
> detector.  Possibly, these While these are distinct from those writes that 
> happen ‘speculatively’ due to slow responses, there’s a strong chance they 
> will be the most common form of transient write.  It might be worth having 
> separate 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org