[jira] [Updated] (CASSANDRA-8630) Faster sequential IO (on compaction, streaming, etc)

2015-07-26 Thread Stefania (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-8630:

Attachment: flight_recorder_001_files.tar.gz

> Faster sequential IO (on compaction, streaming, etc)
> 
>
> Key: CASSANDRA-8630
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8630
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core, Tools
>Reporter: Oleg Anastasyev
>Assignee: Stefania
>  Labels: compaction, performance
> Fix For: 3.x
>
> Attachments: 8630-FasterSequencialReadsAndWrites.txt, cpu_load.png, 
> flight_recorder_001_files.tar.gz
>
>
> When node is doing a lot of sequencial IO (streaming, compacting, etc) a lot 
> of CPU is lost in calls to RAF's int read() and DataOutputStream's write(int).
> This is because default implementations of readShort,readLong, etc as well as 
> their matching write* are implemented with numerous calls of byte by byte 
> read and write. 
> This makes a lot of syscalls as well.
> A quick microbench shows than just reimplementation of these methods in 
> either way gives 8x speed increase.
> A patch attached implements RandomAccessReader.read and 
> SequencialWriter.write methods in more efficient way.
> I also eliminated some extra byte copies in CompositeType.split and 
> ColumnNameHelper.maxComponents, which were on my profiler's hotspot method 
> list during tests.
> A stress tests on my laptop show that this patch makes compaction 25-30% 
> faster  on uncompressed sstables and 15% faster for compressed ones.
> A deployment to production shows much less CPU load for compaction. 
> (I attached a cpu load graph from one of our production, orange is niced CPU 
> load - i.e. compaction; yellow is user - i.e. not compaction related tasks)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9265) Add checksum to saved cache files

2015-07-26 Thread Daniel Chia (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642360#comment-14642360
 ] 

Daniel Chia commented on CASSANDRA-9265:


To clarify my question further - it seems we already version the keycache file 
version, but we don't seem to guarantee any sort of backwards compatibility. 
How important would it be to be able to read version 'b' if we went to version 
'c'?

> Add checksum to saved cache files
> -
>
> Key: CASSANDRA-9265
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9265
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Ariel Weisberg
> Fix For: 3.x
>
>
> Saved caches are not covered by a checksum. We should at least emit a 
> checksum. My suggestion is a large checksum of the whole file (convenient 
> offline validation), and then smaller per record checksums after each record 
> is written (possibly a subset of the incrementally maintained larger 
> checksum).
> I wouldn't go for anything fancy to try to recover from corruption since it 
> is just a saved cache. If corruption is detected while reading I would just 
> have it bail out. I would rather have less code to review and test in this 
> instance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9753) LOCAL_QUORUM reads can block cross-DC if there is a digest mismatch

2015-07-26 Thread sankalp kohli (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642264#comment-14642264
 ] 

sankalp kohli commented on CASSANDRA-9753:
--

I think this is a little different. In CASSANDRA-6887, they are discussing 
about global read repair chance vs CL of LOCAL*. In this, even when we are 
using dc_local_read_repair and a LOCAL consistency level, it is still blocking. 
The reason is that speculative retry is getting mixed here. 

> LOCAL_QUORUM reads can block cross-DC if there is a digest mismatch
> ---
>
> Key: CASSANDRA-9753
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9753
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Richard Low
>
> When there is a digest mismatch during the initial read, a data read request 
> is sent to all replicas involved in the initial read. This can be more than 
> the initial blockFor if read repair was done and if speculative retry kicked 
> in. E.g. for RF 3 in two DCs, the number of reads could be 4: 2 for 
> LOCAL_QUORUM, 1 for read repair and 1 for speculative read if one replica was 
> slow. If there is then a digest mismatch, Cassandra will issue the data read 
> to all 4 and set blockFor=4. Now the read query is blocked on cross-DC 
> latency. The digest mismatch read blockFor should be capped at RF for the 
> local DC when using CL.LOCAL_*.
> You can reproduce this behaviour by creating a keyspace with 
> NetworkTopologyStrategy, RF 3 per DC, dc_local_read_repair=1.0 and ALWAYS for 
> speculative read. If you force a digest mismatch (e.g. by deleting a replicas 
> SSTables and restarting) you can see in tracing that it is blocking for 4 
> responses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8180) Optimize disk seek using min/max column name meta data when the LIMIT clause is used

2015-07-26 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642236#comment-14642236
 ] 

Stefania commented on CASSANDRA-8180:
-

[~blambov] I've pushed a new commit where I ensure that lower bounds always 
compare to less than real values, when the clustering is the same. This means 
the existing merge iterator algorithm is now almost unchanged, the only 
difference is that I moved consume() into the candidate, where we make sure the 
lower bounds are never consumed by the reducer. We still use empty rows as 
lower bounds, but they are never used outside of the merge iterator candidate. 
We could use a specialized {{Unfiltered}} if it really bothers you however. Can 
you take another look?

As for performance, with this test the 8180 branch is ahead:

{code}
user profile=https://dl.dropboxusercontent.com/u/15683245/8180.yaml 
ops\(insert=1,\) n=5M -rate threads=300 -insert revisit=uniform\(1..100\) 
visits=fixed\(25\) -pop seq=1..1K read-lookback=uniform\(1..1K\) contents=SORTED
{code}

http://cstar.datastax.com/graph?stats=094f57cc-3409-11e5-bd2b-42010af0688f&metric=op_rate&operation=1_user&smoothing=1&show_aggregates=true&xmin=0&xmax=1609.08&ymin=0&ymax=4637.6

The read command is unchanged and performance is similar or maybe still better 
on trunk:

{code}
user profile=https://dl.dropboxusercontent.com/u/15683245/8180.yaml 
ops\(singleval=1,\) n=5M -rate threads=300
{code}

http://cstar.datastax.com/graph?stats=094f57cc-3409-11e5-bd2b-42010af0688f&metric=op_rate&operation=2_user&smoothing=1&show_aggregates=true&xmin=0&xmax=35.53&ymin=0&ymax=188763.3

I think we are still visiting all sstables on the second command because the 
global bounds are probably the same.  Profile is attached as 8180_002.yaml.

I've also noticed with flight recorder that {{BigTableReader.getPosition()}} 
and {{Tracing.trace()}} are hotspots, both on trunk and on 8180, we should 
probably optimize them.

> Optimize disk seek using min/max column name meta data when the LIMIT clause 
> is used
> 
>
> Key: CASSANDRA-8180
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8180
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Cassandra 2.0.10
>Reporter: DOAN DuyHai
>Assignee: Stefania
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 8180_001.yaml, 8180_002.yaml
>
>
> I was working on an example of sensor data table (timeseries) and face a use 
> case where C* does not optimize read on disk.
> {code}
> cqlsh:test> CREATE TABLE test(id int, col int, val text, PRIMARY KEY(id,col)) 
> WITH CLUSTERING ORDER BY (col DESC);
> cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 10, '10');
> ...
> >nodetool flush test test
> ...
> cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 20, '20');
> ...
> >nodetool flush test test
> ...
> cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 30, '30');
> ...
> >nodetool flush test test
> {code}
> After that, I activate request tracing:
> {code}
> cqlsh:test> SELECT * FROM test WHERE id=1 LIMIT 1;
>  activity  | 
> timestamp| source| source_elapsed
> ---+--+---+
> execute_cql3_query | 
> 23:48:46,498 | 127.0.0.1 |  0
> Parsing SELECT * FROM test WHERE id=1 LIMIT 1; | 
> 23:48:46,498 | 127.0.0.1 | 74
>Preparing statement | 
> 23:48:46,499 | 127.0.0.1 |253
>   Executing single-partition query on test | 
> 23:48:46,499 | 127.0.0.1 |930
>   Acquiring sstable references | 
> 23:48:46,499 | 127.0.0.1 |943
>Merging memtable tombstones | 
> 23:48:46,499 | 127.0.0.1 |   1032
>Key cache hit for sstable 3 | 
> 23:48:46,500 | 127.0.0.1 |   1160
>Seeking to partition beginning in data file | 
> 23:48:46,500 | 127.0.0.1 |   1173
>Key cache hit for sstable 2 | 
> 23:48:46,500 | 127.0.0.1 |   1889
>Seeking to partition beginning in data file | 
> 23:48:46,500 | 127.0.0.1 |   1901
>Key cache hit for sstable 1 | 
> 23:48:46,501 | 127.0.0.1 |   2373
>Seeking to 

[jira] [Updated] (CASSANDRA-8180) Optimize disk seek using min/max column name meta data when the LIMIT clause is used

2015-07-26 Thread Stefania (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-8180:

Attachment: 8180_002.yaml
8180_001.yaml

> Optimize disk seek using min/max column name meta data when the LIMIT clause 
> is used
> 
>
> Key: CASSANDRA-8180
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8180
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Cassandra 2.0.10
>Reporter: DOAN DuyHai
>Assignee: Stefania
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 8180_001.yaml, 8180_002.yaml
>
>
> I was working on an example of sensor data table (timeseries) and face a use 
> case where C* does not optimize read on disk.
> {code}
> cqlsh:test> CREATE TABLE test(id int, col int, val text, PRIMARY KEY(id,col)) 
> WITH CLUSTERING ORDER BY (col DESC);
> cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 10, '10');
> ...
> >nodetool flush test test
> ...
> cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 20, '20');
> ...
> >nodetool flush test test
> ...
> cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 30, '30');
> ...
> >nodetool flush test test
> {code}
> After that, I activate request tracing:
> {code}
> cqlsh:test> SELECT * FROM test WHERE id=1 LIMIT 1;
>  activity  | 
> timestamp| source| source_elapsed
> ---+--+---+
> execute_cql3_query | 
> 23:48:46,498 | 127.0.0.1 |  0
> Parsing SELECT * FROM test WHERE id=1 LIMIT 1; | 
> 23:48:46,498 | 127.0.0.1 | 74
>Preparing statement | 
> 23:48:46,499 | 127.0.0.1 |253
>   Executing single-partition query on test | 
> 23:48:46,499 | 127.0.0.1 |930
>   Acquiring sstable references | 
> 23:48:46,499 | 127.0.0.1 |943
>Merging memtable tombstones | 
> 23:48:46,499 | 127.0.0.1 |   1032
>Key cache hit for sstable 3 | 
> 23:48:46,500 | 127.0.0.1 |   1160
>Seeking to partition beginning in data file | 
> 23:48:46,500 | 127.0.0.1 |   1173
>Key cache hit for sstable 2 | 
> 23:48:46,500 | 127.0.0.1 |   1889
>Seeking to partition beginning in data file | 
> 23:48:46,500 | 127.0.0.1 |   1901
>Key cache hit for sstable 1 | 
> 23:48:46,501 | 127.0.0.1 |   2373
>Seeking to partition beginning in data file | 
> 23:48:46,501 | 127.0.0.1 |   2384
>  Skipped 0/3 non-slice-intersecting sstables, included 0 due to tombstones | 
> 23:48:46,501 | 127.0.0.1 |   2768
> Merging data from memtables and 3 sstables | 
> 23:48:46,501 | 127.0.0.1 |   2784
> Read 2 live and 0 tombstoned cells | 
> 23:48:46,501 | 127.0.0.1 |   2976
>   Request complete | 
> 23:48:46,501 | 127.0.0.1 |   3551
> {code}
> We can clearly see that C* hits 3 SSTables on disk instead of just one, 
> although it has the min/max column meta data to decide which SSTable contains 
> the most recent data.
> Funny enough, if we add a clause on the clustering column to the select, this 
> time C* optimizes the read path:
> {code}
> cqlsh:test> SELECT * FROM test WHERE id=1 AND col > 25 LIMIT 1;
>  activity  | 
> timestamp| source| source_elapsed
> ---+--+---+
> execute_cql3_query | 
> 23:52:31,888 | 127.0.0.1 |  0
>Parsing SELECT * FROM test WHERE id=1 AND col > 25 LIMIT 1; | 
> 23:52:31,888 | 127.0.0.1 | 60
>Preparing statement | 
> 23:52:31,888 | 127.0.0.1 |277
>   Executing single-partition query on test | 
> 23:52:31,889 | 127.0.0.1 |961
>   Acquiring sstable references | 
> 23

[jira] [Updated] (CASSANDRA-8180) Optimize disk seek using min/max column name meta data when the LIMIT clause is used

2015-07-26 Thread Stefania (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-8180:

Attachment: (was: 8180.yaml)

> Optimize disk seek using min/max column name meta data when the LIMIT clause 
> is used
> 
>
> Key: CASSANDRA-8180
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8180
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Cassandra 2.0.10
>Reporter: DOAN DuyHai
>Assignee: Stefania
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 8180_001.yaml, 8180_002.yaml
>
>
> I was working on an example of sensor data table (timeseries) and face a use 
> case where C* does not optimize read on disk.
> {code}
> cqlsh:test> CREATE TABLE test(id int, col int, val text, PRIMARY KEY(id,col)) 
> WITH CLUSTERING ORDER BY (col DESC);
> cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 10, '10');
> ...
> >nodetool flush test test
> ...
> cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 20, '20');
> ...
> >nodetool flush test test
> ...
> cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 30, '30');
> ...
> >nodetool flush test test
> {code}
> After that, I activate request tracing:
> {code}
> cqlsh:test> SELECT * FROM test WHERE id=1 LIMIT 1;
>  activity  | 
> timestamp| source| source_elapsed
> ---+--+---+
> execute_cql3_query | 
> 23:48:46,498 | 127.0.0.1 |  0
> Parsing SELECT * FROM test WHERE id=1 LIMIT 1; | 
> 23:48:46,498 | 127.0.0.1 | 74
>Preparing statement | 
> 23:48:46,499 | 127.0.0.1 |253
>   Executing single-partition query on test | 
> 23:48:46,499 | 127.0.0.1 |930
>   Acquiring sstable references | 
> 23:48:46,499 | 127.0.0.1 |943
>Merging memtable tombstones | 
> 23:48:46,499 | 127.0.0.1 |   1032
>Key cache hit for sstable 3 | 
> 23:48:46,500 | 127.0.0.1 |   1160
>Seeking to partition beginning in data file | 
> 23:48:46,500 | 127.0.0.1 |   1173
>Key cache hit for sstable 2 | 
> 23:48:46,500 | 127.0.0.1 |   1889
>Seeking to partition beginning in data file | 
> 23:48:46,500 | 127.0.0.1 |   1901
>Key cache hit for sstable 1 | 
> 23:48:46,501 | 127.0.0.1 |   2373
>Seeking to partition beginning in data file | 
> 23:48:46,501 | 127.0.0.1 |   2384
>  Skipped 0/3 non-slice-intersecting sstables, included 0 due to tombstones | 
> 23:48:46,501 | 127.0.0.1 |   2768
> Merging data from memtables and 3 sstables | 
> 23:48:46,501 | 127.0.0.1 |   2784
> Read 2 live and 0 tombstoned cells | 
> 23:48:46,501 | 127.0.0.1 |   2976
>   Request complete | 
> 23:48:46,501 | 127.0.0.1 |   3551
> {code}
> We can clearly see that C* hits 3 SSTables on disk instead of just one, 
> although it has the min/max column meta data to decide which SSTable contains 
> the most recent data.
> Funny enough, if we add a clause on the clustering column to the select, this 
> time C* optimizes the read path:
> {code}
> cqlsh:test> SELECT * FROM test WHERE id=1 AND col > 25 LIMIT 1;
>  activity  | 
> timestamp| source| source_elapsed
> ---+--+---+
> execute_cql3_query | 
> 23:52:31,888 | 127.0.0.1 |  0
>Parsing SELECT * FROM test WHERE id=1 AND col > 25 LIMIT 1; | 
> 23:52:31,888 | 127.0.0.1 | 60
>Preparing statement | 
> 23:52:31,888 | 127.0.0.1 |277
>   Executing single-partition query on test | 
> 23:52:31,889 | 127.0.0.1 |961
>   Acquiring sstable references | 
> 23:52:31,889 | 127.0.0.1 

[jira] [Updated] (CASSANDRA-8180) Optimize disk seek using min/max column name meta data when the LIMIT clause is used

2015-07-26 Thread Stefania (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-8180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-8180:

Reviewer: Branimir Lambov  (was: Sylvain Lebresne)

> Optimize disk seek using min/max column name meta data when the LIMIT clause 
> is used
> 
>
> Key: CASSANDRA-8180
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8180
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
> Environment: Cassandra 2.0.10
>Reporter: DOAN DuyHai
>Assignee: Stefania
>Priority: Minor
> Fix For: 3.x
>
> Attachments: 8180.yaml
>
>
> I was working on an example of sensor data table (timeseries) and face a use 
> case where C* does not optimize read on disk.
> {code}
> cqlsh:test> CREATE TABLE test(id int, col int, val text, PRIMARY KEY(id,col)) 
> WITH CLUSTERING ORDER BY (col DESC);
> cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 10, '10');
> ...
> >nodetool flush test test
> ...
> cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 20, '20');
> ...
> >nodetool flush test test
> ...
> cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 30, '30');
> ...
> >nodetool flush test test
> {code}
> After that, I activate request tracing:
> {code}
> cqlsh:test> SELECT * FROM test WHERE id=1 LIMIT 1;
>  activity  | 
> timestamp| source| source_elapsed
> ---+--+---+
> execute_cql3_query | 
> 23:48:46,498 | 127.0.0.1 |  0
> Parsing SELECT * FROM test WHERE id=1 LIMIT 1; | 
> 23:48:46,498 | 127.0.0.1 | 74
>Preparing statement | 
> 23:48:46,499 | 127.0.0.1 |253
>   Executing single-partition query on test | 
> 23:48:46,499 | 127.0.0.1 |930
>   Acquiring sstable references | 
> 23:48:46,499 | 127.0.0.1 |943
>Merging memtable tombstones | 
> 23:48:46,499 | 127.0.0.1 |   1032
>Key cache hit for sstable 3 | 
> 23:48:46,500 | 127.0.0.1 |   1160
>Seeking to partition beginning in data file | 
> 23:48:46,500 | 127.0.0.1 |   1173
>Key cache hit for sstable 2 | 
> 23:48:46,500 | 127.0.0.1 |   1889
>Seeking to partition beginning in data file | 
> 23:48:46,500 | 127.0.0.1 |   1901
>Key cache hit for sstable 1 | 
> 23:48:46,501 | 127.0.0.1 |   2373
>Seeking to partition beginning in data file | 
> 23:48:46,501 | 127.0.0.1 |   2384
>  Skipped 0/3 non-slice-intersecting sstables, included 0 due to tombstones | 
> 23:48:46,501 | 127.0.0.1 |   2768
> Merging data from memtables and 3 sstables | 
> 23:48:46,501 | 127.0.0.1 |   2784
> Read 2 live and 0 tombstoned cells | 
> 23:48:46,501 | 127.0.0.1 |   2976
>   Request complete | 
> 23:48:46,501 | 127.0.0.1 |   3551
> {code}
> We can clearly see that C* hits 3 SSTables on disk instead of just one, 
> although it has the min/max column meta data to decide which SSTable contains 
> the most recent data.
> Funny enough, if we add a clause on the clustering column to the select, this 
> time C* optimizes the read path:
> {code}
> cqlsh:test> SELECT * FROM test WHERE id=1 AND col > 25 LIMIT 1;
>  activity  | 
> timestamp| source| source_elapsed
> ---+--+---+
> execute_cql3_query | 
> 23:52:31,888 | 127.0.0.1 |  0
>Parsing SELECT * FROM test WHERE id=1 AND col > 25 LIMIT 1; | 
> 23:52:31,888 | 127.0.0.1 | 60
>Preparing statement | 
> 23:52:31,888 | 127.0.0.1 |277
>   Executing single-partition query on test | 
> 23:52:31,889 | 127.0.0.1 |961
>   Acquiring sstable references | 
> 23:52:31,889 | 127.0.0.1 |

[jira] [Commented] (CASSANDRA-7392) Abort in-progress queries that time out

2015-07-26 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642147#comment-14642147
 ] 

Stefania commented on CASSANDRA-7392:
-

Rebased today, CI currently running:

http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-7392-testall/lastCompletedBuild/testReport/
http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-7392-dtest/lastCompletedBuild/testReport/

> Abort in-progress queries that time out
> ---
>
> Key: CASSANDRA-7392
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7392
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Stefania
> Fix For: 3.x
>
>
> Currently we drop queries that time out before we get to them (because node 
> is overloaded) but not queries that time out while being processed.  
> (Particularly common for index queries on data that shouldn't be indexed.)  
> Adding the latter and logging when we have to interrupt one gets us a poor 
> man's "slow query log" for free.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-26 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642123#comment-14642123
 ] 

Benedict commented on CASSANDRA-6477:
-

On second thoughts, I don't think that approach will suffice. Since the second 
write order group is started arbitrarily long after the first, issuing barriers 
against both offers no guarantees. It only shrinks the window of exposure to 
the problem.

I think the simplest and safest thing is to just make the write order global.

> Materialized Views (was: Global Indexes)
> 
>
> Key: CASSANDRA-6477
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API, Core
>Reporter: Jonathan Ellis
>Assignee: Carl Yeksigian
>  Labels: cql
> Fix For: 3.0 alpha 1
>
> Attachments: test-view-data.sh, users.yaml
>
>
> Local indexes are suitable for low-cardinality data, where spreading the 
> index across the cluster is a Good Thing.  However, for high-cardinality 
> data, local indexes require querying most nodes in the cluster even if only a 
> handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9753) LOCAL_QUORUM reads can block cross-DC if there is a digest mismatch

2015-07-26 Thread Paulo Motta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642031#comment-14642031
 ] 

Paulo Motta commented on CASSANDRA-9753:


What is the {{read_repair_chance}} (*NOT* {{dc_local_read_repair}}) ? If > 0, 
then maybe that's a duplicate of 
[8479|https://issues.apache.org/jira/browse/CASSANDRA-8479], because currently 
the {{read_repair_chance}} is orthogonal to consistency level and may cross DC 
boundaries even if CL is LOCAL_*. You may be interested in the discussion of 
[CASSANDRA-6887|https://issues.apache.org/jira/browse/CASSANDRA-6887] to change 
that behavior.

> LOCAL_QUORUM reads can block cross-DC if there is a digest mismatch
> ---
>
> Key: CASSANDRA-9753
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9753
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core
>Reporter: Richard Low
>
> When there is a digest mismatch during the initial read, a data read request 
> is sent to all replicas involved in the initial read. This can be more than 
> the initial blockFor if read repair was done and if speculative retry kicked 
> in. E.g. for RF 3 in two DCs, the number of reads could be 4: 2 for 
> LOCAL_QUORUM, 1 for read repair and 1 for speculative read if one replica was 
> slow. If there is then a digest mismatch, Cassandra will issue the data read 
> to all 4 and set blockFor=4. Now the read query is blocked on cross-DC 
> latency. The digest mismatch read blockFor should be capped at RF for the 
> local DC when using CL.LOCAL_*.
> You can reproduce this behaviour by creating a keyspace with 
> NetworkTopologyStrategy, RF 3 per DC, dc_local_read_repair=1.0 and ALWAYS for 
> speculative read. If you force a digest mismatch (e.g. by deleting a replicas 
> SSTables and restarting) you can see in tracing that it is blocking for 4 
> responses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (CASSANDRA-9898) cqlsh crashes if it load a utf-8 file.

2015-07-26 Thread Philip Thompson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Thompson updated CASSANDRA-9898:
---
Reproduced In: 2.1.8, 2.2.0 rc2  (was: 2.2.0 rc2, 2.1.8)
   Labels: cqlsh  (was: )

> cqlsh crashes if it load a utf-8 file.
> --
>
> Key: CASSANDRA-9898
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9898
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
> Environment: linux, os x yosemite.
>Reporter: Yasuharu Goto
>Assignee: Yasuharu Goto
>Priority: Minor
>  Labels: cqlsh
> Attachments: cassandra-2.1-9898.txt, cassandra-2.2-9898.txt
>
>
> cqlsh crashes when it load a cql script file encoded in utf-8.
> This is a reproduction procedure.
> {quote}
> $cat ./test.cql
> // 日本語のコメント
> use system;
> select * from system.peers;
> $cqlsh --version
> cqlsh 5.0.1
> $cqlsh -f ./test.cql
> Traceback (most recent call last):
>   File "./cqlsh", line 2459, in 
> main(*read_options(sys.argv[1:], os.environ))
>   File "./cqlsh", line 2451, in main
> shell.cmdloop()
>   File "./cqlsh", line 940, in cmdloop
> line = self.get_input_line(self.prompt)
>   File "./cqlsh", line 909, in get_input_line
> self.lastcmd = self.stdin.readline()
>   File 
> "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py",
>  line 675, in readline
> return self.reader.readline(size)
>   File 
> "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py",
>  line 530, in readline
> data = self.read(readsize, firstline=True)
>   File 
> "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py",
>  line 477, in read
> newchars, decodedbytes = self.decode(data, self.errors)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9459) SecondaryIndex API redesign

2015-07-26 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14641921#comment-14641921
 ] 

Sylvain Lebresne commented on CASSANDRA-9459:
-

bq. I believe CASSANDRA-8717 is/was broken by CASSANDRA-8099. The post 
reconcilliation processing step is still there, but it looks like the code for 
scanning all ranges was removed from StorageProxy.

I think we're good, at least it's the intention. The "scan all ranges" option 
pre-CASSANDRA-8099 is just a ugly to ask for the code to not respect the user 
limit before the post-reconciliation function is called, since the limit is 
only thing that makes us stop scanning all ranges. However, 
post-CASSANDRA-8099, the user-limit is enforce _after_ the post-reconciliation 
call. So an implementation that want to use CASSANDRA-8717 can consume as much 
of the iterator passed to the post-reconciliation function as it wants/needs, 
and it will get all ranges if it consumes it all in particular. In other words, 
we now support CASSANDRA-8717 with just the post-reconciliation function, but 
that's a feature since it's cleaner.

> SecondaryIndex API redesign
> ---
>
> Key: CASSANDRA-9459
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9459
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sam Tunnicliffe
>Assignee: Sam Tunnicliffe
> Fix For: 3.0 beta 1
>
>
> For some time now the index subsystem has been a pain point and in large part 
> this is due to the way that the APIs and principal classes have grown 
> organically over the years. It would be a good idea to conduct a wholesale 
> review of the area and see if we can come up with something a bit more 
> coherent.
> A few starting points:
> * There's a lot in AbstractPerColumnSecondaryIndex & its subclasses which 
> could be pulled up into SecondaryIndexSearcher (note that to an extent, this 
> is done in CASSANDRA-8099).
> * SecondayIndexManager is overly complex and several of its functions should 
> be simplified/re-examined. The handling of which columns are indexed and 
> index selection on both the read and write paths are somewhat dense and 
> unintuitive.
> * The SecondaryIndex class hierarchy is rather convoluted and could use some 
> serious rework.
> There are a number of outstanding tickets which we should be able to roll 
> into this higher level one as subtasks (but I'll defer doing that until 
> getting into the details of the redesign):
> * CASSANDRA-7771
> * CASSANDRA-8103
> * CASSANDRA-9041
> * CASSANDRA-4458
> * CASSANDRA-8505
> Whilst they're not hard dependencies, I propose that this be done on top of 
> both CASSANDRA-8099 and CASSANDRA-6717. The former largely because the 
> storage engine changes may facilitate a friendlier index API, but also 
> because of the changes to SIS mentioned above. As for 6717, the changes to 
> schema tables there will help facilitate CASSANDRA-7771.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)