[jira] [Updated] (CASSANDRA-8630) Faster sequential IO (on compaction, streaming, etc)
[ https://issues.apache.org/jira/browse/CASSANDRA-8630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania updated CASSANDRA-8630: Attachment: flight_recorder_001_files.tar.gz > Faster sequential IO (on compaction, streaming, etc) > > > Key: CASSANDRA-8630 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8630 > Project: Cassandra > Issue Type: Improvement > Components: Core, Tools >Reporter: Oleg Anastasyev >Assignee: Stefania > Labels: compaction, performance > Fix For: 3.x > > Attachments: 8630-FasterSequencialReadsAndWrites.txt, cpu_load.png, > flight_recorder_001_files.tar.gz > > > When node is doing a lot of sequencial IO (streaming, compacting, etc) a lot > of CPU is lost in calls to RAF's int read() and DataOutputStream's write(int). > This is because default implementations of readShort,readLong, etc as well as > their matching write* are implemented with numerous calls of byte by byte > read and write. > This makes a lot of syscalls as well. > A quick microbench shows than just reimplementation of these methods in > either way gives 8x speed increase. > A patch attached implements RandomAccessReader.read and > SequencialWriter.write methods in more efficient way. > I also eliminated some extra byte copies in CompositeType.split and > ColumnNameHelper.maxComponents, which were on my profiler's hotspot method > list during tests. > A stress tests on my laptop show that this patch makes compaction 25-30% > faster on uncompressed sstables and 15% faster for compressed ones. > A deployment to production shows much less CPU load for compaction. > (I attached a cpu load graph from one of our production, orange is niced CPU > load - i.e. compaction; yellow is user - i.e. not compaction related tasks) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9265) Add checksum to saved cache files
[ https://issues.apache.org/jira/browse/CASSANDRA-9265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642360#comment-14642360 ] Daniel Chia commented on CASSANDRA-9265: To clarify my question further - it seems we already version the keycache file version, but we don't seem to guarantee any sort of backwards compatibility. How important would it be to be able to read version 'b' if we went to version 'c'? > Add checksum to saved cache files > - > > Key: CASSANDRA-9265 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9265 > Project: Cassandra > Issue Type: Improvement >Reporter: Ariel Weisberg > Fix For: 3.x > > > Saved caches are not covered by a checksum. We should at least emit a > checksum. My suggestion is a large checksum of the whole file (convenient > offline validation), and then smaller per record checksums after each record > is written (possibly a subset of the incrementally maintained larger > checksum). > I wouldn't go for anything fancy to try to recover from corruption since it > is just a saved cache. If corruption is detected while reading I would just > have it bail out. I would rather have less code to review and test in this > instance. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9753) LOCAL_QUORUM reads can block cross-DC if there is a digest mismatch
[ https://issues.apache.org/jira/browse/CASSANDRA-9753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642264#comment-14642264 ] sankalp kohli commented on CASSANDRA-9753: -- I think this is a little different. In CASSANDRA-6887, they are discussing about global read repair chance vs CL of LOCAL*. In this, even when we are using dc_local_read_repair and a LOCAL consistency level, it is still blocking. The reason is that speculative retry is getting mixed here. > LOCAL_QUORUM reads can block cross-DC if there is a digest mismatch > --- > > Key: CASSANDRA-9753 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9753 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Richard Low > > When there is a digest mismatch during the initial read, a data read request > is sent to all replicas involved in the initial read. This can be more than > the initial blockFor if read repair was done and if speculative retry kicked > in. E.g. for RF 3 in two DCs, the number of reads could be 4: 2 for > LOCAL_QUORUM, 1 for read repair and 1 for speculative read if one replica was > slow. If there is then a digest mismatch, Cassandra will issue the data read > to all 4 and set blockFor=4. Now the read query is blocked on cross-DC > latency. The digest mismatch read blockFor should be capped at RF for the > local DC when using CL.LOCAL_*. > You can reproduce this behaviour by creating a keyspace with > NetworkTopologyStrategy, RF 3 per DC, dc_local_read_repair=1.0 and ALWAYS for > speculative read. If you force a digest mismatch (e.g. by deleting a replicas > SSTables and restarting) you can see in tracing that it is blocking for 4 > responses. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-8180) Optimize disk seek using min/max column name meta data when the LIMIT clause is used
[ https://issues.apache.org/jira/browse/CASSANDRA-8180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642236#comment-14642236 ] Stefania commented on CASSANDRA-8180: - [~blambov] I've pushed a new commit where I ensure that lower bounds always compare to less than real values, when the clustering is the same. This means the existing merge iterator algorithm is now almost unchanged, the only difference is that I moved consume() into the candidate, where we make sure the lower bounds are never consumed by the reducer. We still use empty rows as lower bounds, but they are never used outside of the merge iterator candidate. We could use a specialized {{Unfiltered}} if it really bothers you however. Can you take another look? As for performance, with this test the 8180 branch is ahead: {code} user profile=https://dl.dropboxusercontent.com/u/15683245/8180.yaml ops\(insert=1,\) n=5M -rate threads=300 -insert revisit=uniform\(1..100\) visits=fixed\(25\) -pop seq=1..1K read-lookback=uniform\(1..1K\) contents=SORTED {code} http://cstar.datastax.com/graph?stats=094f57cc-3409-11e5-bd2b-42010af0688f&metric=op_rate&operation=1_user&smoothing=1&show_aggregates=true&xmin=0&xmax=1609.08&ymin=0&ymax=4637.6 The read command is unchanged and performance is similar or maybe still better on trunk: {code} user profile=https://dl.dropboxusercontent.com/u/15683245/8180.yaml ops\(singleval=1,\) n=5M -rate threads=300 {code} http://cstar.datastax.com/graph?stats=094f57cc-3409-11e5-bd2b-42010af0688f&metric=op_rate&operation=2_user&smoothing=1&show_aggregates=true&xmin=0&xmax=35.53&ymin=0&ymax=188763.3 I think we are still visiting all sstables on the second command because the global bounds are probably the same. Profile is attached as 8180_002.yaml. I've also noticed with flight recorder that {{BigTableReader.getPosition()}} and {{Tracing.trace()}} are hotspots, both on trunk and on 8180, we should probably optimize them. > Optimize disk seek using min/max column name meta data when the LIMIT clause > is used > > > Key: CASSANDRA-8180 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8180 > Project: Cassandra > Issue Type: Improvement > Components: Core > Environment: Cassandra 2.0.10 >Reporter: DOAN DuyHai >Assignee: Stefania >Priority: Minor > Fix For: 3.x > > Attachments: 8180_001.yaml, 8180_002.yaml > > > I was working on an example of sensor data table (timeseries) and face a use > case where C* does not optimize read on disk. > {code} > cqlsh:test> CREATE TABLE test(id int, col int, val text, PRIMARY KEY(id,col)) > WITH CLUSTERING ORDER BY (col DESC); > cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 10, '10'); > ... > >nodetool flush test test > ... > cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 20, '20'); > ... > >nodetool flush test test > ... > cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 30, '30'); > ... > >nodetool flush test test > {code} > After that, I activate request tracing: > {code} > cqlsh:test> SELECT * FROM test WHERE id=1 LIMIT 1; > activity | > timestamp| source| source_elapsed > ---+--+---+ > execute_cql3_query | > 23:48:46,498 | 127.0.0.1 | 0 > Parsing SELECT * FROM test WHERE id=1 LIMIT 1; | > 23:48:46,498 | 127.0.0.1 | 74 >Preparing statement | > 23:48:46,499 | 127.0.0.1 |253 > Executing single-partition query on test | > 23:48:46,499 | 127.0.0.1 |930 > Acquiring sstable references | > 23:48:46,499 | 127.0.0.1 |943 >Merging memtable tombstones | > 23:48:46,499 | 127.0.0.1 | 1032 >Key cache hit for sstable 3 | > 23:48:46,500 | 127.0.0.1 | 1160 >Seeking to partition beginning in data file | > 23:48:46,500 | 127.0.0.1 | 1173 >Key cache hit for sstable 2 | > 23:48:46,500 | 127.0.0.1 | 1889 >Seeking to partition beginning in data file | > 23:48:46,500 | 127.0.0.1 | 1901 >Key cache hit for sstable 1 | > 23:48:46,501 | 127.0.0.1 | 2373 >Seeking to
[jira] [Updated] (CASSANDRA-8180) Optimize disk seek using min/max column name meta data when the LIMIT clause is used
[ https://issues.apache.org/jira/browse/CASSANDRA-8180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania updated CASSANDRA-8180: Attachment: 8180_002.yaml 8180_001.yaml > Optimize disk seek using min/max column name meta data when the LIMIT clause > is used > > > Key: CASSANDRA-8180 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8180 > Project: Cassandra > Issue Type: Improvement > Components: Core > Environment: Cassandra 2.0.10 >Reporter: DOAN DuyHai >Assignee: Stefania >Priority: Minor > Fix For: 3.x > > Attachments: 8180_001.yaml, 8180_002.yaml > > > I was working on an example of sensor data table (timeseries) and face a use > case where C* does not optimize read on disk. > {code} > cqlsh:test> CREATE TABLE test(id int, col int, val text, PRIMARY KEY(id,col)) > WITH CLUSTERING ORDER BY (col DESC); > cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 10, '10'); > ... > >nodetool flush test test > ... > cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 20, '20'); > ... > >nodetool flush test test > ... > cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 30, '30'); > ... > >nodetool flush test test > {code} > After that, I activate request tracing: > {code} > cqlsh:test> SELECT * FROM test WHERE id=1 LIMIT 1; > activity | > timestamp| source| source_elapsed > ---+--+---+ > execute_cql3_query | > 23:48:46,498 | 127.0.0.1 | 0 > Parsing SELECT * FROM test WHERE id=1 LIMIT 1; | > 23:48:46,498 | 127.0.0.1 | 74 >Preparing statement | > 23:48:46,499 | 127.0.0.1 |253 > Executing single-partition query on test | > 23:48:46,499 | 127.0.0.1 |930 > Acquiring sstable references | > 23:48:46,499 | 127.0.0.1 |943 >Merging memtable tombstones | > 23:48:46,499 | 127.0.0.1 | 1032 >Key cache hit for sstable 3 | > 23:48:46,500 | 127.0.0.1 | 1160 >Seeking to partition beginning in data file | > 23:48:46,500 | 127.0.0.1 | 1173 >Key cache hit for sstable 2 | > 23:48:46,500 | 127.0.0.1 | 1889 >Seeking to partition beginning in data file | > 23:48:46,500 | 127.0.0.1 | 1901 >Key cache hit for sstable 1 | > 23:48:46,501 | 127.0.0.1 | 2373 >Seeking to partition beginning in data file | > 23:48:46,501 | 127.0.0.1 | 2384 > Skipped 0/3 non-slice-intersecting sstables, included 0 due to tombstones | > 23:48:46,501 | 127.0.0.1 | 2768 > Merging data from memtables and 3 sstables | > 23:48:46,501 | 127.0.0.1 | 2784 > Read 2 live and 0 tombstoned cells | > 23:48:46,501 | 127.0.0.1 | 2976 > Request complete | > 23:48:46,501 | 127.0.0.1 | 3551 > {code} > We can clearly see that C* hits 3 SSTables on disk instead of just one, > although it has the min/max column meta data to decide which SSTable contains > the most recent data. > Funny enough, if we add a clause on the clustering column to the select, this > time C* optimizes the read path: > {code} > cqlsh:test> SELECT * FROM test WHERE id=1 AND col > 25 LIMIT 1; > activity | > timestamp| source| source_elapsed > ---+--+---+ > execute_cql3_query | > 23:52:31,888 | 127.0.0.1 | 0 >Parsing SELECT * FROM test WHERE id=1 AND col > 25 LIMIT 1; | > 23:52:31,888 | 127.0.0.1 | 60 >Preparing statement | > 23:52:31,888 | 127.0.0.1 |277 > Executing single-partition query on test | > 23:52:31,889 | 127.0.0.1 |961 > Acquiring sstable references | > 23
[jira] [Updated] (CASSANDRA-8180) Optimize disk seek using min/max column name meta data when the LIMIT clause is used
[ https://issues.apache.org/jira/browse/CASSANDRA-8180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania updated CASSANDRA-8180: Attachment: (was: 8180.yaml) > Optimize disk seek using min/max column name meta data when the LIMIT clause > is used > > > Key: CASSANDRA-8180 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8180 > Project: Cassandra > Issue Type: Improvement > Components: Core > Environment: Cassandra 2.0.10 >Reporter: DOAN DuyHai >Assignee: Stefania >Priority: Minor > Fix For: 3.x > > Attachments: 8180_001.yaml, 8180_002.yaml > > > I was working on an example of sensor data table (timeseries) and face a use > case where C* does not optimize read on disk. > {code} > cqlsh:test> CREATE TABLE test(id int, col int, val text, PRIMARY KEY(id,col)) > WITH CLUSTERING ORDER BY (col DESC); > cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 10, '10'); > ... > >nodetool flush test test > ... > cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 20, '20'); > ... > >nodetool flush test test > ... > cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 30, '30'); > ... > >nodetool flush test test > {code} > After that, I activate request tracing: > {code} > cqlsh:test> SELECT * FROM test WHERE id=1 LIMIT 1; > activity | > timestamp| source| source_elapsed > ---+--+---+ > execute_cql3_query | > 23:48:46,498 | 127.0.0.1 | 0 > Parsing SELECT * FROM test WHERE id=1 LIMIT 1; | > 23:48:46,498 | 127.0.0.1 | 74 >Preparing statement | > 23:48:46,499 | 127.0.0.1 |253 > Executing single-partition query on test | > 23:48:46,499 | 127.0.0.1 |930 > Acquiring sstable references | > 23:48:46,499 | 127.0.0.1 |943 >Merging memtable tombstones | > 23:48:46,499 | 127.0.0.1 | 1032 >Key cache hit for sstable 3 | > 23:48:46,500 | 127.0.0.1 | 1160 >Seeking to partition beginning in data file | > 23:48:46,500 | 127.0.0.1 | 1173 >Key cache hit for sstable 2 | > 23:48:46,500 | 127.0.0.1 | 1889 >Seeking to partition beginning in data file | > 23:48:46,500 | 127.0.0.1 | 1901 >Key cache hit for sstable 1 | > 23:48:46,501 | 127.0.0.1 | 2373 >Seeking to partition beginning in data file | > 23:48:46,501 | 127.0.0.1 | 2384 > Skipped 0/3 non-slice-intersecting sstables, included 0 due to tombstones | > 23:48:46,501 | 127.0.0.1 | 2768 > Merging data from memtables and 3 sstables | > 23:48:46,501 | 127.0.0.1 | 2784 > Read 2 live and 0 tombstoned cells | > 23:48:46,501 | 127.0.0.1 | 2976 > Request complete | > 23:48:46,501 | 127.0.0.1 | 3551 > {code} > We can clearly see that C* hits 3 SSTables on disk instead of just one, > although it has the min/max column meta data to decide which SSTable contains > the most recent data. > Funny enough, if we add a clause on the clustering column to the select, this > time C* optimizes the read path: > {code} > cqlsh:test> SELECT * FROM test WHERE id=1 AND col > 25 LIMIT 1; > activity | > timestamp| source| source_elapsed > ---+--+---+ > execute_cql3_query | > 23:52:31,888 | 127.0.0.1 | 0 >Parsing SELECT * FROM test WHERE id=1 AND col > 25 LIMIT 1; | > 23:52:31,888 | 127.0.0.1 | 60 >Preparing statement | > 23:52:31,888 | 127.0.0.1 |277 > Executing single-partition query on test | > 23:52:31,889 | 127.0.0.1 |961 > Acquiring sstable references | > 23:52:31,889 | 127.0.0.1
[jira] [Updated] (CASSANDRA-8180) Optimize disk seek using min/max column name meta data when the LIMIT clause is used
[ https://issues.apache.org/jira/browse/CASSANDRA-8180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefania updated CASSANDRA-8180: Reviewer: Branimir Lambov (was: Sylvain Lebresne) > Optimize disk seek using min/max column name meta data when the LIMIT clause > is used > > > Key: CASSANDRA-8180 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8180 > Project: Cassandra > Issue Type: Improvement > Components: Core > Environment: Cassandra 2.0.10 >Reporter: DOAN DuyHai >Assignee: Stefania >Priority: Minor > Fix For: 3.x > > Attachments: 8180.yaml > > > I was working on an example of sensor data table (timeseries) and face a use > case where C* does not optimize read on disk. > {code} > cqlsh:test> CREATE TABLE test(id int, col int, val text, PRIMARY KEY(id,col)) > WITH CLUSTERING ORDER BY (col DESC); > cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 10, '10'); > ... > >nodetool flush test test > ... > cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 20, '20'); > ... > >nodetool flush test test > ... > cqlsh:test> INSERT INTO test(id, col , val ) VALUES ( 1, 30, '30'); > ... > >nodetool flush test test > {code} > After that, I activate request tracing: > {code} > cqlsh:test> SELECT * FROM test WHERE id=1 LIMIT 1; > activity | > timestamp| source| source_elapsed > ---+--+---+ > execute_cql3_query | > 23:48:46,498 | 127.0.0.1 | 0 > Parsing SELECT * FROM test WHERE id=1 LIMIT 1; | > 23:48:46,498 | 127.0.0.1 | 74 >Preparing statement | > 23:48:46,499 | 127.0.0.1 |253 > Executing single-partition query on test | > 23:48:46,499 | 127.0.0.1 |930 > Acquiring sstable references | > 23:48:46,499 | 127.0.0.1 |943 >Merging memtable tombstones | > 23:48:46,499 | 127.0.0.1 | 1032 >Key cache hit for sstable 3 | > 23:48:46,500 | 127.0.0.1 | 1160 >Seeking to partition beginning in data file | > 23:48:46,500 | 127.0.0.1 | 1173 >Key cache hit for sstable 2 | > 23:48:46,500 | 127.0.0.1 | 1889 >Seeking to partition beginning in data file | > 23:48:46,500 | 127.0.0.1 | 1901 >Key cache hit for sstable 1 | > 23:48:46,501 | 127.0.0.1 | 2373 >Seeking to partition beginning in data file | > 23:48:46,501 | 127.0.0.1 | 2384 > Skipped 0/3 non-slice-intersecting sstables, included 0 due to tombstones | > 23:48:46,501 | 127.0.0.1 | 2768 > Merging data from memtables and 3 sstables | > 23:48:46,501 | 127.0.0.1 | 2784 > Read 2 live and 0 tombstoned cells | > 23:48:46,501 | 127.0.0.1 | 2976 > Request complete | > 23:48:46,501 | 127.0.0.1 | 3551 > {code} > We can clearly see that C* hits 3 SSTables on disk instead of just one, > although it has the min/max column meta data to decide which SSTable contains > the most recent data. > Funny enough, if we add a clause on the clustering column to the select, this > time C* optimizes the read path: > {code} > cqlsh:test> SELECT * FROM test WHERE id=1 AND col > 25 LIMIT 1; > activity | > timestamp| source| source_elapsed > ---+--+---+ > execute_cql3_query | > 23:52:31,888 | 127.0.0.1 | 0 >Parsing SELECT * FROM test WHERE id=1 AND col > 25 LIMIT 1; | > 23:52:31,888 | 127.0.0.1 | 60 >Preparing statement | > 23:52:31,888 | 127.0.0.1 |277 > Executing single-partition query on test | > 23:52:31,889 | 127.0.0.1 |961 > Acquiring sstable references | > 23:52:31,889 | 127.0.0.1 |
[jira] [Commented] (CASSANDRA-7392) Abort in-progress queries that time out
[ https://issues.apache.org/jira/browse/CASSANDRA-7392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642147#comment-14642147 ] Stefania commented on CASSANDRA-7392: - Rebased today, CI currently running: http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-7392-testall/lastCompletedBuild/testReport/ http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-7392-dtest/lastCompletedBuild/testReport/ > Abort in-progress queries that time out > --- > > Key: CASSANDRA-7392 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7392 > Project: Cassandra > Issue Type: New Feature > Components: Core >Reporter: Jonathan Ellis >Assignee: Stefania > Fix For: 3.x > > > Currently we drop queries that time out before we get to them (because node > is overloaded) but not queries that time out while being processed. > (Particularly common for index queries on data that shouldn't be indexed.) > Adding the latter and logging when we have to interrupt one gets us a poor > man's "slow query log" for free. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)
[ https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642123#comment-14642123 ] Benedict commented on CASSANDRA-6477: - On second thoughts, I don't think that approach will suffice. Since the second write order group is started arbitrarily long after the first, issuing barriers against both offers no guarantees. It only shrinks the window of exposure to the problem. I think the simplest and safest thing is to just make the write order global. > Materialized Views (was: Global Indexes) > > > Key: CASSANDRA-6477 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6477 > Project: Cassandra > Issue Type: New Feature > Components: API, Core >Reporter: Jonathan Ellis >Assignee: Carl Yeksigian > Labels: cql > Fix For: 3.0 alpha 1 > > Attachments: test-view-data.sh, users.yaml > > > Local indexes are suitable for low-cardinality data, where spreading the > index across the cluster is a Good Thing. However, for high-cardinality > data, local indexes require querying most nodes in the cluster even if only a > handful of rows is returned. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9753) LOCAL_QUORUM reads can block cross-DC if there is a digest mismatch
[ https://issues.apache.org/jira/browse/CASSANDRA-9753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642031#comment-14642031 ] Paulo Motta commented on CASSANDRA-9753: What is the {{read_repair_chance}} (*NOT* {{dc_local_read_repair}}) ? If > 0, then maybe that's a duplicate of [8479|https://issues.apache.org/jira/browse/CASSANDRA-8479], because currently the {{read_repair_chance}} is orthogonal to consistency level and may cross DC boundaries even if CL is LOCAL_*. You may be interested in the discussion of [CASSANDRA-6887|https://issues.apache.org/jira/browse/CASSANDRA-6887] to change that behavior. > LOCAL_QUORUM reads can block cross-DC if there is a digest mismatch > --- > > Key: CASSANDRA-9753 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9753 > Project: Cassandra > Issue Type: Bug > Components: Core >Reporter: Richard Low > > When there is a digest mismatch during the initial read, a data read request > is sent to all replicas involved in the initial read. This can be more than > the initial blockFor if read repair was done and if speculative retry kicked > in. E.g. for RF 3 in two DCs, the number of reads could be 4: 2 for > LOCAL_QUORUM, 1 for read repair and 1 for speculative read if one replica was > slow. If there is then a digest mismatch, Cassandra will issue the data read > to all 4 and set blockFor=4. Now the read query is blocked on cross-DC > latency. The digest mismatch read blockFor should be capped at RF for the > local DC when using CL.LOCAL_*. > You can reproduce this behaviour by creating a keyspace with > NetworkTopologyStrategy, RF 3 per DC, dc_local_read_repair=1.0 and ALWAYS for > speculative read. If you force a digest mismatch (e.g. by deleting a replicas > SSTables and restarting) you can see in tracing that it is blocking for 4 > responses. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (CASSANDRA-9898) cqlsh crashes if it load a utf-8 file.
[ https://issues.apache.org/jira/browse/CASSANDRA-9898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-9898: --- Reproduced In: 2.1.8, 2.2.0 rc2 (was: 2.2.0 rc2, 2.1.8) Labels: cqlsh (was: ) > cqlsh crashes if it load a utf-8 file. > -- > > Key: CASSANDRA-9898 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9898 > Project: Cassandra > Issue Type: Bug > Components: Tools > Environment: linux, os x yosemite. >Reporter: Yasuharu Goto >Assignee: Yasuharu Goto >Priority: Minor > Labels: cqlsh > Attachments: cassandra-2.1-9898.txt, cassandra-2.2-9898.txt > > > cqlsh crashes when it load a cql script file encoded in utf-8. > This is a reproduction procedure. > {quote} > $cat ./test.cql > // 日本語のコメント > use system; > select * from system.peers; > $cqlsh --version > cqlsh 5.0.1 > $cqlsh -f ./test.cql > Traceback (most recent call last): > File "./cqlsh", line 2459, in > main(*read_options(sys.argv[1:], os.environ)) > File "./cqlsh", line 2451, in main > shell.cmdloop() > File "./cqlsh", line 940, in cmdloop > line = self.get_input_line(self.prompt) > File "./cqlsh", line 909, in get_input_line > self.lastcmd = self.stdin.readline() > File > "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py", > line 675, in readline > return self.reader.readline(size) > File > "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py", > line 530, in readline > data = self.read(readsize, firstline=True) > File > "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py", > line 477, in read > newchars, decodedbytes = self.decode(data, self.errors) > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-9459) SecondaryIndex API redesign
[ https://issues.apache.org/jira/browse/CASSANDRA-9459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14641921#comment-14641921 ] Sylvain Lebresne commented on CASSANDRA-9459: - bq. I believe CASSANDRA-8717 is/was broken by CASSANDRA-8099. The post reconcilliation processing step is still there, but it looks like the code for scanning all ranges was removed from StorageProxy. I think we're good, at least it's the intention. The "scan all ranges" option pre-CASSANDRA-8099 is just a ugly to ask for the code to not respect the user limit before the post-reconciliation function is called, since the limit is only thing that makes us stop scanning all ranges. However, post-CASSANDRA-8099, the user-limit is enforce _after_ the post-reconciliation call. So an implementation that want to use CASSANDRA-8717 can consume as much of the iterator passed to the post-reconciliation function as it wants/needs, and it will get all ranges if it consumes it all in particular. In other words, we now support CASSANDRA-8717 with just the post-reconciliation function, but that's a feature since it's cleaner. > SecondaryIndex API redesign > --- > > Key: CASSANDRA-9459 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9459 > Project: Cassandra > Issue Type: Improvement >Reporter: Sam Tunnicliffe >Assignee: Sam Tunnicliffe > Fix For: 3.0 beta 1 > > > For some time now the index subsystem has been a pain point and in large part > this is due to the way that the APIs and principal classes have grown > organically over the years. It would be a good idea to conduct a wholesale > review of the area and see if we can come up with something a bit more > coherent. > A few starting points: > * There's a lot in AbstractPerColumnSecondaryIndex & its subclasses which > could be pulled up into SecondaryIndexSearcher (note that to an extent, this > is done in CASSANDRA-8099). > * SecondayIndexManager is overly complex and several of its functions should > be simplified/re-examined. The handling of which columns are indexed and > index selection on both the read and write paths are somewhat dense and > unintuitive. > * The SecondaryIndex class hierarchy is rather convoluted and could use some > serious rework. > There are a number of outstanding tickets which we should be able to roll > into this higher level one as subtasks (but I'll defer doing that until > getting into the details of the redesign): > * CASSANDRA-7771 > * CASSANDRA-8103 > * CASSANDRA-9041 > * CASSANDRA-4458 > * CASSANDRA-8505 > Whilst they're not hard dependencies, I propose that this be done on top of > both CASSANDRA-8099 and CASSANDRA-6717. The former largely because the > storage engine changes may facilitate a friendlier index API, but also > because of the changes to SIS mentioned above. As for 6717, the changes to > schema tables there will help facilitate CASSANDRA-7771. -- This message was sent by Atlassian JIRA (v6.3.4#6332)