[ https://issues.apache.org/jira/browse/CASSANDRA-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joshua McKenzie updated CASSANDRA-13863: ---------------------------------------- Component/s: Core > Speculative retry causes read repair even if read_repair_chance is 0.0. > ----------------------------------------------------------------------- > > Key: CASSANDRA-13863 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13863 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Hiro Wakabayashi > Attachments: > 0001-Use-read_repair_chance-when-starting-repairs-due-to-.patch, speculative > retries.pdf > > > {{read_repair_chance = 0.0}} and {{dclocal_read_repair_chance = 0.0}} should > cause no read repair, but read repair happens with speculative retry. I think > {{read_repair_chance = 0.0}} and {{dclocal_read_repair_chance = 0.0}} should > stop read repair completely because the user wants to stop read repair in > some cases. > {panel:title=Case 1: TWCS users} > The > [documentation|http://cassandra.apache.org/doc/latest/operating/compaction.html?highlight=read_repair_chance] > states how to disable read repair. > {quote}While TWCS tries to minimize the impact of comingled data, users > should attempt to avoid this behavior. Specifically, users should avoid > queries that explicitly set the timestamp via CQL USING TIMESTAMP. > Additionally, users should run frequent repairs (which streams data in such a > way that it does not become comingled), and disable background read repair by > setting the table’s read_repair_chance and dclocal_read_repair_chance to 0. > {quote} > {panel} > {panel:title=Case 2. Strict SLA for read latency} > In a peak time, read latency is a key for us but, read repair causes latency > higher than no read repair. We can use anti entropy repair in off peak time > for consistency. > {panel} > > Here is my procedure to reproduce the problem. > h3. 1. Create a cluster and set {{hinted_handoff_enabled}} to false. > {noformat} > $ ccm create -v 3.0.14 -n 3 cluster_3.0.14 > $ for h in $(seq 1 3) ; do perl -pi -e 's/hinted_handoff_enabled: > true/hinted_handoff_enabled: false/' > ~/.ccm/cluster_3.0.14/node$h/conf/cassandra.yaml ; done > $ for h in $(seq 1 3) ; do grep "hinted_handoff_enabled:" > ~/.ccm/cluster_3.0.14/node$h/conf/cassandra.yaml ; done > hinted_handoff_enabled: false > hinted_handoff_enabled: false > hinted_handoff_enabled: false > $ ccm start{noformat} > h3. 2. Create a keyspace and a table. > {noformat} > $ ccm node1 cqlsh > DROP KEYSPACE IF EXISTS ks1; > CREATE KEYSPACE ks1 WITH replication = {'class': 'SimpleStrategy', > 'replication_factor': '3'} AND durable_writes = true; > CREATE TABLE ks1.t1 ( > key text PRIMARY KEY, > value blob > ) WITH bloom_filter_fp_chance = 0.01 > AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} > AND comment = '' > AND compaction = {'class': > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', > 'max_threshold': '32', 'min_threshold': '4'} > AND compression = {'chunk_length_in_kb': '64', 'class': > 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND crc_check_chance = 1.0 > AND dclocal_read_repair_chance = 0.0 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = 'ALWAYS'; > QUIT; > {noformat} > h3. 3. Stop node2 and node3. Insert a row. > {noformat} > $ ccm node3 stop && ccm node2 stop && ccm status > Cluster: 'cluster_3.0.14' > ---------------------- > node1: UP > node3: DOWN > node2: DOWN > $ ccm node1 cqlsh -k ks1 -e "consistency; tracing on; insert into ks1.t1 > (key, value) values ('mmullass', bigintAsBlob(1));" > Current consistency level is ONE. > Now Tracing is enabled > Tracing session: 01d74590-97cb-11e7-8ea7-c1bd4d549501 > activity > | timestamp | source | > source_elapsed > -----------------------------------------------------------------------------------------------------+----------------------------+-----------+---------------- > > Execute CQL3 query | 2017-09-12 23:59:42.316000 | 127.0.0.1 | > 0 > Parsing insert into ks1.t1 (key, value) values ('mmullass', > bigintAsBlob(1)); [SharedPool-Worker-1] | 2017-09-12 23:59:42.319000 | > 127.0.0.1 | 4323 > Preparing > statement [SharedPool-Worker-1] | 2017-09-12 23:59:42.320000 | 127.0.0.1 | > 5250 > Determining replicas for > mutation [SharedPool-Worker-1] | 2017-09-12 23:59:42.327000 | 127.0.0.1 | > 11886 > Appending to > commitlog [SharedPool-Worker-3] | 2017-09-12 23:59:42.327000 | 127.0.0.1 | > 12195 > Adding to t1 > memtable [SharedPool-Worker-3] | 2017-09-12 23:59:42.327000 | 127.0.0.1 | > 12392 > > Request complete | 2017-09-12 23:59:42.328680 | 127.0.0.1 | > 12680 > $ ccm node1 cqlsh -k ks1 -e "consistency; tracing on; select * from ks1.t1 > where key = 'mmullass';" > Current consistency level is ONE. > Now Tracing is enabled > key | value > ----------+-------------------- > mmullass | 0x0000000000000001 > (1 rows) > Tracing session: 3420ce90-97cb-11e7-8ea7-c1bd4d549501 > activity | > timestamp | source | source_elapsed > ----------------------------------------------------------------------------+----------------------------+-----------+---------------- > Execute CQL3 query | > 2017-09-13 00:01:06.681000 | 127.0.0.1 | 0 > Parsing select * from ks1.t1 where key = 'mmullass'; [SharedPool-Worker-1] | > 2017-09-13 00:01:06.681000 | 127.0.0.1 | 296 > Preparing statement [SharedPool-Worker-1] | > 2017-09-13 00:01:06.681000 | 127.0.0.1 | 561 > Executing single-partition query on t1 [SharedPool-Worker-2] | > 2017-09-13 00:01:06.682000 | 127.0.0.1 | 1056 > Acquiring sstable references [SharedPool-Worker-2] | > 2017-09-13 00:01:06.682000 | 127.0.0.1 | 1142 > Merging memtable contents [SharedPool-Worker-2] | > 2017-09-13 00:01:06.682000 | 127.0.0.1 | 1206 > Read 1 live and 0 tombstone cells [SharedPool-Worker-2] | > 2017-09-13 00:01:06.682000 | 127.0.0.1 | 1455 > Request complete | > 2017-09-13 00:01:06.682794 | 127.0.0.1 | 1794 > {noformat} > h3. 4. Start node2 and confirm node2 has no data. > {noformat} > $ ccm node2 start && ccm status > Cluster: 'cluster_3.0.14' > ------------------------- > node1: UP > node3: DOWN > node2: UP > $ ccm node2 nodetool flush > $ ls ~/.ccm/cluster_3.0.14/node2/data0/ks1/t1-*/*-Data.db > ls: /Users/hiwakaba/.ccm/cluster_3.0.14/node2/data0/ks1/t1-*/*-Data.db: No > such file or directory > {noformat} > h3. 5. Select the row from node2 and read repair works. > {noformat} > $ ccm node2 cqlsh -k ks1 -e "consistency; tracing on; select * from ks1.t1 > where key = 'mmullass';" > Current consistency level is ONE. > Now Tracing is enabled > key | value > -----+------- > (0 rows) > Tracing session: 72a71fc0-97cb-11e7-83cc-a3af9d3da979 > activity > > > | timestamp | source | source_elapsed > -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-----------+---------------- > > > Execute CQL3 query > | 2017-09-13 00:02:51.582000 | 127.0.0.2 | 0 > > > Parsing select * from ks1.t1 where key = 'mmullass'; [SharedPool-Worker-2] > | 2017-09-13 00:02:51.583000 | 127.0.0.2 | 1112 > > > Preparing statement [SharedPool-Worker-2] > | 2017-09-13 00:02:51.583000 | 127.0.0.2 | 1412 > > > reading data from /127.0.0.1 [SharedPool-Worker-2] > | 2017-09-13 00:02:51.584000 | 127.0.0.2 | 2107 > > > Executing single-partition query on t1 [SharedPool-Worker-1] > | 2017-09-13 00:02:51.585000 | 127.0.0.2 | 3492 > > > Sending READ message to /127.0.0.1 [MessagingService-Outgoing-/127.0.0.1] > | 2017-09-13 00:02:51.585000 | 127.0.0.2 | 3516 > > > Acquiring sstable references [SharedPool-Worker-1] > | 2017-09-13 00:02:51.585000 | 127.0.0.2 | 3595 > > > Merging memtable contents [SharedPool-Worker-1] > | 2017-09-13 00:02:51.585001 | 127.0.0.2 | 3673 > > > Read 0 live and 0 tombstone cells [SharedPool-Worker-1] > | 2017-09-13 00:02:51.585001 | 127.0.0.2 | 3851 > > > READ message received from /127.0.0.2 [MessagingService-Incoming-/127.0.0.2] > | 2017-09-13 00:02:51.588000 | 127.0.0.1 | 33 > > > Acquiring sstable references [SharedPool-Worker-2] > | 2017-09-13 00:02:51.600000 | 127.0.0.1 | 12444 > > > Merging memtable contents [SharedPool-Worker-2] > | 2017-09-13 00:02:51.600000 | 127.0.0.1 | 12536 > > > Read 1 live and 0 tombstone cells [SharedPool-Worker-2] > | 2017-09-13 00:02:51.600000 | 127.0.0.1 | 12765 > > > Enqueuing response to /127.0.0.2 [SharedPool-Worker-2] > | 2017-09-13 00:02:51.600000 | 127.0.0.1 | 12929 > > Sending > REQUEST_RESPONSE message to /127.0.0.2 [MessagingService-Outgoing-/127.0.0.2] > | 2017-09-13 00:02:51.602000 | 127.0.0.1 | 14686 > > > REQUEST_RESPONSE message received from /127.0.0.1 > [MessagingService-Incoming-/127.0.0.1] | 2017-09-13 00:02:51.603000 | > 127.0.0.2 | -- > > > Processing response from /127.0.0.1 [SharedPool-Worker-3] > | 2017-09-13 00:02:51.610000 | 127.0.0.2 | -- > > > Initiating read-repair [SharedPool-Worker-3] > | 2017-09-13 00:02:51.610000 | 127.0.0.2 | -- > Digest mismatch: org.apache.cassandra.service.DigestMismatchException: > Mismatch for key DecoratedKey(-4886857781295767937, 6d6d756c6c617373) > (d41d8cd98f00b204e9800998ecf8427e vs f8e0f9262a889cd3ebf4e5d50159757b) > [ReadRepairStage:1] | 2017-09-13 00:02:51.624000 | 127.0.0.2 | -- > > > Request complete > | 2017-09-13 00:02:51.586892 | 127.0.0.2 | 4892 > {noformat} > h3. 6. As a result, node2 has the row. > {noformat} > $ ccm node2 cqlsh -k ks1 -e "consistency; tracing on; select * from ks1.t1 > where key = 'mmullass';" > Current consistency level is ONE. > Now Tracing is enabled > key | value > ----------+-------------------- > mmullass | 0x0000000000000001 > (1 rows) > Tracing session: 78526330-97cb-11e7-83cc-a3af9d3da979 > activity > | timestamp | source | source_elapsed > ------------------------------------------------------------------------------------------+----------------------------+-----------+---------------- > > Execute CQL3 query | 2017-09-13 00:03:01.091000 | 127.0.0.2 | 0 > Parsing select * from ks1.t1 where key = 'mmullass'; > [SharedPool-Worker-3] | 2017-09-13 00:03:01.091000 | 127.0.0.2 | > 216 > Preparing statement > [SharedPool-Worker-3] | 2017-09-13 00:03:01.091000 | 127.0.0.2 | > 390 > reading data from /127.0.0.1 > [SharedPool-Worker-3] | 2017-09-13 00:03:01.091000 | 127.0.0.2 | > 808 > Executing single-partition query on t1 > [SharedPool-Worker-2] | 2017-09-13 00:03:01.092000 | 127.0.0.2 | > 1041 > READ message received from /127.0.0.2 > [MessagingService-Incoming-/127.0.0.2] | 2017-09-13 00:03:01.092000 | > 127.0.0.1 | 33 > Sending READ message to /127.0.0.1 > [MessagingService-Outgoing-/127.0.0.1] | 2017-09-13 00:03:01.092000 | > 127.0.0.2 | 1036 > Executing single-partition query on t1 > [SharedPool-Worker-1] | 2017-09-13 00:03:01.092000 | 127.0.0.1 | > 189 > Acquiring sstable references > [SharedPool-Worker-2] | 2017-09-13 00:03:01.092000 | 127.0.0.2 | > 1113 > Acquiring sstable references > [SharedPool-Worker-1] | 2017-09-13 00:03:01.092000 | 127.0.0.1 | > 276 > Merging memtable contents > [SharedPool-Worker-2] | 2017-09-13 00:03:01.092000 | 127.0.0.2 | > 1172 > Merging memtable contents > [SharedPool-Worker-1] | 2017-09-13 00:03:01.092000 | 127.0.0.1 | > 332 > REQUEST_RESPONSE message received from /127.0.0.1 > [MessagingService-Incoming-/127.0.0.1] | 2017-09-13 00:03:01.093000 | > 127.0.0.2 | -- > Read 1 live and 0 tombstone cells > [SharedPool-Worker-1] | 2017-09-13 00:03:01.093000 | 127.0.0.1 | > 565 > Enqueuing response to /127.0.0.2 > [SharedPool-Worker-1] | 2017-09-13 00:03:01.093000 | 127.0.0.1 | > 648 > Sending REQUEST_RESPONSE message to /127.0.0.2 > [MessagingService-Outgoing-/127.0.0.2] | 2017-09-13 00:03:01.093000 | > 127.0.0.1 | 783 > Processing response from /127.0.0.1 > [SharedPool-Worker-1] | 2017-09-13 00:03:01.094000 | 127.0.0.2 | > -- > Initiating read-repair > [SharedPool-Worker-1] | 2017-09-13 00:03:01.099000 | 127.0.0.2 | > -- > Read 1 live and 0 tombstone cells > [SharedPool-Worker-2] | 2017-09-13 00:03:01.101000 | 127.0.0.2 | > 10113 > > Request complete | 2017-09-13 00:03:01.092830 | 127.0.0.2 | 1830 > $ ccm node2 nodetool flush > $ ls ~/.ccm/cluster_3.0.14/node2/data0/ks1/t1-*/*-Data.db > /Users/hiwakaba/.ccm/cluster_3.0.14/node2/data0/ks1/t1-ec659e0097ca11e78ea7c1bd4d549501/mc-1-big-Data.db > $ ~/.ccm/repository/3.0.14/tools/bin/sstabledump > /Users/hiwakaba/.ccm/cluster_3.0.14/node2/data0/ks1/t1-ec659e0097ca11e78ea7c1bd4d549501/mc-1-big-Data.db > -k mmullass > [ > { > "partition" : { > "key" : [ "mmullass" ], > "position" : 0 > }, > "rows" : [ > { > "type" : "row", > "position" : 36, > "liveness_info" : { "tstamp" : "2017-09-12T14:59:42.312969Z" }, > "cells" : [ > { "name" : "value", "value" : "0000000000000001" } > ] > } > ] > } > ] > {noformat} > In [CASSANDRA-11409|https://issues.apache.org/jira/browse/CASSANDRA-11409], > [~cam1982] commented this was not a bug. So I filed this issue as an > improvement. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org