[ 
https://issues.apache.org/jira/browse/CASSANDRA-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aleksey Yeschenko updated CASSANDRA-13863:
------------------------------------------
    Component/s:     (was: Core)
                 Coordination

> Speculative retry causes read repair even if read_repair_chance is 0.0.
> -----------------------------------------------------------------------
>
>                 Key: CASSANDRA-13863
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13863
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Coordination
>            Reporter: Hiro Wakabayashi
>         Attachments: 
> 0001-Use-read_repair_chance-when-starting-repairs-due-to-.patch, speculative 
> retries.pdf
>
>
> {{read_repair_chance = 0.0}} and {{dclocal_read_repair_chance = 0.0}} should 
> cause no read repair, but read repair happens with speculative retry. I think 
> {{read_repair_chance = 0.0}} and {{dclocal_read_repair_chance = 0.0}} should 
> stop read repair completely because the user wants to stop read repair in 
> some cases.
> {panel:title=Case 1: TWCS users}
> The 
> [documentation|http://cassandra.apache.org/doc/latest/operating/compaction.html?highlight=read_repair_chance]
>  states how to disable read repair.
> {quote}While TWCS tries to minimize the impact of comingled data, users 
> should attempt to avoid this behavior. Specifically, users should avoid 
> queries that explicitly set the timestamp via CQL USING TIMESTAMP. 
> Additionally, users should run frequent repairs (which streams data in such a 
> way that it does not become comingled), and disable background read repair by 
> setting the table’s read_repair_chance and dclocal_read_repair_chance to 0.
> {quote}
> {panel}
> {panel:title=Case 2. Strict SLA for read latency}
> In a peak time, read latency is a key for us but, read repair causes latency 
> higher than no read repair. We can use anti entropy repair in off peak time 
> for consistency.
> {panel}
>  
> Here is my procedure to reproduce the problem.
> h3. 1. Create a cluster and set {{hinted_handoff_enabled}} to false.
> {noformat}
> $ ccm create -v 3.0.14 -n 3 cluster_3.0.14
> $ for h in $(seq 1 3) ; do perl -pi -e 's/hinted_handoff_enabled: 
> true/hinted_handoff_enabled: false/' 
> ~/.ccm/cluster_3.0.14/node$h/conf/cassandra.yaml ; done
> $ for h in $(seq 1 3) ; do grep "hinted_handoff_enabled:" 
> ~/.ccm/cluster_3.0.14/node$h/conf/cassandra.yaml ; done
> hinted_handoff_enabled: false
> hinted_handoff_enabled: false
> hinted_handoff_enabled: false
> $ ccm start{noformat}
> h3. 2. Create a keyspace and a table.
> {noformat}
> $ ccm node1 cqlsh
> DROP KEYSPACE IF EXISTS ks1;
> CREATE KEYSPACE ks1 WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': '3'}  AND durable_writes = true;
> CREATE TABLE ks1.t1 (
>         key text PRIMARY KEY,
>         value blob
>     ) WITH bloom_filter_fp_chance = 0.01
>         AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>         AND comment = ''
>         AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
>         AND compression = {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
>         AND crc_check_chance = 1.0
>         AND dclocal_read_repair_chance = 0.0
>         AND default_time_to_live = 0
>         AND gc_grace_seconds = 864000
>         AND max_index_interval = 2048
>         AND memtable_flush_period_in_ms = 0
>         AND min_index_interval = 128
>         AND read_repair_chance = 0.0
>         AND speculative_retry = 'ALWAYS';
> QUIT;
> {noformat}
> h3. 3. Stop node2 and node3. Insert a row.
> {noformat}
> $ ccm node3 stop && ccm node2 stop && ccm status
> Cluster: 'cluster_3.0.14'
> ----------------------
> node1: UP
> node3: DOWN
> node2: DOWN
> $ ccm node1 cqlsh -k ks1 -e "consistency; tracing on; insert into ks1.t1 
> (key, value) values ('mmullass', bigintAsBlob(1));"
> Current consistency level is ONE.
> Now Tracing is enabled
> Tracing session: 01d74590-97cb-11e7-8ea7-c1bd4d549501
>  activity                                                                     
>                        | timestamp                  | source    | 
> source_elapsed
> -----------------------------------------------------------------------------------------------------+----------------------------+-----------+----------------
>                                                                               
>     Execute CQL3 query | 2017-09-12 23:59:42.316000 | 127.0.0.1 |             
>  0
>  Parsing insert into ks1.t1 (key, value) values ('mmullass', 
> bigintAsBlob(1)); [SharedPool-Worker-1] | 2017-09-12 23:59:42.319000 | 
> 127.0.0.1 |           4323
>                                                            Preparing 
> statement [SharedPool-Worker-1] | 2017-09-12 23:59:42.320000 | 127.0.0.1 |    
>        5250
>                                              Determining replicas for 
> mutation [SharedPool-Worker-1] | 2017-09-12 23:59:42.327000 | 127.0.0.1 |     
>      11886
>                                                         Appending to 
> commitlog [SharedPool-Worker-3] | 2017-09-12 23:59:42.327000 | 127.0.0.1 |    
>       12195
>                                                          Adding to t1 
> memtable [SharedPool-Worker-3] | 2017-09-12 23:59:42.327000 | 127.0.0.1 |     
>      12392
>                                                                               
>       Request complete | 2017-09-12 23:59:42.328680 | 127.0.0.1 |          
> 12680
> $ ccm node1 cqlsh -k ks1 -e "consistency; tracing on; select * from ks1.t1 
> where key = 'mmullass';"
> Current consistency level is ONE.
> Now Tracing is enabled
>  key      | value
> ----------+--------------------
>  mmullass | 0x0000000000000001
> (1 rows)
> Tracing session: 3420ce90-97cb-11e7-8ea7-c1bd4d549501
>  activity                                                                   | 
> timestamp                  | source    | source_elapsed
> ----------------------------------------------------------------------------+----------------------------+-----------+----------------
>                                                          Execute CQL3 query | 
> 2017-09-13 00:01:06.681000 | 127.0.0.1 |              0
>  Parsing select * from ks1.t1 where key = 'mmullass'; [SharedPool-Worker-1] | 
> 2017-09-13 00:01:06.681000 | 127.0.0.1 |            296
>                                   Preparing statement [SharedPool-Worker-1] | 
> 2017-09-13 00:01:06.681000 | 127.0.0.1 |            561
>                Executing single-partition query on t1 [SharedPool-Worker-2] | 
> 2017-09-13 00:01:06.682000 | 127.0.0.1 |           1056
>                          Acquiring sstable references [SharedPool-Worker-2] | 
> 2017-09-13 00:01:06.682000 | 127.0.0.1 |           1142
>                             Merging memtable contents [SharedPool-Worker-2] | 
> 2017-09-13 00:01:06.682000 | 127.0.0.1 |           1206
>                     Read 1 live and 0 tombstone cells [SharedPool-Worker-2] | 
> 2017-09-13 00:01:06.682000 | 127.0.0.1 |           1455
>                                                            Request complete | 
> 2017-09-13 00:01:06.682794 | 127.0.0.1 |           1794
> {noformat}
> h3. 4. Start node2 and confirm node2 has no data.
> {noformat}
> $ ccm node2 start && ccm status
> Cluster: 'cluster_3.0.14'
> -------------------------
> node1: UP
> node3: DOWN
> node2: UP
> $ ccm node2 nodetool flush
> $ ls ~/.ccm/cluster_3.0.14/node2/data0/ks1/t1-*/*-Data.db
> ls: /Users/hiwakaba/.ccm/cluster_3.0.14/node2/data0/ks1/t1-*/*-Data.db: No 
> such file or directory
> {noformat}
> h3. 5. Select the row from node2 and read repair works.
> {noformat}
> $ ccm node2 cqlsh -k ks1 -e "consistency; tracing on; select * from ks1.t1 
> where key = 'mmullass';"
> Current consistency level is ONE.
> Now Tracing is enabled
>  key | value
> -----+-------
> (0 rows)
> Tracing session: 72a71fc0-97cb-11e7-83cc-a3af9d3da979
>  activity                                                                     
>                                                                               
>                                                                              
> | timestamp                  | source    | source_elapsed
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-----------+----------------
>                                                                               
>                                                                               
>                                                           Execute CQL3 query 
> | 2017-09-13 00:02:51.582000 | 127.0.0.2 |              0
>                                                                               
>                                                                               
>   Parsing select * from ks1.t1 where key = 'mmullass'; [SharedPool-Worker-2] 
> | 2017-09-13 00:02:51.583000 | 127.0.0.2 |           1112
>                                                                               
>                                                                               
>                                    Preparing statement [SharedPool-Worker-2] 
> | 2017-09-13 00:02:51.583000 | 127.0.0.2 |           1412
>                                                                               
>                                                                               
>                           reading data from /127.0.0.1 [SharedPool-Worker-2] 
> | 2017-09-13 00:02:51.584000 | 127.0.0.2 |           2107
>                                                                               
>                                                                               
>                 Executing single-partition query on t1 [SharedPool-Worker-1] 
> | 2017-09-13 00:02:51.585000 | 127.0.0.2 |           3492
>                                                                               
>                                                                               
>    Sending READ message to /127.0.0.1 [MessagingService-Outgoing-/127.0.0.1] 
> | 2017-09-13 00:02:51.585000 | 127.0.0.2 |           3516
>                                                                               
>                                                                               
>                           Acquiring sstable references [SharedPool-Worker-1] 
> | 2017-09-13 00:02:51.585000 | 127.0.0.2 |           3595
>                                                                               
>                                                                               
>                              Merging memtable contents [SharedPool-Worker-1] 
> | 2017-09-13 00:02:51.585001 | 127.0.0.2 |           3673
>                                                                               
>                                                                               
>                      Read 0 live and 0 tombstone cells [SharedPool-Worker-1] 
> | 2017-09-13 00:02:51.585001 | 127.0.0.2 |           3851
>                                                                               
>                                                                               
> READ message received from /127.0.0.2 [MessagingService-Incoming-/127.0.0.2] 
> | 2017-09-13 00:02:51.588000 | 127.0.0.1 |             33
>                                                                               
>                                                                               
>                           Acquiring sstable references [SharedPool-Worker-2] 
> | 2017-09-13 00:02:51.600000 | 127.0.0.1 |          12444
>                                                                               
>                                                                               
>                              Merging memtable contents [SharedPool-Worker-2] 
> | 2017-09-13 00:02:51.600000 | 127.0.0.1 |          12536
>                                                                               
>                                                                               
>                      Read 1 live and 0 tombstone cells [SharedPool-Worker-2] 
> | 2017-09-13 00:02:51.600000 | 127.0.0.1 |          12765
>                                                                               
>                                                                               
>                       Enqueuing response to /127.0.0.2 [SharedPool-Worker-2] 
> | 2017-09-13 00:02:51.600000 | 127.0.0.1 |          12929
>                                                                               
>                                                                      Sending 
> REQUEST_RESPONSE message to /127.0.0.2 [MessagingService-Outgoing-/127.0.0.2] 
> | 2017-09-13 00:02:51.602000 | 127.0.0.1 |          14686
>                                                                               
>                                                                   
> REQUEST_RESPONSE message received from /127.0.0.1 
> [MessagingService-Incoming-/127.0.0.1] | 2017-09-13 00:02:51.603000 | 
> 127.0.0.2 |             --
>                                                                               
>                                                                               
>                    Processing response from /127.0.0.1 [SharedPool-Worker-3] 
> | 2017-09-13 00:02:51.610000 | 127.0.0.2 |             --
>                                                                               
>                                                                               
>                                 Initiating read-repair [SharedPool-Worker-3] 
> | 2017-09-13 00:02:51.610000 | 127.0.0.2 |             --
>  Digest mismatch: org.apache.cassandra.service.DigestMismatchException: 
> Mismatch for key DecoratedKey(-4886857781295767937, 6d6d756c6c617373) 
> (d41d8cd98f00b204e9800998ecf8427e vs f8e0f9262a889cd3ebf4e5d50159757b) 
> [ReadRepairStage:1] | 2017-09-13 00:02:51.624000 | 127.0.0.2 |             --
>                                                                               
>                                                                               
>                                                             Request complete 
> | 2017-09-13 00:02:51.586892 | 127.0.0.2 |           4892
> {noformat}
> h3. 6. As a result, node2 has the row.
> {noformat}
> $ ccm node2 cqlsh -k ks1 -e "consistency; tracing on; select * from ks1.t1 
> where key = 'mmullass';"
> Current consistency level is ONE.
> Now Tracing is enabled
>  key      | value
> ----------+--------------------
>  mmullass | 0x0000000000000001
> (1 rows)
> Tracing session: 78526330-97cb-11e7-83cc-a3af9d3da979
>  activity                                                                     
>             | timestamp                  | source    | source_elapsed
> ------------------------------------------------------------------------------------------+----------------------------+-----------+----------------
>                                                                        
> Execute CQL3 query | 2017-09-13 00:03:01.091000 | 127.0.0.2 |              0
>                Parsing select * from ks1.t1 where key = 'mmullass'; 
> [SharedPool-Worker-3] | 2017-09-13 00:03:01.091000 | 127.0.0.2 |            
> 216
>                                                 Preparing statement 
> [SharedPool-Worker-3] | 2017-09-13 00:03:01.091000 | 127.0.0.2 |            
> 390
>                                        reading data from /127.0.0.1 
> [SharedPool-Worker-3] | 2017-09-13 00:03:01.091000 | 127.0.0.2 |            
> 808
>                              Executing single-partition query on t1 
> [SharedPool-Worker-2] | 2017-09-13 00:03:01.092000 | 127.0.0.2 |           
> 1041
>              READ message received from /127.0.0.2 
> [MessagingService-Incoming-/127.0.0.2] | 2017-09-13 00:03:01.092000 | 
> 127.0.0.1 |             33
>                 Sending READ message to /127.0.0.1 
> [MessagingService-Outgoing-/127.0.0.1] | 2017-09-13 00:03:01.092000 | 
> 127.0.0.2 |           1036
>                              Executing single-partition query on t1 
> [SharedPool-Worker-1] | 2017-09-13 00:03:01.092000 | 127.0.0.1 |            
> 189
>                                        Acquiring sstable references 
> [SharedPool-Worker-2] | 2017-09-13 00:03:01.092000 | 127.0.0.2 |           
> 1113
>                                        Acquiring sstable references 
> [SharedPool-Worker-1] | 2017-09-13 00:03:01.092000 | 127.0.0.1 |            
> 276
>                                           Merging memtable contents 
> [SharedPool-Worker-2] | 2017-09-13 00:03:01.092000 | 127.0.0.2 |           
> 1172
>                                           Merging memtable contents 
> [SharedPool-Worker-1] | 2017-09-13 00:03:01.092000 | 127.0.0.1 |            
> 332
>  REQUEST_RESPONSE message received from /127.0.0.1 
> [MessagingService-Incoming-/127.0.0.1] | 2017-09-13 00:03:01.093000 | 
> 127.0.0.2 |             --
>                                   Read 1 live and 0 tombstone cells 
> [SharedPool-Worker-1] | 2017-09-13 00:03:01.093000 | 127.0.0.1 |            
> 565
>                                    Enqueuing response to /127.0.0.2 
> [SharedPool-Worker-1] | 2017-09-13 00:03:01.093000 | 127.0.0.1 |            
> 648
>     Sending REQUEST_RESPONSE message to /127.0.0.2 
> [MessagingService-Outgoing-/127.0.0.2] | 2017-09-13 00:03:01.093000 | 
> 127.0.0.1 |            783
>                                 Processing response from /127.0.0.1 
> [SharedPool-Worker-1] | 2017-09-13 00:03:01.094000 | 127.0.0.2 |             
> --
>                                              Initiating read-repair 
> [SharedPool-Worker-1] | 2017-09-13 00:03:01.099000 | 127.0.0.2 |             
> --
>                                   Read 1 live and 0 tombstone cells 
> [SharedPool-Worker-2] | 2017-09-13 00:03:01.101000 | 127.0.0.2 |          
> 10113
>                                                                          
> Request complete | 2017-09-13 00:03:01.092830 | 127.0.0.2 |           1830
> $ ccm node2 nodetool flush
> $ ls ~/.ccm/cluster_3.0.14/node2/data0/ks1/t1-*/*-Data.db
> /Users/hiwakaba/.ccm/cluster_3.0.14/node2/data0/ks1/t1-ec659e0097ca11e78ea7c1bd4d549501/mc-1-big-Data.db
> $ ~/.ccm/repository/3.0.14/tools/bin/sstabledump 
> /Users/hiwakaba/.ccm/cluster_3.0.14/node2/data0/ks1/t1-ec659e0097ca11e78ea7c1bd4d549501/mc-1-big-Data.db
>  -k mmullass
> [
>   {
>     "partition" : {
>       "key" : [ "mmullass" ],
>       "position" : 0
>     },
>     "rows" : [
>       {
>         "type" : "row",
>         "position" : 36,
>         "liveness_info" : { "tstamp" : "2017-09-12T14:59:42.312969Z" },
>         "cells" : [
>           { "name" : "value", "value" : "0000000000000001" }
>         ]
>       }
>     ]
>   }
> ]
> {noformat}
> In [CASSANDRA-11409|https://issues.apache.org/jira/browse/CASSANDRA-11409], 
> [~cam1982] commented this was not a bug. So I filed this issue as an 
> improvement.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to