[ 
https://issues.apache.org/jira/browse/CASSANDRA-11409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15994312#comment-15994312
 ] 

Cameron Zemek commented on CASSANDRA-11409:
-------------------------------------------

This is not a bug. Its just the setting name is misleading. 
dclocal_read_repair_chance and read_repair_chance don't actually control the 
chance of a read repair; they give a chance for the read executor to involve 
all/dc_local live replicas in a read which in turn will check consistency 
across those replicas.

A read repair happens in either blocking or background process whenever a 
digest mismatch occurs.

{code:title=StorageProxy.java|borderStyle=solid}
private static List<Row> fetchRows(List<ReadCommand> initialCommands, 
ConsistencyLevel consistencyLevel)
    throws UnavailableException, ReadTimeoutException
    {
                 //...
                catch (DigestMismatchException ex)
                {
                    Tracing.trace("Digest mismatch: {}", ex);

                    ReadRepairMetrics.repairedBlocking.mark();

                    // Do a full data read to resolve the correct response (and 
repair node that need be)
                    RowDataResolver resolver = new 
RowDataResolver(exec.command.ksName, exec.command.key, exec.command.filter(), 
exec.command.timestamp);
                    ReadCallback<ReadResponse, Row> repairHandler = new 
ReadCallback<>(resolver,
                                                                                
       ConsistencyLevel.ALL,
                                                                                
       exec.getContactedReplicas().size(),
                                                                                
       exec.command,
                                                                                
       Keyspace.open(exec.command.getKeyspace()),
                                                                                
       exec.handler.endpoints);

                    if (repairCommands == null)
                    {
                        repairCommands = new ArrayList<>();
                        repairResponseHandlers = new ArrayList<>();
                    }
                    repairCommands.add(exec.command);
                    repairResponseHandlers.add(repairHandler);

                    MessageOut<ReadCommand> message = 
exec.command.createMessage();
                    for (InetAddress endpoint : exec.getContactedReplicas())
                    {
                        Tracing.trace("Enqueuing full data read to {}", 
endpoint);
                        MessagingService.instance().sendRR(message, endpoint, 
repairHandler);
                    }
                }
{code}

{code:title=ReadCallback.java|borderStyle=solid}
    public void response(MessageIn<TMessage> message)
    {
        resolver.preprocess(message);
        int n = waitingFor(message)
              ? recievedUpdater.incrementAndGet(this)
              : received;
        if (n >= blockfor && resolver.isDataPresent())
        {
            condition.signalAll();

            // kick off a background digest comparison if this is a result that 
(may have) arrived after
            // the original resolve that get() kicks off as soon as the 
condition is signaled
            if (blockfor < endpoints.size() && n == endpoints.size())
            {
                TraceState traceState = Tracing.instance.get();
                if (traceState != null)
                    traceState.trace("Initiating read-repair");
                StageManager.getStage(Stage.READ_REPAIR).execute(new 
AsyncRepairRunner(traceState));
            }
        }
    }
{code}

So as you can see by ReadCallback it will wait for all replicas to respond in 
the background and check for digest match and execute read repair if required. 
The read repair chance just includes more replicas for a read request so is 
putting more load on the cluster.

In addition to a misleading setting name the documentation also states 
"Additionally, users should run frequent repairs (which streams data in such a 
way that it does not become comingled), and disable background read repair by 
setting the table’s read_repair_chance and dclocal_read_repair_chance to 0." 
Which as I just explain is false. Setting to 0 just makes it less likely for a 
read repair to occur, it doesn't outright stop it. The only way to avoid read 
repair is reading at LOCAL_ONE or ONE where no digest mismatch can occur.

Therefore my suggestion in regards to this issue is to add better documentation 
on read repairs. 

> set read repair chance to 0 but find read repair process in trace
> -----------------------------------------------------------------
>
>                 Key: CASSANDRA-11409
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11409
>             Project: Cassandra
>          Issue Type: Bug
>          Components: CQL, Distributed Metadata
>         Environment: Cassandra 2.1.13 with centos 7
>            Reporter: Ryan Cho
>              Labels: lhf
>         Attachments: 螢幕快照 2016-03-23 下午2.06.10.png
>
>
> I have set  dclocal_read_repair_chance and read_repair_chance to 0.0 for one 
> month, but I still find "Read-repair DC_LOCAL" and "Initiating read-repair" 
> activities in system_trace.events, and  query was executed in these two days 
> and long time after set read repair chance to 0.0 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to