[ https://issues.apache.org/jira/browse/CASSANDRA-6887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14012448#comment-14012448 ]
Sylvain Lebresne commented on CASSANDRA-6887: --------------------------------------------- bq. Only for LOCAL_ONE though, b/c unlike CL#isSufficientLiveNodes(), CL#assureSufficientLiveNodes() does not handle LOCAL_ONE properly. So if there are live replicas - but none in the local DC - the request will go to a different one, potentially, with all RRDs. Fair enough, but surely we can agree it's just a bug of assureSufficientLiveNodes. That doesn't invalidate the fact that allowing read repair globally even when LOCAL CL are used can be useful and shouldn't be removed imo, even if I 100% agree that it shouldn't be the default. > LOCAL_ONE read repair only does local repair, in spite of global digest > queries > ------------------------------------------------------------------------------- > > Key: CASSANDRA-6887 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6887 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Cassandra 2.0.6, x86-64 ubuntu precise > Reporter: Duncan Sands > Assignee: Aleksey Yeschenko > Fix For: 2.0.9, 2.1.0 > > Attachments: 6887-2.0.txt > > > I have a cluster spanning two data centres. Almost all of the writing (and a > lot of reading) is done in DC1. DC2 is used for running the occasional > analytics query. Reads in both data centres use LOCAL_ONE. Read repair > settings are set to the defaults on all column families. > I had a long network outage between the data centres; it lasted longer than > the hints window, so after it was over DC2 didn't have the latest > information. Even after reading data many many times in DC2, the returned > data was still out of date: read repair was not correcting it. > I then investigated using cqlsh in DC2, with tracing on. > What I saw was: > - with consistency ONE, after about 10 read requests a digest request would > be sent to many nodes (spanning both data centres), and the data in DC2 would > be repaired. > - with consistency LOCAL_ONE, after about 10 read requests a digest request > would be sent to many nodes (spanning both data centres), but the data in DC2 > would not be repaired. This is in spite of digest requests being sent to > DC1, as shown by the tracing. > So it looks like digest requests are being sent to both data centres, but > replies from outside the local data centre are ignored when using LOCAL_ONE. > The same data is being queried all the time in DC1 with consistency > LOCAL_ONE, but this didn't result in the data in DC2 being read repaired > either. This is a slightly different case to what I described above: in that > case the local node was out of date and the remote node had the latest data, > while here it is the other way round. > It could be argued that you don't want cross data centre read repair when > using LOCAL_ONE. But then why bother sending cross data centre digest > requests? And if only doing local read repair is how it is supposed to work > then it would be good to document this somewhere. -- This message was sent by Atlassian JIRA (v6.2#6252)