Re: Possible issue with read repair?

Niklas Ekström Thu, 11 Oct 2012 08:10:30 -0700

Thanks!


2012-10-11 16:55, Jonathan Ellis skrev:

https://issues.apache.org/jira/browse/CASSANDRA-4792

On Wed, Oct 10, 2012 at 4:30 PM, Jonathan Ellis <[email protected]> wrote:

You're both right -- "read repair" as a concept is indeed performed
asynchronously, but RowRepairResolver is used for synchronous, high-CL
reads as well, which is the code Niklas is referring to.

Niklas, can you create a ticket to fix this officially?

On Wed, Oct 10, 2012 at 3:31 PM, Mikhail Panchenko <[email protected]> wrote:

I'll take a stab:

Without looking at the code, that seems perfectly fine - the purpose of
read repair is to repair potentially stale data out of band. It is
acceptable (from the viewpoint of the datastore) to have "stale" reads
while read-repair happens in the background. Once the repair is completed,
future reads will have the correct data ("eventually"). Reads do not and
should not block on read repair tasks. See
http://www.datastax.com/docs/1.1/cluster_architecture/about_client_requests#about-read-requestsfor
more info.

In order to achieve what you're looking for and eliminate the window you
are describing, one would write and read at QUORUM consistency level.

On Wed, Oct 10, 2012 at 1:25 PM, Niklas Ekström <[email protected]> wrote:

Hi,

I’m looking in the file StorageProxy.java (Cassandra 1.1.5), and line 766
seems odd to me.

FBUtilities.waitOnFutures() is called with the repairResults from the
RowRepairResolver resolver.

The problem though is that repairResults is only assigned when the object
is created at line 737 in StorageProxy.java, and there it is assigned to
Collections.emptyList(), and in the resolve() method in RowRepairResolver,
which is indirectly called from line 771 in StorageProxy.java, that is,
after the call to FBUtilities.waitOnFutures().

So the effect is that line 766 in StorageProxy.java is essentially a no-op.

If on the other hand line 766 is moved down to just below the try-catch
block under it (to line 777), the effect of the call to
FBUtilities.waitOnFutures() would be to wait for responses to the
READ_REPAIR message. Not waiting for responses to read repair messages
opens a window of time in which stale reads can happen.

Does this sound reasonable or am I overlooking something?

Regards,
Niklas



--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: Possible issue with read repair?

Reply via email to