[ 
https://issues.apache.org/jira/browse/KUDU-1702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KUDU-1702:
------------------------------
    Target Version/s:   (was: 1.5.0)

> Document/Implement read-your-writes for Impala/Spark etc.
> ---------------------------------------------------------
>
>                 Key: KUDU-1702
>                 URL: https://issues.apache.org/jira/browse/KUDU-1702
>             Project: Kudu
>          Issue Type: Sub-task
>          Components: client, tablet, tserver
>    Affects Versions: 1.1.0
>            Reporter: David Alves
>            Assignee: David Alves
>            Priority: Major
>
> Engines like Impala/Spark use many independent client instances, so we should 
> provide a way to have read-your-writes across many independent client 
> instances, which translates to providing a way to get linearizable behavior. 
> At first this can be done using the APIs that are already available. For 
> instance if the objective is to be sure to have the results of a write in a a 
> following scan, the following steps can be taken:
> - After a write the engine should collect the last observed timestamps from 
> kudu clients
> - The engine's coordinator then takes the max of those timestamps, adds 1 and 
> uses that as a snapshot scan timestamp.
> One important pre-requisite of the behavior above is that scans be done in 
> READ_AT_SNAPSHOT mode. Also the steps above currently don't actually 
> guarantee the expected behavior, but should as the currently anomalies are 
> taken care of (as part of KUDU-430).
> In the immediate future we'll add APIs to the Kudu client so as to make the 
> inner workings of getting this behavior oblivious to the engine. The steps 
> will still be the same, i.e. timestamps or timestamp tokens will still be 
> passed around, but the kudu client will encapsulate the choice of the 
> timestamp for the scan.
> Later we will add a way to obtain this behavior without timestamp 
> propagation, either by doing a write-side commit-wait, where clients wait out 
> the clock error after/during the last write thus making sure any future 
> operation will have a higher timestamp; or by making read-side commit wait, 
> where we provide an api on the kudu client for the engine to perform a 
> similar call before the scan call to obtain a scan timestamp.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to