[ https://issues.apache.org/jira/browse/KUDU-1702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Grant Henke updated KUDU-1702: ------------------------------ Target Version/s: (was: 1.5.0) > Document/Implement read-your-writes for Impala/Spark etc. > --------------------------------------------------------- > > Key: KUDU-1702 > URL: https://issues.apache.org/jira/browse/KUDU-1702 > Project: Kudu > Issue Type: Sub-task > Components: client, tablet, tserver > Affects Versions: 1.1.0 > Reporter: David Alves > Assignee: David Alves > Priority: Major > > Engines like Impala/Spark use many independent client instances, so we should > provide a way to have read-your-writes across many independent client > instances, which translates to providing a way to get linearizable behavior. > At first this can be done using the APIs that are already available. For > instance if the objective is to be sure to have the results of a write in a a > following scan, the following steps can be taken: > - After a write the engine should collect the last observed timestamps from > kudu clients > - The engine's coordinator then takes the max of those timestamps, adds 1 and > uses that as a snapshot scan timestamp. > One important pre-requisite of the behavior above is that scans be done in > READ_AT_SNAPSHOT mode. Also the steps above currently don't actually > guarantee the expected behavior, but should as the currently anomalies are > taken care of (as part of KUDU-430). > In the immediate future we'll add APIs to the Kudu client so as to make the > inner workings of getting this behavior oblivious to the engine. The steps > will still be the same, i.e. timestamps or timestamp tokens will still be > passed around, but the kudu client will encapsulate the choice of the > timestamp for the scan. > Later we will add a way to obtain this behavior without timestamp > propagation, either by doing a write-side commit-wait, where clients wait out > the clock error after/during the last write thus making sure any future > operation will have a higher timestamp; or by making read-side commit wait, > where we provide an api on the kudu client for the engine to perform a > similar call before the scan call to obtain a scan timestamp. -- This message was sent by Atlassian JIRA (v7.6.3#76005)