Jeffrey Zhong created HBASE-9768: ------------------------------------ Summary: Two issues in AsyncProcess Key: HBASE-9768 URL: https://issues.apache.org/jira/browse/HBASE-9768 Project: HBase Issue Type: Bug Affects Versions: 0.96.0 Reporter: Jeffrey Zhong Assignee: Nicolas Liochon Fix For: 0.98.0, 0.96.1
There may exist two issues in Asyncprocess code as following: 1) In Htable#backgroundFlushCommits, we have following code: {code} if (ap.hasError()) { if (!clearBufferOnFail) { // if clearBufferOnFailed is not set, we're supposed to keep the failed operation in the // write buffer. This is a questionable feature kept here for backward compatibility writeAsyncBuffer.addAll(ap.getFailedOperations()); } RetriesExhaustedWithDetailsException e = ap.getErrors(); ap.clearErrors(); throw e; } {code} In a rare situation like the following: When there are some updates ongoing, a client call Put(internally backgroundFlushCommits get triggered). Then comes the issue: The first ap.hasError() returns false and the second ap.hasError() returns true. So we could throw exception to caller while writeAsyncBuffer isn't empty.(some updates are still on going). If a client retry with different values for the same keys, we could end up with nondeterministic state. 2) The following code only update cache for the first row. We should update cache for all the regions inside resultForRS because actions are sent to multiple regions per RS {code} if (failureCount++ == 0) { // We're doing this once per location. hConnection.updateCachedLocations(this.tableName, row.getRow(), result, location); if (errorsByServer != null) { errorsByServer.reportServerError(location); canRetry = errorsByServer.canRetryMore(); } } {code} -- This message was sent by Atlassian JIRA (v6.1#6144)