empiredan commented on PR #1399:
URL: 
https://github.com/apache/incubator-pegasus/pull/1399#issuecomment-1477666493

   Currently primary replica server respond to client after rocksdb has been 
written. However, rocksdb write interface may return `kCorruption` or 
`kIOError`, which will be returned to client and client will think this request 
has failed. In fact all of primary and secondary logs have been written 
successfully thus this request should be considered successful. Client will 
choose to write again and will lead to inconsistency for the non-idempotent 
writes such as `incr`, `check_and_set` and `check_and_mutate`.
   
   To solve this problem, I think we can make rocksdb write asynchronous. Once 
fail to write rocksdb asynchronously, for example, `kCorruption` or `kIOError`, 
just remove the replica and move the rocksdb directory to `.err` and move this 
primary replica to other secondary replica. The consistency will be guaranteed 
if and only if logs are consistent. We can just write rocksdb asynchrously. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to