Hello all, 

I am searching for a reviewer for a fix of the long standing bug HBASE-27781 
that effects the 2.5.3+/2.6.x/branch-2 sync client. The bug has to do with an 
edge case in the client handling of an operation timeout - there is a scenario 
where meta slowness can lead to the sync client throwing an unchecked 
AssertionError which bubbles up to the user application layer (instead of 
proper handling which would result in a RetriesExhaustedWithDetailsException) - 
this can be catastrophic because the user application layer is not expecting 
and should not be catching this unchecked exception - the user application 
could crash on unchecked exception being encountered/user application threads 
could silently die. The AssertionError in question here is being explicitly 
thrown in the client , and is not an “assert <condition>” statement that can be 
disabled/enabled with the ‘-ea’ JVM flag. 



One of the triggering scenarios for the bug is meta slowness, which while not 
very common, is not exceedingly rare. There has been a lot of sync client work 
done in the past around better handling of meta slowness / operation timeouts - 
this bug is also a blocker for HBASE-28730 which attempts to bring to 
completion the work done in the past needed for the sync client to respect 
operation timeouts. 

I have taken care to provide a lot of detail around the bug and the fix in the 
jira. The functional scope of the changes is limited to the timeout/location 
error handling in the sync client groupAndSendMulti - the happy path where 
there are no sync client location errors is completely unaffected by the patch. 
I have added a test case using MiniCluster which reproduces meta slowness that 
triggers the bug - without the fix one will see the test error out with 
AssertionError. I would greatly appreciate a review of the bug fix so we can 
work towards resolving this long standing bug and also unblock HBASE-28730. 



JIRA: https://issues.apache.org/jira/browse/HBASE-27781
PR: https://github.com/apache/hbase/pull/7079

Thank you,

Daniel Roudnitsky


Reply via email to