[ https://issues.apache.org/jira/browse/HBASE-27781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17950030#comment-17950030 ]
Daniel Roudnitsky edited comment on HBASE-27781 at 5/7/25 3:10 PM: ------------------------------------------------------------------- Less verbosely, fail all actions really means fail all actions being processed in groupAndSendMulti at the time of the operation timeout being exceeded, none of the actions being processed in groupAndSendMulti at the time of the operation timeout have been successfully executed yet, and there is no time remaining to execute them, so we fail them, but have to take care not to double fail any actions else we hit an assertion error , which is what the patch is addressing was (Author: JIRAUSER304178): Less verbosely, fail all actions really means fail all actions being processed in groupAndSendMulti at the time of the operation timeout being exceeded, none of the actions being processed in groupAndSendMulti at the time of the operation timeout have been successfully executed yet, and there is no time remaining to execute them, so we fail them, but have to take care not to double fail any actions else we hit an assertion error > AssertionError in AsyncRequestFutureImpl when timing out during location > resolution > ----------------------------------------------------------------------------------- > > Key: HBASE-27781 > URL: https://issues.apache.org/jira/browse/HBASE-27781 > Project: HBase > Issue Type: Bug > Components: asyncclient > Reporter: Bryan Beaudreault > Assignee: Daniel Roudnitsky > Priority: Major > Labels: pull-request-available > Fix For: 2.6.3 > > > In AsyncFutureRequestImpl we fail fast when operation timeout is exceeded > during location resolution > [here|https://github.com/apache/hbase/blob/branch-2.5/hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncRequestFutureImpl.java#L460-L462]. > In that handling, we loop all actions and set them as failed. The problem > is, some number of actions may already finished when we get to this spot. So > the actionsInProgress would have been decremented for those already, and now > we're going to decrement by all actions. This causes an assertion error since > we go negative > [here|https://github.com/apache/hbase/blob/branch-2.5/hbase-client/src/main/java/org/apache/hadoop/hbase/client/AsyncRequestFutureImpl.java#L1197], > causing the HBase client to throw an unchecked exception which can kill the > caller thread that invoked the operation which should have timed out, as > callers of the client should not be catching {{Error}} and its subclasses > like {{AssertionError}}. > We still want to fail all actions, because none will be executed. But we need > special handling to avoid this case. Maybe don't bother decrementing the > actionsInProgress at all, instead set to 0. -- This message was sent by Atlassian Jira (v8.20.10#820010)