ZhenyuLi created HBASE-28589: -------------------------------- Summary: Client Does not Stop Retrying after DoNotRetryException Key: HBASE-28589 URL: https://issues.apache.org/jira/browse/HBASE-28589 Project: HBase Issue Type: Bug Components: IPC/RPC Affects Versions: 2.0.0, 1.5.0, 1.4.0, 1.3.0, 1.2.0 Reporter: ZhenyuLi
I recently discovered that the fix for HBase-14598 does not completely resolve the issue. Their fix addressed two aspects: first, when the Scan/Get RPC attempts to allocate a very large array that could potentially lead to an out-of-memory (OOM) error, it will check the size of the array before allocation and directly throw an exception to prevent the region server from crashing and avoid possible cascading failures. Second, the developer intends for the client to stop retrying after such a failure, as retrying will not resolve the issue. However, their fix involved throwing a DoNotRetryException. After ByteBufferOutputStream.write throws the DoNotRetryException, in the call stack (ByteBufferOutputStream.write --> encoder.write --> encodeCellsTo --> his.cellBlockBuilder.buildCellBlockStream --> call.setResponse), the DoNotRetryException is ultimately caught in the CallRunner.run function, with only a log printed. Consequently, the DoNotRetryException is not sent back to the client side. Instead, the client receives a generic exception for the failed RPC request and continues retrying, which is not the desired behavior. -- This message was sent by Atlassian Jira (v8.20.10#820010)