Daniel Roudnitsky created HBASE-29654:
-----------------------------------------
Summary: BinaryComponentComparator filter fails non gracefully
with ArrayIndexOutOfBoundsException
Key: HBASE-29654
URL: https://issues.apache.org/jira/browse/HBASE-29654
Project: HBase
Issue Type: Bug
Components: Filters
Reporter: Daniel Roudnitsky
Assignee: Daniel Roudnitsky
+Problem+
BinaryComponentComparator filter enables a user to compare against a subset of
a byte array by specifying an offset into the byte array from which the
comparison should start. The issue here is that if the offset provided by the
user is longer than a byte array encountered by the scan query (e.g an offset
of 40 is used, and there is a "short" byte array of length 38 somewhere in the
table), when the scan query reaches the short byte array the query will fail in
a nongraceful manner - each scan RPC fails with a mysterious remote
ArrayIndexOutOfBoundsException exception, the client will continue to retry the
failed RPC excessively until exhausting all retries, and the client ultimately
fails with an unintuitive RetriesExhaustedException.
+Root cause+
Serverside the scan request handler will throw an unhandled/unexpected
ArrayIndexOutOfBoundsException during request processing when applying the
filter and attempting to do an unchecked byte array comparison using an offset
which extends beyond the length of an encountered byte array, when this happens
the client will treat the remote exception from the server as an IOException
(which wraps the remote exception), and because IOException is retryable, the
client will proceed to exhaust all of its retries re-running the same query,
all of which are guaranteed to fail, and will eventually fail in a nonobvious
and nongraceful manner with a stack trace that looks like:
{code:java}
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
attempts=2, exceptions:
2025-09-26T19:09:35.531Z,
RpcRetryingCaller{globalStartTime=2025-09-26T19:09:35.412Z, pause=1000,
maxAttempts=2}, java.io.IOException: java.io.IOException
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:451)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:139)
at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:369)
at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:349)
Caused by: java.lang.ArrayIndexOutOfBoundsException
2025-09-26T19:09:36.659Z,
RpcRetryingCaller{globalStartTime=2025-09-26T19:09:35.412Z, pause=1000,
maxAttempts=2}, java.io.IOException: java.io.IOException
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:451)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:139)
at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:369)
at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:349)
Caused by: java.lang.ArrayIndexOutOfBoundsException
at
org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:142)
at
org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:73)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:840)
{code}
+Proposed solution+
Instead of the server throwing an unhandled/unexpected
ArrayIndexOutOfBoundsException, the server should gracefully return a
DoNotRetryIOException which prevents the client from making excessive retries
which are guaranteed to fail, and the DoNotRetryIOException should clearly
explain the root cause to the user - the user provided byte offset for the
filter exceeded the length of a byte array that the scan encountered.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)