[
https://issues.apache.org/jira/browse/HBASE-29654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Daniel Roudnitsky updated HBASE-29654:
--------------------------------------
Description:
+Problem+
BinaryComponentComparator filter enables a user to compare against a subset of
a byte array by specifying an offset into the byte array from which the
comparison should start. The issue here is that if the offset provided by the
user is longer than a byte array encountered by the scan query (e.g an offset
of 40 is used, and there is a "short" byte array of length 38 somewhere in the
table), when the scan query reaches the "short" byte array the query will fail
in a nongraceful manner - the scan RPC fails with a mysterious
ArrayIndexOutOfBoundsException exception, the client continues to retry the
failed RPC excessively until exhausting all retries, and the client operation
ultimately fails out with an unintuitive RetriesExhaustedException.
+Root cause+
Serverside the scan request handler will throw an unhandled/unexpected
ArrayIndexOutOfBoundsException during request processing when applying the
BinaryComponentComparator filter and attempting to do an unchecked byte array
comparison using an offset which extends beyond the length of the "short" byte
array that is being read, when this happens the client will treat the remote
exception from the server as an IOException (which wraps the remote exception),
and because IOException is retryable, the client will proceed to exhaust all of
its retries re-running the same RPC (with all retries guaranteed to fail), and
the operation will eventually fail out in a nonobvious/nongraceful manner with
a stack trace that looks like:
{code:java}
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
attempts=2, exceptions:
2025-09-26T19:09:35.531Z,
RpcRetryingCaller{globalStartTime=2025-09-26T19:09:35.412Z, pause=1000,
maxAttempts=2}, java.io.IOException: java.io.IOException
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:451)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:139)
at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:369)
at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:349)
Caused by: java.lang.ArrayIndexOutOfBoundsException
2025-09-26T19:09:36.659Z,
RpcRetryingCaller{globalStartTime=2025-09-26T19:09:35.412Z, pause=1000,
maxAttempts=2}, java.io.IOException: java.io.IOException
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:451)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:139)
at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:369)
at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:349)
Caused by: java.lang.ArrayIndexOutOfBoundsException
at
org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:142)
at
org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:73)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:840){code}
+Proposed solution+
Instead of the server throwing an unhandled/unexpected
ArrayIndexOutOfBoundsException from BinaryComponentComparator, it should check
byte array length before attempting the byte array comparison and "gracefully"
return a DoNotRetryIOException which will prevent the client from attempting
excessive retries which are guaranteed to fail, and the DoNotRetryIOException
should clearly explain the root cause to the user - the user provided byte
offset for the filter exceeded the length of a byte array that the scan
encountered. This approach preserves the existing behavior of the filter in
that it will still fail when an unexpectedly short byte array is encountered
rather than opting to skip byte arrays which are too short to compare.
was:
+Problem+
BinaryComponentComparator filter enables a user to compare against a subset of
a byte array by specifying an offset into the byte array from which the
comparison should start. The issue here is that if the offset provided by the
user is longer than a byte array encountered by the scan query (e.g an offset
of 40 is used, and there is a "short" byte array of length 38 somewhere in the
table), when the scan query reaches the "short" byte array the query will fail
in a nongraceful manner - the scan RPC fails with a mysterious
ArrayIndexOutOfBoundsException exception, the client continues to retry the
failed RPC excessively until exhausting all retries, and the client operation
ultimately fails out with an unintuitive RetriesExhaustedException.
+Root cause+
Serverside the scan request handler will throw an unhandled/unexpected
ArrayIndexOutOfBoundsException during request processing when applying the
BinaryComponentComparator filter and attempting to do an unchecked byte array
comparison using an offset which extends beyond the length of the "short" byte
array that is being read, when this happens the client will treat the remote
exception from the server as an IOException (which wraps the remote exception),
and because IOException is retryable, the client will proceed to exhaust all of
its retries re-running the same RPC (with all retries guaranteed to fail), and
the operation will eventually fail out in a nonobvious/nongraceful manner with
a stack trace that looks like:
{code:java}
org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
attempts=2, exceptions:
2025-09-26T19:09:35.531Z,
RpcRetryingCaller{globalStartTime=2025-09-26T19:09:35.412Z, pause=1000,
maxAttempts=2}, java.io.IOException: java.io.IOException
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:451)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:139)
at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:369)
at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:349)
Caused by: java.lang.ArrayIndexOutOfBoundsException
2025-09-26T19:09:36.659Z,
RpcRetryingCaller{globalStartTime=2025-09-26T19:09:35.412Z, pause=1000,
maxAttempts=2}, java.io.IOException: java.io.IOException
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:451)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:139)
at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:369)
at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:349)
Caused by: java.lang.ArrayIndexOutOfBoundsException
at
org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:142)
at
org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:73)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:840){code}
+Proposed solution+
Instead of the server throwing an unhandled/unexpected
ArrayIndexOutOfBoundsException from BinaryComponentComparator, it should
"gracefully" return a DoNotRetryIOException which will prevent the client from
attempting excessive retries which are guaranteed to fail, and the
DoNotRetryIOException should clearly explain the root cause to the user - the
user provided byte offset for the filter exceeded the length of a byte array
that the scan encountered. This approach preserves the existing behavior of the
filter in that it will still fail when an unexpectedly short byte array is
encountered rather than opting to skip byte arrays which are too short to
compare.
> BinaryComponentComparator filter fails non gracefully with
> ArrayIndexOutOfBoundsException
> -----------------------------------------------------------------------------------------
>
> Key: HBASE-29654
> URL: https://issues.apache.org/jira/browse/HBASE-29654
> Project: HBase
> Issue Type: Bug
> Components: Filters
> Reporter: Daniel Roudnitsky
> Assignee: Daniel Roudnitsky
> Priority: Major
>
> +Problem+
> BinaryComponentComparator filter enables a user to compare against a subset
> of a byte array by specifying an offset into the byte array from which the
> comparison should start. The issue here is that if the offset provided by the
> user is longer than a byte array encountered by the scan query (e.g an offset
> of 40 is used, and there is a "short" byte array of length 38 somewhere in
> the table), when the scan query reaches the "short" byte array the query will
> fail in a nongraceful manner - the scan RPC fails with a mysterious
> ArrayIndexOutOfBoundsException exception, the client continues to retry the
> failed RPC excessively until exhausting all retries, and the client operation
> ultimately fails out with an unintuitive RetriesExhaustedException.
> +Root cause+
> Serverside the scan request handler will throw an unhandled/unexpected
> ArrayIndexOutOfBoundsException during request processing when applying the
> BinaryComponentComparator filter and attempting to do an unchecked byte array
> comparison using an offset which extends beyond the length of the "short"
> byte array that is being read, when this happens the client will treat the
> remote exception from the server as an IOException (which wraps the remote
> exception), and because IOException is retryable, the client will proceed to
> exhaust all of its retries re-running the same RPC (with all retries
> guaranteed to fail), and the operation will eventually fail out in a
> nonobvious/nongraceful manner with a stack trace that looks like:
> {code:java}
> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
> attempts=2, exceptions:
> 2025-09-26T19:09:35.531Z,
> RpcRetryingCaller{globalStartTime=2025-09-26T19:09:35.412Z, pause=1000,
> maxAttempts=2}, java.io.IOException: java.io.IOException
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:451)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:139)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:369)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:349)
> Caused by: java.lang.ArrayIndexOutOfBoundsException
> 2025-09-26T19:09:36.659Z,
> RpcRetryingCaller{globalStartTime=2025-09-26T19:09:35.412Z, pause=1000,
> maxAttempts=2}, java.io.IOException: java.io.IOException
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:451)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:139)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:369)
> at
> org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:349)
> Caused by: java.lang.ArrayIndexOutOfBoundsException
> at
> org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:142)
> at
> org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:73)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
> at java.base/java.lang.Thread.run(Thread.java:840){code}
> +Proposed solution+
> Instead of the server throwing an unhandled/unexpected
> ArrayIndexOutOfBoundsException from BinaryComponentComparator, it should
> check byte array length before attempting the byte array comparison and
> "gracefully" return a DoNotRetryIOException which will prevent the client
> from attempting excessive retries which are guaranteed to fail, and the
> DoNotRetryIOException should clearly explain the root cause to the user - the
> user provided byte offset for the filter exceeded the length of a byte array
> that the scan encountered. This approach preserves the existing behavior of
> the filter in that it will still fail when an unexpectedly short byte array
> is encountered rather than opting to skip byte arrays which are too short to
> compare.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)