Ray Mattingly created HBASE-29630:
-------------------------------------

             Summary: RetryingCallerInterceptor should be wired up for batch 
calls
                 Key: HBASE-29630
                 URL: https://issues.apache.org/jira/browse/HBASE-29630
             Project: HBase
          Issue Type: Improvement
    Affects Versions: 2.6.3
            Reporter: Ray Mattingly
            Assignee: Ray Mattingly


h3. Summary

{{RetryingCallerInterceptor}} should be invoked for *batch operations* 
({{{}callWithoutRetries(...){}}}) in addition to single operations 
({{{}callWithRetries(...){}}}). This will ensure consistent behavior for 
failure handling, fast-fail logic, and custom metrics across all client 
operations.
h3. Problem
 * *Single operations* ({{{}HTable.get(Get){}}}, etc.) run through 
{{{}RpcRetryingCaller.callWithRetries(){}}}, which wires in interceptor hooks 
({{{}intercept(){}}}, {{{}handleFailure(){}}}, {{{}updateFailureInfo(){}}}).

 * *Batch operations* ({{{}HTable.batch(){}}}, {{{}HTable.get(List<Get>){}}}) 
go through {{{}RpcRetryingCaller.callWithoutRetries(){}}}, which currently 
bypasses interceptor hooks.

 * As a result:

 ** Custom interceptors (e.g., handling {{{}RpcThrottlingException{}}}) do not 
run for batch calls.

 ** Metrics and fast-fail logic only apply to single operations, not batch 
operations.

 ** Behavior is inconsistent between single and batch APIs.

h3. Proposed Change

Update {{RpcRetryingCallerImpl.callWithoutRetries(...)}} to use the same 
interceptor lifecycle as {{{}callWithRetries(...){}}}:
 # Create an interceptor context at call start.

 # Call {{interceptor.intercept()}} before executing the RPC.

 # On exception:

 ** Translate the throwable.

 ** Pass the translated exception into {{{}interceptor.handleFailure(){}}}.

 ** Preserve {{PreemptiveFastFailException}} behavior.

 # Call {{interceptor.updateFailureInfo()}} in the {{finally}} block when 
failures occur.

 # Add an {{AttemptType.NO_RETRY}} marker to the context for clarity in 
logs/metrics.

 # Optionally enrich the context when {{RetriesExhaustedWithDetailsException}} 
(REWDE) is thrown, so interceptors can see per-item failures.

h3. Benefits
 * *Consistency:* Single and batch operations follow the same interceptor 
pattern.

 * *Extensibility:* Interceptors can handle throttling and emit metrics for 
batch calls without custom forks.

 * *Maintainability:* Small, contained change in {{{}RpcRetryingCallerImpl{}}}; 
no invasive changes to {{{}AsyncProcess{}}}.

 * *Compatibility:* No breaking API changes. No-op interceptors behave as 
before.

 * *Community value:* Opens the door to richer failure handling for all client 
operations.

h3. Example Use Case

At my day job we use interceptors to detect {{RpcThrottlingException}} and fail 
fast while reporting metrics. This works for single gets/puts but is currently 
bypassed in batch calls, limiting visibility and resilience. With this change, 
batch requests can also fast-fail and report throttling metrics.
h3. Risks & Mitigation
 * *Performance regression:* Interceptor calls are lightweight; validate with 
before/after microbenchmarks.

 * *Behavior change for existing interceptors:* Only makes interceptors {_}more 
consistent{_}. Default no-op interceptor ensures compatibility.

 * *Naming confusion (“RetryingCaller” on no-retry path):* Clarify with 
Javadoc: interceptors apply to both retrying and no-retry paths.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to