[ 
https://issues.apache.org/jira/browse/KUDU-3668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shen yushi updated KUDU-3668:
-----------------------------
    Status: In Review  (was: Open)

> Kudu Client Write Retry Behavior After Timeout
> ----------------------------------------------
>
>                 Key: KUDU-3668
>                 URL: https://issues.apache.org/jira/browse/KUDU-3668
>             Project: Kudu
>          Issue Type: Bug
>          Components: client
>            Reporter: shen yushi
>            Priority: Major
>
> *Issue Description*
> The Kudu client continues internal retries of write requests _after_ 
> returning a timeout status to the caller. In our implementation, callers may 
> initiate new write operations upon receiving this timeout status. This can 
> cause overlapping writes if the client's internal retry completes 
> concurrently, potentially resulting in data duplication or conflicts.
> *Technical Analysis*
> Based on our investigation (referencing 
> [gerrit.cloudera.org/c/12338|https://gerrit.cloudera.org/c/12338]):
>  # Each KuduRpc object contains an RpcTimeoutTask.
>  # When a tablet server stalls (simulated via gdb process suspension), this 
> task triggers: Invokes the RPC's error callback and Returns timeout status to 
> the caller.
>  # Meanwhile, the Connection object retains the original KuduRpc.
>  # Upon tablet server recovery: The connection may receive exceptions/errors 
> and AsyncKuduClient.handleRetryableError initiates internal retries.
> *Environment*
>  * Kudu Version: 1.15.0
> *Key Question*
> Is this dual-retry behavior (caller + client) an intentional design? We found 
> no existing issues addressing this scenario in the community. We'd appreciate 
> your insights on:
>  * Whether this constitutes a bug
>  * Recommended solutions or configuration adjustments
> We're committed to providing additional details and contributing fixes if 
> needed. Thank you for your time and expertise!
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to