[ 
https://issues.apache.org/jira/browse/KUDU-2033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexey Serbin reassigned KUDU-2033:
-----------------------------------

    Assignee:     (was: Edward Fancher)

> Add a 'torture' scenario to verify Java client's behavior during fail-over 
> ---------------------------------------------------------------------------
>
>                 Key: KUDU-2033
>                 URL: https://issues.apache.org/jira/browse/KUDU-2033
>             Project: Kudu
>          Issue Type: Test
>          Components: client, java
>            Reporter: Alexey Serbin
>            Priority: Major
>              Labels: newbie, newbie++
>
> For the Kudu Java client we have {{TestLeaderFailover}} test which verifies 
> how the client handles the tablet server fail-over scenario.  However, the 
> test covers only one fail-over event and mainly performs write operations 
> while the backend handles the 'unexpected crash' of the tablet server.
> It would be nice to add more tests which cover the client's fail-over 
> behavior:
>   * Add the mixed workload scenario, i.e. combine inserts/scans during the 
> fail-over.  Running the scans would not only verify that the data eventually 
> reaches the destination, but verify that the client automatically retries the 
> scan operations and eventually succeeds reading the data from the cluster.
>   * Induce more fail-over events while running the scenario, i.e. pause and 
> then resume the tservers processes many more times and run the test longer.  
> This is to spot possible bugs during the transition processes and occurrence 
> of multiple fail-over events.
>   * In the mixed workload scenarios, run scan operations in READ_AT_SNAPSHOT 
> mode with different selectors: LEADER_ONLY and CLOSEST_REPLICA.  That's to 
> cover the retry code paths for both cases (as of now, I could see only the 
> LEADER_ONLY path covered, but I might be mistaken).
> The general idea is to make sure the Java client during fail-over events:
> * Retries write and read operations automatically on an error happened due to 
> a fail-over event.
> * Does not silently lose any data: if the client cannot send the data due to 
> timeout or running out of retry attempts, it should report on that.
>    



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to