[ https://issues.apache.org/jira/browse/KUDU-2033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zoltan Chovan resolved KUDU-2033. --------------------------------- Fix Version/s: 1.5.0 Resolution: Fixed > Add a 'torture' scenario to verify Java client's behavior during fail-over > --------------------------------------------------------------------------- > > Key: KUDU-2033 > URL: https://issues.apache.org/jira/browse/KUDU-2033 > Project: Kudu > Issue Type: Test > Components: client, java > Reporter: Alexey Serbin > Priority: Major > Labels: newbie, newbie++ > Fix For: 1.5.0 > > > For the Kudu Java client we have {{TestLeaderFailover}} test which verifies > how the client handles the tablet server fail-over scenario. However, the > test covers only one fail-over event and mainly performs write operations > while the backend handles the 'unexpected crash' of the tablet server. > It would be nice to add more tests which cover the client's fail-over > behavior: > * Add the mixed workload scenario, i.e. combine inserts/scans during the > fail-over. Running the scans would not only verify that the data eventually > reaches the destination, but verify that the client automatically retries the > scan operations and eventually succeeds reading the data from the cluster. > * Induce more fail-over events while running the scenario, i.e. pause and > then resume the tservers processes many more times and run the test longer. > This is to spot possible bugs during the transition processes and occurrence > of multiple fail-over events. > * In the mixed workload scenarios, run scan operations in READ_AT_SNAPSHOT > mode with different selectors: LEADER_ONLY and CLOSEST_REPLICA. That's to > cover the retry code paths for both cases (as of now, I could see only the > LEADER_ONLY path covered, but I might be mistaken). > The general idea is to make sure the Java client during fail-over events: > * Retries write and read operations automatically on an error happened due to > a fail-over event. > * Does not silently lose any data: if the client cannot send the data due to > timeout or running out of retry attempts, it should report on that. > -- This message was sent by Atlassian Jira (v8.20.1#820001)