steveloughran opened a new pull request, #6716:
URL: https://github.com/apache/hadoop/pull/6716

   
   Improve resilience of task commit save and rename operation with retries.
     
   * Retries of save()
     5 attempts, with 500 millis sleep between them. No configuration.
     Issue: should we make this configurable?
   * Split delete(path, recursive) into deleteFile and rmdir for separate
     statistics.
     
   Test simulation expands to:
   * Support recovery through a countdown of calls to fail.
   * Simulate timeout before *and after* rename calls.
   
   This is based on #6596 but skips the rate limiting logic spanning common and 
azure,
   instead it only contains changes in manifest committer -easier to backport.
   
   
   ### How was this patch tested?
   
   * manual test of new tests
   * full test suite left to yetus
   * azure test run in progress.
   
   
   ### For code changes:
   
   - [X] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [ ] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to