[ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=612538&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-612538
 ]

ASF GitHub Bot logged work on HADOOP-17764:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 21/Jun/21 10:29
            Start Date: 21/Jun/21 10:29
    Worklog Time Spent: 10m 
      Work Description: majdyz commented on a change in pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#discussion_r655260406



##########
File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java
##########
@@ -396,6 +396,41 @@ private void incrementBytesRead(long bytesRead) {
     }
   }
 
+  @FunctionalInterface
+  interface CheckedIntSupplier {
+    int get() throws IOException;
+  }
+
+  /**
+   * Helper function that allows to retry an IntSupplier in case of 
`IOException`.
+   * This function is used by `read()` and `read(buf, off, len)` functions. It 
tries to run
+   * `readFn` and in case of `IOException`:
+   *   1. If it gets an EOFException, return -1
+   *   2. Else, run `onReadFailure` and retry running `readFn`. If it fails 
again,
+   *   we run `onReadFailure` and re-throw the error.
+   * @param readFn the function to read, it must return an integer
+   * @param length length of data being attempted to read
+   * @return -1 if `readFn` throws EOFException, else returns int value from 
the result of `readFn`
+   * @throws IOException if retry of `readFn` also fails with `IOException`
+   */
+  private int retryReadOnce(CheckedIntSupplier readFn, int length) throws 
IOException {
+    try {
+      return readFn.get();
+    } catch (EOFException e) {
+      return -1;
+    } catch (IOException e) {
+      onReadFailure(e, length, e instanceof SocketTimeoutException);

Review comment:
       This method is used on both `read()` and `read(b, off, len)` we use 
length = 1 for `read()` and variable length for `read(b, off, len)`. It's 
intended to keep the current behaviour




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 612538)
    Time Spent: 1h  (was: 50m)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --------------------------------------------------------------------------------------
>
>                 Key: HADOOP-17764
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17764
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 3.3.1
>            Reporter: Zamil Majdy
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen and wait for fixed duration}}: The client after failing in the 
> attempt of {{read}}, will reopen the stream, sleep for 
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will 
> be retried without reopening the input stream in case of the second failure 
> happened. This leads to some of the bytes read being skipped which results to 
> corrupt/less data than required. 
>  
> *Scenario to reproduce:*
>  * Execute S3AInputStream `read()` or `read(b, off, len)`.
>  * The read failed and throws `Connection Reset` exception after reading some 
> data.
>  * The InputStream is re-opened and another `read()` or `read(b, off, len)` 
> is executed
>  * The read failed for the second time and throws `Connection Reset` 
> exception after reading some data.
>  * The InputStream is not re-opened and another `read()` or `read(b, off, 
> len)` is executed after sleep
>  * The read succeed, but it skips the first few bytes that has already been 
> read on the second failure.
>  
> *Proposed fix:*
> [https://github.com/apache/hadoop/pull/3109]
> Added the test that reproduces the issue along with the fix



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to