[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-07-06 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=619240&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-619240
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 06/Jul/21 11:36
Start Date: 06/Jul/21 11:36
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-874303444


   Note: we could also backport to 3.2.x if you want to cherrypick the 3.3 
changes and retest...then provide a new PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 619240)
Time Spent: 9h 10m  (was: 9h)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Assignee: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.2
>
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen and wait for fixed duration}}: The client after failing in the 
> attempt of {{read}}, will reopen the stream, sleep for 
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will 
> be retried without reopening the input stream in case of the second failure 
> happened. This leads to some of the bytes read being skipped which results to 
> corrupt/less data than required. 
>  
> *Scenario to reproduce:*
>  * Execute S3AInputStream `read()` or `read(b, off, len)`.
>  * The read failed and throws `Connection Reset` exception after reading some 
> data.
>  * The InputStream is re-opened and another `read()` or `read(b, off, len)` 
> is executed
>  * The read failed for the second time and throws `Connection Reset` 
> exception after reading some data.
>  * The InputStream is not re-opened and another `read()` or `read(b, off, 
> len)` is executed after sleep
>  * The read succeed, but it skips the first few bytes that has already been 
> read on the second failure.
>  
> *Proposed fix:*
> [https://github.com/apache/hadoop/pull/3109]
> Added the test that reproduces the issue along with the fix



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-07-05 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=618771&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-618771
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 05/Jul/21 19:56
Start Date: 05/Jul/21 19:56
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-874303444


   Note: we could also backport to 3.2.x if you want to cherrypick the 3.3 
changes and retest...then provide a new PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 618771)
Time Spent: 9h  (was: 8h 50m)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Assignee: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.2
>
>  Time Spent: 9h
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen and wait for fixed duration}}: The client after failing in the 
> attempt of {{read}}, will reopen the stream, sleep for 
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will 
> be retried without reopening the input stream in case of the second failure 
> happened. This leads to some of the bytes read being skipped which results to 
> corrupt/less data than required. 
>  
> *Scenario to reproduce:*
>  * Execute S3AInputStream `read()` or `read(b, off, len)`.
>  * The read failed and throws `Connection Reset` exception after reading some 
> data.
>  * The InputStream is re-opened and another `read()` or `read(b, off, len)` 
> is executed
>  * The read failed for the second time and throws `Connection Reset` 
> exception after reading some data.
>  * The InputStream is not re-opened and another `read()` or `read(b, off, 
> len)` is executed after sleep
>  * The read succeed, but it skips the first few bytes that has already been 
> read on the second failure.
>  
> *Proposed fix:*
> [https://github.com/apache/hadoop/pull/3109]
> Added the test that reproduces the issue along with the fix



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-28 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=615506&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-615506
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 28/Jun/21 09:03
Start Date: 28/Jun/21 09:03
Worklog Time Spent: 10m 
  Work Description: majdyz closed pull request #3132:
URL: https://github.com/apache/hadoop/pull/3132


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 615506)
Time Spent: 8h 50m  (was: 8h 40m)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Assignee: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.2
>
>  Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen and wait for fixed duration}}: The client after failing in the 
> attempt of {{read}}, will reopen the stream, sleep for 
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will 
> be retried without reopening the input stream in case of the second failure 
> happened. This leads to some of the bytes read being skipped which results to 
> corrupt/less data than required. 
>  
> *Scenario to reproduce:*
>  * Execute S3AInputStream `read()` or `read(b, off, len)`.
>  * The read failed and throws `Connection Reset` exception after reading some 
> data.
>  * The InputStream is re-opened and another `read()` or `read(b, off, len)` 
> is executed
>  * The read failed for the second time and throws `Connection Reset` 
> exception after reading some data.
>  * The InputStream is not re-opened and another `read()` or `read(b, off, 
> len)` is executed after sleep
>  * The read succeed, but it skips the first few bytes that has already been 
> read on the second failure.
>  
> *Proposed fix:*
> [https://github.com/apache/hadoop/pull/3109]
> Added the test that reproduces the issue along with the fix



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=615183&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-615183
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 25/Jun/21 20:11
Start Date: 25/Jun/21 20:11
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-868808282


   thanks. Merged to trunk then (locally) cherrypicked that to branch-3.3, ran 
the new test (and only that test!) and pushed up.
   
   @majdyz thanks! your contribution is appreciated


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 615183)
Time Spent: 8h 40m  (was: 8.5h)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Assignee: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.2
>
>  Time Spent: 8h 40m
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen and wait for fixed duration}}: The client after failing in the 
> attempt of {{read}}, will reopen the stream, sleep for 
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will 
> be retried without reopening the input stream in case of the second failure 
> happened. This leads to some of the bytes read being skipped which results to 
> corrupt/less data than required. 
>  
> *Scenario to reproduce:*
>  * Execute S3AInputStream `read()` or `read(b, off, len)`.
>  * The read failed and throws `Connection Reset` exception after reading some 
> data.
>  * The InputStream is re-opened and another `read()` or `read(b, off, len)` 
> is executed
>  * The read failed for the second time and throws `Connection Reset` 
> exception after reading some data.
>  * The InputStream is not re-opened and another `read()` or `read(b, off, 
> len)` is executed after sleep
>  * The read succeed, but it skips the first few bytes that has already been 
> read on the second failure.
>  
> *Proposed fix:*
> [https://github.com/apache/hadoop/pull/3109]
> Added the test that reproduces the issue along with the fix



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=615181&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-615181
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 25/Jun/21 20:09
Start Date: 25/Jun/21 20:09
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus removed a comment on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-867763605


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 52s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  59m 35s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 44s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   0m 36s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 26s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 42s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 21s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 30s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 11s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  17m  5s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 36s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 38s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   0m 38s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 31s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   0m 31s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 19s | 
[/results-checkstyle-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/10/artifact/out/results-checkstyle-hadoop-tools_hadoop-aws.txt)
 |  hadoop-tools/hadoop-aws: The patch generated 4 new + 9 unchanged - 0 fixed 
= 13 total (was 9)  |
   | +1 :green_heart: |  mvnsite  |   0m 35s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 14s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 23s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  16m 59s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 30s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   0m 28s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 107m 10s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/10/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3109 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 38ed7c3f7332 4.15.0-136-generic #140-Ubuntu SMP Thu Jan 28 
05:20:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / bb232121d40a1d1a6473341a4869907739fa3956 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/10/testReport/ |
   | Max. process+thread count | 599 (vs. ulimit of 5500) |
   | modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-a

[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=615182&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-615182
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 25/Jun/21 20:09
Start Date: 25/Jun/21 20:09
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus removed a comment on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-868684088


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 55s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  35m 53s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 49s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   0m 42s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 29s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 48s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 22s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 47s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 32s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  18m 24s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 35s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 40s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   0m 40s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 32s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 23s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 40s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 15s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 26s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 18s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  16m 55s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 43s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   0m 30s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   |  86m  1s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/11/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3109 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux e92c2d7139cd 4.15.0-136-generic #140-Ubuntu SMP Thu Jan 28 
05:20:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 448b2ef2baefcc74e7f974245a3d654a80d292c8 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/11/testReport/ |
   | Max. process+thread count | 577 (vs. ulimit of 5500) |
   | modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/11/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This me

[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=615151&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-615151
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 25/Jun/21 19:02
Start Date: 25/Jun/21 19:02
Worklog Time Spent: 10m 
  Work Description: steveloughran merged pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 615151)
Time Spent: 8h 10m  (was: 8h)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen and wait for fixed duration}}: The client after failing in the 
> attempt of {{read}}, will reopen the stream, sleep for 
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will 
> be retried without reopening the input stream in case of the second failure 
> happened. This leads to some of the bytes read being skipped which results to 
> corrupt/less data than required. 
>  
> *Scenario to reproduce:*
>  * Execute S3AInputStream `read()` or `read(b, off, len)`.
>  * The read failed and throws `Connection Reset` exception after reading some 
> data.
>  * The InputStream is re-opened and another `read()` or `read(b, off, len)` 
> is executed
>  * The read failed for the second time and throws `Connection Reset` 
> exception after reading some data.
>  * The InputStream is not re-opened and another `read()` or `read(b, off, 
> len)` is executed after sleep
>  * The read succeed, but it skips the first few bytes that has already been 
> read on the second failure.
>  
> *Proposed fix:*
> [https://github.com/apache/hadoop/pull/3109]
> Added the test that reproduces the issue along with the fix



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=615090&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-615090
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 25/Jun/21 16:38
Start Date: 25/Jun/21 16:38
Worklog Time Spent: 10m 
  Work Description: majdyz commented on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-868690774


   There seems to be no complaint from Yetus now :) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 615090)
Time Spent: 8h  (was: 7h 50m)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen and wait for fixed duration}}: The client after failing in the 
> attempt of {{read}}, will reopen the stream, sleep for 
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will 
> be retried without reopening the input stream in case of the second failure 
> happened. This leads to some of the bytes read being skipped which results to 
> corrupt/less data than required. 
>  
> *Scenario to reproduce:*
>  * Execute S3AInputStream `read()` or `read(b, off, len)`.
>  * The read failed and throws `Connection Reset` exception after reading some 
> data.
>  * The InputStream is re-opened and another `read()` or `read(b, off, len)` 
> is executed
>  * The read failed for the second time and throws `Connection Reset` 
> exception after reading some data.
>  * The InputStream is not re-opened and another `read()` or `read(b, off, 
> len)` is executed after sleep
>  * The read succeed, but it skips the first few bytes that has already been 
> read on the second failure.
>  
> *Proposed fix:*
> [https://github.com/apache/hadoop/pull/3109]
> Added the test that reproduces the issue along with the fix



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=615088&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-615088
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 25/Jun/21 16:26
Start Date: 25/Jun/21 16:26
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-868684088


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 55s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  35m 53s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 49s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   0m 42s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 29s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 48s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 22s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 47s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 32s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  18m 24s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 35s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 40s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   0m 40s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 32s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 23s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 40s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 15s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 26s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 18s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  16m 55s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 43s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   0m 30s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   |  86m  1s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/11/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3109 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux e92c2d7139cd 4.15.0-136-generic #140-Ubuntu SMP Thu Jan 28 
05:20:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 448b2ef2baefcc74e7f974245a3d654a80d292c8 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/11/testReport/ |
   | Max. process+thread count | 577 (vs. ulimit of 5500) |
   | modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/11/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message wa

[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=615085&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-615085
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 25/Jun/21 16:20
Start Date: 25/Jun/21 16:20
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3132:
URL: https://github.com/apache/hadoop/pull/3132#issuecomment-868680478


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 48s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  33m 40s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 46s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   0m 36s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 27s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 42s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 21s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 29s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m  8s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  16m 32s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 35s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 37s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   0m 37s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 28s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   0m 28s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 19s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 34s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 14s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 24s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  16m 41s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 17s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   0m 30s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   |  80m  6s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3132/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3132 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 5a556fbf7533 4.15.0-128-generic #131-Ubuntu SMP Wed Dec 9 
06:57:35 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 448b2ef2baefcc74e7f974245a3d654a80d292c8 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3132/4/testReport/ |
   | Max. process+thread count | 517 (vs. ulimit of 5500) |
   | modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3132/4/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was au

[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-25 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=615027&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-615027
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 25/Jun/21 14:19
Start Date: 25/Jun/21 14:19
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-868534727


   ok. just fix those line length checkstyles and we are good to merge. As 
these are just formatting, no need to rerun the tests.
   
   regarding the second failure -I've updated the JIRA to "lets just cut it"; 
it's part of the fault injection of inconsistencies we needed to test S3Guard. 
Now s3 is consistent, just a needless failure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 615027)
Time Spent: 7.5h  (was: 7h 20m)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen and wait for fixed duration}}: The client after failing in the 
> attempt of {{read}}, will reopen the stream, sleep for 
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will 
> be retried without reopening the input stream in case of the second failure 
> happened. This leads to some of the bytes read being skipped which results to 
> corrupt/less data than required. 
>  
> *Scenario to reproduce:*
>  * Execute S3AInputStream `read()` or `read(b, off, len)`.
>  * The read failed and throws `Connection Reset` exception after reading some 
> data.
>  * The InputStream is re-opened and another `read()` or `read(b, off, len)` 
> is executed
>  * The read failed for the second time and throws `Connection Reset` 
> exception after reading some data.
>  * The InputStream is not re-opened and another `read()` or `read(b, off, 
> len)` is executed after sleep
>  * The read succeed, but it skips the first few bytes that has already been 
> read on the second failure.
>  
> *Proposed fix:*
> [https://github.com/apache/hadoop/pull/3109]
> Added the test that reproduces the issue along with the fix



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=614703&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-614703
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 24/Jun/21 20:13
Start Date: 24/Jun/21 20:13
Worklog Time Spent: 10m 
  Work Description: majdyz commented on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-867921545


   I have re-run all the `hadoop-aws` tests, only 2 failure tests so far, turns 
out to be an IAM configuration issue on my end for the rest of error. I 
explicitly allow `osm-pds` and `landsat-pds` bucket access and the account and 
the rest are working fine. Here are the 2 failures :
   
   * 
org.apache.hadoop.tools.contract.AbstractContractDistCpTest#testDistCpWithIterator
   `test timed out after 180 milliseconds`
   * org.apache.hadoop.fs.s3a.ITestS3AInconsistency#testGetFileStatus
   `java.lang.AssertionError: getFileStatus should fail due to delayed 
visibility.`
   
   For the second test that you mentioned: 
https://issues.apache.org/jira/browse/HADOOP-17457,
   it seems to be consistently failing on my end instead of flaky.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 614703)
Time Spent: 7h 20m  (was: 7h 10m)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen and wait for fixed duration}}: The client after failing in the 
> attempt of {{read}}, will reopen the stream, sleep for 
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will 
> be retried without reopening the input stream in case of the second failure 
> happened. This leads to some of the bytes read being skipped which results to 
> corrupt/less data than required. 
>  
> *Scenario to reproduce:*
>  * Execute S3AInputStream `read()` or `read(b, off, len)`.
>  * The read failed and throws `Connection Reset` exception after reading some 
> data.
>  * The InputStream is re-opened and another `read()` or `read(b, off, len)` 
> is executed
>  * The read failed for the second time and throws `Connection Reset` 
> exception after reading some data.
>  * The InputStream is not re-opened and another `read()` or `read(b, off, 
> len)` is executed after sleep
>  * The read succeed, but it skips the first few bytes that has already been 
> read on the second failure.
>  
> *Proposed fix:*
> [https://github.com/apache/hadoop/pull/3109]
> Added the test that reproduces the issue along with the fix



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=614592&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-614592
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 24/Jun/21 16:05
Start Date: 24/Jun/21 16:05
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-867763605


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 52s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  59m 35s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 44s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   0m 36s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 26s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 42s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 21s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 30s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 11s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  17m  5s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 36s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 38s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   0m 38s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 31s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   0m 31s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 19s | 
[/results-checkstyle-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/10/artifact/out/results-checkstyle-hadoop-tools_hadoop-aws.txt)
 |  hadoop-tools/hadoop-aws: The patch generated 4 new + 9 unchanged - 0 fixed 
= 13 total (was 9)  |
   | +1 :green_heart: |  mvnsite  |   0m 35s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 14s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 23s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  16m 59s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 30s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   0m 28s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 107m 10s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/10/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3109 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 38ed7c3f7332 4.15.0-136-generic #140-Ubuntu SMP Thu Jan 28 
05:20:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / bb232121d40a1d1a6473341a4869907739fa3956 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/10/testReport/ |
   | Max. process+thread count | 599 (vs. ulimit of 5500) |
   | modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
   

[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=614589&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-614589
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 24/Jun/21 15:57
Start Date: 24/Jun/21 15:57
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3132:
URL: https://github.com/apache/hadoop/pull/3132#issuecomment-867757850


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 51s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  54m  2s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 45s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   0m 37s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 27s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 44s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 20s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 30s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 12s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  16m 56s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 34s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 39s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   0m 39s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 30s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   0m 30s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 19s | 
[/results-checkstyle-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3132/3/artifact/out/results-checkstyle-hadoop-tools_hadoop-aws.txt)
 |  hadoop-tools/hadoop-aws: The patch generated 4 new + 9 unchanged - 0 fixed 
= 13 total (was 9)  |
   | +1 :green_heart: |  mvnsite  |   0m 34s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 14s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 23s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  17m  5s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 32s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   0m 30s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 101m 37s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3132/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3132 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux ac3ae025f069 4.15.0-136-generic #140-Ubuntu SMP Thu Jan 28 
05:20:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / bb232121d40a1d1a6473341a4869907739fa3956 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3132/3/testReport/ |
   | Max. process+thread count | 524 (vs. ulimit of 5500) |
   | modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
   | C

[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=614568&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-614568
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 24/Jun/21 15:28
Start Date: 24/Jun/21 15:28
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-867732100


   Final changes LGTM; if the test run is happy then it's ready to commit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 614568)
Time Spent: 6h 50m  (was: 6h 40m)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen and wait for fixed duration}}: The client after failing in the 
> attempt of {{read}}, will reopen the stream, sleep for 
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will 
> be retried without reopening the input stream in case of the second failure 
> happened. This leads to some of the bytes read being skipped which results to 
> corrupt/less data than required. 
>  
> *Scenario to reproduce:*
>  * Execute S3AInputStream `read()` or `read(b, off, len)`.
>  * The read failed and throws `Connection Reset` exception after reading some 
> data.
>  * The InputStream is re-opened and another `read()` or `read(b, off, len)` 
> is executed
>  * The read failed for the second time and throws `Connection Reset` 
> exception after reading some data.
>  * The InputStream is not re-opened and another `read()` or `read(b, off, 
> len)` is executed after sleep
>  * The read succeed, but it skips the first few bytes that has already been 
> read on the second failure.
>  
> *Proposed fix:*
> [https://github.com/apache/hadoop/pull/3109]
> Added the test that reproduces the issue along with the fix



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=614567&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-614567
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 24/Jun/21 15:28
Start Date: 24/Jun/21 15:28
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus removed a comment on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-865842982


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 58s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  37m 33s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 50s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   0m 42s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 33s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 53s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 27s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 38s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 26s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  19m  1s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 42s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 45s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | -1 :x: |  javac  |   0m 45s | 
[/results-compile-javac-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/9/artifact/out/results-compile-javac-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt)
 |  hadoop-tools_hadoop-aws-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 generated 1 new + 20 unchanged - 1 fixed 
= 21 total (was 21)  |
   | +1 :green_heart: |  compile  |   0m 37s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | -1 :x: |  javac  |   0m 37s | 
[/results-compile-javac-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/9/artifact/out/results-compile-javac-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt)
 |  
hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 
with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 generated 1 new + 
20 unchanged - 1 fixed = 21 total (was 21)  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 25s | 
[/results-checkstyle-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/9/artifact/out/results-checkstyle-hadoop-tools_hadoop-aws.txt)
 |  hadoop-tools/hadoop-aws: The patch generated 20 new + 9 unchanged - 0 fixed 
= 29 total (was 9)  |
   | +1 :green_heart: |  mvnsite  |   0m 43s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 17s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 27s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 23s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  19m 19s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 57s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   0m 37s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   |  92m  1s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/9/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3109 |
   | Optional T

[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=614566&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-614566
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 24/Jun/21 15:27
Start Date: 24/Jun/21 15:27
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus removed a comment on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-864965245


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 55s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  30m 42s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 46s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   0m 39s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 31s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 48s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 28s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 35s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 10s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  14m 22s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 36s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 38s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | -1 :x: |  javac  |   0m 38s | 
[/results-compile-javac-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/8/artifact/out/results-compile-javac-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt)
 |  hadoop-tools_hadoop-aws-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 generated 1 new + 20 unchanged - 1 fixed 
= 21 total (was 21)  |
   | +1 :green_heart: |  compile  |   0m 31s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | -1 :x: |  javac  |   0m 31s | 
[/results-compile-javac-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/8/artifact/out/results-compile-javac-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt)
 |  
hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 
with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 generated 1 new + 
20 unchanged - 1 fixed = 21 total (was 21)  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 22s | 
[/results-checkstyle-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/8/artifact/out/results-checkstyle-hadoop-tools_hadoop-aws.txt)
 |  hadoop-tools/hadoop-aws: The patch generated 27 new + 9 unchanged - 0 fixed 
= 36 total (was 9)  |
   | +1 :green_heart: |  mvnsite  |   0m 35s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 17s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 26s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  14m  9s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 26s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   0m 34s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   |  73m 36s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/8/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3109 |
   | Optional T

[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=614564&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-614564
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 24/Jun/21 15:27
Start Date: 24/Jun/21 15:27
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus removed a comment on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-862506761


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 51s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  30m 41s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 47s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   0m 40s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 31s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 47s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 28s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 35s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m  9s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  14m 14s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 35s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 37s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | -1 :x: |  javac  |   0m 37s | 
[/results-compile-javac-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/5/artifact/out/results-compile-javac-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt)
 |  hadoop-tools_hadoop-aws-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 generated 1 new + 20 unchanged - 1 fixed 
= 21 total (was 21)  |
   | +1 :green_heart: |  compile  |   0m 32s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | -1 :x: |  javac  |   0m 32s | 
[/results-compile-javac-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/5/artifact/out/results-compile-javac-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt)
 |  
hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 
with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 generated 1 new + 
20 unchanged - 1 fixed = 21 total (was 21)  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 22s | 
[/results-checkstyle-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/5/artifact/out/results-checkstyle-hadoop-tools_hadoop-aws.txt)
 |  hadoop-tools/hadoop-aws: The patch generated 20 new + 9 unchanged - 0 fixed 
= 29 total (was 9)  |
   | +1 :green_heart: |  mvnsite  |   0m 35s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 17s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | -1 :x: |  javadoc  |   0m 25s | 
[/results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/5/artifact/out/results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt)
 |  
hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 
with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 generated 3 new + 
63 unchanged - 0 fixed = 66 total (was 63)  |
   | +1 :green_heart: |  spotbugs  |   1m  9s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  13m 59s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 26s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_he

[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=614563&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-614563
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 24/Jun/21 15:27
Start Date: 24/Jun/21 15:27
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus removed a comment on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-863309072


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 51s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  29m 25s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 46s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   0m 39s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 33s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 46s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 27s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 36s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 10s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  14m 12s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 37s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 37s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | -1 :x: |  javac  |   0m 37s | 
[/results-compile-javac-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/7/artifact/out/results-compile-javac-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt)
 |  hadoop-tools_hadoop-aws-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 generated 1 new + 20 unchanged - 1 fixed 
= 21 total (was 21)  |
   | +1 :green_heart: |  compile  |   0m 30s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | -1 :x: |  javac  |   0m 30s | 
[/results-compile-javac-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/7/artifact/out/results-compile-javac-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt)
 |  
hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 
with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 generated 1 new + 
20 unchanged - 1 fixed = 21 total (was 21)  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 21s | 
[/results-checkstyle-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/7/artifact/out/results-checkstyle-hadoop-tools_hadoop-aws.txt)
 |  hadoop-tools/hadoop-aws: The patch generated 27 new + 9 unchanged - 0 fixed 
= 36 total (was 9)  |
   | +1 :green_heart: |  mvnsite  |   0m 35s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 17s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 26s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m  9s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  15m  7s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 27s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   0m 34s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   |  73m  9s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/7/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3109 |
   | Optional T

[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=614191&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-614191
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 23/Jun/21 19:52
Start Date: 23/Jun/21 19:52
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on a change in pull request 
#3109:
URL: https://github.com/apache/hadoop/pull/3109#discussion_r657412668



##
File path: 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/TestS3AInputStreamRetry.java
##
@@ -0,0 +1,167 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a;
+
+import javax.net.ssl.SSLException;
+import java.io.IOException;
+import java.net.SocketException;
+import java.nio.charset.Charset;
+
+import com.amazonaws.services.s3.model.GetObjectRequest;
+import com.amazonaws.services.s3.model.ObjectMetadata;
+import com.amazonaws.services.s3.model.S3Object;
+import com.amazonaws.services.s3.model.S3ObjectInputStream;
+import org.junit.Test;
+
+import org.apache.commons.io.IOUtils;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.s3a.audit.impl.NoopSpan;
+import org.apache.hadoop.fs.s3a.auth.delegation.EncryptionSecrets;
+import org.apache.hadoop.fs.s3a.impl.ChangeDetectionPolicy;
+
+import static java.lang.Math.min;
+import static org.junit.Assert.assertArrayEquals;
+import static org.junit.Assert.assertEquals;
+
+/**
+ * Tests S3AInputStream retry behavior on read failure.
+ * These tests are for validating expected behavior of retrying the 
S3AInputStream
+ * read() and read(b, off, len), it tests that the read should reopen the 
input stream and retry
+ * the read when IOException is thrown during the read process.
+ */
+public class TestS3AInputStreamRetry extends AbstractS3AMockTest {
+
+  String input = "ab";
+
+  @Test
+  public void testInputStreamReadRetryForException() throws IOException {
+S3AInputStream s3AInputStream = getMockedS3AInputStream();
+
+assertEquals("'a' from the test input stream 'ab' should be the first 
character being read",
+input.charAt(0), s3AInputStream.read());
+assertEquals("'b' from the test input stream 'ab' should be the second 
character being read",
+input.charAt(1), s3AInputStream.read());
+  }
+
+  @Test
+  public void testInputStreamReadRetryLengthForException() throws IOException {
+byte[] result = new byte[input.length()];
+S3AInputStream s3AInputStream = getMockedS3AInputStream();
+s3AInputStream.read(result, 0, input.length());
+
+assertArrayEquals("The read result should equals to the test input stream 
content",
+input.getBytes(), result);
+  }
+
+  private S3AInputStream getMockedS3AInputStream() {
+Path path = new Path("test-path");
+String eTag = "test-etag";
+String versionId = "test-version-id";
+String owner = "test-owner";
+
+S3AFileStatus s3AFileStatus = new S3AFileStatus(
+input.length(), 0, path, input.length(), owner, eTag, versionId);
+
+S3ObjectAttributes s3ObjectAttributes = new S3ObjectAttributes(
+fs.getBucket(), path, fs.pathToKey(path), 
fs.getServerSideEncryptionAlgorithm(),
+new EncryptionSecrets().getEncryptionKey(), eTag, versionId, 
input.length());
+
+S3AReadOpContext s3AReadOpContext = fs.createReadContext(s3AFileStatus, 
S3AInputPolicy.Normal,
+ChangeDetectionPolicy.getPolicy(fs.getConf()), 100, NoopSpan.INSTANCE);
+
+return new S3AInputStream(s3AReadOpContext, s3ObjectAttributes, 
getMockedInputStreamCallback());
+  }
+
+  // Get mocked InputStreamCallbacks where we return mocked S3Object
+  private S3AInputStream.InputStreamCallbacks getMockedInputStreamCallback() {
+return new S3AInputStream.InputStreamCallbacks() {
+
+  final S3Object mockedS3Object = getMockedS3Object();
+
+  @Override
+  public S3Object getObject(GetObjectRequest request) {
+// Set s3 client to return mocked s3object with already defined read 
behavior
+return mockedS3Object;
+  }
+
+  @Override
+  public

[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=614187&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-614187
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 23/Jun/21 19:41
Start Date: 23/Jun/21 19:41
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-867108247


   Thanks for the detials.
   I agree, these are all unrelated. Some of them we've seen before and I'd say 
"you are distant from your S3 bucket/slow network/overloaded laptop". There's a 
couple of new ones though, both with hints of security/permissions.
   
   > 
org.apache.hadoop.tools.contract.AbstractContractDistCpTest#testDistCpWithIterator
   > org.junit.runners.model.TestTimedOutException: test timed out after 
180 milliseconds
   
   
   probably a variant on 
(https://issues.apache.org/jira/browse/HADOOP-17628)[https://issues.apache.org/jira/browse/HADOOP-17628]:
 we need to make the test directory tree smaller. it'd make the test faster for 
all too. Patches welcome :)
   
   
   > 
org.apache.hadoop.fs.contract.AbstractContractUnbufferTest#testUnbufferOnClosedFile
   > java.lang.AssertionError: failed to read expected number of bytes from 
stream. This may be transient Expected :1024 Actual :605
   
   you aren't alone here; its read() returning an undeful buffer. We can't 
switch to readFully() as the test really wants to call read(). Ignore it. 
Happens when I use many threads in parallel runs. 
   
   > org.apache.hadoop.fs.contract.s3a.ITestS3AContractUnbuffer
   > java.lang.AssertionError: failed to read expected number of bytes from 
stream. This may be transient Expected :1024 Actual :605
   
   same transient; ignore
   
   > org.apache.hadoop.fs.s3a.ITestS3AInconsistency#testGetFileStatus
   > java.lang.AssertionError: getFileStatus should fail due to delayed 
visibility.
   
   Looks like you are seeing https://issues.apache.org/jira/browse/HADOOP-17457
   Given S3 is now consistent, I'd fix this by removing the entire test suite :)
   
   
   ```
   org.apache.hadoop.fs.s3a.tools.ITestMarkerTool
   java.nio.file.AccessDeniedException: : listObjects: 
com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: 
Amazon S3; Status Code: 403; Error Code: AccessDenied; 
   ```
   
   This is new. Can you file a JIRA with the stack trace, just so we have a 
history of it. 
   MarkerTool should just be trying to call listObjects under a path in the 
test dir. 
   
   ```
   org.apache.hadoop.fs.s3a.auth.delegation.ITestDelegatedMRJob
   java.nio.file.AccessDeniedException: 
s3a://osm-pds/planet/planet-latest.orc#_partition.lst: getFileStatus on 
s3a://osm-pds/planet/planet-latest.orc#_partition.lst: 
com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon 
S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: A1Y4D90WW452Q8A9; 
S3 Extended Request ID: 
b/IV48OeMEgTaxikC9raP+IiHVPve3rIeoVkCymMc5opNp/70Iyc0tY2WZ0zpixFl0w7WT3bBCQ=; 
Proxy: null), S3 Extended Request ID: 
b/IV48OeMEgTaxikC9raP+IiHVPve3rIeoVkCymMc5opNp/70Iyc0tY2WZ0zpixFl0w7WT3bBCQ=:403
 Forbidden
   ```
   
   This is *very* new, which makes it interesting. If you are seeing this, it 
means it may surface in the wild. I suspect it's because you've got an IAM 
permission set up blocking access to this (public) dataset.
   
   Can you file a JIRA with this too? I'll probably give you some tasks to find 
out more about the cause, but at least there'll be an indexed reference to the 
issue.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 614187)
Time Spent: 5h 50m  (was: 5h 40m)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen

[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=614106&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-614106
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 23/Jun/21 16:24
Start Date: 23/Jun/21 16:24
Worklog Time Spent: 10m 
  Work Description: majdyz edited a comment on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-866982708


   Here are the failing tests:
   
   - 
org.apache.hadoop.tools.contract.AbstractContractDistCpTest#testDistCpWithIterator
   `org.junit.runners.model.TestTimedOutException: test timed out after 180 
milliseconds`
   - 
org.apache.hadoop.fs.contract.AbstractContractUnbufferTest#testUnbufferOnClosedFile
   `java.lang.AssertionError: failed to read expected number of bytes from 
stream. This may be transient 
   Expected :1024
   Actual   :605`
   - org.apache.hadoop.fs.s3a.ITestS3AInconsistency#testGetFileStatus
   `java.lang.AssertionError: getFileStatus should fail due to delayed 
visibility.`
   - org.apache.hadoop.fs.contract.s3a.ITestS3AContractUnbuffer
   `java.lang.AssertionError: failed to read expected number of bytes from 
stream. This may be transient 
   Expected :1024
   Actual   :605`
   - org.apache.hadoop.fs.s3a.tools.ITestMarkerTool
   `java.nio.file.AccessDeniedException: : listObjects: 
com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: 
Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 
FN0SQ82F85TGTZPW; S3 Extended Request ID: 
j1bhdYzzKkMVQqSgPDEPW7QQXMkVE+WJeLKP81l/qs7uF0RVx1xcUk2r6Wri4NQFlt/XE9W+FBo=; 
Proxy: null), S3 Extended Request ID: 
j1bhdYzzKkMVQqSgPDEPW7QQXMkVE+WJeLKP81l/qs7uF0RVx1xcUk2r6Wri4NQFlt/XE9W+FBo=:AccessDenied`
   - org.apache.hadoop.fs.s3a.select.ITestS3SelectMRJob
   - org.apache.hadoop.fs.s3a.statistics.ITestAWSStatisticCollection
   - org.apache.hadoop.fs.s3a.auth.delegation.ITestDelegatedMRJob
   `java.nio.file.AccessDeniedException: 
s3a://osm-pds/planet/planet-latest.orc#_partition.lst: getFileStatus on 
s3a://osm-pds/planet/planet-latest.orc#_partition.lst: 
com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon 
S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: A1Y4D90WW452Q8A9; 
S3 Extended Request ID: 
b/IV48OeMEgTaxikC9raP+IiHVPve3rIeoVkCymMc5opNp/70Iyc0tY2WZ0zpixFl0w7WT3bBCQ=; 
Proxy: null), S3 Extended Request ID: 
b/IV48OeMEgTaxikC9raP+IiHVPve3rIeoVkCymMc5opNp/70Iyc0tY2WZ0zpixFl0w7WT3bBCQ=:403
 Forbidden`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 614106)
Time Spent: 5h 40m  (was: 5.5h)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen and wait for fixed duration}}: The client after failing in the 
> attempt of {{read}}, will reopen the stream, sleep for 
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will 
> be retried without reopening the input stream in case of the second failure 
> happened. This leads to some of the bytes read being skipped which results to 
> corrupt/less data than required. 
>  
> *Scenario to reproduce:*
>  * Execute S3AInputStream `read()` or `read(b, off, len)`.
>  * The read failed and throws `Connection Reset` exception after reading some 
> data.
>  * The InputStream is re-opened and another `read()` or `read(b, off, len)` 
> is executed
>  * The read failed for the second time and throws `Connection Reset` 
> exception after reading some data.
>  * The InputStream is not re-opened and another `read()` or `read(b, off, 
> len)` is executed after sleep
>  * The read succeed, but it skips the first few bytes that has already been 
> read on the second fail

[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=614105&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-614105
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 23/Jun/21 16:23
Start Date: 23/Jun/21 16:23
Worklog Time Spent: 10m 
  Work Description: majdyz commented on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-866982708


   Here are the failing tests:
   
   - 
org.apache.hadoop.tools.contract.AbstractContractDistCpTest#testDistCpWithIterator
   `org.junit.runners.model.TestTimedOutException: test timed out after 180 
milliseconds`
   - 
org.apache.hadoop.fs.contract.AbstractContractUnbufferTest#testUnbufferOnClosedFile
   `java.lang.AssertionError: failed to read expected number of bytes from 
stream. This may be transient 
   Expected :1024
   Actual   :605`
   - org.apache.hadoop.fs.s3a.ITestS3AInconsistency#testGetFileStatus
   `java.lang.AssertionError: getFileStatus should fail due to delayed 
visibility.`
   - org.apache.hadoop.fs.s3a.tools.ITestMarkerTool
   `java.nio.file.AccessDeniedException: : listObjects: 
com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: 
Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 
FN0SQ82F85TGTZPW; S3 Extended Request ID: 
j1bhdYzzKkMVQqSgPDEPW7QQXMkVE+WJeLKP81l/qs7uF0RVx1xcUk2r6Wri4NQFlt/XE9W+FBo=; 
Proxy: null), S3 Extended Request ID: 
j1bhdYzzKkMVQqSgPDEPW7QQXMkVE+WJeLKP81l/qs7uF0RVx1xcUk2r6Wri4NQFlt/XE9W+FBo=:AccessDenied`
   - org.apache.hadoop.fs.s3a.select.ITestS3SelectMRJob
   - org.apache.hadoop.fs.s3a.statistics.ITestAWSStatisticCollection
   - org.apache.hadoop.fs.s3a.auth.delegation.ITestDelegatedMRJob
   `java.nio.file.AccessDeniedException: 
s3a://osm-pds/planet/planet-latest.orc#_partition.lst: getFileStatus on 
s3a://osm-pds/planet/planet-latest.orc#_partition.lst: 
com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon 
S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: A1Y4D90WW452Q8A9; 
S3 Extended Request ID: 
b/IV48OeMEgTaxikC9raP+IiHVPve3rIeoVkCymMc5opNp/70Iyc0tY2WZ0zpixFl0w7WT3bBCQ=; 
Proxy: null), S3 Extended Request ID: 
b/IV48OeMEgTaxikC9raP+IiHVPve3rIeoVkCymMc5opNp/70Iyc0tY2WZ0zpixFl0w7WT3bBCQ=:403
 Forbidden`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 614105)
Time Spent: 5.5h  (was: 5h 20m)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen and wait for fixed duration}}: The client after failing in the 
> attempt of {{read}}, will reopen the stream, sleep for 
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will 
> be retried without reopening the input stream in case of the second failure 
> happened. This leads to some of the bytes read being skipped which results to 
> corrupt/less data than required. 
>  
> *Scenario to reproduce:*
>  * Execute S3AInputStream `read()` or `read(b, off, len)`.
>  * The read failed and throws `Connection Reset` exception after reading some 
> data.
>  * The InputStream is re-opened and another `read()` or `read(b, off, len)` 
> is executed
>  * The read failed for the second time and throws `Connection Reset` 
> exception after reading some data.
>  * The InputStream is not re-opened and another `read()` or `read(b, off, 
> len)` is executed after sleep
>  * The read succeed, but it skips the first few bytes that has already been 
> read on the second failure.
>  
> *Proposed fix:*
> [https://github.com/apache/hadoop/pull/3109]
> Added the test that reproduces the issue along with the fix



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-

[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=613473&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-613473
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 22/Jun/21 13:04
Start Date: 22/Jun/21 13:04
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-865964446


   thanks for running the tests. 
   1. Can you run the entire suite (I.e in the hadoop-aws module, run `mvn 
verify ...` and leave out the -Dtest and -Dit.test bits
   2. Paste in the stack traces of the distcp failures here. We'll look and see 
if they are related or not, based on our experience of flaky tests. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 613473)
Time Spent: 5h 20m  (was: 5h 10m)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen and wait for fixed duration}}: The client after failing in the 
> attempt of {{read}}, will reopen the stream, sleep for 
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will 
> be retried without reopening the input stream in case of the second failure 
> happened. This leads to some of the bytes read being skipped which results to 
> corrupt/less data than required. 
>  
> *Scenario to reproduce:*
>  * Execute S3AInputStream `read()` or `read(b, off, len)`.
>  * The read failed and throws `Connection Reset` exception after reading some 
> data.
>  * The InputStream is re-opened and another `read()` or `read(b, off, len)` 
> is executed
>  * The read failed for the second time and throws `Connection Reset` 
> exception after reading some data.
>  * The InputStream is not re-opened and another `read()` or `read(b, off, 
> len)` is executed after sleep
>  * The read succeed, but it skips the first few bytes that has already been 
> read on the second failure.
>  
> *Proposed fix:*
> [https://github.com/apache/hadoop/pull/3109]
> Added the test that reproduces the issue along with the fix



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=613438&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-613438
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 22/Jun/21 11:49
Start Date: 22/Jun/21 11:49
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3132:
URL: https://github.com/apache/hadoop/pull/3132#issuecomment-865915522


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 47s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  33m  8s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 43s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   0m 36s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 26s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 42s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 22s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 30s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m  8s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  16m 34s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 35s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 37s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   0m 37s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 30s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   0m 30s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 19s | 
[/results-checkstyle-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3132/2/artifact/out/results-checkstyle-hadoop-tools_hadoop-aws.txt)
 |  hadoop-tools/hadoop-aws: The patch generated 20 new + 9 unchanged - 0 fixed 
= 29 total (was 9)  |
   | +1 :green_heart: |  mvnsite  |   0m 34s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 14s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 22s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  17m  3s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 36s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   0m 30s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   |  80m 14s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3132/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3132 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux b377c0a688da 4.15.0-136-generic #140-Ubuntu SMP Thu Jan 28 
05:20:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 02ec4ae6b78f05549ccb2428e31e7ff3357f1502 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3132/2/testReport/ |
   | Max. process+thread count | 593 (vs. ulimit of 5500) |
   | modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
   | 

[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=613404&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-613404
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 22/Jun/21 10:28
Start Date: 22/Jun/21 10:28
Worklog Time Spent: 10m 
  Work Description: majdyz opened a new pull request #3132:
URL: https://github.com/apache/hadoop/pull/3132


   ## NOTICE
   
   Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HADOOP-X. Fix a typo in YYY.)
   For more details, please see 
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 613404)
Time Spent: 5h  (was: 4h 50m)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen and wait for fixed duration}}: The client after failing in the 
> attempt of {{read}}, will reopen the stream, sleep for 
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will 
> be retried without reopening the input stream in case of the second failure 
> happened. This leads to some of the bytes read being skipped which results to 
> corrupt/less data than required. 
>  
> *Scenario to reproduce:*
>  * Execute S3AInputStream `read()` or `read(b, off, len)`.
>  * The read failed and throws `Connection Reset` exception after reading some 
> data.
>  * The InputStream is re-opened and another `read()` or `read(b, off, len)` 
> is executed
>  * The read failed for the second time and throws `Connection Reset` 
> exception after reading some data.
>  * The InputStream is not re-opened and another `read()` or `read(b, off, 
> len)` is executed after sleep
>  * The read succeed, but it skips the first few bytes that has already been 
> read on the second failure.
>  
> *Proposed fix:*
> [https://github.com/apache/hadoop/pull/3109]
> Added the test that reproduces the issue along with the fix



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=613403&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-613403
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 22/Jun/21 10:27
Start Date: 22/Jun/21 10:27
Worklog Time Spent: 10m 
  Work Description: majdyz edited a comment on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-865863147


   I got an error that requires me to provide DynamoDB table to run the S3A 
test, is this expected? 
   Is there any workaround to not use it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 613403)
Time Spent: 4h 50m  (was: 4h 40m)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen and wait for fixed duration}}: The client after failing in the 
> attempt of {{read}}, will reopen the stream, sleep for 
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will 
> be retried without reopening the input stream in case of the second failure 
> happened. This leads to some of the bytes read being skipped which results to 
> corrupt/less data than required. 
>  
> *Scenario to reproduce:*
>  * Execute S3AInputStream `read()` or `read(b, off, len)`.
>  * The read failed and throws `Connection Reset` exception after reading some 
> data.
>  * The InputStream is re-opened and another `read()` or `read(b, off, len)` 
> is executed
>  * The read failed for the second time and throws `Connection Reset` 
> exception after reading some data.
>  * The InputStream is not re-opened and another `read()` or `read(b, off, 
> len)` is executed after sleep
>  * The read succeed, but it skips the first few bytes that has already been 
> read on the second failure.
>  
> *Proposed fix:*
> [https://github.com/apache/hadoop/pull/3109]
> Added the test that reproduces the issue along with the fix



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=613402&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-613402
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 22/Jun/21 10:26
Start Date: 22/Jun/21 10:26
Worklog Time Spent: 10m 
  Work Description: majdyz edited a comment on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-865863147


   @steveloughran I got an error that requires me to provide DynamoDB table to 
run the S3A test, is this expected? 
   Is there any workaround to not use it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 613402)
Time Spent: 4h 40m  (was: 4.5h)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen and wait for fixed duration}}: The client after failing in the 
> attempt of {{read}}, will reopen the stream, sleep for 
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will 
> be retried without reopening the input stream in case of the second failure 
> happened. This leads to some of the bytes read being skipped which results to 
> corrupt/less data than required. 
>  
> *Scenario to reproduce:*
>  * Execute S3AInputStream `read()` or `read(b, off, len)`.
>  * The read failed and throws `Connection Reset` exception after reading some 
> data.
>  * The InputStream is re-opened and another `read()` or `read(b, off, len)` 
> is executed
>  * The read failed for the second time and throws `Connection Reset` 
> exception after reading some data.
>  * The InputStream is not re-opened and another `read()` or `read(b, off, 
> len)` is executed after sleep
>  * The read succeed, but it skips the first few bytes that has already been 
> read on the second failure.
>  
> *Proposed fix:*
> [https://github.com/apache/hadoop/pull/3109]
> Added the test that reproduces the issue along with the fix



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=613401&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-613401
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 22/Jun/21 10:26
Start Date: 22/Jun/21 10:26
Worklog Time Spent: 10m 
  Work Description: majdyz commented on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-865863147


   @steveloughran I got an error that requires me to provide DynamoDB table to 
run the S3A test, is this expected? 
   Is there a way to exclude this tests ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 613401)
Time Spent: 4.5h  (was: 4h 20m)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen and wait for fixed duration}}: The client after failing in the 
> attempt of {{read}}, will reopen the stream, sleep for 
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will 
> be retried without reopening the input stream in case of the second failure 
> happened. This leads to some of the bytes read being skipped which results to 
> corrupt/less data than required. 
>  
> *Scenario to reproduce:*
>  * Execute S3AInputStream `read()` or `read(b, off, len)`.
>  * The read failed and throws `Connection Reset` exception after reading some 
> data.
>  * The InputStream is re-opened and another `read()` or `read(b, off, len)` 
> is executed
>  * The read failed for the second time and throws `Connection Reset` 
> exception after reading some data.
>  * The InputStream is not re-opened and another `read()` or `read(b, off, 
> len)` is executed after sleep
>  * The read succeed, but it skips the first few bytes that has already been 
> read on the second failure.
>  
> *Proposed fix:*
> [https://github.com/apache/hadoop/pull/3109]
> Added the test that reproduces the issue along with the fix



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=613393&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-613393
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 22/Jun/21 09:57
Start Date: 22/Jun/21 09:57
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-865842982


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 58s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  37m 33s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 50s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   0m 42s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 33s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 53s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 27s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 38s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 26s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  19m  1s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 42s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 45s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | -1 :x: |  javac  |   0m 45s | 
[/results-compile-javac-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/9/artifact/out/results-compile-javac-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt)
 |  hadoop-tools_hadoop-aws-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 generated 1 new + 20 unchanged - 1 fixed 
= 21 total (was 21)  |
   | +1 :green_heart: |  compile  |   0m 37s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | -1 :x: |  javac  |   0m 37s | 
[/results-compile-javac-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/9/artifact/out/results-compile-javac-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt)
 |  
hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 
with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 generated 1 new + 
20 unchanged - 1 fixed = 21 total (was 21)  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 25s | 
[/results-checkstyle-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/9/artifact/out/results-checkstyle-hadoop-tools_hadoop-aws.txt)
 |  hadoop-tools/hadoop-aws: The patch generated 20 new + 9 unchanged - 0 fixed 
= 29 total (was 9)  |
   | +1 :green_heart: |  mvnsite  |   0m 43s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 17s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 27s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 23s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  19m 19s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 57s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   0m 37s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   |  92m  1s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/9/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3109 |
   | Optional Tests | d

[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=613356&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-613356
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 22/Jun/21 08:30
Start Date: 22/Jun/21 08:30
Worklog Time Spent: 10m 
  Work Description: majdyz closed pull request #3132:
URL: https://github.com/apache/hadoop/pull/3132


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 613356)
Time Spent: 4h 10m  (was: 4h)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen and wait for fixed duration}}: The client after failing in the 
> attempt of {{read}}, will reopen the stream, sleep for 
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will 
> be retried without reopening the input stream in case of the second failure 
> happened. This leads to some of the bytes read being skipped which results to 
> corrupt/less data than required. 
>  
> *Scenario to reproduce:*
>  * Execute S3AInputStream `read()` or `read(b, off, len)`.
>  * The read failed and throws `Connection Reset` exception after reading some 
> data.
>  * The InputStream is re-opened and another `read()` or `read(b, off, len)` 
> is executed
>  * The read failed for the second time and throws `Connection Reset` 
> exception after reading some data.
>  * The InputStream is not re-opened and another `read()` or `read(b, off, 
> len)` is executed after sleep
>  * The read succeed, but it skips the first few bytes that has already been 
> read on the second failure.
>  
> *Proposed fix:*
> [https://github.com/apache/hadoop/pull/3109]
> Added the test that reproduces the issue along with the fix



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=613354&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-613354
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 22/Jun/21 08:30
Start Date: 22/Jun/21 08:30
Worklog Time Spent: 10m 
  Work Description: majdyz edited a comment on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-865719554


   Agreed on adding slight delay on second retry won't do any harm, I have 
updated the main code to only just throwing the exception without having any 
manual try.
   
   I'm running the test on `aws us-west-1` the command that I'm using is `mvn 
verify -Dtest=TestS3A* -Dit.test=ITestS3A* -Dparallel-tests` there were two 
failing test in `ITestS3AContractDistCp` which I don't think related let me 
re-run the test and see how it goes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 613354)
Time Spent: 4h  (was: 3h 50m)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen and wait for fixed duration}}: The client after failing in the 
> attempt of {{read}}, will reopen the stream, sleep for 
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will 
> be retried without reopening the input stream in case of the second failure 
> happened. This leads to some of the bytes read being skipped which results to 
> corrupt/less data than required. 
>  
> *Scenario to reproduce:*
>  * Execute S3AInputStream `read()` or `read(b, off, len)`.
>  * The read failed and throws `Connection Reset` exception after reading some 
> data.
>  * The InputStream is re-opened and another `read()` or `read(b, off, len)` 
> is executed
>  * The read failed for the second time and throws `Connection Reset` 
> exception after reading some data.
>  * The InputStream is not re-opened and another `read()` or `read(b, off, 
> len)` is executed after sleep
>  * The read succeed, but it skips the first few bytes that has already been 
> read on the second failure.
>  
> *Proposed fix:*
> [https://github.com/apache/hadoop/pull/3109]
> Added the test that reproduces the issue along with the fix



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=613353&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-613353
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 22/Jun/21 08:29
Start Date: 22/Jun/21 08:29
Worklog Time Spent: 10m 
  Work Description: majdyz commented on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-865719554


   Agreed on adding slight delay on second retry won't do any harm, I have 
updated the main code to only just throwing the exception without having any 
manual try.
   
   I'm running the test on `aws us-west-1` the command that I'm using is `mvn 
verify -Dtest=TestS3A* -Dit.test=ITestS3A* -Dparallel-tests` there were two 
failing test in `ITestS3AContractDistCp` which I don't think related let me 
re-run the test and see how it goes.
   
   For the Yetus issue, it's weird actually, I re-create the same branch and 
push it as a separate PR here: https://github.com/apache/hadoop/pull/3132
   and the error is gone


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 613353)
Time Spent: 3h 50m  (was: 3h 40m)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen and wait for fixed duration}}: The client after failing in the 
> attempt of {{read}}, will reopen the stream, sleep for 
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will 
> be retried without reopening the input stream in case of the second failure 
> happened. This leads to some of the bytes read being skipped which results to 
> corrupt/less data than required. 
>  
> *Scenario to reproduce:*
>  * Execute S3AInputStream `read()` or `read(b, off, len)`.
>  * The read failed and throws `Connection Reset` exception after reading some 
> data.
>  * The InputStream is re-opened and another `read()` or `read(b, off, len)` 
> is executed
>  * The read failed for the second time and throws `Connection Reset` 
> exception after reading some data.
>  * The InputStream is not re-opened and another `read()` or `read(b, off, 
> len)` is executed after sleep
>  * The read succeed, but it skips the first few bytes that has already been 
> read on the second failure.
>  
> *Proposed fix:*
> [https://github.com/apache/hadoop/pull/3109]
> Added the test that reproduces the issue along with the fix



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=613352&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-613352
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 22/Jun/21 08:24
Start Date: 22/Jun/21 08:24
Worklog Time Spent: 10m 
  Work Description: majdyz commented on a change in pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#discussion_r655992116



##
File path: 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/TestS3AInputStreamRetry.java
##
@@ -0,0 +1,167 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a;
+
+import javax.net.ssl.SSLException;
+import java.io.IOException;
+import java.net.SocketException;
+import java.nio.charset.Charset;
+
+import com.amazonaws.services.s3.model.GetObjectRequest;
+import com.amazonaws.services.s3.model.ObjectMetadata;
+import com.amazonaws.services.s3.model.S3Object;
+import com.amazonaws.services.s3.model.S3ObjectInputStream;
+import org.junit.Test;
+
+import org.apache.commons.io.IOUtils;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.s3a.audit.impl.NoopSpan;
+import org.apache.hadoop.fs.s3a.auth.delegation.EncryptionSecrets;
+import org.apache.hadoop.fs.s3a.impl.ChangeDetectionPolicy;
+
+import static java.lang.Math.min;
+import static org.junit.Assert.assertArrayEquals;
+import static org.junit.Assert.assertEquals;
+
+/**
+ * Tests S3AInputStream retry behavior on read failure.
+ * These tests are for validating expected behavior of retrying the 
S3AInputStream
+ * read() and read(b, off, len), it tests that the read should reopen the 
input stream and retry
+ * the read when IOException is thrown during the read process.
+ */
+public class TestS3AInputStreamRetry extends AbstractS3AMockTest {
+
+  String input = "ab";
+
+  @Test
+  public void testInputStreamReadRetryForException() throws IOException {
+S3AInputStream s3AInputStream = getMockedS3AInputStream();
+
+assertEquals("'a' from the test input stream 'ab' should be the first 
character being read",
+input.charAt(0), s3AInputStream.read());
+assertEquals("'b' from the test input stream 'ab' should be the second 
character being read",
+input.charAt(1), s3AInputStream.read());
+  }
+
+  @Test
+  public void testInputStreamReadRetryLengthForException() throws IOException {
+byte[] result = new byte[input.length()];
+S3AInputStream s3AInputStream = getMockedS3AInputStream();
+s3AInputStream.read(result, 0, input.length());
+
+assertArrayEquals("The read result should equals to the test input stream 
content",
+input.getBytes(), result);
+  }
+
+  private S3AInputStream getMockedS3AInputStream() {
+Path path = new Path("test-path");
+String eTag = "test-etag";
+String versionId = "test-version-id";
+String owner = "test-owner";
+
+S3AFileStatus s3AFileStatus = new S3AFileStatus(
+input.length(), 0, path, input.length(), owner, eTag, versionId);
+
+S3ObjectAttributes s3ObjectAttributes = new S3ObjectAttributes(
+fs.getBucket(), path, fs.pathToKey(path), 
fs.getServerSideEncryptionAlgorithm(),
+new EncryptionSecrets().getEncryptionKey(), eTag, versionId, 
input.length());
+
+S3AReadOpContext s3AReadOpContext = fs.createReadContext(s3AFileStatus, 
S3AInputPolicy.Normal,
+ChangeDetectionPolicy.getPolicy(fs.getConf()), 100, NoopSpan.INSTANCE);
+
+return new S3AInputStream(s3AReadOpContext, s3ObjectAttributes, 
getMockedInputStreamCallback());
+  }
+
+  // Get mocked InputStreamCallbacks where we return mocked S3Object
+  private S3AInputStream.InputStreamCallbacks getMockedInputStreamCallback() {
+return new S3AInputStream.InputStreamCallbacks() {
+
+  final S3Object mockedS3Object = getMockedS3Object();
+
+  @Override
+  public S3Object getObject(GetObjectRequest request) {
+// Set s3 client to return mocked s3object with already defined read 
behavior
+return mockedS3Object;
+  }
+
+  @Override
+  public GetObje

[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=613332&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-613332
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 22/Jun/21 08:17
Start Date: 22/Jun/21 08:17
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-864965245


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 55s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  30m 42s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 46s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   0m 39s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 31s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 48s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 28s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 35s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 10s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  14m 22s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 36s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 38s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | -1 :x: |  javac  |   0m 38s | 
[/results-compile-javac-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/8/artifact/out/results-compile-javac-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt)
 |  hadoop-tools_hadoop-aws-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 generated 1 new + 20 unchanged - 1 fixed 
= 21 total (was 21)  |
   | +1 :green_heart: |  compile  |   0m 31s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | -1 :x: |  javac  |   0m 31s | 
[/results-compile-javac-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/8/artifact/out/results-compile-javac-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt)
 |  
hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 
with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 generated 1 new + 
20 unchanged - 1 fixed = 21 total (was 21)  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 22s | 
[/results-checkstyle-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/8/artifact/out/results-checkstyle-hadoop-tools_hadoop-aws.txt)
 |  hadoop-tools/hadoop-aws: The patch generated 27 new + 9 unchanged - 0 fixed 
= 36 total (was 9)  |
   | +1 :green_heart: |  mvnsite  |   0m 35s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 17s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 26s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  14m  9s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 26s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   0m 34s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   |  73m 36s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/8/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3109 |
   | Optional Tests | d

[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=613304&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-613304
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 22/Jun/21 08:14
Start Date: 22/Jun/21 08:14
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on a change in pull request 
#3109:
URL: https://github.com/apache/hadoop/pull/3109#discussion_r655400585



##
File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
##
@@ -1439,10 +1439,13 @@ public S3Object getObject(GetObjectRequest request) {
* using FS state as well as the status.
* @param fileStatus file status.
* @param seekPolicy input policy for this operation
+   * @param changePolicy change policy for this operation.
* @param readAheadRange readahead value.
+   * @param auditSpan audit span.
* @return a context for read and select operations.
*/
-  private S3AReadOpContext createReadContext(
+  @VisibleForTesting
+  protected S3AReadOpContext createReadContext(

Review comment:
   Afraid we currently do. However, if we move to a builder API for that 
ReadOpContext then the test could construct something very minimal (would only 
need the Invoker ref). I'd support that change here as it would help future 
tests.
   

##
File path: 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/TestS3AInputStreamRetry.java
##
@@ -0,0 +1,167 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a;
+
+import javax.net.ssl.SSLException;
+import java.io.IOException;
+import java.net.SocketException;
+import java.nio.charset.Charset;
+
+import com.amazonaws.services.s3.model.GetObjectRequest;
+import com.amazonaws.services.s3.model.ObjectMetadata;
+import com.amazonaws.services.s3.model.S3Object;
+import com.amazonaws.services.s3.model.S3ObjectInputStream;
+import org.junit.Test;
+
+import org.apache.commons.io.IOUtils;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.s3a.audit.impl.NoopSpan;
+import org.apache.hadoop.fs.s3a.auth.delegation.EncryptionSecrets;
+import org.apache.hadoop.fs.s3a.impl.ChangeDetectionPolicy;
+
+import static java.lang.Math.min;
+import static org.junit.Assert.assertArrayEquals;
+import static org.junit.Assert.assertEquals;
+
+/**
+ * Tests S3AInputStream retry behavior on read failure.
+ * These tests are for validating expected behavior of retrying the 
S3AInputStream
+ * read() and read(b, off, len), it tests that the read should reopen the 
input stream and retry
+ * the read when IOException is thrown during the read process.
+ */
+public class TestS3AInputStreamRetry extends AbstractS3AMockTest {
+
+  String input = "ab";
+
+  @Test
+  public void testInputStreamReadRetryForException() throws IOException {
+S3AInputStream s3AInputStream = getMockedS3AInputStream();
+
+assertEquals("'a' from the test input stream 'ab' should be the first 
character being read",
+input.charAt(0), s3AInputStream.read());
+assertEquals("'b' from the test input stream 'ab' should be the second 
character being read",
+input.charAt(1), s3AInputStream.read());
+  }
+
+  @Test
+  public void testInputStreamReadRetryLengthForException() throws IOException {
+byte[] result = new byte[input.length()];
+S3AInputStream s3AInputStream = getMockedS3AInputStream();
+s3AInputStream.read(result, 0, input.length());
+
+assertArrayEquals("The read result should equals to the test input stream 
content",
+input.getBytes(), result);
+  }
+
+  private S3AInputStream getMockedS3AInputStream() {
+Path path = new Path("test-path");
+String eTag = "test-etag";
+String versionId = "test-version-id";
+String owner = "test-owner";
+
+S3AFileStatus s3AFileStatus = new S3AFileStatus(
+input.length(), 0, path, input.length(), owner, eTag, versionId);
+
+S3ObjectAttributes s3ObjectAttributes = new S3ObjectAttributes(
+fs.getBucket(), path, fs.pathToKey(path), 
fs.getServerSide

[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=613245&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-613245
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 22/Jun/21 08:05
Start Date: 22/Jun/21 08:05
Worklog Time Spent: 10m 
  Work Description: mukund-thakur commented on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-864980696


   Really a corner case scenario. Nice catch. 
   Reviewed the code. It looks good. 
   From what I understand, we are trying to catch the SocketTimeoutException 
during second read as well and re-opening and throwing the exception. 
   
   Though I was wondering, can the same be achieved by always re-opening the 
stream and throwing exception such that invoker does the retry. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 613245)
Time Spent: 3h 10m  (was: 3h)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen and wait for fixed duration}}: The client after failing in the 
> attempt of {{read}}, will reopen the stream, sleep for 
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will 
> be retried without reopening the input stream in case of the second failure 
> happened. This leads to some of the bytes read being skipped which results to 
> corrupt/less data than required. 
>  
> *Scenario to reproduce:*
>  * Execute S3AInputStream `read()` or `read(b, off, len)`.
>  * The read failed and throws `Connection Reset` exception after reading some 
> data.
>  * The InputStream is re-opened and another `read()` or `read(b, off, len)` 
> is executed
>  * The read failed for the second time and throws `Connection Reset` 
> exception after reading some data.
>  * The InputStream is not re-opened and another `read()` or `read(b, off, 
> len)` is executed after sleep
>  * The read succeed, but it skips the first few bytes that has already been 
> read on the second failure.
>  
> *Proposed fix:*
> [https://github.com/apache/hadoop/pull/3109]
> Added the test that reproduces the issue along with the fix



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=613174&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-613174
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 22/Jun/21 07:56
Start Date: 22/Jun/21 07:56
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus removed a comment on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-862319524






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 613174)
Time Spent: 3h  (was: 2h 50m)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen and wait for fixed duration}}: The client after failing in the 
> attempt of {{read}}, will reopen the stream, sleep for 
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will 
> be retried without reopening the input stream in case of the second failure 
> happened. This leads to some of the bytes read being skipped which results to 
> corrupt/less data than required. 
>  
> *Scenario to reproduce:*
>  * Execute S3AInputStream `read()` or `read(b, off, len)`.
>  * The read failed and throws `Connection Reset` exception after reading some 
> data.
>  * The InputStream is re-opened and another `read()` or `read(b, off, len)` 
> is executed
>  * The read failed for the second time and throws `Connection Reset` 
> exception after reading some data.
>  * The InputStream is not re-opened and another `read()` or `read(b, off, 
> len)` is executed after sleep
>  * The read succeed, but it skips the first few bytes that has already been 
> read on the second failure.
>  
> *Proposed fix:*
> [https://github.com/apache/hadoop/pull/3109]
> Added the test that reproduces the issue along with the fix



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=613159&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-613159
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 22/Jun/21 07:54
Start Date: 22/Jun/21 07:54
Worklog Time Spent: 10m 
  Work Description: majdyz commented on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-864940353


   Hi @bogthe thanks for the feedback. 
   Sorry for getting back on this late, I'm running the test right now and will 
get back on this soon.
   
   And for the testing part:
   
   - Aside from the `.getObjectContent()` we also need to get the current 
implementation of the retry policy (the part where we have delay before the 
next retry) which attached in the `S3AReadOpContext`. I tried to keep the 
mocking minimum by only setting the behaviour of the inputStream, but to attach 
this we need to go from `InputStreamCallbacks` -> `S3Object` -> 
`S3ObjectInputStream` so two small mockings before that are needed. Let me know 
if there could be a better approach for this.
   - For the failure scenario, I provided these two tests where the current 
implementation breaks, let me know if there is any other scenario I should 
provide for the test


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 613159)
Time Spent: 2h 50m  (was: 2h 40m)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen and wait for fixed duration}}: The client after failing in the 
> attempt of {{read}}, will reopen the stream, sleep for 
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will 
> be retried without reopening the input stream in case of the second failure 
> happened. This leads to some of the bytes read being skipped which results to 
> corrupt/less data than required. 
>  
> *Scenario to reproduce:*
>  * Execute S3AInputStream `read()` or `read(b, off, len)`.
>  * The read failed and throws `Connection Reset` exception after reading some 
> data.
>  * The InputStream is re-opened and another `read()` or `read(b, off, len)` 
> is executed
>  * The read failed for the second time and throws `Connection Reset` 
> exception after reading some data.
>  * The InputStream is not re-opened and another `read()` or `read(b, off, 
> len)` is executed after sleep
>  * The read succeed, but it skips the first few bytes that has already been 
> read on the second failure.
>  
> *Proposed fix:*
> [https://github.com/apache/hadoop/pull/3109]
> Added the test that reproduces the issue along with the fix



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=613143&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-613143
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 22/Jun/21 07:52
Start Date: 22/Jun/21 07:52
Worklog Time Spent: 10m 
  Work Description: majdyz commented on a change in pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#discussion_r655258977



##
File path: 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/TestS3AInputStreamRetry.java
##
@@ -0,0 +1,167 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a;
+
+import javax.net.ssl.SSLException;
+import java.io.IOException;
+import java.net.SocketException;
+import java.nio.charset.Charset;
+
+import com.amazonaws.services.s3.model.GetObjectRequest;
+import com.amazonaws.services.s3.model.ObjectMetadata;
+import com.amazonaws.services.s3.model.S3Object;
+import com.amazonaws.services.s3.model.S3ObjectInputStream;
+import org.junit.Test;
+
+import org.apache.commons.io.IOUtils;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.s3a.audit.impl.NoopSpan;
+import org.apache.hadoop.fs.s3a.auth.delegation.EncryptionSecrets;
+import org.apache.hadoop.fs.s3a.impl.ChangeDetectionPolicy;
+
+import static java.lang.Math.min;
+import static org.junit.Assert.assertArrayEquals;
+import static org.junit.Assert.assertEquals;
+
+/**
+ * Tests S3AInputStream retry behavior on read failure.
+ * These tests are for validating expected behavior of retrying the 
S3AInputStream
+ * read() and read(b, off, len), it tests that the read should reopen the 
input stream and retry
+ * the read when IOException is thrown during the read process.
+ */
+public class TestS3AInputStreamRetry extends AbstractS3AMockTest {
+
+  String input = "ab";
+
+  @Test
+  public void testInputStreamReadRetryForException() throws IOException {
+S3AInputStream s3AInputStream = getMockedS3AInputStream();
+
+assertEquals("'a' from the test input stream 'ab' should be the first 
character being read",
+input.charAt(0), s3AInputStream.read());
+assertEquals("'b' from the test input stream 'ab' should be the second 
character being read",
+input.charAt(1), s3AInputStream.read());
+  }
+
+  @Test
+  public void testInputStreamReadRetryLengthForException() throws IOException {
+byte[] result = new byte[input.length()];
+S3AInputStream s3AInputStream = getMockedS3AInputStream();
+s3AInputStream.read(result, 0, input.length());
+
+assertArrayEquals("The read result should equals to the test input stream 
content",
+input.getBytes(), result);
+  }
+
+  private S3AInputStream getMockedS3AInputStream() {
+Path path = new Path("test-path");
+String eTag = "test-etag";
+String versionId = "test-version-id";
+String owner = "test-owner";
+
+S3AFileStatus s3AFileStatus = new S3AFileStatus(
+input.length(), 0, path, input.length(), owner, eTag, versionId);
+
+S3ObjectAttributes s3ObjectAttributes = new S3ObjectAttributes(
+fs.getBucket(), path, fs.pathToKey(path), 
fs.getServerSideEncryptionAlgorithm(),
+new EncryptionSecrets().getEncryptionKey(), eTag, versionId, 
input.length());
+
+S3AReadOpContext s3AReadOpContext = fs.createReadContext(s3AFileStatus, 
S3AInputPolicy.Normal,
+ChangeDetectionPolicy.getPolicy(fs.getConf()), 100, NoopSpan.INSTANCE);
+
+return new S3AInputStream(s3AReadOpContext, s3ObjectAttributes, 
getMockedInputStreamCallback());
+  }
+
+  // Get mocked InputStreamCallbacks where we return mocked S3Object
+  private S3AInputStream.InputStreamCallbacks getMockedInputStreamCallback() {
+return new S3AInputStream.InputStreamCallbacks() {
+
+  final S3Object mockedS3Object = getMockedS3Object();
+
+  @Override
+  public S3Object getObject(GetObjectRequest request) {
+// Set s3 client to return mocked s3object with already defined read 
behavior
+return mockedS3Object;
+  }
+
+  @Override
+  public GetObje

[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=613114&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-613114
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 22/Jun/21 07:49
Start Date: 22/Jun/21 07:49
Worklog Time Spent: 10m 
  Work Description: majdyz opened a new pull request #3132:
URL: https://github.com/apache/hadoop/pull/3132


   ## NOTICE
   
   Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HADOOP-X. Fix a typo in YYY.)
   For more details, please see 
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 613114)
Time Spent: 2.5h  (was: 2h 20m)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen and wait for fixed duration}}: The client after failing in the 
> attempt of {{read}}, will reopen the stream, sleep for 
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will 
> be retried without reopening the input stream in case of the second failure 
> happened. This leads to some of the bytes read being skipped which results to 
> corrupt/less data than required. 
>  
> *Scenario to reproduce:*
>  * Execute S3AInputStream `read()` or `read(b, off, len)`.
>  * The read failed and throws `Connection Reset` exception after reading some 
> data.
>  * The InputStream is re-opened and another `read()` or `read(b, off, len)` 
> is executed
>  * The read failed for the second time and throws `Connection Reset` 
> exception after reading some data.
>  * The InputStream is not re-opened and another `read()` or `read(b, off, 
> len)` is executed after sleep
>  * The read succeed, but it skips the first few bytes that has already been 
> read on the second failure.
>  
> *Proposed fix:*
> [https://github.com/apache/hadoop/pull/3109]
> Added the test that reproduces the issue along with the fix



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=613072&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-613072
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 22/Jun/21 07:44
Start Date: 22/Jun/21 07:44
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3132:
URL: https://github.com/apache/hadoop/pull/3132#issuecomment-865392265


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  6s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  36m 51s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 51s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   0m 44s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 31s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 49s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 27s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 37s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 21s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  18m 39s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 39s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 45s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   0m 45s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 31s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   0m 31s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 21s | 
[/results-checkstyle-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3132/1/artifact/out/results-checkstyle-hadoop-tools_hadoop-aws.txt)
 |  hadoop-tools/hadoop-aws: The patch generated 20 new + 9 unchanged - 0 fixed 
= 29 total (was 9)  |
   | +1 :green_heart: |  mvnsite  |   0m 41s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 16s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 26s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  18m 26s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 45s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   0m 36s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   |  89m 27s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3132/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3132 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux bb656ba2e052 4.15.0-136-generic #140-Ubuntu SMP Thu Jan 28 
05:20:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 02ec4ae6b78f05549ccb2428e31e7ff3357f1502 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3132/1/testReport/ |
   | Max. process+thread count | 622 (vs. ulimit of 5500) |
   | modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
   | 

[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=612903&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-612903
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 21/Jun/21 22:45
Start Date: 21/Jun/21 22:45
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3132:
URL: https://github.com/apache/hadoop/pull/3132#issuecomment-865392265


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  6s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  36m 51s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 51s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   0m 44s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 31s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 49s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 27s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 37s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 21s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  18m 39s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 39s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 45s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   0m 45s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 31s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   0m 31s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 21s | 
[/results-checkstyle-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3132/1/artifact/out/results-checkstyle-hadoop-tools_hadoop-aws.txt)
 |  hadoop-tools/hadoop-aws: The patch generated 20 new + 9 unchanged - 0 fixed 
= 29 total (was 9)  |
   | +1 :green_heart: |  mvnsite  |   0m 41s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 16s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 26s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  18m 26s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 45s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   0m 36s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   |  89m 27s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3132/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3132 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux bb656ba2e052 4.15.0-136-generic #140-Ubuntu SMP Thu Jan 28 
05:20:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 02ec4ae6b78f05549ccb2428e31e7ff3357f1502 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3132/1/testReport/ |
   | Max. process+thread count | 622 (vs. ulimit of 5500) |
   | modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws |
   | 

[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=612876&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-612876
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 21/Jun/21 21:14
Start Date: 21/Jun/21 21:14
Worklog Time Spent: 10m 
  Work Description: majdyz opened a new pull request #3132:
URL: https://github.com/apache/hadoop/pull/3132


   ## NOTICE
   
   Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HADOOP-X. Fix a typo in YYY.)
   For more details, please see 
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 612876)
Time Spent: 2h  (was: 1h 50m)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen and wait for fixed duration}}: The client after failing in the 
> attempt of {{read}}, will reopen the stream, sleep for 
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will 
> be retried without reopening the input stream in case of the second failure 
> happened. This leads to some of the bytes read being skipped which results to 
> corrupt/less data than required. 
>  
> *Scenario to reproduce:*
>  * Execute S3AInputStream `read()` or `read(b, off, len)`.
>  * The read failed and throws `Connection Reset` exception after reading some 
> data.
>  * The InputStream is re-opened and another `read()` or `read(b, off, len)` 
> is executed
>  * The read failed for the second time and throws `Connection Reset` 
> exception after reading some data.
>  * The InputStream is not re-opened and another `read()` or `read(b, off, 
> len)` is executed after sleep
>  * The read succeed, but it skips the first few bytes that has already been 
> read on the second failure.
>  
> *Proposed fix:*
> [https://github.com/apache/hadoop/pull/3109]
> Added the test that reproduces the issue along with the fix



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=612637&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-612637
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 21/Jun/21 13:57
Start Date: 21/Jun/21 13:57
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on a change in pull request 
#3109:
URL: https://github.com/apache/hadoop/pull/3109#discussion_r655400585



##
File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
##
@@ -1439,10 +1439,13 @@ public S3Object getObject(GetObjectRequest request) {
* using FS state as well as the status.
* @param fileStatus file status.
* @param seekPolicy input policy for this operation
+   * @param changePolicy change policy for this operation.
* @param readAheadRange readahead value.
+   * @param auditSpan audit span.
* @return a context for read and select operations.
*/
-  private S3AReadOpContext createReadContext(
+  @VisibleForTesting
+  protected S3AReadOpContext createReadContext(

Review comment:
   Afraid we currently do. However, if we move to a builder API for that 
ReadOpContext then the test could construct something very minimal (would only 
need the Invoker ref). I'd support that change here as it would help future 
tests.
   

##
File path: 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/TestS3AInputStreamRetry.java
##
@@ -0,0 +1,167 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a;
+
+import javax.net.ssl.SSLException;
+import java.io.IOException;
+import java.net.SocketException;
+import java.nio.charset.Charset;
+
+import com.amazonaws.services.s3.model.GetObjectRequest;
+import com.amazonaws.services.s3.model.ObjectMetadata;
+import com.amazonaws.services.s3.model.S3Object;
+import com.amazonaws.services.s3.model.S3ObjectInputStream;
+import org.junit.Test;
+
+import org.apache.commons.io.IOUtils;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.s3a.audit.impl.NoopSpan;
+import org.apache.hadoop.fs.s3a.auth.delegation.EncryptionSecrets;
+import org.apache.hadoop.fs.s3a.impl.ChangeDetectionPolicy;
+
+import static java.lang.Math.min;
+import static org.junit.Assert.assertArrayEquals;
+import static org.junit.Assert.assertEquals;
+
+/**
+ * Tests S3AInputStream retry behavior on read failure.
+ * These tests are for validating expected behavior of retrying the 
S3AInputStream
+ * read() and read(b, off, len), it tests that the read should reopen the 
input stream and retry
+ * the read when IOException is thrown during the read process.
+ */
+public class TestS3AInputStreamRetry extends AbstractS3AMockTest {
+
+  String input = "ab";
+
+  @Test
+  public void testInputStreamReadRetryForException() throws IOException {
+S3AInputStream s3AInputStream = getMockedS3AInputStream();
+
+assertEquals("'a' from the test input stream 'ab' should be the first 
character being read",
+input.charAt(0), s3AInputStream.read());
+assertEquals("'b' from the test input stream 'ab' should be the second 
character being read",
+input.charAt(1), s3AInputStream.read());
+  }
+
+  @Test
+  public void testInputStreamReadRetryLengthForException() throws IOException {
+byte[] result = new byte[input.length()];
+S3AInputStream s3AInputStream = getMockedS3AInputStream();
+s3AInputStream.read(result, 0, input.length());
+
+assertArrayEquals("The read result should equals to the test input stream 
content",
+input.getBytes(), result);
+  }
+
+  private S3AInputStream getMockedS3AInputStream() {
+Path path = new Path("test-path");
+String eTag = "test-etag";
+String versionId = "test-version-id";
+String owner = "test-owner";
+
+S3AFileStatus s3AFileStatus = new S3AFileStatus(
+input.length(), 0, path, input.length(), owner, eTag, versionId);
+
+S3ObjectAttributes s3ObjectAttributes = new S3ObjectAttributes(
+fs.getBucket(), path, fs.pathToKey(path), 
fs.getServerSide

[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=612590&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-612590
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 21/Jun/21 12:07
Start Date: 21/Jun/21 12:07
Worklog Time Spent: 10m 
  Work Description: mukund-thakur commented on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-864980696


   Really a corner case scenario. Nice catch. 
   Reviewed the code. It looks good. 
   From what I understand, we are trying to catch the SocketTimeoutException 
during second read as well and re-opening and throwing the exception. 
   
   Though I was wondering, can the same be achieved by always re-opening the 
stream and throwing exception such that invoker does the retry. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 612590)
Time Spent: 1h 40m  (was: 1.5h)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen and wait for fixed duration}}: The client after failing in the 
> attempt of {{read}}, will reopen the stream, sleep for 
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will 
> be retried without reopening the input stream in case of the second failure 
> happened. This leads to some of the bytes read being skipped which results to 
> corrupt/less data than required. 
>  
> *Scenario to reproduce:*
>  * Execute S3AInputStream `read()` or `read(b, off, len)`.
>  * The read failed and throws `Connection Reset` exception after reading some 
> data.
>  * The InputStream is re-opened and another `read()` or `read(b, off, len)` 
> is executed
>  * The read failed for the second time and throws `Connection Reset` 
> exception after reading some data.
>  * The InputStream is not re-opened and another `read()` or `read(b, off, 
> len)` is executed after sleep
>  * The read succeed, but it skips the first few bytes that has already been 
> read on the second failure.
>  
> *Proposed fix:*
> [https://github.com/apache/hadoop/pull/3109]
> Added the test that reproduces the issue along with the fix



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=612571&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-612571
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 21/Jun/21 11:40
Start Date: 21/Jun/21 11:40
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-864965245


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 55s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  30m 42s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 46s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   0m 39s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 31s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 48s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 28s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 35s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 10s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  14m 22s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 36s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 38s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | -1 :x: |  javac  |   0m 38s | 
[/results-compile-javac-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/8/artifact/out/results-compile-javac-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt)
 |  hadoop-tools_hadoop-aws-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 generated 1 new + 20 unchanged - 1 fixed 
= 21 total (was 21)  |
   | +1 :green_heart: |  compile  |   0m 31s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | -1 :x: |  javac  |   0m 31s | 
[/results-compile-javac-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/8/artifact/out/results-compile-javac-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt)
 |  
hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 
with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 generated 1 new + 
20 unchanged - 1 fixed = 21 total (was 21)  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 22s | 
[/results-checkstyle-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/8/artifact/out/results-checkstyle-hadoop-tools_hadoop-aws.txt)
 |  hadoop-tools/hadoop-aws: The patch generated 27 new + 9 unchanged - 0 fixed 
= 36 total (was 9)  |
   | +1 :green_heart: |  mvnsite  |   0m 35s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 17s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 26s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  14m  9s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 26s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   0m 34s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   |  73m 36s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/8/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3109 |
   | Optional Tests | d

[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=612546&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-612546
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 21/Jun/21 11:00
Start Date: 21/Jun/21 11:00
Worklog Time Spent: 10m 
  Work Description: majdyz commented on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-864940353


   Hi @bogthe thanks for the feedback. 
   Sorry for getting back on this late, I'm running the test right now and will 
get back on this soon.
   
   And for the testing part:
   
   - Aside from the `.getObjectContent()` we also need to get the current 
implementation of the retry policy (the part where we have delay before the 
next retry) which attached in the `S3AReadOpContext`. I tried to keep the 
mocking minimum by only setting the behaviour of the inputStream, but to attach 
this we need to go from `InputStreamCallbacks` -> `S3Object` -> 
`S3ObjectInputStream` so two small mockings before that are needed. Let me know 
if there could be a better approach for this.
   - For the failure scenario, I provided these two tests where the current 
implementation breaks, let me know if there is any other scenario I should 
provide for the test


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 612546)
Time Spent: 1h 20m  (was: 1h 10m)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen and wait for fixed duration}}: The client after failing in the 
> attempt of {{read}}, will reopen the stream, sleep for 
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will 
> be retried without reopening the input stream in case of the second failure 
> happened. This leads to some of the bytes read being skipped which results to 
> corrupt/less data than required. 
>  
> *Scenario to reproduce:*
>  * Execute S3AInputStream `read()` or `read(b, off, len)`.
>  * The read failed and throws `Connection Reset` exception after reading some 
> data.
>  * The InputStream is re-opened and another `read()` or `read(b, off, len)` 
> is executed
>  * The read failed for the second time and throws `Connection Reset` 
> exception after reading some data.
>  * The InputStream is not re-opened and another `read()` or `read(b, off, 
> len)` is executed after sleep
>  * The read succeed, but it skips the first few bytes that has already been 
> read on the second failure.
>  
> *Proposed fix:*
> [https://github.com/apache/hadoop/pull/3109]
> Added the test that reproduces the issue along with the fix



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=612540&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-612540
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 21/Jun/21 10:31
Start Date: 21/Jun/21 10:31
Worklog Time Spent: 10m 
  Work Description: majdyz commented on a change in pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#discussion_r655261509



##
File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
##
@@ -1439,10 +1439,13 @@ public S3Object getObject(GetObjectRequest request) {
* using FS state as well as the status.
* @param fileStatus file status.
* @param seekPolicy input policy for this operation
+   * @param changePolicy change policy for this operation.
* @param readAheadRange readahead value.
+   * @param auditSpan audit span.
* @return a context for read and select operations.
*/
-  private S3AReadOpContext createReadContext(
+  @VisibleForTesting
+  protected S3AReadOpContext createReadContext(

Review comment:
   Since the read context attach the retry policy and the retry logic that 
being used by the input stream, this is seems to be the entry where we can get 
for the test without exposing many internal implementations. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 612540)
Time Spent: 1h 10m  (was: 1h)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen and wait for fixed duration}}: The client after failing in the 
> attempt of {{read}}, will reopen the stream, sleep for 
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will 
> be retried without reopening the input stream in case of the second failure 
> happened. This leads to some of the bytes read being skipped which results to 
> corrupt/less data than required. 
>  
> *Scenario to reproduce:*
>  * Execute S3AInputStream `read()` or `read(b, off, len)`.
>  * The read failed and throws `Connection Reset` exception after reading some 
> data.
>  * The InputStream is re-opened and another `read()` or `read(b, off, len)` 
> is executed
>  * The read failed for the second time and throws `Connection Reset` 
> exception after reading some data.
>  * The InputStream is not re-opened and another `read()` or `read(b, off, 
> len)` is executed after sleep
>  * The read succeed, but it skips the first few bytes that has already been 
> read on the second failure.
>  
> *Proposed fix:*
> [https://github.com/apache/hadoop/pull/3109]
> Added the test that reproduces the issue along with the fix



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=612538&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-612538
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 21/Jun/21 10:29
Start Date: 21/Jun/21 10:29
Worklog Time Spent: 10m 
  Work Description: majdyz commented on a change in pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#discussion_r655260406



##
File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java
##
@@ -396,6 +396,41 @@ private void incrementBytesRead(long bytesRead) {
 }
   }
 
+  @FunctionalInterface
+  interface CheckedIntSupplier {
+int get() throws IOException;
+  }
+
+  /**
+   * Helper function that allows to retry an IntSupplier in case of 
`IOException`.
+   * This function is used by `read()` and `read(buf, off, len)` functions. It 
tries to run
+   * `readFn` and in case of `IOException`:
+   *   1. If it gets an EOFException, return -1
+   *   2. Else, run `onReadFailure` and retry running `readFn`. If it fails 
again,
+   *   we run `onReadFailure` and re-throw the error.
+   * @param readFn the function to read, it must return an integer
+   * @param length length of data being attempted to read
+   * @return -1 if `readFn` throws EOFException, else returns int value from 
the result of `readFn`
+   * @throws IOException if retry of `readFn` also fails with `IOException`
+   */
+  private int retryReadOnce(CheckedIntSupplier readFn, int length) throws 
IOException {
+try {
+  return readFn.get();
+} catch (EOFException e) {
+  return -1;
+} catch (IOException e) {
+  onReadFailure(e, length, e instanceof SocketTimeoutException);

Review comment:
   This method is used on both `read()` and `read(b, off, len)` we use 
length = 1 for `read()` and variable length for `read(b, off, len)`. It's 
intended to keep the current behaviour




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 612538)
Time Spent: 1h  (was: 50m)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen and wait for fixed duration}}: The client after failing in the 
> attempt of {{read}}, will reopen the stream, sleep for 
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will 
> be retried without reopening the input stream in case of the second failure 
> happened. This leads to some of the bytes read being skipped which results to 
> corrupt/less data than required. 
>  
> *Scenario to reproduce:*
>  * Execute S3AInputStream `read()` or `read(b, off, len)`.
>  * The read failed and throws `Connection Reset` exception after reading some 
> data.
>  * The InputStream is re-opened and another `read()` or `read(b, off, len)` 
> is executed
>  * The read failed for the second time and throws `Connection Reset` 
> exception after reading some data.
>  * The InputStream is not re-opened and another `read()` or `read(b, off, 
> len)` is executed after sleep
>  * The read succeed, but it skips the first few bytes that has already been 
> read on the second failure.
>  
> *Proposed fix:*
> [https://github.com/apache/hadoop/pull/3109]
> Added the test that reproduces the issue along with the fix



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=612535&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-612535
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 21/Jun/21 10:27
Start Date: 21/Jun/21 10:27
Worklog Time Spent: 10m 
  Work Description: majdyz commented on a change in pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#discussion_r655258977



##
File path: 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/TestS3AInputStreamRetry.java
##
@@ -0,0 +1,167 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a;
+
+import javax.net.ssl.SSLException;
+import java.io.IOException;
+import java.net.SocketException;
+import java.nio.charset.Charset;
+
+import com.amazonaws.services.s3.model.GetObjectRequest;
+import com.amazonaws.services.s3.model.ObjectMetadata;
+import com.amazonaws.services.s3.model.S3Object;
+import com.amazonaws.services.s3.model.S3ObjectInputStream;
+import org.junit.Test;
+
+import org.apache.commons.io.IOUtils;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.s3a.audit.impl.NoopSpan;
+import org.apache.hadoop.fs.s3a.auth.delegation.EncryptionSecrets;
+import org.apache.hadoop.fs.s3a.impl.ChangeDetectionPolicy;
+
+import static java.lang.Math.min;
+import static org.junit.Assert.assertArrayEquals;
+import static org.junit.Assert.assertEquals;
+
+/**
+ * Tests S3AInputStream retry behavior on read failure.
+ * These tests are for validating expected behavior of retrying the 
S3AInputStream
+ * read() and read(b, off, len), it tests that the read should reopen the 
input stream and retry
+ * the read when IOException is thrown during the read process.
+ */
+public class TestS3AInputStreamRetry extends AbstractS3AMockTest {
+
+  String input = "ab";
+
+  @Test
+  public void testInputStreamReadRetryForException() throws IOException {
+S3AInputStream s3AInputStream = getMockedS3AInputStream();
+
+assertEquals("'a' from the test input stream 'ab' should be the first 
character being read",
+input.charAt(0), s3AInputStream.read());
+assertEquals("'b' from the test input stream 'ab' should be the second 
character being read",
+input.charAt(1), s3AInputStream.read());
+  }
+
+  @Test
+  public void testInputStreamReadRetryLengthForException() throws IOException {
+byte[] result = new byte[input.length()];
+S3AInputStream s3AInputStream = getMockedS3AInputStream();
+s3AInputStream.read(result, 0, input.length());
+
+assertArrayEquals("The read result should equals to the test input stream 
content",
+input.getBytes(), result);
+  }
+
+  private S3AInputStream getMockedS3AInputStream() {
+Path path = new Path("test-path");
+String eTag = "test-etag";
+String versionId = "test-version-id";
+String owner = "test-owner";
+
+S3AFileStatus s3AFileStatus = new S3AFileStatus(
+input.length(), 0, path, input.length(), owner, eTag, versionId);
+
+S3ObjectAttributes s3ObjectAttributes = new S3ObjectAttributes(
+fs.getBucket(), path, fs.pathToKey(path), 
fs.getServerSideEncryptionAlgorithm(),
+new EncryptionSecrets().getEncryptionKey(), eTag, versionId, 
input.length());
+
+S3AReadOpContext s3AReadOpContext = fs.createReadContext(s3AFileStatus, 
S3AInputPolicy.Normal,
+ChangeDetectionPolicy.getPolicy(fs.getConf()), 100, NoopSpan.INSTANCE);
+
+return new S3AInputStream(s3AReadOpContext, s3ObjectAttributes, 
getMockedInputStreamCallback());
+  }
+
+  // Get mocked InputStreamCallbacks where we return mocked S3Object
+  private S3AInputStream.InputStreamCallbacks getMockedInputStreamCallback() {
+return new S3AInputStream.InputStreamCallbacks() {
+
+  final S3Object mockedS3Object = getMockedS3Object();
+
+  @Override
+  public S3Object getObject(GetObjectRequest request) {
+// Set s3 client to return mocked s3object with already defined read 
behavior
+return mockedS3Object;
+  }
+
+  @Override
+  public GetObje

[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=612532&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-612532
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 21/Jun/21 10:18
Start Date: 21/Jun/21 10:18
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus removed a comment on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-863242835


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m  0s |  |  Docker mode activated.  |
   | -1 :x: |  patch  |   0m 17s |  |  
https://github.com/apache/hadoop/pull/3109 does not apply to trunk. Rebase 
required? Wrong Branch? See 
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute for help.  
|
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/3109 |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/6/console |
   | versions | git=2.17.1 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 612532)
Time Spent: 40m  (was: 0.5h)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen and wait for fixed duration}}: The client after failing in the 
> attempt of {{read}}, will reopen the stream, sleep for 
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will 
> be retried without reopening the input stream in case of the second failure 
> happened. This leads to some of the bytes read being skipped which results to 
> corrupt/less data than required. 
>  
> *Scenario to reproduce:*
>  * Execute S3AInputStream `read()` or `read(b, off, len)`.
>  * The read failed and throws `Connection Reset` exception after reading some 
> data.
>  * The InputStream is re-opened and another `read()` or `read(b, off, len)` 
> is executed
>  * The read failed for the second time and throws `Connection Reset` 
> exception after reading some data.
>  * The InputStream is not re-opened and another `read()` or `read(b, off, 
> len)` is executed after sleep
>  * The read succeed, but it skips the first few bytes that has already been 
> read on the second failure.
>  
> *Proposed fix:*
> [https://github.com/apache/hadoop/pull/3109]
> Added the test that reproduces the issue along with the fix



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=612531&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-612531
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 21/Jun/21 10:18
Start Date: 21/Jun/21 10:18
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus removed a comment on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-862336119






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 612531)
Time Spent: 0.5h  (was: 20m)

> S3AInputStream read does not re-open the input stream on the second read 
> retry attempt
> --
>
> Key: HADOOP-17764
> URL: https://issues.apache.org/jira/browse/HADOOP-17764
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs/s3
>Affects Versions: 3.3.1
>Reporter: Zamil Majdy
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> *Bug description:*
> The read method in S3AInputStream has this following behaviour when an 
> IOException happening during the read:
>  * {{reopen and read quickly}}: The client after failing in the first attempt 
> of {{read}}, will reopen the stream and try reading again without {{sleep}}.
>  * {{reopen and wait for fixed duration}}: The client after failing in the 
> attempt of {{read}}, will reopen the stream, sleep for 
> {{fs.s3a.retry.interval}} milliseconds (defaults to 500 ms), and then try 
> reading from the stream.
> While doing the {{reopen and read quickly}} process, the subsequent read will 
> be retried without reopening the input stream in case of the second failure 
> happened. This leads to some of the bytes read being skipped which results to 
> corrupt/less data than required. 
>  
> *Scenario to reproduce:*
>  * Execute S3AInputStream `read()` or `read(b, off, len)`.
>  * The read failed and throws `Connection Reset` exception after reading some 
> data.
>  * The InputStream is re-opened and another `read()` or `read(b, off, len)` 
> is executed
>  * The read failed for the second time and throws `Connection Reset` 
> exception after reading some data.
>  * The InputStream is not re-opened and another `read()` or `read(b, off, 
> len)` is executed after sleep
>  * The read succeed, but it skips the first few bytes that has already been 
> read on the second failure.
>  
> *Proposed fix:*
> [https://github.com/apache/hadoop/pull/3109]
> Added the test that reproduces the issue along with the fix



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=612527&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-612527
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 21/Jun/21 10:00
Start Date: 21/Jun/21 10:00
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus removed a comment on pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#issuecomment-862319524


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 52s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  32m 44s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 45s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   0m 39s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   0m 30s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 46s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 25s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   0m 33s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   1m  9s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  14m 45s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 36s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 36s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   0m 36s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 31s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   0m 31s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 20s | 
[/results-checkstyle-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/2/artifact/out/results-checkstyle-hadoop-tools_hadoop-aws.txt)
 |  hadoop-tools/hadoop-aws: The patch generated 20 new + 9 unchanged - 0 fixed 
= 29 total (was 9)  |
   | +1 :green_heart: |  mvnsite  |   0m 34s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 15s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | -1 :x: |  javadoc  |   0m 25s | 
[/results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/2/artifact/out/results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10.txt)
 |  
hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 
with JDK Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 generated 3 new + 
63 unchanged - 0 fixed = 66 total (was 63)  |
   | +1 :green_heart: |  spotbugs  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  14m 40s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 28s |  |  hadoop-aws in the patch passed. 
 |
   | +1 :green_heart: |  asflicense  |   0m 34s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   |  76m 14s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3109/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3109 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 63f8d712e9e2 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 163bfd49e47f393dbbc4c1257b15d66c312aa4d6 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-

[jira] [Work logged] (HADOOP-17764) S3AInputStream read does not re-open the input stream on the second read retry attempt

2021-06-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17764?focusedWorklogId=611897&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-611897
 ]

ASF GitHub Bot logged work on HADOOP-17764:
---

Author: ASF GitHub Bot
Created on: 18/Jun/21 20:54
Start Date: 18/Jun/21 20:54
Worklog Time Spent: 10m 
  Work Description: bogthe commented on a change in pull request #3109:
URL: https://github.com/apache/hadoop/pull/3109#discussion_r653931489



##
File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java
##
@@ -396,6 +396,41 @@ private void incrementBytesRead(long bytesRead) {
 }
   }
 
+  @FunctionalInterface
+  interface CheckedIntSupplier {
+int get() throws IOException;
+  }
+
+  /**
+   * Helper function that allows to retry an IntSupplier in case of 
`IOException`.
+   * This function is used by `read()` and `read(buf, off, len)` functions. It 
tries to run
+   * `readFn` and in case of `IOException`:
+   *   1. If it gets an EOFException, return -1
+   *   2. Else, run `onReadFailure` and retry running `readFn`. If it fails 
again,
+   *   we run `onReadFailure` and re-throw the error.
+   * @param readFn the function to read, it must return an integer
+   * @param length length of data being attempted to read
+   * @return -1 if `readFn` throws EOFException, else returns int value from 
the result of `readFn`
+   * @throws IOException if retry of `readFn` also fails with `IOException`
+   */
+  private int retryReadOnce(CheckedIntSupplier readFn, int length) throws 
IOException {
+try {
+  return readFn.get();
+} catch (EOFException e) {
+  return -1;
+} catch (IOException e) {
+  onReadFailure(e, length, e instanceof SocketTimeoutException);

Review comment:
   I see you're calling `onReadFailure` with `length` instead of `1`. Any 
reasoning for this?
   
   That is used to calculate the range for a `GetObjectRequest` when the stream 
is being reopened. If it's intended then I would be curious of the impact it 
has on larger objects, have you done any testing around it?

##
File path: 
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
##
@@ -1439,10 +1439,13 @@ public S3Object getObject(GetObjectRequest request) {
* using FS state as well as the status.
* @param fileStatus file status.
* @param seekPolicy input policy for this operation
+   * @param changePolicy change policy for this operation.
* @param readAheadRange readahead value.
+   * @param auditSpan audit span.
* @return a context for read and select operations.
*/
-  private S3AReadOpContext createReadContext(
+  @VisibleForTesting
+  protected S3AReadOpContext createReadContext(

Review comment:
   I'm not really convinced that this is needed. Check the main comment for 
details.

##
File path: 
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/TestS3AInputStreamRetry.java
##
@@ -0,0 +1,167 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a;
+
+import javax.net.ssl.SSLException;
+import java.io.IOException;
+import java.net.SocketException;
+import java.nio.charset.Charset;
+
+import com.amazonaws.services.s3.model.GetObjectRequest;
+import com.amazonaws.services.s3.model.ObjectMetadata;
+import com.amazonaws.services.s3.model.S3Object;
+import com.amazonaws.services.s3.model.S3ObjectInputStream;
+import org.junit.Test;
+
+import org.apache.commons.io.IOUtils;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.s3a.audit.impl.NoopSpan;
+import org.apache.hadoop.fs.s3a.auth.delegation.EncryptionSecrets;
+import org.apache.hadoop.fs.s3a.impl.ChangeDetectionPolicy;
+
+import static java.lang.Math.min;
+import static org.junit.Assert.assertArrayEquals;
+import static org.junit.Assert.assertEquals;
+
+/**
+ * Tests S3AInputStream retry behavior on read failure.
+ * These tests are for validating expected behavior of retrying the 
S3AInputStream
+ * read() and read(b, off, len), it tests that the read should reopen the 
input stream and