[jira] [Commented] (HDFS-16938) Utility to trigger heartbeat and wait until BP thread queue is fully processed

2023-03-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695471#comment-17695471
 ] 

ASF GitHub Bot commented on HDFS-16938:
---

hadoop-yetus commented on PR #5445:
URL: https://github.com/apache/hadoop/pull/5445#issuecomment-1451337470

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 52s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  41m 41s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 31s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  compile  |   1m 22s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   1m  7s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 28s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  8s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   1m 33s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 40s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  26m  3s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 22s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 26s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javac  |   1m 26s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 17s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |   1m 17s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 54s |  |  
hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 86 unchanged - 2 
fixed = 86 total (was 88)  |
   | +1 :green_heart: |  mvnsite  |   1m 29s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 54s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   1m 42s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   4m 35s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  36m 17s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 260m 44s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5445/3/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 57s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 388m 53s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeReconfiguration |
   |   | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped |
   |   | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithShortCircuitRead |
   |   | hadoop.hdfs.server.mover.TestMover |
   |   | hadoop.hdfs.server.mover.TestStorageMover |
   |   | hadoop.hdfs.server.blockmanagement.TestBlockManager |
   |   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   |   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5445/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5445 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux c7f13b8c3285 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 
18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / e7c82dbbd47d725de6e0cbd16f6ba90cb10bca7e |
   | Default Java | Private 

[jira] [Commented] (HDFS-16938) Utility to trigger heartbeat and wait until BP thread queue is fully processed

2023-03-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695441#comment-17695441
 ] 

ASF GitHub Bot commented on HDFS-16938:
---

hadoop-yetus commented on PR #5445:
URL: https://github.com/apache/hadoop/pull/5445#issuecomment-1451264241

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 43s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  39m  3s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 28s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  compile  |   1m 23s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   1m  5s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 32s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 10s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   1m 36s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 25s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 50s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 15s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 19s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javac  |   1m 19s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 20s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |   1m 20s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 53s |  |  
hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 85 unchanged - 2 
fixed = 85 total (was 87)  |
   | +1 :green_heart: |  mvnsite  |   1m 18s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 50s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   1m 29s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 14s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 18s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 208m 13s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 45s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 315m 23s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5445/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5445 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux d5bbeba1b09f 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 
18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 56459b337c0160106aaef5a8fb50fb51b963cb39 |
   | Default Java | Private Build-1.8.0_352-8u352-ga-1~20.04-b08 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_352-8u352-ga-1~20.04-b08 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5445/2/testReport/ |
   | Max. process+thread count | 2836 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5445/2/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> 

[jira] [Commented] (HDFS-16938) Utility to trigger heartbeat and wait until BP thread queue is fully processed

2023-03-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695422#comment-17695422
 ] 

ASF GitHub Bot commented on HDFS-16938:
---

hadoop-yetus commented on PR #5445:
URL: https://github.com/apache/hadoop/pull/5445#issuecomment-1451224598

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 41s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  41m 11s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 33s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  compile  |   1m 29s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   1m  8s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 36s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 12s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   1m 37s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 44s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m  3s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 18s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javac  |   1m 18s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 13s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 49s |  |  
hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 85 unchanged - 2 
fixed = 85 total (was 87)  |
   | +1 :green_heart: |  mvnsite  |   1m 20s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 49s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   1m 29s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 21s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  24m  5s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 205m 36s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 49s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 318m 21s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5445/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5445 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 52247ef2efc5 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 
18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 64dc8ea3920f2a5b1ca01c41cc036d118e95f9f6 |
   | Default Java | Private Build-1.8.0_352-8u352-ga-1~20.04-b08 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_352-8u352-ga-1~20.04-b08 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5445/1/testReport/ |
   | Max. process+thread count | 3845 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5445/1/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> 

[jira] [Commented] (HDFS-16934) org.apache.hadoop.hdfs.tools.TestDFSAdmin#testAllDatanodesReconfig regression

2023-03-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695419#comment-17695419
 ] 

ASF GitHub Bot commented on HDFS-16934:
---

hadoop-yetus commented on PR #5434:
URL: https://github.com/apache/hadoop/pull/5434#issuecomment-1451187716

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 46s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  38m 36s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 26s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  compile  |   1m 18s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   1m  7s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 33s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  9s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   1m 33s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 28s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 40s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 18s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 20s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javac  |   1m 20s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 51s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 21s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 50s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   1m 27s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 12s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 19s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 206m 57s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5434/6/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 51s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 313m 40s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5434/6/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5434 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 45aaa91fd3c9 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 
18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / b525d2854d098d4ad3a7877fa13ebf6343c5538c |
   | Default Java | Private Build-1.8.0_352-8u352-ga-1~20.04-b08 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_352-8u352-ga-1~20.04-b08 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5434/6/testReport/ |
   | Max. process+thread count | 2973 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 

[jira] [Commented] (HDFS-16896) HDFS Client hedged read has increased failure rate than without hedged read

2023-03-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695400#comment-17695400
 ] 

ASF GitHub Bot commented on HDFS-16896:
---

hadoop-yetus commented on PR #5322:
URL: https://github.com/apache/hadoop/pull/5322#issuecomment-1451124197

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 43s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 25s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m 38s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   6m  1s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  compile  |   5m 39s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   1m 18s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 30s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 52s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   2m 17s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   5m 57s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 26s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m  5s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   5m 55s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javac  |   5m 55s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   5m 36s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |   5m 36s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   1m  6s | 
[/results-checkstyle-hadoop-hdfs-project.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5322/10/artifact/out/results-checkstyle-hadoop-hdfs-project.txt)
 |  hadoop-hdfs-project: The patch generated 1 new + 41 unchanged - 1 fixed = 
42 total (was 42)  |
   | +1 :green_heart: |  mvnsite  |   2m  9s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 28s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   1m 56s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   5m 49s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 33s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 27s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | -1 :x: |  unit  | 203m 50s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5322/10/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 50s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 344m 52s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5322/10/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5322 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux a927011dee89 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 
18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 40b77b2a52ef6faabcd3190738ec01418a6ca550 |
   | Default Java | Private Build-1.8.0_352-8u352-ga-1~20.04-b08 |
   | Multi-JDK versions 

[jira] [Resolved] (HDFS-16896) HDFS Client hedged read has increased failure rate than without hedged read

2023-03-01 Thread Tom McCormick (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom McCormick resolved HDFS-16896.
--
Resolution: Fixed

> HDFS Client hedged read has increased failure rate than without hedged read
> ---
>
> Key: HDFS-16896
> URL: https://issues.apache.org/jira/browse/HDFS-16896
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Tom McCormick
>Assignee: Tom McCormick
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
>
> When hedged read is enabled by HDFS client, we see an increased failure rate 
> on reads.
> *stacktrace*
>  
> {code:java}
> Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain 
> block: BP-1183972111-10.197.192.88-1590025572374:blk_17114848218_16043459722 
> file=/data/tracking/streaming/AdImpressionEvent/daily/2022/07/18/compaction_1/part-r-1914862.1658217125623.1362294472.orc
> at 
> org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:1077)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1060)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1039)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.hedgedFetchBlockByteRange(DFSInputStream.java:1365)
> at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1572)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1535)
> at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> org.apache.hadoop.fs.RetryingInputStream.lambda$readFully$3(RetryingInputStream.java:172)
> at org.apache.hadoop.fs.RetryPolicy.lambda$run$0(RetryPolicy.java:137)
> at org.apache.hadoop.fs.NoOpRetryPolicy.run(NoOpRetryPolicy.java:36)
> at org.apache.hadoop.fs.RetryPolicy.run(RetryPolicy.java:136)
> at 
> org.apache.hadoop.fs.RetryingInputStream.readFully(RetryingInputStream.java:168)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> io.trino.plugin.hive.orc.HdfsOrcDataSource.readInternal(HdfsOrcDataSource.java:76)
> ... 46 more
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16938) Utility to trigger heartbeat and wait until BP thread queue is fully processed

2023-03-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695395#comment-17695395
 ] 

ASF GitHub Bot commented on HDFS-16938:
---

virajjasani commented on PR #5445:
URL: https://github.com/apache/hadoop/pull/5445#issuecomment-1451105473

   > If someone removes the processQueueMessages itself from the sendHeartbeat, 
then also this test should fail or atleast some should
   
   +1




> Utility to trigger heartbeat and wait until BP thread queue is fully processed
> --
>
> Key: HDFS-16938
> URL: https://issues.apache.org/jira/browse/HDFS-16938
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> As a follow-up to HDFS-16935, we should provide utility to trigger heartbeat 
> and wait until BP thread queue is fully processed. This would ensure 100% 
> consistency w.r.t active namenode being able to receive bad block reports 
> from the given datanode. This utility would resolve flakes for the tests that 
> rely on namenode's awareness of the reported bad blocks by datanodes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16938) Utility to trigger heartbeat and wait until BP thread queue is fully processed

2023-03-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695394#comment-17695394
 ] 

ASF GitHub Bot commented on HDFS-16938:
---

virajjasani commented on PR #5445:
URL: https://github.com/apache/hadoop/pull/5445#issuecomment-1451104757

   > It is inducing a race by nextHeartbeatTime
   
   Absolutely, that's what I thought too. But yes you are right, other than 
adding sleeps, it's bit tricky to reproduce. But yeah our Jenkins are like 
that, some of the weirdness could be reproduced by us only if we inject sleeps. 
Anyways, as long as daily builds stay happy, it's fine.
   Thanks again :)




> Utility to trigger heartbeat and wait until BP thread queue is fully processed
> --
>
> Key: HDFS-16938
> URL: https://issues.apache.org/jira/browse/HDFS-16938
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> As a follow-up to HDFS-16935, we should provide utility to trigger heartbeat 
> and wait until BP thread queue is fully processed. This would ensure 100% 
> consistency w.r.t active namenode being able to receive bad block reports 
> from the given datanode. This utility would resolve flakes for the tests that 
> rely on namenode's awareness of the reported bad blocks by datanodes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16896) HDFS Client hedged read has increased failure rate than without hedged read

2023-03-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695393#comment-17695393
 ] 

ASF GitHub Bot commented on HDFS-16896:
---

omalley merged PR #5444:
URL: https://github.com/apache/hadoop/pull/5444




> HDFS Client hedged read has increased failure rate than without hedged read
> ---
>
> Key: HDFS-16896
> URL: https://issues.apache.org/jira/browse/HDFS-16896
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Tom McCormick
>Assignee: Tom McCormick
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
>
> When hedged read is enabled by HDFS client, we see an increased failure rate 
> on reads.
> *stacktrace*
>  
> {code:java}
> Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain 
> block: BP-1183972111-10.197.192.88-1590025572374:blk_17114848218_16043459722 
> file=/data/tracking/streaming/AdImpressionEvent/daily/2022/07/18/compaction_1/part-r-1914862.1658217125623.1362294472.orc
> at 
> org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:1077)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1060)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1039)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.hedgedFetchBlockByteRange(DFSInputStream.java:1365)
> at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1572)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1535)
> at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> org.apache.hadoop.fs.RetryingInputStream.lambda$readFully$3(RetryingInputStream.java:172)
> at org.apache.hadoop.fs.RetryPolicy.lambda$run$0(RetryPolicy.java:137)
> at org.apache.hadoop.fs.NoOpRetryPolicy.run(NoOpRetryPolicy.java:36)
> at org.apache.hadoop.fs.RetryPolicy.run(RetryPolicy.java:136)
> at 
> org.apache.hadoop.fs.RetryingInputStream.readFully(RetryingInputStream.java:168)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> io.trino.plugin.hive.orc.HdfsOrcDataSource.readInternal(HdfsOrcDataSource.java:76)
> ... 46 more
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16938) Utility to trigger heartbeat and wait until BP thread queue is fully processed

2023-03-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695392#comment-17695392
 ] 

ASF GitHub Bot commented on HDFS-16938:
---

ayushtkn commented on PR #5445:
URL: https://github.com/apache/hadoop/pull/5445#issuecomment-1451100991

   Playing a bit more with it: Single sleep doesn't repro itself. My wild guess 
is, it isn't testing from where we started. It is inducing a race by 
**nextHeartbeatTime**, that too because we induced sleeps, else like very tough 
with the kind of code we have now.
   
   > we might rather want to wait for source code to do that so that if 
something changes in source code sequence or so, our test would be able to 
catch it 
   
   not the intention of the original test, it is just waiting for the message 
in the Queue to be sent and processed and checking the response of namenode to 
that, or if namenode acknowledges that or not. (I still feel we could have 
invoked **processQueueMessages** directly and saved some time, but lets see, 
the present code is also working)
   
   If someone removes the  processQueueMessages itself from the 
**sendHeartbeat**, then also this test should fail or atleast some should, rest 
we can't guarantee everything...
   
   >but anyways nothing wrong with waiting for sometime and circling back to 
this when/if required.
   
   yeps, lets wait and see if any failures and circle back in future and see 
what are the potential solutions then.




> Utility to trigger heartbeat and wait until BP thread queue is fully processed
> --
>
> Key: HDFS-16938
> URL: https://issues.apache.org/jira/browse/HDFS-16938
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> As a follow-up to HDFS-16935, we should provide utility to trigger heartbeat 
> and wait until BP thread queue is fully processed. This would ensure 100% 
> consistency w.r.t active namenode being able to receive bad block reports 
> from the given datanode. This utility would resolve flakes for the tests that 
> rely on namenode's awareness of the reported bad blocks by datanodes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16923) The getListing RPC will throw NPE if the path does not exist

2023-03-01 Thread Erik Krogen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-16923:
---
Fix Version/s: 3.4.0
   3.3.6

> The getListing RPC will throw NPE if the path does not exist
> 
>
> Key: HDFS-16923
> URL: https://issues.apache.org/jira/browse/HDFS-16923
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6
>
>
> The getListing RPC will throw NPE if the path does not exist. And the stack 
> as bellow:
> {code:java}
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RemoteException): 
> org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> java.lang.NullPointerException
>     at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4195)
>     at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:1421)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:783)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:622)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:590)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:574)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16923) The getListing RPC will throw NPE if the path does not exist

2023-03-01 Thread Erik Krogen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen resolved HDFS-16923.

Resolution: Fixed

> The getListing RPC will throw NPE if the path does not exist
> 
>
> Key: HDFS-16923
> URL: https://issues.apache.org/jira/browse/HDFS-16923
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.6
>
>
> The getListing RPC will throw NPE if the path does not exist. And the stack 
> as bellow:
> {code:java}
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RemoteException): 
> org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> java.lang.NullPointerException
>     at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4195)
>     at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:1421)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:783)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:622)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:590)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:574)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16923) The getListing RPC will throw NPE if the path does not exist

2023-03-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695379#comment-17695379
 ] 

ASF GitHub Bot commented on HDFS-16923:
---

xkrogen merged PR #5400:
URL: https://github.com/apache/hadoop/pull/5400




> The getListing RPC will throw NPE if the path does not exist
> 
>
> Key: HDFS-16923
> URL: https://issues.apache.org/jira/browse/HDFS-16923
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Critical
>  Labels: pull-request-available
>
> The getListing RPC will throw NPE if the path does not exist. And the stack 
> as bellow:
> {code:java}
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RemoteException): 
> org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> java.lang.NullPointerException
>     at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4195)
>     at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:1421)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:783)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:622)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:590)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:574)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16923) The getListing RPC will throw NPE if the path does not exist

2023-03-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695375#comment-17695375
 ] 

ASF GitHub Bot commented on HDFS-16923:
---

xkrogen commented on PR #5400:
URL: https://github.com/apache/hadoop/pull/5400#issuecomment-1451080156

   The only test failures is `TestDirectoryScanner.testThrottling`:
   ```
   [ERROR] Tests run: 13, Failures: 3, Errors: 0, Skipped: 0, Time elapsed: 
575.031 s <<< FAILURE! - in 
org.apache.hadoop.hdfs.server.datanode.TestDirectoryScanner
   [ERROR] 
testThrottling(org.apache.hadoop.hdfs.server.datanode.TestDirectoryScanner)  
Time elapsed: 162.513 s  <<< FAILURE!
   java.lang.AssertionError: Throttle is too permissive
   ```
   This one is a bit nondeterministic as it's actually running things and 
checking how long it takes. It doesn't look related.
   
   Merging to `trunk` and `branch-3.3`. Thanks @ZanderXu ! I'm also going to 
see about getting this into 3.3.5 since we have 
[HDFS-16732](https://issues.apache.org/jira/browse/HDFS-16732) there and this 
is a pretty bad bug.




> The getListing RPC will throw NPE if the path does not exist
> 
>
> Key: HDFS-16923
> URL: https://issues.apache.org/jira/browse/HDFS-16923
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Critical
>  Labels: pull-request-available
>
> The getListing RPC will throw NPE if the path does not exist. And the stack 
> as bellow:
> {code:java}
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RemoteException): 
> org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> java.lang.NullPointerException
>     at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4195)
>     at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:1421)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:783)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:622)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:590)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:574)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16938) Utility to trigger heartbeat and wait until BP thread queue is fully processed

2023-03-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695372#comment-17695372
 ] 

ASF GitHub Bot commented on HDFS-16938:
---

virajjasani commented on PR #5445:
URL: https://github.com/apache/hadoop/pull/5445#issuecomment-1451077691

   > If my intent is just for processQueueMessages, I will expose and just 
shoot that directly, rather than doing the whole loop.
   
   That would also work but as part of the test, we might rather want to wait 
for source code to do that so that if something changes in source code sequence 
or so, our test would be able to catch it (if the intention of the test is to 
wait for processQueueMessages to be successfully called and completed by source 
code).
   
   
   > Changing the existing uses with this would be slowing down tests, which 
are already above tolerable limits.
   
   For this PR, only `testReportBadBlocks` test is updated to use it, but 
anyways nothing wrong with waiting for sometime and circling back to this 
when/if required.
   Thanks Ayush :)




> Utility to trigger heartbeat and wait until BP thread queue is fully processed
> --
>
> Key: HDFS-16938
> URL: https://issues.apache.org/jira/browse/HDFS-16938
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> As a follow-up to HDFS-16935, we should provide utility to trigger heartbeat 
> and wait until BP thread queue is fully processed. This would ensure 100% 
> consistency w.r.t active namenode being able to receive bad block reports 
> from the given datanode. This utility would resolve flakes for the tests that 
> rely on namenode's awareness of the reported bad blocks by datanodes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16923) The getListing RPC will throw NPE if the path does not exist

2023-03-01 Thread Erik Krogen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-16923:
---
Priority: Critical  (was: Major)

> The getListing RPC will throw NPE if the path does not exist
> 
>
> Key: HDFS-16923
> URL: https://issues.apache.org/jira/browse/HDFS-16923
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Critical
>  Labels: pull-request-available
>
> The getListing RPC will throw NPE if the path does not exist. And the stack 
> as bellow:
> {code:java}
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RemoteException): 
> org.apache.hadoop.ipc.RemoteException(java.lang.NullPointerException): 
> java.lang.NullPointerException
>     at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4195)
>     at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:1421)
>     at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:783)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:622)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:590)
>     at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:574)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16938) Utility to trigger heartbeat and wait until BP thread queue is fully processed

2023-03-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695368#comment-17695368
 ] 

ASF GitHub Bot commented on HDFS-16938:
---

ayushtkn commented on PR #5445:
URL: https://github.com/apache/hadoop/pull/5445#issuecomment-1451064227

   First patch isn't something to consider itself. Second one is like very 
unrealistic in general, there aren't any things which we can blame above which 
can consume time, they are some naive if checks or so, can't take this time...
   I don't think we have a use case for now for this util, If my intent is just 
for processQueueMessages, I will expose and just shoot that directly, rather 
than doing the whole loop.
   Changing the existing uses with this would be slowing down tests, which are 
already above tolerable limits. 
   
   Lets hold it, and observe if we get something around this in future we can 
circle back




> Utility to trigger heartbeat and wait until BP thread queue is fully processed
> --
>
> Key: HDFS-16938
> URL: https://issues.apache.org/jira/browse/HDFS-16938
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> As a follow-up to HDFS-16935, we should provide utility to trigger heartbeat 
> and wait until BP thread queue is fully processed. This would ensure 100% 
> consistency w.r.t active namenode being able to receive bad block reports 
> from the given datanode. This utility would resolve flakes for the tests that 
> rely on namenode's awareness of the reported bad blocks by datanodes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16938) Utility to trigger heartbeat and wait until BP thread queue is fully processed

2023-03-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695363#comment-17695363
 ] 

ASF GitHub Bot commented on HDFS-16938:
---

virajjasani commented on PR #5445:
URL: https://github.com/apache/hadoop/pull/5445#issuecomment-1451021126

   The above patch, when applied with this PR changes, the test is passing 
consistently. 
   Whereas without PR changes, the test is consistently failing (failed 7 times 
locally without PR changes, passed 5 times with PR changes).




> Utility to trigger heartbeat and wait until BP thread queue is fully processed
> --
>
> Key: HDFS-16938
> URL: https://issues.apache.org/jira/browse/HDFS-16938
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> As a follow-up to HDFS-16935, we should provide utility to trigger heartbeat 
> and wait until BP thread queue is fully processed. This would ensure 100% 
> consistency w.r.t active namenode being able to receive bad block reports 
> from the given datanode. This utility would resolve flakes for the tests that 
> rely on namenode's awareness of the reported bad blocks by datanodes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16938) Utility to trigger heartbeat and wait until BP thread queue is fully processed

2023-03-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695362#comment-17695362
 ] 

ASF GitHub Bot commented on HDFS-16938:
---

virajjasani commented on PR #5445:
URL: https://github.com/apache/hadoop/pull/5445#issuecomment-1451015537

   Another way I am able to repro consistently:
   
   ```
   diff --git 
a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java
 
b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java
   index e9f424604b4..9b17a126da1 100755
   -

> Utility to trigger heartbeat and wait until BP thread queue is fully processed
> --
>
> Key: HDFS-16938
> URL: https://issues.apache.org/jira/browse/HDFS-16938
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> As a follow-up to HDFS-16935, we should provide utility to trigger heartbeat 
> and wait until BP thread queue is fully processed. This would ensure 100% 
> consistency w.r.t active namenode being able to receive bad block reports 
> from the given datanode. This utility would resolve flakes for the tests that 
> rely on namenode's awareness of the reported bad blocks by datanodes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16938) Utility to trigger heartbeat and wait until BP thread queue is fully processed

2023-03-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695358#comment-17695358
 ] 

ASF GitHub Bot commented on HDFS-16938:
---

virajjasani commented on PR #5445:
URL: https://github.com/apache/hadoop/pull/5445#issuecomment-1451008842

   Though it's difficult to reproduce, I thought this utility would help test 
to ensure 100% that namenode has definitely received report as part of 
`ReportBadBlockAction#reportTo`.




> Utility to trigger heartbeat and wait until BP thread queue is fully processed
> --
>
> Key: HDFS-16938
> URL: https://issues.apache.org/jira/browse/HDFS-16938
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> As a follow-up to HDFS-16935, we should provide utility to trigger heartbeat 
> and wait until BP thread queue is fully processed. This would ensure 100% 
> consistency w.r.t active namenode being able to receive bad block reports 
> from the given datanode. This utility would resolve flakes for the tests that 
> rely on namenode's awareness of the reported bad blocks by datanodes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16938) Utility to trigger heartbeat and wait until BP thread queue is fully processed

2023-03-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695356#comment-17695356
 ] 

ASF GitHub Bot commented on HDFS-16938:
---

virajjasani commented on PR #5445:
URL: https://github.com/apache/hadoop/pull/5445#issuecomment-1450990755

   I tried multiple cases and for some sleeps, I am able to repro, only 
sometimes.
   
   The only way I am able to consistently repro failure is by applying this 
patch:
   ```
   diff --git 
a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java
 
b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java
   index e9f424604b4..c39eca73f38 100755
   -

> Utility to trigger heartbeat and wait until BP thread queue is fully processed
> --
>
> Key: HDFS-16938
> URL: https://issues.apache.org/jira/browse/HDFS-16938
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> As a follow-up to HDFS-16935, we should provide utility to trigger heartbeat 
> and wait until BP thread queue is fully processed. This would ensure 100% 
> consistency w.r.t active namenode being able to receive bad block reports 
> from the given datanode. This utility would resolve flakes for the tests that 
> rely on namenode's awareness of the reported bad blocks by datanodes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16938) Utility to trigger heartbeat and wait until BP thread queue is fully processed

2023-03-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695343#comment-17695343
 ] 

ASF GitHub Bot commented on HDFS-16938:
---

ayushtkn commented on PR #5445:
URL: https://github.com/apache/hadoop/pull/5445#issuecomment-1450903655

   Viraj, can you help me repro the scenario. I added 10K sleep before 
processEnqueed method last time and the test didn’t fail for me, 
triggerHeartbeat was working there




> Utility to trigger heartbeat and wait until BP thread queue is fully processed
> --
>
> Key: HDFS-16938
> URL: https://issues.apache.org/jira/browse/HDFS-16938
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> As a follow-up to HDFS-16935, we should provide utility to trigger heartbeat 
> and wait until BP thread queue is fully processed. This would ensure 100% 
> consistency w.r.t active namenode being able to receive bad block reports 
> from the given datanode. This utility would resolve flakes for the tests that 
> rely on namenode's awareness of the reported bad blocks by datanodes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16938) Utility to trigger heartbeat and wait until BP thread queue is fully processed

2023-03-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695337#comment-17695337
 ] 

ASF GitHub Bot commented on HDFS-16938:
---

virajjasani commented on PR #5445:
URL: https://github.com/apache/hadoop/pull/5445#issuecomment-1450896182

   @ayushtkn @tomscut could you please review this PR?




> Utility to trigger heartbeat and wait until BP thread queue is fully processed
> --
>
> Key: HDFS-16938
> URL: https://issues.apache.org/jira/browse/HDFS-16938
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> As a follow-up to HDFS-16935, we should provide utility to trigger heartbeat 
> and wait until BP thread queue is fully processed. This would ensure 100% 
> consistency w.r.t active namenode being able to receive bad block reports 
> from the given datanode. This utility would resolve flakes for the tests that 
> rely on namenode's awareness of the reported bad blocks by datanodes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16938) Utility to trigger heartbeat and wait until BP thread queue is fully processed

2023-03-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695335#comment-17695335
 ] 

ASF GitHub Bot commented on HDFS-16938:
---

virajjasani opened a new pull request, #5445:
URL: https://github.com/apache/hadoop/pull/5445

   As a follow-up to HDFS-16935, we should provide utility to trigger heartbeat 
and wait until BP thread queue is fully processed. This would ensure 100% 
consistency w.r.t active namenode being able to receive bad block reports from 
the given datanode. This utility would resolve flakes for the tests that rely 
on namenode's awareness of the reported bad blocks by datanodes.




> Utility to trigger heartbeat and wait until BP thread queue is fully processed
> --
>
> Key: HDFS-16938
> URL: https://issues.apache.org/jira/browse/HDFS-16938
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>
> As a follow-up to HDFS-16935, we should provide utility to trigger heartbeat 
> and wait until BP thread queue is fully processed. This would ensure 100% 
> consistency w.r.t active namenode being able to receive bad block reports 
> from the given datanode. This utility would resolve flakes for the tests that 
> rely on namenode's awareness of the reported bad blocks by datanodes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16938) Utility to trigger heartbeat and wait until BP thread queue is fully processed

2023-03-01 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-16938:
--
Labels: pull-request-available  (was: )

> Utility to trigger heartbeat and wait until BP thread queue is fully processed
> --
>
> Key: HDFS-16938
> URL: https://issues.apache.org/jira/browse/HDFS-16938
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Viraj Jasani
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>
> As a follow-up to HDFS-16935, we should provide utility to trigger heartbeat 
> and wait until BP thread queue is fully processed. This would ensure 100% 
> consistency w.r.t active namenode being able to receive bad block reports 
> from the given datanode. This utility would resolve flakes for the tests that 
> rely on namenode's awareness of the reported bad blocks by datanodes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16938) Utility to trigger heartbeat and wait until BP thread queue is fully processed

2023-03-01 Thread Viraj Jasani (Jira)
Viraj Jasani created HDFS-16938:
---

 Summary: Utility to trigger heartbeat and wait until BP thread 
queue is fully processed
 Key: HDFS-16938
 URL: https://issues.apache.org/jira/browse/HDFS-16938
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Viraj Jasani
Assignee: Viraj Jasani


As a follow-up to HDFS-16935, we should provide utility to trigger heartbeat 
and wait until BP thread queue is fully processed. This would ensure 100% 
consistency w.r.t active namenode being able to receive bad block reports from 
the given datanode. This utility would resolve flakes for the tests that rely 
on namenode's awareness of the reported bad blocks by datanodes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16896) HDFS Client hedged read has increased failure rate than without hedged read

2023-03-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695305#comment-17695305
 ] 

ASF GitHub Bot commented on HDFS-16896:
---

hadoop-yetus commented on PR #5444:
URL: https://github.com/apache/hadoop/pull/5444#issuecomment-1450786773

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m  0s |  |  Docker mode activated.  |
   | -1 :x: |  docker  |   5m 55s |  |  Docker failed to build run-specific 
yetus/hadoop:tp-9052}.  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/5444 |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5444/1/console |
   | versions | git=2.17.1 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> HDFS Client hedged read has increased failure rate than without hedged read
> ---
>
> Key: HDFS-16896
> URL: https://issues.apache.org/jira/browse/HDFS-16896
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Tom McCormick
>Assignee: Tom McCormick
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
>
> When hedged read is enabled by HDFS client, we see an increased failure rate 
> on reads.
> *stacktrace*
>  
> {code:java}
> Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain 
> block: BP-1183972111-10.197.192.88-1590025572374:blk_17114848218_16043459722 
> file=/data/tracking/streaming/AdImpressionEvent/daily/2022/07/18/compaction_1/part-r-1914862.1658217125623.1362294472.orc
> at 
> org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:1077)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1060)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1039)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.hedgedFetchBlockByteRange(DFSInputStream.java:1365)
> at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1572)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1535)
> at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> org.apache.hadoop.fs.RetryingInputStream.lambda$readFully$3(RetryingInputStream.java:172)
> at org.apache.hadoop.fs.RetryPolicy.lambda$run$0(RetryPolicy.java:137)
> at org.apache.hadoop.fs.NoOpRetryPolicy.run(NoOpRetryPolicy.java:36)
> at org.apache.hadoop.fs.RetryPolicy.run(RetryPolicy.java:136)
> at 
> org.apache.hadoop.fs.RetryingInputStream.readFully(RetryingInputStream.java:168)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> io.trino.plugin.hive.orc.HdfsOrcDataSource.readInternal(HdfsOrcDataSource.java:76)
> ... 46 more
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16896) HDFS Client hedged read has increased failure rate than without hedged read

2023-03-01 Thread Tom McCormick (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tom McCormick updated HDFS-16896:
-
Fix Version/s: 3.4.0
   3.3.5

> HDFS Client hedged read has increased failure rate than without hedged read
> ---
>
> Key: HDFS-16896
> URL: https://issues.apache.org/jira/browse/HDFS-16896
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Tom McCormick
>Assignee: Tom McCormick
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.5
>
>
> When hedged read is enabled by HDFS client, we see an increased failure rate 
> on reads.
> *stacktrace*
>  
> {code:java}
> Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain 
> block: BP-1183972111-10.197.192.88-1590025572374:blk_17114848218_16043459722 
> file=/data/tracking/streaming/AdImpressionEvent/daily/2022/07/18/compaction_1/part-r-1914862.1658217125623.1362294472.orc
> at 
> org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:1077)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1060)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1039)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.hedgedFetchBlockByteRange(DFSInputStream.java:1365)
> at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1572)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1535)
> at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> org.apache.hadoop.fs.RetryingInputStream.lambda$readFully$3(RetryingInputStream.java:172)
> at org.apache.hadoop.fs.RetryPolicy.lambda$run$0(RetryPolicy.java:137)
> at org.apache.hadoop.fs.NoOpRetryPolicy.run(NoOpRetryPolicy.java:36)
> at org.apache.hadoop.fs.RetryPolicy.run(RetryPolicy.java:136)
> at 
> org.apache.hadoop.fs.RetryingInputStream.readFully(RetryingInputStream.java:168)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> io.trino.plugin.hive.orc.HdfsOrcDataSource.readInternal(HdfsOrcDataSource.java:76)
> ... 46 more
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16896) HDFS Client hedged read has increased failure rate than without hedged read

2023-03-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695301#comment-17695301
 ] 

ASF GitHub Bot commented on HDFS-16896:
---

mccormickt12 opened a new pull request, #5444:
URL: https://github.com/apache/hadoop/pull/5444

   …… (#5322)
   
   HDFS-16896 clear ignoredNodes list when we clear deadnode list on 
refetchLocations.  ignoredNodes list is only used on hedged read codepath
   
   
   
   ### Description of PR
   Backporting hedged read fixes to branch 3.3
   
   ### How was this patch tested?
   Added tests and tested by LinkedIn Trino to verify performance improvements
   
   ### For code changes:
   
   - [ ] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [ ] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   




> HDFS Client hedged read has increased failure rate than without hedged read
> ---
>
> Key: HDFS-16896
> URL: https://issues.apache.org/jira/browse/HDFS-16896
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Tom McCormick
>Assignee: Tom McCormick
>Priority: Major
>  Labels: pull-request-available
>
> When hedged read is enabled by HDFS client, we see an increased failure rate 
> on reads.
> *stacktrace*
>  
> {code:java}
> Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain 
> block: BP-1183972111-10.197.192.88-1590025572374:blk_17114848218_16043459722 
> file=/data/tracking/streaming/AdImpressionEvent/daily/2022/07/18/compaction_1/part-r-1914862.1658217125623.1362294472.orc
> at 
> org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:1077)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1060)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1039)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.hedgedFetchBlockByteRange(DFSInputStream.java:1365)
> at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1572)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1535)
> at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> org.apache.hadoop.fs.RetryingInputStream.lambda$readFully$3(RetryingInputStream.java:172)
> at org.apache.hadoop.fs.RetryPolicy.lambda$run$0(RetryPolicy.java:137)
> at org.apache.hadoop.fs.NoOpRetryPolicy.run(NoOpRetryPolicy.java:36)
> at org.apache.hadoop.fs.RetryPolicy.run(RetryPolicy.java:136)
> at 
> org.apache.hadoop.fs.RetryingInputStream.readFully(RetryingInputStream.java:168)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> io.trino.plugin.hive.orc.HdfsOrcDataSource.readInternal(HdfsOrcDataSource.java:76)
> ... 46 more
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16896) HDFS Client hedged read has increased failure rate than without hedged read

2023-03-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695290#comment-17695290
 ] 

ASF GitHub Bot commented on HDFS-16896:
---

omalley merged PR #5322:
URL: https://github.com/apache/hadoop/pull/5322




> HDFS Client hedged read has increased failure rate than without hedged read
> ---
>
> Key: HDFS-16896
> URL: https://issues.apache.org/jira/browse/HDFS-16896
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Tom McCormick
>Assignee: Tom McCormick
>Priority: Major
>  Labels: pull-request-available
>
> When hedged read is enabled by HDFS client, we see an increased failure rate 
> on reads.
> *stacktrace*
>  
> {code:java}
> Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain 
> block: BP-1183972111-10.197.192.88-1590025572374:blk_17114848218_16043459722 
> file=/data/tracking/streaming/AdImpressionEvent/daily/2022/07/18/compaction_1/part-r-1914862.1658217125623.1362294472.orc
> at 
> org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:1077)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1060)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1039)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.hedgedFetchBlockByteRange(DFSInputStream.java:1365)
> at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1572)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1535)
> at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> org.apache.hadoop.fs.RetryingInputStream.lambda$readFully$3(RetryingInputStream.java:172)
> at org.apache.hadoop.fs.RetryPolicy.lambda$run$0(RetryPolicy.java:137)
> at org.apache.hadoop.fs.NoOpRetryPolicy.run(NoOpRetryPolicy.java:36)
> at org.apache.hadoop.fs.RetryPolicy.run(RetryPolicy.java:136)
> at 
> org.apache.hadoop.fs.RetryingInputStream.readFully(RetryingInputStream.java:168)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> io.trino.plugin.hive.orc.HdfsOrcDataSource.readInternal(HdfsOrcDataSource.java:76)
> ... 46 more
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16896) HDFS Client hedged read has increased failure rate than without hedged read

2023-03-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695277#comment-17695277
 ] 

ASF GitHub Bot commented on HDFS-16896:
---

mccormickt12 commented on code in PR #5322:
URL: https://github.com/apache/hadoop/pull/5322#discussion_r1122202223


##
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java:
##
@@ -955,6 +965,10 @@ private DNAddrPair chooseDataNode(LocatedBlock block,
 }
   }
 
+  /**
+   * RefetchLocations should only be called when there are no active requests
+   * to datanodes. In the hedged read case this means futures should be empty
+   */

Review Comment:
   Added





> HDFS Client hedged read has increased failure rate than without hedged read
> ---
>
> Key: HDFS-16896
> URL: https://issues.apache.org/jira/browse/HDFS-16896
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Tom McCormick
>Assignee: Tom McCormick
>Priority: Major
>  Labels: pull-request-available
>
> When hedged read is enabled by HDFS client, we see an increased failure rate 
> on reads.
> *stacktrace*
>  
> {code:java}
> Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain 
> block: BP-1183972111-10.197.192.88-1590025572374:blk_17114848218_16043459722 
> file=/data/tracking/streaming/AdImpressionEvent/daily/2022/07/18/compaction_1/part-r-1914862.1658217125623.1362294472.orc
> at 
> org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:1077)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1060)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1039)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.hedgedFetchBlockByteRange(DFSInputStream.java:1365)
> at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1572)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1535)
> at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> org.apache.hadoop.fs.RetryingInputStream.lambda$readFully$3(RetryingInputStream.java:172)
> at org.apache.hadoop.fs.RetryPolicy.lambda$run$0(RetryPolicy.java:137)
> at org.apache.hadoop.fs.NoOpRetryPolicy.run(NoOpRetryPolicy.java:36)
> at org.apache.hadoop.fs.RetryPolicy.run(RetryPolicy.java:136)
> at 
> org.apache.hadoop.fs.RetryingInputStream.readFully(RetryingInputStream.java:168)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> io.trino.plugin.hive.orc.HdfsOrcDataSource.readInternal(HdfsOrcDataSource.java:76)
> ... 46 more
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16896) HDFS Client hedged read has increased failure rate than without hedged read

2023-03-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695276#comment-17695276
 ] 

ASF GitHub Bot commented on HDFS-16896:
---

mccormickt12 commented on code in PR #5322:
URL: https://github.com/apache/hadoop/pull/5322#discussion_r1122201971


##
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java:
##
@@ -197,6 +197,15 @@ private void clearLocalDeadNodes() {
 deadNodes.clear();
   }
 
+  /**
+   * Clear list of ignored nodes used for hedged reads.
+   */
+  private void clearIgnoredNodes(Collection ignoredNodes) {

Review Comment:
   fixed





> HDFS Client hedged read has increased failure rate than without hedged read
> ---
>
> Key: HDFS-16896
> URL: https://issues.apache.org/jira/browse/HDFS-16896
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Tom McCormick
>Assignee: Tom McCormick
>Priority: Major
>  Labels: pull-request-available
>
> When hedged read is enabled by HDFS client, we see an increased failure rate 
> on reads.
> *stacktrace*
>  
> {code:java}
> Caused by: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain 
> block: BP-1183972111-10.197.192.88-1590025572374:blk_17114848218_16043459722 
> file=/data/tracking/streaming/AdImpressionEvent/daily/2022/07/18/compaction_1/part-r-1914862.1658217125623.1362294472.orc
> at 
> org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:1077)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1060)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1039)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.hedgedFetchBlockByteRange(DFSInputStream.java:1365)
> at org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1572)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1535)
> at org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:121)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> org.apache.hadoop.fs.RetryingInputStream.lambda$readFully$3(RetryingInputStream.java:172)
> at org.apache.hadoop.fs.RetryPolicy.lambda$run$0(RetryPolicy.java:137)
> at org.apache.hadoop.fs.NoOpRetryPolicy.run(NoOpRetryPolicy.java:36)
> at org.apache.hadoop.fs.RetryPolicy.run(RetryPolicy.java:136)
> at 
> org.apache.hadoop.fs.RetryingInputStream.readFully(RetryingInputStream.java:168)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:112)
> at 
> io.trino.plugin.hive.orc.HdfsOrcDataSource.readInternal(HdfsOrcDataSource.java:76)
> ... 46 more
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16935) TestFsDatasetImpl.testReportBadBlocks brittle

2023-03-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695273#comment-17695273
 ] 

ASF GitHub Bot commented on HDFS-16935:
---

virajjasani commented on code in PR #5432:
URL: https://github.com/apache/hadoop/pull/5432#discussion_r1122197968


##
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestFsDatasetImpl.java:
##
@@ -1101,15 +1099,12 @@ public void testReportBadBlocks() throws Exception {
 
   block = DFSTestUtil.getFirstBlock(fs, filePath);
   // Test for the overloaded method reportBadBlocks
-  dataNode.reportBadBlocks(block, dataNode.getFSDataset()
-  .getFsVolumeReferences().get(0));
-  Thread.sleep(3000);
-  BlockManagerTestUtil.updateState(cluster.getNamesystem()
-  .getBlockManager());
-  // Verify the bad block has been reported to namenode
-  Assert.assertEquals(1, 
cluster.getNamesystem().getCorruptReplicaBlocks());
-} finally {
-  cluster.shutdown();
+  dataNode.reportBadBlocks(block, 
dataNode.getFSDataset().getFsVolumeReferences().get(0));
+  GenericTestUtils.waitFor(() -> {
+
BlockManagerTestUtil.updateState(cluster.getNamesystem().getBlockManager());
+// Verify the bad block has been reported to namenode
+return 1 == cluster.getNamesystem().getCorruptReplicaBlocks();
+  }, 100, 1, "Corrupted replica blocks could not be found");

Review Comment:
   Let me create follow-up work to make this strongly consistent with reporting 
of the bad block.





> TestFsDatasetImpl.testReportBadBlocks brittle
> -
>
> Key: HDFS-16935
> URL: https://issues.apache.org/jira/browse/HDFS-16935
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.4.0, 3.3.5, 3.3.9
>Reporter: Steve Loughran
>Assignee: Viraj Jasani
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.9
>
>
> jenkins failure as sleep() time not long enough
> {code}
> Failing for the past 1 build (Since #4 )
> Took 7.4 sec.
> Error Message
> expected:<1> but was:<0>
> Stacktrace
> java.lang.AssertionError: expected:<1> but was:<0>
>   at org.junit.Assert.fail(Assert.java:89)
>   at org.junit.Assert.failNotEquals(Assert.java:835)
>   at org.junit.Assert.assertEquals(Assert.java:647)
>   at org.junit.Assert.assertEquals(Assert.java:633)
> {code}
> assert is after a 3s sleep waiting for reports coming in.
> {code}
>   dataNode.reportBadBlocks(block, dataNode.getFSDataset()
>   .getFsVolumeReferences().get(0));
>   Thread.sleep(3000);   // 3s 
> sleep
>   BlockManagerTestUtil.updateState(cluster.getNamesystem()
>   .getBlockManager());
>   // Verify the bad block has been reported to namenode
>   Assert.assertEquals(1, 
> cluster.getNamesystem().getCorruptReplicaBlocks());  // here
> {code}
> LambdaTestUtils.eventually() should be used around this assert, maybe with an 
> even shorter initial delay so on faster systems, test is faster.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16935) TestFsDatasetImpl.testReportBadBlocks brittle

2023-03-01 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HDFS-16935.
---
Resolution: Fixed

> TestFsDatasetImpl.testReportBadBlocks brittle
> -
>
> Key: HDFS-16935
> URL: https://issues.apache.org/jira/browse/HDFS-16935
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.4.0, 3.3.5, 3.3.9
>Reporter: Steve Loughran
>Assignee: Viraj Jasani
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.9
>
>
> jenkins failure as sleep() time not long enough
> {code}
> Failing for the past 1 build (Since #4 )
> Took 7.4 sec.
> Error Message
> expected:<1> but was:<0>
> Stacktrace
> java.lang.AssertionError: expected:<1> but was:<0>
>   at org.junit.Assert.fail(Assert.java:89)
>   at org.junit.Assert.failNotEquals(Assert.java:835)
>   at org.junit.Assert.assertEquals(Assert.java:647)
>   at org.junit.Assert.assertEquals(Assert.java:633)
> {code}
> assert is after a 3s sleep waiting for reports coming in.
> {code}
>   dataNode.reportBadBlocks(block, dataNode.getFSDataset()
>   .getFsVolumeReferences().get(0));
>   Thread.sleep(3000);   // 3s 
> sleep
>   BlockManagerTestUtil.updateState(cluster.getNamesystem()
>   .getBlockManager());
>   // Verify the bad block has been reported to namenode
>   Assert.assertEquals(1, 
> cluster.getNamesystem().getCorruptReplicaBlocks());  // here
> {code}
> LambdaTestUtils.eventually() should be used around this assert, maybe with an 
> even shorter initial delay so on faster systems, test is faster.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16935) TestFsDatasetImpl.testReportBadBlocks brittle

2023-03-01 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HDFS-16935:
--
Fix Version/s: 3.4.0
   3.3.9

> TestFsDatasetImpl.testReportBadBlocks brittle
> -
>
> Key: HDFS-16935
> URL: https://issues.apache.org/jira/browse/HDFS-16935
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.4.0, 3.3.5, 3.3.9
>Reporter: Steve Loughran
>Assignee: Viraj Jasani
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.9
>
>
> jenkins failure as sleep() time not long enough
> {code}
> Failing for the past 1 build (Since #4 )
> Took 7.4 sec.
> Error Message
> expected:<1> but was:<0>
> Stacktrace
> java.lang.AssertionError: expected:<1> but was:<0>
>   at org.junit.Assert.fail(Assert.java:89)
>   at org.junit.Assert.failNotEquals(Assert.java:835)
>   at org.junit.Assert.assertEquals(Assert.java:647)
>   at org.junit.Assert.assertEquals(Assert.java:633)
> {code}
> assert is after a 3s sleep waiting for reports coming in.
> {code}
>   dataNode.reportBadBlocks(block, dataNode.getFSDataset()
>   .getFsVolumeReferences().get(0));
>   Thread.sleep(3000);   // 3s 
> sleep
>   BlockManagerTestUtil.updateState(cluster.getNamesystem()
>   .getBlockManager());
>   // Verify the bad block has been reported to namenode
>   Assert.assertEquals(1, 
> cluster.getNamesystem().getCorruptReplicaBlocks());  // here
> {code}
> LambdaTestUtils.eventually() should be used around this assert, maybe with an 
> even shorter initial delay so on faster systems, test is faster.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16935) TestFsDatasetImpl.testReportBadBlocks brittle

2023-03-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695266#comment-17695266
 ] 

ASF GitHub Bot commented on HDFS-16935:
---

virajjasani commented on code in PR #5432:
URL: https://github.com/apache/hadoop/pull/5432#discussion_r1122177909


##
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/TestFsDatasetImpl.java:
##
@@ -1101,15 +1099,12 @@ public void testReportBadBlocks() throws Exception {
 
   block = DFSTestUtil.getFirstBlock(fs, filePath);
   // Test for the overloaded method reportBadBlocks
-  dataNode.reportBadBlocks(block, dataNode.getFSDataset()
-  .getFsVolumeReferences().get(0));
-  Thread.sleep(3000);
-  BlockManagerTestUtil.updateState(cluster.getNamesystem()
-  .getBlockManager());
-  // Verify the bad block has been reported to namenode
-  Assert.assertEquals(1, 
cluster.getNamesystem().getCorruptReplicaBlocks());
-} finally {
-  cluster.shutdown();
+  dataNode.reportBadBlocks(block, 
dataNode.getFSDataset().getFsVolumeReferences().get(0));
+  GenericTestUtils.waitFor(() -> {
+
BlockManagerTestUtil.updateState(cluster.getNamesystem().getBlockManager());
+// Verify the bad block has been reported to namenode
+return 1 == cluster.getNamesystem().getCorruptReplicaBlocks();
+  }, 100, 1, "Corrupted replica blocks could not be found");

Review Comment:
   > I gave it a try and was able to repro this. With triggerHeartbeat, it 
worked for me. I think that is a standard practice for such cases running since 
legacy time for such cases(At least my time).
   
   Great, sound good.
   
   > do you intend to say if we put a sleep just before processQueueMessages, 
things should screw up?
   
   Yes that's what I was suspecting so far but looks like even with sleep, 
reproducing the screw up is difficult. I tried it now several times but unable 
to reproduce failure with sleep.
   
   Also, now I understand what could still possibly go wrong with the race 
condition, let me update the patch.





> TestFsDatasetImpl.testReportBadBlocks brittle
> -
>
> Key: HDFS-16935
> URL: https://issues.apache.org/jira/browse/HDFS-16935
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.4.0, 3.3.5, 3.3.9
>Reporter: Steve Loughran
>Assignee: Viraj Jasani
>Priority: Minor
>  Labels: pull-request-available
>
> jenkins failure as sleep() time not long enough
> {code}
> Failing for the past 1 build (Since #4 )
> Took 7.4 sec.
> Error Message
> expected:<1> but was:<0>
> Stacktrace
> java.lang.AssertionError: expected:<1> but was:<0>
>   at org.junit.Assert.fail(Assert.java:89)
>   at org.junit.Assert.failNotEquals(Assert.java:835)
>   at org.junit.Assert.assertEquals(Assert.java:647)
>   at org.junit.Assert.assertEquals(Assert.java:633)
> {code}
> assert is after a 3s sleep waiting for reports coming in.
> {code}
>   dataNode.reportBadBlocks(block, dataNode.getFSDataset()
>   .getFsVolumeReferences().get(0));
>   Thread.sleep(3000);   // 3s 
> sleep
>   BlockManagerTestUtil.updateState(cluster.getNamesystem()
>   .getBlockManager());
>   // Verify the bad block has been reported to namenode
>   Assert.assertEquals(1, 
> cluster.getNamesystem().getCorruptReplicaBlocks());  // here
> {code}
> LambdaTestUtils.eventually() should be used around this assert, maybe with an 
> even shorter initial delay so on faster systems, test is faster.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16935) TestFsDatasetImpl.testReportBadBlocks brittle

2023-03-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695264#comment-17695264
 ] 

ASF GitHub Bot commented on HDFS-16935:
---

steveloughran merged PR #5432:
URL: https://github.com/apache/hadoop/pull/5432




> TestFsDatasetImpl.testReportBadBlocks brittle
> -
>
> Key: HDFS-16935
> URL: https://issues.apache.org/jira/browse/HDFS-16935
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 3.4.0, 3.3.5, 3.3.9
>Reporter: Steve Loughran
>Assignee: Viraj Jasani
>Priority: Minor
>  Labels: pull-request-available
>
> jenkins failure as sleep() time not long enough
> {code}
> Failing for the past 1 build (Since #4 )
> Took 7.4 sec.
> Error Message
> expected:<1> but was:<0>
> Stacktrace
> java.lang.AssertionError: expected:<1> but was:<0>
>   at org.junit.Assert.fail(Assert.java:89)
>   at org.junit.Assert.failNotEquals(Assert.java:835)
>   at org.junit.Assert.assertEquals(Assert.java:647)
>   at org.junit.Assert.assertEquals(Assert.java:633)
> {code}
> assert is after a 3s sleep waiting for reports coming in.
> {code}
>   dataNode.reportBadBlocks(block, dataNode.getFSDataset()
>   .getFsVolumeReferences().get(0));
>   Thread.sleep(3000);   // 3s 
> sleep
>   BlockManagerTestUtil.updateState(cluster.getNamesystem()
>   .getBlockManager());
>   // Verify the bad block has been reported to namenode
>   Assert.assertEquals(1, 
> cluster.getNamesystem().getCorruptReplicaBlocks());  // here
> {code}
> LambdaTestUtils.eventually() should be used around this assert, maybe with an 
> even shorter initial delay so on faster systems, test is faster.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16934) org.apache.hadoop.hdfs.tools.TestDFSAdmin#testAllDatanodesReconfig regression

2023-03-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17695133#comment-17695133
 ] 

ASF GitHub Bot commented on HDFS-16934:
---

hadoop-yetus commented on PR #5434:
URL: https://github.com/apache/hadoop/pull/5434#issuecomment-1450289619

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 45s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  38m  7s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 27s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  compile  |   1m 22s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  checkstyle  |   1m  7s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 34s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  7s |  |  trunk passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   1m 34s |  |  trunk passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 29s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 53s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 21s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 17s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javac  |   1m 17s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 13s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  javac  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 51s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5434/5/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 0 unchanged - 
0 fixed = 1 total (was 0)  |
   | +1 :green_heart: |  mvnsite  |   1m 18s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 50s |  |  the patch passed with JDK 
Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04  |
   | +1 :green_heart: |  javadoc  |   1m 26s |  |  the patch passed with JDK 
Private Build-1.8.0_352-8u352-ga-1~20.04-b08  |
   | +1 :green_heart: |  spotbugs  |   3m 17s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 33s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 212m 14s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5434/5/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 49s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 318m 19s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   |   | hadoop.hdfs.server.namenode.ha.TestObserverNode |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.42 ServerAPI=1.42 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5434/5/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5434 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 17457753e6f3 4.15.0-200-generic #211-Ubuntu SMP Thu Nov 24 
18:16:04 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 45c2a08193920c45906836f965b1f37491ed7fdb |
   | Default Java | Private Build-1.8.0_352-8u352-ga-1~20.04-b08 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.17+8-post-Ubuntu-1ubuntu220.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_352-8u352-ga-1~20.04-b08 |
 

[jira] [Commented] (HDFS-16937) Delete RPC should also record number of delete blocks in audit log

2023-03-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17694920#comment-17694920
 ] 

ASF GitHub Bot commented on HDFS-16937:
---

hfutatzhanghb commented on PR #5442:
URL: https://github.com/apache/hadoop/pull/5442#issuecomment-1449538280

   > Can not change the operation name like this, Changing audit log output is 
an incompatible change
   
   @ayushtkn , Thanks for your reminding. BTW, what can i do for this, could 
please give me some advice?




> Delete RPC should also record number of delete blocks in audit log
> --
>
> Key: HDFS-16937
> URL: https://issues.apache.org/jira/browse/HDFS-16937
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.3.4
>Reporter: ZhangHB
>Priority: Minor
>  Labels: pull-request-available
>
> To better trace the jitter caused by delete rpc,  we should also record the 
> number of deleting blocks in audit log. With this information, we can know 
> which user cause the jitter.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org