[jira] [Work logged] (HDFS-16604) Install gtest via FetchContent_Declare in CMake

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16604?focusedWorklogId=776129&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776129
 ]

ASF GitHub Bot logged work on HDFS-16604:
-

Author: ASF GitHub Bot
Created on: 31/May/22 08:43
Start Date: 31/May/22 08:43
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4374:
URL: https://github.com/apache/hadoop/pull/4374#issuecomment-1141849469

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m 11s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  41m 11s |  |  trunk passed  |
   | -1 :x: |  compile  |   0m 39s | 
[/branch-compile-hadoop-hdfs-project_hadoop-hdfs-native-client.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4374/5/artifact/out/branch-compile-hadoop-hdfs-project_hadoop-hdfs-native-client.txt)
 |  hadoop-hdfs-native-client in trunk failed.  |
   | +1 :green_heart: |  mvnsite  |   0m 37s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 40s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  65m 16s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 19s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   4m  2s |  |  the patch passed  |
   | -1 :x: |  cc  |   4m  2s | 
[/results-compile-cc-hadoop-hdfs-project_hadoop-hdfs-native-client.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4374/5/artifact/out/results-compile-cc-hadoop-hdfs-project_hadoop-hdfs-native-client.txt)
 |  hadoop-hdfs-project_hadoop-hdfs-native-client generated 8 new + 0 unchanged 
- 0 fixed = 8 total (was 0)  |
   | +1 :green_heart: |  golang  |   4m  2s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |   4m  2s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  mvnsite  |   0m 22s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 18s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m 29s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  |  51m 42s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs-native-client.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4374/5/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-native-client.txt)
 |  hadoop-hdfs-native-client in the patch failed.  |
   | +1 :green_heart: |  asflicense  |   0m 46s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 149m 49s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed CTEST tests | test_libhdfs_threaded_hdfspp_test_shim_static |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4374/5/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4374 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient codespell detsecrets cc golang |
   | uname | Linux 5fe627051356 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 
17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 00f704b25d5f191b4e0450675f0314ae8a8cbac9 |
   | Default Java | Red Hat, Inc.-1.8.0_332-b09 |
   | CTEST | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4374/5/artifact/out/patch-hadoop-hdfs-project_hadoop-hdfs-native-client-ctest.txt
 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4374/5/testReport/ |
   | Max. process+thread count | 598 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-native-client U: 
hadoop-hdfs-project/hadoop-hdfs-native-client |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4374/5/console |
   | versions | git=2.9.5 maven=3.6.3 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




Issue T

[jira] [Updated] (HDFS-15737) Don't remove datanodes from outOfServiceNodeBlocks while checking in DatanodeAdminManager

2022-05-31 Thread Masatake Iwasaki (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-15737:

Fix Version/s: (was: 2.10.2)

> Don't remove datanodes from outOfServiceNodeBlocks while checking in 
> DatanodeAdminManager
> -
>
> Key: HDFS-15737
> URL: https://issues.apache.org/jira/browse/HDFS-15737
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Ye Ni
>Assignee: Ye Ni
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> With CyclicIteration, remove an item while iterating causes either dead loop 
> or ConcurrentModificationException.
> This item should be removed by
> {{toRemove.add(dn);}}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16610) Make fsck read timeout configurable

2022-05-31 Thread Stephen O'Donnell (Jira)
Stephen O'Donnell created HDFS-16610:


 Summary: Make fsck read timeout configurable
 Key: HDFS-16610
 URL: https://issues.apache.org/jira/browse/HDFS-16610
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Reporter: Stephen O'Donnell
Assignee: Stephen O'Donnell


In a cluster with a lot of small files, we encountered a case where fsck was 
very slow. I believe it is due to contention with many other threads reading / 
writing data on the cluster.

Sometimes fsck does not report any progress for more than 60 seconds and the 
client times out. Currently the connect and read timeout are hardcoded to 60 
seconds. This change is to make them configurable.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16610) Make fsck read timeout configurable

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16610?focusedWorklogId=776268&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776268
 ]

ASF GitHub Bot logged work on HDFS-16610:
-

Author: ASF GitHub Bot
Created on: 31/May/22 12:01
Start Date: 31/May/22 12:01
Worklog Time Spent: 10m 
  Work Description: sodonnel opened a new pull request, #4384:
URL: https://github.com/apache/hadoop/pull/4384

   ### Description of PR
   
   In a cluster with a lot of small files, we encountered a case where fsck was 
very slow. I believe it is due to contention with many other threads reading / 
writing data on the cluster.
   
   Sometimes fsck does not report any progress for more than 60 seconds and the 
client times out. Currently the connect and read timeout are hardcoded to 60 
seconds. This change is to make them configurable.
   
   ### How was this patch tested?
   
   Tested manually by inserting a sleep into the fsck logic in the NN. I then 
adjusted the read timeout to validate I got a timeout or not depending on the 
timeout setting.




Issue Time Tracking
---

Worklog Id: (was: 776268)
Remaining Estimate: 0h
Time Spent: 10m

> Make fsck read timeout configurable
> ---
>
> Key: HDFS-16610
> URL: https://issues.apache.org/jira/browse/HDFS-16610
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In a cluster with a lot of small files, we encountered a case where fsck was 
> very slow. I believe it is due to contention with many other threads reading 
> / writing data on the cluster.
> Sometimes fsck does not report any progress for more than 60 seconds and the 
> client times out. Currently the connect and read timeout are hardcoded to 60 
> seconds. This change is to make them configurable.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16610) Make fsck read timeout configurable

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-16610:
--
Labels: pull-request-available  (was: )

> Make fsck read timeout configurable
> ---
>
> Key: HDFS-16610
> URL: https://issues.apache.org/jira/browse/HDFS-16610
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In a cluster with a lot of small files, we encountered a case where fsck was 
> very slow. I believe it is due to contention with many other threads reading 
> / writing data on the cluster.
> Sometimes fsck does not report any progress for more than 60 seconds and the 
> client times out. Currently the connect and read timeout are hardcoded to 60 
> seconds. This change is to make them configurable.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16600) Deadlock on DataNode

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16600?focusedWorklogId=776281&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776281
 ]

ASF GitHub Bot logged work on HDFS-16600:
-

Author: ASF GitHub Bot
Created on: 31/May/22 12:15
Start Date: 31/May/22 12:15
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4367:
URL: https://github.com/apache/hadoop/pull/4367#issuecomment-1142055562

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 44s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  37m 48s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 28s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   1m 18s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m  9s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 35s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  6s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 29s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 26s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 38s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 17s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 26s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   1m 26s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 13s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 54s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 19s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 50s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 24s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 19s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 17s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 241m 46s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4367/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 51s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 346m 57s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.TestRollingUpgrade |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4367/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4367 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 0c52856a8f1c 4.15.0-169-generic #177-Ubuntu SMP Thu Feb 3 
10:50:38 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 0100f4917d24e4db256e5032f75a8f64bb76391a |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-

[jira] [Work logged] (HDFS-16604) Install gtest via FetchContent_Declare in CMake

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16604?focusedWorklogId=776267&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776267
 ]

ASF GitHub Bot logged work on HDFS-16604:
-

Author: ASF GitHub Bot
Created on: 31/May/22 11:59
Start Date: 31/May/22 11:59
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4374:
URL: https://github.com/apache/hadoop/pull/4374#issuecomment-1142040742

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  39m 26s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  37m 52s |  |  trunk passed  |
   | -1 :x: |  compile  |   0m 47s | 
[/branch-compile-hadoop-hdfs-project_hadoop-hdfs-native-client.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4374/6/artifact/out/branch-compile-hadoop-hdfs-project_hadoop-hdfs-native-client.txt)
 |  hadoop-hdfs-native-client in trunk failed.  |
   | +1 :green_heart: |  mvnsite  |   0m 47s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 50s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  59m  1s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 24s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   3m 40s |  |  the patch passed  |
   | -1 :x: |  cc  |   3m 40s | 
[/results-compile-cc-hadoop-hdfs-project_hadoop-hdfs-native-client.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4374/6/artifact/out/results-compile-cc-hadoop-hdfs-project_hadoop-hdfs-native-client.txt)
 |  hadoop-hdfs-project_hadoop-hdfs-native-client generated 8 new + 0 unchanged 
- 0 fixed = 8 total (was 0)  |
   | +1 :green_heart: |  golang  |   3m 40s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |   3m 40s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  mvnsite  |   0m 27s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 22s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  19m 58s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  34m 53s |  |  hadoop-hdfs-native-client in 
the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 50s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 161m 16s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4374/6/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4374 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient codespell detsecrets cc golang |
   | uname | Linux d82b0a18e275 4.15.0-156-generic #163-Ubuntu SMP Thu Aug 19 
23:31:58 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 00f704b25d5f191b4e0450675f0314ae8a8cbac9 |
   | Default Java | Red Hat, Inc.-1.8.0_332-b09 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4374/6/testReport/ |
   | Max. process+thread count | 545 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-native-client U: 
hadoop-hdfs-project/hadoop-hdfs-native-client |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4374/6/console |
   | versions | git=2.9.5 maven=3.6.3 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




Issue Time Tracking
---

Worklog Id: (was: 776267)
Time Spent: 2h 20m  (was: 2h 10m)

> Install gtest via FetchContent_Declare in CMake
> ---
>
> Key: HDFS-16604
> URL: https://issues.apache.org/jira/browse/HDFS-16604
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs++
>Affects Versions: 3.4.0
>Reporter: Gautham Banasandr

[jira] [Work logged] (HDFS-16610) Make fsck read timeout configurable

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16610?focusedWorklogId=776306&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776306
 ]

ASF GitHub Bot logged work on HDFS-16610:
-

Author: ASF GitHub Bot
Created on: 31/May/22 13:02
Start Date: 31/May/22 13:02
Worklog Time Spent: 10m 
  Work Description: ayushtkn commented on code in PR #4384:
URL: https://github.com/apache/hadoop/pull/4384#discussion_r885609952


##
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java:
##
@@ -273,6 +273,14 @@ public interface HdfsClientConfigKeys {
   String DFS_LEASE_HARDLIMIT_KEY = "dfs.namenode.lease-hard-limit-sec";
   long DFS_LEASE_HARDLIMIT_DEFAULT = 20 * 60;
 
+  String DFS_CLIENT_FSCK_CONNECT_TIMEOUT_MS =
+  "dfs.client.fsck.connect.timeout.ms";
+  int DFS_CLIENT_FSCK_CONNECT_TIMEOUT_MS_DEFAULT = 60 * 1000;
+
+  String DFS_CLIENT_FSCK_READ_TIMEOUT_MS =
+  "dfs.client.fsck.read.timeout.ms";
+  int DFS_CLIENT_FSCK_READ_TIMEOUT_MS_DEFAULT = 60 * 1000;

Review Comment:
   do you want to restrict the config to `ms`? Can keep only 
``dfs.client.fsck.connect.timeout`` similar to 
``dfs.webhdfs.socket.connect-timeout`` and allow timeunits?





Issue Time Tracking
---

Worklog Id: (was: 776306)
Time Spent: 20m  (was: 10m)

> Make fsck read timeout configurable
> ---
>
> Key: HDFS-16610
> URL: https://issues.apache.org/jira/browse/HDFS-16610
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In a cluster with a lot of small files, we encountered a case where fsck was 
> very slow. I believe it is due to contention with many other threads reading 
> / writing data on the cluster.
> Sometimes fsck does not report any progress for more than 60 seconds and the 
> client times out. Currently the connect and read timeout are hardcoded to 60 
> seconds. This change is to make them configurable.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16600) Deadlock on DataNode

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16600?focusedWorklogId=776340&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776340
 ]

ASF GitHub Bot logged work on HDFS-16600:
-

Author: ASF GitHub Bot
Created on: 31/May/22 14:05
Start Date: 31/May/22 14:05
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4367:
URL: https://github.com/apache/hadoop/pull/4367#issuecomment-1142182281

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 49s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  38m 55s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 41s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   1m 32s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 17s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 42s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 18s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 45s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 54s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  26m 12s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 26s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 29s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   1m 29s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 23s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   1m 23s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  1s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 28s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 59s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 30s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 38s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  25m 28s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 336m 31s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4367/3/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 59s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 452m 24s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.tools.TestDFSAdmin |
   |   | hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4367/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4367 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 6d2952785abf 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 
17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / f08e25d23aa96705511da6358769b81a4a711080 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubun

[jira] [Work logged] (HDFS-16610) Make fsck read timeout configurable

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16610?focusedWorklogId=776379&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776379
 ]

ASF GitHub Bot logged work on HDFS-16610:
-

Author: ASF GitHub Bot
Created on: 31/May/22 14:56
Start Date: 31/May/22 14:56
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on code in PR #4384:
URL: https://github.com/apache/hadoop/pull/4384#discussion_r885745283


##
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java:
##
@@ -273,6 +273,14 @@ public interface HdfsClientConfigKeys {
   String DFS_LEASE_HARDLIMIT_KEY = "dfs.namenode.lease-hard-limit-sec";
   long DFS_LEASE_HARDLIMIT_DEFAULT = 20 * 60;
 
+  String DFS_CLIENT_FSCK_CONNECT_TIMEOUT_MS =
+  "dfs.client.fsck.connect.timeout.ms";
+  int DFS_CLIENT_FSCK_CONNECT_TIMEOUT_MS_DEFAULT = 60 * 1000;
+
+  String DFS_CLIENT_FSCK_READ_TIMEOUT_MS =
+  "dfs.client.fsck.read.timeout.ms";
+  int DFS_CLIENT_FSCK_READ_TIMEOUT_MS_DEFAULT = 60 * 1000;

Review Comment:
   This suggestion makes sense. I have changed it to use the same technique as 
with webhdfs.





Issue Time Tracking
---

Worklog Id: (was: 776379)
Time Spent: 0.5h  (was: 20m)

> Make fsck read timeout configurable
> ---
>
> Key: HDFS-16610
> URL: https://issues.apache.org/jira/browse/HDFS-16610
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In a cluster with a lot of small files, we encountered a case where fsck was 
> very slow. I believe it is due to contention with many other threads reading 
> / writing data on the cluster.
> Sometimes fsck does not report any progress for more than 60 seconds and the 
> client times out. Currently the connect and read timeout are hardcoded to 60 
> seconds. This change is to make them configurable.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16609) Fix Flakes Junit Tests that often report timeouts

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16609?focusedWorklogId=776382&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776382
 ]

ASF GitHub Bot logged work on HDFS-16609:
-

Author: ASF GitHub Bot
Created on: 31/May/22 15:07
Start Date: 31/May/22 15:07
Worklog Time Spent: 10m 
  Work Description: slfan1989 commented on PR #4382:
URL: https://github.com/apache/hadoop/pull/4382#issuecomment-1142257866

   @tomscut @Hexiaoqiao please help me to review the code, thank you very much!




Issue Time Tracking
---

Worklog Id: (was: 776382)
Time Spent: 0.5h  (was: 20m)

> Fix Flakes Junit Tests that often report timeouts
> -
>
> Key: HDFS-16609
> URL: https://issues.apache.org/jira/browse/HDFS-16609
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When I was dealing with HDFS-16590 JIRA, Junit Tests often reported errors, I 
> found that one type of problem is TimeOut problem, these problems can be 
> avoided by adjusting TimeOut time.
> The modified method is as follows:
> 1.org.apache.hadoop.hdfs.TestFileCreation#testServerDefaultsWithMinimalCaching
> {code:java}
> [ERROR] 
> testServerDefaultsWithMinimalCaching(org.apache.hadoop.hdfs.TestFileCreation) 
>  Time elapsed: 7.136 s  <<< ERROR!
> java.util.concurrent.TimeoutException: 
> Timed out waiting for condition. 
> Thread diagnostics: 
> [WARNING] 
> org.apache.hadoop.hdfs.TestFileCreation.testServerDefaultsWithMinimalCaching(org.apache.hadoop.hdfs.TestFileCreation)
> [ERROR]   Run 1: TestFileCreation.testServerDefaultsWithMinimalCaching:277 
> Timeout Timed out ...
> [INFO]   Run 2: PASS{code}
> 2.org.apache.hadoop.hdfs.TestDFSShell#testFilePermissions
> {code:java}
> [ERROR] testFilePermissions(org.apache.hadoop.hdfs.TestDFSShell)  Time 
> elapsed: 30.022 s  <<< ERROR!
> org.junit.runners.model.TestTimedOutException: test timed out after 3 
> milliseconds
>   at java.lang.Thread.dumpThreads(Native Method)
>   at java.lang.Thread.getStackTrace(Thread.java:1549)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout.createTimeoutException(FailOnTimeout.java:182)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout.getResult(FailOnTimeout.java:177)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout.evaluate(FailOnTimeout.java:128)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at java.lang.Thread.run(Thread.java:748)
> [WARNING] 
> org.apache.hadoop.hdfs.TestDFSShell.testFilePermissions(org.apache.hadoop.hdfs.TestDFSShell)
> [ERROR]   Run 1: TestDFSShell.testFilePermissions TestTimedOut test timed out 
> after 3 mil...
> [INFO]   Run 2: PASS {code}
> 3.org.apache.hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier#testSPSWhenFileHasExcessRedundancyBlocks
> {code:java}
> [ERROR] 
> testSPSWhenFileHasExcessRedundancyBlocks(org.apache.hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier)
>   Time elapsed: 67.904 s  <<< ERROR!
> java.util.concurrent.TimeoutException: 
> Timed out waiting for condition. 
> [WARNING] 
> org.apache.hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier.testSPSWhenFileHasExcessRedundancyBlocks(org.apache.hadoop.hdfs.server.sps.TestExternalStoragePolicySatisfier)
> [ERROR]   Run 1: 
> TestExternalStoragePolicySatisfier.testSPSWhenFileHasExcessRedundancyBlocks:1379
>  Timeout
> [ERROR]   Run 2: 
> TestExternalStoragePolicySatisfier.testSPSWhenFileHasExcessRedundancyBlocks:1379
>  Timeout
> [INFO]   Run 3: PASS {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16611) impove TestSeveralNameNodes#testCircularLinkedListWrites Params

2022-05-31 Thread fanshilun (Jira)
fanshilun created HDFS-16611:


 Summary: impove TestSeveralNameNodes#testCircularLinkedListWrites 
Params
 Key: HDFS-16611
 URL: https://issues.apache.org/jira/browse/HDFS-16611
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs
Affects Versions: 3.4.0
Reporter: fanshilun
Assignee: fanshilun






--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16611) impove TestSeveralNameNodes#testCircularLinkedListWrites Params

2022-05-31 Thread fanshilun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

fanshilun updated HDFS-16611:
-
Description: 
When I was dealing with HDFS-16590 JIRA, Junit Tests often reported errors,  I 
found that the following error messages often appear

org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes#

testCircularLinkedListWrites
{code:java}
1st run
[ERROR] 
testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
  Time elapsed: 114.252 s  <<< FAILURE!
java.lang.AssertionError: 
Some writers didn't complete in expected runtime! Current writer 
state:[Circular Writer:
 directory: /test-0
 target length: 50
 current item: 43
 done: false
, Circular Writer:
 directory: /test-1
 target length: 50
 current item: 47
 done: false
, Circular Writer:
 directory: /test-2
 target length: 50
 current item: 42
 done: false
] expected:<0> but was:<3>

2nd run

[ERROR] 
testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
  Time elapsed: 110.349 s  <<< FAILURE!
java.lang.AssertionError: 
Some writers didn't complete in expected runtime! Current writer 
state:[Circular Writer:
 directory: /test-0
 target length: 50
 current item: 50
 done: false
, Circular Writer:
 directory: /test-1
 target length: 50
 current item: 49
 done: false
, Circular Writer:
 directory: /test-2
 target length: 50
 current item: 49
 done: false
] expected:<0> but was:<3>
at org.junit.Assert.fail(Assert.java:89)



3rd run
[ERROR] 
testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
  Time elapsed: 109.364 s  <<< FAILURE!
java.lang.AssertionError: 
Some writers didn't complete in expected runtime! Current writer 
state:[Circular Writer:
 directory: /test-0
 target length: 50
 current item: 47
 done: false
, Circular Writer:
 directory: /test-1
 target length: 50
 current item: 47
 done: false
, Circular Writer:
 directory: /test-2
 target length: 50
 current item: 46
 done: false
] expected:<0> but was:<3>
at org.junit.Assert.fail(Assert.java:89)


{code}
 

 

> impove TestSeveralNameNodes#testCircularLinkedListWrites Params
> ---
>
> Key: HDFS-16611
> URL: https://issues.apache.org/jira/browse/HDFS-16611
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Minor
>
> When I was dealing with HDFS-16590 JIRA, Junit Tests often reported errors,  
> I found that the following error messages often appear
> org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes#
> testCircularLinkedListWrites
> {code:java}
> 1st run
> [ERROR] 
> testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
>   Time elapsed: 114.252 s  <<< FAILURE!
> java.lang.AssertionError: 
> Some writers didn't complete in expected runtime! Current writer 
> state:[Circular Writer:
>directory: /test-0
>target length: 50
>current item: 43
>done: false
> , Circular Writer:
>directory: /test-1
>target length: 50
>current item: 47
>done: false
> , Circular Writer:
>directory: /test-2
>target length: 50
>current item: 42
>done: false
> ] expected:<0> but was:<3>
> 
> 2nd run
> [ERROR] 
> testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
>   Time elapsed: 110.349 s  <<< FAILURE!
> java.lang.AssertionError: 
> Some writers didn't complete in expected runtime! Current writer 
> state:[Circular Writer:
>directory: /test-0
>target length: 50
>current item: 50
>done: false
> , Circular Writer:
>directory: /test-1
>target length: 50
>current item: 49
>done: false
> , Circular Writer:
>directory: /test-2
>target length: 50
>current item: 49
>done: false
> ] expected:<0> but was:<3>
>   at org.junit.Assert.fail(Assert.java:89)
> 
> 3rd run
> [ERROR] 
> testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
>   Time elapsed: 109.364 s  <<< FAILURE!
> java.lang.AssertionError: 
> Some writers didn't complete in expected runtime! Current writer 
> state:[Circular Writer:
>directory: /test-0
>target length: 50
>current item: 47
>done: false
> , Circular Writer:
>director

[jira] [Updated] (HDFS-16611) impove TestSeveralNameNodes#testCircularLinkedListWrites Params

2022-05-31 Thread fanshilun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

fanshilun updated HDFS-16611:
-
Description: 
When I was dealing with HDFS-16590 JIRA, Junit Tests often reported errors,  I 
found that the following error messages often appear

org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes#

testCircularLinkedListWrites

1st run
{code:java}
1st run
[ERROR] 
testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
  Time elapsed: 114.252 s  <<< FAILURE!
java.lang.AssertionError: 
Some writers didn't complete in expected runtime! Current writer 
state:[Circular Writer:
 directory: /test-0
 target length: 50
 current item: 43
 done: false
, Circular Writer:
 directory: /test-1
 target length: 50
 current item: 47
 done: false
, Circular Writer:
 directory: /test-2
 target length: 50
 current item: 42
 done: false
] expected:<0> but was:<3>
{code}

2st run
{code:java}
 [ERROR] 
testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
  Time elapsed: 110.349 s  <<< FAILURE!
java.lang.AssertionError: 
Some writers didn't complete in expected runtime! Current writer 
state:[Circular Writer:
 directory: /test-0
 target length: 50
 current item: 50
 done: false
, Circular Writer:
 directory: /test-1
 target length: 50
 current item: 49
 done: false
, Circular Writer:
 directory: /test-2
 target length: 50
 current item: 49
 done: false
] expected:<0> but was:<3>
{code}
 
3rd run
{code:java}
[ERROR] 
testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
  Time elapsed: 109.364 s  <<< FAILURE!
java.lang.AssertionError: 
Some writers didn't complete in expected runtime! Current writer 
state:[Circular Writer:
 directory: /test-0
 target length: 50
 current item: 47
 done: false
, Circular Writer:
 directory: /test-1
 target length: 50
 current item: 47
 done: false
, Circular Writer:
 directory: /test-2
 target length: 50
 current item: 46
 done: false
] expected:<0> but was:<3>
{code}

  was:
When I was dealing with HDFS-16590 JIRA, Junit Tests often reported errors,  I 
found that the following error messages often appear

org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes#

testCircularLinkedListWrites
{code:java}
1st run
[ERROR] 
testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
  Time elapsed: 114.252 s  <<< FAILURE!
java.lang.AssertionError: 
Some writers didn't complete in expected runtime! Current writer 
state:[Circular Writer:
 directory: /test-0
 target length: 50
 current item: 43
 done: false
, Circular Writer:
 directory: /test-1
 target length: 50
 current item: 47
 done: false
, Circular Writer:
 directory: /test-2
 target length: 50
 current item: 42
 done: false
] expected:<0> but was:<3>

2nd run

[ERROR] 
testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
  Time elapsed: 110.349 s  <<< FAILURE!
java.lang.AssertionError: 
Some writers didn't complete in expected runtime! Current writer 
state:[Circular Writer:
 directory: /test-0
 target length: 50
 current item: 50
 done: false
, Circular Writer:
 directory: /test-1
 target length: 50
 current item: 49
 done: false
, Circular Writer:
 directory: /test-2
 target length: 50
 current item: 49
 done: false
] expected:<0> but was:<3>
at org.junit.Assert.fail(Assert.java:89)



3rd run
[ERROR] 
testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
  Time elapsed: 109.364 s  <<< FAILURE!
java.lang.AssertionError: 
Some writers didn't complete in expected runtime! Current writer 
state:[Circular Writer:
 directory: /test-0
 target length: 50
 current item: 47
 done: false
, Circular Writer:
 directory: /test-1
 target length: 50
 current item: 47
 done: false
, Circular Writer:
 directory: /test-2
 target length: 50
 current item: 46
 done: false
] expected:<0> but was:<3>
at org.junit.Assert.fail(Assert.java:89)


{code}
 

 


> impove TestSeveralNameNodes#testCircularLinkedListWrites Params
> ---
>
> Key: HDFS-16611
> URL: https://issues.apache.org/jira/browse/HDFS-16611
> Project: Hadoop HDFS
>  Issue

[jira] [Updated] (HDFS-16611) impove TestSeveralNameNodes#testCircularLinkedListWrites Params

2022-05-31 Thread fanshilun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

fanshilun updated HDFS-16611:
-
Description: 
When I was dealing with HDFS-16590 JIRA, Junit Tests often reported errors,  I 
found that the following error messages often appear

org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes#

testCircularLinkedListWrites
 * 1st run

{code:java}
1st run
[ERROR] 
testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
  Time elapsed: 114.252 s  <<< FAILURE!
java.lang.AssertionError: 
Some writers didn't complete in expected runtime! Current writer 
state:[Circular Writer:
 directory: /test-0
 target length: 50
 current item: 43
 done: false
, Circular Writer:
 directory: /test-1
 target length: 50
 current item: 47
 done: false
, Circular Writer:
 directory: /test-2
 target length: 50
 current item: 42
 done: false
] expected:<0> but was:<3>
{code}
 * 2st run

{code:java}
 [ERROR] 
testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
  Time elapsed: 110.349 s  <<< FAILURE!
java.lang.AssertionError: 
Some writers didn't complete in expected runtime! Current writer 
state:[Circular Writer:
 directory: /test-0
 target length: 50
 current item: 50
 done: false
, Circular Writer:
 directory: /test-1
 target length: 50
 current item: 49
 done: false
, Circular Writer:
 directory: /test-2
 target length: 50
 current item: 49
 done: false
] expected:<0> but was:<3>
{code}
 * 3rd run

{code:java}
[ERROR] 
testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
  Time elapsed: 109.364 s  <<< FAILURE!
java.lang.AssertionError: 
Some writers didn't complete in expected runtime! Current writer 
state:[Circular Writer:
 directory: /test-0
 target length: 50
 current item: 47
 done: false
, Circular Writer:
 directory: /test-1
 target length: 50
 current item: 47
 done: false
, Circular Writer:
 directory: /test-2
 target length: 50
 current item: 46
 done: false
] expected:<0> but was:<3>
{code}

  was:
When I was dealing with HDFS-16590 JIRA, Junit Tests often reported errors,  I 
found that the following error messages often appear

org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes#

testCircularLinkedListWrites

1st run
{code:java}
1st run
[ERROR] 
testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
  Time elapsed: 114.252 s  <<< FAILURE!
java.lang.AssertionError: 
Some writers didn't complete in expected runtime! Current writer 
state:[Circular Writer:
 directory: /test-0
 target length: 50
 current item: 43
 done: false
, Circular Writer:
 directory: /test-1
 target length: 50
 current item: 47
 done: false
, Circular Writer:
 directory: /test-2
 target length: 50
 current item: 42
 done: false
] expected:<0> but was:<3>
{code}

2st run
{code:java}
 [ERROR] 
testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
  Time elapsed: 110.349 s  <<< FAILURE!
java.lang.AssertionError: 
Some writers didn't complete in expected runtime! Current writer 
state:[Circular Writer:
 directory: /test-0
 target length: 50
 current item: 50
 done: false
, Circular Writer:
 directory: /test-1
 target length: 50
 current item: 49
 done: false
, Circular Writer:
 directory: /test-2
 target length: 50
 current item: 49
 done: false
] expected:<0> but was:<3>
{code}
 
3rd run
{code:java}
[ERROR] 
testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
  Time elapsed: 109.364 s  <<< FAILURE!
java.lang.AssertionError: 
Some writers didn't complete in expected runtime! Current writer 
state:[Circular Writer:
 directory: /test-0
 target length: 50
 current item: 47
 done: false
, Circular Writer:
 directory: /test-1
 target length: 50
 current item: 47
 done: false
, Circular Writer:
 directory: /test-2
 target length: 50
 current item: 46
 done: false
] expected:<0> but was:<3>
{code}


> impove TestSeveralNameNodes#testCircularLinkedListWrites Params
> ---
>
> Key: HDFS-16611
> URL: https://issues.apache.org/jira/browse/HDFS-16611
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects

[jira] [Work started] (HDFS-16611) impove TestSeveralNameNodes#testCircularLinkedListWrites Params

2022-05-31 Thread fanshilun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-16611 started by fanshilun.

> impove TestSeveralNameNodes#testCircularLinkedListWrites Params
> ---
>
> Key: HDFS-16611
> URL: https://issues.apache.org/jira/browse/HDFS-16611
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Minor
>
> When I was dealing with HDFS-16590 JIRA, Junit Tests often reported errors,  
> I found that the following error messages often appear
> org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes#
> testCircularLinkedListWrites
>  * 1st run
> {code:java}
> 1st run
> [ERROR] 
> testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
>   Time elapsed: 114.252 s  <<< FAILURE!
> java.lang.AssertionError: 
> Some writers didn't complete in expected runtime! Current writer 
> state:[Circular Writer:
>directory: /test-0
>target length: 50
>current item: 43
>done: false
> , Circular Writer:
>directory: /test-1
>target length: 50
>current item: 47
>done: false
> , Circular Writer:
>directory: /test-2
>target length: 50
>current item: 42
>done: false
> ] expected:<0> but was:<3>
> {code}
>  * 2st run
> {code:java}
>  [ERROR] 
> testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
>   Time elapsed: 110.349 s  <<< FAILURE!
> java.lang.AssertionError: 
> Some writers didn't complete in expected runtime! Current writer 
> state:[Circular Writer:
>directory: /test-0
>target length: 50
>current item: 50
>done: false
> , Circular Writer:
>directory: /test-1
>target length: 50
>current item: 49
>done: false
> , Circular Writer:
>directory: /test-2
>target length: 50
>current item: 49
>done: false
> ] expected:<0> but was:<3>
> {code}
>  * 3rd run
> {code:java}
> [ERROR] 
> testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
>   Time elapsed: 109.364 s  <<< FAILURE!
> java.lang.AssertionError: 
> Some writers didn't complete in expected runtime! Current writer 
> state:[Circular Writer:
>directory: /test-0
>target length: 50
>current item: 47
>done: false
> , Circular Writer:
>directory: /test-1
>target length: 50
>current item: 47
>done: false
> , Circular Writer:
>directory: /test-2
>target length: 50
>current item: 46
>done: false
> ] expected:<0> but was:<3>
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16611) impove TestSeveralNameNodes#testCircularLinkedListWrites Params

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16611?focusedWorklogId=776396&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776396
 ]

ASF GitHub Bot logged work on HDFS-16611:
-

Author: ASF GitHub Bot
Created on: 31/May/22 15:29
Start Date: 31/May/22 15:29
Worklog Time Spent: 10m 
  Work Description: slfan1989 opened a new pull request, #4387:
URL: https://github.com/apache/hadoop/pull/4387

   JIRA:HDFS-16611. impove TestSeveralNameNodes#testCircularLinkedListWrites 
Params.




Issue Time Tracking
---

Worklog Id: (was: 776396)
Remaining Estimate: 0h
Time Spent: 10m

> impove TestSeveralNameNodes#testCircularLinkedListWrites Params
> ---
>
> Key: HDFS-16611
> URL: https://issues.apache.org/jira/browse/HDFS-16611
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When I was dealing with HDFS-16590 JIRA, Junit Tests often reported errors,  
> I found that the following error messages often appear
> org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes#
> testCircularLinkedListWrites
>  * 1st run
> {code:java}
> 1st run
> [ERROR] 
> testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
>   Time elapsed: 114.252 s  <<< FAILURE!
> java.lang.AssertionError: 
> Some writers didn't complete in expected runtime! Current writer 
> state:[Circular Writer:
>directory: /test-0
>target length: 50
>current item: 43
>done: false
> , Circular Writer:
>directory: /test-1
>target length: 50
>current item: 47
>done: false
> , Circular Writer:
>directory: /test-2
>target length: 50
>current item: 42
>done: false
> ] expected:<0> but was:<3>
> {code}
>  * 2st run
> {code:java}
>  [ERROR] 
> testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
>   Time elapsed: 110.349 s  <<< FAILURE!
> java.lang.AssertionError: 
> Some writers didn't complete in expected runtime! Current writer 
> state:[Circular Writer:
>directory: /test-0
>target length: 50
>current item: 50
>done: false
> , Circular Writer:
>directory: /test-1
>target length: 50
>current item: 49
>done: false
> , Circular Writer:
>directory: /test-2
>target length: 50
>current item: 49
>done: false
> ] expected:<0> but was:<3>
> {code}
>  * 3rd run
> {code:java}
> [ERROR] 
> testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
>   Time elapsed: 109.364 s  <<< FAILURE!
> java.lang.AssertionError: 
> Some writers didn't complete in expected runtime! Current writer 
> state:[Circular Writer:
>directory: /test-0
>target length: 50
>current item: 47
>done: false
> , Circular Writer:
>directory: /test-1
>target length: 50
>current item: 47
>done: false
> , Circular Writer:
>directory: /test-2
>target length: 50
>current item: 46
>done: false
> ] expected:<0> but was:<3>
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16611) impove TestSeveralNameNodes#testCircularLinkedListWrites Params

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-16611:
--
Labels: pull-request-available  (was: )

> impove TestSeveralNameNodes#testCircularLinkedListWrites Params
> ---
>
> Key: HDFS-16611
> URL: https://issues.apache.org/jira/browse/HDFS-16611
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When I was dealing with HDFS-16590 JIRA, Junit Tests often reported errors,  
> I found that the following error messages often appear
> org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes#
> testCircularLinkedListWrites
>  * 1st run
> {code:java}
> 1st run
> [ERROR] 
> testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
>   Time elapsed: 114.252 s  <<< FAILURE!
> java.lang.AssertionError: 
> Some writers didn't complete in expected runtime! Current writer 
> state:[Circular Writer:
>directory: /test-0
>target length: 50
>current item: 43
>done: false
> , Circular Writer:
>directory: /test-1
>target length: 50
>current item: 47
>done: false
> , Circular Writer:
>directory: /test-2
>target length: 50
>current item: 42
>done: false
> ] expected:<0> but was:<3>
> {code}
>  * 2st run
> {code:java}
>  [ERROR] 
> testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
>   Time elapsed: 110.349 s  <<< FAILURE!
> java.lang.AssertionError: 
> Some writers didn't complete in expected runtime! Current writer 
> state:[Circular Writer:
>directory: /test-0
>target length: 50
>current item: 50
>done: false
> , Circular Writer:
>directory: /test-1
>target length: 50
>current item: 49
>done: false
> , Circular Writer:
>directory: /test-2
>target length: 50
>current item: 49
>done: false
> ] expected:<0> but was:<3>
> {code}
>  * 3rd run
> {code:java}
> [ERROR] 
> testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
>   Time elapsed: 109.364 s  <<< FAILURE!
> java.lang.AssertionError: 
> Some writers didn't complete in expected runtime! Current writer 
> state:[Circular Writer:
>directory: /test-0
>target length: 50
>current item: 47
>done: false
> , Circular Writer:
>directory: /test-1
>target length: 50
>current item: 47
>done: false
> , Circular Writer:
>directory: /test-2
>target length: 50
>current item: 46
>done: false
> ] expected:<0> but was:<3>
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16611) impove TestSeveralNameNodes#testCircularLinkedListWrites Params

2022-05-31 Thread fanshilun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

fanshilun updated HDFS-16611:
-
Description: 
When I was dealing with HDFS-16590 JIRA, Junit Tests often reported errors,  I 
found that the following error messages often appear

org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes#

testCircularLinkedListWrites

This method runs very close to success. It can be found that the current item 
is approximately equal to the target length in 3 runs. I think it can reduce 
the length of LIST_LENGTH and prolong the RUNTIME time, which can effectively 
increase the success rate of this Test.

Reducing LIST_LENGTH does not change the running purpose of Text, and it can 
also test Circular Writes in the case of NN failover.
 * 1st run

{code:java}
1st run
[ERROR] 
testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
  Time elapsed: 114.252 s  <<< FAILURE!
java.lang.AssertionError: 
Some writers didn't complete in expected runtime! Current writer 
state:[Circular Writer:
 directory: /test-0
 target length: 50
 current item: 43
 done: false
, Circular Writer:
 directory: /test-1
 target length: 50
 current item: 47
 done: false
, Circular Writer:
 directory: /test-2
 target length: 50
 current item: 42
 done: false
] expected:<0> but was:<3>
{code}
 * 2st run

{code:java}
 [ERROR] 
testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
  Time elapsed: 110.349 s  <<< FAILURE!
java.lang.AssertionError: 
Some writers didn't complete in expected runtime! Current writer 
state:[Circular Writer:
 directory: /test-0
 target length: 50
 current item: 50
 done: false
, Circular Writer:
 directory: /test-1
 target length: 50
 current item: 49
 done: false
, Circular Writer:
 directory: /test-2
 target length: 50
 current item: 49
 done: false
] expected:<0> but was:<3>
{code}
 * 3rd run

{code:java}
[ERROR] 
testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
  Time elapsed: 109.364 s  <<< FAILURE!
java.lang.AssertionError: 
Some writers didn't complete in expected runtime! Current writer 
state:[Circular Writer:
 directory: /test-0
 target length: 50
 current item: 47
 done: false
, Circular Writer:
 directory: /test-1
 target length: 50
 current item: 47
 done: false
, Circular Writer:
 directory: /test-2
 target length: 50
 current item: 46
 done: false
] expected:<0> but was:<3>
{code}

  was:
When I was dealing with HDFS-16590 JIRA, Junit Tests often reported errors,  I 
found that the following error messages often appear

org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes#

testCircularLinkedListWrites
 * 1st run

{code:java}
1st run
[ERROR] 
testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
  Time elapsed: 114.252 s  <<< FAILURE!
java.lang.AssertionError: 
Some writers didn't complete in expected runtime! Current writer 
state:[Circular Writer:
 directory: /test-0
 target length: 50
 current item: 43
 done: false
, Circular Writer:
 directory: /test-1
 target length: 50
 current item: 47
 done: false
, Circular Writer:
 directory: /test-2
 target length: 50
 current item: 42
 done: false
] expected:<0> but was:<3>
{code}
 * 2st run

{code:java}
 [ERROR] 
testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
  Time elapsed: 110.349 s  <<< FAILURE!
java.lang.AssertionError: 
Some writers didn't complete in expected runtime! Current writer 
state:[Circular Writer:
 directory: /test-0
 target length: 50
 current item: 50
 done: false
, Circular Writer:
 directory: /test-1
 target length: 50
 current item: 49
 done: false
, Circular Writer:
 directory: /test-2
 target length: 50
 current item: 49
 done: false
] expected:<0> but was:<3>
{code}
 * 3rd run

{code:java}
[ERROR] 
testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
  Time elapsed: 109.364 s  <<< FAILURE!
java.lang.AssertionError: 
Some writers didn't complete in expected runtime! Current writer 
state:[Circular Writer:
 directory: /test-0
 target length: 50
 current item: 47
 done: false
, Circular Writer:
 directory: /test-1
 target length: 50
 current item: 47
 done: false
, Circular Writer:
 directory: /test-2
 target length: 50
 curren

[jira] [Created] (HDFS-16612) impove import * In HDFS Project

2022-05-31 Thread fanshilun (Jira)
fanshilun created HDFS-16612:


 Summary: impove import * In HDFS Project
 Key: HDFS-16612
 URL: https://issues.apache.org/jira/browse/HDFS-16612
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.4.0
Reporter: fanshilun
Assignee: fanshilun






--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work started] (HDFS-16612) impove import * In HDFS Project

2022-05-31 Thread fanshilun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-16612 started by fanshilun.

> impove import * In HDFS Project
> ---
>
> Key: HDFS-16612
> URL: https://issues.apache.org/jira/browse/HDFS-16612
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.4.0
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16613) EC: Improve performance of decommissioning dn with many ec blocks

2022-05-31 Thread caozhiqiang (Jira)
caozhiqiang created HDFS-16613:
--

 Summary: EC: Improve performance of decommissioning dn with many 
ec blocks
 Key: HDFS-16613
 URL: https://issues.apache.org/jira/browse/HDFS-16613
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ec, erasure-coding, namenode
Affects Versions: 3.4.0
Reporter: caozhiqiang
Assignee: caozhiqiang


In a hdfs cluster with a lot of EC blocks, decommission a dn is very slow. The 
reason is unlike replication blocks can be replicated from any dn which has the 
same block replication, the ec block have to be replicated from the 
decommissioning dn. The configurations dfs.namenode.replication.max-streams and 
dfs.namenode.replication.max-streams-hard-limit will limit the replication 
speed, but increase these configurations will create risk to the whole 
cluster's network. So it should add a new configuration to limit the 
decommissioning dn, distinguished from the cluster wide max-streams limit.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16576) Remove unused Imports in Hadoop HDFS project

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-16576:
--
Labels: pull-request-available  (was: )

> Remove unused Imports in Hadoop HDFS project
> 
>
> Key: HDFS-16576
> URL: https://issues.apache.org/jira/browse/HDFS-16576
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ashutosh Gupta
>Assignee: Ashutosh Gupta
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> h3. Optimize Imports to keep code clean
>  # Remove any unused imports



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16576) Remove unused Imports in Hadoop HDFS project

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16576?focusedWorklogId=776428&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776428
 ]

ASF GitHub Bot logged work on HDFS-16576:
-

Author: ASF GitHub Bot
Created on: 31/May/22 16:28
Start Date: 31/May/22 16:28
Worklog Time Spent: 10m 
  Work Description: ashutoshcipher opened a new pull request, #4389:
URL: https://github.com/apache/hadoop/pull/4389

   ### Description of PR
   Remove unused Imports in Hadoop HDFS project
   * JIRA: HDFS-16576
   
   
   - [x] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   




Issue Time Tracking
---

Worklog Id: (was: 776428)
Remaining Estimate: 0h
Time Spent: 10m

> Remove unused Imports in Hadoop HDFS project
> 
>
> Key: HDFS-16576
> URL: https://issues.apache.org/jira/browse/HDFS-16576
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ashutosh Gupta
>Assignee: Ashutosh Gupta
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> h3. Optimize Imports to keep code clean
>  # Remove any unused imports



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16613) EC: Improve performance of decommissioning dn with many ec blocks

2022-05-31 Thread caozhiqiang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

caozhiqiang updated HDFS-16613:
---
Description: 
In a hdfs cluster with a lot of EC blocks, decommission a dn is very slow. The 
reason is unlike replication blocks can be replicated from any dn which has the 
same block replication, the ec block have to be replicated from the 
decommissioning dn.

The configurations dfs.namenode.replication.max-streams and 
dfs.namenode.replication.max-streams-hard-limit will limit the replication 
speed, but increase these configurations will create risk to the whole 
cluster's network. So it should add a new configuration to limit the 
decommissioning dn, distinguished from the cluster wide max-streams limit.

  was:In a hdfs cluster with a lot of EC blocks, decommission a dn is very 
slow. The reason is unlike replication blocks can be replicated from any dn 
which has the same block replication, the ec block have to be replicated from 
the decommissioning dn. The configurations dfs.namenode.replication.max-streams 
and dfs.namenode.replication.max-streams-hard-limit will limit the replication 
speed, but increase these configurations will create risk to the whole 
cluster's network. So it should add a new configuration to limit the 
decommissioning dn, distinguished from the cluster wide max-streams limit.


> EC: Improve performance of decommissioning dn with many ec blocks
> -
>
> Key: HDFS-16613
> URL: https://issues.apache.org/jira/browse/HDFS-16613
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ec, erasure-coding, namenode
>Affects Versions: 3.4.0
>Reporter: caozhiqiang
>Assignee: caozhiqiang
>Priority: Major
>
> In a hdfs cluster with a lot of EC blocks, decommission a dn is very slow. 
> The reason is unlike replication blocks can be replicated from any dn which 
> has the same block replication, the ec block have to be replicated from the 
> decommissioning dn.
> The configurations dfs.namenode.replication.max-streams and 
> dfs.namenode.replication.max-streams-hard-limit will limit the replication 
> speed, but increase these configurations will create risk to the whole 
> cluster's network. So it should add a new configuration to limit the 
> decommissioning dn, distinguished from the cluster wide max-streams limit.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work started] (HDFS-16613) EC: Improve performance of decommissioning dn with many ec blocks

2022-05-31 Thread caozhiqiang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-16613 started by caozhiqiang.
--
> EC: Improve performance of decommissioning dn with many ec blocks
> -
>
> Key: HDFS-16613
> URL: https://issues.apache.org/jira/browse/HDFS-16613
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ec, erasure-coding, namenode
>Affects Versions: 3.4.0
>Reporter: caozhiqiang
>Assignee: caozhiqiang
>Priority: Major
>
> In a hdfs cluster with a lot of EC blocks, decommission a dn is very slow. 
> The reason is unlike replication blocks can be replicated from any dn which 
> has the same block replication, the ec block have to be replicated from the 
> decommissioning dn.
> The configurations dfs.namenode.replication.max-streams and 
> dfs.namenode.replication.max-streams-hard-limit will limit the replication 
> speed, but increase these configurations will create risk to the whole 
> cluster's network. So it should add a new configuration to limit the 
> decommissioning dn, distinguished from the cluster wide max-streams limit.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16613) EC: Improve performance of decommissioning dn with many ec blocks

2022-05-31 Thread caozhiqiang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

caozhiqiang updated HDFS-16613:
---
Status: Patch Available  (was: In Progress)

> EC: Improve performance of decommissioning dn with many ec blocks
> -
>
> Key: HDFS-16613
> URL: https://issues.apache.org/jira/browse/HDFS-16613
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ec, erasure-coding, namenode
>Affects Versions: 3.4.0
>Reporter: caozhiqiang
>Assignee: caozhiqiang
>Priority: Major
>
> In a hdfs cluster with a lot of EC blocks, decommission a dn is very slow. 
> The reason is unlike replication blocks can be replicated from any dn which 
> has the same block replication, the ec block have to be replicated from the 
> decommissioning dn.
> The configurations dfs.namenode.replication.max-streams and 
> dfs.namenode.replication.max-streams-hard-limit will limit the replication 
> speed, but increase these configurations will create risk to the whole 
> cluster's network. So it should add a new configuration to limit the 
> decommissioning dn, distinguished from the cluster wide max-streams limit.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16611) impove TestSeveralNameNodes#testCircularLinkedListWrites Params

2022-05-31 Thread fanshilun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

fanshilun updated HDFS-16611:
-
Description: 
When I was dealing with HDFS-16590 JIRA, Junit Tests often reported errors,  I 
found that the following error messages often appear

org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes#

testCircularLinkedListWrites

This method runs very close to success. It can be found that the current item 
is approximately equal to the target length in 3 runs. I think it can reduce 
the length of LIST_LENGTH and prolong the RUNTIME time, which can effectively 
increase the success rate of this Test.

Reducing LIST_LENGTH does not change the running purpose of Test, and it can 
also test Circular Writes in the case of NN failover.
 * 1st run

{code:java}
1st run
[ERROR] 
testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
  Time elapsed: 114.252 s  <<< FAILURE!
java.lang.AssertionError: 
Some writers didn't complete in expected runtime! Current writer 
state:[Circular Writer:
 directory: /test-0
 target length: 50
 current item: 43
 done: false
, Circular Writer:
 directory: /test-1
 target length: 50
 current item: 47
 done: false
, Circular Writer:
 directory: /test-2
 target length: 50
 current item: 42
 done: false
] expected:<0> but was:<3>
{code}
 * 2st run

{code:java}
 [ERROR] 
testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
  Time elapsed: 110.349 s  <<< FAILURE!
java.lang.AssertionError: 
Some writers didn't complete in expected runtime! Current writer 
state:[Circular Writer:
 directory: /test-0
 target length: 50
 current item: 50
 done: false
, Circular Writer:
 directory: /test-1
 target length: 50
 current item: 49
 done: false
, Circular Writer:
 directory: /test-2
 target length: 50
 current item: 49
 done: false
] expected:<0> but was:<3>
{code}
 * 3rd run

{code:java}
[ERROR] 
testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
  Time elapsed: 109.364 s  <<< FAILURE!
java.lang.AssertionError: 
Some writers didn't complete in expected runtime! Current writer 
state:[Circular Writer:
 directory: /test-0
 target length: 50
 current item: 47
 done: false
, Circular Writer:
 directory: /test-1
 target length: 50
 current item: 47
 done: false
, Circular Writer:
 directory: /test-2
 target length: 50
 current item: 46
 done: false
] expected:<0> but was:<3>
{code}

  was:
When I was dealing with HDFS-16590 JIRA, Junit Tests often reported errors,  I 
found that the following error messages often appear

org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes#

testCircularLinkedListWrites

This method runs very close to success. It can be found that the current item 
is approximately equal to the target length in 3 runs. I think it can reduce 
the length of LIST_LENGTH and prolong the RUNTIME time, which can effectively 
increase the success rate of this Test.

Reducing LIST_LENGTH does not change the running purpose of Text, and it can 
also test Circular Writes in the case of NN failover.
 * 1st run

{code:java}
1st run
[ERROR] 
testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
  Time elapsed: 114.252 s  <<< FAILURE!
java.lang.AssertionError: 
Some writers didn't complete in expected runtime! Current writer 
state:[Circular Writer:
 directory: /test-0
 target length: 50
 current item: 43
 done: false
, Circular Writer:
 directory: /test-1
 target length: 50
 current item: 47
 done: false
, Circular Writer:
 directory: /test-2
 target length: 50
 current item: 42
 done: false
] expected:<0> but was:<3>
{code}
 * 2st run

{code:java}
 [ERROR] 
testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
  Time elapsed: 110.349 s  <<< FAILURE!
java.lang.AssertionError: 
Some writers didn't complete in expected runtime! Current writer 
state:[Circular Writer:
 directory: /test-0
 target length: 50
 current item: 50
 done: false
, Circular Writer:
 directory: /test-1
 target length: 50
 current item: 49
 done: false
, Circular Writer:
 directory: /test-2
 target length: 50
 current item: 49
 done: false
] expected:<0> but was:<3>
{code}
 * 3rd run

{code:java}
[ERROR] 
testCircularLinkedListWrites(org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes)
  Time elapsed: 109.364 s  <<< FAILURE!
java.lang.Assertion

[jira] [Work logged] (HDFS-16605) Improve Code With Lambda in hadoop-hdfs-rbf moudle

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16605?focusedWorklogId=776473&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776473
 ]

ASF GitHub Bot logged work on HDFS-16605:
-

Author: ASF GitHub Bot
Created on: 31/May/22 17:36
Start Date: 31/May/22 17:36
Worklog Time Spent: 10m 
  Work Description: goiri commented on code in PR #4375:
URL: https://github.com/apache/hadoop/pull/4375#discussion_r885918464


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/resolver/order/TestLocalResolver.java:
##
@@ -78,12 +77,8 @@ public void testLocalResolver() throws IOException {
 StringBuilder sb = new StringBuilder("clientX");
 LocalResolver localResolver = new LocalResolver(conf, router);
 LocalResolver spyLocalResolver = spy(localResolver);
-doAnswer(new Answer() {
-  @Override
-  public String answer(InvocationOnMock invocation) throws Throwable {
-return sb.toString();
-  }
-}).when(spyLocalResolver).getClientAddr();
+doAnswer((Answer) invocation ->

Review Comment:
   The line split is a little weird. Maybe better as:
   ```
   doAnswer((Answer) invocation -> sb.toString()
   ).when(spyLocalResolver).getClientAddr();
   ```





Issue Time Tracking
---

Worklog Id: (was: 776473)
Time Spent: 50m  (was: 40m)

> Improve Code With Lambda in hadoop-hdfs-rbf moudle
> --
>
> Key: HDFS-16605
> URL: https://issues.apache.org/jira/browse/HDFS-16605
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16610) Make fsck read timeout configurable

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16610?focusedWorklogId=776491&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776491
 ]

ASF GitHub Bot logged work on HDFS-16610:
-

Author: ASF GitHub Bot
Created on: 31/May/22 18:27
Start Date: 31/May/22 18:27
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4384:
URL: https://github.com/apache/hadoop/pull/4384#issuecomment-1142507615

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 37s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 48s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  24m 47s |  |  trunk passed  |
   | -1 :x: |  compile  |   2m 24s | 
[/branch-compile-hadoop-hdfs-project-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4384/1/artifact/out/branch-compile-hadoop-hdfs-project-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  hadoop-hdfs-project in trunk failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | -1 :x: |  compile  |   2m 17s | 
[/branch-compile-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4384/1/artifact/out/branch-compile-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  hadoop-hdfs-project in trunk failed with JDK Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.  |
   | +1 :green_heart: |  checkstyle  |   1m 36s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m  2s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 22s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 44s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   6m 24s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m  1s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 33s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 22s |  |  the patch passed  |
   | -1 :x: |  compile  |   2m 11s | 
[/patch-compile-hadoop-hdfs-project-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4384/1/artifact/out/patch-compile-hadoop-hdfs-project-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  hadoop-hdfs-project in the patch failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | -1 :x: |  javac  |   2m 11s | 
[/patch-compile-hadoop-hdfs-project-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4384/1/artifact/out/patch-compile-hadoop-hdfs-project-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  hadoop-hdfs-project in the patch failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | -1 :x: |  compile  |   2m  0s | 
[/patch-compile-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4384/1/artifact/out/patch-compile-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  hadoop-hdfs-project in the patch failed with JDK Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.  |
   | -1 :x: |  javac  |   2m  0s | 
[/patch-compile-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4384/1/artifact/out/patch-compile-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  hadoop-hdfs-project in the patch failed with JDK Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  ch

[jira] [Reopened] (HDFS-16583) DatanodeAdminDefaultMonitor can get stuck in an infinite loop

2022-05-31 Thread Stephen O'Donnell (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen O'Donnell reopened HDFS-16583:
--

Reopening to add a branch-3.2 PR.

> DatanodeAdminDefaultMonitor can get stuck in an infinite loop
> -
>
> Key: HDFS-16583
> URL: https://issues.apache.org/jira/browse/HDFS-16583
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.4
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> We encountered a case where the decommission monitor in the namenode got 
> stuck for about 6 hours. The logs give:
> {code}
> 2022-05-15 01:09:25,490 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager: Stopping 
> maintenance of dead node 10.185.3.132:50010
> 2022-05-15 01:10:20,918 INFO org.apache.hadoop.http.HttpServer2: Process 
> Thread Dump: jsp requested
> 
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753665_3428271426
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753659_3428271420
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753662_3428271423
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753663_3428271424
> 2022-05-15 06:00:57,281 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager: Stopping 
> maintenance of dead node 10.185.3.34:50010
> 2022-05-15 06:00:58,105 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem write lock 
> held for 17492614 ms via
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:263)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:220)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1601)
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:496)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
>   Number of suppressed write-lock reports: 0
>   Longest write-lock held interval: 17492614
> {code}
> We only have the one thread dump triggered by the FC:
> {code}
> Thread 80 (DatanodeAdminMonitor-0):
>   State: RUNNABLE
>   Blocked count: 16
>   Waited count: 453693
>   Stack:
> 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.check(DatanodeAdminManager.java:538)
> 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:494)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
> {code}
> This was the line of code:
> {code}
> private void check() {
>   final Iterator>>
>   it = new CyclicIteration<>(outOfServiceNodeBlocks,
>   iterkey).iterator();
>   final LinkedList toRemove = new LinkedList<>();
>   while (it.hasNext() && !exceededNumBlocksPerCheck() && namesystem
>   .isRunning()) {
> numNodesChecked++;
> final Map.Entry>
> entry = it.next();
> final DatanodeDescriptor dn = entry.getKey();
> AbstractList blocks = entry.getValue();
> boolean fullScan = false;
> if (dn.isMaintenance() && d

[jira] [Work logged] (HDFS-16583) DatanodeAdminDefaultMonitor can get stuck in an infinite loop

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16583?focusedWorklogId=776551&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776551
 ]

ASF GitHub Bot logged work on HDFS-16583:
-

Author: ASF GitHub Bot
Created on: 31/May/22 20:27
Start Date: 31/May/22 20:27
Worklog Time Spent: 10m 
  Work Description: sodonnel opened a new pull request, #4394:
URL: https://github.com/apache/hadoop/pull/4394

   ### Description of PR
   
   Avoid concurrently modifying the outOfServiceNodeBlocks map in the 
DatanodeAdminDefaultMonitor when ending maintenance. The concurrentModification 
can cause the monitor to get stuck in an infinite loop in some circumstances, 
while holding the NN write lock.
   
   The issue does not affect the DatanodeAdminBackoffMonitor, as it is 
implemented slightly differently.
   
   More details in the Jira https://issues.apache.org/jira/browse/HDFS-16583
   
   This is a backport for branch-3.2 as the newer branches have diverged 
significantly.
   
   ### How was this patch tested?
   
   Existing tests cover it.
   
   




Issue Time Tracking
---

Worklog Id: (was: 776551)
Time Spent: 1.5h  (was: 1h 20m)

> DatanodeAdminDefaultMonitor can get stuck in an infinite loop
> -
>
> Key: HDFS-16583
> URL: https://issues.apache.org/jira/browse/HDFS-16583
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.4
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> We encountered a case where the decommission monitor in the namenode got 
> stuck for about 6 hours. The logs give:
> {code}
> 2022-05-15 01:09:25,490 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager: Stopping 
> maintenance of dead node 10.185.3.132:50010
> 2022-05-15 01:10:20,918 INFO org.apache.hadoop.http.HttpServer2: Process 
> Thread Dump: jsp requested
> 
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753665_3428271426
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753659_3428271420
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753662_3428271423
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753663_3428271424
> 2022-05-15 06:00:57,281 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager: Stopping 
> maintenance of dead node 10.185.3.34:50010
> 2022-05-15 06:00:58,105 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem write lock 
> held for 17492614 ms via
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:263)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:220)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1601)
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:496)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
>   Number of suppressed write-lock reports: 0
>   Longest write-lock held interval: 17492614
> {code}
> We only have the one thread dump triggered by the FC:
> {code}
> Thread 80 (DatanodeAdminMonitor-0):
>   State: RUNNABLE
>   Blocked count: 16
>   Waited count: 453693
>   Stack:
> 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.check(DatanodeAdminManager.java:538)
> 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:494)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> 
> java.util.concurrent.ScheduledThreadPool

[jira] [Work logged] (HDFS-16583) DatanodeAdminDefaultMonitor can get stuck in an infinite loop

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16583?focusedWorklogId=776552&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776552
 ]

ASF GitHub Bot logged work on HDFS-16583:
-

Author: ASF GitHub Bot
Created on: 31/May/22 20:29
Start Date: 31/May/22 20:29
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on PR #4394:
URL: https://github.com/apache/hadoop/pull/4394#issuecomment-1142611040

   @jojochuang Please have a look if you have time. This is a backport of the 
earlier PR.




Issue Time Tracking
---

Worklog Id: (was: 776552)
Time Spent: 1h 40m  (was: 1.5h)

> DatanodeAdminDefaultMonitor can get stuck in an infinite loop
> -
>
> Key: HDFS-16583
> URL: https://issues.apache.org/jira/browse/HDFS-16583
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.4
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> We encountered a case where the decommission monitor in the namenode got 
> stuck for about 6 hours. The logs give:
> {code}
> 2022-05-15 01:09:25,490 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager: Stopping 
> maintenance of dead node 10.185.3.132:50010
> 2022-05-15 01:10:20,918 INFO org.apache.hadoop.http.HttpServer2: Process 
> Thread Dump: jsp requested
> 
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753665_3428271426
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753659_3428271420
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753662_3428271423
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753663_3428271424
> 2022-05-15 06:00:57,281 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager: Stopping 
> maintenance of dead node 10.185.3.34:50010
> 2022-05-15 06:00:58,105 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem write lock 
> held for 17492614 ms via
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:263)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:220)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1601)
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:496)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
>   Number of suppressed write-lock reports: 0
>   Longest write-lock held interval: 17492614
> {code}
> We only have the one thread dump triggered by the FC:
> {code}
> Thread 80 (DatanodeAdminMonitor-0):
>   State: RUNNABLE
>   Blocked count: 16
>   Waited count: 453693
>   Stack:
> 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.check(DatanodeAdminManager.java:538)
> 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:494)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
> {code}
> This was the line of code:
> {code}
> private void check() {
>   final Iterator>>
>   it = new CyclicI

[jira] [Work logged] (HDFS-16610) Make fsck read timeout configurable

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16610?focusedWorklogId=776561&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776561
 ]

ASF GitHub Bot logged work on HDFS-16610:
-

Author: ASF GitHub Bot
Created on: 31/May/22 21:15
Start Date: 31/May/22 21:15
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4384:
URL: https://github.com/apache/hadoop/pull/4384#issuecomment-1142646480

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 39s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  1s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 47s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  24m 49s |  |  trunk passed  |
   | -1 :x: |  compile  |   2m 32s | 
[/branch-compile-hadoop-hdfs-project-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4384/2/artifact/out/branch-compile-hadoop-hdfs-project-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  hadoop-hdfs-project in trunk failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | -1 :x: |  compile  |   2m 17s | 
[/branch-compile-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4384/2/artifact/out/branch-compile-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  hadoop-hdfs-project in trunk failed with JDK Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.  |
   | +1 :green_heart: |  checkstyle  |   1m 34s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m  3s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 22s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 38s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   6m 35s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 52s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 32s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 21s |  |  the patch passed  |
   | -1 :x: |  compile  |   2m 20s | 
[/patch-compile-hadoop-hdfs-project-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4384/2/artifact/out/patch-compile-hadoop-hdfs-project-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  hadoop-hdfs-project in the patch failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | -1 :x: |  javac  |   2m 20s | 
[/patch-compile-hadoop-hdfs-project-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4384/2/artifact/out/patch-compile-hadoop-hdfs-project-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  hadoop-hdfs-project in the patch failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | -1 :x: |  compile  |   1m 57s | 
[/patch-compile-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4384/2/artifact/out/patch-compile-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  hadoop-hdfs-project in the patch failed with JDK Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.  |
   | -1 :x: |  javac  |   1m 57s | 
[/patch-compile-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4384/2/artifact/out/patch-compile-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  hadoop-hdfs-project in the patch failed with JDK Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  ch

[jira] [Work logged] (HDFS-16610) Make fsck read timeout configurable

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16610?focusedWorklogId=776566&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776566
 ]

ASF GitHub Bot logged work on HDFS-16610:
-

Author: ASF GitHub Bot
Created on: 31/May/22 21:30
Start Date: 31/May/22 21:30
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on PR #4384:
URL: https://github.com/apache/hadoop/pull/4384#issuecomment-1142657950

   The native client is failing to build, but I cannot see how this change 
could cause that. I wonder if there is something else going on, or some other 
change has broken the build recently?
   
   ```
   [INFO] Reactor Summary for Apache Hadoop HDFS Project 3.4.0-SNAPSHOT:
   [INFO] 
   [INFO] Apache Hadoop HDFS Client .. SUCCESS [ 34.048 
s]
   [INFO] Apache Hadoop HDFS . SUCCESS [ 57.213 
s]
   [INFO] Apache Hadoop HDFS Native Client ... FAILURE [  7.360 
s]
   ```




Issue Time Tracking
---

Worklog Id: (was: 776566)
Time Spent: 1h  (was: 50m)

> Make fsck read timeout configurable
> ---
>
> Key: HDFS-16610
> URL: https://issues.apache.org/jira/browse/HDFS-16610
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In a cluster with a lot of small files, we encountered a case where fsck was 
> very slow. I believe it is due to contention with many other threads reading 
> / writing data on the cluster.
> Sometimes fsck does not report any progress for more than 60 seconds and the 
> client times out. Currently the connect and read timeout are hardcoded to 60 
> seconds. This change is to make them configurable.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16605) Improve Code With Lambda in hadoop-hdfs-rbf moudle

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16605?focusedWorklogId=776581&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776581
 ]

ASF GitHub Bot logged work on HDFS-16605:
-

Author: ASF GitHub Bot
Created on: 31/May/22 22:29
Start Date: 31/May/22 22:29
Worklog Time Spent: 10m 
  Work Description: slfan1989 commented on code in PR #4375:
URL: https://github.com/apache/hadoop/pull/4375#discussion_r886177581


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/resolver/order/TestLocalResolver.java:
##
@@ -78,12 +77,8 @@ public void testLocalResolver() throws IOException {
 StringBuilder sb = new StringBuilder("clientX");
 LocalResolver localResolver = new LocalResolver(conf, router);
 LocalResolver spyLocalResolver = spy(localResolver);
-doAnswer(new Answer() {
-  @Override
-  public String answer(InvocationOnMock invocation) throws Throwable {
-return sb.toString();
-  }
-}).when(spyLocalResolver).getClientAddr();
+doAnswer((Answer) invocation ->

Review Comment:
   Thanks for your help reviewing the code, I will fix it.





Issue Time Tracking
---

Worklog Id: (was: 776581)
Time Spent: 1h  (was: 50m)

> Improve Code With Lambda in hadoop-hdfs-rbf moudle
> --
>
> Key: HDFS-16605
> URL: https://issues.apache.org/jira/browse/HDFS-16605
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: rbf
>Affects Versions: 3.4.0
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16603) Improve DatanodeHttpServer With Netty recommended method

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16603?focusedWorklogId=776587&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776587
 ]

ASF GitHub Bot logged work on HDFS-16603:
-

Author: ASF GitHub Bot
Created on: 31/May/22 22:45
Start Date: 31/May/22 22:45
Worklog Time Spent: 10m 
  Work Description: jojochuang merged PR #4372:
URL: https://github.com/apache/hadoop/pull/4372




Issue Time Tracking
---

Worklog Id: (was: 776587)
Time Spent: 40m  (was: 0.5h)

> Improve DatanodeHttpServer With Netty recommended method
> 
>
> Key: HDFS-16603
> URL: https://issues.apache.org/jira/browse/HDFS-16603
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When reading the code, I found that some usage methods are outdated due to 
> the upgrade of netty components.
> {color:#172b4d}*1.DatanodeHttpServer#Constructor*{color}
> {code:java}
> @Deprecated
> public static final ChannelOption WRITE_BUFFER_HIGH_WATER_MARK = 
> valueOf("WRITE_BUFFER_HIGH_WATER_MARK"); 
> Deprecated. Use WRITE_BUFFER_WATER_MARK
> @Deprecated
> public static final ChannelOption WRITE_BUFFER_LOW_WATER_MARK = 
> valueOf("WRITE_BUFFER_LOW_WATER_MARK");
> Deprecated. Use WRITE_BUFFER_WATER_MARK
> -
> this.httpServer.childOption(
>           ChannelOption.WRITE_BUFFER_HIGH_WATER_MARK,
>           conf.getInt(
>               DFSConfigKeys.DFS_WEBHDFS_NETTY_HIGH_WATERMARK,
>               DFSConfigKeys.DFS_WEBHDFS_NETTY_HIGH_WATERMARK_DEFAULT));
> this.httpServer.childOption(
>           ChannelOption.WRITE_BUFFER_LOW_WATER_MARK,
>           conf.getInt(
>               DFSConfigKeys.DFS_WEBHDFS_NETTY_LOW_WATERMARK,
>               DFSConfigKeys.DFS_WEBHDFS_NETTY_LOW_WATERMARK_DEFAULT));
> {code}
> *2.Duplicate code* 
> {code:java}
> ChannelFuture f = httpServer.bind(infoAddr);
> try {
>  f.syncUninterruptibly();
> } catch (Throwable e) {
>   if (e instanceof BindException) {
>    throw NetUtils.wrapException(null, 0, infoAddr.getHostName(),
>    infoAddr.getPort(), (SocketException) e);
>  } else {
>    throw e;
>  }
> }
> httpAddress = (InetSocketAddress) f.channel().localAddress();
> LOG.info("Listening HTTP traffic on " + httpAddress);{code}
> *3.io.netty.bootstrap.ChannelFactory Deprecated*
> *use io.netty.channel.ChannelFactory instead.*
> {code:java}
> /** @deprecated */
> @Deprecated
> public interface ChannelFactory {
>     T newChannel();
> }{code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16603) Improve DatanodeHttpServer With Netty recommended method

2022-05-31 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16603.

Resolution: Fixed

> Improve DatanodeHttpServer With Netty recommended method
> 
>
> Key: HDFS-16603
> URL: https://issues.apache.org/jira/browse/HDFS-16603
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> When reading the code, I found that some usage methods are outdated due to 
> the upgrade of netty components.
> {color:#172b4d}*1.DatanodeHttpServer#Constructor*{color}
> {code:java}
> @Deprecated
> public static final ChannelOption WRITE_BUFFER_HIGH_WATER_MARK = 
> valueOf("WRITE_BUFFER_HIGH_WATER_MARK"); 
> Deprecated. Use WRITE_BUFFER_WATER_MARK
> @Deprecated
> public static final ChannelOption WRITE_BUFFER_LOW_WATER_MARK = 
> valueOf("WRITE_BUFFER_LOW_WATER_MARK");
> Deprecated. Use WRITE_BUFFER_WATER_MARK
> -
> this.httpServer.childOption(
>           ChannelOption.WRITE_BUFFER_HIGH_WATER_MARK,
>           conf.getInt(
>               DFSConfigKeys.DFS_WEBHDFS_NETTY_HIGH_WATERMARK,
>               DFSConfigKeys.DFS_WEBHDFS_NETTY_HIGH_WATERMARK_DEFAULT));
> this.httpServer.childOption(
>           ChannelOption.WRITE_BUFFER_LOW_WATER_MARK,
>           conf.getInt(
>               DFSConfigKeys.DFS_WEBHDFS_NETTY_LOW_WATERMARK,
>               DFSConfigKeys.DFS_WEBHDFS_NETTY_LOW_WATERMARK_DEFAULT));
> {code}
> *2.Duplicate code* 
> {code:java}
> ChannelFuture f = httpServer.bind(infoAddr);
> try {
>  f.syncUninterruptibly();
> } catch (Throwable e) {
>   if (e instanceof BindException) {
>    throw NetUtils.wrapException(null, 0, infoAddr.getHostName(),
>    infoAddr.getPort(), (SocketException) e);
>  } else {
>    throw e;
>  }
> }
> httpAddress = (InetSocketAddress) f.channel().localAddress();
> LOG.info("Listening HTTP traffic on " + httpAddress);{code}
> *3.io.netty.bootstrap.ChannelFactory Deprecated*
> *use io.netty.channel.ChannelFactory instead.*
> {code:java}
> /** @deprecated */
> @Deprecated
> public interface ChannelFactory {
>     T newChannel();
> }{code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16576) Remove unused Imports in Hadoop HDFS project

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16576?focusedWorklogId=776595&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776595
 ]

ASF GitHub Bot logged work on HDFS-16576:
-

Author: ASF GitHub Bot
Created on: 31/May/22 23:53
Start Date: 31/May/22 23:53
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4389:
URL: https://github.com/apache/hadoop/pull/4389#issuecomment-1142778740

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 43s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 32 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 49s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  27m 52s |  |  trunk passed  |
   | -1 :x: |  compile  |   2m 45s | 
[/branch-compile-hadoop-hdfs-project-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4389/1/artifact/out/branch-compile-hadoop-hdfs-project-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  hadoop-hdfs-project in trunk failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | -1 :x: |  compile  |   2m 26s | 
[/branch-compile-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4389/1/artifact/out/branch-compile-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  hadoop-hdfs-project in trunk failed with JDK Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.  |
   | +1 :green_heart: |  checkstyle  |   1m 49s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   5m 45s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   4m 31s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   5m  7s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |  10m 31s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m 14s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 33s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   4m 12s |  |  the patch passed  |
   | -1 :x: |  compile  |   2m 29s | 
[/patch-compile-hadoop-hdfs-project-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4389/1/artifact/out/patch-compile-hadoop-hdfs-project-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  hadoop-hdfs-project in the patch failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | -1 :x: |  javac  |   2m 29s | 
[/patch-compile-hadoop-hdfs-project-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4389/1/artifact/out/patch-compile-hadoop-hdfs-project-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  hadoop-hdfs-project in the patch failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | -1 :x: |  compile  |   2m  9s | 
[/patch-compile-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4389/1/artifact/out/patch-compile-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  hadoop-hdfs-project in the patch failed with JDK Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.  |
   | -1 :x: |  javac  |   2m  9s | 
[/patch-compile-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4389/1/artifact/out/patch-compile-hadoop-hdfs-project-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  hadoop-hdfs-project in the patch failed with JDK Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m 22s |  |  hadoop-hdfs-project: The 
patch generated 0 new + 507 unchanged - 63 fixed = 507 total (was 570)  |
   | +1 :green_heart: |  mvnsite  |   4m 32s |  |  the patch passed  |

[jira] [Work logged] (HDFS-16583) DatanodeAdminDefaultMonitor can get stuck in an infinite loop

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16583?focusedWorklogId=776597&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776597
 ]

ASF GitHub Bot logged work on HDFS-16583:
-

Author: ASF GitHub Bot
Created on: 31/May/22 23:55
Start Date: 31/May/22 23:55
Worklog Time Spent: 10m 
  Work Description: jojochuang merged PR #4394:
URL: https://github.com/apache/hadoop/pull/4394




Issue Time Tracking
---

Worklog Id: (was: 776597)
Time Spent: 1h 50m  (was: 1h 40m)

> DatanodeAdminDefaultMonitor can get stuck in an infinite loop
> -
>
> Key: HDFS-16583
> URL: https://issues.apache.org/jira/browse/HDFS-16583
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.3.4
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> We encountered a case where the decommission monitor in the namenode got 
> stuck for about 6 hours. The logs give:
> {code}
> 2022-05-15 01:09:25,490 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager: Stopping 
> maintenance of dead node 10.185.3.132:50010
> 2022-05-15 01:10:20,918 INFO org.apache.hadoop.http.HttpServer2: Process 
> Thread Dump: jsp requested
> 
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753665_3428271426
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753659_3428271420
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753662_3428271423
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753663_3428271424
> 2022-05-15 06:00:57,281 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager: Stopping 
> maintenance of dead node 10.185.3.34:50010
> 2022-05-15 06:00:58,105 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem write lock 
> held for 17492614 ms via
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:263)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:220)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1601)
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:496)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
>   Number of suppressed write-lock reports: 0
>   Longest write-lock held interval: 17492614
> {code}
> We only have the one thread dump triggered by the FC:
> {code}
> Thread 80 (DatanodeAdminMonitor-0):
>   State: RUNNABLE
>   Blocked count: 16
>   Waited count: 453693
>   Stack:
> 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.check(DatanodeAdminManager.java:538)
> 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:494)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
> {code}
> This was the line of code:
> {code}
> private void check() {
>   final Iterator>>
>   it = new CyclicIteration<>(outOfServiceNodeBlocks,
>   iterkey).iterator();
>   final LinkedList toRemove = new Linked

[jira] [Updated] (HDFS-16583) DatanodeAdminDefaultMonitor can get stuck in an infinite loop

2022-05-31 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-16583:
---
Fix Version/s: 3.2.4

> DatanodeAdminDefaultMonitor can get stuck in an infinite loop
> -
>
> Key: HDFS-16583
> URL: https://issues.apache.org/jira/browse/HDFS-16583
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.4, 3.3.4
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> We encountered a case where the decommission monitor in the namenode got 
> stuck for about 6 hours. The logs give:
> {code}
> 2022-05-15 01:09:25,490 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager: Stopping 
> maintenance of dead node 10.185.3.132:50010
> 2022-05-15 01:10:20,918 INFO org.apache.hadoop.http.HttpServer2: Process 
> Thread Dump: jsp requested
> 
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753665_3428271426
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753659_3428271420
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753662_3428271423
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753663_3428271424
> 2022-05-15 06:00:57,281 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager: Stopping 
> maintenance of dead node 10.185.3.34:50010
> 2022-05-15 06:00:58,105 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem write lock 
> held for 17492614 ms via
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:263)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:220)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1601)
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:496)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
>   Number of suppressed write-lock reports: 0
>   Longest write-lock held interval: 17492614
> {code}
> We only have the one thread dump triggered by the FC:
> {code}
> Thread 80 (DatanodeAdminMonitor-0):
>   State: RUNNABLE
>   Blocked count: 16
>   Waited count: 453693
>   Stack:
> 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.check(DatanodeAdminManager.java:538)
> 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:494)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
> {code}
> This was the line of code:
> {code}
> private void check() {
>   final Iterator>>
>   it = new CyclicIteration<>(outOfServiceNodeBlocks,
>   iterkey).iterator();
>   final LinkedList toRemove = new LinkedList<>();
>   while (it.hasNext() && !exceededNumBlocksPerCheck() && namesystem
>   .isRunning()) {
> numNodesChecked++;
> final Map.Entry>
> entry = it.next();
> final DatanodeDescriptor dn = entry.getKey();
> AbstractList blocks = entry.getValue();
> boolean fullScan = false;
> if (dn.isMaintenance() && dn.mainten

[jira] [Resolved] (HDFS-16583) DatanodeAdminDefaultMonitor can get stuck in an infinite loop

2022-05-31 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-16583.

Resolution: Fixed

> DatanodeAdminDefaultMonitor can get stuck in an infinite loop
> -
>
> Key: HDFS-16583
> URL: https://issues.apache.org/jira/browse/HDFS-16583
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.4, 3.3.4
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> We encountered a case where the decommission monitor in the namenode got 
> stuck for about 6 hours. The logs give:
> {code}
> 2022-05-15 01:09:25,490 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager: Stopping 
> maintenance of dead node 10.185.3.132:50010
> 2022-05-15 01:10:20,918 INFO org.apache.hadoop.http.HttpServer2: Process 
> Thread Dump: jsp requested
> 
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753665_3428271426
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753659_3428271420
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753662_3428271423
> 2022-05-15 01:19:06,810 WARN 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: 
> PendingReconstructionMonitor timed out blk_4501753663_3428271424
> 2022-05-15 06:00:57,281 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager: Stopping 
> maintenance of dead node 10.185.3.34:50010
> 2022-05-15 06:00:58,105 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem write lock 
> held for 17492614 ms via
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:263)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:220)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1601)
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:496)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
>   Number of suppressed write-lock reports: 0
>   Longest write-lock held interval: 17492614
> {code}
> We only have the one thread dump triggered by the FC:
> {code}
> Thread 80 (DatanodeAdminMonitor-0):
>   State: RUNNABLE
>   Blocked count: 16
>   Waited count: 453693
>   Stack:
> 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.check(DatanodeAdminManager.java:538)
> 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeAdminManager$Monitor.run(DatanodeAdminManager.java:494)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
> {code}
> This was the line of code:
> {code}
> private void check() {
>   final Iterator>>
>   it = new CyclicIteration<>(outOfServiceNodeBlocks,
>   iterkey).iterator();
>   final LinkedList toRemove = new LinkedList<>();
>   while (it.hasNext() && !exceededNumBlocksPerCheck() && namesystem
>   .isRunning()) {
> numNodesChecked++;
> final Map.Entry>
> entry = it.next();
> final DatanodeDescriptor dn = entry.getKey();
> AbstractList blocks = entry.getValue();
> boolean fullScan = false;
> if (dn.isMaintenance() && dn.maintena

[jira] [Work logged] (HDFS-16611) impove TestSeveralNameNodes#testCircularLinkedListWrites Params

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16611?focusedWorklogId=776607&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776607
 ]

ASF GitHub Bot logged work on HDFS-16611:
-

Author: ASF GitHub Bot
Created on: 01/Jun/22 00:32
Start Date: 01/Jun/22 00:32
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4387:
URL: https://github.com/apache/hadoop/pull/4387#issuecomment-1142869411

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 59s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  37m  1s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 44s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   1m 37s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 22s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 43s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 24s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 50s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 38s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 56s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 20s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 24s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   1m 24s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 21s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   1m 21s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  0s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 25s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 30s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 22s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 33s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 432m 59s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4387/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 14s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 541m 42s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.TestReplaceDatanodeFailureReplication |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4387/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4387 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux f95b2572752b 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 
17:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / fda2b21a4615c84d509a64d0fd066e7020de84ae |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multib

[jira] [Commented] (HDFS-16613) EC: Improve performance of decommissioning dn with many ec blocks

2022-05-31 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17544658#comment-17544658
 ] 

Takanobu Asanuma commented on HDFS-16613:
-

Nice catch. We have also seen this problem. CC: [~hadachi] 

> EC: Improve performance of decommissioning dn with many ec blocks
> -
>
> Key: HDFS-16613
> URL: https://issues.apache.org/jira/browse/HDFS-16613
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ec, erasure-coding, namenode
>Affects Versions: 3.4.0
>Reporter: caozhiqiang
>Assignee: caozhiqiang
>Priority: Major
>
> In a hdfs cluster with a lot of EC blocks, decommission a dn is very slow. 
> The reason is unlike replication blocks can be replicated from any dn which 
> has the same block replication, the ec block have to be replicated from the 
> decommissioning dn.
> The configurations dfs.namenode.replication.max-streams and 
> dfs.namenode.replication.max-streams-hard-limit will limit the replication 
> speed, but increase these configurations will create risk to the whole 
> cluster's network. So it should add a new configuration to limit the 
> decommissioning dn, distinguished from the cluster wide max-streams limit.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16605) Improve Code With Lambda in hadoop-hdfs-rbf moudle

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16605?focusedWorklogId=776623&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776623
 ]

ASF GitHub Bot logged work on HDFS-16605:
-

Author: ASF GitHub Bot
Created on: 01/Jun/22 01:40
Start Date: 01/Jun/22 01:40
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4375:
URL: https://github.com/apache/hadoop/pull/4375#issuecomment-1143021805

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 50s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 10 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  39m 47s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m  3s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   0m 48s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   0m 45s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 53s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  0s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m  4s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 46s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m 45s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 39s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 42s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   0m 42s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 36s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   0m 36s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 22s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 39s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 38s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 55s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 26s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m 21s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  39m 45s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 45s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 143m  9s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4375/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4375 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 3cad363d5ded 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 
17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 9d1c85d71abf1877b07a3d3a9b91b845dc2fa8ee |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4375/4/testReport/ |
   | Max. process+thread count | 2131 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4375/4/console |

[jira] [Work logged] (HDFS-16583) DatanodeAdminDefaultMonitor can get stuck in an infinite loop

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16583?focusedWorklogId=776626&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776626
 ]

ASF GitHub Bot logged work on HDFS-16583:
-

Author: ASF GitHub Bot
Created on: 01/Jun/22 01:46
Start Date: 01/Jun/22 01:46
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4394:
URL: https://github.com/apache/hadoop/pull/4394#issuecomment-1143027080

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  15m  5s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ branch-3.2 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  34m 33s |  |  branch-3.2 passed  |
   | +1 :green_heart: |  compile  |   1m 23s |  |  branch-3.2 passed  |
   | +1 :green_heart: |  checkstyle  |   1m  8s |  |  branch-3.2 passed  |
   | +1 :green_heart: |  mvnsite  |   1m 21s |  |  branch-3.2 passed  |
   | +1 :green_heart: |  javadoc  |   1m 14s |  |  branch-3.2 passed  |
   | +1 :green_heart: |  spotbugs  |   3m 25s |  |  branch-3.2 passed  |
   | +1 :green_heart: |  shadedclient  |  18m 25s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 26s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 10s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 10s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 49s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 18s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m  3s |  |  the patch passed  |
   | +1 :green_heart: |  spotbugs  |   3m 13s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  19m 25s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 214m 10s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4394/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 53s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 317m 59s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.namenode.TestFsck |
   |   | hadoop.hdfs.server.datanode.TestBPOfferService |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4394/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4394 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 4599d1500ca9 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 
17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | branch-3.2 / 9ef4509580ba23a559ca0b7404d0e5fbb9b48b28 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4394/1/testReport/ |
   | Max. process+thread count | 2069 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4394/1/console |
   | versions | git=2.17.1 maven=3.6.0 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




Issue Time Tracking
---

Worklog Id: (was: 776626)
Time Spent: 2h  (was: 1h 50m)

> DatanodeAdminDefaultMonitor can get stuck in an infinite loop
> -
>
> Key: HDFS-16583
> URL: https://issues.apache.org/jira/browse/HDFS-16583
>  

[jira] [Work logged] (HDFS-13522) RBF: Support observer node from Router-Based Federation

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13522?focusedWorklogId=776641&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776641
 ]

ASF GitHub Bot logged work on HDFS-13522:
-

Author: ASF GitHub Bot
Created on: 01/Jun/22 03:33
Start Date: 01/Jun/22 03:33
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4311:
URL: https://github.com/apache/hadoop/pull/4311#issuecomment-1143079362

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 38s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 27s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  24m 38s |  |  trunk passed  |
   | -1 :x: |  compile  |   4m  5s | 
[/branch-compile-root-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4311/5/artifact/out/branch-compile-root-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  root in trunk failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | -1 :x: |  compile  |   3m 38s | 
[/branch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4311/5/artifact/out/branch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  root in trunk failed with JDK Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.  |
   | +1 :green_heart: |  checkstyle  |   3m 59s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   5m 47s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   4m 30s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   4m 54s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |  10m 30s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m 10s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 30s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   4m  9s |  |  the patch passed  |
   | -1 :x: |  compile  |   3m 52s | 
[/patch-compile-root-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4311/5/artifact/out/patch-compile-root-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  root in the patch failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | -1 :x: |  javac  |   3m 52s | 
[/patch-compile-root-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4311/5/artifact/out/patch-compile-root-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  root in the patch failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | -1 :x: |  compile  |   3m 22s | 
[/patch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4311/5/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  root in the patch failed with JDK Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.  |
   | -1 :x: |  javac  |   3m 22s | 
[/patch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4311/5/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  root in the patch failed with JDK Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   3m 33s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4311/5/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 2 new + 196 unchanged - 1 fixed = 198 total (was 
197)  |
   | +1 :green_heart: |  mvnsite  |   4m 51s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   3m 33s |  |  the patch passed with JDK 
P

[jira] [Work logged] (HDFS-16604) Install gtest via FetchContent_Declare in CMake

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16604?focusedWorklogId=776657&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776657
 ]

ASF GitHub Bot logged work on HDFS-16604:
-

Author: ASF GitHub Bot
Created on: 01/Jun/22 04:45
Start Date: 01/Jun/22 04:45
Worklog Time Spent: 10m 
  Work Description: aajisaka merged PR #4374:
URL: https://github.com/apache/hadoop/pull/4374




Issue Time Tracking
---

Worklog Id: (was: 776657)
Time Spent: 2.5h  (was: 2h 20m)

> Install gtest via FetchContent_Declare in CMake
> ---
>
> Key: HDFS-16604
> URL: https://issues.apache.org/jira/browse/HDFS-16604
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs++
>Affects Versions: 3.4.0
>Reporter: Gautham Banasandra
>Assignee: Gautham Banasandra
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> CMake is unable to checkout *release-1.10.0* version of GoogleTest -
> {code}
> [WARNING] -- Build files have been written to: 
> /home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-4370/centos-7/src/hadoop-hdfs-project/hadoop-hdfs-native-client/target/main/native/libhdfspp/googletest-download
> [WARNING] Scanning dependencies of target googletest
> [WARNING] [ 11%] Creating directories for 'googletest'
> [WARNING] [ 22%] Performing download step (git clone) for 'googletest'
> [WARNING] Cloning into 'googletest-src'...
> [WARNING] fatal: invalid reference: release-1.10.0
> [WARNING] CMake Error at 
> googletest-download/googletest-prefix/tmp/googletest-gitclone.cmake:40 
> (message):
> [WARNING]   Failed to checkout tag: 'release-1.10.0'
> [WARNING] 
> [WARNING] 
> [WARNING] gmake[2]: *** [CMakeFiles/googletest.dir/build.make:111: 
> googletest-prefix/src/googletest-stamp/googletest-download] Error 1
> [WARNING] gmake[1]: *** [CMakeFiles/Makefile2:95: 
> CMakeFiles/googletest.dir/all] Error 2
> [WARNING] gmake: *** [Makefile:103: all] Error 2
> [WARNING] CMake Error at main/native/libhdfspp/CMakeLists.txt:68 (message):
> [WARNING]   Build step for googletest failed: 2
> {code}
> Jenkins run - 
> https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4370/6/artifact/out/branch-compile-hadoop-hdfs-project_hadoop-hdfs-native-client.txt
> We need to use *FetchContent_Declare* since we're getting the source code 
> exactly at the given commit SHA. This avoids the checkout step altogether and 
> solves the above issue.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16604) Install gtest via FetchContent_Declare in CMake

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16604?focusedWorklogId=776658&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776658
 ]

ASF GitHub Bot logged work on HDFS-16604:
-

Author: ASF GitHub Bot
Created on: 01/Jun/22 04:46
Start Date: 01/Jun/22 04:46
Worklog Time Spent: 10m 
  Work Description: aajisaka commented on PR #4374:
URL: https://github.com/apache/hadoop/pull/4374#issuecomment-1143112425

   Merged. Thank you @GauthamBanasandra @goiri 




Issue Time Tracking
---

Worklog Id: (was: 776658)
Time Spent: 2h 40m  (was: 2.5h)

> Install gtest via FetchContent_Declare in CMake
> ---
>
> Key: HDFS-16604
> URL: https://issues.apache.org/jira/browse/HDFS-16604
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs++
>Affects Versions: 3.4.0
>Reporter: Gautham Banasandra
>Assignee: Gautham Banasandra
>Priority: Blocker
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> CMake is unable to checkout *release-1.10.0* version of GoogleTest -
> {code}
> [WARNING] -- Build files have been written to: 
> /home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-4370/centos-7/src/hadoop-hdfs-project/hadoop-hdfs-native-client/target/main/native/libhdfspp/googletest-download
> [WARNING] Scanning dependencies of target googletest
> [WARNING] [ 11%] Creating directories for 'googletest'
> [WARNING] [ 22%] Performing download step (git clone) for 'googletest'
> [WARNING] Cloning into 'googletest-src'...
> [WARNING] fatal: invalid reference: release-1.10.0
> [WARNING] CMake Error at 
> googletest-download/googletest-prefix/tmp/googletest-gitclone.cmake:40 
> (message):
> [WARNING]   Failed to checkout tag: 'release-1.10.0'
> [WARNING] 
> [WARNING] 
> [WARNING] gmake[2]: *** [CMakeFiles/googletest.dir/build.make:111: 
> googletest-prefix/src/googletest-stamp/googletest-download] Error 1
> [WARNING] gmake[1]: *** [CMakeFiles/Makefile2:95: 
> CMakeFiles/googletest.dir/all] Error 2
> [WARNING] gmake: *** [Makefile:103: all] Error 2
> [WARNING] CMake Error at main/native/libhdfspp/CMakeLists.txt:68 (message):
> [WARNING]   Build step for googletest failed: 2
> {code}
> Jenkins run - 
> https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4370/6/artifact/out/branch-compile-hadoop-hdfs-project_hadoop-hdfs-native-client.txt
> We need to use *FetchContent_Declare* since we're getting the source code 
> exactly at the given commit SHA. This avoids the checkout step altogether and 
> solves the above issue.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16604) Install gtest via FetchContent_Declare in CMake

2022-05-31 Thread Akira Ajisaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka resolved HDFS-16604.
--
Fix Version/s: 3.4.0
   Resolution: Fixed

Merged the PR into trunk.

> Install gtest via FetchContent_Declare in CMake
> ---
>
> Key: HDFS-16604
> URL: https://issues.apache.org/jira/browse/HDFS-16604
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: libhdfs++
>Affects Versions: 3.4.0
>Reporter: Gautham Banasandra
>Assignee: Gautham Banasandra
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> CMake is unable to checkout *release-1.10.0* version of GoogleTest -
> {code}
> [WARNING] -- Build files have been written to: 
> /home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-4370/centos-7/src/hadoop-hdfs-project/hadoop-hdfs-native-client/target/main/native/libhdfspp/googletest-download
> [WARNING] Scanning dependencies of target googletest
> [WARNING] [ 11%] Creating directories for 'googletest'
> [WARNING] [ 22%] Performing download step (git clone) for 'googletest'
> [WARNING] Cloning into 'googletest-src'...
> [WARNING] fatal: invalid reference: release-1.10.0
> [WARNING] CMake Error at 
> googletest-download/googletest-prefix/tmp/googletest-gitclone.cmake:40 
> (message):
> [WARNING]   Failed to checkout tag: 'release-1.10.0'
> [WARNING] 
> [WARNING] 
> [WARNING] gmake[2]: *** [CMakeFiles/googletest.dir/build.make:111: 
> googletest-prefix/src/googletest-stamp/googletest-download] Error 1
> [WARNING] gmake[1]: *** [CMakeFiles/Makefile2:95: 
> CMakeFiles/googletest.dir/all] Error 2
> [WARNING] gmake: *** [Makefile:103: all] Error 2
> [WARNING] CMake Error at main/native/libhdfspp/CMakeLists.txt:68 (message):
> [WARNING]   Build step for googletest failed: 2
> {code}
> Jenkins run - 
> https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4370/6/artifact/out/branch-compile-hadoop-hdfs-project_hadoop-hdfs-native-client.txt
> We need to use *FetchContent_Declare* since we're getting the source code 
> exactly at the given commit SHA. This avoids the checkout step altogether and 
> solves the above issue.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16590) Fix Junit Test Deprecated assertThat

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16590?focusedWorklogId=776660&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776660
 ]

ASF GitHub Bot logged work on HDFS-16590:
-

Author: ASF GitHub Bot
Created on: 01/Jun/22 04:53
Start Date: 01/Jun/22 04:53
Worklog Time Spent: 10m 
  Work Description: Hexiaoqiao commented on PR #4349:
URL: https://github.com/apache/hadoop/pull/4349#issuecomment-1143115937

   @slfan1989 Thanks for involving me. It makes sense to me. Please check the 
reason of building first before checkin.
   BTW, after check junit API document, I just found that assertEquals and 
assertThat are both deprecated, would you mind to check them together? It is 
also OK to file another jira to fix assertEquals. Thanks




Issue Time Tracking
---

Worklog Id: (was: 776660)
Time Spent: 2h 10m  (was: 2h)

> Fix Junit Test Deprecated assertThat
> 
>
> Key: HDFS-16590
> URL: https://issues.apache.org/jira/browse/HDFS-16590
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16598) All datanodes [DatanodeInfoWithStorage[127.0.0.1:57448,DS-1b5f7e33-a2bf-4edc-9122-a74c995a99f5,DISK]] are bad. Aborting...

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16598?focusedWorklogId=776664&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776664
 ]

ASF GitHub Bot logged work on HDFS-16598:
-

Author: ASF GitHub Bot
Created on: 01/Jun/22 05:06
Start Date: 01/Jun/22 05:06
Worklog Time Spent: 10m 
  Work Description: Hexiaoqiao commented on PR #4366:
URL: https://github.com/apache/hadoop/pull/4366#issuecomment-1143122253

   @ZanderXu Thanks for your catches. Sorry I don't get why 
[HDFS-16534](https://issues.apache.org/jira/browse/HDFS-16534) could lead GS 
inconsistency between client and datanode. Thanks again.
   cc @MingXiangLi 




Issue Time Tracking
---

Worklog Id: (was: 776664)
Time Spent: 40m  (was: 0.5h)

> All datanodes 
> [DatanodeInfoWithStorage[127.0.0.1:57448,DS-1b5f7e33-a2bf-4edc-9122-a74c995a99f5,DISK]]
>  are bad. Aborting...
> --
>
> Key: HDFS-16598
> URL: https://issues.apache.org/jira/browse/HDFS-16598
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> org.apache.hadoop.hdfs.testPipelineRecoveryOnRestartFailure failed with the 
> stack like:
> {code:java}
> java.io.IOException: All datanodes 
> [DatanodeInfoWithStorage[127.0.0.1:57448,DS-1b5f7e33-a2bf-4edc-9122-a74c995a99f5,DISK]]
>  are bad. Aborting...
>   at 
> org.apache.hadoop.hdfs.DataStreamer.handleBadDatanode(DataStreamer.java:1667)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1601)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1587)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.processDatanodeOrExternalError(DataStreamer.java:1371)
>   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:674)
> {code}
> After tracing the root cause, this bug was introduced by 
> [HDFS-16534|https://issues.apache.org/jira/browse/HDFS-16534]. Because the 
> block GS of client may be smaller than DN when pipeline recovery failed.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16610) Make fsck read timeout configurable

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16610?focusedWorklogId=776668&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776668
 ]

ASF GitHub Bot logged work on HDFS-16610:
-

Author: ASF GitHub Bot
Created on: 01/Jun/22 05:27
Start Date: 01/Jun/22 05:27
Worklog Time Spent: 10m 
  Work Description: ayushtkn commented on PR #4384:
URL: https://github.com/apache/hadoop/pull/4384#issuecomment-1143132080

   May be due to HDFS-16604, try to run the build again...




Issue Time Tracking
---

Worklog Id: (was: 776668)
Time Spent: 1h 10m  (was: 1h)

> Make fsck read timeout configurable
> ---
>
> Key: HDFS-16610
> URL: https://issues.apache.org/jira/browse/HDFS-16610
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> In a cluster with a lot of small files, we encountered a case where fsck was 
> very slow. I believe it is due to contention with many other threads reading 
> / writing data on the cluster.
> Sometimes fsck does not report any progress for more than 60 seconds and the 
> client times out. Currently the connect and read timeout are hardcoded to 60 
> seconds. This change is to make them configurable.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-13522) RBF: Support observer node from Router-Based Federation

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13522?focusedWorklogId=776676&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776676
 ]

ASF GitHub Bot logged work on HDFS-13522:
-

Author: ASF GitHub Bot
Created on: 01/Jun/22 05:54
Start Date: 01/Jun/22 05:54
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4127:
URL: https://github.com/apache/hadoop/pull/4127#issuecomment-1143145619

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 50s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 12 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 35s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  27m 32s |  |  trunk passed  |
   | -1 :x: |  compile  |   4m 18s | 
[/branch-compile-root-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4127/12/artifact/out/branch-compile-root-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  root in trunk failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | -1 :x: |  compile  |   3m 35s | 
[/branch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4127/12/artifact/out/branch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  root in trunk failed with JDK Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.  |
   | +1 :green_heart: |  checkstyle  |   4m 12s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   5m 19s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   4m  1s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   4m 18s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |  10m 28s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m 45s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 26s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 58s |  |  the patch passed  |
   | -1 :x: |  compile  |   4m  4s | 
[/patch-compile-root-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4127/12/artifact/out/patch-compile-root-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  root in the patch failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | -1 :x: |  javac  |   4m  4s | 
[/patch-compile-root-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4127/12/artifact/out/patch-compile-root-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  root in the patch failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | -1 :x: |  compile  |   3m 26s | 
[/patch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4127/12/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  root in the patch failed with JDK Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.  |
   | -1 :x: |  javac  |   3m 26s | 
[/patch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4127/12/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  root in the patch failed with JDK Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   3m 57s | 
[/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4127/12/artifact/out/results-checkstyle-root.txt)
 |  root: The patch generated 2 new + 337 unchanged - 1 fixed = 339 total (was 
338)  |
   | +1 :green_heart: |  mvnsite  |   4m 39s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   3m 19s |  |  the patch passed wit

[jira] [Work logged] (HDFS-16598) All datanodes [DatanodeInfoWithStorage[127.0.0.1:57448,DS-1b5f7e33-a2bf-4edc-9122-a74c995a99f5,DISK]] are bad. Aborting...

2022-05-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16598?focusedWorklogId=776684&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776684
 ]

ASF GitHub Bot logged work on HDFS-16598:
-

Author: ASF GitHub Bot
Created on: 01/Jun/22 06:48
Start Date: 01/Jun/22 06:48
Worklog Time Spent: 10m 
  Work Description: ZanderXu commented on PR #4366:
URL: https://github.com/apache/hadoop/pull/4366#issuecomment-1143182980

   In the original PipelineRecovery process, if the pipeline recovery failed, 
the block GS maybe inconsistency. So during Pipeline Recovery, GS inconsistency 
is expected. 
   
   [HDFS-16534](https://issues.apache.org/jira/browse/HDFS-16534) has a bug in 
handling inconsistent GS,  and caused **All datanodes XXX are bad. 
Aborting...** 
   




Issue Time Tracking
---

Worklog Id: (was: 776684)
Time Spent: 50m  (was: 40m)

> All datanodes 
> [DatanodeInfoWithStorage[127.0.0.1:57448,DS-1b5f7e33-a2bf-4edc-9122-a74c995a99f5,DISK]]
>  are bad. Aborting...
> --
>
> Key: HDFS-16598
> URL: https://issues.apache.org/jira/browse/HDFS-16598
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> org.apache.hadoop.hdfs.testPipelineRecoveryOnRestartFailure failed with the 
> stack like:
> {code:java}
> java.io.IOException: All datanodes 
> [DatanodeInfoWithStorage[127.0.0.1:57448,DS-1b5f7e33-a2bf-4edc-9122-a74c995a99f5,DISK]]
>  are bad. Aborting...
>   at 
> org.apache.hadoop.hdfs.DataStreamer.handleBadDatanode(DataStreamer.java:1667)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineInternal(DataStreamer.java:1601)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1587)
>   at 
> org.apache.hadoop.hdfs.DataStreamer.processDatanodeOrExternalError(DataStreamer.java:1371)
>   at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:674)
> {code}
> After tracing the root cause, this bug was introduced by 
> [HDFS-16534|https://issues.apache.org/jira/browse/HDFS-16534]. Because the 
> block GS of client may be smaller than DN when pipeline recovery failed.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org