[jira] [Commented] (HDFS-17518) In the lease monitor, if a file is closed, we should sync the editslog

2024-05-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845662#comment-17845662
 ] 

ASF GitHub Bot commented on HDFS-17518:
---

vinayakumarb commented on code in PR #6809:
URL: https://github.com/apache/hadoop/pull/6809#discussion_r1597551434


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java:
##
@@ -3738,7 +3738,7 @@ boolean internalReleaseLease(Lease lease, String src, 
INodesInPath iip,
   NameNode.stateChangeLog.warn("BLOCK*" +
   " internalReleaseLease: All existing blocks are COMPLETE," +
   " lease removed, file " + src + " closed.");
-  return true;  // closed!
+  return false;  // closed!

Review Comment:
   As per javadoc of this method, return value indicates whether file was 
closed or not.
   
   Changing that value here, may solve the problem of logSync() particularly in 
this case, but it will be problematic for other usages of this method.
   
   For ex: recoverLease() RPC will always get false, even though file was 
closed.
   
   As per the javadoc, even if the return value is false, there are edits 
logged (reassigning the lease, when blockrecovery is initiated.).
   So calling the logSync() is required in both these cases.
   That said, Cannot blindly call logSync() always.
   
   So, more correct approach to fix this is to return a combination of these 
values from this method (i.e. complerted and needsync )
   
   And determine whether to call sync or not in the caller.





> In the lease monitor, if a file is closed, we should sync the editslog
> --
>
> Key: HDFS-17518
> URL: https://issues.apache.org/jira/browse/HDFS-17518
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: lei w
>Priority: Minor
>  Labels: pull-request-available
>
> In the lease monitor, if a file is closed,  method checklease will return 
> true, and then the edits log will not be sync. In my opinion, we should sync 
> the edits log to avoid not synchronizing the state to the standby NameNode 
> for a long time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17518) In the lease monitor, if a file is closed, we should sync the editslog

2024-05-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845663#comment-17845663
 ] 

ASF GitHub Bot commented on HDFS-17518:
---

vinayakumarb commented on code in PR #6809:
URL: https://github.com/apache/hadoop/pull/6809#discussion_r1597551434


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java:
##
@@ -3738,7 +3738,7 @@ boolean internalReleaseLease(Lease lease, String src, 
INodesInPath iip,
   NameNode.stateChangeLog.warn("BLOCK*" +
   " internalReleaseLease: All existing blocks are COMPLETE," +
   " lease removed, file " + src + " closed.");
-  return true;  // closed!
+  return false;  // closed!

Review Comment:
   As per javadoc of this method, return value indicates whether file was 
closed or not.
   
   Changing that value here, may solve the problem of logSync() particularly in 
this case, but it will be problematic for other usages of this method.
   
   For ex: recoverLease() RPC will get false, even though file was closed.
   
   As per the javadoc, even if the return value is false, there are edits 
logged (reassigning the lease, when blockrecovery is initiated.).
   So calling the logSync() is required in both these cases.
   That said, Cannot blindly call logSync() always.
   
   So, more correct approach to fix this is to return a combination of these 
values from this method (i.e. complerted and needsync )
   
   And determine whether to call sync or not in the caller.





> In the lease monitor, if a file is closed, we should sync the editslog
> --
>
> Key: HDFS-17518
> URL: https://issues.apache.org/jira/browse/HDFS-17518
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: lei w
>Priority: Minor
>  Labels: pull-request-available
>
> In the lease monitor, if a file is closed,  method checklease will return 
> true, and then the edits log will not be sync. In my opinion, we should sync 
> the edits log to avoid not synchronizing the state to the standby NameNode 
> for a long time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17509) RBF: Fix ClientProtocol.concat will throw NPE if tgr is a empty file.

2024-05-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845632#comment-17845632
 ] 

ASF GitHub Bot commented on HDFS-17509:
---

hadoop-yetus commented on PR #6784:
URL: https://github.com/apache/hadoop/pull/6784#issuecomment-2105966551

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m 01s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  spotbugs  |   0m 00s |  |  spotbugs executables are not 
available.  |
   | +0 :ok: |  codespell  |   0m 01s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m 01s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m 00s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m 00s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  88m 01s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   5m 17s |  |  trunk passed  |
   | +1 :green_heart: |  checkstyle  |   4m 32s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   5m 07s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   4m 35s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  | 141m 27s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   3m 00s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   2m 20s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |   2m 20s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m 00s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   2m 00s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   2m 25s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   2m 07s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  | 149m 42s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   5m 23s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 403m 30s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/6784 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | MINGW64_NT-10.0-17763 40ce0a78e9b0 3.4.10-87d57229.x86_64 
2024-02-14 20:17 UTC x86_64 Msys |
   | Build tool | maven |
   | Personality | /c/hadoop/dev-support/bin/hadoop.sh |
   | git revision | trunk / acc81392df1ebeb5567c816ad549ce5e81313a99 |
   | Default Java | Azul Systems, Inc.-1.8.0_332-b09 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6784/3/testReport/
 |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6784/3/console
 |
   | versions | git=2.44.0.windows.1 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> RBF: Fix ClientProtocol.concat  will throw NPE if tgr is a empty file.
> --
>
> Key: HDFS-17509
> URL: https://issues.apache.org/jira/browse/HDFS-17509
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: liuguanghua
>Priority: Minor
>  Labels: pull-request-available
>
> hdfs dfs -concat  /tmp/merge /tmp/t1 /tmp/t2
> When /tmp/merge is a empty file, this command will throw NPE via DFSRouter. 
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17506) [FGL] Performance for phase 1

2024-05-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845624#comment-17845624
 ] 

ASF GitHub Bot commented on HDFS-17506:
---

hadoop-yetus commented on PR #6806:
URL: https://github.com/apache/hadoop/pull/6806#issuecomment-2105929542

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m 00s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  spotbugs  |   0m 01s |  |  spotbugs executables are not 
available.  |
   | +0 :ok: |  codespell  |   0m 01s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m 01s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m 00s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m 01s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ HDFS-17384 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  | 108m 02s |  |  HDFS-17384 passed  |
   | +1 :green_heart: |  compile  |   7m 17s |  |  HDFS-17384 passed  |
   | +1 :green_heart: |  checkstyle  |   5m 38s |  |  HDFS-17384 passed  |
   | +1 :green_heart: |  mvnsite  |   8m 07s |  |  HDFS-17384 passed  |
   | +1 :green_heart: |  javadoc  |   7m 02s |  |  HDFS-17384 passed  |
   | +1 :green_heart: |  shadedclient  | 177m 17s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   5m 46s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   4m 29s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |   4m 29s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m 00s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   2m 53s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   5m 08s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   4m 19s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  | 193m 49s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   6m 26s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 512m 36s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/6806 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | MINGW64_NT-10.0-17763 a4a169f40776 3.4.10-87d57229.x86_64 
2024-02-14 20:17 UTC x86_64 Msys |
   | Build tool | maven |
   | Personality | /c/hadoop/dev-support/bin/hadoop.sh |
   | git revision | HDFS-17384 / db92d267a8d598d36f9fbe59ec8c24ccf754c558 |
   | Default Java | Azul Systems, Inc.-1.8.0_332-b09 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6806/3/testReport/
 |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6806/3/console
 |
   | versions | git=2.44.0.windows.1 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> [FGL] Performance for phase 1
> -
>
> Key: HDFS-17506
> URL: https://issues.apache.org/jira/browse/HDFS-17506
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>
> Do some benchmark testing for phase 1.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17520) TestDFSAdmin.testAllDatanodesReconfig and TestDFSAdmin.testDecommissionDataNodesReconfig failed

2024-05-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845622#comment-17845622
 ] 

ASF GitHub Bot commented on HDFS-17520:
---

hadoop-yetus commented on PR #6812:
URL: https://github.com/apache/hadoop/pull/6812#issuecomment-2105927891

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m 00s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  spotbugs  |   0m 01s |  |  spotbugs executables are not 
available.  |
   | +0 :ok: |  codespell  |   0m 01s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m 01s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m 00s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m 00s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  94m 54s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   6m 43s |  |  trunk passed  |
   | +1 :green_heart: |  checkstyle  |   5m 06s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   6m 59s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   6m 10s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  | 156m 39s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   5m 00s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   3m 44s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |   3m 44s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m 00s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   2m 27s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   4m 28s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   4m 06s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  | 170m 16s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   6m 17s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 450m 52s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/6812 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | MINGW64_NT-10.0-17763 17b2ed3cc898 3.4.10-87d57229.x86_64 
2024-02-14 20:17 UTC x86_64 Msys |
   | Build tool | maven |
   | Personality | /c/hadoop/dev-support/bin/hadoop.sh |
   | git revision | trunk / 2d929bda295e05d4b7a3194a7dc311944b662661 |
   | Default Java | Azul Systems, Inc.-1.8.0_332-b09 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6812/1/testReport/
 |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6812/1/console
 |
   | versions | git=2.45.0.windows.1 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> TestDFSAdmin.testAllDatanodesReconfig and 
> TestDFSAdmin.testDecommissionDataNodesReconfig failed
> ---
>
> Key: HDFS-17520
> URL: https://issues.apache.org/jira/browse/HDFS-17520
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>
> {code:java}
> [ERROR] Tests run: 21, Failures: 3, Errors: 0, Skipped: 0, Time elapsed: 
> 44.521 s <<< FAILURE! - in org.apache.hadoop.hdfs.tools.TestDFSAdmin
> [ERROR] testAllDatanodesReconfig(org.apache.hadoop.hdfs.tools.TestDFSAdmin)  
> Time elapsed: 2.086 s  <<< FAILURE!
> java.lang.AssertionError: 
> Expecting:
>  <["Reconfiguring status for node [127.0.0.1:43731]: SUCCESS: Changed 
> property dfs.datanode.peer.stats.enabled",
> " From: "false"",
> " To: "true"",
> "started at Fri May 10 13:02:51 UTC 2024 and finished at Fri May 10 
> 13:02:51 UTC 2024."]>
> to contain subsequence:
>  <["SUCCESS: Changed property dfs.datanode.peer.stats.enabled",
> " From: "false"",
> " To: "true""]>
>   at 
> org.apache.hadoop.hdfs.tools.TestDFSAdmin.testAllDatanodesReconfig(TestDFSAdmin.java:1286)
>   at 

[jira] [Commented] (HDFS-17509) RBF: Fix ClientProtocol.concat will throw NPE if tgr is a empty file.

2024-05-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845615#comment-17845615
 ] 

ASF GitHub Bot commented on HDFS-17509:
---

hadoop-yetus commented on PR #6784:
URL: https://github.com/apache/hadoop/pull/6784#issuecomment-2105783690

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 53s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  49m 54s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 44s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   0m 37s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   0m 30s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 43s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 44s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 32s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 22s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  39m 15s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 31s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 34s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   0m 34s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 29s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   0m 29s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 19s | 
[/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6784/6/artifact/out/results-checkstyle-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-project/hadoop-hdfs-rbf: The patch generated 1 new + 3 
unchanged - 0 fixed = 4 total (was 3)  |
   | +1 :green_heart: |  mvnsite  |   0m 33s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 29s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 24s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 19s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  38m 48s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  31m 34s |  |  hadoop-hdfs-rbf in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 38s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 176m 26s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.45 ServerAPI=1.45 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6784/6/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6784 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux d96a40ec6336 5.15.0-106-generic #116-Ubuntu SMP Wed Apr 17 
09:17:56 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / acc81392df1ebeb5567c816ad549ce5e81313a99 |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6784/6/testReport/ |
   | Max. process+thread count | 3527 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
   | Console output | 

[jira] [Commented] (HDFS-17521) Erasure Coding: Fix calculation errors caused by special index order

2024-05-11 Thread Chenyu Zheng (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845613#comment-17845613
 ] 

Chenyu Zheng commented on HDFS-17521:
-

Possible calculation errors have always been the reason why we have not used EC 
on a very large scale. So I think this is a very important fix.

[~weichiu] [~hexiaoqiao]  Can you please review this?

> Erasure Coding: Fix calculation errors caused by special index order
> 
>
> Key: HDFS-17521
> URL: https://issues.apache.org/jira/browse/HDFS-17521
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chenyu Zheng
>Assignee: Chenyu Zheng
>Priority: Critical
>  Labels: pull-request-available
>
> I found that if the erasedIndexes distribution is such that the parity index 
> is in front of the data index, ec will produce wrong results when decoding.
> In fact, HDFS-15186 has described this problem, but does not fundamentally 
> solve it.
> The reason is that the code assumes that erasedIndexes is preceded by the 
> data index and followed by parity index. If there is a parity index placed in 
> front of the data index, a calculation error will occur.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17521) Erasure Coding: Fix calculation errors caused by special index order

2024-05-11 Thread Chenyu Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chenyu Zheng updated HDFS-17521:

Description: 
I found that if the erasedIndexes distribution is such that the parity index is 
in front of the data index, ec will produce wrong results when decoding.

In fact, HDFS-15186 has described this problem, but does not fundamentally 
solve it.

The reason is that the code assumes that erasedIndexes is preceded by the data 
index and followed by parity index. If there is a parity index placed in front 
of the data index, a calculation error will occur.

  was:
I found that if the erasedIndexes distribution is such that the parity index is 
in front of the data index, ec will produce wrong results when decoding.

In fact, HDFS-15186 has described this problem, but does not fundamentally 
solve it.

The reason is that the code assumes that erasedIndexes is preceded by the data 
index and followed by parity index. If there is a parity index placed in front 
of the data index in the incoming code, a calculation error will occur.


> Erasure Coding: Fix calculation errors caused by special index order
> 
>
> Key: HDFS-17521
> URL: https://issues.apache.org/jira/browse/HDFS-17521
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chenyu Zheng
>Assignee: Chenyu Zheng
>Priority: Critical
>  Labels: pull-request-available
>
> I found that if the erasedIndexes distribution is such that the parity index 
> is in front of the data index, ec will produce wrong results when decoding.
> In fact, HDFS-15186 has described this problem, but does not fundamentally 
> solve it.
> The reason is that the code assumes that erasedIndexes is preceded by the 
> data index and followed by parity index. If there is a parity index placed in 
> front of the data index, a calculation error will occur.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17521) Erasure Coding: Fix calculation errors caused by special index order

2024-05-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845611#comment-17845611
 ] 

ASF GitHub Bot commented on HDFS-17521:
---

zhengchenyu opened a new pull request, #6813:
URL: https://github.com/apache/hadoop/pull/6813

   ### Description of PR
   
   I found that if the erasedIndexes distribution is such that the parity index 
is in front of the data index, ec will produce wrong results when decoding.
   
   In fact, [HDFS-15186](https://issues.apache.org/jira/browse/HDFS-15186) has 
described this problem, but does not fundamentally solve it.
   
   The reason is that the code assumes that erasedIndexes is preceded by the 
data index and followed by parity index. If there is a parity index placed in 
front of the data index in the incoming code, a calculation error will occur.
   
   ### How was this patch tested?
   
   The TestErasureCodingEncodeAndDecode unit test and the erasure_code_test 
binary were executed on different machines. The test machines include those 
with isa-l installed and those without isa-l installed.
   
   ### For code changes:
   
   - Make erasedIndexes support arbitrary index distribution.
   
   




> Erasure Coding: Fix calculation errors caused by special index order
> 
>
> Key: HDFS-17521
> URL: https://issues.apache.org/jira/browse/HDFS-17521
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chenyu Zheng
>Assignee: Chenyu Zheng
>Priority: Critical
>
> I found that if the erasedIndexes distribution is such that the parity index 
> is in front of the data index, ec will produce wrong results when decoding.
> In fact, HDFS-15186 has described this problem, but does not fundamentally 
> solve it.
> The reason is that the code assumes that erasedIndexes is preceded by the 
> data index and followed by parity index. If there is a parity index placed in 
> front of the data index in the incoming code, a calculation error will occur.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17521) Erasure Coding: Fix calculation errors caused by special index order

2024-05-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-17521:
--
Labels: pull-request-available  (was: )

> Erasure Coding: Fix calculation errors caused by special index order
> 
>
> Key: HDFS-17521
> URL: https://issues.apache.org/jira/browse/HDFS-17521
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chenyu Zheng
>Assignee: Chenyu Zheng
>Priority: Critical
>  Labels: pull-request-available
>
> I found that if the erasedIndexes distribution is such that the parity index 
> is in front of the data index, ec will produce wrong results when decoding.
> In fact, HDFS-15186 has described this problem, but does not fundamentally 
> solve it.
> The reason is that the code assumes that erasedIndexes is preceded by the 
> data index and followed by parity index. If there is a parity index placed in 
> front of the data index in the incoming code, a calculation error will occur.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17521) Erasure Coding: Fix calculation errors caused by special index order

2024-05-11 Thread Chenyu Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chenyu Zheng updated HDFS-17521:

Description: 
I found that if the erasedIndexes distribution is such that the parity index is 
in front of the data index, ec will produce wrong results when decoding.

In fact, HDFS-15186 has described this problem, but does not fundamentally 
solve it.

The reason is that the code assumes that erasedIndexes is preceded by the data 
index and followed by parity index. If there is a parity index placed in front 
of the data index in the incoming code, a calculation error will occur.

  was:
I found that if the erasedIndexes distribution is such that the parity index is 
in front of the data index, ec will produce wrong results when decoding.

In fact, HDFS-15186 has described this problem, but does not fundamentally 
solve it.


> Erasure Coding: Fix calculation errors caused by special index order
> 
>
> Key: HDFS-17521
> URL: https://issues.apache.org/jira/browse/HDFS-17521
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chenyu Zheng
>Assignee: Chenyu Zheng
>Priority: Critical
>
> I found that if the erasedIndexes distribution is such that the parity index 
> is in front of the data index, ec will produce wrong results when decoding.
> In fact, HDFS-15186 has described this problem, but does not fundamentally 
> solve it.
> The reason is that the code assumes that erasedIndexes is preceded by the 
> data index and followed by parity index. If there is a parity index placed in 
> front of the data index in the incoming code, a calculation error will occur.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17520) TestDFSAdmin.testAllDatanodesReconfig and TestDFSAdmin.testDecommissionDataNodesReconfig failed

2024-05-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845609#comment-17845609
 ] 

ASF GitHub Bot commented on HDFS-17520:
---

slfan1989 commented on PR #6812:
URL: https://github.com/apache/hadoop/pull/6812#issuecomment-2105740105

   > @slfan1989 Master, I see you are familiar with 
`testDecommissionDataNodesReconfig`, please help me review it. Thanks
   
   @ZanderXu  Thank you for your contribution! I will reply later.




> TestDFSAdmin.testAllDatanodesReconfig and 
> TestDFSAdmin.testDecommissionDataNodesReconfig failed
> ---
>
> Key: HDFS-17520
> URL: https://issues.apache.org/jira/browse/HDFS-17520
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>
> {code:java}
> [ERROR] Tests run: 21, Failures: 3, Errors: 0, Skipped: 0, Time elapsed: 
> 44.521 s <<< FAILURE! - in org.apache.hadoop.hdfs.tools.TestDFSAdmin
> [ERROR] testAllDatanodesReconfig(org.apache.hadoop.hdfs.tools.TestDFSAdmin)  
> Time elapsed: 2.086 s  <<< FAILURE!
> java.lang.AssertionError: 
> Expecting:
>  <["Reconfiguring status for node [127.0.0.1:43731]: SUCCESS: Changed 
> property dfs.datanode.peer.stats.enabled",
> " From: "false"",
> " To: "true"",
> "started at Fri May 10 13:02:51 UTC 2024 and finished at Fri May 10 
> 13:02:51 UTC 2024."]>
> to contain subsequence:
>  <["SUCCESS: Changed property dfs.datanode.peer.stats.enabled",
> " From: "false"",
> " To: "true""]>
>   at 
> org.apache.hadoop.hdfs.tools.TestDFSAdmin.testAllDatanodesReconfig(TestDFSAdmin.java:1286)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17521) Erasure Coding: Fix calculation errors caused by special index order

2024-05-11 Thread Chenyu Zheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chenyu Zheng updated HDFS-17521:

Description: 
I found that if the erasedIndexes distribution is such that the parity index is 
in front of the data index, ec will produce wrong results when decoding.

In fact, HDFS-15186 has described this problem, but does not fundamentally 
solve it.

> Erasure Coding: Fix calculation errors caused by special index order
> 
>
> Key: HDFS-17521
> URL: https://issues.apache.org/jira/browse/HDFS-17521
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chenyu Zheng
>Assignee: Chenyu Zheng
>Priority: Critical
>
> I found that if the erasedIndexes distribution is such that the parity index 
> is in front of the data index, ec will produce wrong results when decoding.
> In fact, HDFS-15186 has described this problem, but does not fundamentally 
> solve it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17506) [FGL] Performance for phase 1

2024-05-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845591#comment-17845591
 ] 

ASF GitHub Bot commented on HDFS-17506:
---

hadoop-yetus commented on PR #6806:
URL: https://github.com/apache/hadoop/pull/6806#issuecomment-2105680116

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 23s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ HDFS-17384 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  33m 21s |  |  HDFS-17384 passed  |
   | +1 :green_heart: |  compile  |   0m 43s |  |  HDFS-17384 passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   0m 43s |  |  HDFS-17384 passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   0m 40s |  |  HDFS-17384 passed  |
   | +1 :green_heart: |  mvnsite  |   0m 47s |  |  HDFS-17384 passed  |
   | +1 :green_heart: |  javadoc  |   0m 44s |  |  HDFS-17384 passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m  7s |  |  HDFS-17384 passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 53s |  |  HDFS-17384 passed  |
   | +1 :green_heart: |  shadedclient  |  21m 56s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 38s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 42s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   0m 42s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 35s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   0m 35s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 31s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 39s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 30s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   1m  6s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 51s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  21m 49s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 213m 38s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 31s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 306m 47s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.45 ServerAPI=1.45 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6806/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6806 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 6412d6991c5d 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | HDFS-17384 / db92d267a8d598d36f9fbe59ec8c24ccf754c558 |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6806/3/testReport/ |
   | Max. process+thread count | 4448 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6806/3/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
 

[jira] [Created] (HDFS-17521) Erasure Coding: Fix calculation errors caused by special index order

2024-05-11 Thread Chenyu Zheng (Jira)
Chenyu Zheng created HDFS-17521:
---

 Summary: Erasure Coding: Fix calculation errors caused by special 
index order
 Key: HDFS-17521
 URL: https://issues.apache.org/jira/browse/HDFS-17521
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Chenyu Zheng
Assignee: Chenyu Zheng






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17509) RBF: Fix ClientProtocol.concat will throw NPE if tgr is a empty file.

2024-05-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845576#comment-17845576
 ] 

ASF GitHub Bot commented on HDFS-17509:
---

ZanderXu commented on code in PR #6784:
URL: https://github.com/apache/hadoop/pull/6784#discussion_r1597413470


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterRpc.java:
##
@@ -1224,6 +1224,17 @@ public void testProxyConcatFile() throws Exception {
 String badPath = "/unknownlocation/unknowndir";
 compareResponses(routerProtocol, nnProtocol, m,
 new Object[] {badPath, new String[] {routerFile}});
+
+// Test when concat trg is a empty file

Review Comment:
   Can you modify the UT to cover the case that one or more source files are 
empty?



##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterClientProtocol.java:
##
@@ -1009,6 +1000,20 @@ public HdfsFileStatus getFileInfo(String src) throws 
IOException {
 return ret;
   }
 
+  public RemoteResult 
getFileRemoteResult(String path)
+  throws IOException {
+rpcServer.checkOperation(NameNode.OperationCategory.READ);
+
+final List locations = rpcServer.getLocationsForPath(path, 
false, false);
+RemoteMethod method =
+new RemoteMethod("getFileInfo", new Class[] {String.class}, new 
RemoteParam());
+// Check for file information sequentially
+RemoteResult result =

Review Comment:
   RBF is simply responsible for locating the downstream namespace and then 
proxying the request.
   So if the input path is only mounted to one namespace, RBF only needs to 
proxy it directly.  RBF does not need to check if the file exists in this only 
one downstream namespace, right?





> RBF: Fix ClientProtocol.concat  will throw NPE if tgr is a empty file.
> --
>
> Key: HDFS-17509
> URL: https://issues.apache.org/jira/browse/HDFS-17509
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: liuguanghua
>Priority: Minor
>  Labels: pull-request-available
>
> hdfs dfs -concat  /tmp/merge /tmp/t1 /tmp/t2
> When /tmp/merge is a empty file, this command will throw NPE via DFSRouter. 
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-2139) Fast copy for HDFS.

2024-05-11 Thread ZanderXu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845565#comment-17845565
 ] 

ZanderXu commented on HDFS-2139:


https://docs.google.com/document/d/1uGHA2dXLldlNoaYF-4c63baYjCuft_T88wdvhwVgh6c/edit?usp=sharing

> Fast copy for HDFS.
> ---
>
> Key: HDFS-2139
> URL: https://issues.apache.org/jira/browse/HDFS-2139
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Pritam Damania
>Assignee: Rituraj
>Priority: Major
> Attachments: HDFS-2139-For-2.7.1.patch, HDFS-2139.patch, 
> HDFS-2139.patch, image-2022-08-11-11-48-17-994.png
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> There is a need to perform fast file copy on HDFS. The fast copy mechanism 
> for a file works as
> follows :
> 1) Query metadata for all blocks of the source file.
> 2) For each block 'b' of the file, find out its datanode locations.
> 3) For each block of the file, add an empty block to the namesystem for
> the destination file.
> 4) For each location of the block, instruct the datanode to make a local
> copy of that block.
> 5) Once each datanode has copied over its respective blocks, they
> report to the namenode about it.
> 6) Wait for all blocks to be copied and exit.
> This would speed up the copying process considerably by removing top of
> the rack data transfers.
> Note : An extra improvement, would be to instruct the datanode to create a
> hardlink of the block file if we are copying a block on the same datanode
> [~xuzq_zander]Provided a design doc 
> https://docs.google.com/document/d/1uGHA2dXLldlNoaYF-4c63baYjCuft_T88wdvhwVgh6c/edit?usp=sharing



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-2139) Fast copy for HDFS.

2024-05-11 Thread ZanderXu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17581740#comment-17581740
 ] 

ZanderXu edited comment on HDFS-2139 at 5/11/24 9:13 AM:
-

https://docs.google.com/document/d/1uGHA2dXLldlNoaYF-4c63baYjCuft_T88wdvhwVgh6c/edit?usp=sharing

[~ferhui] [~weichiu] [~ayushtkn] [~pengbei] Master, sorry for the late design. 
Please help me reviewing this design. And I will start the development work in 
parallel.


was (Author: xuzq_zander):
https://docs.google.com/document/d/1OHdUpQmKD3TZ3xdmQsXNmlXJetn2QFPinMH31Q4BqkI/edit?usp=sharing

[~ferhui][~weichiu][~ayushtkn][~pengbei] Master, sorry for the late design. 
Please help me reviewing this design. And I will start the development work in 
parallel.

> Fast copy for HDFS.
> ---
>
> Key: HDFS-2139
> URL: https://issues.apache.org/jira/browse/HDFS-2139
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Pritam Damania
>Assignee: Rituraj
>Priority: Major
> Attachments: HDFS-2139-For-2.7.1.patch, HDFS-2139.patch, 
> HDFS-2139.patch, image-2022-08-11-11-48-17-994.png
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> There is a need to perform fast file copy on HDFS. The fast copy mechanism 
> for a file works as
> follows :
> 1) Query metadata for all blocks of the source file.
> 2) For each block 'b' of the file, find out its datanode locations.
> 3) For each block of the file, add an empty block to the namesystem for
> the destination file.
> 4) For each location of the block, instruct the datanode to make a local
> copy of that block.
> 5) Once each datanode has copied over its respective blocks, they
> report to the namenode about it.
> 6) Wait for all blocks to be copied and exit.
> This would speed up the copying process considerably by removing top of
> the rack data transfers.
> Note : An extra improvement, would be to instruct the datanode to create a
> hardlink of the block file if we are copying a block on the same datanode
> [~xuzq_zander]Provided a design doc 
> https://docs.google.com/document/d/1uGHA2dXLldlNoaYF-4c63baYjCuft_T88wdvhwVgh6c/edit?usp=sharing



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17509) RBF: Fix ClientProtocol.concat will throw NPE if tgr is a empty file.

2024-05-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845562#comment-17845562
 ] 

ASF GitHub Bot commented on HDFS-17509:
---

LiuGuH commented on code in PR #6784:
URL: https://github.com/apache/hadoop/pull/6784#discussion_r1597405273


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterRpc.java:
##
@@ -1224,6 +1224,17 @@ public void testProxyConcatFile() throws Exception {
 String badPath = "/unknownlocation/unknowndir";
 compareResponses(routerProtocol, nnProtocol, m,
 new Object[] {badPath, new String[] {routerFile}});
+
+// Test when concat trg is a empty file

Review Comment:
   When srclist has a empty file ,both namenode and dfsrouter will throw the 
same IOException 
   
https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirConcatOp.java#L153-L155





> RBF: Fix ClientProtocol.concat  will throw NPE if tgr is a empty file.
> --
>
> Key: HDFS-17509
> URL: https://issues.apache.org/jira/browse/HDFS-17509
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: liuguanghua
>Priority: Minor
>  Labels: pull-request-available
>
> hdfs dfs -concat  /tmp/merge /tmp/t1 /tmp/t2
> When /tmp/merge is a empty file, this command will throw NPE via DFSRouter. 
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17509) RBF: Fix ClientProtocol.concat will throw NPE if tgr is a empty file.

2024-05-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845559#comment-17845559
 ] 

ASF GitHub Bot commented on HDFS-17509:
---

LiuGuH commented on code in PR #6784:
URL: https://github.com/apache/hadoop/pull/6784#discussion_r1597404047


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterClientProtocol.java:
##
@@ -1009,6 +1000,20 @@ public HdfsFileStatus getFileInfo(String src) throws 
IOException {
 return ret;
   }
 
+  public RemoteResult 
getFileRemoteResult(String path)
+  throws IOException {
+rpcServer.checkOperation(NameNode.OperationCategory.READ);
+
+final List locations = rpcServer.getLocationsForPath(path, 
false, false);
+RemoteMethod method =
+new RemoteMethod("getFileInfo", new Class[] {String.class}, new 
RemoteParam());
+// Check for file information sequentially
+RemoteResult result =

Review Comment:
   This may not be  true. Even locations only contains one namespace, it still 
cannot decide whether the file exists or not. So getFileInfo is better to 
execute at least once. Or it will send to namenode and  namenode throw file is 
not found. 





> RBF: Fix ClientProtocol.concat  will throw NPE if tgr is a empty file.
> --
>
> Key: HDFS-17509
> URL: https://issues.apache.org/jira/browse/HDFS-17509
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: liuguanghua
>Priority: Minor
>  Labels: pull-request-available
>
> hdfs dfs -concat  /tmp/merge /tmp/t1 /tmp/t2
> When /tmp/merge is a empty file, this command will throw NPE via DFSRouter. 
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17497) Logic for committed blocks is mixed when computing file size

2024-05-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845556#comment-17845556
 ] 

ASF GitHub Bot commented on HDFS-17497:
---

hfutatzhanghb commented on code in PR #6765:
URL: https://github.com/apache/hadoop/pull/6765#discussion_r1597392676


##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirectory.java:
##
@@ -1105,15 +1106,12 @@ static void unprotectedUpdateCount(INodesInPath 
inodesInPath,
   /**
* Update the cached quota space for a block that is being completed.
* Must only be called once, as the block is being completed.
-   * @param completeBlk - Completed block for which to update space
-   * @param inodes - INodes in path to file containing completeBlk; if null
-   * this will be resolved internally
+   * @param commitBlock - Committed block for which to update space
+   * @param iip - INodes in path to file containing committedBlock
*/
-  public void updateSpaceForCompleteBlock(BlockInfo completeBlk,
-  INodesInPath inodes) throws IOException {
+  public void updateSpaceForCommittedBlock(Block commitBlock,

Review Comment:
   Which could be better parameter name : commitBlock or committedBlock?



##
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java:
##
@@ -3887,7 +3887,11 @@ void commitOrCompleteLastBlock(
   final Block commitBlock) throws IOException {
 assert hasWriteLock();
 Preconditions.checkArgument(fileINode.isUnderConstruction());
-blockManager.commitOrCompleteLastBlock(fileINode, commitBlock, iip);
+if (!blockManager.commitOrCompleteLastBlock(fileINode, commitBlock)) {
+  return;
+}
+// Updating QuotaUsage when committing block since block size will not be 
changed
+getFSDirectory().updateSpaceForCommittedBlock(commitBlock, iip);

Review Comment:
   Sir, How about
   ```java
   if (blockManager.commitOrCompleteLastBlock(fileINode, commitBlock)) {
   // Updating QuotaUsage when committing block since block size will 
not be changed
   getFSDirectory().updateSpaceForCommittedBlock(commitBlock, iip);
   }
   ```





> Logic for committed blocks is mixed when computing file size
> 
>
> Key: HDFS-17497
> URL: https://issues.apache.org/jira/browse/HDFS-17497
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>
> One in-writing HDFS file may contains multiple committed blocks, as follows 
> (assume one file contains three blocks):
> || ||Block 1||Block 2||Block 3||
> |Case 1|Complete|Commit|UnderConstruction|
> |Case 2|Complete|Commit|Commit|
> |Case 3|Commit|Commit|Commit|
>  
> But the logic for committed blocks is mixed when computing file size, it 
> ignores the bytes of the last committed block and contains the bytes of other 
> committed blocks.
> {code:java}
> public final long computeFileSize(boolean includesLastUcBlock,
> boolean usePreferredBlockSize4LastUcBlock) {
>   if (blocks.length == 0) {
> return 0;
>   }
>   final int last = blocks.length - 1;
>   //check if the last block is BlockInfoUnderConstruction
>   BlockInfo lastBlk = blocks[last];
>   long size = lastBlk.getNumBytes();
>   // the last committed block is not complete, so it's bytes may be ignored.
>   if (!lastBlk.isComplete()) {
>  if (!includesLastUcBlock) {
>size = 0;
>  } else if (usePreferredBlockSize4LastUcBlock) {
>size = isStriped()?
>getPreferredBlockSize() *
>((BlockInfoStriped)lastBlk).getDataBlockNum() :
>getPreferredBlockSize();
>  }
>   }
>   // The bytes of other committed blocks are calculated into the file length.
>   for (int i = 0; i < last; i++) {
> size += blocks[i].getNumBytes();
>   }
>   return size;
> } {code}
> The bytes of one committed block will not be changed, so the bytes of the 
> last committed block should be calculated into the file length too.
>  
> And the logic for committed blocks is mixed too when computing file length in 
> DFSInputStream. Normally DFSInputStream does not need to get visible length 
> for committed block regardless of whether the committed block is the last 
> block or not.
>  
> -HDFS-10843- encountered one bug which actually caused by the committed 
> block, but -HDFS-10843- fixed that bug by updating quota usage when 
> completing block. The num of bytes of the committed block will no longer 
> change, so we should update the quota usage when the block is committed, 
> which can reduce the delta quota usage in time.
>  
> So there are somethings we need to do:
>  * Unify the calculation logic for all committed blocks in 
> {{computeFileSize}} of 

[jira] [Updated] (HDFS-17492) [FGL] Abstract a INodeLockManager to manage acquiring and releasing locks in the directory-tree [I]

2024-05-11 Thread ZanderXu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZanderXu updated HDFS-17492:

Summary: [FGL] Abstract a INodeLockManager to manage acquiring and 
releasing locks in the directory-tree [I]  (was: [FGL] Abstract a 
INodeLockManager to manage acquiring and releasing locks in the directory-tree)

> [FGL] Abstract a INodeLockManager to manage acquiring and releasing locks in 
> the directory-tree [I]
> ---
>
> Key: HDFS-17492
> URL: https://issues.apache.org/jira/browse/HDFS-17492
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>
> Abstract a INodeLockManager to mange acquiring and releasing locks in the 
> directory-tree.
>  # Abstract a lock type to cover all cases in NN
>  # Acquire the full path lock for the input path base on the input lock type
>  # Acquire the full path lock for the input iNodeId base on the input lock 
> type
>  # Acquire the full path lock for some input paths, such as for rename, concat
>  
> INodeLockManager should returns an IIP which contains both iNodes and locks.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17488) DN can fail IBRs with NPE when a volume is removed

2024-05-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1784#comment-1784
 ] 

ASF GitHub Bot commented on HDFS-17488:
---

hadoop-yetus commented on PR #6759:
URL: https://github.com/apache/hadoop/pull/6759#issuecomment-2105622801

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | -1 :x: |  patch  |   0m 55s |  |  
https://github.com/apache/hadoop/pull/6759 does not apply to trunk. Rebase 
required? Wrong Branch? See 
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute for help.  
|
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/6759 |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6759/7/console
 |
   | versions | git=2.44.0.windows.1 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> DN can fail IBRs with NPE when a volume is removed
> --
>
> Key: HDFS-17488
> URL: https://issues.apache.org/jira/browse/HDFS-17488
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Felix N
>Assignee: Felix N
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
>  
> Error logs
> {code:java}
> 2024-04-22 15:46:33,422 [BP-1842952724-10.22.68.249-1713771988830 
> heartbeating to localhost/127.0.0.1:64977] ERROR datanode.DataNode 
> (BPServiceActor.java:run(922)) - Exception in BPOfferService for Block pool 
> BP-1842952724-10.22.68.249-1713771988830 (Datanode Uuid 
> 1659ffaf-1a80-4a8e-a542-643f6bd97ed4) service to localhost/127.0.0.1:64977
> java.lang.NullPointerException
>     at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.blockReceivedAndDeleted(DatanodeProtocolClientSideTranslatorPB.java:246)
>     at 
> org.apache.hadoop.hdfs.server.datanode.IncrementalBlockReportManager.sendIBRs(IncrementalBlockReportManager.java:218)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:749)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:920)
>     at java.lang.Thread.run(Thread.java:748) {code}
> The root cause is in BPOfferService#notifyNamenodeBlock, happens when it's 
> called on a block belonging to a volume already removed prior. Because the 
> volume was already removed
>  
> {code:java}
> private void notifyNamenodeBlock(ExtendedBlock block, BlockStatus status,
> String delHint, String storageUuid, boolean isOnTransientStorage) {
>   checkBlock(block);
>   final ReceivedDeletedBlockInfo info = new ReceivedDeletedBlockInfo(
>   block.getLocalBlock(), status, delHint);
>   final DatanodeStorage storage = dn.getFSDataset().getStorage(storageUuid);
>   
>   // storage == null here because it's already removed earlier.
>   for (BPServiceActor actor : bpServices) {
> actor.getIbrManager().notifyNamenodeBlock(info, storage,
> isOnTransientStorage);
>   }
> } {code}
> so IBRs with a null storage are now pending.
> The reason why notifyNamenodeBlock can trigger on such blocks is up in 
> DirectoryScanner#reconcile
> {code:java}
>   public void reconcile() throws IOException {
>     LOG.debug("reconcile start DirectoryScanning");
>     scan();
> // If a volume is removed here after scan() already finished running,
> // diffs is stale and checkAndUpdate will run on a removed volume
>     // HDFS-14476: run checkAndUpdate with batch to avoid holding the lock too
>     // long
>     int loopCount = 0;
>     synchronized (diffs) {
>       for (final Map.Entry entry : diffs.getEntries()) {
>         dataset.checkAndUpdate(entry.getKey(), entry.getValue());        
>     ...
>   } {code}
> Inside checkAndUpdate, memBlockInfo is null because all the block meta in 
> memory is removed during the volume removal, but diskFile still exists. Then 
> DataNode#notifyNamenodeDeletedBlock (and further down the line, 
> notifyNamenodeBlock) is called on this block.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-17488) DN can fail IBRs with NPE when a volume is removed

2024-05-11 Thread ZanderXu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZanderXu resolved HDFS-17488.
-
Fix Version/s: 3.5.0
   Resolution: Fixed

> DN can fail IBRs with NPE when a volume is removed
> --
>
> Key: HDFS-17488
> URL: https://issues.apache.org/jira/browse/HDFS-17488
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Felix N
>Assignee: Felix N
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.0
>
>
>  
> Error logs
> {code:java}
> 2024-04-22 15:46:33,422 [BP-1842952724-10.22.68.249-1713771988830 
> heartbeating to localhost/127.0.0.1:64977] ERROR datanode.DataNode 
> (BPServiceActor.java:run(922)) - Exception in BPOfferService for Block pool 
> BP-1842952724-10.22.68.249-1713771988830 (Datanode Uuid 
> 1659ffaf-1a80-4a8e-a542-643f6bd97ed4) service to localhost/127.0.0.1:64977
> java.lang.NullPointerException
>     at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.blockReceivedAndDeleted(DatanodeProtocolClientSideTranslatorPB.java:246)
>     at 
> org.apache.hadoop.hdfs.server.datanode.IncrementalBlockReportManager.sendIBRs(IncrementalBlockReportManager.java:218)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:749)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:920)
>     at java.lang.Thread.run(Thread.java:748) {code}
> The root cause is in BPOfferService#notifyNamenodeBlock, happens when it's 
> called on a block belonging to a volume already removed prior. Because the 
> volume was already removed
>  
> {code:java}
> private void notifyNamenodeBlock(ExtendedBlock block, BlockStatus status,
> String delHint, String storageUuid, boolean isOnTransientStorage) {
>   checkBlock(block);
>   final ReceivedDeletedBlockInfo info = new ReceivedDeletedBlockInfo(
>   block.getLocalBlock(), status, delHint);
>   final DatanodeStorage storage = dn.getFSDataset().getStorage(storageUuid);
>   
>   // storage == null here because it's already removed earlier.
>   for (BPServiceActor actor : bpServices) {
> actor.getIbrManager().notifyNamenodeBlock(info, storage,
> isOnTransientStorage);
>   }
> } {code}
> so IBRs with a null storage are now pending.
> The reason why notifyNamenodeBlock can trigger on such blocks is up in 
> DirectoryScanner#reconcile
> {code:java}
>   public void reconcile() throws IOException {
>     LOG.debug("reconcile start DirectoryScanning");
>     scan();
> // If a volume is removed here after scan() already finished running,
> // diffs is stale and checkAndUpdate will run on a removed volume
>     // HDFS-14476: run checkAndUpdate with batch to avoid holding the lock too
>     // long
>     int loopCount = 0;
>     synchronized (diffs) {
>       for (final Map.Entry entry : diffs.getEntries()) {
>         dataset.checkAndUpdate(entry.getKey(), entry.getValue());        
>     ...
>   } {code}
> Inside checkAndUpdate, memBlockInfo is null because all the block meta in 
> memory is removed during the volume removal, but diskFile still exists. Then 
> DataNode#notifyNamenodeDeletedBlock (and further down the line, 
> notifyNamenodeBlock) is called on this block.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17488) DN can fail IBRs with NPE when a volume is removed

2024-05-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845551#comment-17845551
 ] 

ASF GitHub Bot commented on HDFS-17488:
---

ZanderXu commented on PR #6759:
URL: https://github.com/apache/hadoop/pull/6759#issuecomment-2105617264

   Merged. Thanks @kokonguyen191 for your contribution and thanks @Hexiaoqiao 
@haiyang1987 @hfutatzhanghb for your review. 




> DN can fail IBRs with NPE when a volume is removed
> --
>
> Key: HDFS-17488
> URL: https://issues.apache.org/jira/browse/HDFS-17488
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Felix N
>Assignee: Felix N
>Priority: Major
>  Labels: pull-request-available
>
>  
> Error logs
> {code:java}
> 2024-04-22 15:46:33,422 [BP-1842952724-10.22.68.249-1713771988830 
> heartbeating to localhost/127.0.0.1:64977] ERROR datanode.DataNode 
> (BPServiceActor.java:run(922)) - Exception in BPOfferService for Block pool 
> BP-1842952724-10.22.68.249-1713771988830 (Datanode Uuid 
> 1659ffaf-1a80-4a8e-a542-643f6bd97ed4) service to localhost/127.0.0.1:64977
> java.lang.NullPointerException
>     at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.blockReceivedAndDeleted(DatanodeProtocolClientSideTranslatorPB.java:246)
>     at 
> org.apache.hadoop.hdfs.server.datanode.IncrementalBlockReportManager.sendIBRs(IncrementalBlockReportManager.java:218)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:749)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:920)
>     at java.lang.Thread.run(Thread.java:748) {code}
> The root cause is in BPOfferService#notifyNamenodeBlock, happens when it's 
> called on a block belonging to a volume already removed prior. Because the 
> volume was already removed
>  
> {code:java}
> private void notifyNamenodeBlock(ExtendedBlock block, BlockStatus status,
> String delHint, String storageUuid, boolean isOnTransientStorage) {
>   checkBlock(block);
>   final ReceivedDeletedBlockInfo info = new ReceivedDeletedBlockInfo(
>   block.getLocalBlock(), status, delHint);
>   final DatanodeStorage storage = dn.getFSDataset().getStorage(storageUuid);
>   
>   // storage == null here because it's already removed earlier.
>   for (BPServiceActor actor : bpServices) {
> actor.getIbrManager().notifyNamenodeBlock(info, storage,
> isOnTransientStorage);
>   }
> } {code}
> so IBRs with a null storage are now pending.
> The reason why notifyNamenodeBlock can trigger on such blocks is up in 
> DirectoryScanner#reconcile
> {code:java}
>   public void reconcile() throws IOException {
>     LOG.debug("reconcile start DirectoryScanning");
>     scan();
> // If a volume is removed here after scan() already finished running,
> // diffs is stale and checkAndUpdate will run on a removed volume
>     // HDFS-14476: run checkAndUpdate with batch to avoid holding the lock too
>     // long
>     int loopCount = 0;
>     synchronized (diffs) {
>       for (final Map.Entry entry : diffs.getEntries()) {
>         dataset.checkAndUpdate(entry.getKey(), entry.getValue());        
>     ...
>   } {code}
> Inside checkAndUpdate, memBlockInfo is null because all the block meta in 
> memory is removed during the volume removal, but diskFile still exists. Then 
> DataNode#notifyNamenodeDeletedBlock (and further down the line, 
> notifyNamenodeBlock) is called on this block.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17488) DN can fail IBRs with NPE when a volume is removed

2024-05-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845550#comment-17845550
 ] 

ASF GitHub Bot commented on HDFS-17488:
---

ZanderXu merged PR #6759:
URL: https://github.com/apache/hadoop/pull/6759




> DN can fail IBRs with NPE when a volume is removed
> --
>
> Key: HDFS-17488
> URL: https://issues.apache.org/jira/browse/HDFS-17488
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Felix N
>Assignee: Felix N
>Priority: Major
>  Labels: pull-request-available
>
>  
> Error logs
> {code:java}
> 2024-04-22 15:46:33,422 [BP-1842952724-10.22.68.249-1713771988830 
> heartbeating to localhost/127.0.0.1:64977] ERROR datanode.DataNode 
> (BPServiceActor.java:run(922)) - Exception in BPOfferService for Block pool 
> BP-1842952724-10.22.68.249-1713771988830 (Datanode Uuid 
> 1659ffaf-1a80-4a8e-a542-643f6bd97ed4) service to localhost/127.0.0.1:64977
> java.lang.NullPointerException
>     at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.blockReceivedAndDeleted(DatanodeProtocolClientSideTranslatorPB.java:246)
>     at 
> org.apache.hadoop.hdfs.server.datanode.IncrementalBlockReportManager.sendIBRs(IncrementalBlockReportManager.java:218)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:749)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:920)
>     at java.lang.Thread.run(Thread.java:748) {code}
> The root cause is in BPOfferService#notifyNamenodeBlock, happens when it's 
> called on a block belonging to a volume already removed prior. Because the 
> volume was already removed
>  
> {code:java}
> private void notifyNamenodeBlock(ExtendedBlock block, BlockStatus status,
> String delHint, String storageUuid, boolean isOnTransientStorage) {
>   checkBlock(block);
>   final ReceivedDeletedBlockInfo info = new ReceivedDeletedBlockInfo(
>   block.getLocalBlock(), status, delHint);
>   final DatanodeStorage storage = dn.getFSDataset().getStorage(storageUuid);
>   
>   // storage == null here because it's already removed earlier.
>   for (BPServiceActor actor : bpServices) {
> actor.getIbrManager().notifyNamenodeBlock(info, storage,
> isOnTransientStorage);
>   }
> } {code}
> so IBRs with a null storage are now pending.
> The reason why notifyNamenodeBlock can trigger on such blocks is up in 
> DirectoryScanner#reconcile
> {code:java}
>   public void reconcile() throws IOException {
>     LOG.debug("reconcile start DirectoryScanning");
>     scan();
> // If a volume is removed here after scan() already finished running,
> // diffs is stale and checkAndUpdate will run on a removed volume
>     // HDFS-14476: run checkAndUpdate with batch to avoid holding the lock too
>     // long
>     int loopCount = 0;
>     synchronized (diffs) {
>       for (final Map.Entry entry : diffs.getEntries()) {
>         dataset.checkAndUpdate(entry.getKey(), entry.getValue());        
>     ...
>   } {code}
> Inside checkAndUpdate, memBlockInfo is null because all the block meta in 
> memory is removed during the volume removal, but diskFile still exists. Then 
> DataNode#notifyNamenodeDeletedBlock (and further down the line, 
> notifyNamenodeBlock) is called on this block.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17509) RBF: Fix ClientProtocol.concat will throw NPE if tgr is a empty file.

2024-05-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845548#comment-17845548
 ] 

ASF GitHub Bot commented on HDFS-17509:
---

ZanderXu commented on code in PR #6784:
URL: https://github.com/apache/hadoop/pull/6784#discussion_r1597385087


##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterClientProtocol.java:
##
@@ -1009,6 +1000,20 @@ public HdfsFileStatus getFileInfo(String src) throws 
IOException {
 return ret;
   }
 
+  public RemoteResult 
getFileRemoteResult(String path)
+  throws IOException {
+rpcServer.checkOperation(NameNode.OperationCategory.READ);
+
+final List locations = rpcServer.getLocationsForPath(path, 
false, false);
+RemoteMethod method =
+new RemoteMethod("getFileInfo", new Class[] {String.class}, new 
RemoteParam());
+// Check for file information sequentially
+RemoteResult result =

Review Comment:
   If `locations` only contains one namespace, we can returns this namespace 
directly instead of getting the namespace through `getFileInfo`, right?



##
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterRpc.java:
##
@@ -1224,6 +1224,17 @@ public void testProxyConcatFile() throws Exception {
 String badPath = "/unknownlocation/unknowndir";
 compareResponses(routerProtocol, nnProtocol, m,
 new Object[] {badPath, new String[] {routerFile}});
+
+// Test when concat trg is a empty file

Review Comment:
   Do the namenode and rbf throw the same Exception?
   
   Maybe RBF throws NPE, but NN throws 
`org.apache.hadoop.HadoopIllegalArgumentException`.





> RBF: Fix ClientProtocol.concat  will throw NPE if tgr is a empty file.
> --
>
> Key: HDFS-17509
> URL: https://issues.apache.org/jira/browse/HDFS-17509
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: liuguanghua
>Priority: Minor
>  Labels: pull-request-available
>
> hdfs dfs -concat  /tmp/merge /tmp/t1 /tmp/t2
> When /tmp/merge is a empty file, this command will throw NPE via DFSRouter. 
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17520) TestDFSAdmin.testAllDatanodesReconfig and TestDFSAdmin.testDecommissionDataNodesReconfig failed

2024-05-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845545#comment-17845545
 ] 

ASF GitHub Bot commented on HDFS-17520:
---

ZanderXu commented on PR #6812:
URL: https://github.com/apache/hadoop/pull/6812#issuecomment-2105610789

   @slfan1989 Master, I see you are familiar with 
`testDecommissionDataNodesReconfig`, please help me review it. Thanks 




> TestDFSAdmin.testAllDatanodesReconfig and 
> TestDFSAdmin.testDecommissionDataNodesReconfig failed
> ---
>
> Key: HDFS-17520
> URL: https://issues.apache.org/jira/browse/HDFS-17520
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>
> {code:java}
> [ERROR] Tests run: 21, Failures: 3, Errors: 0, Skipped: 0, Time elapsed: 
> 44.521 s <<< FAILURE! - in org.apache.hadoop.hdfs.tools.TestDFSAdmin
> [ERROR] testAllDatanodesReconfig(org.apache.hadoop.hdfs.tools.TestDFSAdmin)  
> Time elapsed: 2.086 s  <<< FAILURE!
> java.lang.AssertionError: 
> Expecting:
>  <["Reconfiguring status for node [127.0.0.1:43731]: SUCCESS: Changed 
> property dfs.datanode.peer.stats.enabled",
> " From: "false"",
> " To: "true"",
> "started at Fri May 10 13:02:51 UTC 2024 and finished at Fri May 10 
> 13:02:51 UTC 2024."]>
> to contain subsequence:
>  <["SUCCESS: Changed property dfs.datanode.peer.stats.enabled",
> " From: "false"",
> " To: "true""]>
>   at 
> org.apache.hadoop.hdfs.tools.TestDFSAdmin.testAllDatanodesReconfig(TestDFSAdmin.java:1286)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-17520) TestDFSAdmin.testAllDatanodesReconfig and TestDFSAdmin.testDecommissionDataNodesReconfig failed

2024-05-11 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-17520:
--
Labels: pull-request-available  (was: )

> TestDFSAdmin.testAllDatanodesReconfig and 
> TestDFSAdmin.testDecommissionDataNodesReconfig failed
> ---
>
> Key: HDFS-17520
> URL: https://issues.apache.org/jira/browse/HDFS-17520
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>
> {code:java}
> [ERROR] Tests run: 21, Failures: 3, Errors: 0, Skipped: 0, Time elapsed: 
> 44.521 s <<< FAILURE! - in org.apache.hadoop.hdfs.tools.TestDFSAdmin
> [ERROR] testAllDatanodesReconfig(org.apache.hadoop.hdfs.tools.TestDFSAdmin)  
> Time elapsed: 2.086 s  <<< FAILURE!
> java.lang.AssertionError: 
> Expecting:
>  <["Reconfiguring status for node [127.0.0.1:43731]: SUCCESS: Changed 
> property dfs.datanode.peer.stats.enabled",
> " From: "false"",
> " To: "true"",
> "started at Fri May 10 13:02:51 UTC 2024 and finished at Fri May 10 
> 13:02:51 UTC 2024."]>
> to contain subsequence:
>  <["SUCCESS: Changed property dfs.datanode.peer.stats.enabled",
> " From: "false"",
> " To: "true""]>
>   at 
> org.apache.hadoop.hdfs.tools.TestDFSAdmin.testAllDatanodesReconfig(TestDFSAdmin.java:1286)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>   at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
>   at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17520) TestDFSAdmin.testAllDatanodesReconfig and TestDFSAdmin.testDecommissionDataNodesReconfig failed

2024-05-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845544#comment-17845544
 ] 

ASF GitHub Bot commented on HDFS-17520:
---

ZanderXu opened a new pull request, #6812:
URL: https://github.com/apache/hadoop/pull/6812

   TestDFSAdmin.testAllDatanodesReconfig and 
TestDFSAdmin.testDecommissionDataNodesReconfig failed. 
   
   [HDFS-17506](https://github.com/apache/hadoop/pull/6806) encountered this 
failed UT and the error message like:
   ```
   [ERROR] Tests run: 21, Failures: 3, Errors: 0, Skipped: 0, Time elapsed: 
44.521 s <<< FAILURE! - in org.apache.hadoop.hdfs.tools.TestDFSAdmin
   [ERROR] testAllDatanodesReconfig(org.apache.hadoop.hdfs.tools.TestDFSAdmin)  
Time elapsed: 2.086 s  <<< FAILURE!
   java.lang.AssertionError: 
   
   Expecting:
<["Reconfiguring status for node [127.0.0.1:43731]: SUCCESS: Changed 
property dfs.datanode.peer.stats.enabled",
   "From: "false"",
   "To: "true"",
   "started at Fri May 10 13:02:51 UTC 2024 and finished at Fri May 10 
13:02:51 UTC 2024."]>
   to contain subsequence:
<["SUCCESS: Changed property dfs.datanode.peer.stats.enabled",
   "From: "false"",
   "To: "true""]>
   
at 
org.apache.hadoop.hdfs.tools.TestDFSAdmin.testAllDatanodesReconfig(TestDFSAdmin.java:1286)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
   ```
   
   `getReconfigurationStatusUtil` concurrently get reconfiguration status from 
multiple DNs. It may cause messages in out is wrong, such as:
   ```
   Line1: Reconfiguring status for node [127.0.0.1:65229]: Reconfiguring status 
for node [127.0.0.1:65224]: started at Sat May 11 15:05:49 CST 2024started at 
Sat May 11 15:05:49 CST 2024 and finished at Sat May 11 15:05:49 CST 2024.
   
   Line2: and finished at Sat May 11 15:05:49 CST 2024.
   ```




> TestDFSAdmin.testAllDatanodesReconfig and 
> TestDFSAdmin.testDecommissionDataNodesReconfig failed
> ---
>
> Key: HDFS-17520
> URL: https://issues.apache.org/jira/browse/HDFS-17520
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>
> {code:java}
> [ERROR] Tests run: 21, Failures: 3, Errors: 0, Skipped: 0, Time elapsed: 
> 44.521 s <<< FAILURE! - in org.apache.hadoop.hdfs.tools.TestDFSAdmin
> [ERROR] testAllDatanodesReconfig(org.apache.hadoop.hdfs.tools.TestDFSAdmin)  
> Time elapsed: 2.086 s  <<< FAILURE!
> java.lang.AssertionError: 
> Expecting:
>  <["Reconfiguring status for node [127.0.0.1:43731]: SUCCESS: Changed 
> property dfs.datanode.peer.stats.enabled",
> " From: "false"",
> " To: "true"",
> "started at Fri May 10 13:02:51 UTC 2024 and finished at Fri May 10 
> 13:02:51 UTC 2024."]>
> to contain subsequence:
>  <["SUCCESS: Changed property dfs.datanode.peer.stats.enabled",
> " From: "false"",
> " To: "true""]>
>   at 
> org.apache.hadoop.hdfs.tools.TestDFSAdmin.testAllDatanodesReconfig(TestDFSAdmin.java:1286)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> 

[jira] [Created] (HDFS-17520) TestDFSAdmin.testAllDatanodesReconfig and TestDFSAdmin.testDecommissionDataNodesReconfig failed

2024-05-11 Thread ZanderXu (Jira)
ZanderXu created HDFS-17520:
---

 Summary: TestDFSAdmin.testAllDatanodesReconfig and 
TestDFSAdmin.testDecommissionDataNodesReconfig failed
 Key: HDFS-17520
 URL: https://issues.apache.org/jira/browse/HDFS-17520
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: ZanderXu
Assignee: ZanderXu


{code:java}
[ERROR] Tests run: 21, Failures: 3, Errors: 0, Skipped: 0, Time elapsed: 44.521 
s <<< FAILURE! - in org.apache.hadoop.hdfs.tools.TestDFSAdmin
[ERROR] testAllDatanodesReconfig(org.apache.hadoop.hdfs.tools.TestDFSAdmin)  
Time elapsed: 2.086 s  <<< FAILURE!
java.lang.AssertionError: 

Expecting:
 <["Reconfiguring status for node [127.0.0.1:43731]: SUCCESS: Changed property 
dfs.datanode.peer.stats.enabled",
"   From: "false"",
"   To: "true"",
"started at Fri May 10 13:02:51 UTC 2024 and finished at Fri May 10 
13:02:51 UTC 2024."]>
to contain subsequence:
 <["SUCCESS: Changed property dfs.datanode.peer.stats.enabled",
"   From: "false"",
"   To: "true""]>

at 
org.apache.hadoop.hdfs.tools.TestDFSAdmin.testAllDatanodesReconfig(TestDFSAdmin.java:1286)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
at 
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-17486) VIO: dumpXattrs logic optimization

2024-05-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-17486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845541#comment-17845541
 ] 

ASF GitHub Bot commented on HDFS-17486:
---

YaAYadeer commented on PR #6797:
URL: https://github.com/apache/hadoop/pull/6797#issuecomment-2105598177

   The test class TestOfflineImageViewer already exists.
   TestOfflineImageViewer.testPBImageXmlWriter() method will call dumpXattrs 
when input parameter of dumpXattrs has xattrs.




> VIO: dumpXattrs logic optimization
> --
>
> Key: HDFS-17486
> URL: https://issues.apache.org/jira/browse/HDFS-17486
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.2.0, 3.3.3
>Reporter: wangzhihui
>Priority: Minor
>  Labels: pull-request-available
>
> The dumpXattrs logic in VIO should use 
> FSImageFormatPBINode.Loader.loadXAttrs() to get the Xattrs attribute for easy 
> maintenance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org