[jira] [Work logged] (HDFS-16386) Reduce DataNode load when FsDatasetAsyncDiskService is working

2022-05-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16386?focusedWorklogId=774362=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-774362
 ]

ASF GitHub Bot logged work on HDFS-16386:
-

Author: ASF GitHub Bot
Created on: 25/May/22 05:00
Start Date: 25/May/22 05:00
Worklog Time Spent: 10m 
  Work Description: ZanderXu commented on PR #3806:
URL: https://github.com/apache/hadoop/pull/3806#issuecomment-1136727763

   Thanks, and i will create a new PR to do it.




Issue Time Tracking
---

Worklog Id: (was: 774362)
Time Spent: 4h 10m  (was: 4h)

> Reduce DataNode load when FsDatasetAsyncDiskService is working
> --
>
> Key: HDFS-16386
> URL: https://issues.apache.org/jira/browse/HDFS-16386
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.9.2
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.3, 3.3.2, 3.2.4
>
> Attachments: monitor.png
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Our DataNode node has 36 disks. When FsDatasetAsyncDiskService is working, it 
> will cause a high load on the DataNode.
> Here are some monitoring related to memory:
>  !monitor.png! 
> Since each disk deletes the block asynchronously, and each thread allows 4 
> threads to work,
> This will cause some troubles to the DataNode, such as increased cpu and 
> increased memory.
> We should appropriately reduce the number of jobs of the total thread so that 
> the DataNode can work better.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16386) Reduce DataNode load when FsDatasetAsyncDiskService is working

2022-05-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16386?focusedWorklogId=774354=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-774354
 ]

ASF GitHub Bot logged work on HDFS-16386:
-

Author: ASF GitHub Bot
Created on: 25/May/22 04:19
Start Date: 25/May/22 04:19
Worklog Time Spent: 10m 
  Work Description: jianghuazhu commented on PR #3806:
URL: https://github.com/apache/hadoop/pull/3806#issuecomment-1136708070

   @ZanderXu , nice to communicate with you.
   I suggest that the number of active threads here needs to be set reasonably, 
according to the load capacity of the cluster.




Issue Time Tracking
---

Worklog Id: (was: 774354)
Time Spent: 4h  (was: 3h 50m)

> Reduce DataNode load when FsDatasetAsyncDiskService is working
> --
>
> Key: HDFS-16386
> URL: https://issues.apache.org/jira/browse/HDFS-16386
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.9.2
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.3, 3.3.2, 3.2.4
>
> Attachments: monitor.png
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> Our DataNode node has 36 disks. When FsDatasetAsyncDiskService is working, it 
> will cause a high load on the DataNode.
> Here are some monitoring related to memory:
>  !monitor.png! 
> Since each disk deletes the block asynchronously, and each thread allows 4 
> threads to work,
> This will cause some troubles to the DataNode, such as increased cpu and 
> increased memory.
> We should appropriately reduce the number of jobs of the total thread so that 
> the DataNode can work better.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16386) Reduce DataNode load when FsDatasetAsyncDiskService is working

2022-05-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16386?focusedWorklogId=774343=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-774343
 ]

ASF GitHub Bot logged work on HDFS-16386:
-

Author: ASF GitHub Bot
Created on: 25/May/22 03:52
Start Date: 25/May/22 03:52
Worklog Time Spent: 10m 
  Work Description: ZanderXu commented on PR #3806:
URL: https://github.com/apache/hadoop/pull/3806#issuecomment-1136695695

   Thanks @jianghuazhu for you comment. 
   - I have a question, if the queue is unbounded, will the number of active 
thread in the ThreadPool be greater than the number of core thread?
   - I think that we need to support the ability to dynamically adjust the 
number of core threads, so that we can adjust it in time for different load to 
archive the best result. 
   
   




Issue Time Tracking
---

Worklog Id: (was: 774343)
Time Spent: 3h 50m  (was: 3h 40m)

> Reduce DataNode load when FsDatasetAsyncDiskService is working
> --
>
> Key: HDFS-16386
> URL: https://issues.apache.org/jira/browse/HDFS-16386
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.9.2
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.3, 3.3.2, 3.2.4
>
> Attachments: monitor.png
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> Our DataNode node has 36 disks. When FsDatasetAsyncDiskService is working, it 
> will cause a high load on the DataNode.
> Here are some monitoring related to memory:
>  !monitor.png! 
> Since each disk deletes the block asynchronously, and each thread allows 4 
> threads to work,
> This will cause some troubles to the DataNode, such as increased cpu and 
> increased memory.
> We should appropriately reduce the number of jobs of the total thread so that 
> the DataNode can work better.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16555) rename mixlead method name in DistCpOptions

2022-05-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16555?focusedWorklogId=774339=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-774339
 ]

ASF GitHub Bot logged work on HDFS-16555:
-

Author: ASF GitHub Bot
Created on: 25/May/22 03:41
Start Date: 25/May/22 03:41
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4216:
URL: https://github.com/apache/hadoop/pull/4216#issuecomment-1136691451

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |  19m 30s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  40m 57s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   0m 43s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   0m 39s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   0m 39s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 45s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 44s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 36s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 13s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m 16s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 28s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 32s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   0m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   0m 26s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   0m 26s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 20s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   0m 30s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 23s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 23s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   0m 54s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 42s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  47m 16s |  |  hadoop-distcp in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 44s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 165m 20s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4216/6/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4216 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 7072f516dc64 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 
17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 68e501c931618a65d46a9eebc2b12ffd0dabadc5 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4216/6/testReport/ |
   | Max. process+thread count | 606 (vs. ulimit of 5500) |
   | modules | C: hadoop-tools/hadoop-distcp U: hadoop-tools/hadoop-distcp |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4216/6/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT 

[jira] [Work logged] (HDFS-16593) Correct inaccurate BlocksRemoved metric on DataNode side

2022-05-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16593?focusedWorklogId=774333=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-774333
 ]

ASF GitHub Bot logged work on HDFS-16593:
-

Author: ASF GitHub Bot
Created on: 25/May/22 02:55
Start Date: 25/May/22 02:55
Worklog Time Spent: 10m 
  Work Description: ZanderXu opened a new pull request, #4353:
URL: https://github.com/apache/hadoop/pull/4353

   Correct inaccurate BlocksRemoved metric on DataNode side




Issue Time Tracking
---

Worklog Id: (was: 774333)
Remaining Estimate: 0h
Time Spent: 10m

> Correct inaccurate BlocksRemoved metric on DataNode side
> 
>
> Key: HDFS-16593
> URL: https://issues.apache.org/jira/browse/HDFS-16593
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When tracing the root cause of production issue, I found that the 
> BlocksRemoved  metric on Datanode size was inaccurate.
> {code:java}
> case DatanodeProtocol.DNA_INVALIDATE:
>   //
>   // Some local block(s) are obsolete and can be 
>   // safely garbage-collected.
>   //
>   Block toDelete[] = bcmd.getBlocks();
>   try {
> // using global fsdataset
> dn.getFSDataset().invalidate(bcmd.getBlockPoolId(), toDelete);
>   } catch(IOException e) {
> // Exceptions caught here are not expected to be disk-related.
> throw e;
>   }
>   dn.metrics.incrBlocksRemoved(toDelete.length);
>   break;
> {code}
> Because even if the invalidate method throws an exception, some blocks may 
> have been successfully deleted internally.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16593) Correct inaccurate BlocksRemoved metric on DataNode side

2022-05-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-16593:
--
Labels: pull-request-available  (was: )

> Correct inaccurate BlocksRemoved metric on DataNode side
> 
>
> Key: HDFS-16593
> URL: https://issues.apache.org/jira/browse/HDFS-16593
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When tracing the root cause of production issue, I found that the 
> BlocksRemoved  metric on Datanode size was inaccurate.
> {code:java}
> case DatanodeProtocol.DNA_INVALIDATE:
>   //
>   // Some local block(s) are obsolete and can be 
>   // safely garbage-collected.
>   //
>   Block toDelete[] = bcmd.getBlocks();
>   try {
> // using global fsdataset
> dn.getFSDataset().invalidate(bcmd.getBlockPoolId(), toDelete);
>   } catch(IOException e) {
> // Exceptions caught here are not expected to be disk-related.
> throw e;
>   }
>   dn.metrics.incrBlocksRemoved(toDelete.length);
>   break;
> {code}
> Because even if the invalidate method throws an exception, some blocks may 
> have been successfully deleted internally.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16593) Correct inaccurate BlocksRemoved metric on DataNode side

2022-05-24 Thread ZanderXu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ZanderXu updated HDFS-16593:

Priority: Minor  (was: Major)

> Correct inaccurate BlocksRemoved metric on DataNode side
> 
>
> Key: HDFS-16593
> URL: https://issues.apache.org/jira/browse/HDFS-16593
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Minor
>
> When tracing the root cause of production issue, I found that the 
> BlocksRemoved  metric on Datanode size was inaccurate.
> {code:java}
> case DatanodeProtocol.DNA_INVALIDATE:
>   //
>   // Some local block(s) are obsolete and can be 
>   // safely garbage-collected.
>   //
>   Block toDelete[] = bcmd.getBlocks();
>   try {
> // using global fsdataset
> dn.getFSDataset().invalidate(bcmd.getBlockPoolId(), toDelete);
>   } catch(IOException e) {
> // Exceptions caught here are not expected to be disk-related.
> throw e;
>   }
>   dn.metrics.incrBlocksRemoved(toDelete.length);
>   break;
> {code}
> Because even if the invalidate method throws an exception, some blocks may 
> have been successfully deleted internally.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16593) Correct inaccurate BlocksRemoved metric on DataNode side

2022-05-24 Thread ZanderXu (Jira)
ZanderXu created HDFS-16593:
---

 Summary: Correct inaccurate BlocksRemoved metric on DataNode side
 Key: HDFS-16593
 URL: https://issues.apache.org/jira/browse/HDFS-16593
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: ZanderXu
Assignee: ZanderXu


When tracing the root cause of production issue, I found that the BlocksRemoved 
 metric on Datanode size was inaccurate.

{code:java}
case DatanodeProtocol.DNA_INVALIDATE:
  //
  // Some local block(s) are obsolete and can be 
  // safely garbage-collected.
  //
  Block toDelete[] = bcmd.getBlocks();
  try {
// using global fsdataset
dn.getFSDataset().invalidate(bcmd.getBlockPoolId(), toDelete);
  } catch(IOException e) {
// Exceptions caught here are not expected to be disk-related.
throw e;
  }
  dn.metrics.incrBlocksRemoved(toDelete.length);
  break;
{code}

Because even if the invalidate method throws an exception, some blocks may have 
been successfully deleted internally.




--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16386) Reduce DataNode load when FsDatasetAsyncDiskService is working

2022-05-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16386?focusedWorklogId=774332=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-774332
 ]

ASF GitHub Bot logged work on HDFS-16386:
-

Author: ASF GitHub Bot
Created on: 25/May/22 02:37
Start Date: 25/May/22 02:37
Worklog Time Spent: 10m 
  Work Description: jianghuazhu commented on PR #3806:
URL: https://github.com/apache/hadoop/pull/3806#issuecomment-1136659630

   Thanks @ZanderXu  for following.
   Here are some explanations:
   1. The main job of FsDatasetAsyncDiskService is to delete the replica files 
synchronously or asynchronously. The copy files to be deleted here are all 
files on the local DataNode, and the number is limited. Although the thread 
pool uses an unbounded queue, it will not be stored all the time, because it 
will always be consumed. And these copies have been loaded into memory when the 
DataNode is working, so the probability of OOM here is very low.
   2. If the copy is deleted asynchronously, the thread pool work will be 
started. Before this, each disk will correspond to a thread pool, and the 
thread pool will have at most 4 fixed threads to work, and this condition is 
fixed. In our cluster, DataNodes have different numbers of disks, 12 disks, 36 
disks, and 60 disks will exist. Take DataNode with 36 disks or 60 disks as an 
example, then during peak hours, DataNode needs to start a lot of thread work. 
Adjusting the number of threads flexibly will reduce the workload of the 
DataNode.




Issue Time Tracking
---

Worklog Id: (was: 774332)
Time Spent: 3h 40m  (was: 3.5h)

> Reduce DataNode load when FsDatasetAsyncDiskService is working
> --
>
> Key: HDFS-16386
> URL: https://issues.apache.org/jira/browse/HDFS-16386
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.9.2
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.3, 3.3.2, 3.2.4
>
> Attachments: monitor.png
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Our DataNode node has 36 disks. When FsDatasetAsyncDiskService is working, it 
> will cause a high load on the DataNode.
> Here are some monitoring related to memory:
>  !monitor.png! 
> Since each disk deletes the block asynchronously, and each thread allows 4 
> threads to work,
> This will cause some troubles to the DataNode, such as increased cpu and 
> increased memory.
> We should appropriately reduce the number of jobs of the total thread so that 
> the DataNode can work better.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16555) rename mixlead method name in DistCpOptions

2022-05-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16555?focusedWorklogId=774313=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-774313
 ]

ASF GitHub Bot logged work on HDFS-16555:
-

Author: ASF GitHub Bot
Created on: 25/May/22 00:56
Start Date: 25/May/22 00:56
Worklog Time Spent: 10m 
  Work Description: GuoPhilipse commented on code in PR #4216:
URL: https://github.com/apache/hadoop/pull/4216#discussion_r881089014


##
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpOptions.java:
##
@@ -684,11 +684,23 @@ public Builder withAppend(boolean newAppend) {
   return this;
 }
 
+/**
+ * whether builder with crc.
+ * @param newSkipCRC whether to skip crc check
+ * @return  Builder object whether to skip crc check
+ * @deprecated Use {@link #withSkipCRC(boolean)} instead.
+ */
+@Deprecated
 public Builder withCRC(boolean newSkipCRC) {
   this.skipCRC = newSkipCRC;
   return this;
 }
 
+public Builder withSkipCRC(boolean newSkipCRC) {

Review Comment:
   > copy the javadocs from above, now you've written them
   
   Thanks @steveloughran  for your notice ,have just updated





Issue Time Tracking
---

Worklog Id: (was: 774313)
Time Spent: 1h 40m  (was: 1.5h)

> rename mixlead method name in DistCpOptions
> ---
>
> Key: HDFS-16555
> URL: https://issues.apache.org/jira/browse/HDFS-16555
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: distcp
>Affects Versions: 3.3.2
>Reporter: guophilipse
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Currently method in DistCpOptions withCRC was used as the following
> ```
> withCRC(true) means check without crc
> withCRC(false) means check with crc
> ```
> which mislead the developer when we pass the paramter, we can rename the 
> method to clear that.after that it should be:
> ```
> withSkipCRC(true) means check without crc
> withSkipCRC(false) means check with crc
> ```
> so it will be more understandable.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16555) rename mixlead method name in DistCpOptions

2022-05-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16555?focusedWorklogId=774255=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-774255
 ]

ASF GitHub Bot logged work on HDFS-16555:
-

Author: ASF GitHub Bot
Created on: 24/May/22 22:01
Start Date: 24/May/22 22:01
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on code in PR #4216:
URL: https://github.com/apache/hadoop/pull/4216#discussion_r880984625


##
hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCpOptions.java:
##
@@ -684,11 +684,23 @@ public Builder withAppend(boolean newAppend) {
   return this;
 }
 
+/**
+ * whether builder with crc.
+ * @param newSkipCRC whether to skip crc check
+ * @return  Builder object whether to skip crc check
+ * @deprecated Use {@link #withSkipCRC(boolean)} instead.
+ */
+@Deprecated
 public Builder withCRC(boolean newSkipCRC) {
   this.skipCRC = newSkipCRC;
   return this;
 }
 
+public Builder withSkipCRC(boolean newSkipCRC) {

Review Comment:
   copy the javadocs from above, now you've written them





Issue Time Tracking
---

Worklog Id: (was: 774255)
Time Spent: 1.5h  (was: 1h 20m)

> rename mixlead method name in DistCpOptions
> ---
>
> Key: HDFS-16555
> URL: https://issues.apache.org/jira/browse/HDFS-16555
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: distcp
>Affects Versions: 3.3.2
>Reporter: guophilipse
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> Currently method in DistCpOptions withCRC was used as the following
> ```
> withCRC(true) means check without crc
> withCRC(false) means check with crc
> ```
> which mislead the developer when we pass the paramter, we can rename the 
> method to clear that.after that it should be:
> ```
> withSkipCRC(true) means check without crc
> withSkipCRC(false) means check with crc
> ```
> so it will be more understandable.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16590) Fix Junit Test Deprecated assertThat

2022-05-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16590?focusedWorklogId=774259=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-774259
 ]

ASF GitHub Bot logged work on HDFS-16590:
-

Author: ASF GitHub Bot
Created on: 24/May/22 22:07
Start Date: 24/May/22 22:07
Worklog Time Spent: 10m 
  Work Description: slfan1989 commented on PR #4349:
URL: https://github.com/apache/hadoop/pull/4349#issuecomment-1136477439

   I run this junit test and found that it can be successful.
   
   ```
   [INFO] ---
   [INFO]  T E S T S
   [INFO] ---
   [INFO] Running org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes
   [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
61.054 s - in org.apache.hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes
   [INFO] 
   [INFO] Results:
   [INFO] 
   [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
   ```
   
   I found the following information in the test report
   ```
   org.apache.hadoop.hdfs.server.datanode.TestBlockScanner
   ExecutionException The forked VM terminated without properly saying goodbye. 
VM crash or System.exit called?
   Command was /bin/sh -c cd 
/home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-4349/ubuntu-focal/src/hadoop-hdfs-project/hadoop-hdfs
 && /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xmx2048m 
-XX:+HeapDumpOnOutOfMemoryError -DminiClusterDedicatedDirs=true -jar 
/home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-4349/ubuntu-focal/src/hadoop-hdfs-project/hadoop-hdfs/target/surefire/surefirebooter4063913185211432001.jar
 
/home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-4349/ubuntu-focal/src/hadoop-hdfs-project/hadoop-hdfs/target/surefire
 2022-05-24T06-25-53_745-jvmRun1 surefire5817288024834392866tmp 
surefire_3427099837231563976684tmp
   Error occurred in starting fork, check output in log
   Process Exit Code: 1
   Crashed tests:
   org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyWriter
   ExecutionException The forked VM terminated without properly saying goodbye. 
VM crash or System.exit called?
   Command was /bin/sh -c cd 
/home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-4349/ubuntu-focal/src/hadoop-hdfs-project/hadoop-hdfs
 && /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xmx2048m 
-XX:+HeapDumpOnOutOfMemoryError -DminiClusterDedicatedDirs=true -jar 
/home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-4349/ubuntu-focal/src/hadoop-hdfs-project/hadoop-hdfs/target/surefire/surefirebooter8070136178396967706.jar
 
/home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-4349/ubuntu-focal/src/hadoop-hdfs-project/hadoop-hdfs/target/surefire
 2022-05-24T06-25-53_745-jvmRun1 surefire722018860489712113tmp 
surefire_4633368331952091928634tmp
   Error occurred in starting fork, check output in log
   Process Exit Code: 1
   Crashed tests:
   org.apache.hadoop.hdfs.server.blockmanagement.TestSequentialBlockId
   ExecutionException The forked VM terminated without properly saying goodbye. 
VM crash or System.exit called?
   Command was /bin/sh -c cd 
/home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-4349/ubuntu-focal/src/hadoop-hdfs-project/hadoop-hdfs
 && /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xmx2048m 
-XX:+HeapDumpOnOutOfMemoryError -DminiClusterDedicatedDirs=true -jar 
/home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-4349/ubuntu-focal/src/hadoop-hdfs-project/hadoop-hdfs/target/surefire/surefirebooter4034957644830842693.jar
 
/home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-4349/ubuntu-focal/src/hadoop-hdfs-project/hadoop-hdfs/target/surefire
 2022-05-24T06-25-53_745-jvmRun2 surefire8748449737419433268tmp 
surefire_5939041876649571544204tmp
   Error occurred in starting fork, check output in log
   Process Exit Code: 1
   Crashed tests:
   org.apache.hadoop.hdfs.client.impl.TestBlockReaderLocal
   ExecutionException The forked VM terminated without properly saying goodbye. 
VM crash or System.exit called?
   Command was /bin/sh -c cd 
/home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-4349/ubuntu-focal/src/hadoop-hdfs-project/hadoop-hdfs
 && /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xmx2048m 
-XX:+HeapDumpOnOutOfMemoryError -DminiClusterDedicatedDirs=true -jar 
/home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-4349/ubuntu-focal/src/hadoop-hdfs-project/hadoop-hdfs/target/surefire/surefirebooter5086311145150655861.jar
 
/home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-4349/ubuntu-focal/src/hadoop-hdfs-project/hadoop-hdfs/target/surefire
 2022-05-24T06-25-53_745-jvmRun1 surefire2732095862261680658tmp 
surefire_668579198910796942474tmp
   Error occurred in starting fork, check output in log
   Process Exit Code: 1
   Crashed tests:
   

[jira] [Work logged] (HDFS-16586) Purge FsDatasetAsyncDiskService threadgroup; it causes BPServiceActor$CommandProcessingThread IllegalThreadStateException 'fatal exception and exit'

2022-05-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16586?focusedWorklogId=774256=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-774256
 ]

ASF GitHub Bot logged work on HDFS-16586:
-

Author: ASF GitHub Bot
Created on: 24/May/22 22:02
Start Date: 24/May/22 22:02
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4347:
URL: https://github.com/apache/hadoop/pull/4347#issuecomment-1136471915

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m 10s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ branch-3.3 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  37m 59s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  compile  |   1m 25s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  checkstyle  |   1m  7s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  mvnsite  |   1m 36s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  javadoc  |   1m 44s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  spotbugs  |   3m 41s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  shadedclient  |  28m 43s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 19s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 47s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 20s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 21s |  |  the patch passed  |
   | +1 :green_heart: |  spotbugs  |   3m 26s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  28m  8s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 219m 51s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4347/3/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m  1s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 333m 19s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.TestRollingUpgrade |
   |   | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithShortCircuitRead |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4347/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4347 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 44f8f8cbf6c6 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 
17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | branch-3.3 / 9b929f5863b9789ca429ee1cf3d7f44d9b349991 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4347/3/testReport/ |
   | Max. process+thread count | 2321 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4347/3/console |
   | versions | git=2.17.1 maven=3.6.0 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




Issue Time Tracking
---

Worklog Id: (was: 774256)
Time Spent: 2h 40m  (was: 2.5h)

> Purge FsDatasetAsyncDiskService threadgroup; it causes 
> BPServiceActor$CommandProcessingThread IllegalThreadStateException 'fatal 
> exception and exit' 
> -
>
> 

[jira] [Work logged] (HDFS-16583) DatanodeAdminDefaultMonitor can get stuck in an infinite loop

2022-05-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16583?focusedWorklogId=774247=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-774247
 ]

ASF GitHub Bot logged work on HDFS-16583:
-

Author: ASF GitHub Bot
Created on: 24/May/22 21:45
Start Date: 24/May/22 21:45
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4332:
URL: https://github.com/apache/hadoop/pull/4332#issuecomment-1136461239

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 55s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  39m 25s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 46s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   1m 31s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 19s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 37s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 20s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 41s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 50s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  26m  7s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 24s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 30s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   1m 30s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 18s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   1m 18s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  1s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 26s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 59s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 29s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 35s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  25m 50s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 377m 24s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4332/3/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 58s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 493m 51s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.TestReplaceDatanodeFailureReplication |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4332/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4332 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 87c9decf67e9 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 
17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 2fe9c5ebb481a05cf95df0da4e8bc115bce68959 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 

[jira] [Work logged] (HDFS-16586) Purge FsDatasetAsyncDiskService threadgroup; it causes BPServiceActor$CommandProcessingThread IllegalThreadStateException 'fatal exception and exit'

2022-05-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16586?focusedWorklogId=774246=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-774246
 ]

ASF GitHub Bot logged work on HDFS-16586:
-

Author: ASF GitHub Bot
Created on: 24/May/22 21:41
Start Date: 24/May/22 21:41
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4348:
URL: https://github.com/apache/hadoop/pull/4348#issuecomment-1136459189

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  4s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ branch-3.2 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  34m  7s |  |  branch-3.2 passed  |
   | +1 :green_heart: |  compile  |   1m 24s |  |  branch-3.2 passed  |
   | +1 :green_heart: |  checkstyle  |   1m 10s |  |  branch-3.2 passed  |
   | +1 :green_heart: |  mvnsite  |   1m 34s |  |  branch-3.2 passed  |
   | +1 :green_heart: |  javadoc  |   1m 24s |  |  branch-3.2 passed  |
   | +1 :green_heart: |  spotbugs  |   3m 49s |  |  branch-3.2 passed  |
   | +1 :green_heart: |  shadedclient  |  19m  8s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 24s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  9s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m  9s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 50s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 17s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 58s |  |  the patch passed  |
   | +1 :green_heart: |  spotbugs  |   3m 24s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  20m 17s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 224m  3s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4348/3/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 58s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 315m 20s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.server.namenode.TestNameNodeMXBean |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4348/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4348 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 8941ba45faa6 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 
17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | branch-3.2 / a622237ea99ca1bcd38f3fb118fc85e04244f9f7 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4348/3/testReport/ |
   | Max. process+thread count | 1908 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4348/3/console |
   | versions | git=2.17.1 maven=3.6.0 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




Issue Time Tracking
---

Worklog Id: (was: 774246)
Time Spent: 2.5h  (was: 2h 20m)

> Purge FsDatasetAsyncDiskService threadgroup; it causes 
> BPServiceActor$CommandProcessingThread IllegalThreadStateException 'fatal 
> exception and exit' 
> -
>
> Key: HDFS-16586
> URL: 

[jira] [Work logged] (HDFS-16590) Fix Junit Test Deprecated assertThat

2022-05-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16590?focusedWorklogId=774227=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-774227
 ]

ASF GitHub Bot logged work on HDFS-16590:
-

Author: ASF GitHub Bot
Created on: 24/May/22 21:09
Start Date: 24/May/22 21:09
Worklog Time Spent: 10m 
  Work Description: slfan1989 commented on PR #4349:
URL: https://github.com/apache/hadoop/pull/4349#issuecomment-1136434600

   > The new javac warnings make sense, so they are OK to me.
   > 
   > Unit test failures need looking at, I'm not sure if they are occasional 
issues. I don't usually review HDFS patches.
   
   @dannycjones Thanks you very much, I will follow up this pr.




Issue Time Tracking
---

Worklog Id: (was: 774227)
Time Spent: 50m  (was: 40m)

> Fix Junit Test Deprecated assertThat
> 
>
> Key: HDFS-16590
> URL: https://issues.apache.org/jira/browse/HDFS-16590
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16590) Fix Junit Test Deprecated assertThat

2022-05-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16590?focusedWorklogId=774109=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-774109
 ]

ASF GitHub Bot logged work on HDFS-16590:
-

Author: ASF GitHub Bot
Created on: 24/May/22 16:09
Start Date: 24/May/22 16:09
Worklog Time Spent: 10m 
  Work Description: dannycjones commented on PR #4349:
URL: https://github.com/apache/hadoop/pull/4349#issuecomment-1136121490

   The new javac warnings make sense, so they are OK to me.
   
   Unit test failures need looking at, I'm not sure if they are occasional 
issues. I don't usually review HDFS patches.




Issue Time Tracking
---

Worklog Id: (was: 774109)
Time Spent: 40m  (was: 0.5h)

> Fix Junit Test Deprecated assertThat
> 
>
> Key: HDFS-16590
> URL: https://issues.apache.org/jira/browse/HDFS-16590
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16590) Fix Junit Test Deprecated assertThat

2022-05-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16590?focusedWorklogId=774090=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-774090
 ]

ASF GitHub Bot logged work on HDFS-16590:
-

Author: ASF GitHub Bot
Created on: 24/May/22 15:53
Start Date: 24/May/22 15:53
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4349:
URL: https://github.com/apache/hadoop/pull/4349#issuecomment-1136104471

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  6s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 60 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  17m 59s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m 28s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m  8s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |  20m 37s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   4m 30s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |  15m 55s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |  14m  0s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |  13m 46s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |  21m 26s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 25s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 35s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   7m 42s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 15s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | -1 :x: |  javac  |  22m 15s | 
[/results-compile-javac-root-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4349/1/artifact/out/results-compile-javac-root-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  root-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 generated 4 new + 2460 unchanged - 451 
fixed = 2464 total (was 2911)  |
   | +1 :green_heart: |  compile  |  20m 43s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | -1 :x: |  javac  |  20m 43s | 
[/results-compile-javac-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4349/1/artifact/out/results-compile-javac-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 generated 4 new + 2255 
unchanged - 451 fixed = 2259 total (was 2706)  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   4m 17s |  |  root: The patch generated 
0 new + 688 unchanged - 2 fixed = 688 total (was 690)  |
   | +1 :green_heart: |  mvnsite  |  15m 45s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |  13m 50s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |  13m 39s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |  23m 21s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 35s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   4m 23s |  |  hadoop-auth in the patch 
passed.  |
   | +1 :green_heart: |  unit  |  18m 48s |  |  hadoop-common in the patch 
passed.  |
   | -1 :x: |  unit  | 454m 38s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4349/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  unit  |   4m 43s |  |  hadoop-hdfs-nfs in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   5m 58s |  |  hadoop-yarn-common in the 

[jira] [Resolved] (HDFS-16588) Backport HDFS-16584 to branch-3.3.

2022-05-24 Thread Xiaoqiao He (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He resolved HDFS-16588.

Fix Version/s: 3.3.4
 Hadoop Flags: Reviewed
   Resolution: Fixed

Committed to branch-3.3

> Backport HDFS-16584 to branch-3.3.
> --
>
> Key: HDFS-16588
> URL: https://issues.apache.org/jira/browse/HDFS-16588
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer  mover, namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.4
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> This issue has been dealt with in trunk and again needs to be backported to 
> branch-3.3 or another active branch.
> See HDFS-16584.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16588) Backport HDFS-16584 to branch-3.3.

2022-05-24 Thread Xiaoqiao He (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He updated HDFS-16588:
---
Summary: Backport HDFS-16584 to branch-3.3.  (was: Backport HDFS-16584 to 
branch-3.3 and other active old branches)

> Backport HDFS-16584 to branch-3.3.
> --
>
> Key: HDFS-16588
> URL: https://issues.apache.org/jira/browse/HDFS-16588
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer  mover, namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> This issue has been dealt with in trunk and again needs to be backported to 
> branch-3.3 or another active branch.
> See HDFS-16584.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16588) Backport HDFS-16584 to branch-3.3 and other active old branches

2022-05-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16588?focusedWorklogId=774089=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-774089
 ]

ASF GitHub Bot logged work on HDFS-16588:
-

Author: ASF GitHub Bot
Created on: 24/May/22 15:48
Start Date: 24/May/22 15:48
Worklog Time Spent: 10m 
  Work Description: Hexiaoqiao commented on PR #4342:
URL: https://github.com/apache/hadoop/pull/4342#issuecomment-1136099049

   Committed to branch-3.3. Thanks @jianghuazhu for your contribution. Thanks 
@tomscut for your review.




Issue Time Tracking
---

Worklog Id: (was: 774089)
Time Spent: 1h  (was: 50m)

> Backport HDFS-16584 to branch-3.3 and other active old branches
> ---
>
> Key: HDFS-16588
> URL: https://issues.apache.org/jira/browse/HDFS-16588
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer  mover, namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> This issue has been dealt with in trunk and again needs to be backported to 
> branch-3.3 or another active branch.
> See HDFS-16584.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16588) Backport HDFS-16584 to branch-3.3 and other active old branches

2022-05-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16588?focusedWorklogId=774088=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-774088
 ]

ASF GitHub Bot logged work on HDFS-16588:
-

Author: ASF GitHub Bot
Created on: 24/May/22 15:47
Start Date: 24/May/22 15:47
Worklog Time Spent: 10m 
  Work Description: Hexiaoqiao merged PR #4342:
URL: https://github.com/apache/hadoop/pull/4342




Issue Time Tracking
---

Worklog Id: (was: 774088)
Time Spent: 50m  (was: 40m)

> Backport HDFS-16584 to branch-3.3 and other active old branches
> ---
>
> Key: HDFS-16588
> URL: https://issues.apache.org/jira/browse/HDFS-16588
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer  mover, namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> This issue has been dealt with in trunk and again needs to be backported to 
> branch-3.3 or another active branch.
> See HDFS-16584.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16386) Reduce DataNode load when FsDatasetAsyncDiskService is working

2022-05-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16386?focusedWorklogId=773976=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773976
 ]

ASF GitHub Bot logged work on HDFS-16386:
-

Author: ASF GitHub Bot
Created on: 24/May/22 11:26
Start Date: 24/May/22 11:26
Worklog Time Spent: 10m 
  Work Description: ZanderXu commented on PR #3806:
URL: https://github.com/apache/hadoop/pull/3806#issuecomment-1135792522

   `ThreadPoolExecutor executor = new ThreadPoolExecutor(
   CORE_THREADS_PER_VOLUME, maxNumThreadsPerVolume,
   THREADS_KEEP_ALIVE_SECONDS, TimeUnit.SECONDS,
   new LinkedBlockingQueue(), threadFactory);`
   
   The ThreadPoolExecutor used the unbounded LinkedBlockingQueue, so the actual 
thread number should be less than or equal to the corePoolSize.  When NN needs 
one DN to delete a large number of blocks,  this DN will create a large number 
of ReplicaFileDeleteTask, and stored all ReplicaFileDeleteTasks in the 
LinkedBlockingQueue of the ThreadPoolExecutor, resulting in increased memory or 
even OOM.
   
   Feel free to correct me if there are mistakes.




Issue Time Tracking
---

Worklog Id: (was: 773976)
Time Spent: 3.5h  (was: 3h 20m)

> Reduce DataNode load when FsDatasetAsyncDiskService is working
> --
>
> Key: HDFS-16386
> URL: https://issues.apache.org/jira/browse/HDFS-16386
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.9.2
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.3, 3.3.2, 3.2.4
>
> Attachments: monitor.png
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Our DataNode node has 36 disks. When FsDatasetAsyncDiskService is working, it 
> will cause a high load on the DataNode.
> Here are some monitoring related to memory:
>  !monitor.png! 
> Since each disk deletes the block asynchronously, and each thread allows 4 
> threads to work,
> This will cause some troubles to the DataNode, such as increased cpu and 
> increased memory.
> We should appropriately reduce the number of jobs of the total thread so that 
> the DataNode can work better.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16386) Reduce DataNode load when FsDatasetAsyncDiskService is working

2022-05-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16386?focusedWorklogId=773971=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773971
 ]

ASF GitHub Bot logged work on HDFS-16386:
-

Author: ASF GitHub Bot
Created on: 24/May/22 11:12
Start Date: 24/May/22 11:12
Worklog Time Spent: 10m 
  Work Description: ZanderXu commented on PR #3806:
URL: https://github.com/apache/hadoop/pull/3806#issuecomment-1135779919

   @jianghuazhu I'm so sorry to discuss this issue again.
   Setting smaller MAX THREAD can reduce memory usage? 
[HDFS-16386](https://issues.apache.org/jira/browse/HDFS-16386)
   




Issue Time Tracking
---

Worklog Id: (was: 773971)
Time Spent: 3h 20m  (was: 3h 10m)

> Reduce DataNode load when FsDatasetAsyncDiskService is working
> --
>
> Key: HDFS-16386
> URL: https://issues.apache.org/jira/browse/HDFS-16386
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.9.2
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0, 3.2.3, 3.3.2, 3.2.4
>
> Attachments: monitor.png
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Our DataNode node has 36 disks. When FsDatasetAsyncDiskService is working, it 
> will cause a high load on the DataNode.
> Here are some monitoring related to memory:
>  !monitor.png! 
> Since each disk deletes the block asynchronously, and each thread allows 4 
> threads to work,
> This will cause some troubles to the DataNode, such as increased cpu and 
> increased memory.
> We should appropriately reduce the number of jobs of the total thread so that 
> the DataNode can work better.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16590) Fix Junit Test Deprecated assertThat

2022-05-24 Thread fanshilun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17541416#comment-17541416
 ] 

fanshilun commented on HDFS-16590:
--

Hi, [~ste...@apache.org] , Thank you very much!

> Fix Junit Test Deprecated assertThat
> 
>
> Key: HDFS-16590
> URL: https://issues.apache.org/jira/browse/HDFS-16590
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16586) Purge FsDatasetAsyncDiskService threadgroup; it causes BPServiceActor$CommandProcessingThread IllegalThreadStateException 'fatal exception and exit'

2022-05-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16586?focusedWorklogId=773961=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773961
 ]

ASF GitHub Bot logged work on HDFS-16586:
-

Author: ASF GitHub Bot
Created on: 24/May/22 10:23
Start Date: 24/May/22 10:23
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4347:
URL: https://github.com/apache/hadoop/pull/4347#issuecomment-1135730139

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  9s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ branch-3.3 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  41m 21s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  compile  |   1m 31s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  checkstyle  |   1m  4s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  mvnsite  |   1m 35s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  javadoc  |   1m 46s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  spotbugs  |   3m 43s |  |  branch-3.3 passed  |
   | +1 :green_heart: |  shadedclient  |  28m 43s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 20s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 47s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 19s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 24s |  |  the patch passed  |
   | +1 :green_heart: |  spotbugs  |   3m 26s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  28m 11s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 227m 56s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4347/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 16s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 344m 49s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.TestRollingUpgrade |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4347/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4347 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 5504327bf8bc 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 
17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | branch-3.3 / f3f74013f773e0dbe599a80cee462c73b230946a |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4347/2/testReport/ |
   | Max. process+thread count | 2138 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4347/2/console |
   | versions | git=2.17.1 maven=3.6.0 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




Issue Time Tracking
---

Worklog Id: (was: 773961)
Time Spent: 2h 20m  (was: 2h 10m)

> Purge FsDatasetAsyncDiskService threadgroup; it causes 
> BPServiceActor$CommandProcessingThread IllegalThreadStateException 'fatal 
> exception and exit' 
> -
>
> Key: HDFS-16586
> URL: 

[jira] [Work logged] (HDFS-16592) Fix typo for BalancingPolicy

2022-05-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16592?focusedWorklogId=773946=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773946
 ]

ASF GitHub Bot logged work on HDFS-16592:
-

Author: ASF GitHub Bot
Created on: 24/May/22 09:52
Start Date: 24/May/22 09:52
Worklog Time Spent: 10m 
  Work Description: jianghuazhu commented on PR #4351:
URL: https://github.com/apache/hadoop/pull/4351#issuecomment-1135682996

   Here are some failing unit tests such as:
   TestWebHDFS
   TestUnderReplicatedBlocks
   TestExternalStoragePolicySatisfier
   TestIncrementalBlockReports
   TestRedudantBlocks
   
   It looks like these failures have little to do with the code I submitted.
   @aajisaka  @ferhui , can you help review this pr?
   Thank you very much.




Issue Time Tracking
---

Worklog Id: (was: 773946)
Time Spent: 0.5h  (was: 20m)

> Fix typo for BalancingPolicy
> 
>
> Key: HDFS-16592
> URL: https://issues.apache.org/jira/browse/HDFS-16592
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer  mover, documentation, namenode
>Affects Versions: 3.4.0
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Minor
>  Labels: pull-request-available
> Attachments: image-2022-05-24-11-29-14-019.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
>  !image-2022-05-24-11-29-14-019.png! 
> 'NOT' should be changed to lowercase rather than uppercase.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16592) Fix typo for BalancingPolicy

2022-05-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16592?focusedWorklogId=773941=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773941
 ]

ASF GitHub Bot logged work on HDFS-16592:
-

Author: ASF GitHub Bot
Created on: 24/May/22 09:40
Start Date: 24/May/22 09:40
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4351:
URL: https://github.com/apache/hadoop/pull/4351#issuecomment-1135664263

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 51s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  40m 30s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 27s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   1m 17s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m  6s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 25s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  8s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 30s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 21s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m 48s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 18s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   1m 18s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 11s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 51s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 50s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 22s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 10s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  21m 42s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 255m 28s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4351/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 55s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 361m 33s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.TestClientProtocolForPipelineRecovery |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4351/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4351 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 9fc2a6dd2321 4.15.0-169-generic #177-Ubuntu SMP Thu Feb 3 
10:50:38 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 8081ef3ae322798ebaf16b7d054212ce50077d8a |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 

[jira] [Work logged] (HDFS-16586) Purge FsDatasetAsyncDiskService threadgroup; it causes BPServiceActor$CommandProcessingThread IllegalThreadStateException 'fatal exception and exit'

2022-05-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16586?focusedWorklogId=773937=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773937
 ]

ASF GitHub Bot logged work on HDFS-16586:
-

Author: ASF GitHub Bot
Created on: 24/May/22 09:33
Start Date: 24/May/22 09:33
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4348:
URL: https://github.com/apache/hadoop/pull/4348#issuecomment-1135653728

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 52s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ branch-3.2 Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  34m 18s |  |  branch-3.2 passed  |
   | +1 :green_heart: |  compile  |   1m  9s |  |  branch-3.2 passed  |
   | +1 :green_heart: |  checkstyle  |   1m  1s |  |  branch-3.2 passed  |
   | +1 :green_heart: |  mvnsite  |   1m 19s |  |  branch-3.2 passed  |
   | +1 :green_heart: |  javadoc  |   1m 12s |  |  branch-3.2 passed  |
   | +1 :green_heart: |  spotbugs  |   3m 26s |  |  branch-3.2 passed  |
   | +1 :green_heart: |  shadedclient  |  18m  4s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  2s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m  2s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 45s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m  9s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 53s |  |  the patch passed  |
   | +1 :green_heart: |  spotbugs  |   3m  8s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  18m 36s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 205m 33s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4348/2/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 56s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 291m 36s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.blockmanagement.TestBlockTokenWithShortCircuitRead |
   |   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4348/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4348 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 8a88be9e620b 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 
17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | branch-3.2 / 5b9672b89d60bb6b55b2dd5d9b38ce28d10adac2 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4348/2/testReport/ |
   | Max. process+thread count | 2267 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4348/2/console |
   | versions | git=2.17.1 maven=3.6.0 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




Issue Time Tracking
---

Worklog Id: (was: 773937)
Time Spent: 2h 10m  (was: 2h)

> Purge FsDatasetAsyncDiskService threadgroup; it causes 
> BPServiceActor$CommandProcessingThread IllegalThreadStateException 'fatal 
> exception and exit' 
> 

[jira] [Commented] (HDFS-16590) Fix Junit Test Deprecated assertThat

2022-05-24 Thread Steve Loughran (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17541360#comment-17541360
 ] 

Steve Loughran commented on HDFS-16590:
---

fyi, added you in jira as a contributor, feel free to assign to yourself issues 
you are actively working on

> Fix Junit Test Deprecated assertThat
> 
>
> Key: HDFS-16590
> URL: https://issues.apache.org/jira/browse/HDFS-16590
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-16590) Fix Junit Test Deprecated assertThat

2022-05-24 Thread Steve Loughran (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran reassigned HDFS-16590:
-

Assignee: fanshilun

> Fix Junit Test Deprecated assertThat
> 
>
> Key: HDFS-16590
> URL: https://issues.apache.org/jira/browse/HDFS-16590
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13678) StorageType is incompatible when rolling upgrade to 2.6/2.6+ versions

2022-05-24 Thread Masatake Iwasaki (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17541340#comment-17541340
 ] 

Masatake Iwasaki commented on HDFS-13678:
-

updated the target version for preparing 2.10.2 release.

> StorageType is incompatible when rolling upgrade to 2.6/2.6+ versions
> -
>
> Key: HDFS-13678
> URL: https://issues.apache.org/jira/browse/HDFS-13678
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rolling upgrades
>Affects Versions: 2.5.0
>Reporter: Yiqun Lin
>Priority: Major
>
> In version 2.6.0, we supported more storage types in HDFS that implemented in 
> HDFS-6584. But this seems a incompatible change when we rolling upgrade our 
> cluster from 2.5.0 to 2.6.0 and throw following error.
> {noformat}
> 2018-06-14 11:43:39,246 ERROR [DataNode: 
> [[[DISK]file:/home/vipshop/hard_disk/dfs/, [DISK]file:/data1/dfs/, 
> [DISK]file:/data2/dfs/]] heartbeating to xx.xx.xx.xx:8022] 
> org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in BPOfferService 
> for Block pool BP-670256553-xx.xx.xx.xx-1528795419404 (Datanode Uuid 
> ab150e05-fcb7-49ed-b8ba-f05c27593fee) service to xx.xx.xx.xx:8022
> java.lang.ArrayStoreException
>  at java.util.ArrayList.toArray(ArrayList.java:412)
>  at 
> java.util.Collections$UnmodifiableCollection.toArray(Collections.java:1034)
>  at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:1030)
>  at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:836)
>  at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.sendHeartbeat(DatanodeProtocolClientSideTranslatorPB.java:146)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:566)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:664)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:835)
>  at java.lang.Thread.run(Thread.java:748)
> {noformat}
> The scenery is that old DN parses StorageType error that got from new NN. 
> This error is taking place in sending heratbeat to NN and blocks won't be 
> reported to NN successfully. This will lead subsequent errors.
> Corresponding logic in 2.5.0:
> {code}
>   public static BlockCommand convert(BlockCommandProto blkCmd) {
> ...
> StorageType[][] targetStorageTypes = new StorageType[targetList.size()][];
> List targetStorageTypesList = 
> blkCmd.getTargetStorageTypesList();
> if (targetStorageTypesList.isEmpty()) { // missing storage types
>   for(int i = 0; i < targetStorageTypes.length; i++) {
> targetStorageTypes[i] = new StorageType[targets[i].length];
> Arrays.fill(targetStorageTypes[i], StorageType.DEFAULT);
>   }
> } else {
>   for(int i = 0; i < targetStorageTypes.length; i++) {
> List p = 
> targetStorageTypesList.get(i).getStorageTypesList();
> targetStorageTypes[i] = p.toArray(new StorageType[p.size()]);  < 
> error here
>   }
> }
> {code}
> But corresponding to the current logic , it's will be better to return 
> default type instead of a exception in case StorageType changed(new fields 
> added or new types) in new versions during rolling upgrade.
> {code:java}
> public static StorageType convertStorageType(StorageTypeProto type) {
> switch(type) {
> case DISK:
>   return StorageType.DISK;
> case SSD:
>   return StorageType.SSD;
> case ARCHIVE:
>   return StorageType.ARCHIVE;
> case RAM_DISK:
>   return StorageType.RAM_DISK;
> case PROVIDED:
>   return StorageType.PROVIDED;
> default:
>   throw new IllegalStateException(
>   "BUG: StorageTypeProto not found, type=" + type);
> }
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13678) StorageType is incompatible when rolling upgrade to 2.6/2.6+ versions

2022-05-24 Thread Masatake Iwasaki (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-13678:

Target Version/s: 2.10.3, 2.9.3  (was: 2.9.3, 2.10.2)

> StorageType is incompatible when rolling upgrade to 2.6/2.6+ versions
> -
>
> Key: HDFS-13678
> URL: https://issues.apache.org/jira/browse/HDFS-13678
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rolling upgrades
>Affects Versions: 2.5.0
>Reporter: Yiqun Lin
>Priority: Major
>
> In version 2.6.0, we supported more storage types in HDFS that implemented in 
> HDFS-6584. But this seems a incompatible change when we rolling upgrade our 
> cluster from 2.5.0 to 2.6.0 and throw following error.
> {noformat}
> 2018-06-14 11:43:39,246 ERROR [DataNode: 
> [[[DISK]file:/home/vipshop/hard_disk/dfs/, [DISK]file:/data1/dfs/, 
> [DISK]file:/data2/dfs/]] heartbeating to xx.xx.xx.xx:8022] 
> org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in BPOfferService 
> for Block pool BP-670256553-xx.xx.xx.xx-1528795419404 (Datanode Uuid 
> ab150e05-fcb7-49ed-b8ba-f05c27593fee) service to xx.xx.xx.xx:8022
> java.lang.ArrayStoreException
>  at java.util.ArrayList.toArray(ArrayList.java:412)
>  at 
> java.util.Collections$UnmodifiableCollection.toArray(Collections.java:1034)
>  at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:1030)
>  at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:836)
>  at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.sendHeartbeat(DatanodeProtocolClientSideTranslatorPB.java:146)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:566)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:664)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:835)
>  at java.lang.Thread.run(Thread.java:748)
> {noformat}
> The scenery is that old DN parses StorageType error that got from new NN. 
> This error is taking place in sending heratbeat to NN and blocks won't be 
> reported to NN successfully. This will lead subsequent errors.
> Corresponding logic in 2.5.0:
> {code}
>   public static BlockCommand convert(BlockCommandProto blkCmd) {
> ...
> StorageType[][] targetStorageTypes = new StorageType[targetList.size()][];
> List targetStorageTypesList = 
> blkCmd.getTargetStorageTypesList();
> if (targetStorageTypesList.isEmpty()) { // missing storage types
>   for(int i = 0; i < targetStorageTypes.length; i++) {
> targetStorageTypes[i] = new StorageType[targets[i].length];
> Arrays.fill(targetStorageTypes[i], StorageType.DEFAULT);
>   }
> } else {
>   for(int i = 0; i < targetStorageTypes.length; i++) {
> List p = 
> targetStorageTypesList.get(i).getStorageTypesList();
> targetStorageTypes[i] = p.toArray(new StorageType[p.size()]);  < 
> error here
>   }
> }
> {code}
> But corresponding to the current logic , it's will be better to return 
> default type instead of a exception in case StorageType changed(new fields 
> added or new types) in new versions during rolling upgrade.
> {code:java}
> public static StorageType convertStorageType(StorageTypeProto type) {
> switch(type) {
> case DISK:
>   return StorageType.DISK;
> case SSD:
>   return StorageType.SSD;
> case ARCHIVE:
>   return StorageType.ARCHIVE;
> case RAM_DISK:
>   return StorageType.RAM_DISK;
> case PROVIDED:
>   return StorageType.PROVIDED;
> default:
>   throw new IllegalStateException(
>   "BUG: StorageTypeProto not found, type=" + type);
> }
>   }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14794) [SBN read] reportBadBlock is rejected by Observer.

2022-05-24 Thread Masatake Iwasaki (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17541339#comment-17541339
 ] 

Masatake Iwasaki commented on HDFS-14794:
-

updated the target version for preparing 2.10.2 release.

> [SBN read] reportBadBlock is rejected by Observer.
> --
>
> Key: HDFS-14794
> URL: https://issues.apache.org/jira/browse/HDFS-14794
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Priority: Major
>
> {{reportBadBlock}} is rejected by Observer via StandbyException
> {code}StandbyException: Operation category WRITE is not supported in state 
> observer{code}
> We should investigate what are the consequences of this and if we should 
> treat {{reportBadBlock}} as IBRs. Note that {{reportBadBlock}} is a part of 
> both {{ClientProtocol}} and {{DatanodeProtocol}}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14794) [SBN read] reportBadBlock is rejected by Observer.

2022-05-24 Thread Masatake Iwasaki (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-14794:

Target Version/s: 2.10.3  (was: 2.10.2)

> [SBN read] reportBadBlock is rejected by Observer.
> --
>
> Key: HDFS-14794
> URL: https://issues.apache.org/jira/browse/HDFS-14794
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Priority: Major
>
> {{reportBadBlock}} is rejected by Observer via StandbyException
> {code}StandbyException: Operation category WRITE is not supported in state 
> observer{code}
> We should investigate what are the consequences of this and if we should 
> treat {{reportBadBlock}} as IBRs. Note that {{reportBadBlock}} is a part of 
> both {{ClientProtocol}} and {{DatanodeProtocol}}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15037) Encryption Zone operations should not block other RPC calls while retreiving encryption keys.

2022-05-24 Thread Masatake Iwasaki (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-15037:

Target Version/s: 2.10.3  (was: 2.10.2)

> Encryption Zone operations should not block other RPC calls while retreiving 
> encryption keys.
> -
>
> Key: HDFS-15037
> URL: https://issues.apache.org/jira/browse/HDFS-15037
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Priority: Major
>
> I believe it was an intention to avoid blocking other operations while 
> retrieving keys with holding {{FSDirectory.dirLock}}. But in reality all 
> other operations enter first {{FSNamesystemLock}} then {{dirLock}}. So they 
> are all blocked waiting for the key.
> We see substantial increase in RPC wait time ({{RpcQueueTimeAvgTime}}) on 
> NameNode when encryption operations are intermixed with regular workloads.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15357) Do not trust bad block reports from clients

2022-05-24 Thread Masatake Iwasaki (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17541336#comment-17541336
 ] 

Masatake Iwasaki commented on HDFS-15357:
-

updated the target version for preparing 2.10.2 release.

> Do not trust bad block reports from clients
> ---
>
> Key: HDFS-15357
> URL: https://issues.apache.org/jira/browse/HDFS-15357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Priority: Major
>
> {{reportBadBlocks()}} is implemented by both ClientNamenodeProtocol and 
> DatanodeProtocol. When DFSClient is calling it, a faulty client can cause 
> data availability issues in a cluster. 
> In the past we had such an incident where a node with a faulty NIC was 
> randomly corrupting data. All clients ran on the machine reported all 
> accessed blocks and all associated replicas to be corrupt.  More recently, a 
> single faulty client process caused  a small number of missing blocks.  In 
> all cases, actual data was fine.
> The bad block reports from clients shouldn't be trusted blindly. Instead, the 
> namenode should send a datanode command to verify the claim. A bonus would be 
> to keep the record for a while and ignore repeated reports from the same 
> nodes.
> At minimum, there should be an option to ignore bad block reports from 
> clients, perhaps after logging it. A very crude way would be to make it short 
> out in {{ClientNamenodeProtocolServerSideTranslatorPB#reportBadBlocks()}}. 
> More sophisticated way would be to check for the datanode user name in 
> {{FSNamesystem#reportBadBlocks()}} so that it can be easily logged, or 
> optionally do further processing.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15004) Refactor TestBalancer for faster execution.

2022-05-24 Thread Masatake Iwasaki (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-15004:

Target Version/s: 2.10.3  (was: 2.10.2)

> Refactor TestBalancer for faster execution.
> ---
>
> Key: HDFS-15004
> URL: https://issues.apache.org/jira/browse/HDFS-15004
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs, test
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Priority: Major
>
> {{TestBalancer}} is a big test by itself, it is also a part of many other 
> tests. Running these tests involves spinning of {{MiniDFSCluter}} and 
> shutting it down for every test case, which is inefficient. Many of the test 
> cases can run using the same instance of {{MiniDFSCluter}}, but not all of 
> them. Would be good to refactor the tests to optimize their running time.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15004) Refactor TestBalancer for faster execution.

2022-05-24 Thread Masatake Iwasaki (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17541338#comment-17541338
 ] 

Masatake Iwasaki commented on HDFS-15004:
-

updated the target version for preparing 2.10.2 release.

> Refactor TestBalancer for faster execution.
> ---
>
> Key: HDFS-15004
> URL: https://issues.apache.org/jira/browse/HDFS-15004
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs, test
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Priority: Major
>
> {{TestBalancer}} is a big test by itself, it is also a part of many other 
> tests. Running these tests involves spinning of {{MiniDFSCluter}} and 
> shutting it down for every test case, which is inefficient. Many of the test 
> cases can run using the same instance of {{MiniDFSCluter}}, but not all of 
> them. Would be good to refactor the tests to optimize their running time.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15037) Encryption Zone operations should not block other RPC calls while retreiving encryption keys.

2022-05-24 Thread Masatake Iwasaki (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17541337#comment-17541337
 ] 

Masatake Iwasaki commented on HDFS-15037:
-

updated the target version for preparing 2.10.2 release.

> Encryption Zone operations should not block other RPC calls while retreiving 
> encryption keys.
> -
>
> Key: HDFS-15037
> URL: https://issues.apache.org/jira/browse/HDFS-15037
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption, namenode
>Affects Versions: 2.10.0
>Reporter: Konstantin Shvachko
>Priority: Major
>
> I believe it was an intention to avoid blocking other operations while 
> retrieving keys with holding {{FSDirectory.dirLock}}. But in reality all 
> other operations enter first {{FSNamesystemLock}} then {{dirLock}}. So they 
> are all blocked waiting for the key.
> We see substantial increase in RPC wait time ({{RpcQueueTimeAvgTime}}) on 
> NameNode when encryption operations are intermixed with regular workloads.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15357) Do not trust bad block reports from clients

2022-05-24 Thread Masatake Iwasaki (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-15357:

Target Version/s: 3.4.0, 2.10.3  (was: 3.4.0, 2.10.2)

> Do not trust bad block reports from clients
> ---
>
> Key: HDFS-15357
> URL: https://issues.apache.org/jira/browse/HDFS-15357
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Priority: Major
>
> {{reportBadBlocks()}} is implemented by both ClientNamenodeProtocol and 
> DatanodeProtocol. When DFSClient is calling it, a faulty client can cause 
> data availability issues in a cluster. 
> In the past we had such an incident where a node with a faulty NIC was 
> randomly corrupting data. All clients ran on the machine reported all 
> accessed blocks and all associated replicas to be corrupt.  More recently, a 
> single faulty client process caused  a small number of missing blocks.  In 
> all cases, actual data was fine.
> The bad block reports from clients shouldn't be trusted blindly. Instead, the 
> namenode should send a datanode command to verify the claim. A bonus would be 
> to keep the record for a while and ignore repeated reports from the same 
> nodes.
> At minimum, there should be an option to ignore bad block reports from 
> clients, perhaps after logging it. A very crude way would be to make it short 
> out in {{ClientNamenodeProtocolServerSideTranslatorPB#reportBadBlocks()}}. 
> More sophisticated way would be to check for the datanode user name in 
> {{FSNamesystem#reportBadBlocks()}} so that it can be easily logged, or 
> optionally do further processing.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16165) Backport the Hadoop 3.x Kerberos synchronization fix to Hadoop 2.x

2022-05-24 Thread Masatake Iwasaki (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-16165:

Target Version/s: 2.10.3  (was: 2.10.2)

> Backport the Hadoop 3.x Kerberos synchronization fix to Hadoop 2.x
> --
>
> Key: HDFS-16165
> URL: https://issues.apache.org/jira/browse/HDFS-16165
> Project: Hadoop HDFS
>  Issue Type: Wish
> Environment: Can be reproduced in docker HDFS environment with 
> Kerberos 
> https://github.com/vdesabou/kafka-docker-playground/blob/93a93de293ad2f9bb22afb244f2d8729a178296e/connect/connect-hdfs2-sink/hdfs2-sink-ha-kerberos-repro-gss-exception.sh
>Reporter: Daniel Osvath
>Priority: Major
>  Labels: Confluent
>
> *Problem Description*
> For more than a year Apache Kafka Connect users have been running into a 
> Kerberos renewal issue that causes our HDFS2 connectors to fail. 
> We have been able to consistently reproduce the issue under high load with 40 
> connectors (threads) that use the library. When we try an alternate 
> workaround that uses the kerberos keytab on the system the connector operates 
> without issues.
> We identified the root cause to be a race condition bug in the Hadoop 2.x 
> library that causes the ticker renewal to fail with the error below: 
> {code:java}
> Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
>  at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)We
>  reached the conclusion of the root cause once we tried the same environment 
> (40 connectors) with Hadoop 3.x, and our HDFS3 connectors and operated 
> without renewal issues. Additionally, identifying that the synchronization 
> issue has been fixed for the newer Hadoop 3.x releases  we confirmed our 
> hypothesis about the root cause. Request
> {code}
> There are many changes in HDFS 3 
> [UserGroupInformation.java|https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/UserGroupInformation.java]
>  related to UGI synchronization which were done as part of 
> https://issues.apache.org/jira/browse/HADOOP-9747, and those changes suggest 
> some race conditions were happening with older version, i.e HDFS 2.x Which 
> would explain why we can reproduce the problem with HDFS2.
> For example(among others):
> {code:java}
>   private void relogin(HadoopLoginContext login, boolean ignoreLastLoginTime)
>   throws IOException {
> // ensure the relogin is atomic to avoid leaving credentials in an
> // inconsistent state.  prevents other ugi instances, SASL, and SPNEGO
> // from accessing or altering credentials during the relogin.
> synchronized(login.getSubjectLock()) {
>   // another racing thread may have beat us to the relogin.
>   if (login == getLogin()) {
> unprotectedRelogin(login, ignoreLastLoginTime);
>   }
> }
>   }
> {code}
> All those changes were not backported to Hadoop 2.x (out HDFS2 connector uses 
> 2.10.1), on which several CDH distributions are based. 
> *Request*
> We would like to ask for the synchronization fix to be backported to Hadoop 
> 2.x so that our users can operate without issues. 
> *Impact*
> The older 2.x Hadoop version is used by our HDFS connector, which is used in 
> production by our community. Currently, the issue causes our HDFS connector 
> to fail, as it is unable to recover and renew the ticket at a later point. 
> Having the backported fix would allow our users to operate without issues 
> that require manual intervention every week (or few days in some cases). The 
> only workaround available to community for the issue is to run a command or 
> restart their workers. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16165) Backport the Hadoop 3.x Kerberos synchronization fix to Hadoop 2.x

2022-05-24 Thread Masatake Iwasaki (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17541334#comment-17541334
 ] 

Masatake Iwasaki commented on HDFS-16165:
-

updated the target version for preparing 2.10.2 release.

> Backport the Hadoop 3.x Kerberos synchronization fix to Hadoop 2.x
> --
>
> Key: HDFS-16165
> URL: https://issues.apache.org/jira/browse/HDFS-16165
> Project: Hadoop HDFS
>  Issue Type: Wish
> Environment: Can be reproduced in docker HDFS environment with 
> Kerberos 
> https://github.com/vdesabou/kafka-docker-playground/blob/93a93de293ad2f9bb22afb244f2d8729a178296e/connect/connect-hdfs2-sink/hdfs2-sink-ha-kerberos-repro-gss-exception.sh
>Reporter: Daniel Osvath
>Priority: Major
>  Labels: Confluent
>
> *Problem Description*
> For more than a year Apache Kafka Connect users have been running into a 
> Kerberos renewal issue that causes our HDFS2 connectors to fail. 
> We have been able to consistently reproduce the issue under high load with 40 
> connectors (threads) that use the library. When we try an alternate 
> workaround that uses the kerberos keytab on the system the connector operates 
> without issues.
> We identified the root cause to be a race condition bug in the Hadoop 2.x 
> library that causes the ticker renewal to fail with the error below: 
> {code:java}
> Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by 
> GSSException: No valid credentials provided (Mechanism level: Failed to find 
> any Kerberos tgt)]
>  at 
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)We
>  reached the conclusion of the root cause once we tried the same environment 
> (40 connectors) with Hadoop 3.x, and our HDFS3 connectors and operated 
> without renewal issues. Additionally, identifying that the synchronization 
> issue has been fixed for the newer Hadoop 3.x releases  we confirmed our 
> hypothesis about the root cause. Request
> {code}
> There are many changes in HDFS 3 
> [UserGroupInformation.java|https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/UserGroupInformation.java]
>  related to UGI synchronization which were done as part of 
> https://issues.apache.org/jira/browse/HADOOP-9747, and those changes suggest 
> some race conditions were happening with older version, i.e HDFS 2.x Which 
> would explain why we can reproduce the problem with HDFS2.
> For example(among others):
> {code:java}
>   private void relogin(HadoopLoginContext login, boolean ignoreLastLoginTime)
>   throws IOException {
> // ensure the relogin is atomic to avoid leaving credentials in an
> // inconsistent state.  prevents other ugi instances, SASL, and SPNEGO
> // from accessing or altering credentials during the relogin.
> synchronized(login.getSubjectLock()) {
>   // another racing thread may have beat us to the relogin.
>   if (login == getLogin()) {
> unprotectedRelogin(login, ignoreLastLoginTime);
>   }
> }
>   }
> {code}
> All those changes were not backported to Hadoop 2.x (out HDFS2 connector uses 
> 2.10.1), on which several CDH distributions are based. 
> *Request*
> We would like to ask for the synchronization fix to be backported to Hadoop 
> 2.x so that our users can operate without issues. 
> *Impact*
> The older 2.x Hadoop version is used by our HDFS connector, which is used in 
> production by our community. Currently, the issue causes our HDFS connector 
> to fail, as it is unable to recover and renew the ticket at a later point. 
> Having the backported fix would allow our users to operate without issues 
> that require manual intervention every week (or few days in some cases). The 
> only workaround available to community for the issue is to run a command or 
> restart their workers. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16592) Fix typo for BalancingPolicy

2022-05-24 Thread JiangHua Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

JiangHua Zhu updated HDFS-16592:

Component/s: documentation

> Fix typo for BalancingPolicy
> 
>
> Key: HDFS-16592
> URL: https://issues.apache.org/jira/browse/HDFS-16592
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: balancer  mover, documentation, namenode
>Affects Versions: 3.4.0
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Minor
>  Labels: pull-request-available
> Attachments: image-2022-05-24-11-29-14-019.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  !image-2022-05-24-11-29-14-019.png! 
> 'NOT' should be changed to lowercase rather than uppercase.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14277) [SBN read] Observer benchmark results

2022-05-24 Thread Masatake Iwasaki (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17541287#comment-17541287
 ] 

Masatake Iwasaki commented on HDFS-14277:
-

I updated the target version and priority for preparing 2.10.3.

> [SBN read] Observer benchmark results
> -
>
> Key: HDFS-14277
> URL: https://issues.apache.org/jira/browse/HDFS-14277
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: ha, namenode
>Affects Versions: 2.10.0, 3.3.0
> Environment: Hardware: 4-node cluster, each node has 4 core, Xeon 
> 2.5Ghz, 25GB memory.
> Software: CentOS 7.4, CDH 6.0 + Consistent Reads from Standby, Kerberos, SSL, 
> RPC encryption + Data Transfer Encryption, Cloudera Navigator.
>Reporter: Wei-Chiu Chuang
>Priority: Major
> Attachments: Observer profiler.png, Screen Shot 2019-02-14 at 
> 11.50.37 AM.png, observer RPC queue processing time.png
>
>
> Ran a few benchmarks and profiler (VisualVM) today on an Observer-enabled 
> cluster. Would like to share the results with the community. The cluster has 
> 1 Observer node.
> h2. NNThroughputBenchmark
> Generate 1 million files and send fileStatus RPCs.
> {code:java}
> hadoop org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -fs 
>   -op fileStatus -threads 100 -files 100 -useExisting 
> -keepResults
> {code}
> h3. Kerberos, SSL, RPC encryption, Data Transfer Encryption enabled:
> ||Node||fileStatus (Ops per sec)||
> |Active NameNode|4865|
> |Observer|3996|
> h3. Kerberos, SSL:
> ||Node||fileStatus (Ops per sec)||
> |Active NameNode|7078|
> |Observer|6459|
> Observation:
>  * due to the edit tailing overhead, Observer node consume 30% CPU 
> utilization even if the cluster is idle.
>  * While Active NN has less than 1ms RPC processing time, Observer node has > 
> 5ms RPC processing time. I am still looking for the source of the longer 
> processing time. The longer RPC processing time may be the cause for the 
> performance degradation compared to that of Active NN. Note the cluster has 
> Cloudera Navigator installed which adds additional overhead to RPC processing 
> time.
>  * {{GlobalStateIdContext#isCoordinatedCall()}} pops up as one of the top 
> hotspots in the profiler. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14277) [SBN read] Observer benchmark results

2022-05-24 Thread Masatake Iwasaki (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-14277:

Target Version/s: 2.10.3  (was: 2.10.2)

> [SBN read] Observer benchmark results
> -
>
> Key: HDFS-14277
> URL: https://issues.apache.org/jira/browse/HDFS-14277
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: ha, namenode
>Affects Versions: 2.10.0, 3.3.0
> Environment: Hardware: 4-node cluster, each node has 4 core, Xeon 
> 2.5Ghz, 25GB memory.
> Software: CentOS 7.4, CDH 6.0 + Consistent Reads from Standby, Kerberos, SSL, 
> RPC encryption + Data Transfer Encryption, Cloudera Navigator.
>Reporter: Wei-Chiu Chuang
>Priority: Blocker
> Attachments: Observer profiler.png, Screen Shot 2019-02-14 at 
> 11.50.37 AM.png, observer RPC queue processing time.png
>
>
> Ran a few benchmarks and profiler (VisualVM) today on an Observer-enabled 
> cluster. Would like to share the results with the community. The cluster has 
> 1 Observer node.
> h2. NNThroughputBenchmark
> Generate 1 million files and send fileStatus RPCs.
> {code:java}
> hadoop org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -fs 
>   -op fileStatus -threads 100 -files 100 -useExisting 
> -keepResults
> {code}
> h3. Kerberos, SSL, RPC encryption, Data Transfer Encryption enabled:
> ||Node||fileStatus (Ops per sec)||
> |Active NameNode|4865|
> |Observer|3996|
> h3. Kerberos, SSL:
> ||Node||fileStatus (Ops per sec)||
> |Active NameNode|7078|
> |Observer|6459|
> Observation:
>  * due to the edit tailing overhead, Observer node consume 30% CPU 
> utilization even if the cluster is idle.
>  * While Active NN has less than 1ms RPC processing time, Observer node has > 
> 5ms RPC processing time. I am still looking for the source of the longer 
> processing time. The longer RPC processing time may be the cause for the 
> performance degradation compared to that of Active NN. Note the cluster has 
> Cloudera Navigator installed which adds additional overhead to RPC processing 
> time.
>  * {{GlobalStateIdContext#isCoordinatedCall()}} pops up as one of the top 
> hotspots in the profiler. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14277) [SBN read] Observer benchmark results

2022-05-24 Thread Masatake Iwasaki (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-14277:

Priority: Major  (was: Blocker)

> [SBN read] Observer benchmark results
> -
>
> Key: HDFS-14277
> URL: https://issues.apache.org/jira/browse/HDFS-14277
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: ha, namenode
>Affects Versions: 2.10.0, 3.3.0
> Environment: Hardware: 4-node cluster, each node has 4 core, Xeon 
> 2.5Ghz, 25GB memory.
> Software: CentOS 7.4, CDH 6.0 + Consistent Reads from Standby, Kerberos, SSL, 
> RPC encryption + Data Transfer Encryption, Cloudera Navigator.
>Reporter: Wei-Chiu Chuang
>Priority: Major
> Attachments: Observer profiler.png, Screen Shot 2019-02-14 at 
> 11.50.37 AM.png, observer RPC queue processing time.png
>
>
> Ran a few benchmarks and profiler (VisualVM) today on an Observer-enabled 
> cluster. Would like to share the results with the community. The cluster has 
> 1 Observer node.
> h2. NNThroughputBenchmark
> Generate 1 million files and send fileStatus RPCs.
> {code:java}
> hadoop org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -fs 
>   -op fileStatus -threads 100 -files 100 -useExisting 
> -keepResults
> {code}
> h3. Kerberos, SSL, RPC encryption, Data Transfer Encryption enabled:
> ||Node||fileStatus (Ops per sec)||
> |Active NameNode|4865|
> |Observer|3996|
> h3. Kerberos, SSL:
> ||Node||fileStatus (Ops per sec)||
> |Active NameNode|7078|
> |Observer|6459|
> Observation:
>  * due to the edit tailing overhead, Observer node consume 30% CPU 
> utilization even if the cluster is idle.
>  * While Active NN has less than 1ms RPC processing time, Observer node has > 
> 5ms RPC processing time. I am still looking for the source of the longer 
> processing time. The longer RPC processing time may be the cause for the 
> performance degradation compared to that of Active NN. Note the cluster has 
> Cloudera Navigator installed which adds additional overhead to RPC processing 
> time.
>  * {{GlobalStateIdContext#isCoordinatedCall()}} pops up as one of the top 
> hotspots in the profiler. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15878) RBF: Flaky test TestRouterWebHDFSContractCreate>AbstractContractCreateTest#testSyncable in Trunk

2022-05-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15878?focusedWorklogId=773882=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-773882
 ]

ASF GitHub Bot logged work on HDFS-15878:
-

Author: ASF GitHub Bot
Created on: 24/May/22 06:04
Start Date: 24/May/22 06:04
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on PR #4340:
URL: https://github.com/apache/hadoop/pull/4340#issuecomment-1135447208

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 45s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  42m 11s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m 48s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |  20m 54s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 43s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 13s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 40s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 10s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m  5s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m  9s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 10s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 59s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |  22m 59s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m 46s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  21m 46s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m 35s |  |  
hadoop-common-project/hadoop-common: The patch generated 0 new + 0 unchanged - 
5 fixed = 0 total (was 5)  |
   | +1 :green_heart: |  mvnsite  |   2m  5s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 30s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 21s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m  6s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m 58s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  18m 18s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   1m 25s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 221m 21s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4340/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4340 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux ba989b951e01 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 44f3349cef4d6c29c77e2da7b1aac51e6ac47924 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4340/2/testReport/ |
   | Max. process+thread count | 3158 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common U: 
hadoop-common-project/hadoop-common |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4340/2/console |
   |