[jira] [Commented] (HDFS-16271) RBF: NullPointerException when setQuota through routers with quota disabled

2021-10-13 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17428592#comment-17428592
 ] 

Hadoop QA commented on HDFS-16271:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
47s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 1 
new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
36s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
43s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
19m 13s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 22m 
15s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are 
enabled, using SpotBugs. {color} |
| {color:green}+1{color} | {color:green} spotbugs {color} | {color:green}  1m 
17s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
34s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
35s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
35s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
30s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
33s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace 
issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 35s{color} | {color:green}{color} | {color:green} patch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} spotbugs {color} | {color:green}  1m 
20s{color} | {color

[jira] [Commented] (HDFS-16271) RBF: NullPointerException when setQuota through routers with quota disabled

2021-10-13 Thread Chengwei Wang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17428575#comment-17428575
 ] 

Chengwei Wang commented on HDFS-16271:
--

Hi [~elgoiri],  thanks for your review.  
Submit patch v002  [^HDFS-16271.002.patch]

> RBF: NullPointerException when setQuota through routers with quota disabled
> ---
>
> Key: HDFS-16271
> URL: https://issues.apache.org/jira/browse/HDFS-16271
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.1
>Reporter: Chengwei Wang
>Assignee: Chengwei Wang
>Priority: Major
> Fix For: 3.3.1
>
> Attachments: HDFS-16271.001.patch, HDFS-16271.002.patch
>
>
> When we started routers with *dfs.federation.router.quota.enable=false*, and 
> try to setQuota through them, NullPointerException caught.
> The cuase of NPE is that the Router#quotaManager not initialized when 
> dfs.federation.router.quota.enable=false,
>  but when executing setQuota rpc request inside router, we wolud use it in 
> method Quota#isMountEntry without null check .
> I think it's better to check whether Router#isQuotaEnabled is true before use 
> Router#quotaManager, and throw an IOException with readable message if need.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16271) RBF: NullPointerException when setQuota through routers with quota disabled

2021-10-13 Thread Chengwei Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengwei Wang updated HDFS-16271:
-
Attachment: HDFS-16271.002.patch

> RBF: NullPointerException when setQuota through routers with quota disabled
> ---
>
> Key: HDFS-16271
> URL: https://issues.apache.org/jira/browse/HDFS-16271
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.1
>Reporter: Chengwei Wang
>Assignee: Chengwei Wang
>Priority: Major
> Fix For: 3.3.1
>
> Attachments: HDFS-16271.001.patch, HDFS-16271.002.patch
>
>
> When we started routers with *dfs.federation.router.quota.enable=false*, and 
> try to setQuota through them, NullPointerException caught.
> The cuase of NPE is that the Router#quotaManager not initialized when 
> dfs.federation.router.quota.enable=false,
>  but when executing setQuota rpc request inside router, we wolud use it in 
> method Quota#isMountEntry without null check .
> I think it's better to check whether Router#isQuotaEnabled is true before use 
> Router#quotaManager, and throw an IOException with readable message if need.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16270) Improve NNThroughputBenchmark#printUsage() related to block size

2021-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16270?focusedWorklogId=665490&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-665490
 ]

ASF GitHub Bot logged work on HDFS-16270:
-

Author: ASF GitHub Bot
Created on: 14/Oct/21 02:17
Start Date: 14/Oct/21 02:17
Worklog Time Spent: 10m 
  Work Description: jianghuazhu commented on pull request #3547:
URL: https://github.com/apache/hadoop/pull/3547#issuecomment-942885036


   In the results of jenkins, there are some exceptions, such as:
   TestHDFSFileSystemContract
   TestLeaseRecovery
   TestFileTruncate
   TestBlockTokenWithDFSStriped
   TestViewDistributedFileSystemWithMountLinks
   After analysis, it does not seem to have much connection with the code I 
submitted.
   
   @tomscut  @prasad-acit , would you like to spend some time to help review 
this pr.
   Thank you very much.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 665490)
Time Spent: 50m  (was: 40m)

> Improve NNThroughputBenchmark#printUsage() related to block size
> 
>
> Key: HDFS-16270
> URL: https://issues.apache.org/jira/browse/HDFS-16270
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: benchmarks, namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When using the NNThroughputBenchmark test, if the usage is not correct, we 
> will get some prompt messages.
> E.g:
> '
> If connecting to a remote NameNode with -fs option, 
> dfs.namenode.fs-limits.min-block-size should be set to 16.
> 21/10/13 11:55:32 INFO util.ExitUtil: Exiting with status -1: ExitException
> '
> Yes, this way is good.
> However, the setting of'dfs.blocksize' has been completed before execution, 
> for example:
> conf.setInt(DFSConfigKeys.DFS_BLOCK_SIZE_KEY, 16);
> We will still get the above prompt, which is wrong.
> At the same time, it should also be explained. The hint here should not be 
> for'dfs.namenode.fs-limits.min-block-size', but should be'dfs.blocksize'.
> Because in the NNThroughputBenchmark construction, 
> the'dfs.namenode.fs-limits.min-block-size' has been set to 0 in advance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16268) Balancer stuck when moving striped blocks due to NPE

2021-10-13 Thread Jing Zhao (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17428552#comment-17428552
 ] 

Jing Zhao commented on HDFS-16268:
--

I've committed the fix. Thanks for the contribution, [~LeonG]!

> Balancer stuck when moving striped blocks due to NPE
> 
>
> Key: HDFS-16268
> URL: https://issues.apache.org/jira/browse/HDFS-16268
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover, erasure-coding
>Affects Versions: 3.2.2
>Reporter: Leon Gao
>Assignee: Leon Gao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {code:java}
> 21/10/11 06:11:26 WARN balancer.Dispatcher: Dispatcher thread failed
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.markMovedIfGoodBlock(Dispatcher.java:289)
>         at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.chooseBlockAndProxy(Dispatcher.java:272)
>         at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$2500(Dispatcher.java:236)
>         at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$Source.chooseNextMove(Dispatcher.java:899)
>         at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$Source.dispatchBlocks(Dispatcher.java:958)
>         at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$Source.access$3300(Dispatcher.java:757)
>         at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$2.run(Dispatcher.java:1226)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> {code}
> Due to NPE in the middle, there will be pending moves left in the queue so 
> balancer will stuck forever.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16268) Balancer stuck when moving striped blocks due to NPE

2021-10-13 Thread Jing Zhao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao resolved HDFS-16268.
--
Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

> Balancer stuck when moving striped blocks due to NPE
> 
>
> Key: HDFS-16268
> URL: https://issues.apache.org/jira/browse/HDFS-16268
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover, erasure-coding
>Affects Versions: 3.2.2
>Reporter: Leon Gao
>Assignee: Leon Gao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {code:java}
> 21/10/11 06:11:26 WARN balancer.Dispatcher: Dispatcher thread failed
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.markMovedIfGoodBlock(Dispatcher.java:289)
>         at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.chooseBlockAndProxy(Dispatcher.java:272)
>         at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$2500(Dispatcher.java:236)
>         at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$Source.chooseNextMove(Dispatcher.java:899)
>         at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$Source.dispatchBlocks(Dispatcher.java:958)
>         at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$Source.access$3300(Dispatcher.java:757)
>         at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$2.run(Dispatcher.java:1226)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> {code}
> Due to NPE in the middle, there will be pending moves left in the queue so 
> balancer will stuck forever.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16268) Balancer stuck when moving striped blocks due to NPE

2021-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16268?focusedWorklogId=665480&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-665480
 ]

ASF GitHub Bot logged work on HDFS-16268:
-

Author: ASF GitHub Bot
Created on: 14/Oct/21 01:14
Start Date: 14/Oct/21 01:14
Worklog Time Spent: 10m 
  Work Description: Jing9 merged pull request #3546:
URL: https://github.com/apache/hadoop/pull/3546


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 665480)
Time Spent: 40m  (was: 0.5h)

> Balancer stuck when moving striped blocks due to NPE
> 
>
> Key: HDFS-16268
> URL: https://issues.apache.org/jira/browse/HDFS-16268
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover, erasure-coding
>Affects Versions: 3.2.2
>Reporter: Leon Gao
>Assignee: Leon Gao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> {code:java}
> 21/10/11 06:11:26 WARN balancer.Dispatcher: Dispatcher thread failed
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.markMovedIfGoodBlock(Dispatcher.java:289)
>         at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.chooseBlockAndProxy(Dispatcher.java:272)
>         at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$2500(Dispatcher.java:236)
>         at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$Source.chooseNextMove(Dispatcher.java:899)
>         at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$Source.dispatchBlocks(Dispatcher.java:958)
>         at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$Source.access$3300(Dispatcher.java:757)
>         at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$2.run(Dispatcher.java:1226)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> {code}
> Due to NPE in the middle, there will be pending moves left in the queue so 
> balancer will stuck forever.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-16272) Int overflow in computing safe length during EC block recovery

2021-10-13 Thread Stephen O'Donnell (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen O'Donnell reassigned HDFS-16272:


Assignee: daimin

> Int overflow in computing safe length during EC block recovery
> --
>
> Key: HDFS-16272
> URL: https://issues.apache.org/jira/browse/HDFS-16272
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: 3.1.1
>Affects Versions: 3.3.0, 3.3.1
> Environment: Cluster settings: EC RS-8-2-256k, Block Size 1GiB.
>Reporter: daimin
>Assignee: daimin
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> There exists an int overflow problem in StripedBlockUtil#getSafeLength, which 
> will produce a negative or zero length:
> 1. With negative length, it fails to the later >=0 check, and will crash the 
> BlockRecoveryWorker thread, which make the lease recovery operation unable to 
> finish.
> 2. With zero length, it passes the check, and directly truncate the block 
> size to zero, leads to data lossing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16272) Int overflow in computing safe length during EC block recovery

2021-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16272?focusedWorklogId=665413&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-665413
 ]

ASF GitHub Bot logged work on HDFS-16272:
-

Author: ASF GitHub Bot
Created on: 13/Oct/21 21:57
Start Date: 13/Oct/21 21:57
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on a change in pull request #3548:
URL: https://github.com/apache/hadoop/pull/3548#discussion_r728476447



##
File path: 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/util/StripedBlockUtil.java
##
@@ -245,8 +245,7 @@ public static long getSafeLength(ErasureCodingPolicy 
ecPolicy,
 Arrays.sort(cpy);
 // full stripe is a stripe has at least dataBlkNum full cells.
 // lastFullStripeIdx is the index of the last full stripe.
-int lastFullStripeIdx =
-(int) (cpy[cpy.length - dataBlkNum] / cellSize);
+long lastFullStripeIdx = cpy[cpy.length - dataBlkNum] / cellSize;

Review comment:
   The change you have made here makes sense to me, although I still don't 
fully understand how the method is used in practice. However I do worry where 
else a bug like this may exist.
   
   I think there is a similar problem in `offsetInBlkToOffsetInBG()` in this 
same class. It only seems to be used in test code, but it would be good to fix 
it incase it is used in non-test code later.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 665413)
Time Spent: 1h  (was: 50m)

> Int overflow in computing safe length during EC block recovery
> --
>
> Key: HDFS-16272
> URL: https://issues.apache.org/jira/browse/HDFS-16272
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: 3.1.1
>Affects Versions: 3.3.0, 3.3.1
> Environment: Cluster settings: EC RS-8-2-256k, Block Size 1GiB.
>Reporter: daimin
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> There exists an int overflow problem in StripedBlockUtil#getSafeLength, which 
> will produce a negative or zero length:
> 1. With negative length, it fails to the later >=0 check, and will crash the 
> BlockRecoveryWorker thread, which make the lease recovery operation unable to 
> finish.
> 2. With zero length, it passes the check, and directly truncate the block 
> size to zero, leads to data lossing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16272) Int overflow in computing safe length during EC block recovery

2021-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16272?focusedWorklogId=665347&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-665347
 ]

ASF GitHub Bot logged work on HDFS-16272:
-

Author: ASF GitHub Bot
Created on: 13/Oct/21 20:16
Start Date: 13/Oct/21 20:16
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on a change in pull request #3548:
URL: https://github.com/apache/hadoop/pull/3548#discussion_r728414217



##
File path: 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/util/StripedBlockUtil.java
##
@@ -245,8 +245,7 @@ public static long getSafeLength(ErasureCodingPolicy 
ecPolicy,
 Arrays.sort(cpy);
 // full stripe is a stripe has at least dataBlkNum full cells.
 // lastFullStripeIdx is the index of the last full stripe.
-int lastFullStripeIdx =
-(int) (cpy[cpy.length - dataBlkNum] / cellSize);
+long lastFullStripeIdx = cpy[cpy.length - dataBlkNum] / cellSize;

Review comment:
   I know this is existing code, but I'd like to understand what is 
happening here to review this.
   
   This method receives an array of internal block lengths, so for 3-2 it will 
have 5 entries, 6-3 it will have 9 etc.
   
   Then it sorts the lengths smallest to largest. Then it selects the one at 
position num_blocks - numDataUnits.
   
   Why does it not just pick the first one, which would be the smallest, as the 
smallest data block in the group indicates the last full stripe.
   
   Why is the safe length based on the full stripe, and not a potentially 
partial last stripe?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 665347)
Time Spent: 50m  (was: 40m)

> Int overflow in computing safe length during EC block recovery
> --
>
> Key: HDFS-16272
> URL: https://issues.apache.org/jira/browse/HDFS-16272
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: 3.1.1
>Affects Versions: 3.3.0, 3.3.1
> Environment: Cluster settings: EC RS-8-2-256k, Block Size 1GiB.
>Reporter: daimin
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> There exists an int overflow problem in StripedBlockUtil#getSafeLength, which 
> will produce a negative or zero length:
> 1. With negative length, it fails to the later >=0 check, and will crash the 
> BlockRecoveryWorker thread, which make the lease recovery operation unable to 
> finish.
> 2. With zero length, it passes the check, and directly truncate the block 
> size to zero, leads to data lossing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16267) Make hdfs_df tool cross platform

2021-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16267?focusedWorklogId=665231&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-665231
 ]

ASF GitHub Bot logged work on HDFS-16267:
-

Author: ASF GitHub Bot
Created on: 13/Oct/21 18:57
Start Date: 13/Oct/21 18:57
Worklog Time Spent: 10m 
  Work Description: goiri merged pull request #3542:
URL: https://github.com/apache/hadoop/pull/3542


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 665231)
Time Spent: 1.5h  (was: 1h 20m)

> Make hdfs_df tool cross platform
> 
>
> Key: HDFS-16267
> URL: https://issues.apache.org/jira/browse/HDFS-16267
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, libhdfs++, tools
>Affects Versions: 3.4.0
> Environment: Centos 7, Centos 8, Debian 10, Ubuntu Focal
>Reporter: Gautham Banasandra
>Assignee: Gautham Banasandra
>Priority: Major
>  Labels: libhdfscpp, pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The source files for hdfs_df uses *getopt* for parsing the command line 
> arguments. getopt is available only on Linux and thus, isn't cross platform. 
> We need to replace getopt with *boost::program_options* to make this cross 
> platform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16272) Int overflow in computing safe length during EC block recovery

2021-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16272?focusedWorklogId=665227&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-665227
 ]

ASF GitHub Bot logged work on HDFS-16272:
-

Author: ASF GitHub Bot
Created on: 13/Oct/21 18:57
Start Date: 13/Oct/21 18:57
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3548:
URL: https://github.com/apache/hadoop/pull/3548#issuecomment-942563834


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  1s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  12m 55s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  27m 22s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   6m 30s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   5m 58s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m 21s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 38s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 52s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   2m 13s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   6m 22s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  27m 41s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 26s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   6m 21s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   6m 21s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   5m 53s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   5m 53s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   1m 17s | 
[/results-checkstyle-hadoop-hdfs-project.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3548/1/artifact/out/results-checkstyle-hadoop-hdfs-project.txt)
 |  hadoop-hdfs-project: The patch generated 8 new + 23 unchanged - 0 fixed = 
31 total (was 23)  |
   | +1 :green_heart: |  mvnsite  |   2m 28s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 35s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   2m  3s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   6m 33s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  27m 48s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 23s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | -1 :x: |  unit  | 441m 51s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3548/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   2m 15s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 597m  6s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.web.TestWebHdfsFileSystemContract |
   |   | hadoop.hdfs.TestViewDistributedFileSystemContract |
   |   | hadoop.fs.viewfs.TestViewFSOverloadSchemeWithMountTableConfigInHDFS |
   |   | 
hadoop.fs.viewfs.TestViewFileSystemOverloadSchemeHdfsFileSystemContract |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3548/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3548 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 16fff83

[jira] [Work logged] (HDFS-16270) Improve NNThroughputBenchmark#printUsage() related to block size

2021-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16270?focusedWorklogId=665131&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-665131
 ]

ASF GitHub Bot logged work on HDFS-16270:
-

Author: ASF GitHub Bot
Created on: 13/Oct/21 18:47
Start Date: 13/Oct/21 18:47
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3547:
URL: https://github.com/apache/hadoop/pull/3547#issuecomment-942349833


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 58s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  34m 16s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 24s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   1m 15s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m  1s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 20s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 56s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 26s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 14s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m 12s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 17s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 17s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  8s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   1m  8s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  1s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 53s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 47s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 18s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 20s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m 54s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 383m 26s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3547/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 54s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 487m  4s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.TestHDFSFileSystemContract |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3547/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3547 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux b9993f782838 4.15.0-147-generic #151-Ubuntu SMP Fri Jun 18 
19:21:19 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 4b3a9d7e04fd7e7d55e1af93e07691b379e8d8c7 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3547/1/testReport/ |
   | Max. process+thread count | 2009 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdf

[jira] [Work logged] (HDFS-16244) Add the necessary write lock in Checkpointer#doCheckpoint()

2021-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16244?focusedWorklogId=665097&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-665097
 ]

ASF GitHub Bot logged work on HDFS-16244:
-

Author: ASF GitHub Bot
Created on: 13/Oct/21 18:43
Start Date: 13/Oct/21 18:43
Worklog Time Spent: 10m 
  Work Description: ferhui merged pull request #3497:
URL: https://github.com/apache/hadoop/pull/3497


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 665097)
Time Spent: 2h 40m  (was: 2.5h)

> Add the necessary write lock in Checkpointer#doCheckpoint()
> ---
>
> Key: HDFS-16244
> URL: https://issues.apache.org/jira/browse/HDFS-16244
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> When BackupNode is enabled, Checkpointer#doCheckpoint() will start to work.
> When the image file needs to be reloaded, there is a call link, for example:
> FSImage#reloadFromImageFile()->FSNamesystem#clear()->FSDirectory#reset().
> In FSDirectory#reset(), the write lock needs to be acquired in advance, for 
> example:
>void reset() {
>  writeLock();
>  try {
>  ..
>  } finally {
>writeUnlock();
>  }
>}
> However, no write lock has been acquired before this.
> You will get an exception message at this time, for example:
> java.lang.AssertionError: Should hold namesystem write lock



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16268) Balancer stuck when moving striped blocks due to NPE

2021-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16268?focusedWorklogId=665068&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-665068
 ]

ASF GitHub Bot logged work on HDFS-16268:
-

Author: ASF GitHub Bot
Created on: 13/Oct/21 18:41
Start Date: 13/Oct/21 18:41
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3546:
URL: https://github.com/apache/hadoop/pull/3546#issuecomment-941845531






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 665068)
Time Spent: 0.5h  (was: 20m)

> Balancer stuck when moving striped blocks due to NPE
> 
>
> Key: HDFS-16268
> URL: https://issues.apache.org/jira/browse/HDFS-16268
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover, erasure-coding
>Affects Versions: 3.2.2
>Reporter: Leon Gao
>Assignee: Leon Gao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> {code:java}
> 21/10/11 06:11:26 WARN balancer.Dispatcher: Dispatcher thread failed
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.markMovedIfGoodBlock(Dispatcher.java:289)
>         at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.chooseBlockAndProxy(Dispatcher.java:272)
>         at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$2500(Dispatcher.java:236)
>         at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$Source.chooseNextMove(Dispatcher.java:899)
>         at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$Source.dispatchBlocks(Dispatcher.java:958)
>         at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$Source.access$3300(Dispatcher.java:757)
>         at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$2.run(Dispatcher.java:1226)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> {code}
> Due to NPE in the middle, there will be pending moves left in the queue so 
> balancer will stuck forever.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16244) Add the necessary write lock in Checkpointer#doCheckpoint()

2021-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16244?focusedWorklogId=665075&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-665075
 ]

ASF GitHub Bot logged work on HDFS-16244:
-

Author: ASF GitHub Bot
Created on: 13/Oct/21 18:41
Start Date: 13/Oct/21 18:41
Worklog Time Spent: 10m 
  Work Description: jianghuazhu commented on pull request #3497:
URL: https://github.com/apache/hadoop/pull/3497#issuecomment-942078112






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 665075)
Time Spent: 2.5h  (was: 2h 20m)

> Add the necessary write lock in Checkpointer#doCheckpoint()
> ---
>
> Key: HDFS-16244
> URL: https://issues.apache.org/jira/browse/HDFS-16244
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> When BackupNode is enabled, Checkpointer#doCheckpoint() will start to work.
> When the image file needs to be reloaded, there is a call link, for example:
> FSImage#reloadFromImageFile()->FSNamesystem#clear()->FSDirectory#reset().
> In FSDirectory#reset(), the write lock needs to be acquired in advance, for 
> example:
>void reset() {
>  writeLock();
>  try {
>  ..
>  } finally {
>writeUnlock();
>  }
>}
> However, no write lock has been acquired before this.
> You will get an exception message at this time, for example:
> java.lang.AssertionError: Should hold namesystem write lock



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16186) Datanode kicks out hard disk logic optimization

2021-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16186?focusedWorklogId=665064&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-665064
 ]

ASF GitHub Bot logged work on HDFS-16186:
-

Author: ASF GitHub Bot
Created on: 13/Oct/21 18:41
Start Date: 13/Oct/21 18:41
Worklog Time Spent: 10m 
  Work Description: singer-bin closed pull request #3334:
URL: https://github.com/apache/hadoop/pull/3334


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 665064)
Time Spent: 3h  (was: 2h 50m)

> Datanode kicks out hard disk logic optimization
> ---
>
> Key: HDFS-16186
> URL: https://issues.apache.org/jira/browse/HDFS-16186
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.1.2
> Environment: In the hadoop cluster, a certain hard disk in a certain 
> Datanode has a problem, but the datanode of hdfs did not kick out the hard 
> disk in time, causing the datanode to become a slow node
>Reporter: yanbin.zhang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> 2021-08-24 08:56:10,456 WARN datanode.DataNode 
> (BlockSender.java:readChecksum(681)) - Could not read or failed to verify 
> checksum for data at offset 113115136 for block 
> BP-1801371083-x.x.x.x-1603704063698:blk_5635828768_4563943709
> java.io.IOException: Input/output error
>  at java.io.FileInputStream.readBytes(Native Method)
>  at java.io.FileInputStream.read(FileInputStream.java:255)
>  at 
> org.apache.hadoop.hdfs.server.datanode.FileIoProvider$WrappedFileInputStream.read(FileIoProvider.java:876)
>  at java.io.FilterInputStream.read(FilterInputStream.java:133)
>  at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
>  at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
>  at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
>  at java.io.DataInputStream.read(DataInputStream.java:149)
>  at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:210)
>  at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.ReplicaInputStreams.readChecksumFully(ReplicaInputStreams.java:90)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.readChecksum(BlockSender.java:679)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:588)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.doSendBlock(BlockSender.java:803)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:750)
>  at 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.scanBlock(VolumeScanner.java:448)
>  at 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:558)
>  at 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:633)
> 2021-08-24 08:56:11,121 WARN datanode.VolumeScanner 
> (VolumeScanner.java:handle(292)) - Reporting bad 
> BP-1801371083-x.x.x.x-1603704063698:blk_5635828768_4563943709 on 
> /data11/hdfs/data



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16186) Datanode kicks out hard disk logic optimization

2021-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16186?focusedWorklogId=665022&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-665022
 ]

ASF GitHub Bot logged work on HDFS-16186:
-

Author: ASF GitHub Bot
Created on: 13/Oct/21 18:36
Start Date: 13/Oct/21 18:36
Worklog Time Spent: 10m 
  Work Description: singer-bin removed a comment on pull request #3334:
URL: https://github.com/apache/hadoop/pull/3334#issuecomment-941838725






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 665022)
Time Spent: 2h 50m  (was: 2h 40m)

> Datanode kicks out hard disk logic optimization
> ---
>
> Key: HDFS-16186
> URL: https://issues.apache.org/jira/browse/HDFS-16186
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.1.2
> Environment: In the hadoop cluster, a certain hard disk in a certain 
> Datanode has a problem, but the datanode of hdfs did not kick out the hard 
> disk in time, causing the datanode to become a slow node
>Reporter: yanbin.zhang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> 2021-08-24 08:56:10,456 WARN datanode.DataNode 
> (BlockSender.java:readChecksum(681)) - Could not read or failed to verify 
> checksum for data at offset 113115136 for block 
> BP-1801371083-x.x.x.x-1603704063698:blk_5635828768_4563943709
> java.io.IOException: Input/output error
>  at java.io.FileInputStream.readBytes(Native Method)
>  at java.io.FileInputStream.read(FileInputStream.java:255)
>  at 
> org.apache.hadoop.hdfs.server.datanode.FileIoProvider$WrappedFileInputStream.read(FileIoProvider.java:876)
>  at java.io.FilterInputStream.read(FilterInputStream.java:133)
>  at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
>  at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
>  at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
>  at java.io.DataInputStream.read(DataInputStream.java:149)
>  at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:210)
>  at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.ReplicaInputStreams.readChecksumFully(ReplicaInputStreams.java:90)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.readChecksum(BlockSender.java:679)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:588)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.doSendBlock(BlockSender.java:803)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:750)
>  at 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.scanBlock(VolumeScanner.java:448)
>  at 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:558)
>  at 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:633)
> 2021-08-24 08:56:11,121 WARN datanode.VolumeScanner 
> (VolumeScanner.java:handle(292)) - Reporting bad 
> BP-1801371083-x.x.x.x-1603704063698:blk_5635828768_4563943709 on 
> /data11/hdfs/data



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16269) [Fix] Improve NNThroughputBenchmark#blockReport operation

2021-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16269?focusedWorklogId=664996&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-664996
 ]

ASF GitHub Bot logged work on HDFS-16269:
-

Author: ASF GitHub Bot
Created on: 13/Oct/21 18:34
Start Date: 13/Oct/21 18:34
Worklog Time Spent: 10m 
  Work Description: jianghuazhu commented on pull request #3544:
URL: https://github.com/apache/hadoop/pull/3544#issuecomment-941836374


   @ayushtkn  @virajjasani, are you willing to spend some time reviewing this 
pr.
   Thank you very much.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 664996)
Time Spent: 50m  (was: 40m)

> [Fix] Improve NNThroughputBenchmark#blockReport operation
> -
>
> Key: HDFS-16269
> URL: https://issues.apache.org/jira/browse/HDFS-16269
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: benchmarks, namenode
>Affects Versions: 2.9.2
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> When using NNThroughputBenchmark to verify the blockReport, you will get some 
> exception information.
> Commands used:
> ./bin/hadoop org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -fs 
>  -op blockReport -datanodes 3 -reports 1
> The exception information:
> 21/10/12 14:35:18 INFO namenode.NNThroughputBenchmark: Starting benchmark: 
> blockReport
> 21/10/12 14:35:19 INFO namenode.NNThroughputBenchmark: Creating 10 files with 
> 10 blocks each.
> 21/10/12 14:35:19 ERROR namenode.NNThroughputBenchmark: 
> java.lang.ArrayIndexOutOfBoundsException: 50009
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.addBlocks(NNThroughputBenchmark.java:1161)
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.generateInputs(NNThroughputBenchmark.java:1143)
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$OperationStatsBase.benchmark(NNThroughputBenchmark.java:257)
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.run(NNThroughputBenchmark.java:1528)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.runBenchmark(NNThroughputBenchmark.java:1430)
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.main(NNThroughputBenchmark.java:1550)
> Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 50009
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.addBlocks(NNThroughputBenchmark.java:1161)
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$BlockReportStats.generateInputs(NNThroughputBenchmark.java:1143)
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark$OperationStatsBase.benchmark(NNThroughputBenchmark.java:257)
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.run(NNThroughputBenchmark.java:1528)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.runBenchmark(NNThroughputBenchmark.java:1430)
> at 
> org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark.main(NNThroughputBenchmark.java:1550)
> Checked some code and found that the problem appeared here.
> private ExtendedBlock addBlocks(String fileName, String clientName)
>  throws IOException {
>  for(DatanodeInfo dnInfo: loc.getLocations()) {
>int dnIdx = dnInfo.getXferPort()-1;
>datanodes[dnIdx].addBlock(loc.getBlock().getLocalBlock());
> }
>  }
> It can be seen from this that what dnInfo.getXferPort() gets is a port 
> information and should not be used as an index of an array.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16270) Improve NNThroughputBenchmark#printUsage() related to block size

2021-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16270?focusedWorklogId=664971&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-664971
 ]

ASF GitHub Bot logged work on HDFS-16270:
-

Author: ASF GitHub Bot
Created on: 13/Oct/21 18:32
Start Date: 13/Oct/21 18:32
Worklog Time Spent: 10m 
  Work Description: jianghuazhu opened a new pull request #3547:
URL: https://github.com/apache/hadoop/pull/3547


   ### Description of PR
   When NNThroughputBenchmark#printUsage() is executed, some configurations are 
verified first, and the correct prompts are printed.
   
   ### How was this patch tested?
   This is to print some information, the test pressure is not great.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 664971)
Time Spent: 0.5h  (was: 20m)

> Improve NNThroughputBenchmark#printUsage() related to block size
> 
>
> Key: HDFS-16270
> URL: https://issues.apache.org/jira/browse/HDFS-16270
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: benchmarks, namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When using the NNThroughputBenchmark test, if the usage is not correct, we 
> will get some prompt messages.
> E.g:
> '
> If connecting to a remote NameNode with -fs option, 
> dfs.namenode.fs-limits.min-block-size should be set to 16.
> 21/10/13 11:55:32 INFO util.ExitUtil: Exiting with status -1: ExitException
> '
> Yes, this way is good.
> However, the setting of'dfs.blocksize' has been completed before execution, 
> for example:
> conf.setInt(DFSConfigKeys.DFS_BLOCK_SIZE_KEY, 16);
> We will still get the above prompt, which is wrong.
> At the same time, it should also be explained. The hint here should not be 
> for'dfs.namenode.fs-limits.min-block-size', but should be'dfs.blocksize'.
> Because in the NNThroughputBenchmark construction, 
> the'dfs.namenode.fs-limits.min-block-size' has been set to 0 in advance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16272) Int overflow in computing safe length during EC block recovery

2021-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16272?focusedWorklogId=664978&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-664978
 ]

ASF GitHub Bot logged work on HDFS-16272:
-

Author: ASF GitHub Bot
Created on: 13/Oct/21 18:32
Start Date: 13/Oct/21 18:32
Worklog Time Spent: 10m 
  Work Description: cndaimin opened a new pull request #3548:
URL: https://github.com/apache/hadoop/pull/3548


   Fix the int overflow problem.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 664978)
Time Spent: 0.5h  (was: 20m)

> Int overflow in computing safe length during EC block recovery
> --
>
> Key: HDFS-16272
> URL: https://issues.apache.org/jira/browse/HDFS-16272
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: 3.1.1
>Affects Versions: 3.3.0, 3.3.1
> Environment: Cluster settings: EC RS-8-2-256k, Block Size 1GiB.
>Reporter: daimin
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> There exists an int overflow problem in StripedBlockUtil#getSafeLength, which 
> will produce a negative or zero length:
> 1. With negative length, it fails to the later >=0 check, and will crash the 
> BlockRecoveryWorker thread, which make the lease recovery operation unable to 
> finish.
> 2. With zero length, it passes the check, and directly truncate the block 
> size to zero, leads to data lossing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16244) Add the necessary write lock in Checkpointer#doCheckpoint()

2021-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16244?focusedWorklogId=664894&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-664894
 ]

ASF GitHub Bot logged work on HDFS-16244:
-

Author: ASF GitHub Bot
Created on: 13/Oct/21 18:25
Start Date: 13/Oct/21 18:25
Worklog Time Spent: 10m 
  Work Description: ferhui commented on pull request #3497:
URL: https://github.com/apache/hadoop/pull/3497#issuecomment-942103734


   @jianghuazhu Thanks for contribution. @virajjasani @jojochuang Thanks for 
review! Merged


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 664894)
Time Spent: 2h 20m  (was: 2h 10m)

> Add the necessary write lock in Checkpointer#doCheckpoint()
> ---
>
> Key: HDFS-16244
> URL: https://issues.apache.org/jira/browse/HDFS-16244
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> When BackupNode is enabled, Checkpointer#doCheckpoint() will start to work.
> When the image file needs to be reloaded, there is a call link, for example:
> FSImage#reloadFromImageFile()->FSNamesystem#clear()->FSDirectory#reset().
> In FSDirectory#reset(), the write lock needs to be acquired in advance, for 
> example:
>void reset() {
>  writeLock();
>  try {
>  ..
>  } finally {
>writeUnlock();
>  }
>}
> However, no write lock has been acquired before this.
> You will get an exception message at this time, for example:
> java.lang.AssertionError: Should hold namesystem write lock



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16186) Datanode kicks out hard disk logic optimization

2021-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16186?focusedWorklogId=664882&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-664882
 ]

ASF GitHub Bot logged work on HDFS-16186:
-

Author: ASF GitHub Bot
Created on: 13/Oct/21 18:23
Start Date: 13/Oct/21 18:23
Worklog Time Spent: 10m 
  Work Description: singer-bin commented on pull request #3334:
URL: https://github.com/apache/hadoop/pull/3334#issuecomment-941838725






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 664882)
Time Spent: 2h 40m  (was: 2.5h)

> Datanode kicks out hard disk logic optimization
> ---
>
> Key: HDFS-16186
> URL: https://issues.apache.org/jira/browse/HDFS-16186
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.1.2
> Environment: In the hadoop cluster, a certain hard disk in a certain 
> Datanode has a problem, but the datanode of hdfs did not kick out the hard 
> disk in time, causing the datanode to become a slow node
>Reporter: yanbin.zhang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> 2021-08-24 08:56:10,456 WARN datanode.DataNode 
> (BlockSender.java:readChecksum(681)) - Could not read or failed to verify 
> checksum for data at offset 113115136 for block 
> BP-1801371083-x.x.x.x-1603704063698:blk_5635828768_4563943709
> java.io.IOException: Input/output error
>  at java.io.FileInputStream.readBytes(Native Method)
>  at java.io.FileInputStream.read(FileInputStream.java:255)
>  at 
> org.apache.hadoop.hdfs.server.datanode.FileIoProvider$WrappedFileInputStream.read(FileIoProvider.java:876)
>  at java.io.FilterInputStream.read(FilterInputStream.java:133)
>  at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
>  at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
>  at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
>  at java.io.DataInputStream.read(DataInputStream.java:149)
>  at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:210)
>  at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.ReplicaInputStreams.readChecksumFully(ReplicaInputStreams.java:90)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.readChecksum(BlockSender.java:679)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:588)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.doSendBlock(BlockSender.java:803)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:750)
>  at 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.scanBlock(VolumeScanner.java:448)
>  at 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:558)
>  at 
> org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:633)
> 2021-08-24 08:56:11,121 WARN datanode.VolumeScanner 
> (VolumeScanner.java:handle(292)) - Reporting bad 
> BP-1801371083-x.x.x.x-1603704063698:blk_5635828768_4563943709 on 
> /data11/hdfs/data



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16269) [Fix] Improve NNThroughputBenchmark#blockReport operation

2021-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16269?focusedWorklogId=664820&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-664820
 ]

ASF GitHub Bot logged work on HDFS-16269:
-

Author: ASF GitHub Bot
Created on: 13/Oct/21 18:18
Start Date: 13/Oct/21 18:18
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3544:
URL: https://github.com/apache/hadoop/pull/3544#issuecomment-941185360


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 56s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  34m 12s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 22s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   1m 13s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m  0s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 24s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 58s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 23s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 19s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m 55s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 11s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  9s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   1m  9s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 55s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 13s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 47s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 21s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 20s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  24m  1s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 323m 11s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 40s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 426m 15s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3544/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3544 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux dbace46031da 4.15.0-147-generic #151-Ubuntu SMP Fri Jun 18 
19:21:19 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / e66171dd599c8ccf6dfb79d06edfed1af0e9da1d |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3544/1/testReport/ |
   | Max. process+thread count | 2154 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3544/1/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org |
   
   
   This 

[jira] [Commented] (HDFS-16271) RBF: NullPointerException when setQuota through routers with quota disabled

2021-10-13 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-16271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17428392#comment-17428392
 ] 

Íñigo Goiri commented on HDFS-16271:


Thanks [~smarthan] for the patch.
Can you use LamdaTestUtils#intercept to validate the exception?

> RBF: NullPointerException when setQuota through routers with quota disabled
> ---
>
> Key: HDFS-16271
> URL: https://issues.apache.org/jira/browse/HDFS-16271
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.1
>Reporter: Chengwei Wang
>Assignee: Chengwei Wang
>Priority: Major
> Fix For: 3.3.1
>
> Attachments: HDFS-16271.001.patch
>
>
> When we started routers with *dfs.federation.router.quota.enable=false*, and 
> try to setQuota through them, NullPointerException caught.
> The cuase of NPE is that the Router#quotaManager not initialized when 
> dfs.federation.router.quota.enable=false,
>  but when executing setQuota rpc request inside router, we wolud use it in 
> method Quota#isMountEntry without null check .
> I think it's better to check whether Router#isQuotaEnabled is true before use 
> Router#quotaManager, and throw an IOException with readable message if need.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16271) RBF: NullPointerException when setQuota through routers with quota disabled

2021-10-13 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HDFS-16271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-16271:
---
Summary: RBF: NullPointerException when setQuota through routers with quota 
disabled  (was: NullPointerException when setQuota through routers with quota 
disabled.)

> RBF: NullPointerException when setQuota through routers with quota disabled
> ---
>
> Key: HDFS-16271
> URL: https://issues.apache.org/jira/browse/HDFS-16271
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.1
>Reporter: Chengwei Wang
>Assignee: Chengwei Wang
>Priority: Major
> Fix For: 3.3.1
>
> Attachments: HDFS-16271.001.patch
>
>
> When we started routers with *dfs.federation.router.quota.enable=false*, and 
> try to setQuota through them, NullPointerException caught.
> The cuase of NPE is that the Router#quotaManager not initialized when 
> dfs.federation.router.quota.enable=false,
>  but when executing setQuota rpc request inside router, we wolud use it in 
> method Quota#isMountEntry without null check .
> I think it's better to check whether Router#isQuotaEnabled is true before use 
> Router#quotaManager, and throw an IOException with readable message if need.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16272) Int overflow in computing safe length during EC block recovery

2021-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16272?focusedWorklogId=664800&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-664800
 ]

ASF GitHub Bot logged work on HDFS-16272:
-

Author: ASF GitHub Bot
Created on: 13/Oct/21 17:44
Start Date: 13/Oct/21 17:44
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3548:
URL: https://github.com/apache/hadoop/pull/3548#issuecomment-942563834


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  1s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  12m 55s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  27m 22s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   6m 30s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   5m 58s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m 21s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 38s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 52s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   2m 13s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   6m 22s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  27m 41s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 26s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   6m 21s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   6m 21s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   5m 53s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   5m 53s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   1m 17s | 
[/results-checkstyle-hadoop-hdfs-project.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3548/1/artifact/out/results-checkstyle-hadoop-hdfs-project.txt)
 |  hadoop-hdfs-project: The patch generated 8 new + 23 unchanged - 0 fixed = 
31 total (was 23)  |
   | +1 :green_heart: |  mvnsite  |   2m 28s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 35s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   2m  3s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   6m 33s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  27m 48s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 23s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | -1 :x: |  unit  | 441m 51s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3548/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   2m 15s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 597m  6s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.web.TestWebHdfsFileSystemContract |
   |   | hadoop.hdfs.TestViewDistributedFileSystemContract |
   |   | hadoop.fs.viewfs.TestViewFSOverloadSchemeWithMountTableConfigInHDFS |
   |   | 
hadoop.fs.viewfs.TestViewFileSystemOverloadSchemeHdfsFileSystemContract |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3548/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3548 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 16fff83

[jira] [Commented] (HDFS-16267) Make hdfs_df tool cross platform

2021-10-13 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-16267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17428309#comment-17428309
 ] 

Íñigo Goiri commented on HDFS-16267:


Thanks [~gautham] for the PR.
Merged PR 3542 to trunk.

> Make hdfs_df tool cross platform
> 
>
> Key: HDFS-16267
> URL: https://issues.apache.org/jira/browse/HDFS-16267
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, libhdfs++, tools
>Affects Versions: 3.4.0
> Environment: Centos 7, Centos 8, Debian 10, Ubuntu Focal
>Reporter: Gautham Banasandra
>Assignee: Gautham Banasandra
>Priority: Major
>  Labels: libhdfscpp, pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> The source files for hdfs_df uses *getopt* for parsing the command line 
> arguments. getopt is available only on Linux and thus, isn't cross platform. 
> We need to replace getopt with *boost::program_options* to make this cross 
> platform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16267) Make hdfs_df tool cross platform

2021-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16267?focusedWorklogId=664746&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-664746
 ]

ASF GitHub Bot logged work on HDFS-16267:
-

Author: ASF GitHub Bot
Created on: 13/Oct/21 16:11
Start Date: 13/Oct/21 16:11
Worklog Time Spent: 10m 
  Work Description: goiri merged pull request #3542:
URL: https://github.com/apache/hadoop/pull/3542


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 664746)
Time Spent: 1h 20m  (was: 1h 10m)

> Make hdfs_df tool cross platform
> 
>
> Key: HDFS-16267
> URL: https://issues.apache.org/jira/browse/HDFS-16267
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, libhdfs++, tools
>Affects Versions: 3.4.0
> Environment: Centos 7, Centos 8, Debian 10, Ubuntu Focal
>Reporter: Gautham Banasandra
>Assignee: Gautham Banasandra
>Priority: Major
>  Labels: libhdfscpp, pull-request-available
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> The source files for hdfs_df uses *getopt* for parsing the command line 
> arguments. getopt is available only on Linux and thus, isn't cross platform. 
> We need to replace getopt with *boost::program_options* to make this cross 
> platform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16267) Make hdfs_df tool cross platform

2021-10-13 Thread Jira


 [ 
https://issues.apache.org/jira/browse/HDFS-16267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri resolved HDFS-16267.

Fix Version/s: 3.4.0
 Hadoop Flags: Reviewed
   Resolution: Fixed

> Make hdfs_df tool cross platform
> 
>
> Key: HDFS-16267
> URL: https://issues.apache.org/jira/browse/HDFS-16267
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, libhdfs++, tools
>Affects Versions: 3.4.0
> Environment: Centos 7, Centos 8, Debian 10, Ubuntu Focal
>Reporter: Gautham Banasandra
>Assignee: Gautham Banasandra
>Priority: Major
>  Labels: libhdfscpp, pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> The source files for hdfs_df uses *getopt* for parsing the command line 
> arguments. getopt is available only on Linux and thus, isn't cross platform. 
> We need to replace getopt with *boost::program_options* to make this cross 
> platform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16270) Improve NNThroughputBenchmark#printUsage() related to block size

2021-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16270?focusedWorklogId=664656&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-664656
 ]

ASF GitHub Bot logged work on HDFS-16270:
-

Author: ASF GitHub Bot
Created on: 13/Oct/21 14:13
Start Date: 13/Oct/21 14:13
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3547:
URL: https://github.com/apache/hadoop/pull/3547#issuecomment-942349833


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 58s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  34m 16s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 24s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   1m 15s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m  1s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 20s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 56s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 26s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 14s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m 12s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 17s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 17s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  8s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   1m  8s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  1s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 53s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 47s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 18s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 20s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m 54s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 383m 26s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3547/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 54s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 487m  4s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.hdfs.TestHDFSFileSystemContract |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3547/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3547 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux b9993f782838 4.15.0-147-generic #151-Ubuntu SMP Fri Jun 18 
19:21:19 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 4b3a9d7e04fd7e7d55e1af93e07691b379e8d8c7 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3547/1/testReport/ |
   | Max. process+thread count | 2009 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdf

[jira] [Work logged] (HDFS-16268) Balancer stuck when moving striped blocks due to NPE

2021-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16268?focusedWorklogId=664582&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-664582
 ]

ASF GitHub Bot logged work on HDFS-16268:
-

Author: ASF GitHub Bot
Created on: 13/Oct/21 11:22
Start Date: 13/Oct/21 11:22
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3546:
URL: https://github.com/apache/hadoop/pull/3546#issuecomment-942199526


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 53s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  39m  5s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 23s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |   1m 12s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m  0s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 22s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 55s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 29s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 16s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m  5s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |   1m 16s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m  8s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |   1m  8s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 51s |  |  
hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 151 unchanged - 1 
fixed = 151 total (was 152)  |
   | +1 :green_heart: |  mvnsite  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 48s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 20s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   3m 20s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  24m  4s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  | 328m 10s |  |  hadoop-hdfs in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 38s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 435m 49s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3546/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3546 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux 3388ad215de1 4.15.0-143-generic #147-Ubuntu SMP Wed Apr 14 
16:10:11 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 20f7fbf4b82e89d8d155593a000d51e7e692adb3 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3546/2/testReport/ |
   | Max. process+thread count | 1821 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3546/2/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 

[jira] [Commented] (HDFS-16271) NullPointerException when setQuota through routers with quota disabled.

2021-10-13 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17428146#comment-17428146
 ] 

Hadoop QA commented on HDFS-16271:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 13m 
17s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 1 
new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
20s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
39s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
42s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m 55s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 21m 
51s{color} | {color:blue}{color} | {color:blue} Both FindBugs and SpotBugs are 
enabled, using SpotBugs. {color} |
| {color:green}+1{color} | {color:green} spotbugs {color} | {color:green}  1m 
12s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
34s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
34s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
30s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
30s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
34s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace 
issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 34s{color} | {color:green}{color} | {color:green} patch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 {color} |
| {color:green}+1{color} | {color:green} spotbugs {color} | {color:green}  1m 
22s{color} | {color

[jira] [Work logged] (HDFS-16244) Add the necessary write lock in Checkpointer#doCheckpoint()

2021-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16244?focusedWorklogId=664541&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-664541
 ]

ASF GitHub Bot logged work on HDFS-16244:
-

Author: ASF GitHub Bot
Created on: 13/Oct/21 09:50
Start Date: 13/Oct/21 09:50
Worklog Time Spent: 10m 
  Work Description: jianghuazhu commented on pull request #3497:
URL: https://github.com/apache/hadoop/pull/3497#issuecomment-942125168


   @ferhui Thank you very much.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 664541)
Time Spent: 2h 10m  (was: 2h)

> Add the necessary write lock in Checkpointer#doCheckpoint()
> ---
>
> Key: HDFS-16244
> URL: https://issues.apache.org/jira/browse/HDFS-16244
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> When BackupNode is enabled, Checkpointer#doCheckpoint() will start to work.
> When the image file needs to be reloaded, there is a call link, for example:
> FSImage#reloadFromImageFile()->FSNamesystem#clear()->FSDirectory#reset().
> In FSDirectory#reset(), the write lock needs to be acquired in advance, for 
> example:
>void reset() {
>  writeLock();
>  try {
>  ..
>  } finally {
>writeUnlock();
>  }
>}
> However, no write lock has been acquired before this.
> You will get an exception message at this time, for example:
> java.lang.AssertionError: Should hold namesystem write lock



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16244) Add the necessary write lock in Checkpointer#doCheckpoint()

2021-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16244?focusedWorklogId=664529&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-664529
 ]

ASF GitHub Bot logged work on HDFS-16244:
-

Author: ASF GitHub Bot
Created on: 13/Oct/21 09:22
Start Date: 13/Oct/21 09:22
Worklog Time Spent: 10m 
  Work Description: ferhui commented on pull request #3497:
URL: https://github.com/apache/hadoop/pull/3497#issuecomment-942103734


   @jianghuazhu Thanks for contribution. @virajjasani @jojochuang Thanks for 
review! Merged


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 664529)
Time Spent: 2h  (was: 1h 50m)

> Add the necessary write lock in Checkpointer#doCheckpoint()
> ---
>
> Key: HDFS-16244
> URL: https://issues.apache.org/jira/browse/HDFS-16244
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> When BackupNode is enabled, Checkpointer#doCheckpoint() will start to work.
> When the image file needs to be reloaded, there is a call link, for example:
> FSImage#reloadFromImageFile()->FSNamesystem#clear()->FSDirectory#reset().
> In FSDirectory#reset(), the write lock needs to be acquired in advance, for 
> example:
>void reset() {
>  writeLock();
>  try {
>  ..
>  } finally {
>writeUnlock();
>  }
>}
> However, no write lock has been acquired before this.
> You will get an exception message at this time, for example:
> java.lang.AssertionError: Should hold namesystem write lock



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-16244) Add the necessary write lock in Checkpointer#doCheckpoint()

2021-10-13 Thread Hui Fei (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hui Fei resolved HDFS-16244.

Fix Version/s: 3.4.0
   Resolution: Fixed

> Add the necessary write lock in Checkpointer#doCheckpoint()
> ---
>
> Key: HDFS-16244
> URL: https://issues.apache.org/jira/browse/HDFS-16244
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> When BackupNode is enabled, Checkpointer#doCheckpoint() will start to work.
> When the image file needs to be reloaded, there is a call link, for example:
> FSImage#reloadFromImageFile()->FSNamesystem#clear()->FSDirectory#reset().
> In FSDirectory#reset(), the write lock needs to be acquired in advance, for 
> example:
>void reset() {
>  writeLock();
>  try {
>  ..
>  } finally {
>writeUnlock();
>  }
>}
> However, no write lock has been acquired before this.
> You will get an exception message at this time, for example:
> java.lang.AssertionError: Should hold namesystem write lock



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16244) Add the necessary write lock in Checkpointer#doCheckpoint()

2021-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16244?focusedWorklogId=664527&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-664527
 ]

ASF GitHub Bot logged work on HDFS-16244:
-

Author: ASF GitHub Bot
Created on: 13/Oct/21 09:21
Start Date: 13/Oct/21 09:21
Worklog Time Spent: 10m 
  Work Description: ferhui merged pull request #3497:
URL: https://github.com/apache/hadoop/pull/3497


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 664527)
Time Spent: 1h 50m  (was: 1h 40m)

> Add the necessary write lock in Checkpointer#doCheckpoint()
> ---
>
> Key: HDFS-16244
> URL: https://issues.apache.org/jira/browse/HDFS-16244
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> When BackupNode is enabled, Checkpointer#doCheckpoint() will start to work.
> When the image file needs to be reloaded, there is a call link, for example:
> FSImage#reloadFromImageFile()->FSNamesystem#clear()->FSDirectory#reset().
> In FSDirectory#reset(), the write lock needs to be acquired in advance, for 
> example:
>void reset() {
>  writeLock();
>  try {
>  ..
>  } finally {
>writeUnlock();
>  }
>}
> However, no write lock has been acquired before this.
> You will get an exception message at this time, for example:
> java.lang.AssertionError: Should hold namesystem write lock



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16271) NullPointerException when setQuota through routers with quota disabled.

2021-10-13 Thread Chengwei Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengwei Wang updated HDFS-16271:
-
   Attachment: HDFS-16271.001.patch
Fix Version/s: 3.3.1
 Tags: RBF
 Target Version/s: 3.3.1
Affects Version/s: 3.3.1
   Status: Patch Available  (was: Open)

Submit patch v001.

> NullPointerException when setQuota through routers with quota disabled.
> ---
>
> Key: HDFS-16271
> URL: https://issues.apache.org/jira/browse/HDFS-16271
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.1
>Reporter: Chengwei Wang
>Assignee: Chengwei Wang
>Priority: Major
> Fix For: 3.3.1
>
> Attachments: HDFS-16271.001.patch
>
>
> When we started routers with *dfs.federation.router.quota.enable=false*, and 
> try to setQuota through them, NullPointerException caught.
> The cuase of NPE is that the Router#quotaManager not initialized when 
> dfs.federation.router.quota.enable=false,
>  but when executing setQuota rpc request inside router, we wolud use it in 
> method Quota#isMountEntry without null check .
> I think it's better to check whether Router#isQuotaEnabled is true before use 
> Router#quotaManager, and throw an IOException with readable message if need.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16271) NullPointerException when setQuota through routers with quota disabled.

2021-10-13 Thread Chengwei Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengwei Wang updated HDFS-16271:
-
Summary: NullPointerException when setQuota through routers with quota 
disabled.  (was: NullPointerException when setQuota through routers with 
dfs.federation.router.quota.enable=false.)

> NullPointerException when setQuota through routers with quota disabled.
> ---
>
> Key: HDFS-16271
> URL: https://issues.apache.org/jira/browse/HDFS-16271
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chengwei Wang
>Assignee: Chengwei Wang
>Priority: Major
>
> When we started routers with *dfs.federation.router.quota.enable=false*, and 
> try to setQuota through them, NullPointerException caught.
> The cuase of NPE is that the Router#quotaManager not initialized when 
> dfs.federation.router.quota.enable=false,
>  but when executing setQuota rpc request inside router, we wolud use it in 
> method Quota#isMountEntry without null check .
> I think it's better to check whether Router#isQuotaEnabled is true before use 
> Router#quotaManager, and throw an IOException with readable message if need.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16244) Add the necessary write lock in Checkpointer#doCheckpoint()

2021-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16244?focusedWorklogId=664514&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-664514
 ]

ASF GitHub Bot logged work on HDFS-16244:
-

Author: ASF GitHub Bot
Created on: 13/Oct/21 08:52
Start Date: 13/Oct/21 08:52
Worklog Time Spent: 10m 
  Work Description: jianghuazhu commented on pull request #3497:
URL: https://github.com/apache/hadoop/pull/3497#issuecomment-942078112


   @ayushtkn  @ferhui  @Hexiaoqiao , are you willing to spend some time to help 
review this pr.
   Thank you very much.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 664514)
Time Spent: 1h 40m  (was: 1.5h)

> Add the necessary write lock in Checkpointer#doCheckpoint()
> ---
>
> Key: HDFS-16244
> URL: https://issues.apache.org/jira/browse/HDFS-16244
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> When BackupNode is enabled, Checkpointer#doCheckpoint() will start to work.
> When the image file needs to be reloaded, there is a call link, for example:
> FSImage#reloadFromImageFile()->FSNamesystem#clear()->FSDirectory#reset().
> In FSDirectory#reset(), the write lock needs to be acquired in advance, for 
> example:
>void reset() {
>  writeLock();
>  try {
>  ..
>  } finally {
>writeUnlock();
>  }
>}
> However, no write lock has been acquired before this.
> You will get an exception message at this time, for example:
> java.lang.AssertionError: Should hold namesystem write lock



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16271) NullPointerException when setQuota through routers with dfs.federation.router.quota.enable=false.

2021-10-13 Thread Chengwei Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengwei Wang updated HDFS-16271:
-
Description: 
When we started routers with *dfs.federation.router.quota.enable=false*, and 
try to setQuota through them, NullPointerException caught.

The cuase of NPE is that the Router#quotaManager not initialized when 
dfs.federation.router.quota.enable=false,
 but when executing setQuota rpc request inside router, we wolud use it in 
method Quota#isMountEntry without null check .

I think it's better to check whether Router#isQuotaEnabled is true before use 
Router#quotaManager, and throw an IOException with readable message if need.

  was:
When we started routers with *dfs.federation.router.quota.enable=false*, and 
try to setQuota through them, NullPointerException caught.

The cuase of NPE is that the Router#quotaManager not initialized when 
dfs.federation.router.quota.enable=false,
 but when executing setQuota rpc request inside router, we wolud use it in 
method Quota#isMountEntry without null check .

I think it's better to check whether Router#quotaManager is null, and throw a 
IOException with readable message if null.


> NullPointerException when setQuota through routers with 
> dfs.federation.router.quota.enable=false.
> -
>
> Key: HDFS-16271
> URL: https://issues.apache.org/jira/browse/HDFS-16271
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chengwei Wang
>Assignee: Chengwei Wang
>Priority: Major
>
> When we started routers with *dfs.federation.router.quota.enable=false*, and 
> try to setQuota through them, NullPointerException caught.
> The cuase of NPE is that the Router#quotaManager not initialized when 
> dfs.federation.router.quota.enable=false,
>  but when executing setQuota rpc request inside router, we wolud use it in 
> method Quota#isMountEntry without null check .
> I think it's better to check whether Router#isQuotaEnabled is true before use 
> Router#quotaManager, and throw an IOException with readable message if need.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16272) Int overflow in computing safe length during EC block recovery

2021-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-16272:
--
Labels: pull-request-available  (was: )

> Int overflow in computing safe length during EC block recovery
> --
>
> Key: HDFS-16272
> URL: https://issues.apache.org/jira/browse/HDFS-16272
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: 3.1.1
>Affects Versions: 3.3.0, 3.3.1
> Environment: Cluster settings: EC RS-8-2-256k, Block Size 1GiB.
>Reporter: daimin
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> There exists an int overflow problem in StripedBlockUtil#getSafeLength, which 
> will produce a negative or zero length:
> 1. With negative length, it fails to the later >=0 check, and will crash the 
> BlockRecoveryWorker thread, which make the lease recovery operation unable to 
> finish.
> 2. With zero length, it passes the check, and directly truncate the block 
> size to zero, leads to data lossing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-16272) Int overflow in computing safe length during EC block recovery

2021-10-13 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16272?focusedWorklogId=664490&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-664490
 ]

ASF GitHub Bot logged work on HDFS-16272:
-

Author: ASF GitHub Bot
Created on: 13/Oct/21 07:45
Start Date: 13/Oct/21 07:45
Worklog Time Spent: 10m 
  Work Description: cndaimin opened a new pull request #3548:
URL: https://github.com/apache/hadoop/pull/3548


   Fix the int overflow problem.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 664490)
Remaining Estimate: 0h
Time Spent: 10m

> Int overflow in computing safe length during EC block recovery
> --
>
> Key: HDFS-16272
> URL: https://issues.apache.org/jira/browse/HDFS-16272
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: 3.1.1
>Affects Versions: 3.3.0, 3.3.1
> Environment: Cluster settings: EC RS-8-2-256k, Block Size 1GiB.
>Reporter: daimin
>Priority: Critical
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> There exists an int overflow problem in StripedBlockUtil#getSafeLength, which 
> will produce a negative or zero length:
> 1. With negative length, it fails to the later >=0 check, and will crash the 
> BlockRecoveryWorker thread, which make the lease recovery operation unable to 
> finish.
> 2. With zero length, it passes the check, and directly truncate the block 
> size to zero, leads to data lossing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-16272) Int overflow in computing safe length during EC block recovery

2021-10-13 Thread daimin (Jira)
daimin created HDFS-16272:
-

 Summary: Int overflow in computing safe length during EC block 
recovery
 Key: HDFS-16272
 URL: https://issues.apache.org/jira/browse/HDFS-16272
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: 3.1.1
Affects Versions: 3.3.1, 3.3.0
 Environment: Cluster settings: EC RS-8-2-256k, Block Size 1GiB.
Reporter: daimin


There exists an int overflow problem in StripedBlockUtil#getSafeLength, which 
will produce a negative or zero length:
1. With negative length, it fails to the later >=0 check, and will crash the 
BlockRecoveryWorker thread, which make the lease recovery operation unable to 
finish.
2. With zero length, it passes the check, and directly truncate the block size 
to zero, leads to data lossing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16271) NullPointerException when setQuota through routers with dfs.federation.router.quota.enable=false.

2021-10-13 Thread Chengwei Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengwei Wang updated HDFS-16271:
-
Description: 
When we started routers with *dfs.federation.router.quota.enable=false*, and 
try to setQuota through them, NullPointerException caught.

The cuase of NPE is that the Router#quotaManager not initialized when 
dfs.federation.router.quota.enable=false,
 but when executing setQuota rpc request inside router, we wolud use it in 
method Quota#isMountEntry without null check .

I think it's better to check whether Router#quotaManager is null, and throw a 
IOException with readable message if null.

  was:
When we started routers with dfs.federation.router.quota.enable=false, and try 
to setQuota through them, NullPointerException caught.

The cuase of NPE is that the Router#quotaManager not initialized when 
dfs.federation.router.quota.enable=false,
but when executing setQuota rpc request inside router, we wolud use it in 
method Quota#isMountEntry without null check .

I think it's better to check whether Router#quotaManager is null, and throw a 
IOException with readable message if null.


> NullPointerException when setQuota through routers with 
> dfs.federation.router.quota.enable=false.
> -
>
> Key: HDFS-16271
> URL: https://issues.apache.org/jira/browse/HDFS-16271
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chengwei Wang
>Assignee: Chengwei Wang
>Priority: Major
>
> When we started routers with *dfs.federation.router.quota.enable=false*, and 
> try to setQuota through them, NullPointerException caught.
> The cuase of NPE is that the Router#quotaManager not initialized when 
> dfs.federation.router.quota.enable=false,
>  but when executing setQuota rpc request inside router, we wolud use it in 
> method Quota#isMountEntry without null check .
> I think it's better to check whether Router#quotaManager is null, and throw a 
> IOException with readable message if null.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-16271) NullPointerException when setQuota through routers with dfs.federation.router.quota.enable=false.

2021-10-13 Thread Chengwei Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-16271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chengwei Wang updated HDFS-16271:
-
Description: 
When we started routers with dfs.federation.router.quota.enable=false, and try 
to setQuota through them, NullPointerException caught.

The cuase of NPE is that the Router#quotaManager not initialized when 
dfs.federation.router.quota.enable=false,
but when executing setQuota rpc request inside router, we wolud use it in 
method Quota#isMountEntry without null check .

I think it's better to check whether Router#quotaManager is null, and throw a 
IOException with readable message if null.

> NullPointerException when setQuota through routers with 
> dfs.federation.router.quota.enable=false.
> -
>
> Key: HDFS-16271
> URL: https://issues.apache.org/jira/browse/HDFS-16271
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Chengwei Wang
>Assignee: Chengwei Wang
>Priority: Major
>
> When we started routers with dfs.federation.router.quota.enable=false, and 
> try to setQuota through them, NullPointerException caught.
> The cuase of NPE is that the Router#quotaManager not initialized when 
> dfs.federation.router.quota.enable=false,
> but when executing setQuota rpc request inside router, we wolud use it in 
> method Quota#isMountEntry without null check .
> I think it's better to check whether Router#quotaManager is null, and throw a 
> IOException with readable message if null.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org