[jira] [Updated] (HDFS-15795) Returned wrong checksum when reconstruction was failed by exception
[ https://issues.apache.org/jira/browse/HDFS-15795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yushi Hayasaka updated HDFS-15795: -- Description: If the reconstruction task is failed on StripedBlockChecksumReconstructor by exception, the checksum becomes wrong one because it is calculated with blocks except a failure one. It is caused by catching exception with not appropriate way. As a result, the failed block is not fetched again. was: If the reconstruction task is failed on StripedBlockChecksumReconstructor by exception, the checksum becomes wrong one because it is calculated with blocks without a failure one. It is caused by catching exception with not appropriate way, so we need to fix it. > Returned wrong checksum when reconstruction was failed by exception > --- > > Key: HDFS-15795 > URL: https://issues.apache.org/jira/browse/HDFS-15795 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, ec, erasure-coding >Reporter: Yushi Hayasaka >Assignee: Yushi Hayasaka >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > If the reconstruction task is failed on StripedBlockChecksumReconstructor by > exception, the checksum becomes wrong one because it is calculated with > blocks except a failure one. > It is caused by catching exception with not appropriate way. As a result, the > failed block is not fetched again. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15795) Returned wrong checksum when reconstruction was failed by exception
[ https://issues.apache.org/jira/browse/HDFS-15795?focusedWorklogId=543401=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-543401 ] ASF GitHub Bot logged work on HDFS-15795: - Author: ASF GitHub Bot Created on: 28/Jan/21 06:33 Start Date: 28/Jan/21 06:33 Worklog Time Spent: 10m Work Description: crossfire commented on a change in pull request #2657: URL: https://github.com/apache/hadoop/pull/2657#discussion_r565850014 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockChecksumHelper.java ## @@ -503,6 +503,7 @@ void compute() throws IOException { } } catch (IOException e) { Review comment: It may be okay to just remove here instead of rethrowing exception because it is handled below too: https://github.com/apache/hadoop/blob/f8769e0f4b917d9fda8ff7a9fddb4d755d246a1e/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java#L324 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 543401) Time Spent: 20m (was: 10m) > Returned wrong checksum when reconstruction was failed by exception > --- > > Key: HDFS-15795 > URL: https://issues.apache.org/jira/browse/HDFS-15795 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, ec, erasure-coding >Reporter: Yushi Hayasaka >Assignee: Yushi Hayasaka >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > If the reconstruction task is failed on StripedBlockChecksumReconstructor by > exception, the checksum becomes wrong one because it is calculated with > blocks without a failure one. > It is caused by catching exception with not appropriate way, so we need to > fix it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15796) ConcurrentModificationException error happens on NameNode occasionally
[ https://issues.apache.org/jira/browse/HDFS-15796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Ma updated HDFS-15796: - Description: ConcurrentModificationException error happens on NameNode occasionally. {code:java} 2021-01-23 20:21:18,107 | ERROR | RedundancyMonitor | RedundancyMonitor thread received Runtime exception. | BlockManager.java:4746 java.util.ConcurrentModificationException at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:909) at java.util.ArrayList$Itr.next(ArrayList.java:859) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReconstructionWorkForBlocks(BlockManager.java:1907) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeBlockReconstructionWork(BlockManager.java:1859) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:4862) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$RedundancyMonitor.run(BlockManager.java:4729) at java.lang.Thread.run(Thread.java:748) {code} was: ConcurrentModificationException error happens on NameNode occasionally !file:///C:/Users/m00425105/AppData/Roaming/eSpace_Desktop/UserData/m00425105/imagefiles/10B02DC2-A9F0-4AE6-949B-92B8F1E9249A.png! > ConcurrentModificationException error happens on NameNode occasionally > -- > > Key: HDFS-15796 > URL: https://issues.apache.org/jira/browse/HDFS-15796 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.1.1 >Reporter: Daniel Ma >Priority: Critical > Fix For: 3.1.1 > > > ConcurrentModificationException error happens on NameNode occasionally. > > {code:java} > 2021-01-23 20:21:18,107 | ERROR | RedundancyMonitor | RedundancyMonitor > thread received Runtime exception. | BlockManager.java:4746 > java.util.ConcurrentModificationException > at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:909) > at java.util.ArrayList$Itr.next(ArrayList.java:859) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReconstructionWorkForBlocks(BlockManager.java:1907) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeBlockReconstructionWork(BlockManager.java:1859) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:4862) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$RedundancyMonitor.run(BlockManager.java:4729) > at java.lang.Thread.run(Thread.java:748) > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15790) Make ProtobufRpcEngineProtos and ProtobufRpcEngineProtos2 Co-Exist
[ https://issues.apache.org/jira/browse/HDFS-15790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17273357#comment-17273357 ] Vinayakumar B commented on HDFS-15790: -- Thanks for reporting this issue [~belugabehr]. please check the history of HADOOP-13363 for details regarding why and how the upgrade was done. I will try to review proposed changes this weekend. Thanks. > Make ProtobufRpcEngineProtos and ProtobufRpcEngineProtos2 Co-Exist > -- > > Key: HDFS-15790 > URL: https://issues.apache.org/jira/browse/HDFS-15790 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Changing from Protobuf 2 to Protobuf 3 broke some stuff in Apache Hive > project. This was not an awesome thing to do between minor versions in > regards to backwards compatibility for downstream projects. > Additionally, these two frameworks are not drop-in replacements, they have > some differences. Also, Protobuf 2 is not deprecated or anything so let us > have both protocols available at the same time. In Hadoop 4.x Protobuf 2 > support can be dropped. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-15795) Returned wrong checksum when reconstruction was failed by exception
[ https://issues.apache.org/jira/browse/HDFS-15795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yushi Hayasaka reassigned HDFS-15795: - Assignee: Yushi Hayasaka > Returned wrong checksum when reconstruction was failed by exception > --- > > Key: HDFS-15795 > URL: https://issues.apache.org/jira/browse/HDFS-15795 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, ec, erasure-coding >Reporter: Yushi Hayasaka >Assignee: Yushi Hayasaka >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > If the reconstruction task is failed on StripedBlockChecksumReconstructor by > exception, the checksum becomes wrong one because it is calculated with > blocks without a failure one. > It is caused by catching exception with not appropriate way, so we need to > fix it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15796) ConcurrentModificationException error happens on NameNode occasionally
Daniel Ma created HDFS-15796: Summary: ConcurrentModificationException error happens on NameNode occasionally Key: HDFS-15796 URL: https://issues.apache.org/jira/browse/HDFS-15796 Project: Hadoop HDFS Issue Type: Bug Components: hdfs Affects Versions: 3.1.1 Reporter: Daniel Ma Fix For: 3.1.1 ConcurrentModificationException error happens on NameNode occasionally !file:///C:/Users/m00425105/AppData/Roaming/eSpace_Desktop/UserData/m00425105/imagefiles/10B02DC2-A9F0-4AE6-949B-92B8F1E9249A.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15162) Optimize frequency of regular block reports
[ https://issues.apache.org/jira/browse/HDFS-15162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17273266#comment-17273266 ] JiangHua Zhu edited comment on HDFS-15162 at 1/28/21, 6:21 AM: --- [~ayushtkn] , I noticed your opinion. I agree with what you said. When the DN connects to the NN abnormally, it means that the NN is under pressure or the midway connection fails. Recently I encountered a problem. When DN connected to NN, after frequent retries many times (for example, 50 times), an exception broke out. The log is as follows: 2021-01-01 17:55:21,099 [15993307503]-INFO [cluster lifeline to /:port:Client$Connection@948]-Retrying connect to server: /:port. Already tried 49 time(s) ; retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS) 2021-01-01 17:55:21,100 [15993307504]-WARN [cluster lifeline to /:port:BPServiceActor$LifelineSender@1008]-IOException in LifelineSender for Block pool (Datanode Uuid ) service to /: port java.net.ConnectException: Call From / to :port failed on connection exception: java.net.ConnectException: Connection refused; For more details see: [http://wiki.apache.org/hadoop/ConnectionRefused] at sun.reflect.GeneratedConstructorAccessor68.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:824) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:754) at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1511) at org.apache.hadoop.ipc.Client.call(Client.java:1453) at org.apache.hadoop.ipc.Client.call(Client.java:1363) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116) at com.sun.proxy.$Proxy21.sendLifeline(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.DatanodeLifelineProtocolClientSideTranslatorPB.sendLifeline(DatanodeLifelineProtocolClientSideTranslatorPB.java:100) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor$LifelineSender.sendLifeline(BPServiceActor.java:1074) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor$LifelineSender.sendLifelineIfDue(BPServiceActor.java:1058) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor$LifelineSender.run(BPServiceActor.java:1003) FBR should not be triggered at this time. was (Author: jianghuazhu): [~ayushtkn] , I noticed your opinion. I agree with what you said. When the DN connects to the NN abnormally, it means that the NN is under pressure or the midway connection fails. Recently I encountered a problem. When DN connected to NN, after frequent retries many times (for example, 50 times), an exception broke out. The log is as follows: 2021-01-01 17:55:21,099 [15993307503]-INFO [cluster lifeline to /:port:Client$Connection@948]-Retrying connect to server: /:port. Already tried 49 time(s) ; retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS) 2021-01-12 17:55:21,100 [15993307504]-WARN [cluster lifeline to /:port:BPServiceActor$LifelineSender@1008]-IOException in LifelineSender for Block pool (Datanode Uuid ) service to /: port java.net.ConnectException: Call From / to :port failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.reflect.GeneratedConstructorAccessor68.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:824) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:754) at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1511) at org.apache.hadoop.ipc.Client.call(Client.java:1453) at org.apache.hadoop.ipc.Client.call(Client.java:1363) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116) at com.sun.proxy.$Proxy21.sendLifeline(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.DatanodeLifelineProtocolClientSideTranslatorPB.sendLifeline(DatanodeLifelineProtocolClientSideTranslatorPB.java:100) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor$LifelineSender.sendLifeline(BPServiceActor.java:1074) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor$LifelineSender.sendLifelineIfDue(BPServiceActor.java:1058) at
[jira] [Work logged] (HDFS-15795) Returned wrong checksum when reconstruction was failed by exception
[ https://issues.apache.org/jira/browse/HDFS-15795?focusedWorklogId=543361=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-543361 ] ASF GitHub Bot logged work on HDFS-15795: - Author: ASF GitHub Bot Created on: 28/Jan/21 04:11 Start Date: 28/Jan/21 04:11 Worklog Time Spent: 10m Work Description: crossfire opened a new pull request #2657: URL: https://github.com/apache/hadoop/pull/2657 …ed by exception. ## NOTICE Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull request which starts with the corresponding JIRA issue number. (e.g. HADOOP-X. Fix a typo in YYY.) For more details, please see https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 543361) Remaining Estimate: 0h Time Spent: 10m > Returned wrong checksum when reconstruction was failed by exception > --- > > Key: HDFS-15795 > URL: https://issues.apache.org/jira/browse/HDFS-15795 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, ec, erasure-coding >Reporter: Yushi Hayasaka >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > If the reconstruction task is failed on StripedBlockChecksumReconstructor by > exception, the checksum becomes wrong one because it is calculated with > blocks without a failure one. > It is caused by catching exception with not appropriate way, so we need to > fix it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15795) Returned wrong checksum when reconstruction was failed by exception
[ https://issues.apache.org/jira/browse/HDFS-15795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDFS-15795: -- Labels: pull-request-available (was: ) > Returned wrong checksum when reconstruction was failed by exception > --- > > Key: HDFS-15795 > URL: https://issues.apache.org/jira/browse/HDFS-15795 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, ec, erasure-coding >Reporter: Yushi Hayasaka >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > If the reconstruction task is failed on StripedBlockChecksumReconstructor by > exception, the checksum becomes wrong one because it is calculated with > blocks without a failure one. > It is caused by catching exception with not appropriate way, so we need to > fix it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15795) Returned wrong checksum when reconstruction was failed by exception
Yushi Hayasaka created HDFS-15795: - Summary: Returned wrong checksum when reconstruction was failed by exception Key: HDFS-15795 URL: https://issues.apache.org/jira/browse/HDFS-15795 Project: Hadoop HDFS Issue Type: Bug Components: datanode, ec, erasure-coding Reporter: Yushi Hayasaka If the reconstruction task is failed on StripedBlockChecksumReconstructor by exception, the checksum becomes wrong one because it is calculated with blocks without a failure one. It is caused by catching exception with not appropriate way, so we need to fix it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15740) Make basename cross-platform
[ https://issues.apache.org/jira/browse/HDFS-15740?focusedWorklogId=543343=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-543343 ] ASF GitHub Bot logged work on HDFS-15740: - Author: ASF GitHub Bot Created on: 28/Jan/21 03:03 Start Date: 28/Jan/21 03:03 Worklog Time Spent: 10m Work Description: GauthamBanasandra commented on pull request #2567: URL: https://github.com/apache/hadoop/pull/2567#issuecomment-768764746 @aajisaka Could you please review my PR? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 543343) Remaining Estimate: 16h 10m (was: 16h 20m) Time Spent: 7h 50m (was: 7h 40m) > Make basename cross-platform > > > Key: HDFS-15740 > URL: https://issues.apache.org/jira/browse/HDFS-15740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: libhdfs++ >Affects Versions: 3.4.0 >Reporter: Gautham Banasandra >Assignee: Gautham Banasandra >Priority: Major > Labels: pull-request-available > Original Estimate: 24h > Time Spent: 7h 50m > Remaining Estimate: 16h 10m > > The *basename* function isn't available on Visual Studio 2019 compiler. We > need to make it cross platform. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15162) Optimize frequency of regular block reports
[ https://issues.apache.org/jira/browse/HDFS-15162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17273268#comment-17273268 ] JiangHua Zhu commented on HDFS-15162: - At this time, I understand that two things should be done: 1. FBR can be used only after DN is normally connected to NN; 2. After the abnormality is removed, FBR can be done at the next fixed interval or supplemented with the last unfinished FBR. > Optimize frequency of regular block reports > --- > > Key: HDFS-15162 > URL: https://issues.apache.org/jira/browse/HDFS-15162 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Critical > > Avoid sending block report at regular interval, if there is no failover, > DiskError or any exception encountered in connecting to the Namenode. > This JIRA intents to limit the regular block reports to be sent only in case > of the above scenarios and during re-registration of datanode, to eliminate > the overhead of processing BlockReports at Namenode in case of huge clusters. > *Eg.* If a block report was sent at hours and the next was scheduled at > 0600 hours if there is no above mentioned scenario, it will skip sending the > BR, and schedule it to next 1200 hrs. if something of such sort happens > between 06:- 12: it would send the BR normally. > *NOTE*: This would be optional and can be turned off by default. Would add a > configuration to enable this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15162) Optimize frequency of regular block reports
[ https://issues.apache.org/jira/browse/HDFS-15162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17273266#comment-17273266 ] JiangHua Zhu commented on HDFS-15162: - [~ayushtkn] , I noticed your opinion. I agree with what you said. When the DN connects to the NN abnormally, it means that the NN is under pressure or the midway connection fails. Recently I encountered a problem. When DN connected to NN, after frequent retries many times (for example, 50 times), an exception broke out. The log is as follows: 2021-01-01 17:55:21,099 [15993307503]-INFO [cluster lifeline to /:port:Client$Connection@948]-Retrying connect to server: /:port. Already tried 49 time(s) ; retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS) 2021-01-12 17:55:21,100 [15993307504]-WARN [cluster lifeline to /:port:BPServiceActor$LifelineSender@1008]-IOException in LifelineSender for Block pool (Datanode Uuid ) service to /: port java.net.ConnectException: Call From / to :port failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.reflect.GeneratedConstructorAccessor68.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:824) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:754) at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1511) at org.apache.hadoop.ipc.Client.call(Client.java:1453) at org.apache.hadoop.ipc.Client.call(Client.java:1363) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116) at com.sun.proxy.$Proxy21.sendLifeline(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.DatanodeLifelineProtocolClientSideTranslatorPB.sendLifeline(DatanodeLifelineProtocolClientSideTranslatorPB.java:100) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor$LifelineSender.sendLifeline(BPServiceActor.java:1074) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor$LifelineSender.sendLifelineIfDue(BPServiceActor.java:1058) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor$LifelineSender.run(BPServiceActor.java:1003) FBR should not be triggered at this time. > Optimize frequency of regular block reports > --- > > Key: HDFS-15162 > URL: https://issues.apache.org/jira/browse/HDFS-15162 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Critical > > Avoid sending block report at regular interval, if there is no failover, > DiskError or any exception encountered in connecting to the Namenode. > This JIRA intents to limit the regular block reports to be sent only in case > of the above scenarios and during re-registration of datanode, to eliminate > the overhead of processing BlockReports at Namenode in case of huge clusters. > *Eg.* If a block report was sent at hours and the next was scheduled at > 0600 hours if there is no above mentioned scenario, it will skip sending the > BR, and schedule it to next 1200 hrs. if something of such sort happens > between 06:- 12: it would send the BR normally. > *NOTE*: This would be optional and can be turned off by default. Would add a > configuration to enable this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15661) The DeadNodeDetector shouldn't be shared by different DFSClients.
[ https://issues.apache.org/jira/browse/HDFS-15661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lisheng Sun updated HDFS-15661: --- Fix Version/s: 3.4.0 Resolution: Fixed Status: Resolved (was: Patch Available) > The DeadNodeDetector shouldn't be shared by different DFSClients. > - > > Key: HDFS-15661 > URL: https://issues.apache.org/jira/browse/HDFS-15661 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Jinglun >Assignee: Jinglun >Priority: Major > Fix For: 3.4.0 > > Attachments: HDFS-15661.001.patch, HDFS-15661.002.patch, > HDFS-15661.003.patch, HDFS-15661.004.patch, HDFS-15661.005.patch > > > Currently the DeadNodeDetector is a member of ClientContext. That means it is > shared by many different DFSClients. When one DFSClient.close() is invoked, > the DeadNodeDetecotor thread would be interrupted and impact other DFSClients. > From the original design of HDFS-13571 we could see the DeadNodeDetector is > supposed to share dead nodes of many input streams from the same client. > We should move the DeadNodeDetector as a member of DFSClient instead of > ClientContext. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15661) The DeadNodeDetector shouldn't be shared by different DFSClients.
[ https://issues.apache.org/jira/browse/HDFS-15661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17273259#comment-17273259 ] Lisheng Sun commented on HDFS-15661: Commited to trunk. Thanks [~LiJinglun] for your report and contribution! Thanks [~weichiu] for your review! > The DeadNodeDetector shouldn't be shared by different DFSClients. > - > > Key: HDFS-15661 > URL: https://issues.apache.org/jira/browse/HDFS-15661 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Jinglun >Assignee: Jinglun >Priority: Major > Attachments: HDFS-15661.001.patch, HDFS-15661.002.patch, > HDFS-15661.003.patch, HDFS-15661.004.patch, HDFS-15661.005.patch > > > Currently the DeadNodeDetector is a member of ClientContext. That means it is > shared by many different DFSClients. When one DFSClient.close() is invoked, > the DeadNodeDetecotor thread would be interrupted and impact other DFSClients. > From the original design of HDFS-13571 we could see the DeadNodeDetector is > supposed to share dead nodes of many input streams from the same client. > We should move the DeadNodeDetector as a member of DFSClient instead of > ClientContext. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15794) IBR and FBR use different queues to load data.
[ https://issues.apache.org/jira/browse/HDFS-15794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17273253#comment-17273253 ] JiangHua Zhu commented on HDFS-15794: - [~kihwal] , thank you very much. I think your suggestion is very meaningful. > IBR and FBR use different queues to load data. > -- > > Key: HDFS-15794 > URL: https://issues.apache.org/jira/browse/HDFS-15794 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Major > > When DataNode reports data to NameNode, IBR and FBR are included here. > After the NameNode receives the DataNode request, it temporarily stores the > data in a queue, here it refers to > BlockManager#BlockReportProcessingThread#queue. > NameNodeRpcServer#blockReport() > for (int r = 0; r final BlockListAsLongs blocks = reports[r].getBlocks(); > final int index = r; > noStaleStorages = bm.runBlockOp(() -> > bm.processReport(nodeReg, reports[index].getStorage(), > blocks, context)); > } > NameNodeRpcServer#blockReport() > for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) { > bm.enqueueBlockOp(new Runnable() { > @Override > public void run() { > try { > namesystem.processIncrementalBlockReport(nodeReg, r); > } catch (Exception ex) { > // usually because the node is unregistered/dead. next heartbeat > // will correct the problem > blockStateChangeLog.error( > "*BLOCK* NameNode.blockReceivedAndDeleted: " > + "failed from "+ nodeReg + ":" + ex.getMessage()); > } > } > }); > } > The problem here is that when the NameNode is blocked in processing the IBR, > the FBR requested by the DN from the NameNode will be affected. Similarly, > when the NameNode processing FBR is blocked. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15786) Minor improvement use isEmpty
[ https://issues.apache.org/jira/browse/HDFS-15786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17273201#comment-17273201 ] Jason Wen commented on HDFS-15786: -- I see the PR is all about String objects method change. For String object string.isEmpty() is equivalent to string.length() == 0. It does not make any difference, still O(1) > Minor improvement use isEmpty > - > > Key: HDFS-15786 > URL: https://issues.apache.org/jira/browse/HDFS-15786 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Arturo Bernal >Assignee: Arturo Bernal >Priority: Minor > Labels: pull-request-available > Time Spent: 3.5h > Remaining Estimate: 0h > > Use isEmpty instead size() > o. > > {{size()}} can be *O(1)* or *O(N)*, depending on the {{data structure}}; > {{.isEmpty()}} is never *O(N)*. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15790) Make ProtobufRpcEngineProtos and ProtobufRpcEngineProtos2 Co-Exist
[ https://issues.apache.org/jira/browse/HDFS-15790?focusedWorklogId=543168=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-543168 ] ASF GitHub Bot logged work on HDFS-15790: - Author: ASF GitHub Bot Created on: 27/Jan/21 23:02 Start Date: 27/Jan/21 23:02 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2650: URL: https://github.com/apache/hadoop/pull/2650#issuecomment-768635544 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 31s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | buf | 0m 0s | | buf was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | | 0m 0s | [test4tests](test4tests) | The patch appears to include 5 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 14m 9s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 20m 29s | | trunk passed | | +1 :green_heart: | compile | 20m 33s | | trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | compile | 17m 56s | | trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | +1 :green_heart: | checkstyle | 4m 6s | | trunk passed | | +1 :green_heart: | mvnsite | 6m 15s | | trunk passed | | +1 :green_heart: | shadedclient | 24m 31s | | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 4m 48s | | trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | javadoc | 6m 24s | | trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | +0 :ok: | spotbugs | 1m 26s | | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 11m 34s | | trunk passed | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 26s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 4m 11s | | the patch passed | | +1 :green_heart: | compile | 20m 3s | | the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 | | -1 :x: | cc | 20m 3s | [/diff-compile-cc-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2650/3/artifact/out/diff-compile-cc-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04.txt) | root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 generated 42 new + 370 unchanged - 42 fixed = 412 total (was 412) | | -1 :x: | javac | 20m 3s | [/diff-compile-javac-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2650/3/artifact/out/diff-compile-javac-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04.txt) | root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 generated 1 new + 2035 unchanged - 0 fixed = 2036 total (was 2035) | | +1 :green_heart: | compile | 18m 1s | | the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | -1 :x: | cc | 18m 1s | [/diff-compile-cc-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2650/3/artifact/out/diff-compile-cc-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01.txt) | root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 generated 37 new + 375 unchanged - 37 fixed = 412 total (was 412) | | -1 :x: | javac | 18m 1s | [/diff-compile-javac-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2650/3/artifact/out/diff-compile-javac-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01.txt) | root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 generated 1 new + 1930 unchanged - 0 fixed = 1931 total (was 1930) | | -0 :warning: | checkstyle | 3m 57s | [/diff-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2650/3/artifact/out/diff-checkstyle-root.txt) | root: The patch generated 4 new + 557 unchanged - 3 fixed = 561 total (was 560) | | +1 :green_heart: | mvnsite | 6m 11s | | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | | The patch has no whitespace issues.
[jira] [Commented] (HDFS-15789) Lease renewal does not require namesystem lock
[ https://issues.apache.org/jira/browse/HDFS-15789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17273129#comment-17273129 ] Kihwal Lee commented on HDFS-15789: --- This is a safe change. The FSN lock is only protecting NN against renewing a lease during a HA transition, which should be done by only the active NN. So after this patch, there can be a case where a lease renewal request is received and being processed while a NN is active, but finishes processing it during or after transitioning to standby. However, this does not affect the file system consistency or violate the existing file system API semantics. The important states are whether file is open and who has the lease. Anything that changes these states is edit-logged. The renewal does not revive expired/revoked leases and is thus not edit-logged. +1 for the patch. > Lease renewal does not require namesystem lock > -- > > Key: HDFS-15789 > URL: https://issues.apache.org/jira/browse/HDFS-15789 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.4.0 >Reporter: Jim Brennan >Assignee: Jim Brennan >Priority: Major > Attachments: HDFS-15789.001.patch > > > [~daryn] found this while testing the performance for HDFS-15704. > The lease manager is independent of the namesystem. Acquiring the lock causes > unnecessary lock contention that degrades throughput. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-10498) Intermittent test failure org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotFileLength.testSnapshotfileLength
[ https://issues.apache.org/jira/browse/HDFS-10498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17273123#comment-17273123 ] Kihwal Lee commented on HDFS-10498: --- +1 > Intermittent test failure > org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotFileLength.testSnapshotfileLength > --- > > Key: HDFS-10498 > URL: https://issues.apache.org/jira/browse/HDFS-10498 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs, snapshots >Affects Versions: 3.0.0-alpha1 >Reporter: Hanisha Koneru >Assignee: Jim Brennan >Priority: Major > Attachments: HDFS-10498.001.patch, test_failure.txt > > > Error Details > Per https://builds.apache.org/job/PreCommit-HDFS-Build/15646/testReport/, we > had the following failure. Local rerun is successful. > Error Details: > {panel} > Fail to get block MD5 for > LocatedBlock{BP-145245805-172.17.0.3-1464981728847:blk_1073741826_1002; > getBlockSize()=1; corrupt=false; offset=1024; > locs=[DatanodeInfoWithStorage[127.0.0.1:55764,DS-a33d7c97-9d4a-4694-a47e-a3187a33ed5a,DISK]]} > {panel} > Stack Trace: > {panel} > java.io.IOException: Fail to get block MD5 for > LocatedBlock{BP-145245805-172.17.0.3-1464981728847:blk_1073741826_1002; > getBlockSize()=1; corrupt=false; offset=1024; > locs=[DatanodeInfoWithStorage[127.0.0.1:55764,DS-a33d7c97-9d4a-4694-a47e-a3187a33ed5a,DISK]]} > at > org.apache.hadoop.hdfs.FileChecksumHelper$ReplicatedFileChecksumComputer.checksumBlocks(FileChecksumHelper.java:289) > at > org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:206) > at org.apache.hadoop.hdfs.DFSClient.getFileChecksum(DFSClient.java:1731) > at > org.apache.hadoop.hdfs.DistributedFileSystem$31.doCall(DistributedFileSystem.java:1482) > at > org.apache.hadoop.hdfs.DistributedFileSystem$31.doCall(DistributedFileSystem.java:1479) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1490) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotFileLength.testSnapshotfileLength(TestSnapshotFileLength.java:137) > Standard Output 7 sec > {panel} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15740) Make basename cross-platform
[ https://issues.apache.org/jira/browse/HDFS-15740?focusedWorklogId=543035=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-543035 ] ASF GitHub Bot logged work on HDFS-15740: - Author: ASF GitHub Bot Created on: 27/Jan/21 19:46 Start Date: 27/Jan/21 19:46 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2567: URL: https://github.com/apache/hadoop/pull/2567#issuecomment-768532520 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 33s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | markdownlint | 0m 1s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 14m 9s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 21m 38s | | trunk passed | | +1 :green_heart: | compile | 22m 41s | | trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | compile | 19m 24s | | trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | +1 :green_heart: | mvnsite | 26m 1s | | trunk passed | | +1 :green_heart: | shadedclient | 118m 5s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 118m 26s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 27s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 21m 9s | | the patch passed | | +1 :green_heart: | compile | 23m 44s | | the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 | | -1 :x: | cc | 23m 44s | [/diff-compile-cc-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2567/21/artifact/out/diff-compile-cc-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04.txt) | root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 generated 31 new + 381 unchanged - 31 fixed = 412 total (was 412) | | +1 :green_heart: | golang | 23m 44s | | the patch passed | | +1 :green_heart: | javac | 23m 44s | | the patch passed | | +1 :green_heart: | compile | 19m 41s | | the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | -1 :x: | cc | 19m 41s | [/diff-compile-cc-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2567/21/artifact/out/diff-compile-cc-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01.txt) | root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 generated 33 new + 379 unchanged - 33 fixed = 412 total (was 412) | | +1 :green_heart: | golang | 19m 41s | | the patch passed | | +1 :green_heart: | javac | 19m 41s | | the patch passed | | +1 :green_heart: | mvnsite | 21m 13s | | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 14m 14s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 692m 58s | [/patch-unit-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2567/21/artifact/out/patch-unit-root.txt) | root in the patch passed. | | +1 :green_heart: | asflicense | 1m 33s | | The patch does not generate ASF License warnings. | | | | 916m 0s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.namenode.TestAddOverReplicatedStripedBlocks | | | hadoop.hdfs.server.balancer.TestBalancer | | | hadoop.tools.dynamometer.TestDynamometerInfra | | | hadoop.yarn.service.TestYarnNativeServices | | | hadoop.yarn.server.resourcemanager.TestRMRestart | | | hadoop.yarn.client.api.impl.TestAMRMClient | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base:
[jira] [Work logged] (HDFS-15740) Make basename cross-platform
[ https://issues.apache.org/jira/browse/HDFS-15740?focusedWorklogId=543026=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-543026 ] ASF GitHub Bot logged work on HDFS-15740: - Author: ASF GitHub Bot Created on: 27/Jan/21 19:38 Start Date: 27/Jan/21 19:38 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2567: URL: https://github.com/apache/hadoop/pull/2567#issuecomment-768528104 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 31s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | markdownlint | 0m 0s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | -1 :x: | test4tests | 0m 0s | | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 14m 16s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 21m 42s | | trunk passed | | +1 :green_heart: | compile | 22m 3s | | trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 | | +1 :green_heart: | compile | 19m 32s | | trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | +1 :green_heart: | mvnsite | 26m 11s | | trunk passed | | +1 :green_heart: | shadedclient | 118m 18s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 118m 39s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 27s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 21m 42s | | the patch passed | | +1 :green_heart: | compile | 23m 28s | | the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 | | -1 :x: | cc | 23m 28s | [/diff-compile-cc-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2567/20/artifact/out/diff-compile-cc-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04.txt) | root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 generated 51 new + 361 unchanged - 51 fixed = 412 total (was 412) | | +1 :green_heart: | golang | 23m 28s | | the patch passed | | +1 :green_heart: | javac | 23m 28s | | the patch passed | | +1 :green_heart: | compile | 20m 2s | | the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 | | -1 :x: | cc | 20m 2s | [/diff-compile-cc-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2567/20/artifact/out/diff-compile-cc-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01.txt) | root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 generated 42 new + 370 unchanged - 42 fixed = 412 total (was 412) | | +1 :green_heart: | golang | 20m 2s | | the patch passed | | +1 :green_heart: | javac | 20m 2s | | the patch passed | | +1 :green_heart: | mvnsite | 21m 42s | | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | | The patch has no whitespace issues. | | +1 :green_heart: | shadedclient | 13m 54s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 692m 47s | [/patch-unit-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2567/20/artifact/out/patch-unit-root.txt) | root in the patch passed. | | +1 :green_heart: | asflicense | 1m 33s | | The patch does not generate ASF License warnings. | | | | 916m 49s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks | | | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure | | | hadoop.hdfs.server.balancer.TestBalancer | | | hadoop.hdfs.TestDFSInotifyEventInputStreamKerberized | | | hadoop.tools.dynamometer.TestDynamometerInfra | | | hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer | | | hadoop.yarn.server.router.clientrm.TestFederationClientInterceptor | | Subsystem | Report/Notes |
[jira] [Work logged] (HDFS-15791) Possible Resource Leak in FSImageFormatProtobuf
[ https://issues.apache.org/jira/browse/HDFS-15791?focusedWorklogId=542937=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-542937 ] ASF GitHub Bot logged work on HDFS-15791: - Author: ASF GitHub Bot Created on: 27/Jan/21 17:15 Start Date: 27/Jan/21 17:15 Worklog Time Spent: 10m Work Description: Nargeshdb commented on a change in pull request #2652: URL: https://github.com/apache/hadoop/pull/2652#discussion_r565486885 ## File path: hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java ## @@ -269,14 +269,20 @@ public InputStream getInputStreamForSection(FileSummary.Section section, String compressionCodec) throws IOException { FileInputStream fin = new FileInputStream(filename); - FileChannel channel = fin.getChannel(); - channel.position(section.getOffset()); - InputStream in = new BufferedInputStream(new LimitInputStream(fin, - section.getLength())); + try { Review comment: Thanks a lot for the review. I really appreciate it. I was wondering if I need to do anything else to get the change merged. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 542937) Time Spent: 1h 40m (was: 1.5h) > Possible Resource Leak in FSImageFormatProtobuf > --- > > Key: HDFS-15791 > URL: https://issues.apache.org/jira/browse/HDFS-15791 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Narges Shadab >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > We noticed a possible resource leak > [here|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java#L271]. > If an I/O error occurs at line > [273|https://github.com/apache/hadoop/blob/06a5d3437f68546207f18d23fe527895920c756a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java#L273] > or > [277|https://github.com/apache/hadoop/blob/06a5d3437f68546207f18d23fe527895920c756a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java#L277], > {{fin}} remains open since the exception isn't caught locally, and there is > no way for any caller to close the FileInputStream > I'll submit a pull request to fix it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15794) IBR and FBR use different queues to load data.
[ https://issues.apache.org/jira/browse/HDFS-15794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272999#comment-17272999 ] Kihwal Lee edited comment on HDFS-15794 at 1/27/21, 4:58 PM: - {quote}The problem here is that when the NameNode is blocked in processing the IBR, the FBR requested by the DN from the NameNode will be affected. Similarly, when the NameNode processing FBR is blocked. {quote} The serial processing of IBR and FBR is not a side-effect of the way a data structure is used (single queue). In current namespace and block manager design, each report is processed with the fsn write lock held. The queue made it possible to process multiple IBRs under one lock, thus increasing throughput. Having multiple queues for IBRs and FBRs won't help with concurrency. In fact, it will complicate things, as it needs to maintain certain processing order across multiple queues. In order to make a meaningful performance improvement, we have to make NN perform concurrent block report processing. was (Author: kihwal): {quote}The problem here is that when the NameNode is blocked in processing the IBR, the FBR requested by the DN from the NameNode will be affected. Similarly, when the NameNode processing FBR is blocked. {quote} The serial processing of IBR and FBR is not a side-effect of way a data structure is used (single queue). In current namespace and block manager design, each report is processed with the fsn write lock held. The queue made it possible to process multiple IBRs under one lock, thus increasing throughput. Having multiple queues for IBRs and FBRs won't help with concurrency. In fact, it will complicate things, as it needs to maintain certain processing order across multiple queues. In order to make a meaningful performance improvement, we have to make NN perform concurrent block report processing. > IBR and FBR use different queues to load data. > -- > > Key: HDFS-15794 > URL: https://issues.apache.org/jira/browse/HDFS-15794 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Major > > When DataNode reports data to NameNode, IBR and FBR are included here. > After the NameNode receives the DataNode request, it temporarily stores the > data in a queue, here it refers to > BlockManager#BlockReportProcessingThread#queue. > NameNodeRpcServer#blockReport() > for (int r = 0; r final BlockListAsLongs blocks = reports[r].getBlocks(); > final int index = r; > noStaleStorages = bm.runBlockOp(() -> > bm.processReport(nodeReg, reports[index].getStorage(), > blocks, context)); > } > NameNodeRpcServer#blockReport() > for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) { > bm.enqueueBlockOp(new Runnable() { > @Override > public void run() { > try { > namesystem.processIncrementalBlockReport(nodeReg, r); > } catch (Exception ex) { > // usually because the node is unregistered/dead. next heartbeat > // will correct the problem > blockStateChangeLog.error( > "*BLOCK* NameNode.blockReceivedAndDeleted: " > + "failed from "+ nodeReg + ":" + ex.getMessage()); > } > } > }); > } > The problem here is that when the NameNode is blocked in processing the IBR, > the FBR requested by the DN from the NameNode will be affected. Similarly, > when the NameNode processing FBR is blocked. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15794) IBR and FBR use different queues to load data.
[ https://issues.apache.org/jira/browse/HDFS-15794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272999#comment-17272999 ] Kihwal Lee commented on HDFS-15794: --- {quote}The problem here is that when the NameNode is blocked in processing the IBR, the FBR requested by the DN from the NameNode will be affected. Similarly, when the NameNode processing FBR is blocked. {quote} The serial processing of IBR and FBR is not a side-effect of way a data structure is used (single queue). In current namespace and block manager design, each report is processed with the fsn write lock held. The queue made it possible to process multiple IBRs under one lock, thus increasing throughput. Having multiple queues for IBRs and FBRs won't help with concurrency. In fact, it will complicate things, as it needs to maintain certain processing order across multiple queues. In order to make a meaningful performance improvement, we have to make NN perform concurrent block report processing. > IBR and FBR use different queues to load data. > -- > > Key: HDFS-15794 > URL: https://issues.apache.org/jira/browse/HDFS-15794 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Major > > When DataNode reports data to NameNode, IBR and FBR are included here. > After the NameNode receives the DataNode request, it temporarily stores the > data in a queue, here it refers to > BlockManager#BlockReportProcessingThread#queue. > NameNodeRpcServer#blockReport() > for (int r = 0; r final BlockListAsLongs blocks = reports[r].getBlocks(); > final int index = r; > noStaleStorages = bm.runBlockOp(() -> > bm.processReport(nodeReg, reports[index].getStorage(), > blocks, context)); > } > NameNodeRpcServer#blockReport() > for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) { > bm.enqueueBlockOp(new Runnable() { > @Override > public void run() { > try { > namesystem.processIncrementalBlockReport(nodeReg, r); > } catch (Exception ex) { > // usually because the node is unregistered/dead. next heartbeat > // will correct the problem > blockStateChangeLog.error( > "*BLOCK* NameNode.blockReceivedAndDeleted: " > + "failed from "+ nodeReg + ":" + ex.getMessage()); > } > } > }); > } > The problem here is that when the NameNode is blocked in processing the IBR, > the FBR requested by the DN from the NameNode will be affected. Similarly, > when the NameNode processing FBR is blocked. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15793) Add command to DFSAdmin for Balancer max concurrent threads
[ https://issues.apache.org/jira/browse/HDFS-15793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272871#comment-17272871 ] Hadoop QA commented on HDFS-15793: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 34m 40s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 1s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:blue}0{color} | {color:blue} buf {color} | {color:blue} 0m 0s{color} | {color:blue}{color} | {color:blue} buf was not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 2m 38s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 11s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 41s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 20s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 34s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 2m 58s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 18m 39s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 16s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 4s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 1m 13s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 32s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 33s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 38s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 {color} | | {color:red}-1{color} | {color:red} cc {color} | {color:red} 4m 38s{color} | {color:red}https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/430/artifact/out/diff-compile-cc-hadoop-hdfs-project-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04.txt{color} | {color:red} hadoop-hdfs-project-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 generated 7 new + 86 unchanged - 7 fixed = 93 total (was 93) {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 38s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 31s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 {color} | | {color:red}-1{color} | {color:red} cc {color} | {color:red} 4m 31s{color} |
[jira] [Comment Edited] (HDFS-15794) IBR and FBR use different queues to load data.
[ https://issues.apache.org/jira/browse/HDFS-15794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272840#comment-17272840 ] JiangHua Zhu edited comment on HDFS-15794 at 1/27/21, 1:04 PM: --- [~weichiu] , thank you for your reply. I noticed [HDFS-14997|https://issues.apache.org/jira/browse/HDFS-14997]. The improvements made here are very meaningful. But what I want to explain here is that HDFS-14997 is about the improvement of DataNode. What I want to express is that something meaningful can be done on the NameNode side. When NN processes IBR and FBR data, it can use different queues for processing, instead of sharing one queue (BlockManager#BlockReportProcessingThread#queue). This will benefit the NN's capabilities. was (Author: jianghuazhu): [~weichiu] , thank you for your reply. I noticed 14997. The improvements made here are very meaningful. But what I want to explain here is that [HDFS-14997|https://issues.apache.org/jira/browse/HDFS-14997] is about the improvement of DataNode. What I want to express is that something meaningful can be done on the NameNode side. When NN processes IBR and FBR data, it can use different queues for processing, instead of sharing one queue (BlockManager#BlockReportProcessingThread#queue). This will benefit the NN's capabilities. > IBR and FBR use different queues to load data. > -- > > Key: HDFS-15794 > URL: https://issues.apache.org/jira/browse/HDFS-15794 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Major > > When DataNode reports data to NameNode, IBR and FBR are included here. > After the NameNode receives the DataNode request, it temporarily stores the > data in a queue, here it refers to > BlockManager#BlockReportProcessingThread#queue. > NameNodeRpcServer#blockReport() > for (int r = 0; r final BlockListAsLongs blocks = reports[r].getBlocks(); > final int index = r; > noStaleStorages = bm.runBlockOp(() -> > bm.processReport(nodeReg, reports[index].getStorage(), > blocks, context)); > } > NameNodeRpcServer#blockReport() > for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) { > bm.enqueueBlockOp(new Runnable() { > @Override > public void run() { > try { > namesystem.processIncrementalBlockReport(nodeReg, r); > } catch (Exception ex) { > // usually because the node is unregistered/dead. next heartbeat > // will correct the problem > blockStateChangeLog.error( > "*BLOCK* NameNode.blockReceivedAndDeleted: " > + "failed from "+ nodeReg + ":" + ex.getMessage()); > } > } > }); > } > The problem here is that when the NameNode is blocked in processing the IBR, > the FBR requested by the DN from the NameNode will be affected. Similarly, > when the NameNode processing FBR is blocked. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15794) IBR and FBR use different queues to load data.
[ https://issues.apache.org/jira/browse/HDFS-15794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272840#comment-17272840 ] JiangHua Zhu edited comment on HDFS-15794 at 1/27/21, 1:03 PM: --- [~weichiu] , thank you for your reply. I noticed 14997. The improvements made here are very meaningful. But what I want to explain here is that [HDFS-14997|https://issues.apache.org/jira/browse/HDFS-14997] is about the improvement of DataNode. What I want to express is that something meaningful can be done on the NameNode side. When NN processes IBR and FBR data, it can use different queues for processing, instead of sharing one queue (BlockManager#BlockReportProcessingThread#queue). This will benefit the NN's capabilities. was (Author: jianghuazhu): [~weichiu] , thank you for your reply. I noticed [14997|https://issues.apache.org/jira/browse/HDFS-14997]. The improvements made here are very meaningful. But what I want to explain here is that @14997 is about the improvement of DataNode. What I want to express is that something meaningful can be done on the NameNode side. When NN processes IBR and FBR data, it can use different queues for processing, instead of sharing one queue (BlockManager#BlockReportProcessingThread#queue). This will benefit the NN's capabilities. > IBR and FBR use different queues to load data. > -- > > Key: HDFS-15794 > URL: https://issues.apache.org/jira/browse/HDFS-15794 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Major > > When DataNode reports data to NameNode, IBR and FBR are included here. > After the NameNode receives the DataNode request, it temporarily stores the > data in a queue, here it refers to > BlockManager#BlockReportProcessingThread#queue. > NameNodeRpcServer#blockReport() > for (int r = 0; r final BlockListAsLongs blocks = reports[r].getBlocks(); > final int index = r; > noStaleStorages = bm.runBlockOp(() -> > bm.processReport(nodeReg, reports[index].getStorage(), > blocks, context)); > } > NameNodeRpcServer#blockReport() > for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) { > bm.enqueueBlockOp(new Runnable() { > @Override > public void run() { > try { > namesystem.processIncrementalBlockReport(nodeReg, r); > } catch (Exception ex) { > // usually because the node is unregistered/dead. next heartbeat > // will correct the problem > blockStateChangeLog.error( > "*BLOCK* NameNode.blockReceivedAndDeleted: " > + "failed from "+ nodeReg + ":" + ex.getMessage()); > } > } > }); > } > The problem here is that when the NameNode is blocked in processing the IBR, > the FBR requested by the DN from the NameNode will be affected. Similarly, > when the NameNode processing FBR is blocked. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15794) IBR and FBR use different queues to load data.
[ https://issues.apache.org/jira/browse/HDFS-15794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272840#comment-17272840 ] JiangHua Zhu edited comment on HDFS-15794 at 1/27/21, 1:02 PM: --- [~weichiu] , thank you for your reply. I noticed [14997|https://issues.apache.org/jira/browse/HDFS-14997]. The improvements made here are very meaningful. But what I want to explain here is that @14997 is about the improvement of DataNode. What I want to express is that something meaningful can be done on the NameNode side. When NN processes IBR and FBR data, it can use different queues for processing, instead of sharing one queue (BlockManager#BlockReportProcessingThread#queue). This will benefit the NN's capabilities. was (Author: jianghuazhu): [~weichiu] , thank you for your reply. I noticed [link 14997|https://issues.apache.org/jira/browse/HDFS-14997]. The improvements made here are very meaningful. But what I want to explain here is that @14997 is about the improvement of DataNode. What I want to express is that something meaningful can be done on the NameNode side. When NN processes IBR and FBR data, it can use different queues for processing, instead of sharing one queue (BlockManager#BlockReportProcessingThread#queue). This will benefit the NN's capabilities. > IBR and FBR use different queues to load data. > -- > > Key: HDFS-15794 > URL: https://issues.apache.org/jira/browse/HDFS-15794 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Major > > When DataNode reports data to NameNode, IBR and FBR are included here. > After the NameNode receives the DataNode request, it temporarily stores the > data in a queue, here it refers to > BlockManager#BlockReportProcessingThread#queue. > NameNodeRpcServer#blockReport() > for (int r = 0; r final BlockListAsLongs blocks = reports[r].getBlocks(); > final int index = r; > noStaleStorages = bm.runBlockOp(() -> > bm.processReport(nodeReg, reports[index].getStorage(), > blocks, context)); > } > NameNodeRpcServer#blockReport() > for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) { > bm.enqueueBlockOp(new Runnable() { > @Override > public void run() { > try { > namesystem.processIncrementalBlockReport(nodeReg, r); > } catch (Exception ex) { > // usually because the node is unregistered/dead. next heartbeat > // will correct the problem > blockStateChangeLog.error( > "*BLOCK* NameNode.blockReceivedAndDeleted: " > + "failed from "+ nodeReg + ":" + ex.getMessage()); > } > } > }); > } > The problem here is that when the NameNode is blocked in processing the IBR, > the FBR requested by the DN from the NameNode will be affected. Similarly, > when the NameNode processing FBR is blocked. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15794) IBR and FBR use different queues to load data.
[ https://issues.apache.org/jira/browse/HDFS-15794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272840#comment-17272840 ] JiangHua Zhu edited comment on HDFS-15794 at 1/27/21, 1:01 PM: --- [~weichiu] , thank you for your reply. I noticed [link 14997|https://issues.apache.org/jira/browse/HDFS-14997]. The improvements made here are very meaningful. But what I want to explain here is that @14997 is about the improvement of DataNode. What I want to express is that something meaningful can be done on the NameNode side. When NN processes IBR and FBR data, it can use different queues for processing, instead of sharing one queue (BlockManager#BlockReportProcessingThread#queue). This will benefit the NN's capabilities. was (Author: jianghuazhu): [~weichiu] , thank you for your reply. I noticed 14997. The improvements made here are very meaningful. But what I want to explain here is that @14997 is about the improvement of DataNode. What I want to express is that something meaningful can be done on the NameNode side. When NN processes IBR and FBR data, it can use different queues for processing, instead of sharing one queue (BlockManager#BlockReportProcessingThread#queue). This will benefit the NN's capabilities. > IBR and FBR use different queues to load data. > -- > > Key: HDFS-15794 > URL: https://issues.apache.org/jira/browse/HDFS-15794 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Major > > When DataNode reports data to NameNode, IBR and FBR are included here. > After the NameNode receives the DataNode request, it temporarily stores the > data in a queue, here it refers to > BlockManager#BlockReportProcessingThread#queue. > NameNodeRpcServer#blockReport() > for (int r = 0; r final BlockListAsLongs blocks = reports[r].getBlocks(); > final int index = r; > noStaleStorages = bm.runBlockOp(() -> > bm.processReport(nodeReg, reports[index].getStorage(), > blocks, context)); > } > NameNodeRpcServer#blockReport() > for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) { > bm.enqueueBlockOp(new Runnable() { > @Override > public void run() { > try { > namesystem.processIncrementalBlockReport(nodeReg, r); > } catch (Exception ex) { > // usually because the node is unregistered/dead. next heartbeat > // will correct the problem > blockStateChangeLog.error( > "*BLOCK* NameNode.blockReceivedAndDeleted: " > + "failed from "+ nodeReg + ":" + ex.getMessage()); > } > } > }); > } > The problem here is that when the NameNode is blocked in processing the IBR, > the FBR requested by the DN from the NameNode will be affected. Similarly, > when the NameNode processing FBR is blocked. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15794) IBR and FBR use different queues to load data.
[ https://issues.apache.org/jira/browse/HDFS-15794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272840#comment-17272840 ] JiangHua Zhu edited comment on HDFS-15794 at 1/27/21, 1:00 PM: --- [~weichiu] , thank you for your reply. I noticed 14997. The improvements made here are very meaningful. But what I want to explain here is that @14997 is about the improvement of DataNode. What I want to express is that something meaningful can be done on the NameNode side. When NN processes IBR and FBR data, it can use different queues for processing, instead of sharing one queue (BlockManager#BlockReportProcessingThread#queue). This will benefit the NN's capabilities. was (Author: jianghuazhu): [~weichiu] , thank you for your reply. I noticed @14997 [14997|https://issues.apache.org/jira/browse/HDFS-14997]. The improvements made here are very meaningful. But what I want to explain here is that @14997 is about the improvement of DataNode. What I want to express is that something meaningful can be done on the NameNode side. When NN processes IBR and FBR data, it can use different queues for processing, instead of sharing one queue (BlockManager#BlockReportProcessingThread#queue). This will benefit the NN's capabilities. > IBR and FBR use different queues to load data. > -- > > Key: HDFS-15794 > URL: https://issues.apache.org/jira/browse/HDFS-15794 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Major > > When DataNode reports data to NameNode, IBR and FBR are included here. > After the NameNode receives the DataNode request, it temporarily stores the > data in a queue, here it refers to > BlockManager#BlockReportProcessingThread#queue. > NameNodeRpcServer#blockReport() > for (int r = 0; r final BlockListAsLongs blocks = reports[r].getBlocks(); > final int index = r; > noStaleStorages = bm.runBlockOp(() -> > bm.processReport(nodeReg, reports[index].getStorage(), > blocks, context)); > } > NameNodeRpcServer#blockReport() > for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) { > bm.enqueueBlockOp(new Runnable() { > @Override > public void run() { > try { > namesystem.processIncrementalBlockReport(nodeReg, r); > } catch (Exception ex) { > // usually because the node is unregistered/dead. next heartbeat > // will correct the problem > blockStateChangeLog.error( > "*BLOCK* NameNode.blockReceivedAndDeleted: " > + "failed from "+ nodeReg + ":" + ex.getMessage()); > } > } > }); > } > The problem here is that when the NameNode is blocked in processing the IBR, > the FBR requested by the DN from the NameNode will be affected. Similarly, > when the NameNode processing FBR is blocked. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15794) IBR and FBR use different queues to load data.
[ https://issues.apache.org/jira/browse/HDFS-15794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272840#comment-17272840 ] JiangHua Zhu edited comment on HDFS-15794 at 1/27/21, 1:00 PM: --- [~weichiu] , thank you for your reply. I noticed @14997 [14997|https://issues.apache.org/jira/browse/HDFS-14997]. The improvements made here are very meaningful. But what I want to explain here is that @14997 is about the improvement of DataNode. What I want to express is that something meaningful can be done on the NameNode side. When NN processes IBR and FBR data, it can use different queues for processing, instead of sharing one queue (BlockManager#BlockReportProcessingThread#queue). This will benefit the NN's capabilities. was (Author: jianghuazhu): [~weichiu] , thank you for your reply. I noticed @14997 . The improvements made here are very meaningful. But what I want to explain here is that @14997 is about the improvement of DataNode. What I want to express is that something meaningful can be done on the NameNode side. When NN processes IBR and FBR data, it can use different queues for processing, instead of sharing one queue (BlockManager#BlockReportProcessingThread#queue). This will benefit the NN's capabilities. [link title|https://issues.apache.org/jira/browse/HDFS-14997] > IBR and FBR use different queues to load data. > -- > > Key: HDFS-15794 > URL: https://issues.apache.org/jira/browse/HDFS-15794 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Major > > When DataNode reports data to NameNode, IBR and FBR are included here. > After the NameNode receives the DataNode request, it temporarily stores the > data in a queue, here it refers to > BlockManager#BlockReportProcessingThread#queue. > NameNodeRpcServer#blockReport() > for (int r = 0; r final BlockListAsLongs blocks = reports[r].getBlocks(); > final int index = r; > noStaleStorages = bm.runBlockOp(() -> > bm.processReport(nodeReg, reports[index].getStorage(), > blocks, context)); > } > NameNodeRpcServer#blockReport() > for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) { > bm.enqueueBlockOp(new Runnable() { > @Override > public void run() { > try { > namesystem.processIncrementalBlockReport(nodeReg, r); > } catch (Exception ex) { > // usually because the node is unregistered/dead. next heartbeat > // will correct the problem > blockStateChangeLog.error( > "*BLOCK* NameNode.blockReceivedAndDeleted: " > + "failed from "+ nodeReg + ":" + ex.getMessage()); > } > } > }); > } > The problem here is that when the NameNode is blocked in processing the IBR, > the FBR requested by the DN from the NameNode will be affected. Similarly, > when the NameNode processing FBR is blocked. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-15794) IBR and FBR use different queues to load data.
[ https://issues.apache.org/jira/browse/HDFS-15794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272840#comment-17272840 ] JiangHua Zhu edited comment on HDFS-15794 at 1/27/21, 12:59 PM: [~weichiu] , thank you for your reply. I noticed @14997 . The improvements made here are very meaningful. But what I want to explain here is that @14997 is about the improvement of DataNode. What I want to express is that something meaningful can be done on the NameNode side. When NN processes IBR and FBR data, it can use different queues for processing, instead of sharing one queue (BlockManager#BlockReportProcessingThread#queue). This will benefit the NN's capabilities. [link title|https://issues.apache.org/jira/browse/HDFS-14997] was (Author: jianghuazhu): [~weichiu] , thank you for your reply. I noticed @14997 . The improvements made here are very meaningful. But what I want to explain here is that @14997 is about the improvement of DataNode. What I want to express is that something meaningful can be done on the NameNode side. When NN processes IBR and FBR data, it can use different queues for processing, instead of sharing one queue (BlockManager#BlockReportProcessingThread#queue). This will benefit the NN's capabilities. > IBR and FBR use different queues to load data. > -- > > Key: HDFS-15794 > URL: https://issues.apache.org/jira/browse/HDFS-15794 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Major > > When DataNode reports data to NameNode, IBR and FBR are included here. > After the NameNode receives the DataNode request, it temporarily stores the > data in a queue, here it refers to > BlockManager#BlockReportProcessingThread#queue. > NameNodeRpcServer#blockReport() > for (int r = 0; r final BlockListAsLongs blocks = reports[r].getBlocks(); > final int index = r; > noStaleStorages = bm.runBlockOp(() -> > bm.processReport(nodeReg, reports[index].getStorage(), > blocks, context)); > } > NameNodeRpcServer#blockReport() > for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) { > bm.enqueueBlockOp(new Runnable() { > @Override > public void run() { > try { > namesystem.processIncrementalBlockReport(nodeReg, r); > } catch (Exception ex) { > // usually because the node is unregistered/dead. next heartbeat > // will correct the problem > blockStateChangeLog.error( > "*BLOCK* NameNode.blockReceivedAndDeleted: " > + "failed from "+ nodeReg + ":" + ex.getMessage()); > } > } > }); > } > The problem here is that when the NameNode is blocked in processing the IBR, > the FBR requested by the DN from the NameNode will be affected. Similarly, > when the NameNode processing FBR is blocked. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15794) IBR and FBR use different queues to load data.
[ https://issues.apache.org/jira/browse/HDFS-15794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272840#comment-17272840 ] JiangHua Zhu commented on HDFS-15794: - [~weichiu] , thank you for your reply. I noticed @14997 . The improvements made here are very meaningful. But what I want to explain here is that @14997 is about the improvement of DataNode. What I want to express is that something meaningful can be done on the NameNode side. When NN processes IBR and FBR data, it can use different queues for processing, instead of sharing one queue (BlockManager#BlockReportProcessingThread#queue). This will benefit the NN's capabilities. > IBR and FBR use different queues to load data. > -- > > Key: HDFS-15794 > URL: https://issues.apache.org/jira/browse/HDFS-15794 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Major > > When DataNode reports data to NameNode, IBR and FBR are included here. > After the NameNode receives the DataNode request, it temporarily stores the > data in a queue, here it refers to > BlockManager#BlockReportProcessingThread#queue. > NameNodeRpcServer#blockReport() > for (int r = 0; r final BlockListAsLongs blocks = reports[r].getBlocks(); > final int index = r; > noStaleStorages = bm.runBlockOp(() -> > bm.processReport(nodeReg, reports[index].getStorage(), > blocks, context)); > } > NameNodeRpcServer#blockReport() > for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) { > bm.enqueueBlockOp(new Runnable() { > @Override > public void run() { > try { > namesystem.processIncrementalBlockReport(nodeReg, r); > } catch (Exception ex) { > // usually because the node is unregistered/dead. next heartbeat > // will correct the problem > blockStateChangeLog.error( > "*BLOCK* NameNode.blockReceivedAndDeleted: " > + "failed from "+ nodeReg + ":" + ex.getMessage()); > } > } > }); > } > The problem here is that when the NameNode is blocked in processing the IBR, > the FBR requested by the DN from the NameNode will be affected. Similarly, > when the NameNode processing FBR is blocked. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15794) IBR and FBR use different queues to load data.
[ https://issues.apache.org/jira/browse/HDFS-15794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272795#comment-17272795 ] Wei-Chiu Chuang commented on HDFS-15794: I think we made quite some improvements recently in here. One of which is HDFS-14997. > IBR and FBR use different queues to load data. > -- > > Key: HDFS-15794 > URL: https://issues.apache.org/jira/browse/HDFS-15794 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Major > > When DataNode reports data to NameNode, IBR and FBR are included here. > After the NameNode receives the DataNode request, it temporarily stores the > data in a queue, here it refers to > BlockManager#BlockReportProcessingThread#queue. > NameNodeRpcServer#blockReport() > for (int r = 0; r final BlockListAsLongs blocks = reports[r].getBlocks(); > final int index = r; > noStaleStorages = bm.runBlockOp(() -> > bm.processReport(nodeReg, reports[index].getStorage(), > blocks, context)); > } > NameNodeRpcServer#blockReport() > for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) { > bm.enqueueBlockOp(new Runnable() { > @Override > public void run() { > try { > namesystem.processIncrementalBlockReport(nodeReg, r); > } catch (Exception ex) { > // usually because the node is unregistered/dead. next heartbeat > // will correct the problem > blockStateChangeLog.error( > "*BLOCK* NameNode.blockReceivedAndDeleted: " > + "failed from "+ nodeReg + ":" + ex.getMessage()); > } > } > }); > } > The problem here is that when the NameNode is blocked in processing the IBR, > the FBR requested by the DN from the NameNode will be affected. Similarly, > when the NameNode processing FBR is blocked. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly
[ https://issues.apache.org/jira/browse/HDFS-15714?focusedWorklogId=542754=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-542754 ] ASF GitHub Bot logged work on HDFS-15714: - Author: ASF GitHub Bot Created on: 27/Jan/21 10:32 Start Date: 27/Jan/21 10:32 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on pull request #2655: URL: https://github.com/apache/hadoop/pull/2655#issuecomment-768192307 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 1m 27s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 4s | | No case conflicting files found. | | +0 :ok: | buf | 0m 1s | | buf was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | | 0m 0s | [test4tests](test4tests) | The patch appears to include 79 new or modified test files. | _ HDFS-15714 Compile Tests _ | | +0 :ok: | mvndep | 13m 53s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 23m 50s | | HDFS-15714 passed | | +1 :green_heart: | compile | 21m 54s | | HDFS-15714 passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 | | +1 :green_heart: | compile | 18m 22s | | HDFS-15714 passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 | | +1 :green_heart: | checkstyle | 4m 9s | | HDFS-15714 passed | | +1 :green_heart: | mvnsite | 6m 3s | | HDFS-15714 passed | | +1 :green_heart: | shadedclient | 27m 51s | | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 4m 30s | | HDFS-15714 passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 | | +1 :green_heart: | javadoc | 5m 53s | | HDFS-15714 passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 | | +0 :ok: | spotbugs | 0m 46s | | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 11m 26s | | HDFS-15714 passed | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 26s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 4m 31s | | the patch passed | | +1 :green_heart: | compile | 21m 7s | | the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 | | -1 :x: | cc | 21m 7s | [/diff-compile-cc-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2655/1/artifact/out/diff-compile-cc-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt) | root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 generated 30 new + 142 unchanged - 30 fixed = 172 total (was 172) | | -1 :x: | javac | 21m 7s | [/diff-compile-javac-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2655/1/artifact/out/diff-compile-javac-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt) | root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 generated 67 new + 2006 unchanged - 27 fixed = 2073 total (was 2033) | | +1 :green_heart: | compile | 22m 21s | | the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 | | -1 :x: | cc | 22m 21s | [/diff-compile-cc-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2655/1/artifact/out/diff-compile-cc-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt) | root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 generated 33 new + 139 unchanged - 33 fixed = 172 total (was 172) | | -1 :x: | javac | 22m 21s | [/diff-compile-javac-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2655/1/artifact/out/diff-compile-javac-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt) | root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 generated 67 new + 1901 unchanged - 27 fixed = 1968 total (was 1928) | | -0 :warning: | checkstyle | 4m 43s | [/diff-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2655/1/artifact/out/diff-checkstyle-root.txt) | root: The patch generated 144 new + 4280 unchanged - 35 fixed = 4424 total (was 4315) | | +1 :green_heart: | mvnsite | 9m 18s | | the patch passed | | -1 :x: | whitespace | 0m
[jira] [Commented] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly
[ https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272751#comment-17272751 ] Hadoop QA commented on HDFS-15714: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 27s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 4s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:blue}0{color} | {color:blue} buf {color} | {color:blue} 0m 1s{color} | {color:blue}{color} | {color:blue} buf was not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 79 new or modified test files. {color} | || || || || {color:brown} HDFS-15714 Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 13m 53s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 50s{color} | {color:green}{color} | {color:green} HDFS-15714 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 21m 54s{color} | {color:green}{color} | {color:green} HDFS-15714 passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 18m 22s{color} | {color:green}{color} | {color:green} HDFS-15714 passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 4m 9s{color} | {color:green}{color} | {color:green} HDFS-15714 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 6m 3s{color} | {color:green}{color} | {color:green} HDFS-15714 passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 27m 51s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 30s{color} | {color:green}{color} | {color:green} HDFS-15714 passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 5m 53s{color} | {color:green}{color} | {color:green} HDFS-15714 passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 0m 46s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 11m 26s{color} | {color:green}{color} | {color:green} HDFS-15714 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 31s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 21m 7s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} | | {color:red}-1{color} | {color:red} cc {color} | {color:red} 21m 7s{color} | {color:red}https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2655/1/artifact/out/diff-compile-cc-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt{color} | {color:red} root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 generated 30 new + 142 unchanged - 30 fixed = 172 total (was 172) {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 21m 7s{color} | {color:red}https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2655/1/artifact/out/diff-compile-javac-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt{color} | {color:red} root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 generated 67 new + 2006 unchanged - 27 fixed = 2073 total (was 2033) {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 22m 21s{color} | {color:green}{color} | {color:green} the patch passed with
[jira] [Comment Edited] (HDFS-15794) IBR and FBR use different queues to load data.
[ https://issues.apache.org/jira/browse/HDFS-15794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272716#comment-17272716 ] JiangHua Zhu edited comment on HDFS-15794 at 1/27/21, 9:45 AM: --- When DataNode requests IBR and FBR from NameNode, NameNode uses different queue processing (BlockManager). This can improve the performance of NameNode processing these two types of requests. [~weichiu] [~elgoiri] Do you have different ideas? was (Author: jianghuazhu): When DataNode requests IBR and FBR from NameNode, NameNode uses different queue processing (BlockManager). This can improve the performance of NameNode processing these two types of requests. [~weichiu] [~elgoiri] Do you have different opinions? > IBR and FBR use different queues to load data. > -- > > Key: HDFS-15794 > URL: https://issues.apache.org/jira/browse/HDFS-15794 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Major > > When DataNode reports data to NameNode, IBR and FBR are included here. > After the NameNode receives the DataNode request, it temporarily stores the > data in a queue, here it refers to > BlockManager#BlockReportProcessingThread#queue. > NameNodeRpcServer#blockReport() > for (int r = 0; r final BlockListAsLongs blocks = reports[r].getBlocks(); > final int index = r; > noStaleStorages = bm.runBlockOp(() -> > bm.processReport(nodeReg, reports[index].getStorage(), > blocks, context)); > } > NameNodeRpcServer#blockReport() > for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) { > bm.enqueueBlockOp(new Runnable() { > @Override > public void run() { > try { > namesystem.processIncrementalBlockReport(nodeReg, r); > } catch (Exception ex) { > // usually because the node is unregistered/dead. next heartbeat > // will correct the problem > blockStateChangeLog.error( > "*BLOCK* NameNode.blockReceivedAndDeleted: " > + "failed from "+ nodeReg + ":" + ex.getMessage()); > } > } > }); > } > The problem here is that when the NameNode is blocked in processing the IBR, > the FBR requested by the DN from the NameNode will be affected. Similarly, > when the NameNode processing FBR is blocked. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15794) IBR and FBR use different queues to load data.
[ https://issues.apache.org/jira/browse/HDFS-15794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272716#comment-17272716 ] JiangHua Zhu commented on HDFS-15794: - When DataNode requests IBR and FBR from NameNode, NameNode uses different queue processing (BlockManager). This can improve the performance of NameNode processing these two types of requests. [~weichiu] [~elgoiri] Do you have different opinions? > IBR and FBR use different queues to load data. > -- > > Key: HDFS-15794 > URL: https://issues.apache.org/jira/browse/HDFS-15794 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Major > > When DataNode reports data to NameNode, IBR and FBR are included here. > After the NameNode receives the DataNode request, it temporarily stores the > data in a queue, here it refers to > BlockManager#BlockReportProcessingThread#queue. > NameNodeRpcServer#blockReport() > for (int r = 0; r final BlockListAsLongs blocks = reports[r].getBlocks(); > final int index = r; > noStaleStorages = bm.runBlockOp(() -> > bm.processReport(nodeReg, reports[index].getStorage(), > blocks, context)); > } > NameNodeRpcServer#blockReport() > for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) { > bm.enqueueBlockOp(new Runnable() { > @Override > public void run() { > try { > namesystem.processIncrementalBlockReport(nodeReg, r); > } catch (Exception ex) { > // usually because the node is unregistered/dead. next heartbeat > // will correct the problem > blockStateChangeLog.error( > "*BLOCK* NameNode.blockReceivedAndDeleted: " > + "failed from "+ nodeReg + ":" + ex.getMessage()); > } > } > }); > } > The problem here is that when the NameNode is blocked in processing the IBR, > the FBR requested by the DN from the NameNode will be affected. Similarly, > when the NameNode processing FBR is blocked. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15794) IBR and FBR use different queues to load data.
JiangHua Zhu created HDFS-15794: --- Summary: IBR and FBR use different queues to load data. Key: HDFS-15794 URL: https://issues.apache.org/jira/browse/HDFS-15794 Project: Hadoop HDFS Issue Type: Improvement Components: namenode Reporter: JiangHua Zhu When DataNode reports data to NameNode, IBR and FBR are included here. After the NameNode receives the DataNode request, it temporarily stores the data in a queue, here it refers to BlockManager#BlockReportProcessingThread#queue. NameNodeRpcServer#blockReport() for (int r = 0; r bm.processReport(nodeReg, reports[index].getStorage(), blocks, context)); } NameNodeRpcServer#blockReport() for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) { bm.enqueueBlockOp(new Runnable() { @Override public void run() { try { namesystem.processIncrementalBlockReport(nodeReg, r); } catch (Exception ex) { // usually because the node is unregistered/dead. next heartbeat // will correct the problem blockStateChangeLog.error( "*BLOCK* NameNode.blockReceivedAndDeleted: " + "failed from "+ nodeReg + ":" + ex.getMessage()); } } }); } The problem here is that when the NameNode is blocked in processing the IBR, the FBR requested by the DN from the NameNode will be affected. Similarly, when the NameNode processing FBR is blocked. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-15794) IBR and FBR use different queues to load data.
[ https://issues.apache.org/jira/browse/HDFS-15794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] JiangHua Zhu reassigned HDFS-15794: --- Assignee: JiangHua Zhu > IBR and FBR use different queues to load data. > -- > > Key: HDFS-15794 > URL: https://issues.apache.org/jira/browse/HDFS-15794 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Major > > When DataNode reports data to NameNode, IBR and FBR are included here. > After the NameNode receives the DataNode request, it temporarily stores the > data in a queue, here it refers to > BlockManager#BlockReportProcessingThread#queue. > NameNodeRpcServer#blockReport() > for (int r = 0; r final BlockListAsLongs blocks = reports[r].getBlocks(); > final int index = r; > noStaleStorages = bm.runBlockOp(() -> > bm.processReport(nodeReg, reports[index].getStorage(), > blocks, context)); > } > NameNodeRpcServer#blockReport() > for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) { > bm.enqueueBlockOp(new Runnable() { > @Override > public void run() { > try { > namesystem.processIncrementalBlockReport(nodeReg, r); > } catch (Exception ex) { > // usually because the node is unregistered/dead. next heartbeat > // will correct the problem > blockStateChangeLog.error( > "*BLOCK* NameNode.blockReceivedAndDeleted: " > + "failed from "+ nodeReg + ":" + ex.getMessage()); > } > } > }); > } > The problem here is that when the NameNode is blocked in processing the IBR, > the FBR requested by the DN from the NameNode will be affected. Similarly, > when the NameNode processing FBR is blocked. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org