[jira] [Updated] (HDFS-15795) Returned wrong checksum when reconstruction was failed by exception

2021-01-27 Thread Yushi Hayasaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yushi Hayasaka updated HDFS-15795:
--
Description: 
If the reconstruction task is failed on StripedBlockChecksumReconstructor by 
exception, the checksum becomes wrong one because it is calculated with blocks 
except a failure one.
It is caused by catching exception with not appropriate way. As a result, the 
failed block is not fetched again.

  was:
If the reconstruction task is failed on StripedBlockChecksumReconstructor by 
exception, the checksum becomes wrong one because it is calculated with blocks 
without a failure one.
It is caused by catching exception with not appropriate way, so we need to fix 
it.


> Returned wrong checksum when reconstruction was failed by exception
> ---
>
> Key: HDFS-15795
> URL: https://issues.apache.org/jira/browse/HDFS-15795
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, ec, erasure-coding
>Reporter: Yushi Hayasaka
>Assignee: Yushi Hayasaka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> If the reconstruction task is failed on StripedBlockChecksumReconstructor by 
> exception, the checksum becomes wrong one because it is calculated with 
> blocks except a failure one.
> It is caused by catching exception with not appropriate way. As a result, the 
> failed block is not fetched again.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15795) Returned wrong checksum when reconstruction was failed by exception

2021-01-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15795?focusedWorklogId=543401=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-543401
 ]

ASF GitHub Bot logged work on HDFS-15795:
-

Author: ASF GitHub Bot
Created on: 28/Jan/21 06:33
Start Date: 28/Jan/21 06:33
Worklog Time Spent: 10m 
  Work Description: crossfire commented on a change in pull request #2657:
URL: https://github.com/apache/hadoop/pull/2657#discussion_r565850014



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockChecksumHelper.java
##
@@ -503,6 +503,7 @@ void compute() throws IOException {
   }
 } catch (IOException e) {

Review comment:
   It may be okay to just remove here instead of rethrowing exception 
because it is handled below too:
   
https://github.com/apache/hadoop/blob/f8769e0f4b917d9fda8ff7a9fddb4d755d246a1e/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiver.java#L324





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 543401)
Time Spent: 20m  (was: 10m)

> Returned wrong checksum when reconstruction was failed by exception
> ---
>
> Key: HDFS-15795
> URL: https://issues.apache.org/jira/browse/HDFS-15795
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, ec, erasure-coding
>Reporter: Yushi Hayasaka
>Assignee: Yushi Hayasaka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> If the reconstruction task is failed on StripedBlockChecksumReconstructor by 
> exception, the checksum becomes wrong one because it is calculated with 
> blocks without a failure one.
> It is caused by catching exception with not appropriate way, so we need to 
> fix it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15796) ConcurrentModificationException error happens on NameNode occasionally

2021-01-27 Thread Daniel Ma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Ma updated HDFS-15796:
-
Description: 
ConcurrentModificationException error happens on NameNode occasionally.

 
{code:java}
2021-01-23 20:21:18,107 | ERROR | RedundancyMonitor | RedundancyMonitor thread 
received Runtime exception.  | BlockManager.java:4746
java.util.ConcurrentModificationException
at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:909)
at java.util.ArrayList$Itr.next(ArrayList.java:859)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReconstructionWorkForBlocks(BlockManager.java:1907)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeBlockReconstructionWork(BlockManager.java:1859)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:4862)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$RedundancyMonitor.run(BlockManager.java:4729)
at java.lang.Thread.run(Thread.java:748)
{code}
 

 

  was:
ConcurrentModificationException error happens on NameNode occasionally

 

!file:///C:/Users/m00425105/AppData/Roaming/eSpace_Desktop/UserData/m00425105/imagefiles/10B02DC2-A9F0-4AE6-949B-92B8F1E9249A.png!


> ConcurrentModificationException error happens on NameNode occasionally
> --
>
> Key: HDFS-15796
> URL: https://issues.apache.org/jira/browse/HDFS-15796
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.1.1
>Reporter: Daniel Ma
>Priority: Critical
> Fix For: 3.1.1
>
>
> ConcurrentModificationException error happens on NameNode occasionally.
>  
> {code:java}
> 2021-01-23 20:21:18,107 | ERROR | RedundancyMonitor | RedundancyMonitor 
> thread received Runtime exception.  | BlockManager.java:4746
> java.util.ConcurrentModificationException
>   at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:909)
>   at java.util.ArrayList$Itr.next(ArrayList.java:859)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReconstructionWorkForBlocks(BlockManager.java:1907)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeBlockReconstructionWork(BlockManager.java:1859)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:4862)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$RedundancyMonitor.run(BlockManager.java:4729)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15790) Make ProtobufRpcEngineProtos and ProtobufRpcEngineProtos2 Co-Exist

2021-01-27 Thread Vinayakumar B (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17273357#comment-17273357
 ] 

Vinayakumar B commented on HDFS-15790:
--

Thanks for reporting this issue [~belugabehr].

please check the history of HADOOP-13363 for details regarding why and how the 
upgrade was done.

 

I will try to review proposed changes this weekend.

Thanks.

> Make ProtobufRpcEngineProtos and ProtobufRpcEngineProtos2 Co-Exist
> --
>
> Key: HDFS-15790
> URL: https://issues.apache.org/jira/browse/HDFS-15790
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Changing from Protobuf 2 to Protobuf 3 broke some stuff in Apache Hive 
> project.  This was not an awesome thing to do between minor versions in 
> regards to backwards compatibility for downstream projects.
> Additionally, these two frameworks are not drop-in replacements, they have 
> some differences.  Also, Protobuf 2 is not deprecated or anything so let us 
> have both protocols available at the same time.  In Hadoop 4.x Protobuf 2 
> support can be dropped.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-15795) Returned wrong checksum when reconstruction was failed by exception

2021-01-27 Thread Yushi Hayasaka (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yushi Hayasaka reassigned HDFS-15795:
-

Assignee: Yushi Hayasaka

> Returned wrong checksum when reconstruction was failed by exception
> ---
>
> Key: HDFS-15795
> URL: https://issues.apache.org/jira/browse/HDFS-15795
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, ec, erasure-coding
>Reporter: Yushi Hayasaka
>Assignee: Yushi Hayasaka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If the reconstruction task is failed on StripedBlockChecksumReconstructor by 
> exception, the checksum becomes wrong one because it is calculated with 
> blocks without a failure one.
> It is caused by catching exception with not appropriate way, so we need to 
> fix it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15796) ConcurrentModificationException error happens on NameNode occasionally

2021-01-27 Thread Daniel Ma (Jira)
Daniel Ma created HDFS-15796:


 Summary: ConcurrentModificationException error happens on NameNode 
occasionally
 Key: HDFS-15796
 URL: https://issues.apache.org/jira/browse/HDFS-15796
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs
Affects Versions: 3.1.1
Reporter: Daniel Ma
 Fix For: 3.1.1


ConcurrentModificationException error happens on NameNode occasionally

 

!file:///C:/Users/m00425105/AppData/Roaming/eSpace_Desktop/UserData/m00425105/imagefiles/10B02DC2-A9F0-4AE6-949B-92B8F1E9249A.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15162) Optimize frequency of regular block reports

2021-01-27 Thread JiangHua Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17273266#comment-17273266
 ] 

JiangHua Zhu edited comment on HDFS-15162 at 1/28/21, 6:21 AM:
---

[~ayushtkn] , I noticed your opinion.
 I agree with what you said. When the DN connects to the NN abnormally, it 
means that the NN is under pressure or the midway connection fails.
 Recently I encountered a problem. When DN connected to NN, after frequent 
retries many times (for example, 50 times), an exception broke out. The log is 
as follows:
 2021-01-01 17:55:21,099 [15993307503]-INFO [cluster lifeline to 
/:port:Client$Connection@948]-Retrying connect to server: 
/:port. Already tried 49 time(s) ; retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
 2021-01-01 17:55:21,100 [15993307504]-WARN [cluster lifeline to 
/:port:BPServiceActor$LifelineSender@1008]-IOException in 
LifelineSender for Block pool  (Datanode Uuid ) service to /: 
port
 java.net.ConnectException: Call From / to :port failed on 
connection exception: java.net.ConnectException: Connection refused; For more 
details see: [http://wiki.apache.org/hadoop/ConnectionRefused]
 at sun.reflect.GeneratedConstructorAccessor68.newInstance(Unknown Source)
 at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
 at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:824)
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:754)
 at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1511)
 at org.apache.hadoop.ipc.Client.call(Client.java:1453)
 at org.apache.hadoop.ipc.Client.call(Client.java:1363)
 at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
 at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
 at com.sun.proxy.$Proxy21.sendLifeline(Unknown Source)
 at 
org.apache.hadoop.hdfs.protocolPB.DatanodeLifelineProtocolClientSideTranslatorPB.sendLifeline(DatanodeLifelineProtocolClientSideTranslatorPB.java:100)
 at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor$LifelineSender.sendLifeline(BPServiceActor.java:1074)
 at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor$LifelineSender.sendLifelineIfDue(BPServiceActor.java:1058)
 at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor$LifelineSender.run(BPServiceActor.java:1003)

FBR should not be triggered at this time.


was (Author: jianghuazhu):
[~ayushtkn] , I noticed your opinion.
I agree with what you said. When the DN connects to the NN abnormally, it means 
that the NN is under pressure or the midway connection fails.
Recently I encountered a problem. When DN connected to NN, after frequent 
retries many times (for example, 50 times), an exception broke out. The log is 
as follows:
2021-01-01 17:55:21,099 [15993307503]-INFO [cluster lifeline to 
/:port:Client$Connection@948]-Retrying connect to server: 
/:port. Already tried 49 time(s) ; retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2021-01-12 17:55:21,100 [15993307504]-WARN [cluster lifeline to 
/:port:BPServiceActor$LifelineSender@1008]-IOException in 
LifelineSender for Block pool  (Datanode Uuid ) service to /: 
port
java.net.ConnectException: Call From / to :port failed on 
connection exception: java.net.ConnectException: Connection refused; For more 
details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.GeneratedConstructorAccessor68.newInstance(Unknown Source)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:824)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:754)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1511)
at org.apache.hadoop.ipc.Client.call(Client.java:1453)
at org.apache.hadoop.ipc.Client.call(Client.java:1363)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.$Proxy21.sendLifeline(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.DatanodeLifelineProtocolClientSideTranslatorPB.sendLifeline(DatanodeLifelineProtocolClientSideTranslatorPB.java:100)
at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor$LifelineSender.sendLifeline(BPServiceActor.java:1074)
at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor$LifelineSender.sendLifelineIfDue(BPServiceActor.java:1058)
at 

[jira] [Work logged] (HDFS-15795) Returned wrong checksum when reconstruction was failed by exception

2021-01-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15795?focusedWorklogId=543361=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-543361
 ]

ASF GitHub Bot logged work on HDFS-15795:
-

Author: ASF GitHub Bot
Created on: 28/Jan/21 04:11
Start Date: 28/Jan/21 04:11
Worklog Time Spent: 10m 
  Work Description: crossfire opened a new pull request #2657:
URL: https://github.com/apache/hadoop/pull/2657


   …ed by exception.
   
   ## NOTICE
   
   Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HADOOP-X. Fix a typo in YYY.)
   For more details, please see 
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 543361)
Remaining Estimate: 0h
Time Spent: 10m

> Returned wrong checksum when reconstruction was failed by exception
> ---
>
> Key: HDFS-15795
> URL: https://issues.apache.org/jira/browse/HDFS-15795
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, ec, erasure-coding
>Reporter: Yushi Hayasaka
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If the reconstruction task is failed on StripedBlockChecksumReconstructor by 
> exception, the checksum becomes wrong one because it is calculated with 
> blocks without a failure one.
> It is caused by catching exception with not appropriate way, so we need to 
> fix it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15795) Returned wrong checksum when reconstruction was failed by exception

2021-01-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-15795:
--
Labels: pull-request-available  (was: )

> Returned wrong checksum when reconstruction was failed by exception
> ---
>
> Key: HDFS-15795
> URL: https://issues.apache.org/jira/browse/HDFS-15795
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, ec, erasure-coding
>Reporter: Yushi Hayasaka
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> If the reconstruction task is failed on StripedBlockChecksumReconstructor by 
> exception, the checksum becomes wrong one because it is calculated with 
> blocks without a failure one.
> It is caused by catching exception with not appropriate way, so we need to 
> fix it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15795) Returned wrong checksum when reconstruction was failed by exception

2021-01-27 Thread Yushi Hayasaka (Jira)
Yushi Hayasaka created HDFS-15795:
-

 Summary: Returned wrong checksum when reconstruction was failed by 
exception
 Key: HDFS-15795
 URL: https://issues.apache.org/jira/browse/HDFS-15795
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode, ec, erasure-coding
Reporter: Yushi Hayasaka


If the reconstruction task is failed on StripedBlockChecksumReconstructor by 
exception, the checksum becomes wrong one because it is calculated with blocks 
without a failure one.
It is caused by catching exception with not appropriate way, so we need to fix 
it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15740) Make basename cross-platform

2021-01-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15740?focusedWorklogId=543343=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-543343
 ]

ASF GitHub Bot logged work on HDFS-15740:
-

Author: ASF GitHub Bot
Created on: 28/Jan/21 03:03
Start Date: 28/Jan/21 03:03
Worklog Time Spent: 10m 
  Work Description: GauthamBanasandra commented on pull request #2567:
URL: https://github.com/apache/hadoop/pull/2567#issuecomment-768764746


   @aajisaka Could you please review my PR?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 543343)
Remaining Estimate: 16h 10m  (was: 16h 20m)
Time Spent: 7h 50m  (was: 7h 40m)

> Make basename cross-platform
> 
>
> Key: HDFS-15740
> URL: https://issues.apache.org/jira/browse/HDFS-15740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: libhdfs++
>Affects Versions: 3.4.0
>Reporter: Gautham Banasandra
>Assignee: Gautham Banasandra
>Priority: Major
>  Labels: pull-request-available
>   Original Estimate: 24h
>  Time Spent: 7h 50m
>  Remaining Estimate: 16h 10m
>
> The *basename* function isn't available on Visual Studio 2019 compiler. We 
> need to make it cross platform.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15162) Optimize frequency of regular block reports

2021-01-27 Thread JiangHua Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17273268#comment-17273268
 ] 

JiangHua Zhu commented on HDFS-15162:
-

At this time, I understand that two things should be done:
1. FBR can be used only after DN is normally connected to NN;
2. After the abnormality is removed, FBR can be done at the next fixed interval 
or supplemented with the last unfinished FBR.

> Optimize frequency of regular block reports
> ---
>
> Key: HDFS-15162
> URL: https://issues.apache.org/jira/browse/HDFS-15162
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Critical
>
> Avoid sending block report at regular interval, if there is no failover, 
> DiskError or any exception encountered in connecting to the Namenode.
> This JIRA intents to limit the regular block reports to be sent only in case 
> of the above scenarios and during re-registration  of datanode, to eliminate 
> the overhead of processing BlockReports at Namenode in case of huge clusters.
> *Eg.* If a block report was sent at  hours and the next was scheduled at 
> 0600 hours if there is no above mentioned scenario, it will skip sending the 
> BR, and schedule it to next 1200 hrs. if something of such sort happens 
> between 06:- 12: it would send the BR normally.
> *NOTE*: This would be optional and can be turned off by default. Would add a 
> configuration to enable this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15162) Optimize frequency of regular block reports

2021-01-27 Thread JiangHua Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17273266#comment-17273266
 ] 

JiangHua Zhu commented on HDFS-15162:
-

[~ayushtkn] , I noticed your opinion.
I agree with what you said. When the DN connects to the NN abnormally, it means 
that the NN is under pressure or the midway connection fails.
Recently I encountered a problem. When DN connected to NN, after frequent 
retries many times (for example, 50 times), an exception broke out. The log is 
as follows:
2021-01-01 17:55:21,099 [15993307503]-INFO [cluster lifeline to 
/:port:Client$Connection@948]-Retrying connect to server: 
/:port. Already tried 49 time(s) ; retry policy is 
RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
2021-01-12 17:55:21,100 [15993307504]-WARN [cluster lifeline to 
/:port:BPServiceActor$LifelineSender@1008]-IOException in 
LifelineSender for Block pool  (Datanode Uuid ) service to /: 
port
java.net.ConnectException: Call From / to :port failed on 
connection exception: java.net.ConnectException: Connection refused; For more 
details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.GeneratedConstructorAccessor68.newInstance(Unknown Source)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:824)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:754)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1511)
at org.apache.hadoop.ipc.Client.call(Client.java:1453)
at org.apache.hadoop.ipc.Client.call(Client.java:1363)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.$Proxy21.sendLifeline(Unknown Source)
at 
org.apache.hadoop.hdfs.protocolPB.DatanodeLifelineProtocolClientSideTranslatorPB.sendLifeline(DatanodeLifelineProtocolClientSideTranslatorPB.java:100)
at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor$LifelineSender.sendLifeline(BPServiceActor.java:1074)
at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor$LifelineSender.sendLifelineIfDue(BPServiceActor.java:1058)
at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor$LifelineSender.run(BPServiceActor.java:1003)

FBR should not be triggered at this time.

> Optimize frequency of regular block reports
> ---
>
> Key: HDFS-15162
> URL: https://issues.apache.org/jira/browse/HDFS-15162
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Critical
>
> Avoid sending block report at regular interval, if there is no failover, 
> DiskError or any exception encountered in connecting to the Namenode.
> This JIRA intents to limit the regular block reports to be sent only in case 
> of the above scenarios and during re-registration  of datanode, to eliminate 
> the overhead of processing BlockReports at Namenode in case of huge clusters.
> *Eg.* If a block report was sent at  hours and the next was scheduled at 
> 0600 hours if there is no above mentioned scenario, it will skip sending the 
> BR, and schedule it to next 1200 hrs. if something of such sort happens 
> between 06:- 12: it would send the BR normally.
> *NOTE*: This would be optional and can be turned off by default. Would add a 
> configuration to enable this.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15661) The DeadNodeDetector shouldn't be shared by different DFSClients.

2021-01-27 Thread Lisheng Sun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lisheng Sun updated HDFS-15661:
---
Fix Version/s: 3.4.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> The DeadNodeDetector shouldn't be shared by different DFSClients.
> -
>
> Key: HDFS-15661
> URL: https://issues.apache.org/jira/browse/HDFS-15661
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Fix For: 3.4.0
>
> Attachments: HDFS-15661.001.patch, HDFS-15661.002.patch, 
> HDFS-15661.003.patch, HDFS-15661.004.patch, HDFS-15661.005.patch
>
>
> Currently the DeadNodeDetector is a member of ClientContext. That means it is 
> shared by many different DFSClients. When one DFSClient.close() is invoked, 
> the DeadNodeDetecotor thread would be interrupted and impact other DFSClients.
> From the original design of HDFS-13571 we could see the DeadNodeDetector is 
> supposed to share dead nodes of many input streams from the same client. 
> We should move the DeadNodeDetector as a member of DFSClient instead of 
> ClientContext. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15661) The DeadNodeDetector shouldn't be shared by different DFSClients.

2021-01-27 Thread Lisheng Sun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17273259#comment-17273259
 ] 

Lisheng Sun commented on HDFS-15661:


Commited to trunk.
 Thanks [~LiJinglun] for your report and contribution!

Thanks [~weichiu] for your review!

> The DeadNodeDetector shouldn't be shared by different DFSClients.
> -
>
> Key: HDFS-15661
> URL: https://issues.apache.org/jira/browse/HDFS-15661
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Major
> Attachments: HDFS-15661.001.patch, HDFS-15661.002.patch, 
> HDFS-15661.003.patch, HDFS-15661.004.patch, HDFS-15661.005.patch
>
>
> Currently the DeadNodeDetector is a member of ClientContext. That means it is 
> shared by many different DFSClients. When one DFSClient.close() is invoked, 
> the DeadNodeDetecotor thread would be interrupted and impact other DFSClients.
> From the original design of HDFS-13571 we could see the DeadNodeDetector is 
> supposed to share dead nodes of many input streams from the same client. 
> We should move the DeadNodeDetector as a member of DFSClient instead of 
> ClientContext. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15794) IBR and FBR use different queues to load data.

2021-01-27 Thread JiangHua Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17273253#comment-17273253
 ] 

JiangHua Zhu commented on HDFS-15794:
-

[~kihwal] , thank you very much.
I think your suggestion is very meaningful.

 

> IBR and FBR use different queues to load data.
> --
>
> Key: HDFS-15794
> URL: https://issues.apache.org/jira/browse/HDFS-15794
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>
> When DataNode reports data to NameNode, IBR and FBR are included here.
> After the NameNode receives the DataNode request, it temporarily stores the 
> data in a queue, here it refers to 
> BlockManager#BlockReportProcessingThread#queue.
> NameNodeRpcServer#blockReport()
> for (int r = 0; r   final BlockListAsLongs blocks = reports[r].getBlocks();
>  final int index = r;
>  noStaleStorages = bm.runBlockOp(() ->
>  bm.processReport(nodeReg, reports[index].getStorage(),
>  blocks, context));
>  }
> NameNodeRpcServer#blockReport()
> for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) {
>  bm.enqueueBlockOp(new Runnable() {
>  @Override
>  public void run() {
>  try {
>  namesystem.processIncrementalBlockReport(nodeReg, r);
>  } catch (Exception ex) {
>  // usually because the node is unregistered/dead. next heartbeat
>  // will correct the problem
>  blockStateChangeLog.error(
>  "*BLOCK* NameNode.blockReceivedAndDeleted: "
>  + "failed from "+ nodeReg + ":" + ex.getMessage());
>  }
>  }
>  });
>  }
> The problem here is that when the NameNode is blocked in processing the IBR, 
> the FBR requested by the DN from the NameNode will be affected. Similarly, 
> when the NameNode processing FBR is blocked.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15786) Minor improvement use isEmpty

2021-01-27 Thread Jason Wen (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17273201#comment-17273201
 ] 

Jason Wen commented on HDFS-15786:
--

I see the PR is all about String objects method change. For String object 
string.isEmpty() is equivalent to string.length() == 0. It does not make any 
difference, still O(1)

> Minor improvement use isEmpty
> -
>
> Key: HDFS-15786
> URL: https://issues.apache.org/jira/browse/HDFS-15786
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Arturo Bernal
>Assignee: Arturo Bernal
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Use isEmpty instead size() > o.
>  
> {{size()}} can be *O(1)* or *O(N)*, depending on the {{data structure}}; 
> {{.isEmpty()}} is never *O(N)*.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15790) Make ProtobufRpcEngineProtos and ProtobufRpcEngineProtos2 Co-Exist

2021-01-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15790?focusedWorklogId=543168=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-543168
 ]

ASF GitHub Bot logged work on HDFS-15790:
-

Author: ASF GitHub Bot
Created on: 27/Jan/21 23:02
Start Date: 27/Jan/21 23:02
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2650:
URL: https://github.com/apache/hadoop/pull/2650#issuecomment-768635544


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 31s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  buf  |   0m  0s |  |  buf was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch 
appears to include 5 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m  9s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  20m 29s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  20m 33s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |  17m 56s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  checkstyle  |   4m  6s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   6m 15s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m 31s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   4m 48s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   6m 24s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 26s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |  11m 34s |  |  trunk passed  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 26s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   4m 11s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m  3s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | -1 :x: |  cc  |  20m  3s | 
[/diff-compile-cc-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2650/3/artifact/out/diff-compile-cc-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04.txt)
 |  root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 generated 42 new + 370 unchanged - 42 
fixed = 412 total (was 412)  |
   | -1 :x: |  javac  |  20m  3s | 
[/diff-compile-javac-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2650/3/artifact/out/diff-compile-javac-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04.txt)
 |  root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 generated 1 new + 2035 unchanged - 0 
fixed = 2036 total (was 2035)  |
   | +1 :green_heart: |  compile  |  18m  1s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | -1 :x: |  cc  |  18m  1s | 
[/diff-compile-cc-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2650/3/artifact/out/diff-compile-cc-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01.txt)
 |  root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 generated 37 new + 375 
unchanged - 37 fixed = 412 total (was 412)  |
   | -1 :x: |  javac  |  18m  1s | 
[/diff-compile-javac-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2650/3/artifact/out/diff-compile-javac-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01.txt)
 |  root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 generated 1 new + 1930 
unchanged - 0 fixed = 1931 total (was 1930)  |
   | -0 :warning: |  checkstyle  |   3m 57s | 
[/diff-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2650/3/artifact/out/diff-checkstyle-root.txt)
 |  root: The patch generated 4 new + 557 unchanged - 3 fixed = 561 total (was 
560)  |
   | +1 :green_heart: |  mvnsite  |   6m 11s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  

[jira] [Commented] (HDFS-15789) Lease renewal does not require namesystem lock

2021-01-27 Thread Kihwal Lee (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17273129#comment-17273129
 ] 

Kihwal Lee commented on HDFS-15789:
---

This is a safe change. 

The FSN lock is only protecting NN against renewing a lease during a HA 
transition, which should be done by only the active NN.  So after this patch, 
there can be a case where a lease renewal request is received and being 
processed while a NN is active, but finishes processing it during or after 
transitioning to standby.  However, this does not affect the file system 
consistency or violate the existing file system API semantics.   The important 
states are whether file is open and who has the lease. Anything that changes 
these states is edit-logged.  The renewal does not revive expired/revoked 
leases and is thus not edit-logged.

+1 for the patch.

> Lease renewal does not require namesystem lock
> --
>
> Key: HDFS-15789
> URL: https://issues.apache.org/jira/browse/HDFS-15789
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HDFS-15789.001.patch
>
>
> [~daryn] found this while testing the performance for HDFS-15704.
> The lease manager is independent of the namesystem. Acquiring the lock causes 
> unnecessary lock contention that degrades throughput.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10498) Intermittent test failure org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotFileLength.testSnapshotfileLength

2021-01-27 Thread Kihwal Lee (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-10498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17273123#comment-17273123
 ] 

Kihwal Lee commented on HDFS-10498:
---

+1

> Intermittent test failure 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotFileLength.testSnapshotfileLength
> ---
>
> Key: HDFS-10498
> URL: https://issues.apache.org/jira/browse/HDFS-10498
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, snapshots
>Affects Versions: 3.0.0-alpha1
>Reporter: Hanisha Koneru
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HDFS-10498.001.patch, test_failure.txt
>
>
> Error Details
> Per https://builds.apache.org/job/PreCommit-HDFS-Build/15646/testReport/, we 
> had the following failure. Local rerun is successful.
> Error Details:
> {panel}
> Fail to get block MD5 for 
> LocatedBlock{BP-145245805-172.17.0.3-1464981728847:blk_1073741826_1002; 
> getBlockSize()=1; corrupt=false; offset=1024; 
> locs=[DatanodeInfoWithStorage[127.0.0.1:55764,DS-a33d7c97-9d4a-4694-a47e-a3187a33ed5a,DISK]]}
> {panel}
> Stack Trace: 
> {panel}
> java.io.IOException: Fail to get block MD5 for 
> LocatedBlock{BP-145245805-172.17.0.3-1464981728847:blk_1073741826_1002; 
> getBlockSize()=1; corrupt=false; offset=1024; 
> locs=[DatanodeInfoWithStorage[127.0.0.1:55764,DS-a33d7c97-9d4a-4694-a47e-a3187a33ed5a,DISK]]}
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$ReplicatedFileChecksumComputer.checksumBlocks(FileChecksumHelper.java:289)
>   at 
> org.apache.hadoop.hdfs.FileChecksumHelper$FileChecksumComputer.compute(FileChecksumHelper.java:206)
>   at org.apache.hadoop.hdfs.DFSClient.getFileChecksum(DFSClient.java:1731)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$31.doCall(DistributedFileSystem.java:1482)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem$31.doCall(DistributedFileSystem.java:1479)
>   at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>   at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileChecksum(DistributedFileSystem.java:1490)
>   at 
> org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotFileLength.testSnapshotfileLength(TestSnapshotFileLength.java:137)
>  Standard Output  7 sec
> {panel}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15740) Make basename cross-platform

2021-01-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15740?focusedWorklogId=543035=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-543035
 ]

ASF GitHub Bot logged work on HDFS-15740:
-

Author: ASF GitHub Bot
Created on: 27/Jan/21 19:46
Start Date: 27/Jan/21 19:46
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2567:
URL: https://github.com/apache/hadoop/pull/2567#issuecomment-768532520


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 33s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m  9s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  21m 38s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  22m 41s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |  19m 24s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  mvnsite  |  26m  1s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  | 118m  5s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  | 118m 26s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 27s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |  21m  9s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  23m 44s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | -1 :x: |  cc  |  23m 44s | 
[/diff-compile-cc-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2567/21/artifact/out/diff-compile-cc-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04.txt)
 |  root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 generated 31 new + 381 unchanged - 31 
fixed = 412 total (was 412)  |
   | +1 :green_heart: |  golang  |  23m 44s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |  23m 44s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  19m 41s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | -1 :x: |  cc  |  19m 41s | 
[/diff-compile-cc-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2567/21/artifact/out/diff-compile-cc-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01.txt)
 |  root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 generated 33 new + 379 
unchanged - 33 fixed = 412 total (was 412)  |
   | +1 :green_heart: |  golang  |  19m 41s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |  19m 41s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |  21m 13s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  14m 14s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 692m 58s | 
[/patch-unit-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2567/21/artifact/out/patch-unit-root.txt)
 |  root in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 33s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 916m  0s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.namenode.TestAddOverReplicatedStripedBlocks |
   |   | hadoop.hdfs.server.balancer.TestBalancer |
   |   | hadoop.tools.dynamometer.TestDynamometerInfra |
   |   | hadoop.yarn.service.TestYarnNativeServices |
   |   | hadoop.yarn.server.resourcemanager.TestRMRestart |
   |   | hadoop.yarn.client.api.impl.TestAMRMClient |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 

[jira] [Work logged] (HDFS-15740) Make basename cross-platform

2021-01-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15740?focusedWorklogId=543026=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-543026
 ]

ASF GitHub Bot logged work on HDFS-15740:
-

Author: ASF GitHub Bot
Created on: 27/Jan/21 19:38
Start Date: 27/Jan/21 19:38
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2567:
URL: https://github.com/apache/hadoop/pull/2567#issuecomment-768528104


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 31s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 16s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  21m 42s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  22m  3s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |  19m 32s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  mvnsite  |  26m 11s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  | 118m 18s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  | 118m 39s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 27s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |  21m 42s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  23m 28s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | -1 :x: |  cc  |  23m 28s | 
[/diff-compile-cc-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2567/20/artifact/out/diff-compile-cc-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04.txt)
 |  root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 generated 51 new + 361 unchanged - 51 
fixed = 412 total (was 412)  |
   | +1 :green_heart: |  golang  |  23m 28s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |  23m 28s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m  2s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | -1 :x: |  cc  |  20m  2s | 
[/diff-compile-cc-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2567/20/artifact/out/diff-compile-cc-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01.txt)
 |  root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 generated 42 new + 370 
unchanged - 42 fixed = 412 total (was 412)  |
   | +1 :green_heart: |  golang  |  20m  2s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |  20m  2s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |  21m 42s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  13m 54s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 692m 47s | 
[/patch-unit-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2567/20/artifact/out/patch-unit-root.txt)
 |  root in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 33s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 916m 49s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
   |   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
   |   | hadoop.hdfs.server.balancer.TestBalancer |
   |   | hadoop.hdfs.TestDFSInotifyEventInputStreamKerberized |
   |   | hadoop.tools.dynamometer.TestDynamometerInfra |
   |   | hadoop.yarn.server.resourcemanager.security.TestDelegationTokenRenewer 
|
   |   | hadoop.yarn.server.router.clientrm.TestFederationClientInterceptor |
   
   
   | Subsystem | Report/Notes |
   

[jira] [Work logged] (HDFS-15791) Possible Resource Leak in FSImageFormatProtobuf

2021-01-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15791?focusedWorklogId=542937=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-542937
 ]

ASF GitHub Bot logged work on HDFS-15791:
-

Author: ASF GitHub Bot
Created on: 27/Jan/21 17:15
Start Date: 27/Jan/21 17:15
Worklog Time Spent: 10m 
  Work Description: Nargeshdb commented on a change in pull request #2652:
URL: https://github.com/apache/hadoop/pull/2652#discussion_r565486885



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java
##
@@ -269,14 +269,20 @@ public InputStream 
getInputStreamForSection(FileSummary.Section section,
 String compressionCodec)
 throws IOException {
   FileInputStream fin = new FileInputStream(filename);
-  FileChannel channel = fin.getChannel();
-  channel.position(section.getOffset());
-  InputStream in = new BufferedInputStream(new LimitInputStream(fin,
-  section.getLength()));
+  try {

Review comment:
   Thanks a lot for the review. I really appreciate it. 
   I was wondering if I need to do anything else to get the change merged.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 542937)
Time Spent: 1h 40m  (was: 1.5h)

> Possible Resource Leak in FSImageFormatProtobuf
> ---
>
> Key: HDFS-15791
> URL: https://issues.apache.org/jira/browse/HDFS-15791
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: Narges Shadab
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> We noticed a possible resource leak 
> [here|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java#L271].
>  If an I/O error occurs at line 
> [273|https://github.com/apache/hadoop/blob/06a5d3437f68546207f18d23fe527895920c756a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java#L273]
>  or 
> [277|https://github.com/apache/hadoop/blob/06a5d3437f68546207f18d23fe527895920c756a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormatProtobuf.java#L277],
>  {{fin}} remains open since the exception isn't caught locally, and there is 
> no way for any caller to close the FileInputStream
> I'll submit a pull request to fix it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15794) IBR and FBR use different queues to load data.

2021-01-27 Thread Kihwal Lee (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272999#comment-17272999
 ] 

Kihwal Lee edited comment on HDFS-15794 at 1/27/21, 4:58 PM:
-

{quote}The problem here is that when the NameNode is blocked in processing the 
IBR, the FBR requested by the DN from the NameNode will be affected. Similarly, 
when the NameNode processing FBR is blocked.
{quote}

The serial processing of IBR and FBR is not a side-effect of the way a data 
structure is used (single queue).  In current namespace and block manager 
design, each report is processed with the fsn write lock held. The queue made 
it possible to process multiple IBRs under one lock, thus increasing 
throughput.  Having multiple queues for IBRs and FBRs won't help with 
concurrency. In fact, it will complicate things, as it needs to maintain 
certain processing order across multiple queues.

In order to make a meaningful performance improvement, we have to make NN 
perform concurrent block report processing. 


was (Author: kihwal):
{quote}The problem here is that when the NameNode is blocked in processing the 
IBR, the FBR requested by the DN from the NameNode will be affected. Similarly, 
when the NameNode processing FBR is blocked.
{quote}

The serial processing of IBR and FBR is not a side-effect of way a data 
structure is used (single queue).  In current namespace and block manager 
design, each report is processed with the fsn write lock held. The queue made 
it possible to process multiple IBRs under one lock, thus increasing 
throughput.  Having multiple queues for IBRs and FBRs won't help with 
concurrency. In fact, it will complicate things, as it needs to maintain 
certain processing order across multiple queues.

In order to make a meaningful performance improvement, we have to make NN 
perform concurrent block report processing. 

> IBR and FBR use different queues to load data.
> --
>
> Key: HDFS-15794
> URL: https://issues.apache.org/jira/browse/HDFS-15794
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>
> When DataNode reports data to NameNode, IBR and FBR are included here.
> After the NameNode receives the DataNode request, it temporarily stores the 
> data in a queue, here it refers to 
> BlockManager#BlockReportProcessingThread#queue.
> NameNodeRpcServer#blockReport()
> for (int r = 0; r   final BlockListAsLongs blocks = reports[r].getBlocks();
>  final int index = r;
>  noStaleStorages = bm.runBlockOp(() ->
>  bm.processReport(nodeReg, reports[index].getStorage(),
>  blocks, context));
>  }
> NameNodeRpcServer#blockReport()
> for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) {
>  bm.enqueueBlockOp(new Runnable() {
>  @Override
>  public void run() {
>  try {
>  namesystem.processIncrementalBlockReport(nodeReg, r);
>  } catch (Exception ex) {
>  // usually because the node is unregistered/dead. next heartbeat
>  // will correct the problem
>  blockStateChangeLog.error(
>  "*BLOCK* NameNode.blockReceivedAndDeleted: "
>  + "failed from "+ nodeReg + ":" + ex.getMessage());
>  }
>  }
>  });
>  }
> The problem here is that when the NameNode is blocked in processing the IBR, 
> the FBR requested by the DN from the NameNode will be affected. Similarly, 
> when the NameNode processing FBR is blocked.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15794) IBR and FBR use different queues to load data.

2021-01-27 Thread Kihwal Lee (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272999#comment-17272999
 ] 

Kihwal Lee commented on HDFS-15794:
---

{quote}The problem here is that when the NameNode is blocked in processing the 
IBR, the FBR requested by the DN from the NameNode will be affected. Similarly, 
when the NameNode processing FBR is blocked.
{quote}

The serial processing of IBR and FBR is not a side-effect of way a data 
structure is used (single queue).  In current namespace and block manager 
design, each report is processed with the fsn write lock held. The queue made 
it possible to process multiple IBRs under one lock, thus increasing 
throughput.  Having multiple queues for IBRs and FBRs won't help with 
concurrency. In fact, it will complicate things, as it needs to maintain 
certain processing order across multiple queues.

In order to make a meaningful performance improvement, we have to make NN 
perform concurrent block report processing. 

> IBR and FBR use different queues to load data.
> --
>
> Key: HDFS-15794
> URL: https://issues.apache.org/jira/browse/HDFS-15794
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>
> When DataNode reports data to NameNode, IBR and FBR are included here.
> After the NameNode receives the DataNode request, it temporarily stores the 
> data in a queue, here it refers to 
> BlockManager#BlockReportProcessingThread#queue.
> NameNodeRpcServer#blockReport()
> for (int r = 0; r   final BlockListAsLongs blocks = reports[r].getBlocks();
>  final int index = r;
>  noStaleStorages = bm.runBlockOp(() ->
>  bm.processReport(nodeReg, reports[index].getStorage(),
>  blocks, context));
>  }
> NameNodeRpcServer#blockReport()
> for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) {
>  bm.enqueueBlockOp(new Runnable() {
>  @Override
>  public void run() {
>  try {
>  namesystem.processIncrementalBlockReport(nodeReg, r);
>  } catch (Exception ex) {
>  // usually because the node is unregistered/dead. next heartbeat
>  // will correct the problem
>  blockStateChangeLog.error(
>  "*BLOCK* NameNode.blockReceivedAndDeleted: "
>  + "failed from "+ nodeReg + ":" + ex.getMessage());
>  }
>  }
>  });
>  }
> The problem here is that when the NameNode is blocked in processing the IBR, 
> the FBR requested by the DN from the NameNode will be affected. Similarly, 
> when the NameNode processing FBR is blocked.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15793) Add command to DFSAdmin for Balancer max concurrent threads

2021-01-27 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272871#comment-17272871
 ] 

Hadoop QA commented on HDFS-15793:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 34m 
40s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
1s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:blue}0{color} | {color:blue} buf {color} | {color:blue}  0m  0s{color} 
| {color:blue}{color} | {color:blue} buf was not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 1 
new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  2m 
38s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
11s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
41s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
20s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
34s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
58s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m 39s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
16s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m  
4s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m 
13s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs 
config; considering switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
32s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
33s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
38s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red}  4m 38s{color} | 
{color:red}https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/430/artifact/out/diff-compile-cc-hadoop-hdfs-project-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04.txt{color}
 | {color:red} hadoop-hdfs-project-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 
with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 generated 7 new + 86 unchanged 
- 7 fixed = 93 total (was 93) {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  4m 
38s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
31s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red}  4m 31s{color} | 

[jira] [Comment Edited] (HDFS-15794) IBR and FBR use different queues to load data.

2021-01-27 Thread JiangHua Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272840#comment-17272840
 ] 

JiangHua Zhu edited comment on HDFS-15794 at 1/27/21, 1:04 PM:
---

[~weichiu] , thank you for your reply.
 I noticed [HDFS-14997|https://issues.apache.org/jira/browse/HDFS-14997]. The 
improvements made here are very meaningful.
 But what I want to explain here is that HDFS-14997 is about the improvement of 
DataNode.
 What I want to express is that something meaningful can be done on the 
NameNode side.
 When NN processes IBR and FBR data, it can use different queues for 
processing, instead of sharing one queue 
(BlockManager#BlockReportProcessingThread#queue).
 This will benefit the NN's capabilities.

 

 


was (Author: jianghuazhu):
[~weichiu] , thank you for your reply.
 I noticed 14997. The improvements made here are very meaningful.
 But what I want to explain here is that 
[HDFS-14997|https://issues.apache.org/jira/browse/HDFS-14997] is about the 
improvement of DataNode.
 What I want to express is that something meaningful can be done on the 
NameNode side.
 When NN processes IBR and FBR data, it can use different queues for 
processing, instead of sharing one queue 
(BlockManager#BlockReportProcessingThread#queue).
 This will benefit the NN's capabilities.

 

 

> IBR and FBR use different queues to load data.
> --
>
> Key: HDFS-15794
> URL: https://issues.apache.org/jira/browse/HDFS-15794
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>
> When DataNode reports data to NameNode, IBR and FBR are included here.
> After the NameNode receives the DataNode request, it temporarily stores the 
> data in a queue, here it refers to 
> BlockManager#BlockReportProcessingThread#queue.
> NameNodeRpcServer#blockReport()
> for (int r = 0; r   final BlockListAsLongs blocks = reports[r].getBlocks();
>  final int index = r;
>  noStaleStorages = bm.runBlockOp(() ->
>  bm.processReport(nodeReg, reports[index].getStorage(),
>  blocks, context));
>  }
> NameNodeRpcServer#blockReport()
> for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) {
>  bm.enqueueBlockOp(new Runnable() {
>  @Override
>  public void run() {
>  try {
>  namesystem.processIncrementalBlockReport(nodeReg, r);
>  } catch (Exception ex) {
>  // usually because the node is unregistered/dead. next heartbeat
>  // will correct the problem
>  blockStateChangeLog.error(
>  "*BLOCK* NameNode.blockReceivedAndDeleted: "
>  + "failed from "+ nodeReg + ":" + ex.getMessage());
>  }
>  }
>  });
>  }
> The problem here is that when the NameNode is blocked in processing the IBR, 
> the FBR requested by the DN from the NameNode will be affected. Similarly, 
> when the NameNode processing FBR is blocked.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15794) IBR and FBR use different queues to load data.

2021-01-27 Thread JiangHua Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272840#comment-17272840
 ] 

JiangHua Zhu edited comment on HDFS-15794 at 1/27/21, 1:03 PM:
---

[~weichiu] , thank you for your reply.
 I noticed 14997. The improvements made here are very meaningful.
 But what I want to explain here is that 
[HDFS-14997|https://issues.apache.org/jira/browse/HDFS-14997] is about the 
improvement of DataNode.
 What I want to express is that something meaningful can be done on the 
NameNode side.
 When NN processes IBR and FBR data, it can use different queues for 
processing, instead of sharing one queue 
(BlockManager#BlockReportProcessingThread#queue).
 This will benefit the NN's capabilities.

 

 


was (Author: jianghuazhu):
[~weichiu] , thank you for your reply.
 I noticed [14997|https://issues.apache.org/jira/browse/HDFS-14997]. The 
improvements made here are very meaningful.
 But what I want to explain here is that @14997 is about the improvement of 
DataNode.
 What I want to express is that something meaningful can be done on the 
NameNode side.
 When NN processes IBR and FBR data, it can use different queues for 
processing, instead of sharing one queue 
(BlockManager#BlockReportProcessingThread#queue).
 This will benefit the NN's capabilities.

 

 

> IBR and FBR use different queues to load data.
> --
>
> Key: HDFS-15794
> URL: https://issues.apache.org/jira/browse/HDFS-15794
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>
> When DataNode reports data to NameNode, IBR and FBR are included here.
> After the NameNode receives the DataNode request, it temporarily stores the 
> data in a queue, here it refers to 
> BlockManager#BlockReportProcessingThread#queue.
> NameNodeRpcServer#blockReport()
> for (int r = 0; r   final BlockListAsLongs blocks = reports[r].getBlocks();
>  final int index = r;
>  noStaleStorages = bm.runBlockOp(() ->
>  bm.processReport(nodeReg, reports[index].getStorage(),
>  blocks, context));
>  }
> NameNodeRpcServer#blockReport()
> for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) {
>  bm.enqueueBlockOp(new Runnable() {
>  @Override
>  public void run() {
>  try {
>  namesystem.processIncrementalBlockReport(nodeReg, r);
>  } catch (Exception ex) {
>  // usually because the node is unregistered/dead. next heartbeat
>  // will correct the problem
>  blockStateChangeLog.error(
>  "*BLOCK* NameNode.blockReceivedAndDeleted: "
>  + "failed from "+ nodeReg + ":" + ex.getMessage());
>  }
>  }
>  });
>  }
> The problem here is that when the NameNode is blocked in processing the IBR, 
> the FBR requested by the DN from the NameNode will be affected. Similarly, 
> when the NameNode processing FBR is blocked.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15794) IBR and FBR use different queues to load data.

2021-01-27 Thread JiangHua Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272840#comment-17272840
 ] 

JiangHua Zhu edited comment on HDFS-15794 at 1/27/21, 1:02 PM:
---

[~weichiu] , thank you for your reply.
 I noticed [14997|https://issues.apache.org/jira/browse/HDFS-14997]. The 
improvements made here are very meaningful.
 But what I want to explain here is that @14997 is about the improvement of 
DataNode.
 What I want to express is that something meaningful can be done on the 
NameNode side.
 When NN processes IBR and FBR data, it can use different queues for 
processing, instead of sharing one queue 
(BlockManager#BlockReportProcessingThread#queue).
 This will benefit the NN's capabilities.

 

 


was (Author: jianghuazhu):
[~weichiu] , thank you for your reply.
 I noticed [link 14997|https://issues.apache.org/jira/browse/HDFS-14997]. The 
improvements made here are very meaningful.
 But what I want to explain here is that @14997 is about the improvement of 
DataNode.
 What I want to express is that something meaningful can be done on the 
NameNode side.
 When NN processes IBR and FBR data, it can use different queues for 
processing, instead of sharing one queue 
(BlockManager#BlockReportProcessingThread#queue).
 This will benefit the NN's capabilities.

 

 

> IBR and FBR use different queues to load data.
> --
>
> Key: HDFS-15794
> URL: https://issues.apache.org/jira/browse/HDFS-15794
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>
> When DataNode reports data to NameNode, IBR and FBR are included here.
> After the NameNode receives the DataNode request, it temporarily stores the 
> data in a queue, here it refers to 
> BlockManager#BlockReportProcessingThread#queue.
> NameNodeRpcServer#blockReport()
> for (int r = 0; r   final BlockListAsLongs blocks = reports[r].getBlocks();
>  final int index = r;
>  noStaleStorages = bm.runBlockOp(() ->
>  bm.processReport(nodeReg, reports[index].getStorage(),
>  blocks, context));
>  }
> NameNodeRpcServer#blockReport()
> for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) {
>  bm.enqueueBlockOp(new Runnable() {
>  @Override
>  public void run() {
>  try {
>  namesystem.processIncrementalBlockReport(nodeReg, r);
>  } catch (Exception ex) {
>  // usually because the node is unregistered/dead. next heartbeat
>  // will correct the problem
>  blockStateChangeLog.error(
>  "*BLOCK* NameNode.blockReceivedAndDeleted: "
>  + "failed from "+ nodeReg + ":" + ex.getMessage());
>  }
>  }
>  });
>  }
> The problem here is that when the NameNode is blocked in processing the IBR, 
> the FBR requested by the DN from the NameNode will be affected. Similarly, 
> when the NameNode processing FBR is blocked.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15794) IBR and FBR use different queues to load data.

2021-01-27 Thread JiangHua Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272840#comment-17272840
 ] 

JiangHua Zhu edited comment on HDFS-15794 at 1/27/21, 1:01 PM:
---

[~weichiu] , thank you for your reply.
 I noticed [link 14997|https://issues.apache.org/jira/browse/HDFS-14997]. The 
improvements made here are very meaningful.
 But what I want to explain here is that @14997 is about the improvement of 
DataNode.
 What I want to express is that something meaningful can be done on the 
NameNode side.
 When NN processes IBR and FBR data, it can use different queues for 
processing, instead of sharing one queue 
(BlockManager#BlockReportProcessingThread#queue).
 This will benefit the NN's capabilities.

 

 


was (Author: jianghuazhu):
[~weichiu] , thank you for your reply.
 I noticed 14997. The improvements made here are very meaningful.
 But what I want to explain here is that @14997 is about the improvement of 
DataNode.
 What I want to express is that something meaningful can be done on the 
NameNode side.
 When NN processes IBR and FBR data, it can use different queues for 
processing, instead of sharing one queue 
(BlockManager#BlockReportProcessingThread#queue).
 This will benefit the NN's capabilities.

 

 

> IBR and FBR use different queues to load data.
> --
>
> Key: HDFS-15794
> URL: https://issues.apache.org/jira/browse/HDFS-15794
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>
> When DataNode reports data to NameNode, IBR and FBR are included here.
> After the NameNode receives the DataNode request, it temporarily stores the 
> data in a queue, here it refers to 
> BlockManager#BlockReportProcessingThread#queue.
> NameNodeRpcServer#blockReport()
> for (int r = 0; r   final BlockListAsLongs blocks = reports[r].getBlocks();
>  final int index = r;
>  noStaleStorages = bm.runBlockOp(() ->
>  bm.processReport(nodeReg, reports[index].getStorage(),
>  blocks, context));
>  }
> NameNodeRpcServer#blockReport()
> for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) {
>  bm.enqueueBlockOp(new Runnable() {
>  @Override
>  public void run() {
>  try {
>  namesystem.processIncrementalBlockReport(nodeReg, r);
>  } catch (Exception ex) {
>  // usually because the node is unregistered/dead. next heartbeat
>  // will correct the problem
>  blockStateChangeLog.error(
>  "*BLOCK* NameNode.blockReceivedAndDeleted: "
>  + "failed from "+ nodeReg + ":" + ex.getMessage());
>  }
>  }
>  });
>  }
> The problem here is that when the NameNode is blocked in processing the IBR, 
> the FBR requested by the DN from the NameNode will be affected. Similarly, 
> when the NameNode processing FBR is blocked.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15794) IBR and FBR use different queues to load data.

2021-01-27 Thread JiangHua Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272840#comment-17272840
 ] 

JiangHua Zhu edited comment on HDFS-15794 at 1/27/21, 1:00 PM:
---

[~weichiu] , thank you for your reply.
 I noticed 14997. The improvements made here are very meaningful.
 But what I want to explain here is that @14997 is about the improvement of 
DataNode.
 What I want to express is that something meaningful can be done on the 
NameNode side.
 When NN processes IBR and FBR data, it can use different queues for 
processing, instead of sharing one queue 
(BlockManager#BlockReportProcessingThread#queue).
 This will benefit the NN's capabilities.

 

 


was (Author: jianghuazhu):
[~weichiu] , thank you for your reply.
 I noticed @14997 [14997|https://issues.apache.org/jira/browse/HDFS-14997]. The 
improvements made here are very meaningful.
 But what I want to explain here is that @14997 is about the improvement of 
DataNode.
 What I want to express is that something meaningful can be done on the 
NameNode side.
 When NN processes IBR and FBR data, it can use different queues for 
processing, instead of sharing one queue 
(BlockManager#BlockReportProcessingThread#queue).
 This will benefit the NN's capabilities.

 

 

> IBR and FBR use different queues to load data.
> --
>
> Key: HDFS-15794
> URL: https://issues.apache.org/jira/browse/HDFS-15794
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>
> When DataNode reports data to NameNode, IBR and FBR are included here.
> After the NameNode receives the DataNode request, it temporarily stores the 
> data in a queue, here it refers to 
> BlockManager#BlockReportProcessingThread#queue.
> NameNodeRpcServer#blockReport()
> for (int r = 0; r   final BlockListAsLongs blocks = reports[r].getBlocks();
>  final int index = r;
>  noStaleStorages = bm.runBlockOp(() ->
>  bm.processReport(nodeReg, reports[index].getStorage(),
>  blocks, context));
>  }
> NameNodeRpcServer#blockReport()
> for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) {
>  bm.enqueueBlockOp(new Runnable() {
>  @Override
>  public void run() {
>  try {
>  namesystem.processIncrementalBlockReport(nodeReg, r);
>  } catch (Exception ex) {
>  // usually because the node is unregistered/dead. next heartbeat
>  // will correct the problem
>  blockStateChangeLog.error(
>  "*BLOCK* NameNode.blockReceivedAndDeleted: "
>  + "failed from "+ nodeReg + ":" + ex.getMessage());
>  }
>  }
>  });
>  }
> The problem here is that when the NameNode is blocked in processing the IBR, 
> the FBR requested by the DN from the NameNode will be affected. Similarly, 
> when the NameNode processing FBR is blocked.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15794) IBR and FBR use different queues to load data.

2021-01-27 Thread JiangHua Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272840#comment-17272840
 ] 

JiangHua Zhu edited comment on HDFS-15794 at 1/27/21, 1:00 PM:
---

[~weichiu] , thank you for your reply.
 I noticed @14997 [14997|https://issues.apache.org/jira/browse/HDFS-14997]. The 
improvements made here are very meaningful.
 But what I want to explain here is that @14997 is about the improvement of 
DataNode.
 What I want to express is that something meaningful can be done on the 
NameNode side.
 When NN processes IBR and FBR data, it can use different queues for 
processing, instead of sharing one queue 
(BlockManager#BlockReportProcessingThread#queue).
 This will benefit the NN's capabilities.

 

 


was (Author: jianghuazhu):
[~weichiu] , thank you for your reply.
 I noticed @14997 . The improvements made here are very meaningful.
 But what I want to explain here is that @14997 is about the improvement of 
DataNode.
 What I want to express is that something meaningful can be done on the 
NameNode side.
 When NN processes IBR and FBR data, it can use different queues for 
processing, instead of sharing one queue 
(BlockManager#BlockReportProcessingThread#queue).
 This will benefit the NN's capabilities.

 

[link title|https://issues.apache.org/jira/browse/HDFS-14997]

 

> IBR and FBR use different queues to load data.
> --
>
> Key: HDFS-15794
> URL: https://issues.apache.org/jira/browse/HDFS-15794
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>
> When DataNode reports data to NameNode, IBR and FBR are included here.
> After the NameNode receives the DataNode request, it temporarily stores the 
> data in a queue, here it refers to 
> BlockManager#BlockReportProcessingThread#queue.
> NameNodeRpcServer#blockReport()
> for (int r = 0; r   final BlockListAsLongs blocks = reports[r].getBlocks();
>  final int index = r;
>  noStaleStorages = bm.runBlockOp(() ->
>  bm.processReport(nodeReg, reports[index].getStorage(),
>  blocks, context));
>  }
> NameNodeRpcServer#blockReport()
> for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) {
>  bm.enqueueBlockOp(new Runnable() {
>  @Override
>  public void run() {
>  try {
>  namesystem.processIncrementalBlockReport(nodeReg, r);
>  } catch (Exception ex) {
>  // usually because the node is unregistered/dead. next heartbeat
>  // will correct the problem
>  blockStateChangeLog.error(
>  "*BLOCK* NameNode.blockReceivedAndDeleted: "
>  + "failed from "+ nodeReg + ":" + ex.getMessage());
>  }
>  }
>  });
>  }
> The problem here is that when the NameNode is blocked in processing the IBR, 
> the FBR requested by the DN from the NameNode will be affected. Similarly, 
> when the NameNode processing FBR is blocked.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15794) IBR and FBR use different queues to load data.

2021-01-27 Thread JiangHua Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272840#comment-17272840
 ] 

JiangHua Zhu edited comment on HDFS-15794 at 1/27/21, 12:59 PM:


[~weichiu] , thank you for your reply.
 I noticed @14997 . The improvements made here are very meaningful.
 But what I want to explain here is that @14997 is about the improvement of 
DataNode.
 What I want to express is that something meaningful can be done on the 
NameNode side.
 When NN processes IBR and FBR data, it can use different queues for 
processing, instead of sharing one queue 
(BlockManager#BlockReportProcessingThread#queue).
 This will benefit the NN's capabilities.

 

[link title|https://issues.apache.org/jira/browse/HDFS-14997]

 


was (Author: jianghuazhu):
[~weichiu] , thank you for your reply.
I noticed @14997 . The improvements made here are very meaningful.
But what I want to explain here is that @14997 is about the improvement of 
DataNode.
What I want to express is that something meaningful can be done on the NameNode 
side.
When NN processes IBR and FBR data, it can use different queues for processing, 
instead of sharing one queue (BlockManager#BlockReportProcessingThread#queue).
This will benefit the NN's capabilities.

 

> IBR and FBR use different queues to load data.
> --
>
> Key: HDFS-15794
> URL: https://issues.apache.org/jira/browse/HDFS-15794
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>
> When DataNode reports data to NameNode, IBR and FBR are included here.
> After the NameNode receives the DataNode request, it temporarily stores the 
> data in a queue, here it refers to 
> BlockManager#BlockReportProcessingThread#queue.
> NameNodeRpcServer#blockReport()
> for (int r = 0; r   final BlockListAsLongs blocks = reports[r].getBlocks();
>  final int index = r;
>  noStaleStorages = bm.runBlockOp(() ->
>  bm.processReport(nodeReg, reports[index].getStorage(),
>  blocks, context));
>  }
> NameNodeRpcServer#blockReport()
> for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) {
>  bm.enqueueBlockOp(new Runnable() {
>  @Override
>  public void run() {
>  try {
>  namesystem.processIncrementalBlockReport(nodeReg, r);
>  } catch (Exception ex) {
>  // usually because the node is unregistered/dead. next heartbeat
>  // will correct the problem
>  blockStateChangeLog.error(
>  "*BLOCK* NameNode.blockReceivedAndDeleted: "
>  + "failed from "+ nodeReg + ":" + ex.getMessage());
>  }
>  }
>  });
>  }
> The problem here is that when the NameNode is blocked in processing the IBR, 
> the FBR requested by the DN from the NameNode will be affected. Similarly, 
> when the NameNode processing FBR is blocked.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15794) IBR and FBR use different queues to load data.

2021-01-27 Thread JiangHua Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272840#comment-17272840
 ] 

JiangHua Zhu commented on HDFS-15794:
-

[~weichiu] , thank you for your reply.
I noticed @14997 . The improvements made here are very meaningful.
But what I want to explain here is that @14997 is about the improvement of 
DataNode.
What I want to express is that something meaningful can be done on the NameNode 
side.
When NN processes IBR and FBR data, it can use different queues for processing, 
instead of sharing one queue (BlockManager#BlockReportProcessingThread#queue).
This will benefit the NN's capabilities.

 

> IBR and FBR use different queues to load data.
> --
>
> Key: HDFS-15794
> URL: https://issues.apache.org/jira/browse/HDFS-15794
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>
> When DataNode reports data to NameNode, IBR and FBR are included here.
> After the NameNode receives the DataNode request, it temporarily stores the 
> data in a queue, here it refers to 
> BlockManager#BlockReportProcessingThread#queue.
> NameNodeRpcServer#blockReport()
> for (int r = 0; r   final BlockListAsLongs blocks = reports[r].getBlocks();
>  final int index = r;
>  noStaleStorages = bm.runBlockOp(() ->
>  bm.processReport(nodeReg, reports[index].getStorage(),
>  blocks, context));
>  }
> NameNodeRpcServer#blockReport()
> for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) {
>  bm.enqueueBlockOp(new Runnable() {
>  @Override
>  public void run() {
>  try {
>  namesystem.processIncrementalBlockReport(nodeReg, r);
>  } catch (Exception ex) {
>  // usually because the node is unregistered/dead. next heartbeat
>  // will correct the problem
>  blockStateChangeLog.error(
>  "*BLOCK* NameNode.blockReceivedAndDeleted: "
>  + "failed from "+ nodeReg + ":" + ex.getMessage());
>  }
>  }
>  });
>  }
> The problem here is that when the NameNode is blocked in processing the IBR, 
> the FBR requested by the DN from the NameNode will be affected. Similarly, 
> when the NameNode processing FBR is blocked.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15794) IBR and FBR use different queues to load data.

2021-01-27 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272795#comment-17272795
 ] 

Wei-Chiu Chuang commented on HDFS-15794:


I think we made quite some improvements recently in here.

One of which is HDFS-14997.

> IBR and FBR use different queues to load data.
> --
>
> Key: HDFS-15794
> URL: https://issues.apache.org/jira/browse/HDFS-15794
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>
> When DataNode reports data to NameNode, IBR and FBR are included here.
> After the NameNode receives the DataNode request, it temporarily stores the 
> data in a queue, here it refers to 
> BlockManager#BlockReportProcessingThread#queue.
> NameNodeRpcServer#blockReport()
> for (int r = 0; r   final BlockListAsLongs blocks = reports[r].getBlocks();
>  final int index = r;
>  noStaleStorages = bm.runBlockOp(() ->
>  bm.processReport(nodeReg, reports[index].getStorage(),
>  blocks, context));
>  }
> NameNodeRpcServer#blockReport()
> for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) {
>  bm.enqueueBlockOp(new Runnable() {
>  @Override
>  public void run() {
>  try {
>  namesystem.processIncrementalBlockReport(nodeReg, r);
>  } catch (Exception ex) {
>  // usually because the node is unregistered/dead. next heartbeat
>  // will correct the problem
>  blockStateChangeLog.error(
>  "*BLOCK* NameNode.blockReceivedAndDeleted: "
>  + "failed from "+ nodeReg + ":" + ex.getMessage());
>  }
>  }
>  });
>  }
> The problem here is that when the NameNode is blocked in processing the IBR, 
> the FBR requested by the DN from the NameNode will be affected. Similarly, 
> when the NameNode processing FBR is blocked.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly

2021-01-27 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15714?focusedWorklogId=542754=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-542754
 ]

ASF GitHub Bot logged work on HDFS-15714:
-

Author: ASF GitHub Bot
Created on: 27/Jan/21 10:32
Start Date: 27/Jan/21 10:32
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2655:
URL: https://github.com/apache/hadoop/pull/2655#issuecomment-768192307


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m 27s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  4s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  buf  |   0m  1s |  |  buf was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch 
appears to include 79 new or modified test files.  |
    _ HDFS-15714 Compile Tests _ |
   | +0 :ok: |  mvndep  |  13m 53s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  23m 50s |  |  HDFS-15714 passed  |
   | +1 :green_heart: |  compile  |  21m 54s |  |  HDFS-15714 passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  compile  |  18m 22s |  |  HDFS-15714 passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   4m  9s |  |  HDFS-15714 passed  |
   | +1 :green_heart: |  mvnsite  |   6m  3s |  |  HDFS-15714 passed  |
   | +1 :green_heart: |  shadedclient  |  27m 51s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   4m 30s |  |  HDFS-15714 passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | +1 :green_heart: |  javadoc  |   5m 53s |  |  HDFS-15714 passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | +0 :ok: |  spotbugs  |   0m 46s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |  11m 26s |  |  HDFS-15714 passed  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 26s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   4m 31s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m  7s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04  |
   | -1 :x: |  cc  |  21m  7s | 
[/diff-compile-cc-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2655/1/artifact/out/diff-compile-cc-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt)
 |  root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 generated 30 new + 142 unchanged - 30 
fixed = 172 total (was 172)  |
   | -1 :x: |  javac  |  21m  7s | 
[/diff-compile-javac-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2655/1/artifact/out/diff-compile-javac-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt)
 |  root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 generated 67 new + 2006 unchanged - 27 
fixed = 2073 total (was 2033)  |
   | +1 :green_heart: |  compile  |  22m 21s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01  |
   | -1 :x: |  cc  |  22m 21s | 
[/diff-compile-cc-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2655/1/artifact/out/diff-compile-cc-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt)
 |  root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 generated 33 new + 139 
unchanged - 33 fixed = 172 total (was 172)  |
   | -1 :x: |  javac  |  22m 21s | 
[/diff-compile-javac-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2655/1/artifact/out/diff-compile-javac-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01.txt)
 |  root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 generated 67 new + 1901 
unchanged - 27 fixed = 1968 total (was 1928)  |
   | -0 :warning: |  checkstyle  |   4m 43s | 
[/diff-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2655/1/artifact/out/diff-checkstyle-root.txt)
 |  root: The patch generated 144 new + 4280 unchanged - 35 fixed = 4424 total 
(was 4315)  |
   | +1 :green_heart: |  mvnsite  |   9m 18s |  |  the patch passed  |
   | -1 :x: |  whitespace  |   0m  

[jira] [Commented] (HDFS-15714) HDFS Provided Storage Read/Write Mount Support On-the-fly

2021-01-27 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272751#comment-17272751
 ] 

Hadoop QA commented on HDFS-15714:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
27s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
4s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:blue}0{color} | {color:blue} buf {color} | {color:blue}  0m  1s{color} 
| {color:blue}{color} | {color:blue} buf was not available. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 
79 new or modified test files. {color} |
|| || || || {color:brown} HDFS-15714 Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 13m 
53s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
50s{color} | {color:green}{color} | {color:green} HDFS-15714 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 21m 
54s{color} | {color:green}{color} | {color:green} HDFS-15714 passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 18m 
22s{color} | {color:green}{color} | {color:green} HDFS-15714 passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  4m 
 9s{color} | {color:green}{color} | {color:green} HDFS-15714 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  6m  
3s{color} | {color:green}{color} | {color:green} HDFS-15714 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
27m 51s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  4m 
30s{color} | {color:green}{color} | {color:green} HDFS-15714 passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  5m 
53s{color} | {color:green}{color} | {color:green} HDFS-15714 passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  0m 
46s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs 
config; considering switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 11m 
26s{color} | {color:green}{color} | {color:green} HDFS-15714 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
31s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 21m  
7s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red} 21m  7s{color} | 
{color:red}https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2655/1/artifact/out/diff-compile-cc-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt{color}
 | {color:red} root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 generated 30 new + 142 unchanged - 30 
fixed = 172 total (was 172) {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 21m  7s{color} 
| 
{color:red}https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2655/1/artifact/out/diff-compile-javac-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04.txt{color}
 | {color:red} root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 generated 67 new + 2006 unchanged - 27 
fixed = 2073 total (was 2033) {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 22m 
21s{color} | {color:green}{color} | {color:green} the patch passed with 

[jira] [Comment Edited] (HDFS-15794) IBR and FBR use different queues to load data.

2021-01-27 Thread JiangHua Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272716#comment-17272716
 ] 

JiangHua Zhu edited comment on HDFS-15794 at 1/27/21, 9:45 AM:
---

When DataNode requests IBR and FBR from NameNode, NameNode uses different queue 
processing (BlockManager). This can improve the performance of NameNode 
processing these two types of requests.

[~weichiu] [~elgoiri] Do you have different ideas?

 


was (Author: jianghuazhu):
When DataNode requests IBR and FBR from NameNode, NameNode uses different queue 
processing (BlockManager). This can improve the performance of NameNode 
processing these two types of requests.

[~weichiu] [~elgoiri] Do you have different opinions?

 

> IBR and FBR use different queues to load data.
> --
>
> Key: HDFS-15794
> URL: https://issues.apache.org/jira/browse/HDFS-15794
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>
> When DataNode reports data to NameNode, IBR and FBR are included here.
> After the NameNode receives the DataNode request, it temporarily stores the 
> data in a queue, here it refers to 
> BlockManager#BlockReportProcessingThread#queue.
> NameNodeRpcServer#blockReport()
> for (int r = 0; r   final BlockListAsLongs blocks = reports[r].getBlocks();
>  final int index = r;
>  noStaleStorages = bm.runBlockOp(() ->
>  bm.processReport(nodeReg, reports[index].getStorage(),
>  blocks, context));
>  }
> NameNodeRpcServer#blockReport()
> for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) {
>  bm.enqueueBlockOp(new Runnable() {
>  @Override
>  public void run() {
>  try {
>  namesystem.processIncrementalBlockReport(nodeReg, r);
>  } catch (Exception ex) {
>  // usually because the node is unregistered/dead. next heartbeat
>  // will correct the problem
>  blockStateChangeLog.error(
>  "*BLOCK* NameNode.blockReceivedAndDeleted: "
>  + "failed from "+ nodeReg + ":" + ex.getMessage());
>  }
>  }
>  });
>  }
> The problem here is that when the NameNode is blocked in processing the IBR, 
> the FBR requested by the DN from the NameNode will be affected. Similarly, 
> when the NameNode processing FBR is blocked.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15794) IBR and FBR use different queues to load data.

2021-01-27 Thread JiangHua Zhu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17272716#comment-17272716
 ] 

JiangHua Zhu commented on HDFS-15794:
-

When DataNode requests IBR and FBR from NameNode, NameNode uses different queue 
processing (BlockManager). This can improve the performance of NameNode 
processing these two types of requests.

[~weichiu] [~elgoiri] Do you have different opinions?

 

> IBR and FBR use different queues to load data.
> --
>
> Key: HDFS-15794
> URL: https://issues.apache.org/jira/browse/HDFS-15794
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>
> When DataNode reports data to NameNode, IBR and FBR are included here.
> After the NameNode receives the DataNode request, it temporarily stores the 
> data in a queue, here it refers to 
> BlockManager#BlockReportProcessingThread#queue.
> NameNodeRpcServer#blockReport()
> for (int r = 0; r   final BlockListAsLongs blocks = reports[r].getBlocks();
>  final int index = r;
>  noStaleStorages = bm.runBlockOp(() ->
>  bm.processReport(nodeReg, reports[index].getStorage(),
>  blocks, context));
>  }
> NameNodeRpcServer#blockReport()
> for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) {
>  bm.enqueueBlockOp(new Runnable() {
>  @Override
>  public void run() {
>  try {
>  namesystem.processIncrementalBlockReport(nodeReg, r);
>  } catch (Exception ex) {
>  // usually because the node is unregistered/dead. next heartbeat
>  // will correct the problem
>  blockStateChangeLog.error(
>  "*BLOCK* NameNode.blockReceivedAndDeleted: "
>  + "failed from "+ nodeReg + ":" + ex.getMessage());
>  }
>  }
>  });
>  }
> The problem here is that when the NameNode is blocked in processing the IBR, 
> the FBR requested by the DN from the NameNode will be affected. Similarly, 
> when the NameNode processing FBR is blocked.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15794) IBR and FBR use different queues to load data.

2021-01-27 Thread JiangHua Zhu (Jira)
JiangHua Zhu created HDFS-15794:
---

 Summary: IBR and FBR use different queues to load data.
 Key: HDFS-15794
 URL: https://issues.apache.org/jira/browse/HDFS-15794
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: JiangHua Zhu


When DataNode reports data to NameNode, IBR and FBR are included here.
After the NameNode receives the DataNode request, it temporarily stores the 
data in a queue, here it refers to 
BlockManager#BlockReportProcessingThread#queue.
NameNodeRpcServer#blockReport()
for (int r = 0; r 
 bm.processReport(nodeReg, reports[index].getStorage(),
 blocks, context));
 }
NameNodeRpcServer#blockReport()
for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) {
 bm.enqueueBlockOp(new Runnable() {
 @Override
 public void run() {
 try {
 namesystem.processIncrementalBlockReport(nodeReg, r);
 } catch (Exception ex) {
 // usually because the node is unregistered/dead. next heartbeat
 // will correct the problem
 blockStateChangeLog.error(
 "*BLOCK* NameNode.blockReceivedAndDeleted: "
 + "failed from "+ nodeReg + ":" + ex.getMessage());
 }
 }
 });
 }
The problem here is that when the NameNode is blocked in processing the IBR, 
the FBR requested by the DN from the NameNode will be affected. Similarly, when 
the NameNode processing FBR is blocked.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-15794) IBR and FBR use different queues to load data.

2021-01-27 Thread JiangHua Zhu (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

JiangHua Zhu reassigned HDFS-15794:
---

Assignee: JiangHua Zhu

> IBR and FBR use different queues to load data.
> --
>
> Key: HDFS-15794
> URL: https://issues.apache.org/jira/browse/HDFS-15794
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: JiangHua Zhu
>Assignee: JiangHua Zhu
>Priority: Major
>
> When DataNode reports data to NameNode, IBR and FBR are included here.
> After the NameNode receives the DataNode request, it temporarily stores the 
> data in a queue, here it refers to 
> BlockManager#BlockReportProcessingThread#queue.
> NameNodeRpcServer#blockReport()
> for (int r = 0; r   final BlockListAsLongs blocks = reports[r].getBlocks();
>  final int index = r;
>  noStaleStorages = bm.runBlockOp(() ->
>  bm.processReport(nodeReg, reports[index].getStorage(),
>  blocks, context));
>  }
> NameNodeRpcServer#blockReport()
> for (final StorageReceivedDeletedBlocks r: receivedAndDeletedBlocks) {
>  bm.enqueueBlockOp(new Runnable() {
>  @Override
>  public void run() {
>  try {
>  namesystem.processIncrementalBlockReport(nodeReg, r);
>  } catch (Exception ex) {
>  // usually because the node is unregistered/dead. next heartbeat
>  // will correct the problem
>  blockStateChangeLog.error(
>  "*BLOCK* NameNode.blockReceivedAndDeleted: "
>  + "failed from "+ nodeReg + ":" + ex.getMessage());
>  }
>  }
>  });
>  }
> The problem here is that when the NameNode is blocked in processing the IBR, 
> the FBR requested by the DN from the NameNode will be affected. Similarly, 
> when the NameNode processing FBR is blocked.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org