[jira] [Commented] (HDFS-15169) RBF: Router FSCK should consider the mount table

2020-03-22 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064569#comment-17064569
 ] 

Xiaoqiao He commented on HDFS-15169:


Attach Jenkins result link 
https://builds.apache.org/job/PreCommit-HDFS-Build/28995/console du to it is 
misbehaving few days.
Hi [~aajisaka],[~elgoiri],[~ayushtkn] Would you like to have a review?

> RBF: Router FSCK should consider the mount table
> 
>
> Key: HDFS-15169
> URL: https://issues.apache.org/jira/browse/HDFS-15169
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Akira Ajisaka
>Assignee: Xiaoqiao He
>Priority: Major
> Attachments: HDFS-15169.001.patch
>
>
> HDFS-13989 implemented FSCK to DFSRouter, however, it just redirects the 
> requests to all the active downstream NameNodes for now. The DFSRouter should 
> consider the mount table when redirecting the requests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15075) Remove process command timing from BPServiceActor

2020-03-22 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064541#comment-17064541
 ] 

Xiaoqiao He commented on HDFS-15075:


Thanks [~elgoiri] for your good catches, v007 try to update the patch following 
suggestions. Please give another reviews. Thanks again.

> Remove process command timing from BPServiceActor
> -
>
> Key: HDFS-15075
> URL: https://issues.apache.org/jira/browse/HDFS-15075
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Xiaoqiao He
>Priority: Major
> Attachments: HDFS-15075.001.patch, HDFS-15075.002.patch, 
> HDFS-15075.003.patch, HDFS-15075.004.patch, HDFS-15075.005.patch, 
> HDFS-15075.006.patch, HDFS-15075.007.patch
>
>
> HDFS-14997 moved the command processing into async.
> Right now, we are checking the time to add to a queue.
> We should remove this one and maybe move the timing within the thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15232) Some CTESTs are failing after HADOOP-16054

2020-03-22 Thread Akira Ajisaka (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064540#comment-17064540
 ] 

Akira Ajisaka commented on HDFS-15232:
--

The tests failed by SEGV.
Error log: https://gist.github.com/aajisaka/f2982c01c2dccb0d8b679d1a0ece6827

> Some CTESTs are failing after HADOOP-16054
> --
>
> Key: HDFS-15232
> URL: https://issues.apache.org/jira/browse/HDFS-15232
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: native
>Reporter: Akira Ajisaka
>Priority: Major
>
> Failed CTEST tests after HADOOP-16054:
> * remote_block_reader
> * memcheck_remote_block_reader
> * bad_datanode
> * memcheck_bad_datanode



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15075) Remove process command timing from BPServiceActor

2020-03-22 Thread Xiaoqiao He (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He updated HDFS-15075:
---
Attachment: HDFS-15075.007.patch

> Remove process command timing from BPServiceActor
> -
>
> Key: HDFS-15075
> URL: https://issues.apache.org/jira/browse/HDFS-15075
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Xiaoqiao He
>Priority: Major
> Attachments: HDFS-15075.001.patch, HDFS-15075.002.patch, 
> HDFS-15075.003.patch, HDFS-15075.004.patch, HDFS-15075.005.patch, 
> HDFS-15075.006.patch, HDFS-15075.007.patch
>
>
> HDFS-14997 moved the command processing into async.
> Right now, we are checking the time to add to a queue.
> We should remove this one and maybe move the timing within the thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15232) Some CTESTs are failing after HADOOP-16054

2020-03-22 Thread Akira Ajisaka (Jira)
Akira Ajisaka created HDFS-15232:


 Summary: Some CTESTs are failing after HADOOP-16054
 Key: HDFS-15232
 URL: https://issues.apache.org/jira/browse/HDFS-15232
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: native
Reporter: Akira Ajisaka


Failed CTEST tests after HADOOP-16054:
* remote_block_reader
* memcheck_remote_block_reader
* bad_datanode
* memcheck_bad_datanode



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15113) Missing IBR when NameNode restart if open processCommand async feature

2020-03-22 Thread Wei-Chiu Chuang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064516#comment-17064516
 ] 

Wei-Chiu Chuang commented on HDFS-15113:


+1

> Missing IBR when NameNode restart if open processCommand async feature
> --
>
> Key: HDFS-15113
> URL: https://issues.apache.org/jira/browse/HDFS-15113
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Blocker
> Fix For: 3.3.0
>
> Attachments: HDFS-15113.001.patch, HDFS-15113.002.patch, 
> HDFS-15113.003.patch, HDFS-15113.004.patch, HDFS-15113.005.patch, 
> HDFS-15113.addendum.patch
>
>
> Recently, I meet one case that NameNode missing block after restart which is 
> related with HDFS-14997.
> a. during NameNode restart, it will return command `DNA_REGISTER` to DataNode 
> when receive some RPC request from DataNode.
> b. when DataNode receive `DNA_REGISTER` command, it will run #reRegister 
> async.
> {code:java}
>   void reRegister() throws IOException {
> if (shouldRun()) {
>   // re-retrieve namespace info to make sure that, if the NN
>   // was restarted, we still match its version (HDFS-2120)
>   NamespaceInfo nsInfo = retrieveNamespaceInfo();
>   // and re-register
>   register(nsInfo);
>   scheduler.scheduleHeartbeat();
>   // HDFS-9917,Standby NN IBR can be very huge if standby namenode is down
>   // for sometime.
>   if (state == HAServiceState.STANDBY || state == 
> HAServiceState.OBSERVER) {
> ibrManager.clearIBRs();
>   }
> }
>   }
> {code}
> c. As we know, #register will trigger BR immediately.
> d. because #reRegister run async, so we could not make sure which one run 
> first between send FBR and clear IBR. If clean IBR run first, it will be OK. 
> But if send FBR first then clear IBR, it will missing some blocks received 
> between these two time point until next FBR.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15113) Missing IBR when NameNode restart if open processCommand async feature

2020-03-22 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064506#comment-17064506
 ] 

Xiaoqiao He commented on HDFS-15113:


Thanks [~weichiu]. Please refer Yetus result: 
https://builds.apache.org/job/PreCommit-HDFS-Build/29007/console

> Missing IBR when NameNode restart if open processCommand async feature
> --
>
> Key: HDFS-15113
> URL: https://issues.apache.org/jira/browse/HDFS-15113
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Blocker
> Fix For: 3.3.0
>
> Attachments: HDFS-15113.001.patch, HDFS-15113.002.patch, 
> HDFS-15113.003.patch, HDFS-15113.004.patch, HDFS-15113.005.patch, 
> HDFS-15113.addendum.patch
>
>
> Recently, I meet one case that NameNode missing block after restart which is 
> related with HDFS-14997.
> a. during NameNode restart, it will return command `DNA_REGISTER` to DataNode 
> when receive some RPC request from DataNode.
> b. when DataNode receive `DNA_REGISTER` command, it will run #reRegister 
> async.
> {code:java}
>   void reRegister() throws IOException {
> if (shouldRun()) {
>   // re-retrieve namespace info to make sure that, if the NN
>   // was restarted, we still match its version (HDFS-2120)
>   NamespaceInfo nsInfo = retrieveNamespaceInfo();
>   // and re-register
>   register(nsInfo);
>   scheduler.scheduleHeartbeat();
>   // HDFS-9917,Standby NN IBR can be very huge if standby namenode is down
>   // for sometime.
>   if (state == HAServiceState.STANDBY || state == 
> HAServiceState.OBSERVER) {
> ibrManager.clearIBRs();
>   }
> }
>   }
> {code}
> c. As we know, #register will trigger BR immediately.
> d. because #reRegister run async, so we could not make sure which one run 
> first between send FBR and clear IBR. If clean IBR run first, it will be OK. 
> But if send FBR first then clear IBR, it will missing some blocks received 
> between these two time point until next FBR.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14783) Expired SampleStat needs to be removed from SlowPeersReport

2020-03-22 Thread Haibin Huang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064491#comment-17064491
 ] 

Haibin Huang commented on HDFS-14783:
-

[~elgoiri], i'm sorry, the newest patch is [^HDFS-14783-005.patch], and i don't 
know why it dosen't trigger Hadoop QA any more.

> Expired SampleStat needs to be removed from SlowPeersReport
> ---
>
> Key: HDFS-14783
> URL: https://issues.apache.org/jira/browse/HDFS-14783
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-14783, HDFS-14783-001.patch, HDFS-14783-002.patch, 
> HDFS-14783-003.patch, HDFS-14783-004.patch, HDFS-14783-005.patch
>
>
> SlowPeersReport is calculated by the SampleStat between tow dn, so it can 
> present on nn's jmx like this:
> {code:java}
> "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}]
> {code}
> the SampleStat is stored in a LinkedBlockingDeque, it won't be 
> removed until the queue is full and a newest one is generated. Therefore, if 
> dn1 don't send any packet to dn2 for a long time, the old SampleStat will 
> keep staying in the queue, and will be used to calculated slowpeer.I think 
> these old SampleStats should be considered as expired message and ignore them 
> when generating a new SlowPeersReport.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15113) Missing IBR when NameNode restart if open processCommand async feature

2020-03-22 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-15113:
---
Status: Patch Available  (was: Reopened)

> Missing IBR when NameNode restart if open processCommand async feature
> --
>
> Key: HDFS-15113
> URL: https://issues.apache.org/jira/browse/HDFS-15113
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Blocker
> Fix For: 3.3.0
>
> Attachments: HDFS-15113.001.patch, HDFS-15113.002.patch, 
> HDFS-15113.003.patch, HDFS-15113.004.patch, HDFS-15113.005.patch, 
> HDFS-15113.addendum.patch
>
>
> Recently, I meet one case that NameNode missing block after restart which is 
> related with HDFS-14997.
> a. during NameNode restart, it will return command `DNA_REGISTER` to DataNode 
> when receive some RPC request from DataNode.
> b. when DataNode receive `DNA_REGISTER` command, it will run #reRegister 
> async.
> {code:java}
>   void reRegister() throws IOException {
> if (shouldRun()) {
>   // re-retrieve namespace info to make sure that, if the NN
>   // was restarted, we still match its version (HDFS-2120)
>   NamespaceInfo nsInfo = retrieveNamespaceInfo();
>   // and re-register
>   register(nsInfo);
>   scheduler.scheduleHeartbeat();
>   // HDFS-9917,Standby NN IBR can be very huge if standby namenode is down
>   // for sometime.
>   if (state == HAServiceState.STANDBY || state == 
> HAServiceState.OBSERVER) {
> ibrManager.clearIBRs();
>   }
> }
>   }
> {code}
> c. As we know, #register will trigger BR immediately.
> d. because #reRegister run async, so we could not make sure which one run 
> first between send FBR and clear IBR. If clean IBR run first, it will be OK. 
> But if send FBR first then clear IBR, it will missing some blocks received 
> between these two time point until next FBR.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Reopened] (HDFS-15113) Missing IBR when NameNode restart if open processCommand async feature

2020-03-22 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang reopened HDFS-15113:


Reopen to have the addendum tested.

> Missing IBR when NameNode restart if open processCommand async feature
> --
>
> Key: HDFS-15113
> URL: https://issues.apache.org/jira/browse/HDFS-15113
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Blocker
> Fix For: 3.3.0
>
> Attachments: HDFS-15113.001.patch, HDFS-15113.002.patch, 
> HDFS-15113.003.patch, HDFS-15113.004.patch, HDFS-15113.005.patch, 
> HDFS-15113.addendum.patch
>
>
> Recently, I meet one case that NameNode missing block after restart which is 
> related with HDFS-14997.
> a. during NameNode restart, it will return command `DNA_REGISTER` to DataNode 
> when receive some RPC request from DataNode.
> b. when DataNode receive `DNA_REGISTER` command, it will run #reRegister 
> async.
> {code:java}
>   void reRegister() throws IOException {
> if (shouldRun()) {
>   // re-retrieve namespace info to make sure that, if the NN
>   // was restarted, we still match its version (HDFS-2120)
>   NamespaceInfo nsInfo = retrieveNamespaceInfo();
>   // and re-register
>   register(nsInfo);
>   scheduler.scheduleHeartbeat();
>   // HDFS-9917,Standby NN IBR can be very huge if standby namenode is down
>   // for sometime.
>   if (state == HAServiceState.STANDBY || state == 
> HAServiceState.OBSERVER) {
> ibrManager.clearIBRs();
>   }
> }
>   }
> {code}
> c. As we know, #register will trigger BR immediately.
> d. because #reRegister run async, so we could not make sure which one run 
> first between send FBR and clear IBR. If clean IBR run first, it will be OK. 
> But if send FBR first then clear IBR, it will missing some blocks received 
> between these two time point until next FBR.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14783) Expired SampleStat needs to be removed from SlowPeersReport

2020-03-22 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-14783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064332#comment-17064332
 ] 

Íñigo Goiri commented on HDFS-14783:


+1 on  [^HDFS-14783-003.patch].

> Expired SampleStat needs to be removed from SlowPeersReport
> ---
>
> Key: HDFS-14783
> URL: https://issues.apache.org/jira/browse/HDFS-14783
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Major
> Attachments: HDFS-14783, HDFS-14783-001.patch, HDFS-14783-002.patch, 
> HDFS-14783-003.patch, HDFS-14783-004.patch, HDFS-14783-005.patch
>
>
> SlowPeersReport is calculated by the SampleStat between tow dn, so it can 
> present on nn's jmx like this:
> {code:java}
> "SlowPeersReport" :[{"SlowNode":"dn2","ReportingNodes":["dn1"]}]
> {code}
> the SampleStat is stored in a LinkedBlockingDeque, it won't be 
> removed until the queue is full and a newest one is generated. Therefore, if 
> dn1 don't send any packet to dn2 for a long time, the old SampleStat will 
> keep staying in the queue, and will be used to calculated slowpeer.I think 
> these old SampleStats should be considered as expired message and ignore them 
> when generating a new SlowPeersReport.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15075) Remove process command timing from BPServiceActor

2020-03-22 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064329#comment-17064329
 ] 

Íñigo Goiri edited comment on HDFS-15075 at 3/22/20, 4:14 PM:
--

Thanks [~hexiaoqiao] for the updated patch.
Good call on the doc file.
After checking more, I have a few minor comments... sorry for bringing them up 
late:
* The finally block in the new TestDataNodeMetrics test, should be expanded. 
Usually, Yetus would complain.
* Add a few basic comments to the new test (e.g., "Write into a file to trigger 
DN metrics").
* I know that DataNodeMetrics doesn't have javadocs, but given that we have 
latencies and millis in some parameters, I would make all of them called 
"latency" and to have a javadoc saying is milliseconds (let's do this just for 
the new methods).
* Let's move the substraction of the time inside the null checks all over 
FsDatasetImpl, there is no point doing the substraction and then not doing 
anything if it is null.


was (Author: elgoiri):
Thanks [~hexiaoqiao] for the updated patch.
Good call on the doc file.
After checking more, I have a few minor comments... sorry for bringing them up 
late:
* The finally block in the new TestDataNodeMetrics test, should be expanded. 
Usually, Yetus would complain.
* Add a few basic comments to the new test (e.g., "Write into a file to trigger 
DN metrics".
* I know that DataNodeMetrics doesn't have javadocs, but given that we have 
latencies and millis in some parameters, I would make all of them called 
"latency" and to have a javadoc saying is milliseconds (let's do this just for 
the new methods).
* Let's move the substraction of the time inside the null checks all over 
FsDatasetImpl, there is no point doing the substraction and then not doing 
anything if it is null.

> Remove process command timing from BPServiceActor
> -
>
> Key: HDFS-15075
> URL: https://issues.apache.org/jira/browse/HDFS-15075
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Xiaoqiao He
>Priority: Major
> Attachments: HDFS-15075.001.patch, HDFS-15075.002.patch, 
> HDFS-15075.003.patch, HDFS-15075.004.patch, HDFS-15075.005.patch, 
> HDFS-15075.006.patch
>
>
> HDFS-14997 moved the command processing into async.
> Right now, we are checking the time to add to a queue.
> We should remove this one and maybe move the timing within the thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15075) Remove process command timing from BPServiceActor

2020-03-22 Thread Jira


[ 
https://issues.apache.org/jira/browse/HDFS-15075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064329#comment-17064329
 ] 

Íñigo Goiri commented on HDFS-15075:


Thanks [~hexiaoqiao] for the updated patch.
Good call on the doc file.
After checking more, I have a few minor comments... sorry for bringing them up 
late:
* The finally block in the new TestDataNodeMetrics test, should be expanded. 
Usually, Yetus would complain.
* Add a few basic comments to the new test (e.g., "Write into a file to trigger 
DN metrics".
* I know that DataNodeMetrics doesn't have javadocs, but given that we have 
latencies and millis in some parameters, I would make all of them called 
"latency" and to have a javadoc saying is milliseconds (let's do this just for 
the new methods).
* Let's move the substraction of the time inside the null checks all over 
FsDatasetImpl, there is no point doing the substraction and then not doing 
anything if it is null.

> Remove process command timing from BPServiceActor
> -
>
> Key: HDFS-15075
> URL: https://issues.apache.org/jira/browse/HDFS-15075
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Xiaoqiao He
>Priority: Major
> Attachments: HDFS-15075.001.patch, HDFS-15075.002.patch, 
> HDFS-15075.003.patch, HDFS-15075.004.patch, HDFS-15075.005.patch, 
> HDFS-15075.006.patch
>
>
> HDFS-14997 moved the command processing into async.
> Right now, we are checking the time to add to a queue.
> We should remove this one and maybe move the timing within the thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15191) EOF when reading legacy buffer in BlockTokenIdentifier

2020-03-22 Thread Steven Rand (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Rand updated HDFS-15191:
---
Attachment: HDFS-15191-002.patch

> EOF when reading legacy buffer in BlockTokenIdentifier
> --
>
> Key: HDFS-15191
> URL: https://issues.apache.org/jira/browse/HDFS-15191
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.2.1
>Reporter: Steven Rand
>Assignee: Steven Rand
>Priority: Major
> Attachments: HDFS-15191-001.patch, HDFS-15191-002.patch
>
>
> We have an HDFS client application which recently upgraded from 3.2.0 to 
> 3.2.1. After this upgrade (but not before), we sometimes see these errors 
> when this application is used with clusters still running Hadoop 2.x (more 
> specifically CDH 5.12.1):
> {code}
> WARN  [2020-02-24T00:54:32.856Z] 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory: I/O error constructing 
> remote block reader. (_sampled: true)
> java.io.EOFException:
> at java.io.DataInputStream.readByte(DataInputStream.java:272)
> at 
> org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:308)
> at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:329)
> at 
> org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFieldsLegacy(BlockTokenIdentifier.java:240)
> at 
> org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFields(BlockTokenIdentifier.java:221)
> at 
> org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:200)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:530)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:342)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:276)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:245)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:227)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.peerSend(SaslDataTransferClient.java:170)
> at 
> org.apache.hadoop.hdfs.DFSUtilClient.peerFromSocketAndKey(DFSUtilClient.java:730)
> at 
> org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:2942)
> at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:822)
> at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:747)
> at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.build(BlockReaderFactory.java:380)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:644)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:575)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:757)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:829)
> at java.io.DataInputStream.read(DataInputStream.java:100)
> at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2314)
> at org.apache.commons.io.IOUtils.copy(IOUtils.java:2270)
> at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2291)
> at org.apache.commons.io.IOUtils.copy(IOUtils.java:2246)
> at org.apache.commons.io.IOUtils.toByteArray(IOUtils.java:765)
> {code}
> We get this warning for all DataNodes with a copy of the block, so the read 
> fails.
> I haven't been able to figure out what changed between 3.2.0 and 3.2.1 to 
> cause this, but HDFS-13617 and HDFS-14611 seem related, so tagging 
> [~vagarychen] in case you have any ideas.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15191) EOF when reading legacy buffer in BlockTokenIdentifier

2020-03-22 Thread Steven Rand (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Rand updated HDFS-15191:
---
Status: Open  (was: Patch Available)

> EOF when reading legacy buffer in BlockTokenIdentifier
> --
>
> Key: HDFS-15191
> URL: https://issues.apache.org/jira/browse/HDFS-15191
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.2.1
>Reporter: Steven Rand
>Assignee: Steven Rand
>Priority: Major
> Attachments: HDFS-15191-001.patch, HDFS-15191-002.patch
>
>
> We have an HDFS client application which recently upgraded from 3.2.0 to 
> 3.2.1. After this upgrade (but not before), we sometimes see these errors 
> when this application is used with clusters still running Hadoop 2.x (more 
> specifically CDH 5.12.1):
> {code}
> WARN  [2020-02-24T00:54:32.856Z] 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory: I/O error constructing 
> remote block reader. (_sampled: true)
> java.io.EOFException:
> at java.io.DataInputStream.readByte(DataInputStream.java:272)
> at 
> org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:308)
> at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:329)
> at 
> org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFieldsLegacy(BlockTokenIdentifier.java:240)
> at 
> org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFields(BlockTokenIdentifier.java:221)
> at 
> org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:200)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:530)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:342)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:276)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:245)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:227)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.peerSend(SaslDataTransferClient.java:170)
> at 
> org.apache.hadoop.hdfs.DFSUtilClient.peerFromSocketAndKey(DFSUtilClient.java:730)
> at 
> org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:2942)
> at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:822)
> at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:747)
> at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.build(BlockReaderFactory.java:380)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:644)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:575)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:757)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:829)
> at java.io.DataInputStream.read(DataInputStream.java:100)
> at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2314)
> at org.apache.commons.io.IOUtils.copy(IOUtils.java:2270)
> at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2291)
> at org.apache.commons.io.IOUtils.copy(IOUtils.java:2246)
> at org.apache.commons.io.IOUtils.toByteArray(IOUtils.java:765)
> {code}
> We get this warning for all DataNodes with a copy of the block, so the read 
> fails.
> I haven't been able to figure out what changed between 3.2.0 and 3.2.1 to 
> cause this, but HDFS-13617 and HDFS-14611 seem related, so tagging 
> [~vagarychen] in case you have any ideas.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15191) EOF when reading legacy buffer in BlockTokenIdentifier

2020-03-22 Thread Steven Rand (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Rand updated HDFS-15191:
---
Status: Patch Available  (was: Open)

> EOF when reading legacy buffer in BlockTokenIdentifier
> --
>
> Key: HDFS-15191
> URL: https://issues.apache.org/jira/browse/HDFS-15191
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.2.1
>Reporter: Steven Rand
>Assignee: Steven Rand
>Priority: Major
> Attachments: HDFS-15191-001.patch, HDFS-15191-002.patch
>
>
> We have an HDFS client application which recently upgraded from 3.2.0 to 
> 3.2.1. After this upgrade (but not before), we sometimes see these errors 
> when this application is used with clusters still running Hadoop 2.x (more 
> specifically CDH 5.12.1):
> {code}
> WARN  [2020-02-24T00:54:32.856Z] 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory: I/O error constructing 
> remote block reader. (_sampled: true)
> java.io.EOFException:
> at java.io.DataInputStream.readByte(DataInputStream.java:272)
> at 
> org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:308)
> at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:329)
> at 
> org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFieldsLegacy(BlockTokenIdentifier.java:240)
> at 
> org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFields(BlockTokenIdentifier.java:221)
> at 
> org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:200)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:530)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:342)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:276)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:245)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:227)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.peerSend(SaslDataTransferClient.java:170)
> at 
> org.apache.hadoop.hdfs.DFSUtilClient.peerFromSocketAndKey(DFSUtilClient.java:730)
> at 
> org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:2942)
> at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:822)
> at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:747)
> at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.build(BlockReaderFactory.java:380)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:644)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:575)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:757)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:829)
> at java.io.DataInputStream.read(DataInputStream.java:100)
> at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2314)
> at org.apache.commons.io.IOUtils.copy(IOUtils.java:2270)
> at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2291)
> at org.apache.commons.io.IOUtils.copy(IOUtils.java:2246)
> at org.apache.commons.io.IOUtils.toByteArray(IOUtils.java:765)
> {code}
> We get this warning for all DataNodes with a copy of the block, so the read 
> fails.
> I haven't been able to figure out what changed between 3.2.0 and 3.2.1 to 
> cause this, but HDFS-13617 and HDFS-14611 seem related, so tagging 
> [~vagarychen] in case you have any ideas.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15191) EOF when reading legacy buffer in BlockTokenIdentifier

2020-03-22 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064280#comment-17064280
 ] 

Hadoop QA commented on HDFS-15191:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  8s{color} 
| {color:red} HDFS-15191 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-15191 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12997361/HDFS-15191-001.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/29005/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> EOF when reading legacy buffer in BlockTokenIdentifier
> --
>
> Key: HDFS-15191
> URL: https://issues.apache.org/jira/browse/HDFS-15191
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.2.1
>Reporter: Steven Rand
>Assignee: Steven Rand
>Priority: Major
> Attachments: HDFS-15191-001.patch
>
>
> We have an HDFS client application which recently upgraded from 3.2.0 to 
> 3.2.1. After this upgrade (but not before), we sometimes see these errors 
> when this application is used with clusters still running Hadoop 2.x (more 
> specifically CDH 5.12.1):
> {code}
> WARN  [2020-02-24T00:54:32.856Z] 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory: I/O error constructing 
> remote block reader. (_sampled: true)
> java.io.EOFException:
> at java.io.DataInputStream.readByte(DataInputStream.java:272)
> at 
> org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:308)
> at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:329)
> at 
> org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFieldsLegacy(BlockTokenIdentifier.java:240)
> at 
> org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFields(BlockTokenIdentifier.java:221)
> at 
> org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:200)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:530)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:342)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:276)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:245)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:227)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.peerSend(SaslDataTransferClient.java:170)
> at 
> org.apache.hadoop.hdfs.DFSUtilClient.peerFromSocketAndKey(DFSUtilClient.java:730)
> at 
> org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:2942)
> at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:822)
> at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:747)
> at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.build(BlockReaderFactory.java:380)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:644)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:575)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:757)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:829)
> at java.io.DataInputStream.read(DataInputStream.java:100)
> at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2314)
> at org.apache.commons.io.IOUtils.copy(IOUtils.java:2270)
> at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2291)
> at org.apache.commons.io.IOUtils.copy(IOUtils.java:2246)
> at org.apache.commons.io.IOUtils.toByteArray(IOUtils.java:765)
> {code}
> We get this warning for all DataNodes with a copy of the block, so the read 
> fails.
> I haven't been able to figure out what changed between 3.2.0 and 3.2.1 to 
> cause this, but HDFS-13617 and HDFS-14611 seem related, so tagging 
> [~vagarychen] in case you have any ideas.



[jira] [Commented] (HDFS-15191) EOF when reading legacy buffer in BlockTokenIdentifier

2020-03-22 Thread Steven Rand (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064269#comment-17064269
 ] 

Steven Rand commented on HDFS-15191:


Attached a patch which uses the same approach from HDFS-14611, which is to just 
catch the EOF. Also included a unit test which fails before the rest of the 
patch and works after.

> EOF when reading legacy buffer in BlockTokenIdentifier
> --
>
> Key: HDFS-15191
> URL: https://issues.apache.org/jira/browse/HDFS-15191
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.2.1
>Reporter: Steven Rand
>Assignee: Steven Rand
>Priority: Major
> Attachments: HDFS-15191-001.patch
>
>
> We have an HDFS client application which recently upgraded from 3.2.0 to 
> 3.2.1. After this upgrade (but not before), we sometimes see these errors 
> when this application is used with clusters still running Hadoop 2.x (more 
> specifically CDH 5.12.1):
> {code}
> WARN  [2020-02-24T00:54:32.856Z] 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory: I/O error constructing 
> remote block reader. (_sampled: true)
> java.io.EOFException:
> at java.io.DataInputStream.readByte(DataInputStream.java:272)
> at 
> org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:308)
> at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:329)
> at 
> org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFieldsLegacy(BlockTokenIdentifier.java:240)
> at 
> org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFields(BlockTokenIdentifier.java:221)
> at 
> org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:200)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:530)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:342)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:276)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:245)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:227)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.peerSend(SaslDataTransferClient.java:170)
> at 
> org.apache.hadoop.hdfs.DFSUtilClient.peerFromSocketAndKey(DFSUtilClient.java:730)
> at 
> org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:2942)
> at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:822)
> at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:747)
> at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.build(BlockReaderFactory.java:380)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:644)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:575)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:757)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:829)
> at java.io.DataInputStream.read(DataInputStream.java:100)
> at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2314)
> at org.apache.commons.io.IOUtils.copy(IOUtils.java:2270)
> at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2291)
> at org.apache.commons.io.IOUtils.copy(IOUtils.java:2246)
> at org.apache.commons.io.IOUtils.toByteArray(IOUtils.java:765)
> {code}
> We get this warning for all DataNodes with a copy of the block, so the read 
> fails.
> I haven't been able to figure out what changed between 3.2.0 and 3.2.1 to 
> cause this, but HDFS-13617 and HDFS-14611 seem related, so tagging 
> [~vagarychen] in case you have any ideas.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15191) EOF when reading legacy buffer in BlockTokenIdentifier

2020-03-22 Thread Steven Rand (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Rand updated HDFS-15191:
---
Attachment: HDFS-15191-001.patch

> EOF when reading legacy buffer in BlockTokenIdentifier
> --
>
> Key: HDFS-15191
> URL: https://issues.apache.org/jira/browse/HDFS-15191
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.2.1
>Reporter: Steven Rand
>Priority: Major
> Attachments: HDFS-15191-001.patch
>
>
> We have an HDFS client application which recently upgraded from 3.2.0 to 
> 3.2.1. After this upgrade (but not before), we sometimes see these errors 
> when this application is used with clusters still running Hadoop 2.x (more 
> specifically CDH 5.12.1):
> {code}
> WARN  [2020-02-24T00:54:32.856Z] 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory: I/O error constructing 
> remote block reader. (_sampled: true)
> java.io.EOFException:
> at java.io.DataInputStream.readByte(DataInputStream.java:272)
> at 
> org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:308)
> at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:329)
> at 
> org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFieldsLegacy(BlockTokenIdentifier.java:240)
> at 
> org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFields(BlockTokenIdentifier.java:221)
> at 
> org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:200)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:530)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:342)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:276)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:245)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:227)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.peerSend(SaslDataTransferClient.java:170)
> at 
> org.apache.hadoop.hdfs.DFSUtilClient.peerFromSocketAndKey(DFSUtilClient.java:730)
> at 
> org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:2942)
> at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:822)
> at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:747)
> at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.build(BlockReaderFactory.java:380)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:644)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:575)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:757)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:829)
> at java.io.DataInputStream.read(DataInputStream.java:100)
> at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2314)
> at org.apache.commons.io.IOUtils.copy(IOUtils.java:2270)
> at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2291)
> at org.apache.commons.io.IOUtils.copy(IOUtils.java:2246)
> at org.apache.commons.io.IOUtils.toByteArray(IOUtils.java:765)
> {code}
> We get this warning for all DataNodes with a copy of the block, so the read 
> fails.
> I haven't been able to figure out what changed between 3.2.0 and 3.2.1 to 
> cause this, but HDFS-13617 and HDFS-14611 seem related, so tagging 
> [~vagarychen] in case you have any ideas.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15191) EOF when reading legacy buffer in BlockTokenIdentifier

2020-03-22 Thread Steven Rand (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Rand updated HDFS-15191:
---
Assignee: Steven Rand
  Status: Patch Available  (was: Open)

> EOF when reading legacy buffer in BlockTokenIdentifier
> --
>
> Key: HDFS-15191
> URL: https://issues.apache.org/jira/browse/HDFS-15191
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.2.1
>Reporter: Steven Rand
>Assignee: Steven Rand
>Priority: Major
> Attachments: HDFS-15191-001.patch
>
>
> We have an HDFS client application which recently upgraded from 3.2.0 to 
> 3.2.1. After this upgrade (but not before), we sometimes see these errors 
> when this application is used with clusters still running Hadoop 2.x (more 
> specifically CDH 5.12.1):
> {code}
> WARN  [2020-02-24T00:54:32.856Z] 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory: I/O error constructing 
> remote block reader. (_sampled: true)
> java.io.EOFException:
> at java.io.DataInputStream.readByte(DataInputStream.java:272)
> at 
> org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:308)
> at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:329)
> at 
> org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFieldsLegacy(BlockTokenIdentifier.java:240)
> at 
> org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFields(BlockTokenIdentifier.java:221)
> at 
> org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:200)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:530)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:342)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:276)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:245)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:227)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.peerSend(SaslDataTransferClient.java:170)
> at 
> org.apache.hadoop.hdfs.DFSUtilClient.peerFromSocketAndKey(DFSUtilClient.java:730)
> at 
> org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:2942)
> at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:822)
> at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:747)
> at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.build(BlockReaderFactory.java:380)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:644)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:575)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:757)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:829)
> at java.io.DataInputStream.read(DataInputStream.java:100)
> at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2314)
> at org.apache.commons.io.IOUtils.copy(IOUtils.java:2270)
> at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2291)
> at org.apache.commons.io.IOUtils.copy(IOUtils.java:2246)
> at org.apache.commons.io.IOUtils.toByteArray(IOUtils.java:765)
> {code}
> We get this warning for all DataNodes with a copy of the block, so the read 
> fails.
> I haven't been able to figure out what changed between 3.2.0 and 3.2.1 to 
> cause this, but HDFS-13617 and HDFS-14611 seem related, so tagging 
> [~vagarychen] in case you have any ideas.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12862) CacheDirective becomes invalid when NN restart or failover

2020-03-22 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-12862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064262#comment-17064262
 ] 

Hadoop QA commented on HDFS-12862:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-3.1 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
 6s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
18s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
52s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
19s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 54s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
4s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} branch-3.1 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  7s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 94m 49s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
28s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}161m  4s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.diskbalancer.TestDiskBalancer |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.8 Server=19.03.8 Image:yetus/hadoop:70a0ef5d4a6 |
| JIRA Issue | HDFS-12862 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12973890/HDFS-12862.branch-3.1.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 978d64ac9523 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-3.1 / 61915fb |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_242 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/29002/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/29002/testReport/ |
| Max. process+thread count | 2902 (vs. ulimit of 5500) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/29002/console |
| Power

[jira] [Commented] (HDFS-12733) Option to disable to namenode local edits

2020-03-22 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-12733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064253#comment-17064253
 ] 

Xiaoqiao He commented on HDFS-12733:


Hi [~shv],[~brahmareddy],[~elgoiri],[~ayushtkn], Sorry for pending this issue 
for long time since we do not reach agreement last time, but this issue is 
still be there and sometimes could impact performance. I want to know If we 
could step forward without introducing a new configuration parameter which as 
v008 shows? Thanks everyone again.

> Option to disable to namenode local edits
> -
>
> Key: HDFS-12733
> URL: https://issues.apache.org/jira/browse/HDFS-12733
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, performance
>Reporter: Brahma Reddy Battula
>Assignee: Xiaoqiao He
>Priority: Major
> Attachments: HDFS-12733-001.patch, HDFS-12733-002.patch, 
> HDFS-12733-003.patch, HDFS-12733.004.patch, HDFS-12733.005.patch, 
> HDFS-12733.006.patch, HDFS-12733.007.patch, HDFS-12733.008.patch
>
>
> As of now, Edits will be written in local and shared locations which will be 
> redundant and local edits never used in HA setup.
> Disabling local edits gives little performance improvement.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15191) EOF when reading legacy buffer in BlockTokenIdentifier

2020-03-22 Thread Steven Rand (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17063999#comment-17063999
 ] 

Steven Rand edited comment on HDFS-15191 at 3/22/20, 12:10 PM:
---

Also, I noticed that the original implementation of 
{{BlockTokenIdentifier#readFieldsLegacy}} from HDFS-11026 stops at exactly the 
right place, and it was only after HDFS-6708 that we can get this EOF, because 
it added more to that method: 
[https://github.com/apache/hadoop/commit/2f73396b5901fd5fe29f6cd76fc1b3134b854b37#diff-4bd9e663e048a075423d94ba53b8d315R217].
 
It seems like we should remove the additions to 
{{BlockTokenIdentifier#readFieldsLegacy}} from HDFS-6708, since if we're 
reading a legacy block token, then we know that the NameNode was on too old of 
a version to add the information about storages to the block token. Does that 
sound right?


was (Author: steven rand):
Also, I noticed that the original implementation of 
{{BlockTokenIdentifier#readFieldsLegacy}} from HDFS-11026 stops at exactly the 
right place, and it was only after HDFS-6708 that we can get this EOF, because 
it added more to that method: 
[https://github.com/apache/hadoop/commit/2f73396b5901fd5fe29f6cd76fc1b3134b854b37#diff-4bd9e663e048a075423d94ba53b8d315R217].
 
It seems like we should remove the additions to 
{{BlockTokenIdentifier#readFieldsLegacy}} from HDFS-6708, since if we're 
reading a legacy block token, then we know that the DataNode was on too old of 
a version to add the information about storages to the block token. Does that 
sound right?

> EOF when reading legacy buffer in BlockTokenIdentifier
> --
>
> Key: HDFS-15191
> URL: https://issues.apache.org/jira/browse/HDFS-15191
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.2.1
>Reporter: Steven Rand
>Priority: Major
>
> We have an HDFS client application which recently upgraded from 3.2.0 to 
> 3.2.1. After this upgrade (but not before), we sometimes see these errors 
> when this application is used with clusters still running Hadoop 2.x (more 
> specifically CDH 5.12.1):
> {code}
> WARN  [2020-02-24T00:54:32.856Z] 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory: I/O error constructing 
> remote block reader. (_sampled: true)
> java.io.EOFException:
> at java.io.DataInputStream.readByte(DataInputStream.java:272)
> at 
> org.apache.hadoop.io.WritableUtils.readVLong(WritableUtils.java:308)
> at org.apache.hadoop.io.WritableUtils.readVInt(WritableUtils.java:329)
> at 
> org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFieldsLegacy(BlockTokenIdentifier.java:240)
> at 
> org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier.readFields(BlockTokenIdentifier.java:221)
> at 
> org.apache.hadoop.security.token.Token.decodeIdentifier(Token.java:200)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:530)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:342)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:276)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:245)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:227)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.peerSend(SaslDataTransferClient.java:170)
> at 
> org.apache.hadoop.hdfs.DFSUtilClient.peerFromSocketAndKey(DFSUtilClient.java:730)
> at 
> org.apache.hadoop.hdfs.DFSClient.newConnectedPeer(DFSClient.java:2942)
> at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.nextTcpPeer(BlockReaderFactory.java:822)
> at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:747)
> at 
> org.apache.hadoop.hdfs.client.impl.BlockReaderFactory.build(BlockReaderFactory.java:380)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:644)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:575)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:757)
> at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:829)
> at java.io.DataInputStream.read(DataInputStream.java:100)
> at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2314)
> at org.apache.commons.io.IOUtils.copy(I

[jira] [Commented] (HDFS-15051) RBF: Propose to revoke WRITE MountTableEntry privilege to super user only

2020-03-22 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064245#comment-17064245
 ] 

Xiaoqiao He commented on HDFS-15051:


v008 change permission check logic just EXECUTE if immediate parent does't 
exist rather than WRITE permission when add mount point.
[~ayushtkn] Please take another review if you have bandwidth. Thanks.

> RBF: Propose to revoke WRITE MountTableEntry privilege to super user only
> -
>
> Key: HDFS-15051
> URL: https://issues.apache.org/jira/browse/HDFS-15051
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Major
> Attachments: HDFS-15051.001.patch, HDFS-15051.002.patch, 
> HDFS-15051.003.patch, HDFS-15051.004.patch, HDFS-15051.005.patch, 
> HDFS-15051.006.patch, HDFS-15051.007.patch, HDFS-15051.008.patch
>
>
> The current permission checker of #MountTableStoreImpl is not very restrict. 
> In some case, any user could add/update/remove MountTableEntry without the 
> expected permission checking.
> The following code segment try to check permission when operate 
> MountTableEntry, however mountTable object is from Client/RouterAdmin 
> {{MountTable mountTable = request.getEntry();}}, and user could pass any mode 
> which could bypass the permission checker.
> {code:java}
>   public void checkPermission(MountTable mountTable, FsAction access)
>   throws AccessControlException {
> if (isSuperUser()) {
>   return;
> }
> FsPermission mode = mountTable.getMode();
> if (getUser().equals(mountTable.getOwnerName())
> && mode.getUserAction().implies(access)) {
>   return;
> }
> if (isMemberOfGroup(mountTable.getGroupName())
> && mode.getGroupAction().implies(access)) {
>   return;
> }
> if (!getUser().equals(mountTable.getOwnerName())
> && !isMemberOfGroup(mountTable.getGroupName())
> && mode.getOtherAction().implies(access)) {
>   return;
> }
> throw new AccessControlException(
> "Permission denied while accessing mount table "
> + mountTable.getSourcePath()
> + ": user " + getUser() + " does not have " + access.toString()
> + " permissions.");
>   }
> {code}
> I just propose revoke WRITE MountTableEntry privilege to super user only.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HDFS-15051) RBF: Propose to revoke WRITE MountTableEntry privilege to super user only

2020-03-22 Thread Xiaoqiao He (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He updated HDFS-15051:
---
Comment: was deleted

(was: Thanks [~ayushtkn] pick up this JIRA.
{quote}If the immediate parent doesn't exist, the parent above is checked for 
WRITE permission only, IMO it should be EXECUTE only, If parent is there then 
we can check WRITE, else we can cosider it exists virtually and has required 
permissions, and move up normally.{quote}
This makes sense to me, I would like to update it in the next two days.)

> RBF: Propose to revoke WRITE MountTableEntry privilege to super user only
> -
>
> Key: HDFS-15051
> URL: https://issues.apache.org/jira/browse/HDFS-15051
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Major
> Attachments: HDFS-15051.001.patch, HDFS-15051.002.patch, 
> HDFS-15051.003.patch, HDFS-15051.004.patch, HDFS-15051.005.patch, 
> HDFS-15051.006.patch, HDFS-15051.007.patch, HDFS-15051.008.patch
>
>
> The current permission checker of #MountTableStoreImpl is not very restrict. 
> In some case, any user could add/update/remove MountTableEntry without the 
> expected permission checking.
> The following code segment try to check permission when operate 
> MountTableEntry, however mountTable object is from Client/RouterAdmin 
> {{MountTable mountTable = request.getEntry();}}, and user could pass any mode 
> which could bypass the permission checker.
> {code:java}
>   public void checkPermission(MountTable mountTable, FsAction access)
>   throws AccessControlException {
> if (isSuperUser()) {
>   return;
> }
> FsPermission mode = mountTable.getMode();
> if (getUser().equals(mountTable.getOwnerName())
> && mode.getUserAction().implies(access)) {
>   return;
> }
> if (isMemberOfGroup(mountTable.getGroupName())
> && mode.getGroupAction().implies(access)) {
>   return;
> }
> if (!getUser().equals(mountTable.getOwnerName())
> && !isMemberOfGroup(mountTable.getGroupName())
> && mode.getOtherAction().implies(access)) {
>   return;
> }
> throw new AccessControlException(
> "Permission denied while accessing mount table "
> + mountTable.getSourcePath()
> + ": user " + getUser() + " does not have " + access.toString()
> + " permissions.");
>   }
> {code}
> I just propose revoke WRITE MountTableEntry privilege to super user only.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15075) Remove process command timing from BPServiceActor

2020-03-22 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064243#comment-17064243
 ] 

Xiaoqiao He commented on HDFS-15075:


Thanks [~elgoiri] for your comments, v006 try to update following above 
suggestions,
a. add unit test to verify part of new metrics (which can be checked directly) 
but both of them.
b. update metrics document.

> Remove process command timing from BPServiceActor
> -
>
> Key: HDFS-15075
> URL: https://issues.apache.org/jira/browse/HDFS-15075
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Xiaoqiao He
>Priority: Major
> Attachments: HDFS-15075.001.patch, HDFS-15075.002.patch, 
> HDFS-15075.003.patch, HDFS-15075.004.patch, HDFS-15075.005.patch, 
> HDFS-15075.006.patch
>
>
> HDFS-14997 moved the command processing into async.
> Right now, we are checking the time to add to a queue.
> We should remove this one and maybe move the timing within the thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15051) RBF: Propose to revoke WRITE MountTableEntry privilege to super user only

2020-03-22 Thread Xiaoqiao He (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He updated HDFS-15051:
---
Attachment: HDFS-15051.008.patch

> RBF: Propose to revoke WRITE MountTableEntry privilege to super user only
> -
>
> Key: HDFS-15051
> URL: https://issues.apache.org/jira/browse/HDFS-15051
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Major
> Attachments: HDFS-15051.001.patch, HDFS-15051.002.patch, 
> HDFS-15051.003.patch, HDFS-15051.004.patch, HDFS-15051.005.patch, 
> HDFS-15051.006.patch, HDFS-15051.007.patch, HDFS-15051.008.patch
>
>
> The current permission checker of #MountTableStoreImpl is not very restrict. 
> In some case, any user could add/update/remove MountTableEntry without the 
> expected permission checking.
> The following code segment try to check permission when operate 
> MountTableEntry, however mountTable object is from Client/RouterAdmin 
> {{MountTable mountTable = request.getEntry();}}, and user could pass any mode 
> which could bypass the permission checker.
> {code:java}
>   public void checkPermission(MountTable mountTable, FsAction access)
>   throws AccessControlException {
> if (isSuperUser()) {
>   return;
> }
> FsPermission mode = mountTable.getMode();
> if (getUser().equals(mountTable.getOwnerName())
> && mode.getUserAction().implies(access)) {
>   return;
> }
> if (isMemberOfGroup(mountTable.getGroupName())
> && mode.getGroupAction().implies(access)) {
>   return;
> }
> if (!getUser().equals(mountTable.getOwnerName())
> && !isMemberOfGroup(mountTable.getGroupName())
> && mode.getOtherAction().implies(access)) {
>   return;
> }
> throw new AccessControlException(
> "Permission denied while accessing mount table "
> + mountTable.getSourcePath()
> + ": user " + getUser() + " does not have " + access.toString()
> + " permissions.");
>   }
> {code}
> I just propose revoke WRITE MountTableEntry privilege to super user only.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15075) Remove process command timing from BPServiceActor

2020-03-22 Thread Xiaoqiao He (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoqiao He updated HDFS-15075:
---
Attachment: HDFS-15075.006.patch

> Remove process command timing from BPServiceActor
> -
>
> Key: HDFS-15075
> URL: https://issues.apache.org/jira/browse/HDFS-15075
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Íñigo Goiri
>Assignee: Xiaoqiao He
>Priority: Major
> Attachments: HDFS-15075.001.patch, HDFS-15075.002.patch, 
> HDFS-15075.003.patch, HDFS-15075.004.patch, HDFS-15075.005.patch, 
> HDFS-15075.006.patch
>
>
> HDFS-14997 moved the command processing into async.
> Right now, we are checking the time to add to a queue.
> We should remove this one and maybe move the timing within the thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12862) CacheDirective becomes invalid when NN restart or failover

2020-03-22 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-12862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064223#comment-17064223
 ] 

Hadoop QA commented on HDFS-12862:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 27m 
44s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} branch-3.1 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 29m 
 4s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
47s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
17s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m  9s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
13s{color} | {color:green} branch-3.1 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} branch-3.1 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 23s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}114m 13s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}213m  3s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency |
|   | hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean |
|   | hadoop.hdfs.server.diskbalancer.TestDiskBalancer |
|   | hadoop.hdfs.server.namenode.TestRedudantBlocks |
|   | hadoop.hdfs.server.namenode.TestFsck |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.8 Server=19.03.8 Image:yetus/hadoop:70a0ef5d4a6 |
| JIRA Issue | HDFS-12862 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12973890/HDFS-12862.branch-3.1.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux f10f263edd47 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 
08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | branch-3.1 / 61915fb |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_242 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/29000/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/29000/testReport/ |
| Ma

[jira] [Commented] (HDFS-15051) RBF: Propose to revoke WRITE MountTableEntry privilege to super user only

2020-03-22 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064210#comment-17064210
 ] 

Xiaoqiao He commented on HDFS-15051:


Thanks [~ayushtkn] pick up this JIRA.
{quote}If the immediate parent doesn't exist, the parent above is checked for 
WRITE permission only, IMO it should be EXECUTE only, If parent is there then 
we can check WRITE, else we can cosider it exists virtually and has required 
permissions, and move up normally.{quote}
This makes sense to me, I would like to update it in the next two days.

> RBF: Propose to revoke WRITE MountTableEntry privilege to super user only
> -
>
> Key: HDFS-15051
> URL: https://issues.apache.org/jira/browse/HDFS-15051
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Major
> Attachments: HDFS-15051.001.patch, HDFS-15051.002.patch, 
> HDFS-15051.003.patch, HDFS-15051.004.patch, HDFS-15051.005.patch, 
> HDFS-15051.006.patch, HDFS-15051.007.patch
>
>
> The current permission checker of #MountTableStoreImpl is not very restrict. 
> In some case, any user could add/update/remove MountTableEntry without the 
> expected permission checking.
> The following code segment try to check permission when operate 
> MountTableEntry, however mountTable object is from Client/RouterAdmin 
> {{MountTable mountTable = request.getEntry();}}, and user could pass any mode 
> which could bypass the permission checker.
> {code:java}
>   public void checkPermission(MountTable mountTable, FsAction access)
>   throws AccessControlException {
> if (isSuperUser()) {
>   return;
> }
> FsPermission mode = mountTable.getMode();
> if (getUser().equals(mountTable.getOwnerName())
> && mode.getUserAction().implies(access)) {
>   return;
> }
> if (isMemberOfGroup(mountTable.getGroupName())
> && mode.getGroupAction().implies(access)) {
>   return;
> }
> if (!getUser().equals(mountTable.getOwnerName())
> && !isMemberOfGroup(mountTable.getGroupName())
> && mode.getOtherAction().implies(access)) {
>   return;
> }
> throw new AccessControlException(
> "Permission denied while accessing mount table "
> + mountTable.getSourcePath()
> + ": user " + getUser() + " does not have " + access.toString()
> + " permissions.");
>   }
> {code}
> I just propose revoke WRITE MountTableEntry privilege to super user only.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15051) RBF: Propose to revoke WRITE MountTableEntry privilege to super user only

2020-03-22 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064211#comment-17064211
 ] 

Xiaoqiao He commented on HDFS-15051:


Thanks [~ayushtkn] pick up this JIRA.
{quote}If the immediate parent doesn't exist, the parent above is checked for 
WRITE permission only, IMO it should be EXECUTE only, If parent is there then 
we can check WRITE, else we can cosider it exists virtually and has required 
permissions, and move up normally.{quote}
This makes sense to me, I would like to update it in the next two days.

> RBF: Propose to revoke WRITE MountTableEntry privilege to super user only
> -
>
> Key: HDFS-15051
> URL: https://issues.apache.org/jira/browse/HDFS-15051
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: rbf
>Reporter: Xiaoqiao He
>Assignee: Xiaoqiao He
>Priority: Major
> Attachments: HDFS-15051.001.patch, HDFS-15051.002.patch, 
> HDFS-15051.003.patch, HDFS-15051.004.patch, HDFS-15051.005.patch, 
> HDFS-15051.006.patch, HDFS-15051.007.patch
>
>
> The current permission checker of #MountTableStoreImpl is not very restrict. 
> In some case, any user could add/update/remove MountTableEntry without the 
> expected permission checking.
> The following code segment try to check permission when operate 
> MountTableEntry, however mountTable object is from Client/RouterAdmin 
> {{MountTable mountTable = request.getEntry();}}, and user could pass any mode 
> which could bypass the permission checker.
> {code:java}
>   public void checkPermission(MountTable mountTable, FsAction access)
>   throws AccessControlException {
> if (isSuperUser()) {
>   return;
> }
> FsPermission mode = mountTable.getMode();
> if (getUser().equals(mountTable.getOwnerName())
> && mode.getUserAction().implies(access)) {
>   return;
> }
> if (isMemberOfGroup(mountTable.getGroupName())
> && mode.getGroupAction().implies(access)) {
>   return;
> }
> if (!getUser().equals(mountTable.getOwnerName())
> && !isMemberOfGroup(mountTable.getGroupName())
> && mode.getOtherAction().implies(access)) {
>   return;
> }
> throw new AccessControlException(
> "Permission denied while accessing mount table "
> + mountTable.getSourcePath()
> + ": user " + getUser() + " does not have " + access.toString()
> + " permissions.");
>   }
> {code}
> I just propose revoke WRITE MountTableEntry privilege to super user only.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15180) DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.

2020-03-22 Thread Aiphago (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17064192#comment-17064192
 ] 

Aiphago commented on HDFS-15180:


Fix the problem in UT

>  DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.
> ---
>
> Key: HDFS-15180
> URL: https://issues.apache.org/jira/browse/HDFS-15180
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: Aiphago
>Priority: Major
> Attachments: HDFS-15180.001.patch, HDFS-15180.002.patch, 
> HDFS-15180.003.patch, HDFS-15180.004.patch, 
> image-2020-03-10-17-22-57-391.png, image-2020-03-10-17-31-58-830.png, 
> image-2020-03-10-17-34-26-368.png
>
>
> Now the FsDatasetImpl datasetLock is heavy, when their are many namespaces in 
> big cluster. If we can split the FsDatasetImpl datasetLock via blockpool. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15180) DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.

2020-03-22 Thread Aiphago (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aiphago updated HDFS-15180:
---
Attachment: HDFS-15180.004.patch

>  DataNode FsDatasetImpl Fine-Grained Locking via BlockPool.
> ---
>
> Key: HDFS-15180
> URL: https://issues.apache.org/jira/browse/HDFS-15180
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.2.0
>Reporter: zhuqi
>Assignee: Aiphago
>Priority: Major
> Attachments: HDFS-15180.001.patch, HDFS-15180.002.patch, 
> HDFS-15180.003.patch, HDFS-15180.004.patch, 
> image-2020-03-10-17-22-57-391.png, image-2020-03-10-17-31-58-830.png, 
> image-2020-03-10-17-34-26-368.png
>
>
> Now the FsDatasetImpl datasetLock is heavy, when their are many namespaces in 
> big cluster. If we can split the FsDatasetImpl datasetLock via blockpool. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org