[jira] [Commented] (HDFS-15382) Split FsDatasetImpl from blockpool lock to blockpool volume lock

2022-02-20 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17495328#comment-17495328
 ] 

Xiaoqiao He commented on HDFS-15382:


Thanks [~yuanbo] for your feedback. [~Aiphag0] is working for pushing this 
feature forward now. The latest PR refer to 
https://github.com/apache/hadoop/pull/3941. Before that we have merged 
https://issues.apache.org/jira/browse/HDFS-16429. Welcome to any suggestions or 
work together here.
{quote}It seems not a  compatible feature{quote}
We do not create new feature branch for this improvement. IMO there is no any 
incompatible changes for end user now.

> Split FsDatasetImpl from blockpool lock to blockpool volume lock 
> -
>
> Key: HDFS-15382
> URL: https://issues.apache.org/jira/browse/HDFS-15382
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Mingxiang Li
>Assignee: Mingxiang Li
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-15382-sample.patch, image-2020-06-02-1.png, 
> image-2020-06-03-1.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> In HDFS-15180 we split lock to blockpool grain size.But when one volume is in 
> heavy load and will block other request which in same blockpool but different 
> volume.So we split lock to two leval to avoid this happend.And to improve 
> datanode performance.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15382) Split FsDatasetImpl from blockpool lock to blockpool volume lock

2022-02-20 Thread Yuanbo Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17495308#comment-17495308
 ] 

Yuanbo Liu commented on HDFS-15382:
---

It seems not a  compatible feature, how about merging all sub-task patches into 
feature branch, and then merge it into trunk?

> Split FsDatasetImpl from blockpool lock to blockpool volume lock 
> -
>
> Key: HDFS-15382
> URL: https://issues.apache.org/jira/browse/HDFS-15382
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Mingxiang Li
>Assignee: Mingxiang Li
>Priority: Major
>  Labels: pull-request-available
> Attachments: HDFS-15382-sample.patch, image-2020-06-02-1.png, 
> image-2020-06-03-1.png
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> In HDFS-15180 we split lock to blockpool grain size.But when one volume is in 
> heavy load and will block other request which in same blockpool but different 
> volume.So we split lock to two leval to avoid this happend.And to improve 
> datanode performance.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15382) Split FsDatasetImpl from blockpool lock to blockpool volume lock

2022-01-16 Thread Mingxinag Li (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17476764#comment-17476764
 ] 

Mingxinag Li commented on HDFS-15382:
-

[https://github.com/apache/hadoop/pull/3889]  update pre dependencies pr

> Split FsDatasetImpl from blockpool lock to blockpool volume lock 
> -
>
> Key: HDFS-15382
> URL: https://issues.apache.org/jira/browse/HDFS-15382
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Mingxinag Li
>Assignee: Mingxinag Li
>Priority: Major
> Attachments: HDFS-15382-sample.patch, image-2020-06-02-1.png, 
> image-2020-06-03-1.png
>
>
> In HDFS-15180 we split lock to blockpool grain size.But when one volume is in 
> heavy load and will block other request which in same blockpool but different 
> volume.So we split lock to two leval to avoid this happend.And to improve 
> datanode performance.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15382) Split FsDatasetImpl from blockpool lock to blockpool volume lock

2021-11-16 Thread Yuanbo Liu (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17444371#comment-17444371
 ] 

Yuanbo Liu commented on HDFS-15382:
---

The design of splitting data lock by volume seems promising as the size of 
volume is getting bigger and bigger nowadays. We're looking forword this patch 
could be applied into branch-3. Is anyone working on this JIRA?

> Split FsDatasetImpl from blockpool lock to blockpool volume lock 
> -
>
> Key: HDFS-15382
> URL: https://issues.apache.org/jira/browse/HDFS-15382
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Aiphago
>Assignee: Aiphago
>Priority: Major
> Attachments: HDFS-15382-sample.patch, image-2020-06-02-1.png, 
> image-2020-06-03-1.png
>
>
> In HDFS-15180 we split lock to blockpool grain size.But when one volume is in 
> heavy load and will block other request which in same blockpool but different 
> volume.So we split lock to two leval to avoid this happend.And to improve 
> datanode performance.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15382) Split FsDatasetImpl from blockpool lock to blockpool volume lock

2020-09-29 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17204434#comment-17204434
 ] 

Xiaoqiao He commented on HDFS-15382:


Thanks [~sodonnell],[~Jiang Xin] for your comments.
HDFS-15150 and HDFS-15160 is very interesting improvement for DataNode, and I 
think the result is also impressive. But this solution does not solve coupling 
issue between different BlockPools and different Volumes when enable Federation 
feature. Especially one of BlockPool/Volume's load is very high, other 
BlockPools/Volumes read/write operation will be blocked since still some IO 
operation in Lock which could hold for long time, such as 
#updateReplicaUnderRecovery. In our inner branch, this issue is very critical. 
Please reference 
https://drive.google.com/file/d/1eaE8vSEhIli0H3j2eDiPJNYuKAC0MFgu/view?usp=sharing
 if interesting details.
IMO, the key of this improvement is decoupling BlockPools and Volumes and try 
to improve performance further. with HDFS-15150 and HDFS-15160, it will get a 
better result. 
About the demo patch, if agreement, we will split to some subtask to push this 
feature forwards. cc [~Aiphag0]
Thanks [~sodonnell] and [~LiJinglun] again. Welcome more discussion and 
suggestions.

> Split FsDatasetImpl from blockpool lock to blockpool volume lock 
> -
>
> Key: HDFS-15382
> URL: https://issues.apache.org/jira/browse/HDFS-15382
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Aiphago
>Assignee: Aiphago
>Priority: Major
> Attachments: HDFS-15382-sample.patch, image-2020-06-02-1.png, 
> image-2020-06-03-1.png
>
>
> In HDFS-15180 we split lock to blockpool grain size.But when one volume is in 
> heavy load and will block other request which in same blockpool but different 
> volume.So we split lock to two leval to avoid this happend.And to improve 
> datanode performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15382) Split FsDatasetImpl from blockpool lock to blockpool volume lock

2020-09-29 Thread Jinglun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17203990#comment-17203990
 ] 

Jinglun commented on HDFS-15382:


Thanks very much for [~sodonnell] your suggestions ! I'd like to give a try of 
the read/write lock !

> Split FsDatasetImpl from blockpool lock to blockpool volume lock 
> -
>
> Key: HDFS-15382
> URL: https://issues.apache.org/jira/browse/HDFS-15382
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Aiphago
>Assignee: Aiphago
>Priority: Major
> Fix For: 3.2.1
>
> Attachments: HDFS-15382-sample.patch, image-2020-06-02-1.png, 
> image-2020-06-03-1.png
>
>
> In HDFS-15180 we split lock to blockpool grain size.But when one volume is in 
> heavy load and will block other request which in same blockpool but different 
> volume.So we split lock to two leval to avoid this happend.And to improve 
> datanode performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15382) Split FsDatasetImpl from blockpool lock to blockpool volume lock

2020-09-29 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17203837#comment-17203837
 ] 

Stephen O'Donnell commented on HDFS-15382:
--

[~LiJinglun] There is a similar change already committed on trunk in HDFS-15150 
and HDFS-15160. These changes do not go as far as the changes suggested here, 
but they are simpler and hence easier to backport / review. Some people who 
tried these patches out reported good results in reducing DN pauses.

[~hexiaoqiao] The approach here does seem like a good one and worth exploring. 
The concern I have, is around the complexity of it and the amount of change it 
introduces. It would be great to also benchmark the simple read/write lock 
along with the change here to see how they compare.

I also think it would be worth exploring where the lock is held during IO 
operations (ie potentially held a long time) and try to avoid holding the lock 
during a disk IO. If we could do this on the common code paths (create/write 
block, read block) then it would make most problems go away I think.

There is also the recent changes to remove the locking in DirectoryScanner, 
which we are seeing cause a lot of problems on the 3.x branches - HDFS-15415

> Split FsDatasetImpl from blockpool lock to blockpool volume lock 
> -
>
> Key: HDFS-15382
> URL: https://issues.apache.org/jira/browse/HDFS-15382
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Aiphago
>Assignee: Aiphago
>Priority: Major
> Fix For: 3.2.1
>
> Attachments: HDFS-15382-sample.patch, image-2020-06-02-1.png, 
> image-2020-06-03-1.png
>
>
> In HDFS-15180 we split lock to blockpool grain size.But when one volume is in 
> heavy load and will block other request which in same blockpool but different 
> volume.So we split lock to two leval to avoid this happend.And to improve 
> datanode performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15382) Split FsDatasetImpl from blockpool lock to blockpool volume lock

2020-09-29 Thread Jinglun (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17203790#comment-17203790
 ] 

Jinglun commented on HDFS-15382:


Great work [~Aiphag0]  [~hexiaoqiao] !!! Haven't go into details(it is a very 
big patch) but the design makes sense to me. Our data streaming service is 
facing occasionally slow read/write. I'd like to test this patch on my cluster. 
A patch based on 3.x will be very helpful. 

> Split FsDatasetImpl from blockpool lock to blockpool volume lock 
> -
>
> Key: HDFS-15382
> URL: https://issues.apache.org/jira/browse/HDFS-15382
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Aiphago
>Assignee: Aiphago
>Priority: Major
> Fix For: 3.2.1
>
> Attachments: HDFS-15382-sample.patch, image-2020-06-02-1.png, 
> image-2020-06-03-1.png
>
>
> In HDFS-15180 we split lock to blockpool grain size.But when one volume is in 
> heavy load and will block other request which in same blockpool but different 
> volume.So we split lock to two leval to avoid this happend.And to improve 
> datanode performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15382) Split FsDatasetImpl from blockpool lock to blockpool volume lock

2020-09-22 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17200157#comment-17200157
 ] 

Xiaoqiao He commented on HDFS-15382:


[~weichiu],[~sodonnell],[~linyiqun] do you have time to review this solution? 
It works well in our internal cluster. I believe this is useful feature if we 
can push it forward.
Any suggestions and comments are welcome. We will prepare new patch based on 
trunk if come to an agreement on this solution.

> Split FsDatasetImpl from blockpool lock to blockpool volume lock 
> -
>
> Key: HDFS-15382
> URL: https://issues.apache.org/jira/browse/HDFS-15382
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Aiphago
>Assignee: Aiphago
>Priority: Major
> Fix For: 3.2.1
>
> Attachments: HDFS-15382-sample.patch, image-2020-06-02-1.png, 
> image-2020-06-03-1.png
>
>
> In HDFS-15180 we split lock to blockpool grain size.But when one volume is in 
> heavy load and will block other request which in same blockpool but different 
> volume.So we split lock to two leval to avoid this happend.And to improve 
> datanode performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15382) Split FsDatasetImpl from blockpool lock to blockpool volume lock

2020-09-16 Thread Aiphago (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17196824#comment-17196824
 ] 

Aiphago commented on HDFS-15382:


This is a sample base our 2.7 version, mian logic is similar, but apply to 
trunck need some wrok todo.

> Split FsDatasetImpl from blockpool lock to blockpool volume lock 
> -
>
> Key: HDFS-15382
> URL: https://issues.apache.org/jira/browse/HDFS-15382
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Aiphago
>Assignee: Aiphago
>Priority: Major
> Fix For: 3.2.1
>
> Attachments: HDFS-15382-sample.patch, image-2020-06-02-1.png, 
> image-2020-06-03-1.png
>
>
> In HDFS-15180 we split lock to blockpool grain size.But when one volume is in 
> heavy load and will block other request which in same blockpool but different 
> volume.So we split lock to two leval to avoid this happend.And to improve 
> datanode performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15382) Split FsDatasetImpl from blockpool lock to blockpool volume lock

2020-09-14 Thread Jiang Xin (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17195317#comment-17195317
 ] 

Jiang Xin commented on HDFS-15382:
--

Thanks [~Aiphag0] for your proposal, seems it helps a lot on IO heavy DNs and 
we planned to do this recently. Would you submit a sample patch or push it 
forward?

> Split FsDatasetImpl from blockpool lock to blockpool volume lock 
> -
>
> Key: HDFS-15382
> URL: https://issues.apache.org/jira/browse/HDFS-15382
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Aiphago
>Assignee: Aiphago
>Priority: Major
> Fix For: 3.2.1
>
> Attachments: image-2020-06-02-1.png, image-2020-06-03-1.png
>
>
> In HDFS-15180 we split lock to blockpool grain size.But when one volume is in 
> heavy load and will block other request which in same blockpool but different 
> volume.So we split lock to two leval to avoid this happend.And to improve 
> datanode performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15382) Split FsDatasetImpl from blockpool lock to blockpool volume lock

2020-06-03 Thread Aiphago (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17124767#comment-17124767
 ] 

Aiphago commented on HDFS-15382:


!image-2020-06-03-1.png|width=628,height=378!

The simple lock model like this.Parts of implemented as follows
 # As for finalizeReplica(),append(),createRbw()First get BlockPoolLock 
read lock,and then get BlockPoolLock-volume-lock write lock.

 # As for getStoredBlock(),getMetaDataInputStream()First get BlockPoolLock 
read lock,and the then get BlockPoolLock-volume-lock read lock.

 # As for deepCopyReplica(),getBlockReports() get the BlockPoolLock read lock.

 # As for delete hold the BlockPoolLock write lock.
 # The change of replicaMap's Gset change to sync to make thread safe.And 
replicaMap itself is the same as HDFS-15180  only control blockpool lock

> Split FsDatasetImpl from blockpool lock to blockpool volume lock 
> -
>
> Key: HDFS-15382
> URL: https://issues.apache.org/jira/browse/HDFS-15382
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Aiphago
>Assignee: Aiphago
>Priority: Major
> Fix For: 3.2.1
>
> Attachments: image-2020-06-02-1.png, image-2020-06-03-1.png
>
>
> In HDFS-15180 we split lock to blockpool grain size.But when one volume is in 
> heavy load and will block other request which in same blockpool but different 
> volume.So we split lock to two leval to avoid this happend.And to improve 
> datanode performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15382) Split FsDatasetImpl from blockpool lock to blockpool volume lock

2020-06-03 Thread Aiphago (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17124736#comment-17124736
 ] 

Aiphago commented on HDFS-15382:


{quote}it is confused with {{ReplicaCachingGetSpaceUsed}}, IIUC, 
{{ReplicaCachingGetSpaceUsed}} is calculated in memory directly rather than 
sync info from disk, right? so why is it related to this changes? Also the log 
print is based on our internal version rather than branch trunk, some notes 
could be better.
{quote}
{{ReplicaCachingGetSpaceUsed copy replica from FsDataSetImpl most of time is 
spend in wait }}{{FsDataSetImpl lock,so 'Copy replica infos' time spend can 
reflect the time wait for the lock.}}{{}}

> Split FsDatasetImpl from blockpool lock to blockpool volume lock 
> -
>
> Key: HDFS-15382
> URL: https://issues.apache.org/jira/browse/HDFS-15382
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Aiphago
>Assignee: Aiphago
>Priority: Major
> Fix For: 3.2.1
>
> Attachments: image-2020-06-02-1.png
>
>
> In HDFS-15180 we split lock to blockpool grain size.But when one volume is in 
> heavy load and will block other request which in same blockpool but different 
> volume.So we split lock to two leval to avoid this happend.And to improve 
> datanode performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15382) Split FsDatasetImpl from blockpool lock to blockpool volume lock

2020-06-02 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17123392#comment-17123392
 ] 

Xiaoqiao He commented on HDFS-15382:


Thanks [~Aiphag0] for your proposal here. The performance improvement is 
impressive especially the load of DataNode is very high. A few nits,
a. the monitor index is our internal item, it is more helpful to explain what 
it is meaning.
b. it is confused with {{ReplicaCachingGetSpaceUsed}}, IIUC, 
{{ReplicaCachingGetSpaceUsed}} is calculated in memory directly rather than 
sync info from disk, right? so why is it related to this changes? Also the log 
print is based on our internal version rather than branch trunk, some notes 
could be better. 
c. If could we offer a simple design document (maybe include one design chart 
with a simple design/refactor description), IMO it is useful to someone who is 
interested this improvement.

> Split FsDatasetImpl from blockpool lock to blockpool volume lock 
> -
>
> Key: HDFS-15382
> URL: https://issues.apache.org/jira/browse/HDFS-15382
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Aiphago
>Assignee: Aiphago
>Priority: Major
> Fix For: 3.2.1
>
> Attachments: image-2020-06-02-1.png
>
>
> In HDFS-15180 we split lock to blockpool grain size.But when one volume is in 
> heavy load and will block other request which in same blockpool but different 
> volume.So we split lock to two leval to avoid this happend.And to improve 
> datanode performance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org