[jira] [Commented] (HDFS-15000) Improve FsDatasetImpl to avoid IO operation in datasetLock

Stephen O'Donnell (Jira) Thu, 19 Dec 2019 04:07:48 -0800


    [ 
https://issues.apache.org/jira/browse/HDFS-15000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17000006#comment-17000006
 ]


Stephen O'Donnell commented on HDFS-15000:
------------------------------------------

I wonder if this could be done by simply releasing the lock, do the IO, re-take 
the lock, and avoid the need for futures etc.

I have not looked at this in great detail, so my suggestion my have some flaws. 
Looking at FsDatasetImpl.createRbw(), it does approximately the following:

{code}
lock {
  1. check_block_id does not already exist
  2. check enough space available etc
  3. select the volume

  4. Perform the IO via newReplicaInfo = v.createRbw(b);

  5. Add the new block to the volume map
}
{code}

A problem with dropping the lock in the middle while doing the IO is that 
another thread could come in with the same block ID, and it would pass check 
(1) above, and then we would have a race condition.

I wonder if it would be possible to refactor things so we do steps 1, 2, 3 and 
5, drop the lock and then do the IO operation to actually create the file. In 
the event the IO fails, re-take the lock and clean up the volume map.

This would require some refactoring of a few methods, as the volume map needs 
to store a reference to the replicaInfo, which currently is created in 
v.createRbw(b) along with the file on disk, but I don't think it needs to be - 
we could break those two apart.

> Improve FsDatasetImpl to avoid IO operation in datasetLock
> ----------------------------------------------------------
>
>                 Key: HDFS-15000
>                 URL: https://issues.apache.org/jira/browse/HDFS-15000
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>            Reporter: Xiaoqiao He
>            Assignee: Aiphago
>            Priority: Major
>         Attachments: HDFS-15000.001.patch
>
>
> As HDFS-14997 mentioned, some methods in #FsDatasetImpl such as 
> #finalizeBlock, #finalizeReplica, #createRbw includes IO operation in the 
> datasetLock, It will block some logic when IO load is very high. We should 
> reduce grain fineness or move IO operation out of datasetLock.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-15000) Improve FsDatasetImpl to avoid IO operation in datasetLock

Reply via email to