[jira] [Commented] (HDFS-13769) Namenode gets stuck when deleting large dir in trash

Kihwal Lee (JIRA) Thu, 26 Jul 2018 10:43:13 -0700


    [ 
https://issues.apache.org/jira/browse/HDFS-13769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16558665#comment-16558665
 ]


Kihwal Lee commented on HDFS-13769:
-----------------------------------

bq. This seems to apply not only for trash dir, but also any directory with 
large amount of data,
You mean the performance hit? Sure.  But the same kind of logic cannot be used 
as a generic solution. It is equivalent to users dividing a large dir structure 
and deleting them individually.  If this logic is applied by default in 
FSShell, it will break the delete semantics.  We might add an option for the 
FSShell to delete in this mode with a clear warning that the delete is no 
longer atomic.  In any case, we can't do this in RPC server side (i.e. 
namenode).

> Namenode gets stuck when deleting large dir in trash
> ----------------------------------------------------
>
>                 Key: HDFS-13769
>                 URL: https://issues.apache.org/jira/browse/HDFS-13769
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 2.8.2, 3.1.0
>            Reporter: Tao Jie
>            Assignee: Tao Jie
>            Priority: Major
>         Attachments: HDFS-13769.001.patch
>
>
> Similar to the situation discussed in HDFS-13671, Namenode gets stuck for a 
> long time when deleting trash dir with a large mount of data. We found log in 
> namenode:
> {quote}
> 2018-06-08 20:00:59,042 INFO namenode.FSNamesystem 
> (FSNamesystemLock.java:writeUnlock(252)) - FSNamesystem write lock held for 
> 23018 ms via
> java.lang.Thread.getStackTrace(Thread.java:1552)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1033)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystemLock.writeUnlock(FSNamesystemLock.java:254)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1567)
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:2820)
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:1047)
> {quote}
> One simple solution is to avoid deleting large data in one delete RPC call. 
> We implement a trashPolicy that divide the delete operation into several 
> delete RPCs, and each single deletion would not delete too many files.
> Any thought? [~linyiqun]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13769) Namenode gets stuck when deleting large dir in trash

Reply via email to