[ https://issues.apache.org/jira/browse/HDFS-14908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16954497#comment-16954497 ]
Jinglun commented on HDFS-14908: -------------------------------- I did a benchmark and here is the result. Test file is Test.java. *Design* Compare path.startsWith(parent) and DFSUtil.isParent(path, parent) when path starts / not starts with parent. {quote}java -Xmx512m Test 10000000000 /dir0/dir1/dir2/dir3 /dir0/dir1 java -Xmx512m Test 10000000000 /dir0/dir1/dir2/dir3 /dir0/dir2 java -Xmx512m Test 10000000000 dir0/dir1/dir2/dir3 dir0/dir1 java -Xmx512m Test 10000000000 /dir0/dir1/dir2/dir3/ /dir0/dir1/ {quote} *Result* Case 1: path= /dir0/dir1/dir2/dir3 parent=/dir0/dir1 || Times||1,000,000,000||10,000,000,000|| |startsWith()|4,095ms|40,657ms| |isParent()|4,665ms|46,543ms| Case 2: path= /dir0/dir1/dir2/dir3 parent=/dir0/dir2 ||Times||1,000,000,000||10,000,000,000|| |startsWith()|4,029ms|39,559ms| |isParent()|3,984ms|39,095ms| Case 3: path= dir0/dir1/dir2/dir3 parent=dir0/dir1 ||Times||1,000,000,000||10,000,000,000|| |startsWith()|3,519ms|34,562ms| |isParent()|235ms|2,328ms| Case 4: path= /dir0/dir1/dir2/dir3/ parent=/dir0/dir1/ ||Times||1,000,000,000||10,000,000,000|| |startsWith()|4,546ms|45,355ms| |isParent()|19,409ms|189,750ms| *Conclusion* The result in case 1 shows the overhead in isParent() is not very serious. I think it's acceptable. The result in case 2 is a little confusing as isParent() is faster than startsWith(). I repeated the test many times but the results were the same. The method isParent() does many additional checks and then calls startsWith() so it should be slower than calling startsWith() directly. May be the compiler/JVM has optimized the code ? The result of Case 3 is expected because in isParent() the comparison of each characters is skipped. The result of Case 4 shows isParent() costs much more time when path ends with a '/'. That's because there are copy of strings. We can optimize this by introducing a new characters comparing method supporting startIndex and endIndex to replace the startsWith() in L1769. > LeaseManager should check parent-child relationship when filter open files. > --------------------------------------------------------------------------- > > Key: HDFS-14908 > URL: https://issues.apache.org/jira/browse/HDFS-14908 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 3.1.0, 3.0.1 > Reporter: Jinglun > Assignee: Jinglun > Priority: Minor > Attachments: HDFS-14908.001.patch, HDFS-14908.002.patch > > > Now when doing listOpenFiles(), LeaseManager only checks whether the filter > path is the prefix of the open files. We should check whether the filter path > is the parent/ancestor of the open files. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org