[jira] [Commented] (HDFS-5498) Improve datanode startup time
[ https://issues.apache.org/jira/browse/HDFS-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13906993#comment-13906993 ] Kihwal Lee commented on HDFS-5498: -- [~azuryy], the patch depends on HDFS-5583 and HDFS-5924. [~brandonli] is reviewing them. > Improve datanode startup time > - > > Key: HDFS-5498 > URL: https://issues.apache.org/jira/browse/HDFS-5498 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Attachments: HDFS-5498.with_du_change.patch, > HDFS-5498.with_du_change.patch, HDFS-5498_sh.patch > > > Similarly to HDFS-5027, an improvement can be made for getVomeMap(). This is > the phase in which ReplicaMap.is populated. But it will be even better if > datanode scans only once and do both. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5498) Improve datanode startup time
[ https://issues.apache.org/jira/browse/HDFS-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13906663#comment-13906663 ] Fengdong Yu commented on HDFS-5498: --- who can commit this? > Improve datanode startup time > - > > Key: HDFS-5498 > URL: https://issues.apache.org/jira/browse/HDFS-5498 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Attachments: HDFS-5498.with_du_change.patch, > HDFS-5498.with_du_change.patch, HDFS-5498_sh.patch > > > Similarly to HDFS-5027, an improvement can be made for getVomeMap(). This is > the phase in which ReplicaMap.is populated. But it will be even better if > datanode scans only once and do both. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5498) Improve datanode startup time
[ https://issues.apache.org/jira/browse/HDFS-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13901817#comment-13901817 ] Brandon Li commented on HDFS-5498: -- +1. The patch looks good to me. > Improve datanode startup time > - > > Key: HDFS-5498 > URL: https://issues.apache.org/jira/browse/HDFS-5498 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Attachments: HDFS-5498.with_du_change.patch, > HDFS-5498.with_du_change.patch, HDFS-5498_sh.patch > > > Similarly to HDFS-5027, an improvement can be made for getVomeMap(). This is > the phase in which ReplicaMap.is populated. But it will be even better if > datanode scans only once and do both. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5498) Improve datanode startup time
[ https://issues.apache.org/jira/browse/HDFS-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13901620#comment-13901620 ] Vinayakumar B commented on HDFS-5498: - Patch looks good Kihwal. +1 Yes, adding DN level hook is beyond scope of this Jira. > Improve datanode startup time > - > > Key: HDFS-5498 > URL: https://issues.apache.org/jira/browse/HDFS-5498 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Attachments: HDFS-5498.with_du_change.patch, > HDFS-5498.with_du_change.patch, HDFS-5498_sh.patch > > > Similarly to HDFS-5027, an improvement can be made for getVomeMap(). This is > the phase in which ReplicaMap.is populated. But it will be even better if > datanode scans only once and do both. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5498) Improve datanode startup time
[ https://issues.apache.org/jira/browse/HDFS-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13901115#comment-13901115 ] Vinayakumar B commented on HDFS-5498: - This is with HDFS-5585 ? Yes, with HDFS-5585 and using dfsadmin shutdown command this will work. But I dont think this will work for normal shutdown using hadoop-daemon.sh ..? Am I missing something? > Improve datanode startup time > - > > Key: HDFS-5498 > URL: https://issues.apache.org/jira/browse/HDFS-5498 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Attachments: HDFS-5498.with_du_change.patch, > HDFS-5498.with_du_change.patch > > > Similarly to HDFS-5027, an improvement can be made for getVomeMap(). This is > the phase in which ReplicaMap.is populated. But it will be even better if > datanode scans only once and do both. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5498) Improve datanode startup time
[ https://issues.apache.org/jira/browse/HDFS-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13900516#comment-13900516 ] Kihwal Lee commented on HDFS-5498: -- bq. saveDfsUsed(); will store the cache file, its called only in BlockPoolSlice#shutdown(), but in real environment this will never be called. So no cache fill be written. Actually it is being called during shutdown. {panel} at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.shutdown(BlockPoolSlice.java:391) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdownBlockPool(FsVolumeImpl.java:265) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.removeBlockPool(FsVolumeList.java:242) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.shutdownBlockPool(FsDatasetImpl.java:1779) at org.apache.hadoop.hdfs.server.datanode.DataNode.shutdownBlockPool(DataNode.java:909) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.shutdownActor(BPOfferService.java:362) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.cleanUp(BPServiceActor.java:611) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:811) at java.lang.Thread.run(Thread.java:695) {panel} > Improve datanode startup time > - > > Key: HDFS-5498 > URL: https://issues.apache.org/jira/browse/HDFS-5498 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Attachments: HDFS-5498.with_du_change.patch, > HDFS-5498.with_du_change.patch > > > Similarly to HDFS-5027, an improvement can be made for getVomeMap(). This is > the phase in which ReplicaMap.is populated. But it will be even better if > datanode scans only once and do both. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5498) Improve datanode startup time
[ https://issues.apache.org/jira/browse/HDFS-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13892407#comment-13892407 ] Brandon Li commented on HDFS-5498: -- [~kihwal], the patch needs to be rebased, could you please upload a new one? > Improve datanode startup time > - > > Key: HDFS-5498 > URL: https://issues.apache.org/jira/browse/HDFS-5498 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Attachments: HDFS-5498.with_du_change.patch, > HDFS-5498.with_du_change.patch > > > Similarly to HDFS-5027, an improvement can be made for getVomeMap(). This is > the phase in which ReplicaMap.is populated. But it will be even better if > datanode scans only once and do both. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5498) Improve datanode startup time
[ https://issues.apache.org/jira/browse/HDFS-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13828837#comment-13828837 ] Vinay commented on HDFS-5498: - Good, improvement kihwal, {{saveDfsUsed();}} will store the cache file, its called only in {{BlockPoolSlice#shutdown()}}, but in real environment this will never be called. So no cache fill be written. How about saving this cache file every 10 min (refreshInterval) or adding a shutdownHook to call {{BlockPoolSlice#shutdown()}} ..? Typo here : *cahed* {code}+ * If the cahed value is not available or too old, -1 is returned.{code} > Improve datanode startup time > - > > Key: HDFS-5498 > URL: https://issues.apache.org/jira/browse/HDFS-5498 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Attachments: HDFS-5498.with_du_change.patch, > HDFS-5498.with_du_change.patch > > > Similarly to HDFS-5027, an improvement can be made for getVomeMap(). This is > the phase in which ReplicaMap.is populated. But it will be even better if > datanode scans only once and do both. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5498) Improve datanode startup time
[ https://issues.apache.org/jira/browse/HDFS-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13828425#comment-13828425 ] Hadoop QA commented on HDFS-5498: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12615019/HDFS-5498.with_du_change.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5515//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/5515//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5515//console This message is automatically generated. > Improve datanode startup time > - > > Key: HDFS-5498 > URL: https://issues.apache.org/jira/browse/HDFS-5498 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Attachments: HDFS-5498.with_du_change.patch, > HDFS-5498.with_du_change.patch > > > Similarly to HDFS-5027, an improvement can be made for getVomeMap(). This is > the phase in which ReplicaMap.is populated. But it will be even better if > datanode scans only once and do both. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5498) Improve datanode startup time
[ https://issues.apache.org/jira/browse/HDFS-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13828245#comment-13828245 ] Kihwal Lee commented on HDFS-5498: -- TestBalancerWithNodeGroup failure is not caused by this patch. TestDFSUpgrade failed because the storage checksum comparison in the test does not take the cache file into account. It is already skipping VERSION files, so it is simple matter of adding the cache file to the blacklist. > Improve datanode startup time > - > > Key: HDFS-5498 > URL: https://issues.apache.org/jira/browse/HDFS-5498 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Attachments: HDFS-5498.with_du_change.patch, > HDFS-5498.with_du_change.patch > > > Similarly to HDFS-5027, an improvement can be made for getVomeMap(). This is > the phase in which ReplicaMap.is populated. But it will be even better if > datanode scans only once and do both. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5498) Improve datanode startup time
[ https://issues.apache.org/jira/browse/HDFS-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13827236#comment-13827236 ] Hadoop QA commented on HDFS-5498: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12614733/HDFS-5498.with_du_change.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 2 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup org.apache.hadoop.hdfs.TestDFSUpgrade {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5493//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/5493//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5493//console This message is automatically generated. > Improve datanode startup time > - > > Key: HDFS-5498 > URL: https://issues.apache.org/jira/browse/HDFS-5498 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Kihwal Lee >Assignee: Kihwal Lee > Attachments: HDFS-5498.with_du_change.patch > > > Similarly to HDFS-5027, an improvement can be made for getVomeMap(). This is > the phase in which ReplicaMap.is populated. But it will be even better if > datanode scans only once and do both. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HDFS-5498) Improve datanode startup time
[ https://issues.apache.org/jira/browse/HDFS-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13825886#comment-13825886 ] Kihwal Lee commented on HDFS-5498: -- For quick restart (e.g. rolling upgrade), the initialization of {{BlockPoolSlice}} can be made fater by cashing {{dfsUsed}}. Currently DU unconditionally runs the du command in its constructor. Filed HADOOP-10111 to add an option to specify an initial "used" value and delay the initial scan. When I incorporated this change in {{BlockPoolSlice}}, the initialization time for each volume became several milliseconds on restart. It looks for the cache file and use the value if it is less than 5 minutes old (the default DU interval). Another part of start-up time improvement will come from parallelizing the scanning & popuating {{ReplicaMap}}. > Improve datanode startup time > - > > Key: HDFS-5498 > URL: https://issues.apache.org/jira/browse/HDFS-5498 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Kihwal Lee > > Similarly to HDFS-5027, an improvement can be made for getVomeMap(). This is > the phase in which ReplicaMap.is populated. But it will be even better if > datanode scans only once and do both. -- This message was sent by Atlassian JIRA (v6.1#6144)