[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

Lei (Eddy) Xu (JIRA) Thu, 05 Feb 2015 17:33:59 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308419#comment-14308419
 ]


Lei (Eddy) Xu commented on HDFS-7692:
-------------------------------------

[~guoleitao] Thanks for working on this. It looks very good overall.

Just have a few comments:

{code}
try {
   addStorageThreadPool.awaitTermination(Long.MAX_VALUE, TimeUnit.DAYS);
} catch (InterruptedException e) {
  e.printStackTrace();
}
{code}

Could you log the Exception here and rethrow it as {{IOException}}? 

Moreover, in tests, I think you do not need to catch this 
{{InterruptedException}}. If such an exception occurs, it would be better to 
just let the test case fail.

In {{TestDataStorage#testAddStorageDirectoreis}}:

{code}
 for (NamespaceInfo ni : namespaceInfos) {
139           storage.addStorageLocations(mockDN, ni, locations, START_OPT);    
        
148           pool = Executors.newFixedThreadPool(numThreads);
149           storage.addStorageLocations(mockDN, ni, locations, START_OPT, 
pool,
150               addedLocation);
151           pool.shutdown();
{code}

You created one thread pool for each Namespace, and wait this pool to terminate 
in the loop. I guess that your intention was creating a global pool and let 
each namespace share the same thread pool?

{code}
Executors.newFixedThreadPool(null != datanode.getConf() ? datanode
493                 .getConf().getInt(
494                     DFSConfigKeys.DFS_DATANODE_DIRECTORYSCAN_THREADS_KEY,
495                     
DFSConfigKeys.DFS_DATANODE_DIRECTORYSCAN_THREADS_DEFAULT)
496                 : dataDirs.size());
{code}

Will this {{null != datanode.getConf()}} condition actually happen?

{code}
/*
* @param successVolumes a list of successfully loaded volumes.
*/
{code}

Maybe it is better to re-phrase this as something like "An output container to 
fill with successfully loaded volumes"?

I'd love to give a non-binding +1 once these comments are addressed.

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> ----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-7692
>                 URL: https://issues.apache.org/jira/browse/HDFS-7692
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>    Affects Versions: 2.5.2
>            Reporter: Leitao Guo
>         Attachments: HDFS-7692.01.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
>     for (StorageLocation dataDir : dataDirs) {
>       File root = dataDir.getFile();
>      ... ...
>         bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
>         addBlockPoolStorage(bpid, bpStorage);
>     ... ...
>       successVolumes.add(dataDir);
>     }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

Reply via email to