[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-06-19 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594136#comment-14594136
 ] 

Lei (Eddy) Xu commented on HDFS-7692:
-

Hi, [~guoleitao].  Thanks a lot for updating this. It looks good after 
addressing the comments.

One question I have: why does not create {{ExecutorService}} and collect 
{{successVolumes}} within {{DataStorage#addStorageLocation}}? The interface 
might look simpler (less modification) in that way.

One minor nits:  
{code}
  e.printStackTrace();
{code}

Can we replace it with {{log.warn(e)}} or {{log.error(e)}}?

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7692.01.patch, HDFS-7692.02.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-05-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14550131#comment-14550131
 ] 

Hadoop QA commented on HDFS-7692:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 37s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | javac |   7m 26s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 35s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   0m 37s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | install |   1m 37s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m  2s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 13s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 164m  3s | Tests failed in hadoop-hdfs. |
| | | 205m  9s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.hdfs.server.namenode.ha.TestBootstrapStandbyWithQJM |
|   | hadoop.hdfs.server.namenode.ha.TestHASafeMode |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12733723/HDFS-7692.02.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 0790275 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11047/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11047/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11047/console |


This message was automatically generated.

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7692.01.patch, HDFS-7692.02.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-05-18 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549930#comment-14549930
 ] 

Leitao Guo commented on HDFS-7692:
--

Sorry it's my mistake to comment many times here! It seems that my network 
condition is not very good now...

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7692.01.patch, HDFS-7692.02.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-05-18 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549913#comment-14549913
 ] 

Leitao Guo commented on HDFS-7692:
--

[~eddyxu], thanks for your comments, please have a check of the new patch.
1.In DataStorage#recoverTransitionRead, log the InterruptedException and 
rethrow it as InterruptedIOException;
2.In TestDataStorage#testAddStorageDirectoreis, catch InterruptedException then 
let the test case fail;
3.The multithread in DataStorage#addStorageLocations() is for one specific 
namespace, so in TestDataStorage#testAddStorageDirectoreis my intention is 
creating one thread pool for each namespace. Not change here.
4.Re-phrase the parameter successVolumes.

[~szetszwo],thanks for your comments, please have a check of the new patch.
1. InterruptedException re-thrown as InterruptedIOException;
2. I think it's a good idea to log the upgrade progress for each dir, but so 
far, we can not get the progress easily from the current api. Do you think it's 
necessary to file a new jira to follow this?

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7692.01.patch, HDFS-7692.02.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-05-18 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549910#comment-14549910
 ] 

Leitao Guo commented on HDFS-7692:
--

[~eddyxu], thanks for your comments, please have a check of the new patch.
1.In DataStorage#recoverTransitionRead, log the InterruptedException and 
rethrow it as InterruptedIOException;
2.In TestDataStorage#testAddStorageDirectoreis, catch InterruptedException then 
let the test case fail;
3.The multithread in DataStorage#addStorageLocations() is for one specific 
namespace, so in TestDataStorage#testAddStorageDirectoreis my intention is 
creating one thread pool for each namespace. Not change here.
4.Re-phrase the parameter successVolumes.

[~szetszwo],thanks for your comments, please have a check of the new patch.
1. InterruptedException re-thrown as InterruptedIOException;
2. I think it's a good idea to log the upgrade progress for each dir, but so 
far, we can not get the progress easily from the current api. Do you think it's 
necessary to file a new jira to follow this?

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7692.01.patch, HDFS-7692.02.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-05-18 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549914#comment-14549914
 ] 

Leitao Guo commented on HDFS-7692:
--

[~eddyxu], thanks for your comments, please have a check of the new patch.
1.In DataStorage#recoverTransitionRead, log the InterruptedException and 
rethrow it as InterruptedIOException;
2.In TestDataStorage#testAddStorageDirectoreis, catch InterruptedException then 
let the test case fail;
3.The multithread in DataStorage#addStorageLocations() is for one specific 
namespace, so in TestDataStorage#testAddStorageDirectoreis my intention is 
creating one thread pool for each namespace. Not change here.
4.Re-phrase the parameter successVolumes.

[~szetszwo],thanks for your comments, please have a check of the new patch.
1. InterruptedException re-thrown as InterruptedIOException;
2. I think it's a good idea to log the upgrade progress for each dir, but so 
far, we can not get the progress easily from the current api. Do you think it's 
necessary to file a new jira to follow this?

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7692.01.patch, HDFS-7692.02.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-05-18 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549897#comment-14549897
 ] 

Leitao Guo commented on HDFS-7692:
--

[~eddyxu], thanks for your comments, please have a check of the new patch.
1.In DataStorage#recoverTransitionRead, log the InterruptedException and 
rethrow it as InterruptedIOException;
2.In TestDataStorage#testAddStorageDirectoreis, catch InterruptedException then 
let the test case fail;
3.The multithread in DataStorage#addStorageLocations() is for one specific 
namespace, so in TestDataStorage#testAddStorageDirectoreis my intention is 
creating one thread pool for each namespace. Not change here.
4.Re-phrase the parameter successVolumes.

[~szetszwo],thanks for your comments, please have a check of the new patch.
1. InterruptedException re-thrown as InterruptedIOException;
2. I think it's a good idea to log the upgrade progress for each dir, but so 
far, we can not get the progress easily from the current api. Do you think it's 
necessary to file a new jira to follow this?

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7692.01.patch, HDFS-7692.02.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-05-18 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549899#comment-14549899
 ] 

Leitao Guo commented on HDFS-7692:
--

[~eddyxu], thanks for your comments, please have a check of the new patch.
1.In DataStorage#recoverTransitionRead, log the InterruptedException and 
rethrow it as InterruptedIOException;
2.In TestDataStorage#testAddStorageDirectoreis, catch InterruptedException then 
let the test case fail;
3.The multithread in DataStorage#addStorageLocations() is for one specific 
namespace, so in TestDataStorage#testAddStorageDirectoreis my intention is 
creating one thread pool for each namespace. Not change here.
4.Re-phrase the parameter successVolumes.

[~szetszwo],thanks for your comments, please have a check of the new patch.
1. InterruptedException re-thrown as InterruptedIOException;
2. I think it's a good idea to log the upgrade progress for each dir, but so 
far, we can not get the progress easily from the current api. Do you think it's 
necessary to file a new jira to follow this?

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7692.01.patch, HDFS-7692.02.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-05-18 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549902#comment-14549902
 ] 

Leitao Guo commented on HDFS-7692:
--

[~eddyxu], thanks for your comments, please have a check of the new patch.
1.In DataStorage#recoverTransitionRead, log the InterruptedException and 
rethrow it as InterruptedIOException;
2.In TestDataStorage#testAddStorageDirectoreis, catch InterruptedException then 
let the test case fail;
3.The multithread in DataStorage#addStorageLocations() is for one specific 
namespace, so in TestDataStorage#testAddStorageDirectoreis my intention is 
creating one thread pool for each namespace. Not change here.
4.Re-phrase the parameter successVolumes.

[~szetszwo],thanks for your comments, please have a check of the new patch.
1. InterruptedException re-thrown as InterruptedIOException;
2. I think it's a good idea to log the upgrade progress for each dir, but so 
far, we can not get the progress easily from the current api. Do you think it's 
necessary to file a new jira to follow this?

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7692.01.patch, HDFS-7692.02.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-05-18 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549900#comment-14549900
 ] 

Leitao Guo commented on HDFS-7692:
--

[~eddyxu], thanks for your comments, please have a check of the new patch.
1.In DataStorage#recoverTransitionRead, log the InterruptedException and 
rethrow it as InterruptedIOException;
2.In TestDataStorage#testAddStorageDirectoreis, catch InterruptedException then 
let the test case fail;
3.The multithread in DataStorage#addStorageLocations() is for one specific 
namespace, so in TestDataStorage#testAddStorageDirectoreis my intention is 
creating one thread pool for each namespace. Not change here.
4.Re-phrase the parameter successVolumes.

[~szetszwo],thanks for your comments, please have a check of the new patch.
1. InterruptedException re-thrown as InterruptedIOException;
2. I think it's a good idea to log the upgrade progress for each dir, but so 
far, we can not get the progress easily from the current api. Do you think it's 
necessary to file a new jira to follow this?

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7692.01.patch, HDFS-7692.02.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-05-18 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549895#comment-14549895
 ] 

Leitao Guo commented on HDFS-7692:
--

[~eddyxu], thanks for your comments, please have a check of the new patch.
1.In DataStorage#recoverTransitionRead, log the InterruptedException and 
rethrow it as InterruptedIOException;
2.In TestDataStorage#testAddStorageDirectoreis, catch InterruptedException then 
let the test case fail;
3.The multithread in DataStorage#addStorageLocations() is for one specific 
namespace, so in TestDataStorage#testAddStorageDirectoreis my intention is 
creating one thread pool for each namespace. Not change here.
4.Re-phrase the parameter successVolumes.

[~szetszwo],thanks for your comments, please have a check of the new patch.
1. InterruptedException re-thrown as InterruptedIOException;
2. I think it's a good idea to log the upgrade progress for each dir, but so 
far, we can not get the progress easily from the current api. Do you think it's 
necessary to file a new jira to follow this?

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7692.01.patch, HDFS-7692.02.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-05-18 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549908#comment-14549908
 ] 

Leitao Guo commented on HDFS-7692:
--

[~eddyxu], thanks for your comments, please have a check of the new patch.
1.In DataStorage#recoverTransitionRead, log the InterruptedException and 
rethrow it as InterruptedIOException;
2.In TestDataStorage#testAddStorageDirectoreis, catch InterruptedException then 
let the test case fail;
3.The multithread in DataStorage#addStorageLocations() is for one specific 
namespace, so in TestDataStorage#testAddStorageDirectoreis my intention is 
creating one thread pool for each namespace. Not change here.
4.Re-phrase the parameter successVolumes.

[~szetszwo],thanks for your comments, please have a check of the new patch.
1. InterruptedException re-thrown as InterruptedIOException;
2. I think it's a good idea to log the upgrade progress for each dir, but so 
far, we can not get the progress easily from the current api. Do you think it's 
necessary to file a new jira to follow this?

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7692.01.patch, HDFS-7692.02.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-05-18 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549904#comment-14549904
 ] 

Leitao Guo commented on HDFS-7692:
--

[~eddyxu], thanks for your comments, please have a check of the new patch.
1.In DataStorage#recoverTransitionRead, log the InterruptedException and 
rethrow it as InterruptedIOException;
2.In TestDataStorage#testAddStorageDirectoreis, catch InterruptedException then 
let the test case fail;
3.The multithread in DataStorage#addStorageLocations() is for one specific 
namespace, so in TestDataStorage#testAddStorageDirectoreis my intention is 
creating one thread pool for each namespace. Not change here.
4.Re-phrase the parameter successVolumes.

[~szetszwo],thanks for your comments, please have a check of the new patch.
1. InterruptedException re-thrown as InterruptedIOException;
2. I think it's a good idea to log the upgrade progress for each dir, but so 
far, we can not get the progress easily from the current api. Do you think it's 
necessary to file a new jira to follow this?

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7692.01.patch, HDFS-7692.02.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-05-18 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549906#comment-14549906
 ] 

Leitao Guo commented on HDFS-7692:
--

[~eddyxu], thanks for your comments, please have a check of the new patch.
1.In DataStorage#recoverTransitionRead, log the InterruptedException and 
rethrow it as InterruptedIOException;
2.In TestDataStorage#testAddStorageDirectoreis, catch InterruptedException then 
let the test case fail;
3.The multithread in DataStorage#addStorageLocations() is for one specific 
namespace, so in TestDataStorage#testAddStorageDirectoreis my intention is 
creating one thread pool for each namespace. Not change here.
4.Re-phrase the parameter successVolumes.

[~szetszwo],thanks for your comments, please have a check of the new patch.
1. InterruptedException re-thrown as InterruptedIOException;
2. I think it's a good idea to log the upgrade progress for each dir, but so 
far, we can not get the progress easily from the current api. Do you think it's 
necessary to file a new jira to follow this?

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7692.01.patch, HDFS-7692.02.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-05-18 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549893#comment-14549893
 ] 

Leitao Guo commented on HDFS-7692:
--

[~eddyxu], thanks for your comments, please have a check of the new patch.
1.In DataStorage#recoverTransitionRead, log the InterruptedException and 
rethrow it as InterruptedIOException;
2.In TestDataStorage#testAddStorageDirectoreis, catch InterruptedException then 
let the test case fail;
3.The multithread in DataStorage#addStorageLocations() is for one specific 
namespace, so in TestDataStorage#testAddStorageDirectoreis my intention is 
creating one thread pool for each namespace. Not change here.
4.Re-phrase the parameter successVolumes.

[~szetszwo],thanks for your comments, please have a check of the new patch.
1. InterruptedException re-thrown as InterruptedIOException;
2. I think it's a good idea to log the upgrade progress for each dir, but so 
far, we can not get the progress easily from the current api. Do you think it's 
necessary to file a new jira to follow this?

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7692.01.patch, HDFS-7692.02.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-05-18 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549907#comment-14549907
 ] 

Leitao Guo commented on HDFS-7692:
--

[~eddyxu], thanks for your comments, please have a check of the new patch.
1.In DataStorage#recoverTransitionRead, log the InterruptedException and 
rethrow it as InterruptedIOException;
2.In TestDataStorage#testAddStorageDirectoreis, catch InterruptedException then 
let the test case fail;
3.The multithread in DataStorage#addStorageLocations() is for one specific 
namespace, so in TestDataStorage#testAddStorageDirectoreis my intention is 
creating one thread pool for each namespace. Not change here.
4.Re-phrase the parameter successVolumes.

[~szetszwo],thanks for your comments, please have a check of the new patch.
1. InterruptedException re-thrown as InterruptedIOException;
2. I think it's a good idea to log the upgrade progress for each dir, but so 
far, we can not get the progress easily from the current api. Do you think it's 
necessary to file a new jira to follow this?

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7692.01.patch, HDFS-7692.02.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-05-18 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549903#comment-14549903
 ] 

Leitao Guo commented on HDFS-7692:
--

[~eddyxu], thanks for your comments, please have a check of the new patch.
1.In DataStorage#recoverTransitionRead, log the InterruptedException and 
rethrow it as InterruptedIOException;
2.In TestDataStorage#testAddStorageDirectoreis, catch InterruptedException then 
let the test case fail;
3.The multithread in DataStorage#addStorageLocations() is for one specific 
namespace, so in TestDataStorage#testAddStorageDirectoreis my intention is 
creating one thread pool for each namespace. Not change here.
4.Re-phrase the parameter successVolumes.

[~szetszwo],thanks for your comments, please have a check of the new patch.
1. InterruptedException re-thrown as InterruptedIOException;
2. I think it's a good idea to log the upgrade progress for each dir, but so 
far, we can not get the progress easily from the current api. Do you think it's 
necessary to file a new jira to follow this?

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7692.01.patch, HDFS-7692.02.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-05-18 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549905#comment-14549905
 ] 

Leitao Guo commented on HDFS-7692:
--

[~eddyxu], thanks for your comments, please have a check of the new patch.
1.In DataStorage#recoverTransitionRead, log the InterruptedException and 
rethrow it as InterruptedIOException;
2.In TestDataStorage#testAddStorageDirectoreis, catch InterruptedException then 
let the test case fail;
3.The multithread in DataStorage#addStorageLocations() is for one specific 
namespace, so in TestDataStorage#testAddStorageDirectoreis my intention is 
creating one thread pool for each namespace. Not change here.
4.Re-phrase the parameter successVolumes.

[~szetszwo],thanks for your comments, please have a check of the new patch.
1. InterruptedException re-thrown as InterruptedIOException;
2. I think it's a good idea to log the upgrade progress for each dir, but so 
far, we can not get the progress easily from the current api. Do you think it's 
necessary to file a new jira to follow this?

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7692.01.patch, HDFS-7692.02.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-05-18 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549894#comment-14549894
 ] 

Leitao Guo commented on HDFS-7692:
--

[~eddyxu], thanks for your comments, please have a check of the new patch.
1.In DataStorage#recoverTransitionRead, log the InterruptedException and 
rethrow it as InterruptedIOException;
2.In TestDataStorage#testAddStorageDirectoreis, catch InterruptedException then 
let the test case fail;
3.The multithread in DataStorage#addStorageLocations() is for one specific 
namespace, so in TestDataStorage#testAddStorageDirectoreis my intention is 
creating one thread pool for each namespace. Not change here.
4.Re-phrase the parameter successVolumes.

[~szetszwo],thanks for your comments, please have a check of the new patch.
1. InterruptedException re-thrown as InterruptedIOException;
2. I think it's a good idea to log the upgrade progress for each dir, but so 
far, we can not get the progress easily from the current api. Do you think it's 
necessary to file a new jira to follow this?

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7692.01.patch, HDFS-7692.02.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-05-18 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549892#comment-14549892
 ] 

Leitao Guo commented on HDFS-7692:
--

[~eddyxu], thanks for your comments, please have a check of the new patch.
1.In DataStorage#recoverTransitionRead, log the InterruptedException and 
rethrow it as InterruptedIOException;
2.In TestDataStorage#testAddStorageDirectoreis, catch InterruptedException then 
let the test case fail;
3.The multithread in DataStorage#addStorageLocations() is for one specific 
namespace, so in TestDataStorage#testAddStorageDirectoreis my intention is 
creating one thread pool for each namespace. Not change here.
4.Re-phrase the parameter successVolumes.

[~szetszwo],thanks for your comments, please have a check of the new patch.
1. InterruptedException re-thrown as InterruptedIOException;
2. I think it's a good idea to log the upgrade progress for each dir, but so 
far, we can not get the progress easily from the current api. Do you think it's 
necessary to file a new jira to follow this?

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7692.01.patch, HDFS-7692.02.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-05-18 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549901#comment-14549901
 ] 

Leitao Guo commented on HDFS-7692:
--

[~eddyxu], thanks for your comments, please have a check of the new patch.
1.In DataStorage#recoverTransitionRead, log the InterruptedException and 
rethrow it as InterruptedIOException;
2.In TestDataStorage#testAddStorageDirectoreis, catch InterruptedException then 
let the test case fail;
3.The multithread in DataStorage#addStorageLocations() is for one specific 
namespace, so in TestDataStorage#testAddStorageDirectoreis my intention is 
creating one thread pool for each namespace. Not change here.
4.Re-phrase the parameter successVolumes.

[~szetszwo],thanks for your comments, please have a check of the new patch.
1. InterruptedException re-thrown as InterruptedIOException;
2. I think it's a good idea to log the upgrade progress for each dir, but so 
far, we can not get the progress easily from the current api. Do you think it's 
necessary to file a new jira to follow this?

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7692.01.patch, HDFS-7692.02.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-05-18 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549898#comment-14549898
 ] 

Leitao Guo commented on HDFS-7692:
--

[~eddyxu], thanks for your comments, please have a check of the new patch.
1.In DataStorage#recoverTransitionRead, log the InterruptedException and 
rethrow it as InterruptedIOException;
2.In TestDataStorage#testAddStorageDirectoreis, catch InterruptedException then 
let the test case fail;
3.The multithread in DataStorage#addStorageLocations() is for one specific 
namespace, so in TestDataStorage#testAddStorageDirectoreis my intention is 
creating one thread pool for each namespace. Not change here.
4.Re-phrase the parameter successVolumes.

[~szetszwo],thanks for your comments, please have a check of the new patch.
1. InterruptedException re-thrown as InterruptedIOException;
2. I think it's a good idea to log the upgrade progress for each dir, but so 
far, we can not get the progress easily from the current api. Do you think it's 
necessary to file a new jira to follow this?

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7692.01.patch, HDFS-7692.02.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-05-18 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549896#comment-14549896
 ] 

Leitao Guo commented on HDFS-7692:
--

[~eddyxu], thanks for your comments, please have a check of the new patch.
1.In DataStorage#recoverTransitionRead, log the InterruptedException and 
rethrow it as InterruptedIOException;
2.In TestDataStorage#testAddStorageDirectoreis, catch InterruptedException then 
let the test case fail;
3.The multithread in DataStorage#addStorageLocations() is for one specific 
namespace, so in TestDataStorage#testAddStorageDirectoreis my intention is 
creating one thread pool for each namespace. Not change here.
4.Re-phrase the parameter successVolumes.

[~szetszwo],thanks for your comments, please have a check of the new patch.
1. InterruptedException re-thrown as InterruptedIOException;
2. I think it's a good idea to log the upgrade progress for each dir, but so 
far, we can not get the progress easily from the current api. Do you think it's 
necessary to file a new jira to follow this?

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7692.01.patch, HDFS-7692.02.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-05-18 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549888#comment-14549888
 ] 

Leitao Guo commented on HDFS-7692:
--

[~eddyxu], thanks for your comments, please have a check of the new patch.
1.In DataStorage#recoverTransitionRead, log the InterruptedException and 
rethrow it as InterruptedIOException;
2.In TestDataStorage#testAddStorageDirectoreis, catch InterruptedException then 
let the test case fail;
3.The multithread in DataStorage#addStorageLocations() is for one specific 
namespace, so in TestDataStorage#testAddStorageDirectoreis my intention is 
creating one thread pool for each namespace. Not change here.
4.Re-phrase the parameter successVolumes.

[~szetszwo],thanks for your comments, please have a check of the new patch.
1. InterruptedException re-thrown as InterruptedIOException;
2. I think it's a good idea to log the upgrade progress for each dir, but so 
far, we can not get the progress easily from the current api. Do you think it's 
necessary to file a new jira to follow this?

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7692.01.patch, HDFS-7692.02.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-05-18 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549890#comment-14549890
 ] 

Leitao Guo commented on HDFS-7692:
--

[~eddyxu], thanks for your comments, please have a check of the new patch.
1.In DataStorage#recoverTransitionRead, log the InterruptedException and 
rethrow it as InterruptedIOException;
2.In TestDataStorage#testAddStorageDirectoreis, catch InterruptedException then 
let the test case fail;
3.The multithread in DataStorage#addStorageLocations() is for one specific 
namespace, so in TestDataStorage#testAddStorageDirectoreis my intention is 
creating one thread pool for each namespace. Not change here.
4.Re-phrase the parameter successVolumes.

[~szetszwo],thanks for your comments, please have a check of the new patch.
1. InterruptedException re-thrown as InterruptedIOException;
2. I think it's a good idea to log the upgrade progress for each dir, but so 
far, we can not get the progress easily from the current api. Do you think it's 
necessary to file a new jira to follow this?

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7692.01.patch, HDFS-7692.02.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-05-18 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549886#comment-14549886
 ] 

Leitao Guo commented on HDFS-7692:
--

[~eddyxu], thanks for your comments, please have a check of the new patch.
1.In DataStorage#recoverTransitionRead, log the InterruptedException and 
rethrow it as InterruptedIOException;
2.In TestDataStorage#testAddStorageDirectoreis, catch InterruptedException then 
let the test case fail;
3.The multithread in DataStorage#addStorageLocations() is for one specific 
namespace, so in TestDataStorage#testAddStorageDirectoreis my intention is 
creating one thread pool for each namespace. Not change here.
4.Re-phrase the parameter successVolumes.

[~szetszwo],thanks for your comments, please have a check of the new patch.
1. InterruptedException re-thrown as InterruptedIOException;
2. I think it's a good idea to log the upgrade progress for each dir, but so 
far, we can not get the progress easily from the current api. Do you think it's 
necessary to file a new jira to follow this?

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7692.01.patch, HDFS-7692.02.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-05-18 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549885#comment-14549885
 ] 

Leitao Guo commented on HDFS-7692:
--

[~eddyxu], thanks for your comments, please have a check of the new patch.
1.In DataStorage#recoverTransitionRead, log the InterruptedException and 
rethrow it as InterruptedIOException;
2.In TestDataStorage#testAddStorageDirectoreis, catch InterruptedException then 
let the test case fail;
3.The multithread in DataStorage#addStorageLocations() is for one specific 
namespace, so in TestDataStorage#testAddStorageDirectoreis my intention is 
creating one thread pool for each namespace. Not change here.
4.Re-phrase the parameter successVolumes.

[~szetszwo],thanks for your comments, please have a check of the new patch.
1. InterruptedException re-thrown as InterruptedIOException;
2. I think it's a good idea to log the upgrade progress for each dir, but so 
far, we can not get the progress easily from the current api. Do you think it's 
necessary to file a new jira to follow this?

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7692.01.patch, HDFS-7692.02.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-05-18 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549887#comment-14549887
 ] 

Leitao Guo commented on HDFS-7692:
--

[~eddyxu], thanks for your comments, please have a check of the new patch.
1.In DataStorage#recoverTransitionRead, log the InterruptedException and 
rethrow it as InterruptedIOException;
2.In TestDataStorage#testAddStorageDirectoreis, catch InterruptedException then 
let the test case fail;
3.The multithread in DataStorage#addStorageLocations() is for one specific 
namespace, so in TestDataStorage#testAddStorageDirectoreis my intention is 
creating one thread pool for each namespace. Not change here.
4.Re-phrase the parameter successVolumes.

[~szetszwo],thanks for your comments, please have a check of the new patch.
1. InterruptedException re-thrown as InterruptedIOException;
2. I think it's a good idea to log the upgrade progress for each dir, but so 
far, we can not get the progress easily from the current api. Do you think it's 
necessary to file a new jira to follow this?

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7692.01.patch, HDFS-7692.02.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-05-18 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549889#comment-14549889
 ] 

Leitao Guo commented on HDFS-7692:
--

[~eddyxu], thanks for your comments, please have a check of the new patch.
1.In DataStorage#recoverTransitionRead, log the InterruptedException and 
rethrow it as InterruptedIOException;
2.In TestDataStorage#testAddStorageDirectoreis, catch InterruptedException then 
let the test case fail;
3.The multithread in DataStorage#addStorageLocations() is for one specific 
namespace, so in TestDataStorage#testAddStorageDirectoreis my intention is 
creating one thread pool for each namespace. Not change here.
4.Re-phrase the parameter successVolumes.

[~szetszwo],thanks for your comments, please have a check of the new patch.
1. InterruptedException re-thrown as InterruptedIOException;
2. I think it's a good idea to log the upgrade progress for each dir, but so 
far, we can not get the progress easily from the current api. Do you think it's 
necessary to file a new jira to follow this?

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7692.01.patch, HDFS-7692.02.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-05-18 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549883#comment-14549883
 ] 

Leitao Guo commented on HDFS-7692:
--

[~eddyxu], thanks for your comments, please have a check of the new patch.
1.In DataStorage#recoverTransitionRead, log the InterruptedException and 
rethrow it as InterruptedIOException;
2.In TestDataStorage#testAddStorageDirectoreis, catch InterruptedException then 
let the test case fail;
3.The multithread in DataStorage#addStorageLocations() is for one specific 
namespace, so in TestDataStorage#testAddStorageDirectoreis my intention is 
creating one thread pool for each namespace. Not change here.
4.Re-phrase the parameter successVolumes.

[~szetszwo],thanks for your comments, please have a check of the new patch.
1. InterruptedException re-thrown as InterruptedIOException;
2. I think it's a good idea to log the upgrade progress for each dir, but so 
far, we can not get the progress easily from the current api. Do you think it's 
necessary to file a new jira to follow this?

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7692.01.patch, HDFS-7692.02.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-05-18 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549882#comment-14549882
 ] 

Leitao Guo commented on HDFS-7692:
--

[~eddyxu], thanks for your comments, please have a check of the new patch.
1.In DataStorage#recoverTransitionRead, log the InterruptedException and 
rethrow it as InterruptedIOException;
2.In TestDataStorage#testAddStorageDirectoreis, catch InterruptedException then 
let the test case fail;
3.The multithread in DataStorage#addStorageLocations() is for one specific 
namespace, so in TestDataStorage#testAddStorageDirectoreis my intention is 
creating one thread pool for each namespace. Not change here.
4.Re-phrase the parameter successVolumes.

[~szetszwo],thanks for your comments, please have a check of the new patch.
1. InterruptedException re-thrown as InterruptedIOException;
2. I think it's a good idea to log the upgrade progress for each dir, but so 
far, we can not get the progress easily from the current api. Do you think it's 
necessary to file a new jira to follow this?

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
>  Labels: BB2015-05-TBR
> Attachments: HDFS-7692.01.patch, HDFS-7692.02.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-02-06 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14310592#comment-14310592
 ] 

Leitao Guo commented on HDFS-7692:
--

[~szetszwo] thanks for your comments, I will update the patch asap.

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
> Attachments: HDFS-7692.01.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-02-06 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14310591#comment-14310591
 ] 

Leitao Guo commented on HDFS-7692:
--

When upgrading before the patch, I find there is high cpu utilization (~90%) in 
our cluster , so I think we'd better control the num of threads here. I will 
have a test verify this.

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
> Attachments: HDFS-7692.01.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-02-06 Thread Leitao Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14310587#comment-14310587
 ] 

Leitao Guo commented on HDFS-7692:
--

[~eddyxu], thanks for your review and comments!

When running tests after patching, I got "NullPointerException" at this line, 
so I add the check of "null != datanode.getConf()" here.
{code} Executors.newFixedThreadPool(null != datanode.getConf() ? datanode
493 .getConf().getInt(
494 DFSConfigKeys.DFS_DATANODE_DIRECTORYSCAN_THREADS_KEY,
495 
DFSConfigKeys.DFS_DATANODE_DIRECTORYSCAN_THREADS_DEFAULT)
496 : dataDirs.size()); {code}

I will update the patch according to your comments, thanks!

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
>Assignee: Leitao Guo
> Attachments: HDFS-7692.01.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-02-06 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14310019#comment-14310019
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7692:
---

Some more comments:
- InterruptedException should be re-thrown as InterruptedIOException but not 
ignoring it.
- Instead more wait forever as below, 
{code}
+  addStorageThreadPool.awaitTermination(Long.MAX_VALUE, TimeUnit.DAYS);
{code}
how about wait a short time period, 1 minute, and print a info message?  It 
would be great if we can print progress for each dir.
{code}
  for(; !addStorageThreadPool.awaitTermination(1, TimeUnit.MINUTES); ) {
LOG.info(..);
  }
{code}

BTW, thanks a lot for working on this!

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
> Attachments: HDFS-7692.01.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-02-06 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14309994#comment-14309994
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7692:
---

I think we don't need DFS_DATANODE_DIRECTORYSCAN_THREADS_KEY.  Since the number 
of dataDirs won't be large, why don't always use dataDirs.size()?

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
> Attachments: HDFS-7692.01.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-02-05 Thread Lei (Eddy) Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14308419#comment-14308419
 ] 

Lei (Eddy) Xu commented on HDFS-7692:
-

[~guoleitao] Thanks for working on this. It looks very good overall.

Just have a few comments:

{code}
try {
   addStorageThreadPool.awaitTermination(Long.MAX_VALUE, TimeUnit.DAYS);
} catch (InterruptedException e) {
  e.printStackTrace();
}
{code}

Could you log the Exception here and rethrow it as {{IOException}}? 

Moreover, in tests, I think you do not need to catch this 
{{InterruptedException}}. If such an exception occurs, it would be better to 
just let the test case fail.

In {{TestDataStorage#testAddStorageDirectoreis}}:

{code}
 for (NamespaceInfo ni : namespaceInfos) {
139   storage.addStorageLocations(mockDN, ni, locations, START_OPT);

148   pool = Executors.newFixedThreadPool(numThreads);
149   storage.addStorageLocations(mockDN, ni, locations, START_OPT, 
pool,
150   addedLocation);
151   pool.shutdown();
{code}

You created one thread pool for each Namespace, and wait this pool to terminate 
in the loop. I guess that your intention was creating a global pool and let 
each namespace share the same thread pool?

{code}
Executors.newFixedThreadPool(null != datanode.getConf() ? datanode
493 .getConf().getInt(
494 DFSConfigKeys.DFS_DATANODE_DIRECTORYSCAN_THREADS_KEY,
495 
DFSConfigKeys.DFS_DATANODE_DIRECTORYSCAN_THREADS_DEFAULT)
496 : dataDirs.size());
{code}

Will this {{null != datanode.getConf()}} condition actually happen?

{code}
/*
* @param successVolumes a list of successfully loaded volumes.
*/
{code}

Maybe it is better to re-phrase this as something like "An output container to 
fill with successfully loaded volumes"?

I'd love to give a non-binding +1 once these comments are addressed.

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
> Attachments: HDFS-7692.01.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7692) DataStorage#addStorageLocations(...) should support MultiThread to speedup the upgrade of block pool at multi storage directories.

2015-02-04 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14306034#comment-14306034
 ] 

Hadoop QA commented on HDFS-7692:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12696523/HDFS-7692.01.patch
  against trunk revision 064e077.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.qjournal.TestSecureNNWithQJM

  The following test timeouts occurred in 
hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.server.namenode.TestSecurityTokenEditLog

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9430//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9430//console

This message is automatically generated.

> DataStorage#addStorageLocations(...) should support MultiThread to speedup 
> the upgrade of block pool at multi storage directories.
> --
>
> Key: HDFS-7692
> URL: https://issues.apache.org/jira/browse/HDFS-7692
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.5.2
>Reporter: Leitao Guo
> Attachments: HDFS-7692.01.patch
>
>
> {code:title=DataStorage#addStorageLocations(...)|borderStyle=solid}
> for (StorageLocation dataDir : dataDirs) {
>   File root = dataDir.getFile();
>  ... ...
> bpStorage.recoverTransitionRead(datanode, nsInfo, bpDataDirs, 
> startOpt);
> addBlockPoolStorage(bpid, bpStorage);
> ... ...
>   successVolumes.add(dataDir);
> }
> {code}
> In the above code the storage directories will be analyzed one by one, which 
> is really time consuming when upgrading HDFS with datanodes have dozens of 
> large volumes.  MultiThread dataDirs analyzing should be supported here to 
> speedup upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)