[jira] [Commented] (HDFS-8392) DataNode support for multiple datasets

2015-06-01 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14568109#comment-14568109
 ] 

Arpit Agarwal commented on HDFS-8392:
-

Thanks for the review Jitendra. I fixed the first point and committed to 
branch-7240 since the delta is trivial.

bq. Ideally we should change the tests to use the datasetsMap instead of legacy 
field 'data'. That can be taken in a separate jira.
I'll file a Jira to make sure we fix it in trunk post merge. Deferring to avoid 
frequent merging pain.

Delta for reference:
{code}
--- 
a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
+++ 
b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
@@ -2598,7 +2598,7 @@ public void scheduleAllBlockReport(long delay) {
*
* @return the fsdataset that stores the blocks
*/
-  @VisibleForTesting
+  @Deprecated
   public FsDatasetSpi? getFSDataset() {
 Preconditions.checkState(datasets.size() = 1,
 Did not expect more than one Dataset here.);
@@ -2621,7 +2621,7 @@ public BlockScanner getBlockScanner() {
*
* @return
*/
-  @VisibleForTesting
+  @Deprecated
   DirectoryScanner getDirectoryScanner() {
 return directoryScannersMap.get(getFSDataset());
   }
{code}


 DataNode support for multiple datasets
 --

 Key: HDFS-8392
 URL: https://issues.apache.org/jira/browse/HDFS-8392
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-8392-HDFS-7240.01.patch, 
 HDFS-8392-HDFS-7240.02.patch, HDFS-8392-HDFS-7240.03.patch


 For HDFS-7240 we would like to share available DataNode storage across HDFS 
 blocks and Ozone objects.
 The DataNode already supports sharing available storage across multiple block 
 pool IDs for the federation feature. However all federated block pools use 
 the same dataset implementation i.e. {{FsDatasetImpl}}.
 We can extend the DataNode to support multiple dataset implementations so the 
 same storage space can be shared across one or more HDFS block pools and one 
 or more Ozone block pools.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8392) DataNode support for multiple datasets

2015-05-31 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14566953#comment-14566953
 ] 

Jitendra Nath Pandey commented on HDFS-8392:


[~arpitagarwal], the patch looks very good. I have couple of minor comments.   
  # Datanode#getFSDataset(), getDirectoryScanner : Please annotate it as 
deprecated, and probably we should remove @VisibleForTesting, so that new tests 
are not written using these methods.
   # Ideally we should change the tests to use the datasetsMap instead of 
legacy field 'data'. That can be taken in a separate jira.

I am +1 with the patch. 

 DataNode support for multiple datasets
 --

 Key: HDFS-8392
 URL: https://issues.apache.org/jira/browse/HDFS-8392
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-8392-HDFS-7240.01.patch, 
 HDFS-8392-HDFS-7240.02.patch, HDFS-8392-HDFS-7240.03.patch


 For HDFS-7240 we would like to share available DataNode storage across HDFS 
 blocks and Ozone objects.
 The DataNode already supports sharing available storage across multiple block 
 pool IDs for the federation feature. However all federated block pools use 
 the same dataset implementation i.e. {{FsDatasetImpl}}.
 We can extend the DataNode to support multiple dataset implementations so the 
 same storage space can be shared across one or more HDFS block pools and one 
 or more Ozone block pools.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8392) DataNode support for multiple datasets

2015-05-22 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557061#comment-14557061
 ] 

Hadoop QA commented on HDFS-8392:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 48s | Pre-patch HDFS-7240 compilation 
is healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 4 new or modified test files. |
| {color:green}+1{color} | javac |   7m 32s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 35s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m 14s | The applied patch generated  8 
new checkstyle issues (total was 665, now 659). |
| {color:red}-1{color} | whitespace |   0m  9s | The patch has 7  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 35s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m  4s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 14s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 162m 28s | Tests failed in hadoop-hdfs. |
| | | 205m 41s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | 
hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734938/HDFS-8392-HDFS-7240.03.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | HDFS-7240 / 770ed92 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/2/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/2/artifact/patchprocess/whitespace.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/2/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/2/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf907.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/2/console |


This message was automatically generated.

 DataNode support for multiple datasets
 --

 Key: HDFS-8392
 URL: https://issues.apache.org/jira/browse/HDFS-8392
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-8392-HDFS-7240.01.patch, 
 HDFS-8392-HDFS-7240.02.patch, HDFS-8392-HDFS-7240.03.patch


 For HDFS-7240 we would like to share available DataNode storage across HDFS 
 blocks and Ozone objects.
 The DataNode already supports sharing available storage across multiple block 
 pool IDs for the federation feature. However all federated block pools use 
 the same dataset implementation i.e. {{FsDatasetImpl}}.
 We can extend the DataNode to support multiple dataset implementations so the 
 same storage space can be shared across one or more HDFS block pools and one 
 or more Ozone block pools.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8392) DataNode support for multiple datasets

2015-05-21 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14555264#comment-14555264
 ] 

Arpit Agarwal commented on HDFS-8392:
-

The attached patch is the first step to supporting alternate Dataset 
implementations. This fixes assumptions in the DataNode code that there is a 
single dataset instance. Instead the block pool Id is used to lookup the 
dataset. The dataset instantiation is keyed off the service type, so all 
federated block pools would share the same dataset instance.

I left {{Datanode#data}} and {{Datanode#getFSDataset}} to avoid massive changes 
to existing test cases and tagged them {{@VisibleForTesting}}. For HDFS unit 
tests we will never have more than one dataset instance.

 DataNode support for multiple datasets
 --

 Key: HDFS-8392
 URL: https://issues.apache.org/jira/browse/HDFS-8392
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-8392-HDFS-7240.01.patch


 For HDFS-7240 we would like to share available DataNode storage across HDFS 
 blocks and Ozone objects.
 The DataNode already supports sharing available storage across multiple block 
 pool IDs for the federation feature. However all federated block pools use 
 the same dataset implementation i.e. {{FsDatasetImpl}}.
 We can extend the DataNode to support multiple dataset implementations so the 
 same storage space can be shared across one or more HDFS block pools and one 
 or more Ozone block pools.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8392) DataNode support for multiple datasets

2015-05-21 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14555459#comment-14555459
 ] 

Hadoop QA commented on HDFS-8392:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 41s | Pre-patch HDFS-7240 compilation 
is healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 4 new or modified test files. |
| {color:green}+1{color} | javac |   7m 28s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 39s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   2m 12s | The applied patch generated  
38 new checkstyle issues (total was 665, now 691). |
| {color:red}-1{color} | whitespace |   0m  8s | The patch has 7  line(s) that 
end in whitespace. Use git apply --whitespace=fix. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m  4s | The patch does not introduce 
any new Findbugs (version 3.0.0) warnings. |
| {color:green}+1{color} | native |   3m 14s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 168m 28s | Tests failed in hadoop-hdfs. |
| | | 211m 32s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
|   | hadoop.hdfs.server.namenode.ha.TestHAStateTransitions |
|   | hadoop.hdfs.server.namenode.TestCheckpoint |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestWriteToReplica |
|   | hadoop.hdfs.web.TestWebHdfsWithMultipleNameNodes |
|   | hadoop.hdfs.server.datanode.TestDeleteBlockPool |
|   | hadoop.hdfs.server.namenode.TestFsckWithMultipleNameNodes |
|   | hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
|   | hadoop.hdfs.TestDecommission |
|   | hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations |
|   | hadoop.hdfs.TestAppendSnapshotTruncate |
|   | hadoop.fs.viewfs.TestViewFileSystemHdfs |
|   | hadoop.hdfs.server.datanode.TestRefreshNamenodes |
|   | hadoop.hdfs.server.datanode.TestDataNodeExit |
|   | hadoop.hdfs.server.datanode.TestBlockScanner |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12734668/HDFS-8392-HDFS-7240.01.patch
 |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | HDFS-7240 / 770ed92 |
| checkstyle |  
https://builds.apache.org/job/PreCommit-HDFS-Build/11092/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11092/artifact/patchprocess/whitespace.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11092/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11092/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/11092/console |


This message was automatically generated.

 DataNode support for multiple datasets
 --

 Key: HDFS-8392
 URL: https://issues.apache.org/jira/browse/HDFS-8392
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
 Attachments: HDFS-8392-HDFS-7240.01.patch


 For HDFS-7240 we would like to share available DataNode storage across HDFS 
 blocks and Ozone objects.
 The DataNode already supports sharing available storage across multiple block 
 pool IDs for the federation feature. However all federated block pools use 
 the same dataset implementation i.e. {{FsDatasetImpl}}.
 We can extend the DataNode to support multiple dataset implementations so the 
 same storage space can be shared across one or more HDFS block pools and one 
 or more Ozone block pools.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8392) DataNode support for multiple datasets

2015-05-14 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14544064#comment-14544064
 ] 

Arpit Agarwal commented on HDFS-8392:
-

FsDatasetSpi is geared towards storing and retrieving files. In the object 
store we want to be able to store and retrieve metadata containers and data 
containers. Files may not be the best abstraction for these containers. For 
these we'll introduce a StorageContainerDataset. We don't foresee a third 
dataset type right now.

The DataNode already supports multiple block pools per storage volume and most 
of the difficult work was done as part of the federation feature. It is 
relatively straightforward to extend it to support the notion of a dataset per 
block pool. So in a cluster running non-federated HDFS and Object store 
services, the DataNodes would have two blockpools and two datasets, each 
servicing one block pool.

Hope that's a little clearer. I intend to post a patch next week.

 DataNode support for multiple datasets
 --

 Key: HDFS-8392
 URL: https://issues.apache.org/jira/browse/HDFS-8392
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal

 For HDFS-7240 we would like to share available DataNode storage across HDFS 
 blocks and Ozone objects.
 The DataNode already supports sharing available storage across multiple block 
 pool IDs for the federation feature. However all federated block pools use 
 the same dataset implementation i.e. {{FsDatasetImpl}}.
 We can extend the DataNode to support multiple dataset implementations so the 
 same storage space can be shared across one or more HDFS block pools and one 
 or more Ozone block pools.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8392) DataNode support for multiple datasets

2015-05-13 Thread Joe Pallas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14542952#comment-14542952
 ] 

Joe Pallas commented on HDFS-8392:
--

The current Ozone Architecture document seems to say that storage would simply 
use different block pools, so it isn't clear what motivates this.  The datanode 
has the notion of a single dataset pretty firmly at present, and it isn't clear 
how multiple datasets might share the same volumes (if that is the intent).  
Could you elaborate on what problems this would be trying to solve?


 DataNode support for multiple datasets
 --

 Key: HDFS-8392
 URL: https://issues.apache.org/jira/browse/HDFS-8392
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal

 For HDFS-7240 we would like to share available DataNode storage across HDFS 
 blocks and Ozone objects.
 The DataNode already supports sharing available storage across multiple block 
 pool IDs for the federation feature. However all federated block pools use 
 the same dataset implementation i.e. {{FsDatasetImpl}}.
 We can extend the DataNode to support multiple dataset implementations so the 
 same storage space can be shared across one or more HDFS block pools and one 
 or more Ozone block pools.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)