[jira] [Commented] (HDFS-4987) Namenode changes to track multiple storages

2013-08-30 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13755403#comment-13755403
 ] 

Junping Du commented on HDFS-4987:
--

File HDFS-5154 to track unit tests faiure.

 Namenode changes to track multiple storages
 ---

 Key: HDFS-4987
 URL: https://issues.apache.org/jira/browse/HDFS-4987
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Suresh Srinivas
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: Heterogeneous Storage (HDFS-2832)

 Attachments: h4987_20130822.patch, h4987_20130827b.patch, 
 h4987_20130827c.patch, h4987_20130827.patch, h4987_20130828.patch


 Currently namenode track in BlockInfo, the corresponding DataNodeDescriptor. 
 I propose changing this with new abstraction StorageDescriptor. This change 
 will also make DatanodeDescriptor a collection of StroageDescriptors. This 
 will allow given a block, identify its storage and also Datanode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4987) Namenode changes to track multiple storages

2013-08-27 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13751473#comment-13751473
 ] 

Arpit Agarwal commented on HDFS-4987:
-

Hi Nicholas,

Couple of comment related nitpicks we can fix later.

# BlockInfo.java:44 - Comment should be updated to say {{block belongs to 
triplets[3*i] is the reference to the StorageInfo}}
# BlockInfo#findStorageInfo - Javadoc needs to be updated to {{   * @param 
dn}}


 Namenode changes to track multiple storages
 ---

 Key: HDFS-4987
 URL: https://issues.apache.org/jira/browse/HDFS-4987
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Suresh Srinivas
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h4987_20130822.patch, h4987_20130827b.patch, 
 h4987_20130827c.patch, h4987_20130827.patch


 Currently namenode track in BlockInfo, the corresponding DataNodeDescriptor. 
 I propose changing this with new abstraction StorageDescriptor. This change 
 will also make DatanodeDescriptor a collection of StroageDescriptors. This 
 will allow given a block, identify its storage and also Datanode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4987) Namenode changes to track multiple storages

2013-08-27 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13751573#comment-13751573
 ] 

Arpit Agarwal commented on HDFS-4987:
-

I hit submit too soon.

{quote}
For #1, I tried to move blockContentsStale to be per storage but it is quite 
involved. How about we move it when we change block report to be per storage?
{quote}
This sounds good. I filed HDFS-5134 to track this and couple of other items 
that can be done per-storage now.

Javadoc nitpicks but I am +1 on the patch basically. Please feel free to commit 
this today. 
# {{BlockManager#processReport}} - Javadoc should be updated e.g. {{   * The 
given datanode is reporting all its blocks on the given Storage.}}
# BlockInfo.java:44 - Comment should be updated to say {{block belongs to 
triplets[3*i] is the reference to the StorageInfo}}
# {{BlockInfo#findStorageInfo}} - Parameter should be  {{* @param dn}}
# {{BlockManager#addStoredBlockImmediate}} - Link in header comment needs to be 
updated.

Thanks!
Arpit



 Namenode changes to track multiple storages
 ---

 Key: HDFS-4987
 URL: https://issues.apache.org/jira/browse/HDFS-4987
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Suresh Srinivas
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h4987_20130822.patch, h4987_20130827b.patch, 
 h4987_20130827c.patch, h4987_20130827.patch


 Currently namenode track in BlockInfo, the corresponding DataNodeDescriptor. 
 I propose changing this with new abstraction StorageDescriptor. This change 
 will also make DatanodeDescriptor a collection of StroageDescriptors. This 
 will allow given a block, identify its storage and also Datanode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4987) Namenode changes to track multiple storages

2013-08-27 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13752122#comment-13752122
 ] 

Arpit Agarwal commented on HDFS-4987:
-

Thanks Nicholas! +1 for the updated patch.

 Namenode changes to track multiple storages
 ---

 Key: HDFS-4987
 URL: https://issues.apache.org/jira/browse/HDFS-4987
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Suresh Srinivas
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h4987_20130822.patch, h4987_20130827b.patch, 
 h4987_20130827c.patch, h4987_20130827.patch, h4987_20130828.patch


 Currently namenode track in BlockInfo, the corresponding DataNodeDescriptor. 
 I propose changing this with new abstraction StorageDescriptor. This change 
 will also make DatanodeDescriptor a collection of StroageDescriptors. This 
 will allow given a block, identify its storage and also Datanode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4987) Namenode changes to track multiple storages

2013-08-27 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13752128#comment-13752128
 ] 

Junping Du commented on HDFS-4987:
--

Thanks for addressing comments in new patch. Patch looks good to me. +1

 Namenode changes to track multiple storages
 ---

 Key: HDFS-4987
 URL: https://issues.apache.org/jira/browse/HDFS-4987
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Suresh Srinivas
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h4987_20130822.patch, h4987_20130827b.patch, 
 h4987_20130827c.patch, h4987_20130827.patch, h4987_20130828.patch


 Currently namenode track in BlockInfo, the corresponding DataNodeDescriptor. 
 I propose changing this with new abstraction StorageDescriptor. This change 
 will also make DatanodeDescriptor a collection of StroageDescriptors. This 
 will allow given a block, identify its storage and also Datanode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4987) Namenode changes to track multiple storages

2013-08-26 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750287#comment-13750287
 ] 

Arpit Agarwal commented on HDFS-4987:
-

Hi Nicholas,

Couple of comments.

# {{blockContentsStale}} should be per storage since DataNode may want to send 
separate block reports per storage.
# {{DatanodeStorageInfo}} should have the Storage type.

Thanks.

 Namenode changes to track multiple storages
 ---

 Key: HDFS-4987
 URL: https://issues.apache.org/jira/browse/HDFS-4987
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Suresh Srinivas
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h4987_20130822.patch


 Currently namenode track in BlockInfo, the corresponding DataNodeDescriptor. 
 I propose changing this with new abstraction StorageDescriptor. This change 
 will also make DatanodeDescriptor a collection of StroageDescriptors. This 
 will allow given a block, identify its storage and also Datanode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4987) Namenode changes to track multiple storages

2013-08-26 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750717#comment-13750717
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-4987:
--

Arpit, thanks for the comments.

For #1, I tried to move blockContentsStale to be per storage but it is quite 
involved.  How about we move it when we change block report to be per storage?

For #2, I added storageType to DatanodeStorageInfo.

I also found another problem that LocatedBlock was used in reportBadBlocks(..) 
but it did not have storage ID.  I think we need to add storage ID to 
LocatedBlock.  I will file a JIRA.

 Namenode changes to track multiple storages
 ---

 Key: HDFS-4987
 URL: https://issues.apache.org/jira/browse/HDFS-4987
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Suresh Srinivas
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h4987_20130822.patch


 Currently namenode track in BlockInfo, the corresponding DataNodeDescriptor. 
 I propose changing this with new abstraction StorageDescriptor. This change 
 will also make DatanodeDescriptor a collection of StroageDescriptors. This 
 will allow given a block, identify its storage and also Datanode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4987) Namenode changes to track multiple storages

2013-08-26 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750753#comment-13750753
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-4987:
--

 ... I will file a JIRA.

Since we already have HDFS-5009, I am not going to file a JIRA but assign 
HDFS-5009 to myself.

 Namenode changes to track multiple storages
 ---

 Key: HDFS-4987
 URL: https://issues.apache.org/jira/browse/HDFS-4987
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Suresh Srinivas
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h4987_20130822.patch, h4987_20130827.patch


 Currently namenode track in BlockInfo, the corresponding DataNodeDescriptor. 
 I propose changing this with new abstraction StorageDescriptor. This change 
 will also make DatanodeDescriptor a collection of StroageDescriptors. This 
 will allow given a block, identify its storage and also Datanode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4987) Namenode changes to track multiple storages

2013-08-26 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750845#comment-13750845
 ] 

Junping Du commented on HDFS-4987:
--

Nicholas, nice code refactoring work on reportDiff(...). Previous delimiter 
seems to be non-necessary. My only question is if we remove TestBlockInfo.java 
which is only unit test and consumer of moveBlockToHead() in 
DataNodeDescriptor/BlockInfo, shall we consider to remove this method rather 
than leaving there without using and testing?

 Namenode changes to track multiple storages
 ---

 Key: HDFS-4987
 URL: https://issues.apache.org/jira/browse/HDFS-4987
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Suresh Srinivas
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h4987_20130822.patch, h4987_20130827.patch


 Currently namenode track in BlockInfo, the corresponding DataNodeDescriptor. 
 I propose changing this with new abstraction StorageDescriptor. This change 
 will also make DatanodeDescriptor a collection of StroageDescriptors. This 
 will allow given a block, identify its storage and also Datanode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4987) Namenode changes to track multiple storages

2013-08-26 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750874#comment-13750874
 ] 

Junping Du commented on HDFS-4987:
--

One additional comment is we can do iteration blocks on DN for specific 
StorageID (we may add API for it later). Now, for each StorageBlockReport, it 
will iterate the whole block list which seems not the most efficient way.

 Namenode changes to track multiple storages
 ---

 Key: HDFS-4987
 URL: https://issues.apache.org/jira/browse/HDFS-4987
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Suresh Srinivas
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h4987_20130822.patch, h4987_20130827.patch


 Currently namenode track in BlockInfo, the corresponding DataNodeDescriptor. 
 I propose changing this with new abstraction StorageDescriptor. This change 
 will also make DatanodeDescriptor a collection of StroageDescriptors. This 
 will allow given a block, identify its storage and also Datanode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4987) Namenode changes to track multiple storages

2013-08-26 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750960#comment-13750960
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-4987:
--

Good catch Junping, I will remove moveBlockToHead(..) from BlockInfo.

For your second comment, I actually don't understand.  We already have 
individual StorageBlockReports in the NN-side.  Do you mean the DN side?  I 
have not checked it yet. 

 Namenode changes to track multiple storages
 ---

 Key: HDFS-4987
 URL: https://issues.apache.org/jira/browse/HDFS-4987
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Suresh Srinivas
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h4987_20130822.patch, h4987_20130827b.patch, 
 h4987_20130827.patch


 Currently namenode track in BlockInfo, the corresponding DataNodeDescriptor. 
 I propose changing this with new abstraction StorageDescriptor. This change 
 will also make DatanodeDescriptor a collection of StroageDescriptors. This 
 will allow given a block, identify its storage and also Datanode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4987) Namenode changes to track multiple storages

2013-08-26 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750964#comment-13750964
 ] 

Junping Du commented on HDFS-4987:
--

bq. For your second comment, I actually don't understand. We already have 
individual StorageBlockReports in the NN-side. Do you mean the DN side? I have 
not checked it yet.
Yes. I mean in your reportDiff(), it firstly add all blocks of DN to remove 
list. In future, adding blocks belongs to specific storageID may make more 
sense in this case?


 Namenode changes to track multiple storages
 ---

 Key: HDFS-4987
 URL: https://issues.apache.org/jira/browse/HDFS-4987
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Suresh Srinivas
Assignee: Tsz Wo (Nicholas), SZE
 Attachments: h4987_20130822.patch, h4987_20130827b.patch, 
 h4987_20130827.patch


 Currently namenode track in BlockInfo, the corresponding DataNodeDescriptor. 
 I propose changing this with new abstraction StorageDescriptor. This change 
 will also make DatanodeDescriptor a collection of StroageDescriptors. This 
 will allow given a block, identify its storage and also Datanode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira