[ 
https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13842846#comment-13842846
 ] 

Andrew Wang commented on HDFS-2832:
-----------------------------------

I'm doing my best here to make this a friendly technical discussion. I guess my 
timing here is unfortunate with the merge vote pending, but since I don't 
intend to vote -1, this is just me reading the design doc again and posting 
more comments. Please take it as such. I'll restate that I think everything in 
the branch so far looks great.

Arpit, I actually reviewed the previous comment history very carefully before 
posting, as well as the updated design document. Yes, I know that I made some 
of these comments before, but I brought them up in the first place because I 
felt like they were interesting problems related to this feature. I bring them 
up again because, 4 months later, I was wondering if you had any thoughts on 
potential solutions that could be added to the doc. It's fine if automatic 
migration, open files, more elaborate resource management, and additional 
storage types are all not in immediate scope, but I assume we'll want them in 
the future.

I'd also like if the design doc was reworked to reflect what's in phase 1 vs. 
phase 2 vs. future work. This would also save me from making comments I 
apparently shouldn't be making on WIP sections (e.g. 6.2, 6.4). I'll say this 
again too, if phase 1 is done and phase 2 is starting, it seems like the design 
of phase 2 is exactly what be the focus of our attention right now.

The back and forth:

I mention SCR over and over again because HBase is very interested in using 
SSDs, and I figured supporting one of our biggest downstream projects would be 
a prime use case and potentially in scope. I'd be sad if not, but it'd be good 
to at least gather their requirements and see how we might get there.

bq. HDFS does not support rolling upgrades today.

Well, CDH supports rolling upgrades in some situations. ATM is working on 
metadata upgrade with HA enabled (HDFS-5138) and I've seen some recent JIRAs 
related to rolling upgrade (HDFS-5535), so it seems like a reasonable question. 
At least at the protobuf level, everything so far looks compatible, so I 
thought it might work as long as the handler code is compatible too.

These question were also not answered:

bq. Do you forsee heartbeats and block reports always being combined in 
realistic scenarios? Or are there reasons to split it? Is there any additional 
overhead from splitting? Can we save any complexity by not supporting split 
reports? I see this on the test matrix.
b1. Have you put any thought about metrics and tooling to help users and admins 
debug their quota usage and issues with migrating files to certain storage 
types? Especially because of SCR.

Thanks Arpit. Again, this feedback isn't really merge related, it's just 
technical discussion. Not trying to block anything here.

> Enable support for heterogeneous storages in HDFS
> -------------------------------------------------
>
>                 Key: HDFS-2832
>                 URL: https://issues.apache.org/jira/browse/HDFS-2832
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>    Affects Versions: 0.24.0
>            Reporter: Suresh Srinivas
>            Assignee: Suresh Srinivas
>         Attachments: 20130813-HeterogeneousStorage.pdf, 
> 20131125-HeterogeneousStorage-TestPlan.pdf, 
> 20131125-HeterogeneousStorage.pdf, 
> 20131202-HeterogeneousStorage-TestPlan.pdf, 
> 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, 
> editsStored, h2832_20131023.patch, h2832_20131023b.patch, 
> h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, 
> h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, 
> h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, 
> h2832_20131110.patch, h2832_20131110b.patch, h2832_20131111.patch, 
> h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, 
> h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, 
> h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, 
> h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch, 
> h2832_20131203.patch
>
>
> HDFS currently supports configuration where storages are a list of 
> directories. Typically each of these directories correspond to a volume with 
> its own file system. All these directories are homogeneous and therefore 
> identified as a single storage at the namenode. I propose, change to the 
> current model where Datanode * is a * storage, to Datanode * is a collection 
> * of strorages. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to