[ 
https://issues.apache.org/jira/browse/HDFS-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13843549#comment-13843549
 ] 

Arpit Agarwal commented on HDFS-2832:
-------------------------------------

{quote}
I bring them up again because, 4 months later, I was wondering if you had any 
thoughts on potential solutions that could be added to the doc. It's fine if 
automatic migration, open files, more elaborate resource management, and 
additional storage types are all not in immediate scope, but I assume we'll 
want them in the future.
{quote}
Andrew, automatic migration is not in scope for our design. wrt open files can 
you describe a specific use case you think we should be handling that we have 
not described? Maybe it will help me understand your concern better. If you are 
concerned about reclaiming capacity for in use blocks, that is analogous to 
asking "If a process keeps a long-lived handle to a file what will the 
operating system do to reclaim disk space used by the file?" and the answer is 
the same - nothing.

I don't want anyone reading your comments to get a false impression that the 
feature is incompatible with SCR.

{quote}
Well, CDH supports rolling upgrades in some situations. ATM is working on 
metadata upgrade with HA enabled (HDFS-5138) and I've seen some recent JIRAs 
related to rolling upgrade (HDFS-5535), so it seems like a reasonable question. 
At least at the protobuf level, everything so far looks compatible, so I 
thought it might work as long as the handler code is compatible too.
{quote}
I am not familiar with how CDH does rolling upgrades so I cannot tell you 
whether it will work. You recently bumped the layout version for caching so you 
might recall that HDFS layout version checks prevent a DN registering with an 
NN with a mismatched version. To my knowledge HDFS-5535 will not fix this 
limitation either. That said, we have retained wire-protocol compatibility.

{quote}
Do you forsee heartbeats and block reports always being combined in realistic 
scenarios? Or are there reasons to split it? Is there any additional overhead 
from splitting? Can we save any complexity by not supporting split reports? I 
see this on the test matrix.
{quote}
I thought I answered it, maybe if you describe your concerns I can give you a 
better answer. When the test plan says 'split' I meant splitting the reports 
across multiple requests. Reports will always be split by storage but we are 
not splitting them across multiple messages for now. What kind of overhead are 
you thinking of?

{quote}
b1. Have you put any thought about metrics and tooling to help users and admins 
debug their quota usage and issues with migrating files to certain storage 
types?
{quote}
We'll include it in the next design rev as we start phase 2.

> Enable support for heterogeneous storages in HDFS
> -------------------------------------------------
>
>                 Key: HDFS-2832
>                 URL: https://issues.apache.org/jira/browse/HDFS-2832
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>    Affects Versions: 0.24.0
>            Reporter: Suresh Srinivas
>            Assignee: Suresh Srinivas
>         Attachments: 20130813-HeterogeneousStorage.pdf, 
> 20131125-HeterogeneousStorage-TestPlan.pdf, 
> 20131125-HeterogeneousStorage.pdf, 
> 20131202-HeterogeneousStorage-TestPlan.pdf, 
> 20131203-HeterogeneousStorage-TestPlan.pdf, H2832_20131107.patch, 
> editsStored, h2832_20131023.patch, h2832_20131023b.patch, 
> h2832_20131025.patch, h2832_20131028.patch, h2832_20131028b.patch, 
> h2832_20131029.patch, h2832_20131103.patch, h2832_20131104.patch, 
> h2832_20131105.patch, h2832_20131107b.patch, h2832_20131108.patch, 
> h2832_20131110.patch, h2832_20131110b.patch, h2832_20131111.patch, 
> h2832_20131112.patch, h2832_20131112b.patch, h2832_20131114.patch, 
> h2832_20131118.patch, h2832_20131119.patch, h2832_20131119b.patch, 
> h2832_20131121.patch, h2832_20131122.patch, h2832_20131122b.patch, 
> h2832_20131123.patch, h2832_20131124.patch, h2832_20131202.patch, 
> h2832_20131203.patch
>
>
> HDFS currently supports configuration where storages are a list of 
> directories. Typically each of these directories correspond to a volume with 
> its own file system. All these directories are homogeneous and therefore 
> identified as a single storage at the namenode. I propose, change to the 
> current model where Datanode * is a * storage, to Datanode * is a collection 
> * of strorages. 



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to