[ 
https://issues.apache.org/jira/browse/HDFS-13224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16386553#comment-16386553
 ] 

Íñigo Goiri commented on HDFS-13224:
------------------------------------

I should add some documentation at some point (maybe a separate JIRA?) but the 
idea is that one can specify multiple subclusters for a mount point.
However, there are different approaches to doing this, currently we have:
* HASH: use consistent hashing at the first mount point level and decide based 
on this.
* LOCAL: use the local subcluster (this one is good for locality)
* RANDOM: pick a random subcluster (good for load balancing)
* HASH_ALL: distributes all the files in the mount point subtree using 
consistent hashing. The problem with this approach is that it requires all the 
tree structure (subfolders) to be in all subclusters.

We have all these working internally but seems to be a preference for HASH_ALL 
even though it has some limitations.

It may make sense to split this into a couple JIRAs. Anyway, let's get some 
feedback and proposals on  [^HDFS-13224.000.patch] for now.

> RBF: Mount points across multiple subclusters
> ---------------------------------------------
>
>                 Key: HDFS-13224
>                 URL: https://issues.apache.org/jira/browse/HDFS-13224
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Íñigo Goiri
>            Assignee: Íñigo Goiri
>            Priority: Major
>         Attachments: HDFS-13224.000.patch
>
>
> Currently, a mount point points to a single subcluster. We should be able to 
> spread files in a mount point across subclusters.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to