[
https://issues.apache.org/jira/browse/HBASE-57?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12859244#action_12859244
]
Jan Lukavsky commented on HBASE-57:
-----------------------------------
@stack
Our cluster is not "long running", we have written approx. 1TB of data, which
we are trying to analyze using M/R. After the data was written, we do not
modify them. In this case, I'm afraid dislocated blocks will stay put, until
major compaction (which might therefore solve this problem for us). The blocks
may get dislocated for example after some RS becomes temporarily unavailable.
Therefore I suppose assigning regions based on their real location is somewhat
'cleaner', so that after the RS becomes available again it regains its regions.
> [hbase] Master should allocate regions to regionservers based upon data
> locality and rack awareness
> ---------------------------------------------------------------------------------------------------
>
> Key: HBASE-57
> URL: https://issues.apache.org/jira/browse/HBASE-57
> Project: Hadoop HBase
> Issue Type: Improvement
> Components: master
> Affects Versions: 0.2.0
> Reporter: stack
>
> Currently, regions are assigned regionservers based off a basic loading
> attribute. A factor to include in the assignment calcuation is the location
> of the region in hdfs; i.e. servers hosting region replicas. If the cluster
> is such that regionservers are being run on the same nodes as those running
> hdfs, then ideally the regionserver for a particular region should be running
> on the same server as hosts a region replica.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.