[
https://issues.apache.org/jira/browse/HBASE-9556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13770943#comment-13770943
]
Nick Dimiduk commented on HBASE-9556:
-------------------------------------
I like it.
> Provide key range support to bulkload to avoid too many reducers even the
> data belongs to few regions
> -----------------------------------------------------------------------------------------------------
>
> Key: HBASE-9556
> URL: https://issues.apache.org/jira/browse/HBASE-9556
> Project: HBase
> Issue Type: Improvement
> Components: mapreduce
> Reporter: rajeshbabu
> Assignee: rajeshbabu
> Priority: Minor
>
> Presently the number of reducers in bulk load are equal to number of regions.
> Lets suppose a table has 500 regions and import data only belongs 10 regions,
> still we are starting 500(equal to no. of regions) reducers instead of 10.
> Which will consume more time and resources.
> If user knows the row key range of import data, then we can pass startkey
> and/or endkey as input and based on the key range we can define the
> partitions and number of reducers(regions to which the data belongs). This
> helps to avoid too many reducers to start and do nothing and also avoids
> contention in shuffling.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira