rajeshbabu created HBASE-9556:
---------------------------------

             Summary: Provide key range support to bulkload to avoid too many 
reducers even the data belongs to few regions
                 Key: HBASE-9556
                 URL: https://issues.apache.org/jira/browse/HBASE-9556
             Project: HBase
          Issue Type: Improvement
          Components: mapreduce
            Reporter: rajeshbabu
            Assignee: rajeshbabu
            Priority: Minor


Presently the number of reducers in bulk load are equal to number of regions.
Lets suppose a table has 500 regions and import data only belongs 10 regions, 
still we are starting 500(equal to no. of regions) reducers instead of 10. 
Which will consume more time and resources. 

If user knows the row key range of import data, then we can pass startkey 
and/or endkey as input and based on the key range we can define the partitions 
and number of reducers(regions to which the data belongs). This helps to avoid 
too many reducers to start and do nothing and also avoids contention in 
shuffling.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to