[ https://issues.apache.org/jira/browse/HBASE-5140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220725#comment-13220725 ]
Rajesh Balamohan commented on HBASE-5140: ----------------------------------------- @Josh - Thanks for this patch. for loop within getSplits() generates the splits with the help of generateRegionSplits(). However, the returned list<InputSplit> is not added back to "List<InputSplit> splits = new ArrayList<InputSplit>(keys.getFirst().length);" > TableInputFormat subclass to allow N number of splits per region during MR > jobs > ------------------------------------------------------------------------------- > > Key: HBASE-5140 > URL: https://issues.apache.org/jira/browse/HBASE-5140 > Project: HBase > Issue Type: New Feature > Components: mapreduce > Affects Versions: 0.90.4 > Reporter: Josh Wymer > Priority: Trivial > Labels: mapreduce, split > Fix For: 0.90.4 > > Attachments: > Added_functionality_to_TableInputFormat_that_allows_splitting_of_regions.patch, > > Added_functionality_to_TableInputFormat_that_allows_splitting_of_regions.patch.1, > Added_functionality_to_split_n_times_per_region_on_mapreduce_jobs.patch > > Original Estimate: 72h > Remaining Estimate: 72h > > In regards to [HBASE-5138|https://issues.apache.org/jira/browse/HBASE-5138] I > am working on a patch for the TableInputFormat class that overrides getSplits > in order to generate N number of splits per regions and/or N number of splits > per job. The idea is to convert the startKey and endKey for each region from > byte[] to BigDecimal, take the difference, divide by N, convert back to > byte[] and generate splits on the resulting values. Assuming your keys are > fully distributed this should generate splits at nearly the same number of > rows per split. Any suggestions on this issue are welcome. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira