[
https://issues.apache.org/jira/browse/HCATALOG-506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Travis Crawford resolved HCATALOG-506.
--------------------------------------
Resolution: Fixed
Fix Version/s: 0.5
Committed to trunk. Thanks for the patch Greg! If you run into issues when
using with Giraph please let us know.
> desired number of input splits for large files
> ----------------------------------------------
>
> Key: HCATALOG-506
> URL: https://issues.apache.org/jira/browse/HCATALOG-506
> Project: HCatalog
> Issue Type: Improvement
> Affects Versions: 0.4
> Reporter: Greg Malewicz
> Assignee: Travis Crawford
> Labels: performance
> Fix For: 0.5
>
> Attachments: HCATALOG-506.patch, HCATALOG-506-revised.patch
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> Allow the user to specify the desired number of input splits through a new
> configuration parameter hcatalog.desiredNumInputSplits. Two existing
> parameters may also need to be specified: mapred.min.split.size and
> mapred.max.split.size. This is useful when there are few but large input
> files that we want to split into many splits, so as to enhance the
> parallelizm of loading the splits.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira