[
https://issues.apache.org/jira/browse/HCATALOG-506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Greg Malewicz updated HCATALOG-506:
-----------------------------------
Attachment: HCATALOG-506-revised.patch
Revision.
> desired number of input splits for large files
> ----------------------------------------------
>
> Key: HCATALOG-506
> URL: https://issues.apache.org/jira/browse/HCATALOG-506
> Project: HCatalog
> Issue Type: Improvement
> Affects Versions: 0.4
> Reporter: Greg Malewicz
> Labels: performance
> Attachments: HCATALOG-506.patch, HCATALOG-506-revised.patch
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> Allow the user to specify the desired number of input splits through a new
> configuration parameter hcatalog.desiredNumInputSplits. Two existing
> parameters may also need to be specified: mapred.min.split.size and
> mapred.max.split.size. This is useful when there are few but large input
> files that we want to split into many splits, so as to enhance the
> parallelizm of loading the splits.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira