[ 
https://issues.apache.org/jira/browse/HCATALOG-506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Malewicz updated HCATALOG-506:
-----------------------------------

    Attachment: HCATALOG-506.patch
    
> desired number of input splits for large files
> ----------------------------------------------
>
>                 Key: HCATALOG-506
>                 URL: https://issues.apache.org/jira/browse/HCATALOG-506
>             Project: HCatalog
>          Issue Type: Improvement
>    Affects Versions: 0.4
>            Reporter: Greg Malewicz
>              Labels: performance
>         Attachments: HCATALOG-506.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Allow user to specify the desired number of input splits through a new 
> configuration parameter hcatalog.desiredNumInputSplits. Two existing 
> parameters may also need to be specified: mapred.min.split.size and 
> mapred.max.split.size. This is useful when there are few but large input 
> files that we want to split into many splits, so as to enhance the 
> parallelizm of loading the splits.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to