[ 
https://issues.apache.org/jira/browse/HCATALOG-506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13463227#comment-13463227
 ] 

Travis Crawford commented on HCATALOG-506:
------------------------------------------

Hey Greg, this patch doesn't apply cleanly and looking I believe it was 
generated with {{git diff}}, which does not work correctly when patching into 
svn repos. In the future, please use {{git diff --no-prefix}} which generates 
patches in the correct format.

https://cwiki.apache.org/confluence/display/HCATALOG/HowToContribute#HowToContribute-Creatingapatchwithgit

This patch is pretty small so I'll take care of things this time.
                
> desired number of input splits for large files
> ----------------------------------------------
>
>                 Key: HCATALOG-506
>                 URL: https://issues.apache.org/jira/browse/HCATALOG-506
>             Project: HCatalog
>          Issue Type: Improvement
>    Affects Versions: 0.4
>            Reporter: Greg Malewicz
>            Assignee: Travis Crawford
>              Labels: performance
>         Attachments: HCATALOG-506.patch, HCATALOG-506-revised.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Allow the user to specify the desired number of input splits through a new 
> configuration parameter hcatalog.desiredNumInputSplits. Two existing 
> parameters may also need to be specified: mapred.min.split.size and 
> mapred.max.split.size. This is useful when there are few but large input 
> files that we want to split into many splits, so as to enhance the 
> parallelizm of loading the splits.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to