[
https://issues.apache.org/jira/browse/PIG-2661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13263289#comment-13263289
]
Daniel Dai commented on PIG-2661:
---------------------------------
For order by job, we will first generate 3 MR jobs, the first job processes
everything before order-by, the second is sample job and the third is a sort
job. We try to drop the first job in SampleOptimizer. Currently we only drop
the first job when the it is empty. If the first job is not empty, we may merge
1st job pipleline into 2nd/3th job, however, we need to make sure we sample the
input after the pipeline, and WeightedRangePartitioner also partition the input
after the pipeline. Seems there is some non-trivial work to do.
> Pig uses an extra job for loading data in Pigmix L9
> ---------------------------------------------------
>
> Key: PIG-2661
> URL: https://issues.apache.org/jira/browse/PIG-2661
> Project: Pig
> Issue Type: Improvement
> Affects Versions: 0.9.0
> Reporter: Jie Li
>
> See
> https://issues.apache.org/jira/browse/PIG-200?focusedCommentId=13260155&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13260155
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira