[
https://issues.apache.org/jira/browse/PIG-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12914663#action_12914663
]
Thejas M Nair commented on PIG-1642:
------------------------------------
Comments on the patch -
- In SampleOptimizer.java It expects the sampling MR plan to have only one
integer argument which has information about the number of reducers that will
be used in the successor of sampling job (order-by/skewed-join). We might not
remember this assumption if we make changes to the sampling plan, so it will be
safer to throw an error if more than one integer constant is seen in the plan.
- In test case, the expected number of reducers is being computed dynamically
and used for checking in first scenario, it can be used it in last scenario as
well.
> Order by doesn't use estimation to determine the parallelism
> ------------------------------------------------------------
>
> Key: PIG-1642
> URL: https://issues.apache.org/jira/browse/PIG-1642
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.8.0
> Reporter: Richard Ding
> Assignee: Richard Ding
> Fix For: 0.8.0
>
> Attachments: PIG-1642.patch, PIG-1642_1.patch, PIG-1642_1.patch
>
>
> With PIG-1249, a simple heuristic is used to determine the number of reducers
> if it isn't specified (via PARALLEL or default_parallel). For order by
> statement, however, it still defaults to 1.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.