[ https://issues.apache.org/jira/browse/PIG-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12914663#action_12914663 ]
Thejas M Nair commented on PIG-1642: ------------------------------------ Comments on the patch - - In SampleOptimizer.java It expects the sampling MR plan to have only one integer argument which has information about the number of reducers that will be used in the successor of sampling job (order-by/skewed-join). We might not remember this assumption if we make changes to the sampling plan, so it will be safer to throw an error if more than one integer constant is seen in the plan. - In test case, the expected number of reducers is being computed dynamically and used for checking in first scenario, it can be used it in last scenario as well. > Order by doesn't use estimation to determine the parallelism > ------------------------------------------------------------ > > Key: PIG-1642 > URL: https://issues.apache.org/jira/browse/PIG-1642 > Project: Pig > Issue Type: Bug > Affects Versions: 0.8.0 > Reporter: Richard Ding > Assignee: Richard Ding > Fix For: 0.8.0 > > Attachments: PIG-1642.patch, PIG-1642_1.patch, PIG-1642_1.patch > > > With PIG-1249, a simple heuristic is used to determine the number of reducers > if it isn't specified (via PARALLEL or default_parallel). For order by > statement, however, it still defaults to 1. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.