[ https://issues.apache.org/jira/browse/PIG-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Olga Natkovich updated PIG-1144: -------------------------------- Status: Patch Available (was: Open) resubmitting to rerun the tests > set default_parallelism construct does not set the number of reducers > correctly > ------------------------------------------------------------------------------- > > Key: PIG-1144 > URL: https://issues.apache.org/jira/browse/PIG-1144 > Project: Pig > Issue Type: Bug > Components: impl > Affects Versions: 0.6.0 > Environment: Hadoop 20 cluster with multi-node installation > Reporter: Viraj Bhat > Assignee: Daniel Dai > Fix For: 0.7.0 > > Attachments: brokenparallel.out, genericscript_broken_parallel.pig, > PIG-1144-1.patch, PIG-1144-2.patch, PIG-1144-3.patch > > > Hi all, > I have a Pig script where I set the parallelism using the following set > construct: "set default_parallel 100" . I modified the "MRPrinter.java" to > printout the parallelism > {code} > ... > public void visitMROp(MapReduceOper mr) > mStream.println("MapReduce node " + mr.getOperatorKey().toString() + " > Parallelism " + mr.getRequestedParallelism()); > ... > {code} > When I run an explain on the script, I see that the last job which does the > actual sort, runs as a single reducer job. This can be corrected, by adding > the PARALLEL keyword in front of the ORDER BY. > Attaching the script and the explain output > Viraj -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.