Parameter subsitution using -param option runs into problems when substituing entire pig statements in a shell script (maybe this is a bash problem) ----------------------------------------------------------------------------------------------------------------------------------------------------
Key: PIG-1586 URL: https://issues.apache.org/jira/browse/PIG-1586 Project: Pig Issue Type: Bug Affects Versions: 0.8.0 Reporter: Viraj Bhat I have a Pig script as a template: {code} register Countwords.jar; A = $INPUT; B = FOREACH A GENERATE examples.udf.SubString($0,0,1), $1 as num; C = GROUP B BY $0; D = FOREACH C GENERATE group, SUM(B.num); STORE D INTO $OUTPUT; {code} I attempt to do Parameter substitutions using the following: Using Shell script: {code} #!/bin/bash java -cp ~/pig-svn/trunk/pig.jar:$HADOOP_CONF_DIR org.apache.pig.Main -r -file sub.pig \ -param INPUT="(foreach (COGROUP(load '/user/viraj/dataset1' USING PigStorage() AS (word:chararray,num:int)) by (word),(load '/user/viraj/dataset2' USING PigStorage() AS (word:chararray,num:int)) by (word)) generate flatten(examples.udf.CountWords(\\$0,\\$1,\\$2)))" \ -param OUTPUT="\'/user/viraj/output\' USING PigStorage()" {code} register Countwords.jar; A = (foreach (COGROUP(load '/user/viraj/dataset1' USING PigStorage() AS (word:chararray,num:int)) by (word),(load '/user/viraj/dataset2' USING PigStorage() AS (word:chararray,num:int)) by (word)) generate flatten(examples.udf.CountWords(runsub.sh,,))); B = FOREACH A GENERATE examples.udf.SubString($0,0,1), $1 as num; C = GROUP B BY $0; D = FOREACH C GENERATE group, SUM(B.num); STORE D INTO /user/viraj/output; {code} The shell substitutes the $0 before passing it to java. a) Is there a workaround for this? b) Is this is Pig param problem? Viraj -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.