[ https://issues.apache.org/jira/browse/PIG-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Viraj Bhat updated PIG-1586: ---------------------------- Description: I have a Pig script as a template: {code} register Countwords.jar; A = $INPUT; B = FOREACH A GENERATE examples.udf.SubString($0,0,1), $1 as num; C = GROUP B BY $0; D = FOREACH C GENERATE group, SUM(B.num); STORE D INTO $OUTPUT; {code} I attempt to do Parameter substitutions using the following: Using Shell script: {code} #!/bin/bash java -cp ~/pig-svn/trunk/pig.jar:$HADOOP_CONF_DIR org.apache.pig.Main -r -file sub.pig \ -param INPUT="(foreach (COGROUP(load '/user/viraj/dataset1' USING PigStorage() AS (word:chararray,num:int)) by (word),(load '/user/viraj/dataset2' USING PigStorage() AS (word:chararray,num:int)) by (word)) generate flatten(examples.udf.CountWords(\\$0,\\$1,\\$2)))" \ -param OUTPUT="\'/user/viraj/output\' USING PigStorage()" {code} {code} register Countwords.jar; A = (foreach (COGROUP(load '/user/viraj/dataset1' USING PigStorage() AS (word:chararray,num:int)) by (word),(load '/user/viraj/dataset2' USING PigStorage() AS (word:chararray,num:int)) by (word)) generate flatten(examples.udf.CountWords(runsub.sh,,))); B = FOREACH A GENERATE examples.udf.SubString($0,0,1), $1 as num; C = GROUP B BY $0; D = FOREACH C GENERATE group, SUM(B.num); STORE D INTO /user/viraj/output; {code} The shell substitutes the $0 before passing it to java. a) Is there a workaround for this? b) Is this is Pig param problem? Viraj was: I have a Pig script as a template: {code} register Countwords.jar; A = $INPUT; B = FOREACH A GENERATE examples.udf.SubString($0,0,1), $1 as num; C = GROUP B BY $0; D = FOREACH C GENERATE group, SUM(B.num); STORE D INTO $OUTPUT; {code} I attempt to do Parameter substitutions using the following: Using Shell script: {code} #!/bin/bash java -cp ~/pig-svn/trunk/pig.jar:$HADOOP_CONF_DIR org.apache.pig.Main -r -file sub.pig \ -param INPUT="(foreach (COGROUP(load '/user/viraj/dataset1' USING PigStorage() AS (word:chararray,num:int)) by (word),(load '/user/viraj/dataset2' USING PigStorage() AS (word:chararray,num:int)) by (word)) generate flatten(examples.udf.CountWords(\\$0,\\$1,\\$2)))" \ -param OUTPUT="\'/user/viraj/output\' USING PigStorage()" {code} register Countwords.jar; A = (foreach (COGROUP(load '/user/viraj/dataset1' USING PigStorage() AS (word:chararray,num:int)) by (word),(load '/user/viraj/dataset2' USING PigStorage() AS (word:chararray,num:int)) by (word)) generate flatten(examples.udf.CountWords(runsub.sh,,))); B = FOREACH A GENERATE examples.udf.SubString($0,0,1), $1 as num; C = GROUP B BY $0; D = FOREACH C GENERATE group, SUM(B.num); STORE D INTO /user/viraj/output; {code} The shell substitutes the $0 before passing it to java. a) Is there a workaround for this? b) Is this is Pig param problem? Viraj > Parameter subsitution using -param option runs into problems when substituing > entire pig statements in a shell script (maybe this is a bash problem) > ---------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: PIG-1586 > URL: https://issues.apache.org/jira/browse/PIG-1586 > Project: Pig > Issue Type: Bug > Affects Versions: 0.8.0 > Reporter: Viraj Bhat > > I have a Pig script as a template: > {code} > register Countwords.jar; > A = $INPUT; > B = FOREACH A GENERATE > examples.udf.SubString($0,0,1), > $1 as num; > C = GROUP B BY $0; > D = FOREACH C GENERATE group, SUM(B.num); > STORE D INTO $OUTPUT; > {code} > I attempt to do Parameter substitutions using the following: > Using Shell script: > {code} > #!/bin/bash > java -cp ~/pig-svn/trunk/pig.jar:$HADOOP_CONF_DIR org.apache.pig.Main -r > -file sub.pig \ > -param INPUT="(foreach (COGROUP(load '/user/viraj/dataset1' > USING PigStorage() AS (word:chararray,num:int)) by (word),(load > '/user/viraj/dataset2' USING PigStorage() AS (word:chararray,num:int)) by > (word)) generate flatten(examples.udf.CountWords(\\$0,\\$1,\\$2)))" \ > -param OUTPUT="\'/user/viraj/output\' USING PigStorage()" > {code} > {code} > register Countwords.jar; > A = (foreach (COGROUP(load '/user/viraj/dataset1' USING PigStorage() AS > (word:chararray,num:int)) by (word),(load '/user/viraj/dataset2' USING > PigStorage() AS (word:chararray,num:int)) by (word)) generate > flatten(examples.udf.CountWords(runsub.sh,,))); > B = FOREACH A GENERATE > examples.udf.SubString($0,0,1), > $1 as num; > C = GROUP B BY $0; > D = FOREACH C GENERATE group, SUM(B.num); > STORE D INTO /user/viraj/output; > {code} > The shell substitutes the $0 before passing it to java. > a) Is there a workaround for this? > b) Is this is Pig param problem? > Viraj -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.