I will. There is also a "bug" on Pig documentation here:
http://pig.apache.org/docs/r0.8.1/piglatin_ref2.html where it says In this example the command is executed and its stdout is used as the parameter value. %declare CMD 'generate_date'; it should really be `generate_date` with the back ticks, not the single quotes. On Wed, Aug 17, 2011 at 6:18 PM, Dmitriy Ryaboy <dvrya...@gmail.com> wrote: > Nice job figuring out a fix! > You should seriously file a bug with AMR for that. That's kind of > ridiculous. > > D > > On Wed, Aug 17, 2011 at 6:03 PM, Dexin Wang <wangde...@gmail.com> wrote: > > > I solved my own problem and just want to share with whoever might > encounter > > the same issue. > > > > I pass colon separated list then convert it to comma separated list > inside > > pig script using declare command. > > > > Submit pig job like this: > > > > -p SOURCE_DIRS="2011-08:2011-07:2011-06" > > > > and in Pig script > > > > % declare SOURCE_DIRS_CONVERTED `echo $SOURCE_DIRS | tr ':' ','`; > > LOAD '/root_dir/{$SOURCE_DIRS_CONVERTED}' ... > > > > > > On Wed, Aug 17, 2011 at 4:21 PM, Dexin Wang <wangde...@gmail.com> wrote: > > > > > Hi, > > > > > > I'm running pig jobs using Amazon pig support, where you submit jobs > with > > > comma concatenated parameters like this: > > > > > > elastic-mapreduce --pig-script --args myscript.pig --args > > > -p,PARAM1=value1,-p,PARAM2=value2,-p,PARAM3=value3 > > > > > > In my script, I need to pass multiple directories for the pig script to > > > load like this: > > > > > > raw = LOAD '/root_dir/{$SOURCE_DIRS}' > > > > > > and SOURCE_DIRS is computed. For example, it can be > > > "2011-08,2011-07,20110-06", meaning my pig script need to load data for > > the > > > past 3 months. This works fine when I run my job using local or direct > > > hadoop mode. But with Amazon pig, I have to do something like this: > > > > > > elastic-mapreduce --pig-script --args myscript.pig > > > -p,SOURCE_DIRS="2011-08,2011-07,2011-06" > > > > > > but emr will just replace commas with spaces so it breaks the parameter > > > passing syntax. I've tried adding backslashes before commas, but I > simply > > > end up with back slash with space in between. > > > > > > So question becomes: > > > > > > 1. can I do something differently than what I'm doing to pass multiple > > > folders to pig script (without commas), or > > > 2. anyone knows how to properly pass commas to elastic-mapreduce ? > > > > > > Thanks! > > > > > > Dexin > > > > > >