Thank you. This is what I was expecting. So you can use DEFINE commands as your 'local' reduce functions. Not sure it is specifically mentioned/commented in any of the wiki pages. I feel this is pretty neat way to implement low-level 'local' map/reduce ( though the user should realize that they are operated on local data ) functionality over pig.
On Thu, Feb 18, 2010 at 3:33 PM, Ankur Goel <[email protected]> wrote: > Prasenjit, > Whether this is executed in the map or reduce phase, it will > only produce 'local' sum. To produce global sum you should be able to do > something like this > > A = Load ... > DEFINE CMD `script` ship('/a/b/script'); > B = Stream A through CMD as (count: long); > C = GROUP B ALL; > D = FOREACH C GENERATE 'Num Rows', SUM(B.count) > > Notice the group ALL after streaming is what will ship the counts computed > by your python script in each mapper to a single reducer where it will be > summed to produce a global sum. > > Hope that helps > > -...@nkur > > > prasenjit mukherjee wrote: > >> Apologies if I was not clear enough. >> >> Can I use the following python script in my DEFINE command to compute >> number of rows in my relation ( basically same as the SUM command) : >> >> #!/usr/bin/python >> import sys >> my_sum=0; >> for line in sys.stdin: >> my_sum+=1 >> sys.stdout.write(my_sum) >> >> -Prasen >> >> On Thu, Feb 18, 2010 at 2:56 PM, Ankur Goel <[email protected]> wrote: >> >> >> >>> Depending upon where it is placed in your pig script it will be invoked >>> in >>> either map or reduce phase. >>> To get better understanding of your pig script execution plan you can do >>> this from the grunt shell >>> >>> explain -script <your-script> -dot -out <dot-output-file> >>> >>> You can then feed the dot output file into a dot parser to generate the >>> DAG >>> in jpg/gif format >>> >>> -...@nkur >>> >>> >>> prasenjit mukherjee wrote: >>> >>> >>> >>>> Just wondering if I can use the DEFINE command to write my custom >>>> mapper/reducer functions. Mapper ( I believe) I can, but what not sure >>>> about reducer. I guess this depends how the define commands are >>>> invoked. >>>> >>>> -Prasen >>>> >>>> >>>> >>>> >>> >>> >> >
