Thank you very much for your explination , Just to verify that I understood correctly For example if myfile contains the following data 1 3 4 3 4 6 7 8 2 4 5 9 9 3 5 6 6 2
so all this data will be sent to Proj(0) operator which gives as a results 1 3 7 4 9 6 After that all this data in myfile will be sent to the filter operator, so that the filter take tow inputs the myfile data and the result of the proj(0) > 5 which is 7 9 6 regards On Mon, Jan 24, 2011 at 10:08 PM, Alan Gates <[email protected]> wrote: > The logical plan for your script will look like: > > Load -> Filter -> Store > > Filter will have an expression plan that looks like Proj($0) > const(5) > > So yes, all your data will go through the filter operator. But keep in > mind that there is a filter operator in each map task, so all your code will > not go through any one instance of the operator (unless myfile is small). > Hope that helps. > > Unfortunately, there is not any great architecture document on Pig. > Probably the best substitute is a paper we published in VLDB 2009, which > you can get here: > http://infolab.stanford.edu/~olston/publications/vldb09.pdf. Since this > is almost 2 years old now some of the specific information is out of date > but the basic structure is still correct. > > Alan. > > > On Jan 24, 2011, at 12:48 PM, Baraa Mohamad wrote: > > Hello all: >> >> I'm new user of Pig , and I'm very interested in the architecture of Pig. >> I have a question about the logical plan >> >> In the logical plan of this example: (in attach) >> a = load 'myfile'; >> b = filter a by $0 > 5; >> store b into 'myfilteredfile'; >> >> >> Does all the data in 'myfile' will be sent in it's totality to the Proj(0) >> operator and to the Filter Operator ?? >> More generally what are runing on the arrows in the logical plan ?? >> >> what is the best documentation to understand the architecture of Pig not >> only how to use it because I'll try to use it in the medical domain but >> first I have to understand it >> deeply >> >> thank you very much for your help >> >> >> Baraa MOHAMAD >> Doctorante en informatique >> ISIMA-LIMOS >> Université Blaise Pascal >> Clermont-Ferrand >> France >> Tél: +33 658900080 >> > >
