Indeed. I have observed Pig running considerably faster than hand-written MR programs, precisely because it is willing and able to do optimizations that decrease the number of passes over the data. These optimizations break abstraction boundaries in a way that would be very unpleasant or infeasible maintenance-wise in the hand-written Java code.
In general, I don't think performance will be all that strongly impacted by introducing pig components. On Tue, Feb 23, 2010 at 11:54 PM, Ankur C. Goel <gan...@yahoo-inc.com>wrote: > Pig also provides very nice features like MultiQuery optimization and > skewed & merge join that are hard to implement in Java M/R every time you > need them. > > With the latest pig release 0.6 the performance gap between Java M/R and > Pig has been narrowed to a good extent. > -- Ted Dunning, CTO DeepDyve