Hi all
according to the vldb 09 paper, the split operator and all its successive 
operators reside in memory without any blocking in between. However, the source 
code (version 0.7) shows that a MR job is actually ended when it meets the 
split 
operator and multiple new MR jobs are created, each representing one branch. 
This write-once-read-multiple-times method is different from the in-memory 
method mentioned in that paper. Does pig change the strategy for split, or is 
there still an in-memory version of split I didn't discover?

Thanks,
-Gang



Reply via email to