Re: split operator

2010-08-23 Thread Daniel Dai
; This is a question long ago, but I suddenly come up with some more thoughts > on > this. In a query as simple as this: > > A = LOAD 'input'; > B = FILTER A BY $1 == 1; > C = COGROUP A BY $0, B BY $0; > > the optimizer will insert a split operator to reuse A. Acc

Re: split operator

2010-08-23 Thread Gang Luo
Hi Daniel, This is a question long ago, but I suddenly come up with some more thoughts on this. In a query as simple as this: A = LOAD 'input'; B = FILTER A BY $1 == 1; C = COGROUP A BY $0, B BY $0; the optimizer will insert a split operator to reuse A. According to the source c

Re: split operator

2010-07-26 Thread Daniel Dai
gt; in 4.3.1, the example and figure 6 show this. 5.1 last paragraph says split > operator maintain one-tuple buffer for each branch and talks about how to > synchronize multiple branches. I do think that is the in-memory split. > > here is the paper: http://www.vldb.org/pvldb/2/vldb09-10

Re: split operator

2010-07-26 Thread Gang Luo
Hi Daniel, in 4.3.1, the example and figure 6 show this. 5.1 last paragraph says split operator maintain one-tuple buffer for each branch and talks about how to synchronize multiple branches. I do think that is the in-memory split. here is the paper: http://www.vldb.org/pvldb/2/vldb09-1074.pdf

Re: split operator

2010-07-26 Thread Daniel Dai
jobs) Daniel Gang Luo wrote: Hi all according to the vldb 09 paper, the split operator and all its successive operators reside in memory without any blocking in between. However, the source code (version 0.7) shows that a MR job is actually ended when it meets the split operator and multiple new MR

split operator

2010-07-25 Thread Gang Luo
Hi all according to the vldb 09 paper, the split operator and all its successive operators reside in memory without any blocking in between. However, the source code (version 0.7) shows that a MR job is actually ended when it meets the split operator and multiple new MR jobs are created, each