; This is a question long ago, but I suddenly come up with some more thoughts
> on
> this. In a query as simple as this:
>
> A = LOAD 'input';
> B = FILTER A BY $1 == 1;
> C = COGROUP A BY $0, B BY $0;
>
> the optimizer will insert a split operator to reuse A. Acc
Hi Daniel,
This is a question long ago, but I suddenly come up with some more thoughts on
this. In a query as simple as this:
A = LOAD 'input';
B = FILTER A BY $1 == 1;
C = COGROUP A BY $0, B BY $0;
the optimizer will insert a split operator to reuse A. According to the source
c
gt; in 4.3.1, the example and figure 6 show this. 5.1 last paragraph says split
> operator maintain one-tuple buffer for each branch and talks about how to
> synchronize multiple branches. I do think that is the in-memory split.
>
> here is the paper: http://www.vldb.org/pvldb/2/vldb09-10
Hi Daniel,
in 4.3.1, the example and figure 6 show this. 5.1 last paragraph says split
operator maintain one-tuple buffer for each branch and talks about how to
synchronize multiple branches. I do think that is the in-memory split.
here is the paper: http://www.vldb.org/pvldb/2/vldb09-1074.pdf
jobs)
Daniel
Gang Luo wrote:
Hi all
according to the vldb 09 paper, the split operator and all its successive
operators reside in memory without any blocking in between. However, the source
code (version 0.7) shows that a MR job is actually ended when it meets the split
operator and multiple new MR
Hi all
according to the vldb 09 paper, the split operator and all its successive
operators reside in memory without any blocking in between. However, the source
code (version 0.7) shows that a MR job is actually ended when it meets the
split
operator and multiple new MR jobs are created, each