I mean that the head operators have different parallelism: IterativeDataStream ids = ...
ids.map().setParallelism(2) ids.map().setParallelism(4) //... ids.closeWith(feedback) Aljoscha Krettek <aljos...@apache.org> ezt írta (időpont: 2015. júl. 31., P, 14:23): > I thought about having some tighter restrictions here. My idea was to > enforce that the feedback edges must have the same parallelism as the > original input stream, otherwise shipping strategies such as "keyBy", > "shuffle", "rebalance" don't seem to make sense because they would differ > from the distribution of the original elements (at least IMHO). Maybe I'm > wrong there, though. > > To me it seems intuitive that I get the feedback at the head they way I > specify it at the tail. But maybe that's also just me... :D > > On Fri, 31 Jul 2015 at 14:00 Gyula Fóra <gyf...@apache.org> wrote: > > > Hey, > > > > I am not sure what is the intuitive behaviour here. As you are not > applying > > a transformation on the feedback stream but pass it to a closeWith > method, > > I thought it was somehow nature that it gets the partitioning of the > > iteration input, but maybe its not intuitive. > > > > If others also think that preserving feedback partitioning should be the > > default I am not against it :) > > > > Btw, this still won't make it very simple. We still need as many > > source/sink pairs as we have different parallelism among the head > > operators. Otherwise the forwarding logic wont work. > > > > Cheers, > > Gyula > > > > Aljoscha Krettek <aljos...@apache.org> ezt írta (időpont: 2015. júl. > 31., > > P, 11:52): > > > > > Hi, > > > I'm currently working on making the StreamGraph generation more > > centralized > > > (i.e. not spread across the different API classes). The question is now > > why > > > we need to switch to preserve partitioning? Could we not make > "preserve" > > > partitioning the default and if users want to have shuffle partitioning > > or > > > anything they have to specify it manually when adding the feedback > edge? > > > > > > This would make for a very simple scheme where the iteration sources > are > > > always connected to the heads using "forward" and the tails are > connected > > > to the iteration sinks using whatever partitioner was set by the user. > > This > > > would make it more transparent than the current default of the > "shuffle" > > > betweens tails and iteration sinks. > > > > > > Cheers, > > > Aljoscha > > > > > > P.S. I now we had quite some discussion about introducing "preserve > > > partitioning" but now, when I think of it it should be the default... > :D > > > > > >