I mean that the head operators have different parallelism:

IterativeDataStream ids = ...

ids.map().setParallelism(2)
ids.map().setParallelism(4)

//...

ids.closeWith(feedback)

Aljoscha Krettek <aljos...@apache.org> ezt írta (időpont: 2015. júl. 31.,
P, 14:23):

> I thought about having some tighter restrictions here. My idea was to
> enforce that the feedback edges must have the same parallelism as the
> original input stream, otherwise shipping strategies such as "keyBy",
> "shuffle", "rebalance" don't seem to make sense because they would differ
> from the distribution of the original elements (at least IMHO). Maybe I'm
> wrong there, though.
>
> To me it seems intuitive that I get the feedback at the head they way I
> specify it at the tail. But maybe that's also just me... :D
>
> On Fri, 31 Jul 2015 at 14:00 Gyula Fóra <gyf...@apache.org> wrote:
>
> > Hey,
> >
> > I am not sure what is the intuitive behaviour here. As you are not
> applying
> > a transformation on the feedback stream but pass it to a closeWith
> method,
> > I thought it was somehow nature that it gets the partitioning of the
> > iteration input, but maybe its not intuitive.
> >
> > If others also think that preserving feedback partitioning should be the
> > default I am not against it :)
> >
> > Btw, this still won't make it very simple. We still need as many
> > source/sink pairs as we have different parallelism among the head
> > operators. Otherwise the forwarding logic wont work.
> >
> > Cheers,
> > Gyula
> >
> > Aljoscha Krettek <aljos...@apache.org> ezt írta (időpont: 2015. júl.
> 31.,
> > P, 11:52):
> >
> > > Hi,
> > > I'm currently working on making the StreamGraph generation more
> > centralized
> > > (i.e. not spread across the different API classes). The question is now
> > why
> > > we need to switch to preserve partitioning? Could we not make
> "preserve"
> > > partitioning the default and if users want to have shuffle partitioning
> > or
> > > anything they have to specify it manually when adding the feedback
> edge?
> > >
> > > This would make for a very simple scheme where the iteration sources
> are
> > > always connected to the heads using "forward" and the tails are
> connected
> > > to the iteration sinks using whatever partitioner was set by the user.
> > This
> > > would make it more transparent than the current default of the
> "shuffle"
> > > betweens tails and iteration sinks.
> > >
> > > Cheers,
> > > Aljoscha
> > >
> > > P.S. I now we had quite some discussion about introducing "preserve
> > > partitioning" but now, when I think of it it should be the default...
> :D
> > >
> >
>

Reply via email to