Yes, please

Vasiliki Kalavri <vasilikikala...@gmail.com> ezt írta (időpont: 2015. nov.
25., Sze, 14:37):

> So, do we all agree that the current behavior is not correct? Shall I open
> a JIRA about this?
>
> On 25 November 2015 at 13:58, Gyula Fóra <gyula.f...@gmail.com> wrote:
>
> > Well it kind of depends on what definition of union are we using. If this
> > is a union in a set theoretical way we can argue that the union of a
> stream
> > with itself should be the same stream because it contains exactly the
> same
> > elements with the same timestamps and lineage.
> >
> > On the other hand stream and stream.map(id) are not exactly the same as
> > they might have elements with different order (the lineage differs).
> >
> > So I wouldnt say that any self-union semantics is the only possible one.
> >
> > Gyula
> >
> > Bruecke, Christoph <christoph.brue...@campus.tu-berlin.de> ezt írta
> > (időpont: 2015. nov. 25., Sze, 13:47):
> >
> > > Hi,
> > >
> > > the operation “stream.union(stream.map(id))” is equivalent to
> > > “stream.union(stream)” isn’t it? So it might also duplicate the data.
> > >
> > > - Christoph
> > >
> > >
> > > > On 25 Nov 2015, at 11:24, Stephan Ewen <se...@apache.org> wrote:
> > > >
> > > > "stream.union(stream.map(..))" should definitely be possible. Not
> sure
> > > why
> > > > this is not permitted.
> > > >
> > > > "stream.union(stream)" would contain each element twice, so should
> > either
> > > > give an error or actually union (or duplicate) elements...
> > > >
> > > > Stephan
> > > >
> > > >
> > > > On Wed, Nov 25, 2015 at 10:42 AM, Gyula Fóra <gyf...@apache.org>
> > wrote:
> > > >
> > > >> Yes, I am not sure if this the intentional behaviour. I think you
> are
> > > >> supposed to be able to do the things you described.
> > > >>
> > > >> stream.union(stream.map(..)) and things like this are fair
> operations.
> > > Also
> > > >> maybe stream.union(stream) should just give stream instead of an
> > error.
> > > >>
> > > >> Could someone comment on this who knows the reasoning behind the
> > current
> > > >> mechanics?
> > > >>
> > > >> Gyula
> > > >>
> > > >> Vasiliki Kalavri <vasilikikala...@gmail.com> ezt írta (időpont:
> 2015.
> > > nov.
> > > >> 24., K, 16:46):
> > > >>
> > > >>> Hi squirrels,
> > > >>>
> > > >>> when porting the gelly streaming code from 0.9 to 0.10 today with
> > > Paris,
> > > >> we
> > > >>> hit an exception in union: "*A DataStream cannot be unioned with
> > > >> itself*".
> > > >>>
> > > >>> The code raising this exception looks like this:
> > > >>> stream.union(stream.map(...)).
> > > >>>
> > > >>> Taking a look into the union code, we see that it's now not allowed
> > to
> > > >>> union a stream, not only with itself, but with any product of
> itself.
> > > >>>
> > > >>> First, we are wondering, why is that? Does it make building the
> > stream
> > > >>> graph easier in some way?
> > > >>> Second, we might want to give a better error message there, e.g.
> "*A
> > > >>> DataStream cannot be unioned with itself or a product of itself*",
> > and
> > > >>> finally, we should update the docs, which currently state that
> union
> > a
> > > >>> stream with itself is allowed and that "*If you union a data stream
> > > with
> > > >>> itself you will still only get each element once.*"
> > > >>>
> > > >>> Cheers,
> > > >>> -Vasia.
> > > >>>
> > > >>
> > >
> > >
> >
>

Reply via email to