Yes, please Vasiliki Kalavri <vasilikikala...@gmail.com> ezt írta (időpont: 2015. nov. 25., Sze, 14:37):
> So, do we all agree that the current behavior is not correct? Shall I open > a JIRA about this? > > On 25 November 2015 at 13:58, Gyula Fóra <gyula.f...@gmail.com> wrote: > > > Well it kind of depends on what definition of union are we using. If this > > is a union in a set theoretical way we can argue that the union of a > stream > > with itself should be the same stream because it contains exactly the > same > > elements with the same timestamps and lineage. > > > > On the other hand stream and stream.map(id) are not exactly the same as > > they might have elements with different order (the lineage differs). > > > > So I wouldnt say that any self-union semantics is the only possible one. > > > > Gyula > > > > Bruecke, Christoph <christoph.brue...@campus.tu-berlin.de> ezt írta > > (időpont: 2015. nov. 25., Sze, 13:47): > > > > > Hi, > > > > > > the operation “stream.union(stream.map(id))” is equivalent to > > > “stream.union(stream)” isn’t it? So it might also duplicate the data. > > > > > > - Christoph > > > > > > > > > > On 25 Nov 2015, at 11:24, Stephan Ewen <se...@apache.org> wrote: > > > > > > > > "stream.union(stream.map(..))" should definitely be possible. Not > sure > > > why > > > > this is not permitted. > > > > > > > > "stream.union(stream)" would contain each element twice, so should > > either > > > > give an error or actually union (or duplicate) elements... > > > > > > > > Stephan > > > > > > > > > > > > On Wed, Nov 25, 2015 at 10:42 AM, Gyula Fóra <gyf...@apache.org> > > wrote: > > > > > > > >> Yes, I am not sure if this the intentional behaviour. I think you > are > > > >> supposed to be able to do the things you described. > > > >> > > > >> stream.union(stream.map(..)) and things like this are fair > operations. > > > Also > > > >> maybe stream.union(stream) should just give stream instead of an > > error. > > > >> > > > >> Could someone comment on this who knows the reasoning behind the > > current > > > >> mechanics? > > > >> > > > >> Gyula > > > >> > > > >> Vasiliki Kalavri <vasilikikala...@gmail.com> ezt írta (időpont: > 2015. > > > nov. > > > >> 24., K, 16:46): > > > >> > > > >>> Hi squirrels, > > > >>> > > > >>> when porting the gelly streaming code from 0.9 to 0.10 today with > > > Paris, > > > >> we > > > >>> hit an exception in union: "*A DataStream cannot be unioned with > > > >> itself*". > > > >>> > > > >>> The code raising this exception looks like this: > > > >>> stream.union(stream.map(...)). > > > >>> > > > >>> Taking a look into the union code, we see that it's now not allowed > > to > > > >>> union a stream, not only with itself, but with any product of > itself. > > > >>> > > > >>> First, we are wondering, why is that? Does it make building the > > stream > > > >>> graph easier in some way? > > > >>> Second, we might want to give a better error message there, e.g. > "*A > > > >>> DataStream cannot be unioned with itself or a product of itself*", > > and > > > >>> finally, we should update the docs, which currently state that > union > > a > > > >>> stream with itself is allowed and that "*If you union a data stream > > > with > > > >>> itself you will still only get each element once.*" > > > >>> > > > >>> Cheers, > > > >>> -Vasia. > > > >>> > > > >> > > > > > > > > >