You are right, but you should never try to compute full stream median, thats the point :D
On Tue, Apr 21, 2015 at 2:52 PM, Bruno Cadonna < cado...@informatik.hu-berlin.de> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi Gyula, > > I read your comments of your PR. > > I have a question to this comment: > > "It only allows aggregations so we dont need to keep the full history > in a buffer." > > What if the user implements an aggregation function like a median? > > For a median you need the full history, don't you? > > Am I missing something? > > Cheers, > Bruno > > On 21.04.2015 14:31, Gyula Fóra wrote: > > I have opened a PR for this feature: > > > > https://github.com/apache/flink/pull/614 > > > > Cheers, Gyula > > > > On Tue, Apr 21, 2015 at 1:10 PM, Gyula Fóra <gyula.f...@gmail.com> > > wrote: > > > >> Thats a good idea, I will modify my PR to that :) > >> > >> Gyula > >> > >> On Tue, Apr 21, 2015 at 12:09 PM, Fabian Hueske > >> <fhue...@gmail.com> wrote: > >> > >>> Is it possible to switch the order of the statements, i.e., > >>> > >>> dataStream.every(Time.of(4,sec)).reduce(...) instead of > >>> dataStream.reduce(...).every(Time.of(4,sec)) > >>> > >>> I think that would be more consistent with the structure of the > >>> remaining API. > >>> > >>> Cheers, Fabian > >>> > >>> 2015-04-21 10:57 GMT+02:00 Gyula Fóra <gyf...@apache.org>: > >>> > >>>> Hi Bruno, > >>>> > >>>> Of course you can do that as well. (That's the good part :p > >>>> ) > >>>> > >>>> I will open a PR soon with the proposed changes (first > >>>> without breaking > >>> the > >>>> current Api) and I will post it here. > >>>> > >>>> Cheers, Gyula > >>>> > >>>> On Tuesday, April 21, 2015, Bruno Cadonna < > >>> cado...@informatik.hu-berlin.de > >>>>> > >>>> wrote: > >>>> > > Hi Gyula, > > > > I have a question regarding your suggestion. > > > > Can the current continuous aggregation be also specified with your > > proposed periodic aggregation? > > > > I am thinking about something like > > > > dataStream.reduce(...).every(Count.of(1)) > > > > Cheers, Bruno > > > > On 20.04.2015 22:32, Gyula Fóra wrote: > >>>>>>> Hey all, > >>>>>>> > >>>>>>> I think we are missing a quite useful feature that > >>>>>>> could be implemented (with some slight modifications) > >>>>>>> on top of the current windowing api. > >>>>>>> > >>>>>>> We currently provide 2 ways of aggregating (or > >>>>>>> reducing) over streams: doing a continuous aggregation > >>>>>>> and always output the aggregated value (which cannot be > >>>>>>> done properly in parallel) or doing aggregation in a > >>>>>>> window periodically. > >>>>>>> > >>>>>>> What we don't have at the moment is periodic > >>>>>>> aggregations on the whole stream. I would even go as > >>>>>>> far as to remove the continuous outputting > >>>>>>> reduce/aggregate it and replace it with this version > >>>>>>> as this in return can be done properly in parallel. > >>>>>>> > >>>>>>> My suggestion would be that a call: > >>>>>>> > >>>>>>> dataStream.reduce(..) dataStream.sum(..) > >>>>>>> > >>>>>>> would return a windowed data stream where the window is > >>>>>>> the whole record history, and the user would need to > >>>>>>> define a trigger to get the actual reduced values > >>>>>>> like: > >>>>>>> > >>>>>>> dataStream.reduce(...).every(Time.of(4,sec)) to get the > >>>>>>> actual reduced results. dataStream.sum(...).every(...) > >>>>>>> > >>>>>>> I think the current data stream reduce/aggregation is > >>>>>>> very confusing without being practical for any normal > >>>>>>> use-case. > >>>>>>> > >>>>>>> Also this would be a very api breaking change (but I > >>>>>>> would still make this change as it is much more > >>>>>>> intuitive than the current behaviour) so I would try to > >>>>>>> push it before the release if we can agree. > >>>>>>> > >>>>>>> Cheers, Gyula > >>>>>>> > > > >>>>> > >>>> > >>> > >> > >> > > > > - -- > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > Dr. Bruno Cadonna > Postdoctoral Researcher > > Databases and Information Systems > Department of Computer Science > Humboldt-Universität zu Berlin > > http://www.informatik.hu-berlin.de/~cadonnab > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.11 (GNU/Linux) > > iQEcBAEBAgAGBQJVNkgYAAoJEKdCIJx7flKw7rUIAMmu80ZuMvfA/BvQemkEo7As > bU3iWre+e3OUWNRLuf2JfG9CHMKFSjBJG6Jax/pWZBXTYh8oaYDrYixq7e+vljqf > P9ypurhd1h8In71aSUyUPIsrTg6aJ5xo/beUxA6LFbB2LpVqawNDe0gjn3ZRMobM > zmn962kqp0oHAVipYI2mzEU6RNl1Kh0PoaLaZRLRh+dlgKofqDFcBiB3hhG/VEoF > sCsCAsC1bXtpToPRZ29cRcEfpHcnE3zCgivPeG83JsWYr4mIEj7gp+smFUz0PjoI > 1wHv/pnZJS4Onk38HH1GcP95/uYpqm4gz3OBCuE7v+3b1bI852bIvnUZrCGLOew= > =u1R0 > -----END PGP SIGNATURE----- >