Thanks for bring up this discussion Kenn! Definitely +1 for the proposal.
I have left some questions in the documentation :) Best, Jincheng Rui Wang <ruw...@google.com> 于2019年12月11日周三 上午5:23写道: > Until now as I am not seeing more people are commenting on this proposal, > can we consider this proposal is already accepted by Beam community? > > If it is accepted, I want to start a discussion on deprecate the old GROUP > BY windowing style and only keep table-valued function windowing. > > > -Rui > > On Thu, Jul 25, 2019 at 11:32 AM Kenneth Knowles <k...@apache.org> wrote: > >> We hope it does enter the SQL standard. It is one reason for coming >> together to write this paper. >> >> OVER clause is mentioned often. >> >> - TUMBLE can actually just be a function so you don't need OVER or any >> of the fancy stuff we propose; it is just done to make them all look similar >> - HOP still doesn't work since OVER clause has one value per input row, >> it is still 1 to 1 input/output ratio >> - SESSION GAP 5 MINUTES (PARTITION BY key) is actually a natural syntax >> that could work well >> >> None of them require ORDER, by design. >> >> On the other hand, implementing the general OVER clause and the rank, >> running sum, etc, could be done with GBK + sort values. That is not related >> to windowing. And since in SQL users of windowing will think of OVER as >> related to ordering, I personally don't want to also use it for something >> that has nothing to do with ordering. >> >> But if you would write up something that could be interesting to discuss >> more. >> >> Kenn >> >> On Wed, Jul 24, 2019 at 2:24 PM Mingmin Xu <mingm...@gmail.com> wrote: >> >>> +1 to remove those magic words in Calcite streaming SQL, just because >>> they're not SQL standard. The idea to replace HOP/TUMBLE with >>> table-view-functions makes it concise, my only question is, is it(or will >>> it be) part of SQL standard? --I'm a big fan to align with standards :lol >>> >>> Ps, although the concept of `window` used here are different from window >>> function in SQL, the syntax gives some insight. Take the example of >>> `ROW_NUMBER() >>> OVER (PARTITION BY COL1 ORDER BY COL2) AS row_number`, `ROW_NUMBER()` >>> assigns a sequence value for records in subgroup with key 'COL1'. We can >>> introduce another function, like TUMBLE() which will assign a window >>> instance(more instances for HOP()) for the record. >>> >>> Mingmin >>> >>> >>> On Sun, Jul 21, 2019 at 9:42 PM Manu Zhang <owenzhang1...@gmail.com> >>> wrote: >>> >>>> Thanks Kenn, >>>> great paper and left some newbie questions on the proposal. >>>> >>>> Manu >>>> >>>> On Fri, Jul 19, 2019 at 1:51 AM Kenneth Knowles <k...@apache.org> >>>> wrote: >>>> >>>>> Hi all, >>>>> >>>>> I recently had the great privilege to work with others from Beam plus >>>>> Calcite and Flink SQL contributors to build a new and minimal proposal for >>>>> adding streaming extensions to standard SQL: event time, watermarks, >>>>> windowing, triggers, stream materialization. >>>>> >>>>> We hope this will influence the standard body and also Calcite and >>>>> Flink and other projects working on the streaming SQL. >>>>> >>>>> I would like to start implementing these extensions in Beam, moving >>>>> from our current streaming extensions to the new proposal. >>>>> >>>>> The whole paper is https://arxiv.org/abs/1905.12133 >>>>> >>>>> My small proposal to start in Beam: >>>>> https://s.apache.org/streaming-beam-sql >>>>> >>>>> TL;DR: replace `GROUP BY Tumble/Hop/Session` with table functions that >>>>> do Tumble, Hop, Session. The details of why to make this change are >>>>> explained in the appendix to my proposal. For the big picture of how it >>>>> fits in, the full paper is best. >>>>> >>>>> Kenn >>>>> >>>> >>> >>> -- >>> ---- >>> Mingmin >>> >>