Until now as I am not seeing more people are commenting on this proposal, can we consider this proposal is already accepted by Beam community?
If it is accepted, I want to start a discussion on deprecate the old GROUP BY windowing style and only keep table-valued function windowing. -Rui On Thu, Jul 25, 2019 at 11:32 AM Kenneth Knowles <k...@apache.org> wrote: > We hope it does enter the SQL standard. It is one reason for coming > together to write this paper. > > OVER clause is mentioned often. > > - TUMBLE can actually just be a function so you don't need OVER or any of > the fancy stuff we propose; it is just done to make them all look similar > - HOP still doesn't work since OVER clause has one value per input row, > it is still 1 to 1 input/output ratio > - SESSION GAP 5 MINUTES (PARTITION BY key) is actually a natural syntax > that could work well > > None of them require ORDER, by design. > > On the other hand, implementing the general OVER clause and the rank, > running sum, etc, could be done with GBK + sort values. That is not related > to windowing. And since in SQL users of windowing will think of OVER as > related to ordering, I personally don't want to also use it for something > that has nothing to do with ordering. > > But if you would write up something that could be interesting to discuss > more. > > Kenn > > On Wed, Jul 24, 2019 at 2:24 PM Mingmin Xu <mingm...@gmail.com> wrote: > >> +1 to remove those magic words in Calcite streaming SQL, just because >> they're not SQL standard. The idea to replace HOP/TUMBLE with >> table-view-functions makes it concise, my only question is, is it(or will >> it be) part of SQL standard? --I'm a big fan to align with standards :lol >> >> Ps, although the concept of `window` used here are different from window >> function in SQL, the syntax gives some insight. Take the example of >> `ROW_NUMBER() >> OVER (PARTITION BY COL1 ORDER BY COL2) AS row_number`, `ROW_NUMBER()` >> assigns a sequence value for records in subgroup with key 'COL1'. We can >> introduce another function, like TUMBLE() which will assign a window >> instance(more instances for HOP()) for the record. >> >> Mingmin >> >> >> On Sun, Jul 21, 2019 at 9:42 PM Manu Zhang <owenzhang1...@gmail.com> >> wrote: >> >>> Thanks Kenn, >>> great paper and left some newbie questions on the proposal. >>> >>> Manu >>> >>> On Fri, Jul 19, 2019 at 1:51 AM Kenneth Knowles <k...@apache.org> wrote: >>> >>>> Hi all, >>>> >>>> I recently had the great privilege to work with others from Beam plus >>>> Calcite and Flink SQL contributors to build a new and minimal proposal for >>>> adding streaming extensions to standard SQL: event time, watermarks, >>>> windowing, triggers, stream materialization. >>>> >>>> We hope this will influence the standard body and also Calcite and >>>> Flink and other projects working on the streaming SQL. >>>> >>>> I would like to start implementing these extensions in Beam, moving >>>> from our current streaming extensions to the new proposal. >>>> >>>> The whole paper is https://arxiv.org/abs/1905.12133 >>>> >>>> My small proposal to start in Beam: >>>> https://s.apache.org/streaming-beam-sql >>>> >>>> TL;DR: replace `GROUP BY Tumble/Hop/Session` with table functions that >>>> do Tumble, Hop, Session. The details of why to make this change are >>>> explained in the appendix to my proposal. For the big picture of how it >>>> fits in, the full paper is best. >>>> >>>> Kenn >>>> >>> >> >> -- >> ---- >> Mingmin >> >