Until now as I am not seeing more people are commenting on this proposal,
can we consider this proposal is already accepted by Beam community?

If it is accepted, I want to start a discussion on deprecate the old GROUP
BY windowing style and only keep table-valued function windowing.


-Rui

On Thu, Jul 25, 2019 at 11:32 AM Kenneth Knowles <k...@apache.org> wrote:

> We hope it does enter the SQL standard. It is one reason for coming
> together to write this paper.
>
> OVER clause is mentioned often.
>
>  - TUMBLE can actually just be a function so you don't need OVER or any of
> the fancy stuff we propose; it is just done to make them all look similar
>  - HOP still doesn't work since OVER clause has one value per input row,
> it is still 1 to 1 input/output ratio
>  - SESSION GAP 5 MINUTES (PARTITION BY key) is actually a natural syntax
> that could work well
>
> None of them require ORDER, by design.
>
> On the other hand, implementing the general OVER clause and the rank,
> running sum, etc, could be done with GBK + sort values. That is not related
> to windowing. And since in SQL users of windowing will think of OVER as
> related to ordering, I personally don't want to also use it for something
> that has nothing to do with ordering.
>
> But if you would write up something that could be interesting to discuss
> more.
>
> Kenn
>
> On Wed, Jul 24, 2019 at 2:24 PM Mingmin Xu <mingm...@gmail.com> wrote:
>
>> +1 to remove those magic words in Calcite streaming SQL, just because
>> they're not SQL standard. The idea to replace HOP/TUMBLE with
>> table-view-functions makes it concise, my only question is, is it(or will
>> it be) part of SQL standard? --I'm a big fan to align with standards :lol
>>
>> Ps, although the concept of `window` used here are different from window
>> function in SQL, the syntax gives some insight. Take the example of 
>> `ROW_NUMBER()
>> OVER (PARTITION BY COL1 ORDER BY COL2) AS row_number`, `ROW_NUMBER()`
>> assigns a sequence value for records in subgroup with key 'COL1'. We can
>> introduce another function, like TUMBLE() which will assign a window
>> instance(more instances for HOP()) for the record.
>>
>> Mingmin
>>
>>
>> On Sun, Jul 21, 2019 at 9:42 PM Manu Zhang <owenzhang1...@gmail.com>
>> wrote:
>>
>>> Thanks Kenn,
>>> great paper and left some newbie questions on the proposal.
>>>
>>> Manu
>>>
>>> On Fri, Jul 19, 2019 at 1:51 AM Kenneth Knowles <k...@apache.org> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I recently had the great privilege to work with others from Beam plus
>>>> Calcite and Flink SQL contributors to build a new and minimal proposal for
>>>> adding streaming extensions to standard SQL: event time, watermarks,
>>>> windowing, triggers, stream materialization.
>>>>
>>>> We hope this will influence the standard body and also Calcite and
>>>> Flink and other projects working on the streaming SQL.
>>>>
>>>> I would like to start implementing these extensions in Beam, moving
>>>> from our current streaming extensions to the new proposal.
>>>>
>>>>    The whole paper is https://arxiv.org/abs/1905.12133
>>>>
>>>>    My small proposal to start in Beam:
>>>> https://s.apache.org/streaming-beam-sql
>>>>
>>>> TL;DR: replace `GROUP BY Tumble/Hop/Session` with table functions that
>>>> do Tumble, Hop, Session. The details of why to make this change are
>>>> explained in the appendix to my proposal. For the big picture of how it
>>>> fits in, the full paper is best.
>>>>
>>>> Kenn
>>>>
>>>
>>
>> --
>> ----
>> Mingmin
>>
>

Reply via email to