Thanks for bring up this discussion Kenn!

Definitely +1 for the proposal.

I have left some questions in the documentation :)

Best,
Jincheng

Rui Wang <ruw...@google.com> 于2019年12月11日周三 上午5:23写道:

> Until now as I am not seeing more people are commenting on this proposal,
> can we consider this proposal is already accepted by Beam community?
>
> If it is accepted, I want to start a discussion on deprecate the old GROUP
> BY windowing style and only keep table-valued function windowing.
>
>
> -Rui
>
> On Thu, Jul 25, 2019 at 11:32 AM Kenneth Knowles <k...@apache.org> wrote:
>
>> We hope it does enter the SQL standard. It is one reason for coming
>> together to write this paper.
>>
>> OVER clause is mentioned often.
>>
>>  - TUMBLE can actually just be a function so you don't need OVER or any
>> of the fancy stuff we propose; it is just done to make them all look similar
>>  - HOP still doesn't work since OVER clause has one value per input row,
>> it is still 1 to 1 input/output ratio
>>  - SESSION GAP 5 MINUTES (PARTITION BY key) is actually a natural syntax
>> that could work well
>>
>> None of them require ORDER, by design.
>>
>> On the other hand, implementing the general OVER clause and the rank,
>> running sum, etc, could be done with GBK + sort values. That is not related
>> to windowing. And since in SQL users of windowing will think of OVER as
>> related to ordering, I personally don't want to also use it for something
>> that has nothing to do with ordering.
>>
>> But if you would write up something that could be interesting to discuss
>> more.
>>
>> Kenn
>>
>> On Wed, Jul 24, 2019 at 2:24 PM Mingmin Xu <mingm...@gmail.com> wrote:
>>
>>> +1 to remove those magic words in Calcite streaming SQL, just because
>>> they're not SQL standard. The idea to replace HOP/TUMBLE with
>>> table-view-functions makes it concise, my only question is, is it(or will
>>> it be) part of SQL standard? --I'm a big fan to align with standards :lol
>>>
>>> Ps, although the concept of `window` used here are different from window
>>> function in SQL, the syntax gives some insight. Take the example of 
>>> `ROW_NUMBER()
>>> OVER (PARTITION BY COL1 ORDER BY COL2) AS row_number`, `ROW_NUMBER()`
>>> assigns a sequence value for records in subgroup with key 'COL1'. We can
>>> introduce another function, like TUMBLE() which will assign a window
>>> instance(more instances for HOP()) for the record.
>>>
>>> Mingmin
>>>
>>>
>>> On Sun, Jul 21, 2019 at 9:42 PM Manu Zhang <owenzhang1...@gmail.com>
>>> wrote:
>>>
>>>> Thanks Kenn,
>>>> great paper and left some newbie questions on the proposal.
>>>>
>>>> Manu
>>>>
>>>> On Fri, Jul 19, 2019 at 1:51 AM Kenneth Knowles <k...@apache.org>
>>>> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I recently had the great privilege to work with others from Beam plus
>>>>> Calcite and Flink SQL contributors to build a new and minimal proposal for
>>>>> adding streaming extensions to standard SQL: event time, watermarks,
>>>>> windowing, triggers, stream materialization.
>>>>>
>>>>> We hope this will influence the standard body and also Calcite and
>>>>> Flink and other projects working on the streaming SQL.
>>>>>
>>>>> I would like to start implementing these extensions in Beam, moving
>>>>> from our current streaming extensions to the new proposal.
>>>>>
>>>>>    The whole paper is https://arxiv.org/abs/1905.12133
>>>>>
>>>>>    My small proposal to start in Beam:
>>>>> https://s.apache.org/streaming-beam-sql
>>>>>
>>>>> TL;DR: replace `GROUP BY Tumble/Hop/Session` with table functions that
>>>>> do Tumble, Hop, Session. The details of why to make this change are
>>>>> explained in the appendix to my proposal. For the big picture of how it
>>>>> fits in, the full paper is best.
>>>>>
>>>>> Kenn
>>>>>
>>>>
>>>
>>> --
>>> ----
>>> Mingmin
>>>
>>

Reply via email to