Thanks for considering our needs.

I'm pretty sure that windows are in almost every streaming pipeline with aggregations. Unlike regular Java API, SQL syntax is very difficult to deprecate.

We usually give Flink user 1-2 releases time to update their code. Once Calcite supports polymorphic table functions, I think 6 months would be helpful otherwise we need to maintain our own fork which we could mostly prevent so far.

Regards,
Timo

On 29.04.20 00:49, Rui Wang wrote:
Agreed. I would like to get more feedback to have a
reasonable accommodation for users.


-Rui

On Mon, Apr 27, 2020 at 11:50 AM Julian Hyde <jh...@apache.org> wrote:

Changing my +1 to +0. We have to make reasonable accommodations for our
users. Glad we had this discussion.

On Apr 24, 2020, at 11:10 AM, Rui Wang <amaliu...@apache.org> wrote:

Hi Timo,

My intention is to fully drop concepts such as SqlGroupedWindowFunction
and
auxiliary group functions, which include relevant code in parser/syntax,
operator, planner, etc.

Since you mentioned the need for more time to migrate. How many Calcite
releases that you think can probably leave enough buffer time? (Calcite
schedules 4 releases a year. So say 2 releases will give 6 months)


-Rui

On Fri, Apr 24, 2020 at 1:50 AM Timo Walther <twal...@apache.org> wrote:

Hi everyone,

so far Apache Flink depends on this feature. We are fine with improving
the SQL compliance and eventually dropping GROUP BY TUMBLE/HOP/SESSION
in the future. However, we would like to give our users some time to
migrate their existing pipelines.

What does dropping mean for Calcite? Will users of Calcite be able to
still support this syntax? In particular, are you intending to also drop
concepts such as SqlGroupedWindowFunction and auxiliary group functions?
Or are you intending to just remove entries from Calcite's default
operator table?

Regards,
Timo


On 24.04.20 10:30, Julian Hyde wrote:
+1

Let’s remove TUMBLE etc from the GROUP BY clause. Since this is a SQL
change, not an API change, I don’t we need to give notice. Let’s just
do it.

Julian

On Apr 22, 2020, at 4:05 PM, Rui Wang <amaliu...@apache.org> wrote:

Made a mistake on the example above, and update it as follows:

// Table function windowing syntax.
SELECT
        product_id, count(*), window_start
FROM TABLE(TUMBLE(order, DESCRIPTOR(rowtime), INTERVAL '1' hour))
GROUP BY product_id, window_start

On Wed, Apr 22, 2020 at 2:31 PM Rui Wang <amaliu...@apache.org>
wrote:

Hi community,

I want to kick off a discussion about deprecating grouped window
functions
(GROUP BY TUMBLE/HOP/SESSION) as the table function windowing support
becomes a thing [1] (FROM TABLE(TUMBLE/HOP/SESSION)). The current
stage of
table function windowing is TUMBLE support is checked in. HOP and
SESSION
support is likely to be merged in 1.23.0.

A briefly example of two different windowing syntax:

// Grouped window functions.
SELECT
       product_id, count(*), TUMBLE_START() as window_start
FROM order
GROUP BY product_id, TUMBLE(rowtime, INTERVAL '1' hour); // an hour
long
fixed window size.

// Table function windowing syntax.
SELECT
        product_id, count(*), window_start
FROM TABLE(TUMBLE(order, DESCRIPTOR(.rowtime), INTERVAL '1' hour)
GROUP BY product_id

I am giving a short, selective comparison as the following:

The places that table function windowing behaves better
1) no GROUPING/GROUP BY enforced. It becomes a problem in streaming
JOIN.
For example, one use case is for each hour, apply a JOIN on two
streams. In
this case, no GROUP BY is needed.
2) grouped window functions allow multiple calls in GROUP BY. For
example,
from SQL syntax perspective, GROUP BY TUMBLE(...), HOP(...),
SESSION(...)
is not wrong, but it is an illegal query.
3) Calcite includes an Enumerable implementation of table function
windowing, while grouped window functions do not have that.


The places that table function windowing behaves worse
1) table function windowing adds "window_start", "window_end" into
table
directly, which increases the volume of data (number of rows *
sizeof(timestamp) * 2).


I want to focus on discussing two questions in this thread:
1) Do people support deprecating grouped window functions?
2) By which version people prefer to make grouped window functions
completely removed?(if 1) is yes).



[1]: https://jira.apache.org/jira/browse/CALCITE-3271


-Rui







Reply via email to