Hi community, I want to kick off a discussion about deprecating grouped window functions (GROUP BY TUMBLE/HOP/SESSION) as the table function windowing support becomes a thing [1] (FROM TABLE(TUMBLE/HOP/SESSION)). The current stage of table function windowing is TUMBLE support is checked in. HOP and SESSION support is likely to be merged in 1.23.0.
A briefly example of two different windowing syntax: // Grouped window functions. SELECT product_id, count(*), TUMBLE_START() as window_start FROM order GROUP BY product_id, TUMBLE(rowtime, INTERVAL '1' hour); // an hour long fixed window size. // Table function windowing syntax. SELECT product_id, count(*), window_start FROM TABLE(TUMBLE(order, DESCRIPTOR(.rowtime), INTERVAL '1' hour) GROUP BY product_id I am giving a short, selective comparison as the following: The places that table function windowing behaves better 1) no GROUPING/GROUP BY enforced. It becomes a problem in streaming JOIN. For example, one use case is for each hour, apply a JOIN on two streams. In this case, no GROUP BY is needed. 2) grouped window functions allow multiple calls in GROUP BY. For example, from SQL syntax perspective, GROUP BY TUMBLE(...), HOP(...), SESSION(...) is not wrong, but it is an illegal query. 3) Calcite includes an Enumerable implementation of table function windowing, while grouped window functions do not have that. The places that table function windowing behaves worse 1) table function windowing adds "window_start", "window_end" into table directly, which increases the volume of data (number of rows * sizeof(timestamp) * 2). I want to focus on discussing two questions in this thread: 1) Do people support deprecating grouped window functions? 2) By which version people prefer to make grouped window functions completely removed?(if 1) is yes). [1]: https://jira.apache.org/jira/browse/CALCITE-3271 -Rui