By the way, a principle I would follow when composing windows is that all that 
comes out of a query is simply data values. I don’t think this is 
controversial. But I don’t want anyone to suggest that we pass fuzzy concepts 
like “window instances” from sub-query to enclosing query.

Julian


> On May 18, 2017, at 11:53 AM, Julian Hyde <jh...@apache.org> wrote:
> 
> Have you considered using TUMBLE_START? This is an inclusive bound, so you 
> should be able to compose windows.
> 
> It’s possible that TUMBLE_START isn’t returning the right value yet, e.g. see 
> https://insight.io/github.com/apache/calcite/blob/a11d14054e9c1d2ce22f60e11536f1885faaae7c/core/src/main/java/org/apache/calcite/sql2rel/AuxiliaryConverter.java#L56
>  
> <https://insight.io/github.com/apache/calcite/blob/a11d14054e9c1d2ce22f60e11536f1885faaae7c/core/src/main/java/org/apache/calcite/sql2rel/AuxiliaryConverter.java#L56>
>  but in principle it’s the right thing to use.
> 
> Julian
> 
> 
>> On May 17, 2017, at 4:46 AM, Timo Walther <twal...@apache.org 
>> <mailto:twal...@apache.org>> wrote:
>> 
>> Hi everyone,
>> 
>> we are very happy to support TUMBLE/HOP/SESSION in our upcoming Flink 1.3 
>> release. However, there are some problems regarding nested window queries 
>> that we would like to discuss with the Calcite community.
>> 
>> Take the following query:
>> 
>> SELECT
>>  rowtime, SUM(x)
>> FROM (
>>  SELECT
>>     TUMBLE_END(rowtime, INTERVAL '2' MINUTE) AS rowtime,
>>     MIN(x) AS x
>>  FROM MyTable
>>  GROUP BY TUMBLE(rowtime, INTERVAL '2' MINUTE)
>> )
>> GROUP BY TUMBLE(rowtime, INTERVAL '1' HOUR)
>> 
>> 
>> Initially, we thought that we can use the xxx_END() group auxiliary 
>> functions to define the rowtime for the upper query. However, according to 
>> http://calcite.apache.org/docs/stream.html 
>> <http://calcite.apache.org/docs/stream.html>, TUMBLE_END should return the 
>> timestamp of the exclusive window end, i.e., for a window of 1 hour that 
>> contains all elements from 12:00:00.000 until 12:59:59.999 (inclusive), 
>> TUMBLE_END would return 13:00:00.000. The problem is that Flink uses the 
>> inclusive window end as new timestamp. The reason for that is that if you do 
>> preaggregation with a window, say 5 minute windows which later will be 
>> aggregated into 1 hour windows, the last 5 minute window (from 12:55:00.000 
>> until 12:59:59.999 incl) would have a timestamp of 13:00:00.000 and fall 
>> into the next window starting at 13:00:00.000.
>> 
>> 
>> The question is how Calcite is planning to support nested windows. Right now 
>> we see the following options:
>> 
>> - TUMBLE_END returns the inclusive window end
>> 
>> - we introduce an additional group auxiliary function for the inclusive 
>> window end like: SELECT TUMBLE_TIME(rowtime, INTERVAL '2' MINUTE) AS rowtime 
>> ...
>> 
>> - we allow references to the window in the select: SELECT TUMBLE(rowtime, 
>> INTERVAL '1' HOUR) AS rowtime ...
>> 
>> What do you think?
>> 
>> 
>> Regards,
>> 
>> Timo
>> 
>> 
>> 
>> 
> 

Reply via email to