By the way, a principle I would follow when composing windows is that all that comes out of a query is simply data values. I don’t think this is controversial. But I don’t want anyone to suggest that we pass fuzzy concepts like “window instances” from sub-query to enclosing query.
Julian > On May 18, 2017, at 11:53 AM, Julian Hyde <jh...@apache.org> wrote: > > Have you considered using TUMBLE_START? This is an inclusive bound, so you > should be able to compose windows. > > It’s possible that TUMBLE_START isn’t returning the right value yet, e.g. see > https://insight.io/github.com/apache/calcite/blob/a11d14054e9c1d2ce22f60e11536f1885faaae7c/core/src/main/java/org/apache/calcite/sql2rel/AuxiliaryConverter.java#L56 > > <https://insight.io/github.com/apache/calcite/blob/a11d14054e9c1d2ce22f60e11536f1885faaae7c/core/src/main/java/org/apache/calcite/sql2rel/AuxiliaryConverter.java#L56> > but in principle it’s the right thing to use. > > Julian > > >> On May 17, 2017, at 4:46 AM, Timo Walther <twal...@apache.org >> <mailto:twal...@apache.org>> wrote: >> >> Hi everyone, >> >> we are very happy to support TUMBLE/HOP/SESSION in our upcoming Flink 1.3 >> release. However, there are some problems regarding nested window queries >> that we would like to discuss with the Calcite community. >> >> Take the following query: >> >> SELECT >> rowtime, SUM(x) >> FROM ( >> SELECT >> TUMBLE_END(rowtime, INTERVAL '2' MINUTE) AS rowtime, >> MIN(x) AS x >> FROM MyTable >> GROUP BY TUMBLE(rowtime, INTERVAL '2' MINUTE) >> ) >> GROUP BY TUMBLE(rowtime, INTERVAL '1' HOUR) >> >> >> Initially, we thought that we can use the xxx_END() group auxiliary >> functions to define the rowtime for the upper query. However, according to >> http://calcite.apache.org/docs/stream.html >> <http://calcite.apache.org/docs/stream.html>, TUMBLE_END should return the >> timestamp of the exclusive window end, i.e., for a window of 1 hour that >> contains all elements from 12:00:00.000 until 12:59:59.999 (inclusive), >> TUMBLE_END would return 13:00:00.000. The problem is that Flink uses the >> inclusive window end as new timestamp. The reason for that is that if you do >> preaggregation with a window, say 5 minute windows which later will be >> aggregated into 1 hour windows, the last 5 minute window (from 12:55:00.000 >> until 12:59:59.999 incl) would have a timestamp of 13:00:00.000 and fall >> into the next window starting at 13:00:00.000. >> >> >> The question is how Calcite is planning to support nested windows. Right now >> we see the following options: >> >> - TUMBLE_END returns the inclusive window end >> >> - we introduce an additional group auxiliary function for the inclusive >> window end like: SELECT TUMBLE_TIME(rowtime, INTERVAL '2' MINUTE) AS rowtime >> ... >> >> - we allow references to the window in the select: SELECT TUMBLE(rowtime, >> INTERVAL '1' HOUR) AS rowtime ... >> >> What do you think? >> >> >> Regards, >> >> Timo >> >> >> >> >