Hi, Thanks for this points. I am not sure if I really understood the implications of using this option in the stream mode. I got the point that if we have 20 rows then we have 20 outputs. However, I wonder what happens when a new record comes in and we have the query you proposed
SELECT SREAM orderId, price, AVG(price) OVER (ORDER BY orderTime ROWS 5 PRECEDING) FROM Orders Assuming we have up to moment T te following 5 orders: ordN1, ordN2, ordN3, ordN4, ordN5 and we get ordN6 at moment T+1 ..will the query provide only one result corresponding to ordN6 and thus average over ordN2, ordN3, ordN4, ordN5, ordN6....or because ordN2 to ordN5 are still in the system the query will return 5 results? If the query answer is 1 output in this case corersponding to element ordN6 then indeed it can do the job for this scenario. -----Original Message----- From: Julian Hyde [mailto:jh...@apache.org] Sent: Tuesday, September 27, 2016 2:41 AM To: dev@calcite.apache.org Subject: Re: New type of window semantics Have you considered the sliding window, which is already part of standard SQL? We propose to support it in streaming SQL also. Here is an example: SELECT orderId, price, AVG(price) OVER (ORDER BY orderTime ROWS 5 PRECEDING) FROM Orders (This is a non-streaming query, but you can add the STREAM keyword to get a streaming query.) Given orders 1 .. 20, then order 10 would show the average for orders 5 .. 10 inclusive, order 11 would show the average for orders 6 .. 11, and so forth. In streaming queries, windows are often used in the GROUP BY clause, but we do not use a GROUP BY here. The OVER clause with sliding windows does not aggregate rows. If 20 rows come in, then 20 rows go out. It makes sense, because each row cannot have its own window if multiple rows are squashed into one. Julian > On Sep 26, 2016, at 12:53 AM, Radu Tudoran <radu.tudo...@huawei.com> wrote: > > Hi, > > First of all let me introduce myself - My name is Radu Tudoran and I am > working in the field of Big Data processing with a high focus on streaming > and more recently in the area of SQL. > > I wanted to raise a question/proposal for discussion in the community: > > Based on our requirements I realized that I would need to create a window > (e.g. hop window) that would move on every incoming element based. The syntax > that I have in mind for it is > > HOP(column_name, # EVENT , INTERVAL # ) (or should it rather be # ELEMENT > instead of EVENT?) > > I wanted to check with you what do you think about such a grammar to go > directly in Calcite? I think it is relevant for streaming scenarios where you > do not necessary have events coming at regular time interval but you would > still like to react on every event. > As an example you can consider a stock market application where you would > always compute for every new offer the average over the last hour. > > Best regards,