Hi everyone, Thanks everyone for this healthy discussion. I think we have addressed all the concerns. I would continue with a voting. If you have any new objections, feel free to let me know.
Best, Jark On Sat, 10 Oct 2020 at 17:54, Jark Wu <imj...@gmail.com> wrote: > Hi Jingsong, > > That's a good question. I did have searched a lot and didn't find any > system that provides such an out-of-box function. > I guess the reason is that in the traditional batch systems, this feature > is supported by the over window and they don't need to invent a > new function/syntax for this. > For streaming systems, we are the first one to propose this new window. > > However, I think CUMULATE is a good name. Because almost all the databases > call such scenarios as "cumulative window", e.g. Snowflake[1], SQL Server > [2], Postgres [3]. > Thus we choose "cumulative" as the base name, but use the verb form > "cumulate" because other window function names are also verbs, e.g. tumble, > hop. > > I hope this can address your concern. > > Best, > Jark > > [1]: > https://docs.snowflake.com/en/sql-reference/functions-analytic.html#cumulative-window-frame-examples > [2]: > https://docs.microsoft.com/en-us/sql/t-sql/queries/select-over-clause-transact-sql?view=sql-server-ver15#c-producing-a-moving-average-and-cumulative-total > [3]: > https://popsql.com/learn-sql/postgresql/how-to-calculate-cumulative-sum-running-total-in-postgresql > > On Sat, 10 Oct 2020 at 17:26, Jingsong Li <jingsongl...@gmail.com> wrote: > >> +1 for voting. Thanks Jark for driving. >> >> +1 for TVF, It has been put forward by theory and supported by calcite. It >> will greatly enhance the window related operations. >> >> My personal feeling is that after TVF, the following operations can be >> similar to the traditional batch SQL, as long as the window related >> attributes are included in the key. >> >> I am not sure about the CUMULATE window, yes, It's a common requirement, >> Is >> there any more evidence (other systems) to prove this word ("CUMULATE") is >> appropriate. >> >> Best, >> Jingsong >> >> On Sat, Oct 10, 2020 at 3:43 PM Jark Wu <imj...@gmail.com> wrote: >> >> > Hi Pengcheng, >> > >> > IIUC, the "stream operators" you mean is the non-time operators or >> called >> > regular operators, such as regular join, regular aggregate. >> > But you may misunderstand me, only the time operators can't be applied >> > after the new window operators, because of missing time attributes. >> > The regular operators can still be applied after the new window >> operators. >> > >> > Regarding using window TVFs to re-assign event-time and watermarks, I'm >> not >> > sure about this. >> > Because assigning watermark requires to define the watermark strategy, >> > however, the window TVF doesn't provide such ability. >> > Polymorphic table functions are table functions which just append >> > additional columns and convert N rows into M rows, it can't touch meta >> > information. >> > >> > Best, >> > Jark >> > >> > On Sat, 10 Oct 2020 at 15:41, Jark Wu <imj...@gmail.com> wrote: >> > >> > > Hi Danny, >> > > >> > > Thanks for the hint about named params syntax, I added examples with >> > named >> > > params in the FLIP. >> > > >> > > Best, >> > > Jark >> > > >> > > >> > > On Sat, 10 Oct 2020 at 15:03, Pengcheng Liu < >> pengchengliucr...@gmail.com >> > > >> > > wrote: >> > > >> > >> Hi, Jark, >> > >> >> > >> I've got some different opinions there, I think it's a very common >> > use >> > >> case to use >> > >> window operators in combination with streaming operators(even >> those >> > >> time operators). >> > >> (e.g. for some tables, users only care data within a period, but >> for >> > >> other tables, they may >> > >> want the whole historical data). >> > >> The pipeline may looks like this: >> > >> window join -> dimension table join -> stream aggregate -> stream >> > sort >> > >> >> > >> Just as what you said, the key clause can be used to distinguish >> > >> whether a operator should >> > >> be translated to a window operator or a streaming operator. >> > >> >> > >> Also, as I've mentioned before, 1) for time operator after window >> > >> aggregation, the auxiliary function >> > >> which is used to access time attribute column can be actually >> > replaced >> > >> with (window_end -1). >> > >> Actually, we only just need to make the results of the upstream >> > >> contains a time column whose >> > >> range is within (window_start, window_end), and thus the >> downstream >> > >> time operators can work on it >> > >> (driving by the original watermark in the source). 2) for time >> > >> operator after other window operators, >> > >> the downstream time operators can access the time column directly >> > from >> > >> it's input. >> > >> >> > >> One more thoughts there, maybe the window TVFs can re-assign >> > >> timestamps and watermarks, so >> > >> that in some case when the watermark can not be retrieved from >> source >> > >> directly(may needs some >> > >> conversions), the watermark can still be assigned dynamically in >> the >> > >> SQL(use the time column as >> > >> the watermark column) and thus make it work. I think this can save >> > >> much time to revise the event >> > >> time column in some cases(this is a real demand in our production >> > >> environment). >> > >> >> > >> I strongly suggest that we should support the combination usage of >> > >> window operators and >> > >> streaming operators. And I think we can achieve this with little >> > work. >> > >> >> > >> Best, >> > >> Pengcheng >> > >> >> > >> >> > >> Jark Wu <imj...@gmail.com> 于2020年10月10日周六 下午1:45写道: >> > >> >> > >>> Hi Benchao, >> > >>> >> > >>> That's a good question. >> > >>> >> > >>> IMO, the new windowed operators and the current time operators are >> two >> > >>> different sets of functions, >> > >>> just like time operators and non-time operators are two different >> sets >> > of >> > >>> functions. >> > >>> I think it's fine if we don't support integrating them, just like >> time >> > >>> operators can't be applied on non-windowed aggregate. >> > >>> If users want to use time operators in the whole pipeline, then >> he/she >> > >>> can >> > >>> use the grouped window aggregates instead of the window TVFs. >> > >>> >> > >>> The key idea of window TVF is that all the operators in the pipeline >> > are >> > >>> based on the **windows**. >> > >>> In terms of syntax, if the key clause (e.g. group by, partitioned >> by, >> > >>> join >> > >>> on, order by) contains window_start and window_end, >> > >>> it can be translated into windowed operators. >> > >>> Thus, we will have windowed CEP, windowed sort, windowed over >> aggregate >> > >>> in >> > >>> the future to make it possible to build a windowed pipeline. >> > >>> >> > >>> But I think we can elaborate the integration more in the future if >> > users >> > >>> need it. Actually, I don't fully understand the scenario of >> integrating >> > >>> window TVF and time operators at this point. >> > >>> For example, interval join an input stream and a window join >> result. I >> > >>> don't see why it can't be expressed by nested window join and why >> users >> > >>> have to use interval join here. >> > >>> Maybe we can wait for more inputs from users when the window TVF is >> > >>> released and we can elaborate it again. >> > >>> >> > >>> Best, >> > >>> Jark >> > >>> >> > >>> On Sat, 10 Oct 2020 at 12:01, 刘 芃成 <pengchengliucr...@gmail.com> >> > wrote: >> > >>> >> > >>> > Hi, Benchao, >> > >>> > I think I got your point, actually, in current >> implementation >> > >>> for >> > >>> > group window aggregation, the value of time attributes(e.g. >> > >>> > TUMBLE_ROWTIME/TUMBLE_PROCTIME) is calculated as (window_end – 1), >> > so I >> > >>> > think we can just use it directly if you need this. But I think >> this >> > >>> time >> > >>> > attributes is mainly suggested to use in case of cascaded window >> > >>> operations. >> > >>> > Regarding the example you provided, I think the semantics of the >> SQL >> > in >> > >>> > your example which doing interval join(e.g. with TUMBLE_ROWTIME) >> > after >> > >>> > window aggregation is not clear in the current implementation, >> and I >> > >>> think >> > >>> > that’s a strong reason why we need the new TVFs syntax. >> > >>> > With the new syntax, users should understand which time >> column >> > to >> > >>> > use and how to generate it when doing interval join and etc. >> > >>> > >> > >>> > Best, >> > >>> > Pengcheng >> > >>> > >> > >>> > 发件人: Benchao Li <libenc...@apache.org> >> > >>> > 日期: 2020年10月10日 星期六 上午11:02 >> > >>> > 收件人: pengcheng Liu <pengchengliucr...@gmail.com> >> > >>> > 抄送: dev <dev@flink.apache.org> >> > >>> > 主题: Re: [DISCUSS] FLIP-145: Support SQL windowing table-valued >> > function >> > >>> > >> > >>> > Hi pengcheng, >> > >>> > >> > >>> > Thanks for your response. >> > >>> > I knew that the original time attribute column will be retained >> after >> > >>> the >> > >>> > TVF, >> > >>> > what I'm questioning is how do we get the time attribute column >> after >> > >>> > Aggregation. >> > >>> > Your answer did not remove my doubts about this. >> > >>> > >> > >>> > It's ok if we did not plan to integrate new TVF aggregate with old >> > >>> "time >> > >>> > attribute scenarios" >> > >>> > listed in my previous email in this FLIP. However it's good to >> > >>> elaborate >> > >>> > more on this, and >> > >>> > leave it to the future plan. >> > >>> > >> > >>> > pengcheng Liu <pengchengliucr...@gmail.com<mailto: >> > >>> > pengchengliucr...@gmail.com>> 于2020年10月10日周六 上午10:45写道: >> > >>> > Hi,Benchao, >> > >>> > In TVFs, the time attributes is just passed through from >> parent >> > >>> rels, >> > >>> > and the TVFs just add two >> > >>> > additional window attributes(i.e. window_start & window_end). >> > >>> Also, I >> > >>> > think the time columns can be not only a time attribute >> > >>> > with type of `TimeIndicatorType` but also a regular column >> with >> > >>> type >> > >>> > of `Timestamp`. >> > >>> > >> > >>> > For cascaded window operations, we can use >> > window_start/window_end >> > >>> of >> > >>> > the previous window result directly to >> > >>> > indicate operating on the same window, or use new DESCRIPTOR >> > >>> column >> > >>> > to assign new windows, in case of the change of >> > >>> > the time column(e.g. in some case, the original timestamp is >> > >>> > inaccurate and need some conversion to be used). >> > >>> > >> > >>> > You can check the definition or signature of these TVFs in the >> > >>> FLIP. >> > >>> > e.g. >> > >>> > SELECT * FROM TABLE( >> > >>> > TUMBLE(TABLE Bid, DESCRIPTOR(bidtime), INTERVAL '10' MINUTES)) >> > >>> > In the example, the `bidtime` is the time attribute column, >> which >> > >>> is >> > >>> > the first operand of the DESCRIPTOR function. >> > >>> > >> > >>> > +1 start voting. >> > >>> > >> > >>> > Benchao Li <libenc...@apache.org<mailto:libenc...@apache.org>> >> > >>> > 于2020年10月10日周六 上午10:08写道: >> > >>> > Hi Jark, >> > >>> > >> > >>> > 2 & 3 sounds good to me. >> > >>> > >> > >>> > Regarding time attribute, >> > >>> > I still have some questions, I knew it's easy to support cascaded >> > >>> window >> > >>> > aggregate using new TVFs. >> > >>> > However there are some other places where need time attribute: >> > >>> > - CEP >> > >>> > - interval join >> > >>> > - order by >> > >>> > - over window >> > >>> > If there is no time attribute column, how do we integrate these >> old >> > >>> > features with the new TVFs. >> > >>> > E.g. >> > >>> > StreamA -> new window aggregate -> interval join -> Sink >> > >>> > / >> > >>> > StreamB ----------------------------------- >> > >>> > >> > >>> > >> > >>> > Jark Wu <imj...@gmail.com<mailto:imj...@gmail.com>> 于2020年10月9日周五 >> > >>> > 下午11:51写道: >> > >>> > Hi Benchao, >> > >>> > >> > >>> > 1) time attribute >> > >>> > Yes. We don't need time attribute auxiliary function. Because the >> new >> > >>> > window operations are all based on the >> > >>> > window_start and window_end columns instead of on the time >> > >>> attributes. So >> > >>> > we don't need to propagate time attributes. >> > >>> > Cascaded window aggregate can be expressed by simply GROUP BY the >> > >>> > window_start and window_end of the previous window result. >> > >>> > I have added a cascaded window aggregate example in the Tumbling >> > Window >> > >>> > section in the FLIP. >> > >>> > If you want to define proctime window aggregate, the time column >> in >> > TVF >> > >>> > should be a proctime attribute field (or PROCTIME() function). >> > >>> > >> > >>> > 2) batch support >> > >>> > Yes. The proposed syntax/API are unified for batch and streaming. >> > Batch >> > >>> > support is in the plan, but may not have enough time to catch up >> > 1.12. >> > >>> > >> > >>> > 3) support `grouping sets` >> > >>> > This is not included in the FLIP, but I think it's great if we can >> > >>> support >> > >>> > `grouping sets`. >> > >>> > The existing window impl doesn't support this because we convert >> the >> > >>> > LogicalAggregate into WindowAggregate in the beginning, >> > >>> > the expand grouping sets rule can't be applied in this situation. >> > >>> > Fortunately, with the new window impl, the conversion to >> > >>> WindowAggregate >> > >>> > will happen at the end, so I think the expand rule can be >> > >>> > applied and support this feature naturally. >> > >>> > Therefore, IMO, we don't need to include this feature in this >> FLIP to >> > >>> avoid >> > >>> > the FLIP being too large. >> > >>> > This can be a follow-up issue (maybe just add tests and docs) >> after >> > the >> > >>> > FLIP. >> > >>> > >> > >>> > Best, >> > >>> > Jark >> > >>> > >> > >>> > >> > >>> > On Fri, 9 Oct 2020 at 19:09, 刘 芃成 <pengchengliucr...@gmail.com >> > <mailto: >> > >>> > pengchengliucr...@gmail.com>> wrote: >> > >>> > >> > >>> > > Hi,Benchao, >> > >>> > > Welcome to join the discussion, yes, this new syntax can >> > >>> make SQL >> > >>> > > more clear and simpler. >> > >>> > > For your first question, the `window_start` and >> > `window_end` >> > >>> > > columns will be added automatically, >> > >>> > > so we don't need to use auxiliary group functions to >> infer >> > or >> > >>> > > access the window properties. >> > >>> > > >> > >>> > > For the `grouping sets` on TVFs, I think it's >> interesting >> > if >> > >>> we >> > >>> > > can support it, as we already supported `grouping sets` >> > >>> > > on streaming aggregates in blink planner. But I'm not >> sure >> > >>> if it >> > >>> > > will be included into this FLIP. >> > >>> > > >> > >>> > > cc @Jark Wu >> > >>> > > >> > >>> > > Best, >> > >>> > > Pengcheng >> > >>> > > >> > >>> > > >> > >>> > > 在 2020/10/9 下午5:25,“Benchao Li”<libenc...@apache.org<mailto: >> > >>> > libenc...@apache.org>> 写入: >> > >>> > > >> > >>> > > Thanks Jark for bringing this discussion, I like this FLIP >> very >> > >>> much. >> > >>> > > >> > >>> > > Especially the cumulate window, it's much like the current >> > TUMBLE >> > >>> > > window + >> > >>> > > Fast Emit (which is an undocumented experimental feature), >> > >>> however, >> > >>> > > it's >> > >>> > > more powerful. >> > >>> > > >> > >>> > > And This will make the SQL semantic more standard, >> especially >> > >>> for the >> > >>> > > HOPPING window. >> > >>> > > >> > >>> > > Regarding time attribute, >> > >>> > > It seems that we don't need a specific function to infer the >> > time >> > >>> > > attribute >> > >>> > > like >> > >>> > > `TUMBLE_ROWTIME` / `TUMBLE_PROCTIME`. Then are >> `window_start` >> > and >> > >>> > > `window_end` >> > >>> > > column a time attribute column automatically? >> > >>> > > - If not, what will be the time attribute of the result >> > relation >> > >>> of >> > >>> > > these >> > >>> > > TVFs? >> > >>> > > Especially after the window aggregation. >> > >>> > > - If yes, then how do we handle proctime? >> > >>> > > >> > >>> > > Regarding batch operators, >> > >>> > > It's great to hear that we can reuse the batch operators in >> > >>> > continuous >> > >>> > > batch mode >> > >>> > > as you mentioned in the FLIP. >> > >>> > > Current window aggregate could also be used in batch mode >> with >> > >>> > > rowtime. Do >> > >>> > > you plan >> > >>> > > to support these TVFs for batch mode in this FLIP? Hence the >> > >>> > Table/SQL >> > >>> > > is a >> > >>> > > unified >> > >>> > > API, it's great if we can keep the features complete both in >> > >>> > streaming >> > >>> > > and >> > >>> > > batch mode. >> > >>> > > >> > >>> > > There is one more question, I don't know whether it should >> be >> > >>> > > considered in >> > >>> > > this FLIP. >> > >>> > > Does the new window support `grouping sets`? (It's not >> > supported >> > >>> in >> > >>> > old >> > >>> > > window impl). >> > >>> > > >> > >>> > > Jark Wu <imj...@gmail.com<mailto:imj...@gmail.com>> >> > >>> 于2020年10月9日周五 >> > >>> > 下午4:14写道: >> > >>> > > >> > >>> > > > Hi all, >> > >>> > > > >> > >>> > > > I know we have a lot of discussion and development on >> going >> > >>> right >> > >>> > > now but >> > >>> > > > it would be great if we can get FLIP-145 into a votable >> > state. >> > >>> > > > If there are no objections, I would like to start voting >> in >> > the >> > >>> > next >> > >>> > > days. >> > >>> > > > >> > >>> > > > Best, >> > >>> > > > Jark >> > >>> > > > >> > >>> > > > On Thu, 1 Oct 2020 at 14:29, Jark Wu <imj...@gmail.com >> > <mailto: >> > >>> > imj...@gmail.com>> wrote: >> > >>> > > > >> > >>> > > > > Hi everyone, >> > >>> > > > > >> > >>> > > > > I have added a section for Performance Optimization to >> > >>> describe >> > >>> > > how to >> > >>> > > > > improve the performance in the short-term and long-term >> > >>> > > > > and sketch the future performance potential under the >> new >> > >>> window >> > >>> > > API. >> > >>> > > > > Introducing the window API is just the first step, we >> will >> > >>> > > > > continuously improve the performance to make it powerful >> > and >> > >>> > > useful. >> > >>> > > > > >> > >>> > > > > Best, >> > >>> > > > > Jark >> > >>> > > > > >> > >>> > > > > On Thu, 1 Oct 2020 at 14:28, Jark Wu <imj...@gmail.com >> > >>> <mailto: >> > >>> > imj...@gmail.com>> wrote: >> > >>> > > > > >> > >>> > > > >> Hi Pengcheng, >> > >>> > > > >> >> > >>> > > > >> Yes, the window TVF is part of the FLIP. Welcome to >> > >>> contribute >> > >>> > > and join >> > >>> > > > >> the discussion. >> > >>> > > > >> Regarding the SESSION window aggregation, users can use >> > the >> > >>> > > existing >> > >>> > > > >> grouped session window function. >> > >>> > > > >> >> > >>> > > > >> Best, >> > >>> > > > >> Jark >> > >>> > > > >> >> > >>> > > > >> On Sun, 27 Sep 2020 at 21:24, liupengcheng < >> > >>> > > pengchengliucr...@gmail.com<mailto:pengchengliucr...@gmail.com> >> > >>> > > > > >> > >>> > > > >> wrote: >> > >>> > > > >> >> > >>> > > > >>> Hi Jark, >> > >>> > > > >>> Thanks for reply, yes, I think it's a good >> > >>> feature, it >> > >>> > > can >> > >>> > > > >>> improve the NRT scenarios >> > >>> > > > >>> as you mentioned in the FLIP. Also, I think it >> > can >> > >>> > > improve the >> > >>> > > > >>> streaming SQL greatly, >> > >>> > > > >>> it can support richer window operations in >> flink >> > >>> SQL >> > >>> > and >> > >>> > > bring >> > >>> > > > >>> great convenience to users. >> > >>> > > > >>> (we are now only supported group window in >> > flink). >> > >>> > > > >>> >> > >>> > > > >>> Regarding the SESSION window, I think it's >> > >>> especially >> > >>> > > useful >> > >>> > > > for >> > >>> > > > >>> user behavior analysis(e.g. >> > >>> > > > >>> counting user visits on a news website or >> social >> > >>> > > platform), but >> > >>> > > > >>> I agree that we can keep it >> > >>> > > > >>> out of the FLIP now to catch up 1.12. >> > >>> > > > >>> >> > >>> > > > >>> Recently, I've done some work on the stream >> > planner >> > >>> > with >> > >>> > > the >> > >>> > > > >>> TVFs, and I'm willing to contribute >> > >>> > > > >>> to this part. Is it in the plan of this FLIP? >> > >>> > > > >>> >> > >>> > > > >>> Best, >> > >>> > > > >>> PengchengLiu >> > >>> > > > >>> >> > >>> > > > >>> >> > >>> > > > >>> 在 2020/9/26 下午11:09,“Jark Wu”<imj...@gmail.com >> <mailto: >> > >>> > imj...@gmail.com>> 写入: >> > >>> > > > >>> >> > >>> > > > >>> Hi pengcheng, >> > >>> > > > >>> >> > >>> > > > >>> That's great to see you also have the need of >> window >> > >>> join. >> > >>> > > > >>> You are right, the windowing TVF is a powerful >> > feature >> > >>> > which >> > >>> > > can >> > >>> > > > >>> support >> > >>> > > > >>> more operations in the future. >> > >>> > > > >>> I think it as of the date time "partition" >> selection >> > in >> > >>> > > batch SQL >> > >>> > > > >>> jobs, >> > >>> > > > >>> with this new syntax, I think it is possible >> > >>> > > > >>> to migrate traditional batch SQL jobs to Flink >> SQL >> > by >> > >>> > > changing a >> > >>> > > > >>> few lines. >> > >>> > > > >>> >> > >>> > > > >>> Regarding the SESSION window, this is on purpose >> to >> > >>> keep it >> > >>> > > out of >> > >>> > > > >>> the >> > >>> > > > >>> FLIP, because we want to keep the >> > >>> > > > >>> FLIP small to catch up 1.12 and SESSION TVF is >> rarely >> > >>> > useful >> > >>> > > (e.g. >> > >>> > > > >>> session >> > >>> > > > >>> window join?). >> > >>> > > > >>> >> > >>> > > > >>> Best, >> > >>> > > > >>> Jark >> > >>> > > > >>> >> > >>> > > > >>> On Fri, 25 Sep 2020 at 22:59, liupengcheng < >> > >>> > > > >>> pengchengliucr...@gmail.com<mailto: >> > >>> pengchengliucr...@gmail.com >> > >>> > >> >> > >>> > > > >>> wrote: >> > >>> > > > >>> >> > >>> > > > >>> > Hi, Jark, >> > >>> > > > >>> > I'm very interested in this feature, and >> > I'm >> > >>> also >> > >>> > > working >> > >>> > > > >>> on this >> > >>> > > > >>> > recently. >> > >>> > > > >>> > I just have a glance at the FLIP, it's >> > good, >> > >>> but >> > >>> > I >> > >>> > > found >> > >>> > > > >>> that >> > >>> > > > >>> > there is no plan to add SESSION windows. >> > >>> > > > >>> > Also, I think there can be more things >> we >> > >>> can do >> > >>> > > based on >> > >>> > > > >>> this new >> > >>> > > > >>> > syntax. For example, >> > >>> > > > >>> > - window sort support >> > >>> > > > >>> > - window union/intersect/minus support >> > >>> > > > >>> > - Improve dimension table join >> > >>> > > > >>> > We can have more deep discussion on this >> > new >> > >>> > > feature >> > >>> > > > later >> > >>> > > > >>> . >> > >>> > > > >>> > I've also opened an jira that is >> related to >> > >>> this >> > >>> > > feature >> > >>> > > > >>> recently: >> > >>> > > > >>> > >> https://issues.apache.org/jira/browse/FLINK-18830 >> > >>> > > > >>> > >> > >>> > > > >>> > Best! >> > >>> > > > >>> > PengchengLiu >> > >>> > > > >>> > >> > >>> > > > >>> > 在 2020/9/25 下午10:30,“Jark Wu”<imj...@gmail.com >> > >>> <mailto: >> > >>> > imj...@gmail.com>> 写入: >> > >>> > > > >>> > >> > >>> > > > >>> > Hi everyone, >> > >>> > > > >>> > >> > >>> > > > >>> > I want to start a FLIP about supporting >> > windowing >> > >>> > > > table-valued >> > >>> > > > >>> > functions >> > >>> > > > >>> > (TVF). >> > >>> > > > >>> > The main purpose of this FLIP is to improve >> the >> > >>> near >> > >>> > > > real-time >> > >>> > > > >>> (NRT) >> > >>> > > > >>> > experience of Flink. >> > >>> > > > >>> > >> > >>> > > > >>> > FLIP-145: >> > >>> > > > >>> > >> > >>> > > > >>> > >> > >>> > > > >>> >> > >>> > > > >> > >>> > > >> > >>> > >> > >>> >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-145%3A+Support+SQL+windowing+table-valued+function >> > >>> > > > >>> > >> > >>> > > > >>> > We want to introduce TUMBLE, HOP, CUMULATE >> > >>> windowing >> > >>> > > TVFs, >> > >>> > > > the >> > >>> > > > >>> > CUMULATE is >> > >>> > > > >>> > a new kind of window. >> > >>> > > > >>> > With the windowing TVFs, we can support >> richer >> > >>> > > operations on >> > >>> > > > >>> windows, >> > >>> > > > >>> > including window join, window TopN and so >> on. >> > >>> > > > >>> > This makes things simple: we only need to >> > assign >> > >>> > > windows at >> > >>> > > > the >> > >>> > > > >>> > beginning >> > >>> > > > >>> > of the query, and then apply operations >> after >> > >>> that >> > >>> > like >> > >>> > > > >>> traditional >> > >>> > > > >>> > batch >> > >>> > > > >>> > SQL. >> > >>> > > > >>> > We hope it can help to reduce the learning >> > curve >> > >>> of >> > >>> > > windows, >> > >>> > > > >>> improve >> > >>> > > > >>> > NRT >> > >>> > > > >>> > for Flink, and attract more batch users. >> > >>> > > > >>> > >> > >>> > > > >>> > A simple code snippet for 10 minutes >> tumbling >> > >>> window >> > >>> > > > aggregate: >> > >>> > > > >>> > >> > >>> > > > >>> > SELECT window_start, window_end, SUM(price) >> > >>> > > > >>> > FROM TABLE( >> > >>> > > > >>> > TUMBLE(TABLE Bid, DESCRIPTOR(bidtime), >> > >>> INTERVAL >> > >>> > > '10' >> > >>> > > > >>> MINUTES)) >> > >>> > > > >>> > GROUP BY window_start, window_end; >> > >>> > > > >>> > >> > >>> > > > >>> > I'm looking forward to your feedback. >> > >>> > > > >>> > >> > >>> > > > >>> > Best, >> > >>> > > > >>> > Jark >> > >>> > > > >>> > >> > >>> > > > >>> > >> > >>> > > > >>> > >> > >>> > > > >>> >> > >>> > > > >>> >> > >>> > > > >>> >> > >>> > > > >> > >>> > > >> > >>> > > >> > >>> > > -- >> > >>> > > >> > >>> > > Best, >> > >>> > > Benchao Li >> > >>> > > >> > >>> > >> > >>> > >> > >>> > -- >> > >>> > >> > >>> > Best, >> > >>> > Benchao Li >> > >>> > >> > >>> > >> > >>> > -- >> > >>> > >> > >>> > Best, >> > >>> > Benchao Li >> > >>> > >> > >>> >> > >> >> > >> >> >> -- >> Best, Jingsong Lee >> >