Hi Henry, praiseAggr is an append table, so it contains "100,101,102,100,101,103,100". 1. if you change your sql to s"SELECT article_id FROM praise GROUP BY article_id", the answer is "101,102,103" 2. if you change your sql to s"SELECT last_value(article_id) FROM praise", the answer is "100"
Best, Hequn On Tue, Aug 21, 2018 at 8:52 PM, 徐涛 <happydexu...@gmail.com> wrote: > Hi Fabian, > Thanks for your response. This question puzzles me for quite a long time. > If the praiseAggr has the following value: > window-1 100,101,102 > window-2 100,101,103 > window-3 100 > > the last time the article table joins praiseAggr, which of the following > value does praiseAggr table has? > 1— 100,101,102,100,101,103,100 collect all the element of all > the window > 2— 100 the element of the latest window > 3— 101,102,103 the distinct value of all the window > > > Best, > Henry > > > 在 2018年8月21日,下午8:02,Fabian Hueske <fhue...@gmail.com> 写道: > > Hi, > > The semantics of a query do not depend on the way that it is used. > praiseAggr is a table that grows by one row per second and article_id. If > you use that table in a join, the join will fully materialize the table. > This is a special case because the same row is added multiple times, so > the state won't grow that quickly, but the performance will decrease > because for each row from article will join with multiple (a growing > number) of rows from praiseAggr. > > Best, Fabian > > 2018-08-21 12:19 GMT+02:00 徐涛 <happydexu...@gmail.com>: > >> Hi All, >> var praiseAggr = tableEnv.sqlQuery(s"SELECT article_id FROM praise GROUP >> BY HOP(updated_time, INTERVAL '1' SECOND,INTERVAL '3' MINUTE) , article_id" >> ) >> tableEnv.registerTable("praiseAggr", praiseAggr) >> >> var finalTable = tableEnv.sqlQuery(*s**”**SELECT 1 FROM article a join >> praiseAggr p on a.article_id=p.article_id" *) >> tableEnv.registerTable("finalTable", finalTable) >> >> I know that praiseAggr, if written to sink, is append mode , so if a >> table joins praiseAggr, what the table “see”, is a table contains the >> latest value, or a table that grows larger and larger? If it is the later, >> will it introduce performance problem? >> Thanks a lot. >> >> >> Best, >> Henry >> > > >