Yes thanks Perhaps my example is too simple
*select xx, count(), sum() from ttt group by xx* Why the querie value can't be calculated each 2 secs / waterMark arrive ? I'm try to find the video of: http://flink-forward.org/kb_ sessions/scaling-stream-processing-with-apache-flink-to-very-large-state/ 2016-11-07 22:02 GMT+01:00 Fabian Hueske <fhue...@gmail.com>: > First of all, the document only proposes semantics for Flink's support of > relational queries on streams. > It does not describe the implementation and in fact most of it is not > implemented. > > How the queries will be executed would depend on the definition of the > table, i.e., whether the tables are derived in append or replace mode. > For the second query we do not necessarily need to "store all events as > is" but could do some pre-aggregation depending on the configured update > rate. > Watermarks will be used to track time in a query, i.e., to evaluate a > predicate like "*BETWEEN now() - INTERVAL '1' HOUR AND now()"* where > now() would be the current watermark time. > > There are a couple of tricks one can play to reduce the memory > requirements and the implementation should try to optimize for that. > However, it is true that for some queries we will need to keep the > complete input relation (within its time bounds) as state. > The good news is that Flink is very good a managing large state and can > easily scale to hundreds of nodes. > > Did that answer your questions? > > 2016-11-07 21:33 GMT+01:00 Alberto Ramón <a.ramonporto...@gmail.com>: > >> From "Relational Queries on Data Stream in Apache Flink" > Bounday >> Memory Requirements >> (https://docs.google.com/document/d/1qVVt_16kdaZQ8RTfA_f4kon >> QPW4tnl8THw6rzGUdaqU/edit#) >> >> >> *SELECT user, page, COUNT(page) AS pCntFROM pageviews* >> >> *GROUP BY user, page* >> >> *-Versus-* >> >> >> *SELECT user, page, COUNT(page) AS pCntFROM pageviews* >> >> *WHERE rowtime BETWEEN now() - INTERVAL '1' HOUR AND now() // only last >> hour* >> >> *GROUP BY user, page* >> >> I understand: >> >> - Not use WaterMark to pre-calculate agrr, and save memory >> - Store all events "as is" until the end of window >> >> are My assumptions true ? >> >> >