Fabian, ultimately, i just want to perform a join on the last values for each keys.
On Tue, Aug 6, 2019 at 8:07 PM Maatary Okouya <maatarioko...@gmail.com> wrote: > Fabian, > > could you please clarify the following statement: > > However joining an append-only table with this view without adding > temporal join condition, means that the stream is fully materialized as > state. > This is because previously emitted results must be updated when the view > changes. > It really depends on the semantics of the join and query that you need, > how much state the query will need to maintain. > > > I am not sure to understand the problem. If i have to append-only table > and perform some join on it, what's the issue ? > > > On Tue, Aug 6, 2019 at 8:03 PM Maatary Okouya <maatarioko...@gmail.com> > wrote: > >> Thank you for the clarification. Really appreciated. >> >> Is Last_val part of the API ? >> >> On Fri, Aug 2, 2019 at 10:49 AM Fabian Hueske <fhue...@gmail.com> wrote: >> >>> Hi, >>> >>> Flink does not distinguish between streams and tables. For the Table API >>> / SQL, there are only tables that are changing over time, i.e., dynamic >>> tables. >>> A Stream in the Kafka Streams or KSQL sense, is in Flink a Table with >>> append-only changes, i.e., records are only inserted and never deleted or >>> modified. >>> A Table in the Kafka Streams or KSQL sense, is in Flink a Table that has >>> upsert and delete changes, i.e., the table has a unique key and records are >>> inserted, deleted, or updated per key. >>> >>> In the current version, Flink does not have native support to ingest an >>> upsert stream as a dynamic table (right now only append-only tables can be >>> ingested, native support for upsert tables will be added soon.). >>> However, you can create a view with the following SQL query on an >>> append-only table that creates an upsert table: >>> >>> SELECT key, LAST_VAL(v1), LAST_VAL(v2), ... >>> FROM appendOnlyTable >>> GROUP BY key >>> >>> Given, this view, you can run all kinds of SQL queries on it. >>> However joining an append-only table with this view without adding >>> temporal join condition, means that the stream is fully materialized as >>> state. >>> This is because previously emitted results must be updated when the view >>> changes. >>> It really depends on the semantics of the join and query that you need, >>> how much state the query will need to maintain. >>> >>> An alternative to using Table API / SQL and it's dynamic table >>> abstraction is to use Flink's DataStream API and ProcessFunctions. >>> These APIs are more low level and expose access to state and timers, >>> which are the core ingredients for stream processing. >>> You can implement pretty much all logic of KStreams and more in these >>> APIs. >>> >>> Best, Fabian >>> >>> >>> Am Di., 23. Juli 2019 um 13:06 Uhr schrieb Maatary Okouya < >>> maatarioko...@gmail.com>: >>> >>>> I would like to have a KTable, or maybe in Flink term a dynamic Table, >>>> that only contains the latest value for each keyed record. This would allow >>>> me to perform aggregation and join, based on the latest state of every >>>> record, as opposed to every record over time, or a period of time. >>>> >>>> On Sun, Jul 21, 2019 at 8:21 AM miki haiat <miko5...@gmail.com> wrote: >>>> >>>>> Can you elaborate more about your use case . >>>>> >>>>> >>>>> On Sat, Jul 20, 2019 at 1:04 AM Maatary Okouya < >>>>> maatarioko...@gmail.com> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> I am a user of Kafka Stream so far. However, because i have been face >>>>>> with several limitation in particular in performing Join on KTable. >>>>>> >>>>>> I was wondering what is the appraoch in Flink to achieve (1) the >>>>>> concept of KTable, i.e. a Table that represent a changeLog, i.e. only the >>>>>> latest version of all keyed records, and (2) joining those. >>>>>> >>>>>> There are currently a lot of limitation around that on Kafka Stream, >>>>>> and i need that for performing some ETL process, where i need to mirror >>>>>> entire databases in Kafka, and then do some join on the table to emit the >>>>>> logical entity in Kafka Topics. I was hoping that somehow i could acheive >>>>>> that by using FLink as intermediary. >>>>>> >>>>>> I can see that you support any kind of join, but i just don't see the >>>>>> notion of Ktable. >>>>>> >>>>>> >>>>>>