Ah I see, so just auto-restarting to pick up new stuff. I'd love to understand how Paimon does this. They have a database sync action <https://paimon.apache.org/docs/master/flink/cdc-ingestion/mysql-cdc/#synchronizing-databases> which will sync entire databases, handle schema evolution, and I'm pretty sure (I think I saw this in my local test) also pick up new tables.
https://github.com/apache/paimon/blob/master/paimon-flink/paimon-flink-cdc/src/main/java/org/apache/paimon/flink/action/cdc/SyncDatabaseActionBase.java#L45 I'm sure that Paimon table format is great, but at Wikimedia Foundation we are on the Iceberg train. Imagine if there was a flink-cdc full database sync to Flink IcebergSink! On Thu, May 23, 2024 at 3:47 PM Péter Váry <peter.vary.apa...@gmail.com> wrote: > I will ask Marton about the slides. > > The solution was something like this in a nutshell: > - Make sure that on job start the latest Iceberg schema is read from the > Iceberg table > - Throw a SuppressRestartsException when data arrives with the wrong schema > - Use Flink Kubernetes Operator to restart your failed jobs by setting > kubernetes.operator.job.restart.failed > > Thanks, Peter > > On Thu, May 23, 2024, 20:29 Andrew Otto <o...@wikimedia.org> wrote: > >> Wow, I would LOVE to see this talk. If there is no recording, perhaps >> there are slides somewhere? >> >> On Thu, May 23, 2024 at 11:00 AM Carlos Sanabria Miranda < >> sanabria.miranda.car...@gmail.com> wrote: >> >>> Hi everyone! >>> >>> I have found in the Flink Forward website the following presentation: >>> "Self-service >>> ingestion pipelines with evolving schema via Flink and Iceberg >>> <https://www.flink-forward.org/seattle-2023/agenda#self-service-ingestion-pipelines-with-evolving-schema-via-flink-and-iceberg>" >>> by Márton Balassi from the 2023 conference in Seattle, but I cannot find >>> the recording anywhere. I have found the recordings of the other >>> presentations in the Ververica Academy website >>> <https://www.ververica.academy/app>, but not this one. >>> >>> Does anyone know where I can find it? Or at least the slides? >>> >>> We are using Flink with the Iceberg sink connector to write streaming >>> events to Iceberg tables, and we are researching how to handle schema >>> evolution properly. I saw that presentation and I thought it could be of >>> great help to us. >>> >>> Thanks in advance! >>> >>> Carlos >>> >>