Hello all, I am using the flink-iceberg-runtime lib to read an iceberg table into a Flink datastream. I am using Glue as the catalog. I use the flink table API to build and query an iceberg table and then use toDataStream to convert it into a DataStream<Row>. Here is the code
Table table = streamTableEnv.from(<table>).select(..).where(...) DataStream<Row> stream = streamTableEnv.toDataStream(table) stream.executeAndCollect() I have observed that the table construction and the stream construction (the first two lines of code above) are quite slow. It takes 6 to 7 seconds. The debugging/profiling exercise has revealed that there are some inefficiencies. streamTableEnv.toDataStream does not use the cachingCatalog created and attached to the streamTableEnv so it hits the external catalog multiple times. toDataStream call creates a new DummyStreamExecEnv and all the related objects again. This is where the latency is coming from I think. Has anyone experienced this? Would appreciate ways to overcome the slowness. Thank you Chetas