pvary commented on code in PR #8553: URL: https://github.com/apache/iceberg/pull/8553#discussion_r1398107792
########## flink/v1.17/flink/src/main/java/org/apache/iceberg/flink/source/IcebergSource.java: ########## @@ -453,6 +492,18 @@ public IcebergSource<T> build() { contextBuilder.project(FlinkSchemaUtil.convert(icebergSchema, projectedFlinkSchema)); } + SerializableRecordEmitter<T> emitter = SerializableRecordEmitter.defaultEmitter(); + if (watermarkColumn != null) { Review Comment: The focus of the feature is correct watermark generation, and we need to make sure that the watermarks are emitted in order, but this does not mean automatically that the records need to be emitted in order too. These are two different aspects of a data stream. In case of combined splits, we do not advance the watermark, so it doesn't cause issues wrt watermark generation. The user can decide if the record out of orderness is a problem them. If they decide so, they can set the configuration, but if they have enough memory, to keep the state, they can decide that reading speed (combining files to splits) is more important than reading files in order. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org