pvary commented on code in PR #9179:
URL: https://github.com/apache/iceberg/pull/9179#discussion_r1410606706
##########
docs/flink-queries.md:
##########
@@ -277,6 +277,58 @@ DataStream<Row> stream = env.fromSource(source,
WatermarkStrategy.noWatermarks()
"Iceberg Source as Avro GenericRecord", new
GenericRecordAvroTypeInfo(avroSchema));
```
+### Emitting watermarks
+Emitting watermarks from the source itself could be beneficial for several
purposes, like harnessing the
+[Flink Watermark
Alignment](https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/dev/datastream/event-time/generating_watermarks/#watermark-alignment)
+feature to prevent runaway readers, or providing triggers for [Flink
windowing](https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/dev/datastream/operators/windows/).
Review Comment:
I think it is very important to understand, that windowing and watermark
generation based on records could cause surprising results - especially with
batch reads, or in backfill situations. Without this feature there is not
guarantee on the order of the files are read. Window triggering will only
become reliable when the source controls the emitted watermarks.
I am not sure how detailed the description should be, but I think it is
important to be noted here, so I am open for suggestions, if you think we
should add more detail here.
##########
docs/flink-queries.md:
##########
@@ -277,6 +277,58 @@ DataStream<Row> stream = env.fromSource(source,
WatermarkStrategy.noWatermarks()
"Iceberg Source as Avro GenericRecord", new
GenericRecordAvroTypeInfo(avroSchema));
```
+### Emitting watermarks
+Emitting watermarks from the source itself could be beneficial for several
purposes, like harnessing the
+[Flink Watermark
Alignment](https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/dev/datastream/event-time/generating_watermarks/#watermark-alignment)
+feature to prevent runaway readers, or providing triggers for [Flink
windowing](https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/dev/datastream/operators/windows/).
+
+Enable watermark generation for an `IcebergSource` by setting the
`watermarkColumn`.
+The supported column types are `timestamp`, `timestamptz` and `long`.
+Timestamp columns are automatically converted to milliseconds since the Java
epoch of
+1970-01-01T00:00:00Z. Use `watermarkTimeUnit` to configure the conversion for
long columns.
Review Comment:
Done
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]