Kafka supports server-side and client-side timestamps since version 0.10.1. KafkaIO in Beam can provide much better watermark, especially for topics with server-side timestamps. The default implementation currently just uses processing time for event time and watermark, which is not very useful.
Wrote a short doc <https://docs.google.com/document/d/1DyWcLJpALRoUfvYUbiPCDVikYb_Xz2X7Co2aDUVVd4I/edit?usp=sharing>[1] about the proposal. Your feedback is welcome. I am planning to work on it, and don't mind guiding if anyone else is interested (it is fairly accessible for newcomers) . TL;DR : *server-side timestamp* : It monotonically increases within a Kafka partition. We can provide near perfect watermark : min(timestamp of latest record consumed on a partition). *client-side / custom timestamp* : Watermark is min(timestamp over last few seconds) similar to PubsubIO in Beam. This is not great, but we will let user provide tighter bounds or provide entirely own implementation. Thanks, Raghu. [1] https://docs.google.com/document/d/1DyWcLJpALRoUfvYUbiPCDVikYb_Xz2X7Co2aDUVVd4I/edit?usp=sharing