For demonstration purpose, I was using data that had older timestamps with structured streaming. The data was for the year 2018, window was of 24 hours and watermark of 0 seconds. Few things that I saw and could not explain are: 1. The initial batch of streaming had around 60 windows. It processed all but the last one. 2. The data for a window is not sent to the writer immediately. 3. If I ingest data for 2019 in the midway, it is not processed. In fact, spark didnt output the 2019 data at all.
Can someone point me to some doc or explanation on how the structured streaming works with data that has non current timestamps? Thanks in advance, Hemant