Re: Why does join use rows that were sent after watermark of 20 seconds?

2018-12-10 Thread Jungtaek Lim
Please refer the structured streaming guide doc which is very clear of representing when the query will have unbounded state. http://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#inner-joins-with-optional-watermarking Quoting the doc: In other words, you will hav

Re: Why does join use rows that were sent after watermark of 20 seconds?

2018-12-10 Thread Abhijeet Kumar
You mean to say that Spark will store all the data in memory forever :) > On 10-Dec-2018, at 6:16 PM, Sandeep Katta > wrote: > > Hi Abhijeet, > > You are using inner join with unbounded state which means every data in > stream ll match with other stream infinitely, > If you want the inten