Re: [Beginner][StructuredStreaming] Using Spark aggregation - WithWatermark on old data

2018-05-24 Thread karthikjay
My data looks like this: { "ts2" : "2018/05/01 00:02:50.041", "serviceGroupId" : "123", "userId" : "avv-0", "stream" : "", "lastUserActivity" : "00:02:50", "lastUserActivityCount" : "0" } { "ts2" : "2018/05/01 00:09:02.079", "serviceGroupId" : "123", "userId" : "avv-0",

[Beginner][StructuredStreaming] Using Spark aggregation - WithWatermark on old data

2018-05-23 Thread karthikjay
I am doing the following aggregation on the data val channelChangesAgg = tunerDataJsonDF .withWatermark("ts2", "10 seconds") .groupBy(window(col("ts2"),"10 seconds"), col("env"), col("servicegroupid")) .agg(count("linetransactionid") as