I have the following query:
val ds = dataFrame
.filter(! $"requri".endsWith(".m3u8"))
.filter(! $"bserver".contains("trimmer"))
.withWatermark("time", "120 seconds")
.groupBy(window(dataFrame.col("time"),"60
seconds"),col("channelName"))
.agg(sum("bytes")/1000000 as "byte_count")
How do I implement a foreach writer so that its process method is triggered
only once for every watermarking interval. i.e in the aforementioned
example, I will get the following
10.00-10.01 Channel-1 100(bytes)
10.00-10.01 Channel-2 120(bytes)
10.01-10.02 Channel-1 110(bytes)
...
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
---------------------------------------------------------------------
To unsubscribe e-mail: [email protected]