I have the following query:

    val ds = dataFrame
      .filter(! $"requri".endsWith(".m3u8"))
      .filter(! $"bserver".contains("trimmer"))
      .withWatermark("time", "120 seconds")
      .groupBy(window(dataFrame.col("time"),"60
seconds"),col("channelName"))
      .agg(sum("bytes")/1000000 as "byte_count")

How do I implement a foreach writer so that its process method is triggered
only once for every watermarking interval. i.e in the aforementioned
example, I will get the following

10.00-10.01 Channel-1 100(bytes)
10.00-10.01 Channel-2 120(bytes)
10.01-10.02 Channel-1 110(bytes)
...



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [email protected]

Reply via email to