Hi All,

when can we expect multiple aggregations to be supported in spark
structured streaming?

For example,

id | amount | my_timestamp
------------------------------------------------------
1  |      5      |  2018-04-01T01:00:00.000Z
1  |     10     |  2018-04-01T01:10:00.000Z
2  |     20     |  2018-04-01T01:20:00.000Z
2  |     30     |  2018-04-01T01:25:00.000Z
2  |     40     |  2018-04-01T01:30:00.000Z


I want to run a query like below to solve the problem in all streaming
fashion

select sum(amount) from (select amount, max(my_timestamp) from table group
by id, window("my_timestamp", "1 hours"))

just want the output to be

sum(amount)
------------------
 50

I am trying to find a solution without using flatMapGroupWithState or order
by. I am using spark 2.3.1 (custom built from master) and I had already
tried self join solution but again I am running into "multiple aggregations
are not supported"

Thanks!

Reply via email to