Hi All,
when can we expect multiple aggregations to be supported in spark
structured streaming?
For example,
id | amount | my_timestamp
------------------------------------------------------
1 | 5 | 2018-04-01T01:00:00.000Z
1 | 10 | 2018-04-01T01:10:00.000Z
2 | 20 | 2018-04-01T01:20:00.000Z
2 | 30 | 2018-04-01T01:25:00.000Z
2 | 40 | 2018-04-01T01:30:00.000Z
I want to run a query like below to solve the problem in all streaming
fashion
select sum(amount) from (select amount, max(my_timestamp) from table group
by id, window("my_timestamp", "1 hours"))
just want the output to be
sum(amount)
------------------
50
I am trying to find a solution without using flatMapGroupWithState or order
by. I am using spark 2.3.1 (custom built from master) and I had already
tried self join solution but again I am running into "multiple aggregations
are not supported"
Thanks!