Hi, Das:
Thanks for your answer.
I'm talking about multiple streaming aggregations here is :
df.groupBy("key").agg(min("colA").as("min")).groupBy("min").count()
EG: The data source is the user login record.There are two fields in my
temp view (USER_LOGIN_TABLE): user_id, is_login.Then figure out the number of
users who logged in more than 3 times in 5 minutes.My first SQL is:
SELECT user_id,count(1) as failed_num
FROM USER_LOGIN_TABLE
WHERE login_failed
I took the last SQL is a new temp view (USER_FAILED_TABLE).Then the
second SQL is:
SELECT count(user_id)
FROM USER_FAILED_TABLE
WHERE failed_num>=3
Thanks.
------------------
---------------------------------------------------------------????????---------------------------------------------------------------------------------------------------------------
------------------
Hello,
What do you mean by multiple streaming aggregations? Something like this is
already supported.
df.groupBy("key").agg(min("colA"), max("colB"), avg("colC"))
But the following is not supported.
df.groupBy("key").agg(min("colA").as("min")).groupBy("min").count()
In other words, multiple aggregations ONE AFTER ANOTHER is NOT supported yet,
and we currently don't have any plans to support it by 2.3.
If this is what you want, then can you explain the use case of why you want
multiple aggregation
On Tue, Nov 28, 2017 at 9:46 PM, Georg Heiler <[email protected]> wrote:
2.3 around January
0.0 <[email protected]> schrieb am Mi. 29. Nov. 2017 um 05:08:
Hi, all:
Multiple streaming aggregations are not yet supported. When will it be
supported? Is it in the plan?
Thanks.