Re: Multiple aggregations over streaming dataframes

2016-07-08 Thread Arnaud Bailly
ted. In the Kakfa world each topic is allowed to define a TTL SLA. I.e. > The consumer must read the data with in a limited of window of time. > > Andy > > From: Michael Armbrust > Date: Thursday, July 7, 2016 at 2:31 PM > To: Arnaud Bailly > Cc: Sivakumaran S , "u

Re: Multiple aggregations over streaming dataframes

2016-07-07 Thread Andy Davidson
2:31 PM To: Arnaud Bailly Cc: Sivakumaran S , "user @spark" Subject: Re: Multiple aggregations over streaming dataframes > We are planning to address this issue in the future. > > At a high level, we'll have to add a delta mode so that updates can be > communicat

Re: Multiple aggregations over streaming dataframes

2016-07-07 Thread Michael Armbrust
We are planning to address this issue in the future. At a high level, we'll have to add a delta mode so that updates can be communicated from one operator to the next. On Thu, Jul 7, 2016 at 8:59 AM, Arnaud Bailly wrote: > Indeed. But nested aggregation does not work with Structured Streaming,

Re: Multiple aggregations over streaming dataframes

2016-07-07 Thread Arnaud Bailly
Indeed. But nested aggregation does not work with Structured Streaming, that's the point. I would like to know if there is workaround, or what's the plan regarding this feature which seems to me quite useful. If the implementation is not overtly complex and it is just a matter of manpower, I am fin

Re: Multiple aggregations over streaming dataframes

2016-07-07 Thread Sivakumaran S
Arnauld, You could aggregate the first table and then merge it with the second table (assuming that they are similarly structured) and then carry out the second aggregation. Unless the data is very large, I don’t see why you should persist it to disk. IMO, nested aggregation is more elegant and

Re: Multiple aggregations over streaming dataframes

2016-07-07 Thread Arnaud Bailly
It's aggregation at multiple levels in a query: first do some aggregation on one tavle, then join with another table and do a second aggregation. I could probably rewrite the query in such a way that it does aggregation in one pass but that would obfuscate the purpose of the various stages. Le 7 ju

Re: Multiple aggregations over streaming dataframes

2016-07-07 Thread Sivakumaran S
Hi Arnauld, Sorry for the doubt, but what exactly is multiple aggregation? What is the use case? Regards, Sivakumaran > On 07-Jul-2016, at 11:18 AM, Arnaud Bailly wrote: > > Hello, > > I understand multiple aggregations over streaming dataframes is not currently > supported in Spark 2.0.