Re: What's the root cause of not supporting multiple aggregations in structured streaming?

2020-11-28 Thread Jungtaek Lim
To make clear, what Arun meant in old PR is, watermark and output mode are not relevant. It's limited to the append mode in any way when we only deal with watermark. So in this phase we don't (and shouldn't) bring output mode in topic and make things complicated, unless we really have a solid plan

Re: What's the root cause of not supporting multiple aggregations in structured streaming?

2020-11-26 Thread Yuanjian Li
Nice blog! Thanks for sharing, Etienne! Let's try to raise this discussion again after the 3.1 release. I do think more committers/contributors had realized the issue of global watermark per SPARK-24634 and SPARK-33259

Re: What's the root cause of not supporting multiple aggregations in structured streaming?

2020-11-26 Thread Etienne Chauchot
Hi, Regarding this subject I wrote a blog article that gives details about the watermark architecture proposal that was discussed in the design doc and in the PR: https://echauchot.blogspot.com/2020/11/watermark-architecture-proposal-for.html Best Etienne On 29/09/2020 03:24, Yuanjian Li

Re: What's the root cause of not supporting multiple aggregations in structured streaming?

2020-09-28 Thread Yuanjian Li
Thanks for the great discussion! Also interested in this feature and did some investigation before. As Arun mentioned, similar to the "update" mode, the "complete" mode also needs more design. We might need an operation level output mode for the complete mode support. That is to say, if we use

Re: What's the root cause of not supporting multiple aggregations in structured streaming?

2020-09-25 Thread Jungtaek Lim
Thanks Etienne! Yeah I forgot to say nice talking with you again. And sorry I forgot to send the reply (was in draft). Regarding investment in SS, well, unfortunately I don't know - I'm just an individual. There might be various reasons to do so, most probably "priority" among the stuff. There's

Re: What's the root cause of not supporting multiple aggregations in structured streaming?

2020-09-04 Thread Etienne Chauchot
Hi Jungtaek Lim, Nice to hear from you again since last time we talked :) and congrats on becoming a Spark committer in the meantime ! (if I'm not mistaking you were not at the time) I totally agree with what you're saying on merging structural parts of Spark without having a broader

Re: What's the root cause of not supporting multiple aggregations in structured streaming?

2020-09-04 Thread Jungtaek Lim
Unfortunately I don't see enough active committers working on Structured Streaming; I don't expect major features/improvements can be brought in this situation. Technically I can review and merge the PR on major improvements in SS, but that depends on how huge the proposal is changing. If the

Re: What's the root cause of not supporting multiple aggregations in structured streaming?

2020-08-31 Thread Etienne Chauchot
Hi all, I'm also very interested in this feature but the PR is open since January 2019 and was not updated. It raised a design discussion around watermarks and a design doc was written

Re: What's the root cause of not supporting multiple aggregations in structured streaming?

2019-05-21 Thread 张万新
Thanks, I'll check it out. Arun Mahadevan 于 2019年5月21日周二 01:31写道: > Heres the proposal for supporting it in "append" mode - > https://github.com/apache/spark/pull/23576. You could see if it addresses > your requirement and post your feedback in the PR. > For "update" mode its going to be much

Re: What's the root cause of not supporting multiple aggregations in structured streaming?

2019-05-20 Thread Arun Mahadevan
Heres the proposal for supporting it in "append" mode - https://github.com/apache/spark/pull/23576. You could see if it addresses your requirement and post your feedback in the PR. For "update" mode its going to be much harder to support this without first adding support for "retractions",

Re: What's the root cause of not supporting multiple aggregations in structured streaming?

2019-05-20 Thread Gabor Somogyi
There is PR for this but not yet merged. On Mon, May 20, 2019 at 10:13 AM 张万新 wrote: > Hi there, > > I'd like to know what's the root reason why multiple aggregations on > streaming dataframe is not allowed since it's a very useful feature, and > flink has supported it for a long time. > >

What's the root cause of not supporting multiple aggregations in structured streaming?

2019-05-20 Thread 张万新
Hi there, I'd like to know what's the root reason why multiple aggregations on streaming dataframe is not allowed since it's a very useful feature, and flink has supported it for a long time. Thanks.