Re: [DISCUSS] Time to evaluate "continuous mode" in SS?

2020-09-16 Thread Cheng Su
. Thanks, Cheng Su From: Jungtaek Lim Date: Tuesday, September 15, 2020 at 5:04 PM To: Joseph Torres Cc: Sean Owen , dev Subject: Re: [DISCUSS] Time to evaluate "continuous mode" in SS? Yeah I realized there's a proposal for push-based shuffle, and I agree that may unblock the architect

Re: [DISCUSS] Time to evaluate "continuous mode" in SS?

2020-09-15 Thread Jungtaek Lim
Yeah I realized there's a proposal for push-based shuffle, and I agree that may unblock the architectural issue on true-streaming. (The root concern of the continuous mode has been that it doesn't fit with the architecture of Spark, and probably push-based shuffle could persuade me.) I guess

Re: [DISCUSS] Time to evaluate "continuous mode" in SS?

2020-09-15 Thread mshen
Hi Joseph, Would be interested in discussing your thoughts for how push-based shuffle could help with continuous mode in SS. We have discussed internally at LinkedIn with our Samza peers as well as with Alibaba Flink team for applicability of push-based shuffle on streaming engines, especially

Re: [DISCUSS] Time to evaluate "continuous mode" in SS?

2020-09-15 Thread Joseph Torres
It's worth noting that the push-based shuffle SPIP currently in progress addresses a substantial blocker in the area. If you remember when we removed the half-finished stateful query support, the lack of that functionality and the challenge of implementing it is basically why it was half-finished.

Re: [DISCUSS] Time to evaluate "continuous mode" in SS?

2020-09-15 Thread Sean Owen
I think we certainly can't remove it without deprecation and a few releases. If there were big problems with it that weren't getting fixed, sure maybe, but lack of interest in reviewing minor changes isn't necessarily a bad sign. By the same logic you'd delete graphx long ago. Anecdotally, yes

Re: [DISCUSS] Time to evaluate "continuous mode" in SS?

2020-09-15 Thread Jungtaek Lim
Probably it would depend on the meaning of "experimental". My understanding of "experimental" is more likely "incubation", which may be graduated finally, or may be retired. To be clear, I'm evaluating the continuous mode as "candidate to retire", unless there are actual use cases in production

Re: [DISCUSS] Time to evaluate "continuous mode" in SS?

2020-09-15 Thread Sean Owen
If you're suggesting making it un-Experimental, probably yes, as it is de facto not going to change much I expect. If you're saying remove it, probably not? I don't see that it's anywhere near deprecated, and not sure it's unmaintained - obviously tests etc still have to keep passing. On Mon, Sep

Re: [DISCUSS] Time to evaluate "continuous mode" in SS?

2020-09-15 Thread Gabor Somogyi
Hi Jungtaek, All I see at the moment is that most of the users choose Flink over Spark when continues processing is needed. Unless there is a revolution in this area there is no point to keep maintenance. 2.5 years is lot in bigdata industry. If there will be efforts in this area then happy to

[DISCUSS] Time to evaluate "continuous mode" in SS?

2020-09-14 Thread Jungtaek Lim
Hi devs, It was Spark 2.3 in Feb 2018 which introduced continuous mode in Structured Streaming as "experimental". Now we are here at 2.5 years after its release - I feel it would be a good time to evaluate the mode, whether the mode has been widely used or not, and the mode has been making