Bump this again. I filed SPARK-31985 [1] and plan to submit a PR in a couple of days if there's no voice on the reason we should keep it.
1. https://issues.apache.org/jira/browse/SPARK-31985 On Thu, May 21, 2020 at 8:54 AM Jungtaek Lim <kabhwan.opensou...@gmail.com> wrote: > Let me share the effect on removing the incomplete and undocumented code > path. I manually tried out removing the code path and here's the change. > > > https://github.com/HeartSaVioR/spark/commit/aa53e9b1b33c0b8aec37704ad290b42ffb2962d8 > > 1,120 lines deleted without hurting any existing streaming tests, except a > suite I removed as well since it tests such code path. No need to update > documentation as it was never publicized. This also removes some rules > which only apply for continuous mode, which gave exceptions on the fact > that "continuous mode is only available for map-like operations". Removing > them would be back to simplify the usage of continuous mode. > > Also worth noting that I had to manually remove the code path instead of > revert, because the code path has been changed to reflect DSv2 change. What > this means? We have to "update" the code path and concern > about compatibility, etc. while it never be used in production. > > On Tue, May 19, 2020 at 1:14 PM Jungtaek Lim <kabhwan.opensou...@gmail.com> > wrote: > >> Hi devs, >> >> during experiment on complete mode I realized we left some incomplete >> code parts on supporting aggregation for continuous mode. (shuffle & >> coalesce) >> >> The work had been occurred around first half of 2018 and stopped, and no >> work has been done for around 2 years (so I don't expect anyone is working >> on this). The functionality is undocumented (as the work was only done >> partially) and continuous mode is experimental so I don't feel risks to get >> rid of the part. >> >> What do you think? If it makes sense then I'll raise a PR to get rid of >> the incomplete codes. >> >> Thanks, >> Jungtaek Lim (HeartSaVioR) >> >> ps. I'm kind of feeling the same with continuous mode itself - it has >> been still experimental over the 2 years, and while it lacks lots of >> essential things (backpressure, epoch synchronization, query progress, >> etc.) there's no additional major work over 1 year. We eventually have been >> excluded continuous mode for the new streaming feature like observable >> metric, streaming UI, etc. because it is on top of the feature >> which continuous mode lacks. >> >> Unlike incomplete code path, I'm not strongly against this, as it >> has been documented and someone might use the feature in production. I >> still think it's ideal to retire the feature smoothly. >> (Please chime in with the use case if someone has the production cases.) >> >> ps2. It feels like "feature branch" looks to be a thing to consider for >> efforts on one big feature. >> >