Never mind, forget about the dead code. Turned out that reverting "via manual" can be very easily done - remove config and apply to the tests -> remove field -> remove the changes into docs. It was considered as non-trivial because we only consider about "git revert" but there's no strict rule to do so.
Please take a look at this PR, https://github.com/apache/spark/pull/28517 On Wed, May 13, 2020 at 3:10 AM Russell Spitzer <russell.spit...@gmail.com> wrote: > I think that the dead code approach, while a bit unpalatable and worse > than reverting, is probably better than leaving the parameter (even if it > is hidden) > > On Tue, May 12, 2020 at 12:46 PM Ryan Blue <rb...@netflix.com> wrote: > >> +1 for the approach Jungtaek suggests. That will avoid needing to support >> behavior that is not well understood with minimal changes. >> >> On Tue, May 12, 2020 at 1:45 AM Jungtaek Lim < >> kabhwan.opensou...@gmail.com> wrote: >> >>> Before I forget, we'd better not forget to change the doc, as create >>> table doc looks to represent current syntax which will be incorrect later. >>> >>> On Tue, May 12, 2020 at 5:32 PM Jungtaek Lim < >>> kabhwan.opensou...@gmail.com> wrote: >>> >>>> It's not only for end users, but also for us. Spark itself uses the >>>> config "true" and "false" in tests and it still brings confusion. We still >>>> have to deal with both situations. >>>> >>>> I'm wondering how long days it would be needed to revert it cleanly, >>>> but if we worry about the amount of code change just around the new RC, >>>> what about make the code dirty (should be fixed soon) but less headache via >>>> applying traditional (and bad) way? >>>> >>>> Let's just remove the config so that the config cannot be used in any >>>> way (even in Spark codebase), and set corresponding field in parser to the >>>> constant value so that no one can modify in any way. This would make the >>>> dead code by intention which should be cleaned it up later, so let's add >>>> FIXME comment there so that anyone can take it up for cleaning up the code >>>> later. (If no one volunteers then I'll probably pick up.) >>>> >>>> That is a bad pattern, but still better as we prevent end users (even >>>> early adopters) go through the undocumented path in any way, and that will >>>> be explicitly marked as "should be fixed". This is different from retaining >>>> config - I don't expect unified create table syntax will be landed in >>>> bugfix version, so even unified create table syntax can be landed in 3.1.0 >>>> (this is also not guaranteed) the config will live in 3.0.x in any way. If >>>> we temporarily go dirty way then we can clean up the code in any version, >>>> even from bugfix version, maybe within a couple of weeks just after 3.0.0 >>>> is released. >>>> >>>> Does it sound valid? >>>> >>>> On Tue, May 12, 2020 at 2:35 PM Wenchen Fan <cloud0...@gmail.com> >>>> wrote: >>>> >>>>> SPARK-30098 was merged about 6 months ago. It's not a clean revert and >>>>> we may need to spend quite a bit of time to resolve conflicts and fix >>>>> tests. >>>>> >>>>> I don't see why it's still a problem if a feature is disabled and >>>>> hidden from end-users (it's undocumented, the config is internal). The >>>>> related code will be replaced in the master branch sooner or later, when >>>>> we >>>>> unify the syntaxes. >>>>> >>>>> >>>>> >>>>> On Tue, May 12, 2020 at 6:16 AM Ryan Blue <rb...@netflix.com.invalid> >>>>> wrote: >>>>> >>>>>> I'm all for getting the unified syntax into master. The only issue >>>>>> appears to be whether or not to pass the presence of the EXTERNAL keyword >>>>>> through to a catalog in v2. Maybe it's time to start a discuss thread for >>>>>> that issue so we're not stuck for another 6 weeks on it. >>>>>> >>>>>> On Mon, May 11, 2020 at 3:13 PM Jungtaek Lim < >>>>>> kabhwan.opensou...@gmail.com> wrote: >>>>>> >>>>>>> Btw another wondering here is, is it good to retain the flag on >>>>>>> master as an intermediate step? Wouldn't it be better for us to >>>>>>> start "unified create table syntax" from scratch? >>>>>>> >>>>>>> >>>>>>> On Tue, May 12, 2020 at 6:50 AM Jungtaek Lim < >>>>>>> kabhwan.opensou...@gmail.com> wrote: >>>>>>> >>>>>>>> I'm sorry, but I have to agree with Ryan and Russell. I chose the >>>>>>>> option 1 because it's less worse than option 2, but it doesn't mean I >>>>>>>> fully >>>>>>>> agree with option 1. >>>>>>>> >>>>>>>> Let's make below things clear if we really go with option 1, >>>>>>>> otherwise please consider reverting it. >>>>>>>> >>>>>>>> * Do you fully indicate about "all" the paths where the second >>>>>>>> create table syntax is taken? >>>>>>>> * Could you explain "why" to end users without any confusion? Do >>>>>>>> you think end users will understand it easily? >>>>>>>> * Do you have an actual end users to guide to turn this on? Or do >>>>>>>> you have a plan to turn this on for your team/customers and deal with >>>>>>>> the ambiguity? >>>>>>>> * Could you please document about how things will change if the >>>>>>>> flag is turned on? >>>>>>>> >>>>>>>> I guess the option 1 is to leave a flag as "undocumented" one and >>>>>>>> forget about the path to turn on, but I think that would lead to make >>>>>>>> the >>>>>>>> feature be "broken window" even we are not able to touch. >>>>>>>> >>>>>>>> On Tue, May 12, 2020 at 6:45 AM Russell Spitzer < >>>>>>>> russell.spit...@gmail.com> wrote: >>>>>>>> >>>>>>>>> I think reverting 30098 is the right decision here if we want to >>>>>>>>> unblock 3.0. We shouldn't ship with features which we know do not >>>>>>>>> function >>>>>>>>> in the way we intend, regardless of how little exposure most users >>>>>>>>> have to >>>>>>>>> them. Even if it's off my default, we should probably work to avoid >>>>>>>>> switches that cause things to behave unpredictably or require a flow >>>>>>>>> chart >>>>>>>>> to actually determine what will happen. >>>>>>>>> >>>>>>>>> On Mon, May 11, 2020 at 3:07 PM Ryan Blue >>>>>>>>> <rb...@netflix.com.invalid> wrote: >>>>>>>>> >>>>>>>>>> I'm all for fixing behavior in master by turning this off as an >>>>>>>>>> intermediate step, but I don't think that Spark 3.0 can safely >>>>>>>>>> include >>>>>>>>>> SPARK-30098. >>>>>>>>>> >>>>>>>>>> The problem is that SPARK-30098 introduces strange behavior, as >>>>>>>>>> Jungtaek pointed out. And that behavior is not fully understood. >>>>>>>>>> While >>>>>>>>>> working on a unified CREATE TABLE syntax, I hit additional test >>>>>>>>>> failures >>>>>>>>>> <https://github.com/apache/spark/pull/28026#issuecomment-606967363> >>>>>>>>>> where the wrong create path was being used. >>>>>>>>>> >>>>>>>>>> Unless we plan to NOT support the behavior >>>>>>>>>> when spark.sql.legacy.createHiveTableByDefault.enabled is disabled, >>>>>>>>>> we >>>>>>>>>> should not ship Spark 3.0 with SPARK-30098. Otherwise, we will have >>>>>>>>>> to deal >>>>>>>>>> with this problem for years to come. >>>>>>>>>> >>>>>>>>>> On Mon, May 11, 2020 at 1:06 AM JackyLee <qcsd2...@163.com> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> +1. Agree with Xiao Li and Jungtaek Lim. >>>>>>>>>>> >>>>>>>>>>> This seems to be controversial, and can not be done in a short >>>>>>>>>>> time. It is >>>>>>>>>>> necessary to choose option 1 to unblock Spark 3.0 and support it >>>>>>>>>>> in 3.1. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Sent from: >>>>>>>>>>> http://apache-spark-developers-list.1001551.n3.nabble.com/ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> --------------------------------------------------------------------- >>>>>>>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Ryan Blue >>>>>>>>>> Software Engineer >>>>>>>>>> Netflix >>>>>>>>>> >>>>>>>>> >>>>>> >>>>>> -- >>>>>> Ryan Blue >>>>>> Software Engineer >>>>>> Netflix >>>>>> >>>>> >> >> -- >> Ryan Blue >> Software Engineer >> Netflix >> >