+1 for the approach Jungtaek suggests. That will avoid needing to support
behavior that is not well understood with minimal changes.

On Tue, May 12, 2020 at 1:45 AM Jungtaek Lim <kabhwan.opensou...@gmail.com>
wrote:

> Before I forget, we'd better not forget to change the doc, as create table
> doc looks to represent current syntax which will be incorrect later.
>
> On Tue, May 12, 2020 at 5:32 PM Jungtaek Lim <kabhwan.opensou...@gmail.com>
> wrote:
>
>> It's not only for end users, but also for us. Spark itself uses the
>> config "true" and "false" in tests and it still brings confusion. We still
>> have to deal with both situations.
>>
>> I'm wondering how long days it would be needed to revert it cleanly, but
>> if we worry about the amount of code change just around the new RC, what
>> about make the code dirty (should be fixed soon) but less headache via
>> applying traditional (and bad) way?
>>
>> Let's just remove the config so that the config cannot be used in any way
>> (even in Spark codebase), and set corresponding field in parser to the
>> constant value so that no one can modify in any way. This would make the
>> dead code by intention which should be cleaned it up later, so let's add
>> FIXME comment there so that anyone can take it up for cleaning up the code
>> later. (If no one volunteers then I'll probably pick up.)
>>
>> That is a bad pattern, but still better as we prevent end users (even
>> early adopters) go through the undocumented path in any way, and that will
>> be explicitly marked as "should be fixed". This is different from retaining
>> config - I don't expect unified create table syntax will be landed in
>> bugfix version, so even unified create table syntax can be landed in 3.1.0
>> (this is also not guaranteed) the config will live in 3.0.x in any way. If
>> we temporarily go dirty way then we can clean up the code in any version,
>> even from bugfix version, maybe within a couple of weeks just after 3.0.0
>> is released.
>>
>> Does it sound valid?
>>
>> On Tue, May 12, 2020 at 2:35 PM Wenchen Fan <cloud0...@gmail.com> wrote:
>>
>>> SPARK-30098 was merged about 6 months ago. It's not a clean revert and
>>> we may need to spend quite a bit of time to resolve conflicts and fix tests.
>>>
>>> I don't see why it's still a problem if a feature is disabled and hidden
>>> from end-users (it's undocumented, the config is internal). The related
>>> code will be replaced in the master branch sooner or later, when we unify
>>> the syntaxes.
>>>
>>>
>>>
>>> On Tue, May 12, 2020 at 6:16 AM Ryan Blue <rb...@netflix.com.invalid>
>>> wrote:
>>>
>>>> I'm all for getting the unified syntax into master. The only issue
>>>> appears to be whether or not to pass the presence of the EXTERNAL keyword
>>>> through to a catalog in v2. Maybe it's time to start a discuss thread for
>>>> that issue so we're not stuck for another 6 weeks on it.
>>>>
>>>> On Mon, May 11, 2020 at 3:13 PM Jungtaek Lim <
>>>> kabhwan.opensou...@gmail.com> wrote:
>>>>
>>>>> Btw another wondering here is, is it good to retain the flag on master
>>>>> as an intermediate step? Wouldn't it be better for us to start "unified
>>>>> create table syntax" from scratch?
>>>>>
>>>>>
>>>>> On Tue, May 12, 2020 at 6:50 AM Jungtaek Lim <
>>>>> kabhwan.opensou...@gmail.com> wrote:
>>>>>
>>>>>> I'm sorry, but I have to agree with Ryan and Russell. I chose the
>>>>>> option 1 because it's less worse than option 2, but it doesn't mean I 
>>>>>> fully
>>>>>> agree with option 1.
>>>>>>
>>>>>> Let's make below things clear if we really go with option 1,
>>>>>> otherwise please consider reverting it.
>>>>>>
>>>>>> * Do you fully indicate about "all" the paths where the second create
>>>>>> table syntax is taken?
>>>>>> * Could you explain "why" to end users without any confusion? Do you
>>>>>> think end users will understand it easily?
>>>>>> * Do you have an actual end users to guide to turn this on? Or do you
>>>>>> have a plan to turn this on for your team/customers and deal with
>>>>>> the ambiguity?
>>>>>> * Could you please document about how things will change if the flag
>>>>>> is turned on?
>>>>>>
>>>>>> I guess the option 1 is to leave a flag as "undocumented" one and
>>>>>> forget about the path to turn on, but I think that would lead to make the
>>>>>> feature be "broken window" even we are not able to touch.
>>>>>>
>>>>>> On Tue, May 12, 2020 at 6:45 AM Russell Spitzer <
>>>>>> russell.spit...@gmail.com> wrote:
>>>>>>
>>>>>>> I think reverting 30098 is the right decision here if we want to
>>>>>>> unblock 3.0. We shouldn't ship with features which we know do not 
>>>>>>> function
>>>>>>> in the way we intend, regardless of how little exposure most users have 
>>>>>>> to
>>>>>>> them. Even if it's off my default, we should probably work to avoid
>>>>>>> switches that cause things to behave unpredictably or require a flow 
>>>>>>> chart
>>>>>>> to actually determine what will happen.
>>>>>>>
>>>>>>> On Mon, May 11, 2020 at 3:07 PM Ryan Blue <rb...@netflix.com.invalid>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I'm all for fixing behavior in master by turning this off as an
>>>>>>>> intermediate step, but I don't think that Spark 3.0 can safely include
>>>>>>>> SPARK-30098.
>>>>>>>>
>>>>>>>> The problem is that SPARK-30098 introduces strange behavior, as
>>>>>>>> Jungtaek pointed out. And that behavior is not fully understood. While
>>>>>>>> working on a unified CREATE TABLE syntax, I hit additional test
>>>>>>>> failures
>>>>>>>> <https://github.com/apache/spark/pull/28026#issuecomment-606967363>
>>>>>>>> where the wrong create path was being used.
>>>>>>>>
>>>>>>>> Unless we plan to NOT support the behavior
>>>>>>>> when spark.sql.legacy.createHiveTableByDefault.enabled is disabled, we
>>>>>>>> should not ship Spark 3.0 with SPARK-30098. Otherwise, we will have to 
>>>>>>>> deal
>>>>>>>> with this problem for years to come.
>>>>>>>>
>>>>>>>> On Mon, May 11, 2020 at 1:06 AM JackyLee <qcsd2...@163.com> wrote:
>>>>>>>>
>>>>>>>>> +1. Agree with Xiao Li and Jungtaek Lim.
>>>>>>>>>
>>>>>>>>> This seems to be controversial, and can not be done in a short
>>>>>>>>> time. It is
>>>>>>>>> necessary to choose option 1 to unblock Spark 3.0 and support it
>>>>>>>>> in 3.1.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Sent from:
>>>>>>>>> http://apache-spark-developers-list.1001551.n3.nabble.com/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Ryan Blue
>>>>>>>> Software Engineer
>>>>>>>> Netflix
>>>>>>>>
>>>>>>>
>>>>
>>>> --
>>>> Ryan Blue
>>>> Software Engineer
>>>> Netflix
>>>>
>>>

-- 
Ryan Blue
Software Engineer
Netflix

Reply via email to