Hi,

It is a quite long discussion to follow and I hope I didn’t misunderstand 
anything. From the proposals presented by Xuefu I would vote:

-1 for #1 and #2 
+1 for #3

Besides #3 being IMO more general and more consistent, having qualified names 
(#3) would help/make easier for someone to use cross databases/catalogs queries 
(joining multiple data sets/streams). For example with some functions to 
manipulate/clean up/convert the stored data in different catalogs registered in 
the respective catalogs.

Piotrek 

> On 19 Sep 2019, at 06:35, Jark Wu <imj...@gmail.com> wrote:
> 
> I agree with Xuefu that inconsistent handling with all the other objects is
> not a big problem.
> 
> Regarding to option#3, the special "system.system" namespace may confuse
> users.
> Users need to know the set of built-in function names to know when to use
> "system.system" namespace.
> What will happen if user registers a non-builtin function name under the
> "system.system" namespace?
> Besides, I think it doesn't solve the "explode" problem I mentioned at the
> beginning of this thread.
> 
> So here is my vote:
> 
> +1 for #1
> 0 for #2
> -1 for #3
> 
> Best,
> Jark
> 
> 
> On Thu, 19 Sep 2019 at 08:38, Xuefu Z <usxu...@gmail.com> wrote:
> 
>> @Dawid, Re: we also don't need additional referencing the specialcatalog
>> anywhere.
>> 
>> True. But once we allow such reference, then user can do so in any possible
>> place where a function name is expected, for which we have to handle.
>> That's a big difference, I think.
>> 
>> Thanks,
>> Xuefu
>> 
>> On Wed, Sep 18, 2019 at 5:25 PM Dawid Wysakowicz <
>> wysakowicz.da...@gmail.com>
>> wrote:
>> 
>>> @Bowen I am not suggesting introducing additional catalog. I think we
>> need
>>> to get rid of the current built-in catalog.
>>> 
>>> @Xuefu in option #3 we also don't need additional referencing the special
>>> catalog anywhere else besides in the CREATE statement. The resolution
>>> behaviour is exactly the same in both options.
>>> 
>>> On Thu, 19 Sep 2019, 08:17 Xuefu Z, <usxu...@gmail.com> wrote:
>>> 
>>>> Hi Dawid,
>>>> 
>>>> "GLOBAL" is a temporary keyword that was given to the approach. It can
>> be
>>>> changed to something else for better.
>>>> 
>>>> The difference between this and the #3 approach is that we only need
>> the
>>>> keyword for this create DDL. For other places (such as function
>>>> referencing), no keyword or special namespace is needed.
>>>> 
>>>> Thanks,
>>>> Xuefu
>>>> 
>>>> On Wed, Sep 18, 2019 at 4:32 PM Dawid Wysakowicz <
>>>> wysakowicz.da...@gmail.com>
>>>> wrote:
>>>> 
>>>>> Hi,
>>>>> I think it makes sense to start voting at this point.
>>>>> 
>>>>> Option 1: Only 1-part identifiers
>>>>> PROS:
>>>>> - allows shadowing built-in functions
>>>>> CONS:
>>>>> - incosistent with all the other objects, both permanent & temporary
>>>>> - does not allow shadowing catalog functions
>>>>> 
>>>>> Option 2: Special keyword for built-in function
>>>>> I think this is quite similar to the special catalog/db. The thing I
>> am
>>>>> strongly against in this proposal is the GLOBAL keyword. This keyword
>>>> has a
>>>>> meaning in rdbms systems and means a function that is present for a
>>>>> lifetime of a session in which it was created, but available in all
>>> other
>>>>> sessions. Therefore I really don't want to use this keyword in a
>>>> different
>>>>> context.
>>>>> 
>>>>> Option 3: Special catalog/db
>>>>> 
>>>>> PROS:
>>>>> - allows shadowing built-in functions
>>>>> - allows shadowing catalog functions
>>>>> - consistent with other objects
>>>>> CONS:
>>>>> - we introduce a special namespace for built-in functions
>>>>> 
>>>>> I don't see a problem with introducing the special namespace. In the
>>> end
>>>> it
>>>>> is very similar to the keyword approach. In this case the catalog/db
>>>>> combination would be the "keyword"
>>>>> 
>>>>> Therefore my votes:
>>>>> Option 1: -0
>>>>> Option 2: -1 (I might change to +0 if we can come up with a better
>>>> keyword)
>>>>> Option 3: +1
>>>>> 
>>>>> Best,
>>>>> Dawid
>>>>> 
>>>>> 
>>>>> On Thu, 19 Sep 2019, 05:12 Xuefu Z, <usxu...@gmail.com> wrote:
>>>>> 
>>>>>> Hi Aljoscha,
>>>>>> 
>>>>>> Thanks for the summary and these are great questions to be
>> answered.
>>>> The
>>>>>> answer to your first question is clear: there is a general
>> agreement
>>> to
>>>>>> override built-in functions with temp functions.
>>>>>> 
>>>>>> However, your second and third questions are sort of related, as a
>>>>> function
>>>>>> reference can be either just function name (like "func") or in the
>>> form
>>>>> or
>>>>>> "cat.db.func". When a reference is just function name, it can mean
>>>>> either a
>>>>>> built-in function or a function defined in the current cat/db. If
>> we
>>>>>> support overriding a built-in function with a temp function, such
>>>>>> overriding can also cover a function in the current cat/db.
>>>>>> 
>>>>>> I think what Timo referred as "overriding a catalog function"
>> means a
>>>>> temp
>>>>>> function defined as "cat.db.func" overrides a catalog function
>> "func"
>>>> in
>>>>>> cat/db even if cat/db is not current. To support this, temp
>> function
>>>> has
>>>>> to
>>>>>> be tied to a cat/db. What's why I said above that the 2nd and 3rd
>>>>> questions
>>>>>> are related. The problem with such support is the ambiguity when
>> user
>>>>>> defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func
>>> ...".
>>>>>> Here "func" can means a global temp function, or a temp function in
>>>>> current
>>>>>> cat/db. If we can assume the former, this creates an inconsistency
>>>>> because
>>>>>> "CREATE FUNCTION func" actually means a function in current cat/db.
>>> If
>>>> we
>>>>>> assume the latter, then there is no way for user to create a global
>>>> temp
>>>>>> function.
>>>>>> 
>>>>>> Giving a special namespace for built-in functions may solve the
>>>> ambiguity
>>>>>> problem above, but it also introduces artificial catalog/database
>>> that
>>>>>> needs special treatment and pollutes the cleanness of  the code. I
>>>> would
>>>>>> rather introduce a syntax in DDL to solve the problem, like "CREATE
>>>>>> [GLOBAL] TEMPORARY FUNCTION func".
>>>>>> 
>>>>>> Thus, I'd like to summarize a few candidate proposals for voting
>>>>> purposes:
>>>>>> 
>>>>>> 1. Support only global, temporary functions without namespace. Such
>>>> temp
>>>>>> functions overrides built-in functions and catalog functions in
>>> current
>>>>>> cat/db. The resolution order is: temp functions -> built-in
>> functions
>>>> ->
>>>>>> catalog functions. (Partially or fully qualified functions has no
>>>>>> ambiguity!)
>>>>>> 
>>>>>> 2. In addition to #1, support creating and referencing temporary
>>>>> functions
>>>>>> associated with a cat/db with "GLOBAL" qualifier in DDL for global
>>> temp
>>>>>> functions. The resolution order is: global temp functions ->
>> built-in
>>>>>> functions -> temp functions in current cat/db -> catalog function.
>>>>>> (Resolution for partially or fully qualified function reference is:
>>>> temp
>>>>>> functions -> persistent functions.)
>>>>>> 
>>>>>> 3. In addition to #1, support creating and referencing temporary
>>>>> functions
>>>>>> associated with a cat/db with a special namespace for built-in
>>>> functions
>>>>>> and global temp functions. The resolution is the same as #2, except
>>>> that
>>>>>> the special namespace might be prefixed to a reference to a
>> built-in
>>>>>> function or global temp function. (In absence of the special
>>> namespace,
>>>>> the
>>>>>> resolution order is the same as in #2.)
>>>>>> 
>>>>>> My personal preference is #1, given the unknown use case and
>>> introduced
>>>>>> complexity for #2 and #3. However, #2 is an acceptable alternative.
>>>> Thus,
>>>>>> my votes are:
>>>>>> 
>>>>>> +1 for #1
>>>>>> +0 for #2
>>>>>> -1 for #3
>>>>>> 
>>>>>> Everyone, please cast your vote (in above format please!), or let
>> me
>>>> know
>>>>>> if you have more questions or other candidates.
>>>>>> 
>>>>>> Thanks,
>>>>>> Xuefu
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek <
>>> aljos...@apache.org>
>>>>>> wrote:
>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> I think this discussion and the one for FLIP-64 are very
>> connected.
>>>> To
>>>>>>> resolve the differences, think we have to think about the basic
>>>>>> principles
>>>>>>> and find consensus there. The basic questions I see are:
>>>>>>> 
>>>>>>> - Do we want to support overriding builtin functions?
>>>>>>> - Do we want to support overriding catalog functions?
>>>>>>> - And then later: should temporary functions be tied to a
>>>>>>> catalog/database?
>>>>>>> 
>>>>>>> I don’t have much to say about these, except that we should
>>> somewhat
>>>>>> stick
>>>>>>> to what the industry does. But I also understand that the
>> industry
>>> is
>>>>>>> already very divided on this.
>>>>>>> 
>>>>>>> Best,
>>>>>>> Aljoscha
>>>>>>> 
>>>>>>>> On 18. Sep 2019, at 11:41, Jark Wu <imj...@gmail.com> wrote:
>>>>>>>> 
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> +1 to strive for reaching consensus on the remaining topics. We
>>> are
>>>>>>> close to the truth. It will waste a lot of time if we resume the
>>>> topic
>>>>>> some
>>>>>>> time later.
>>>>>>>> 
>>>>>>>> +1 to “1-part/override” and I’m also fine with Timo’s
>>> “cat.db.fun”
>>>>> way
>>>>>>> to override a catalog function.
>>>>>>>> 
>>>>>>>> I’m not sure about “system.system.fun”, it introduces a
>>> nonexistent
>>>>> cat
>>>>>>> & db? And we still need to do special treatment for the dedicated
>>>>>>> system.system cat & db?
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> Jark
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> 在 2019年9月18日,06:54,Timo Walther <twal...@apache.org> 写道:
>>>>>>>>> 
>>>>>>>>> Hi everyone,
>>>>>>>>> 
>>>>>>>>> @Xuefu: I would like to avoid adding too many things
>>>> incrementally.
>>>>>>> Users should be able to override all catalog objects consistently
>>>>>> according
>>>>>>> to FLIP-64 (Support for Temporary Objects in Table module). If
>>>>> functions
>>>>>>> are treated completely different, we need more code and special
>>>> cases.
>>>>>> From
>>>>>>> an implementation perspective, this topic only affects the lookup
>>>> logic
>>>>>>> which is rather low implementation effort which is why I would
>> like
>>>> to
>>>>>>> clarify the remaining items. As you said, we have a slight
>> consenus
>>>> on
>>>>>>> overriding built-in functions; we should also strive for reaching
>>>>>> consensus
>>>>>>> on the remaining topics.
>>>>>>>>> 
>>>>>>>>> @Dawid: I like your idea as it ensures registering catalog
>>> objects
>>>>>>> consistent and the overriding of built-in functions more
>> explicit.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Timo
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On 17.09.19 11:59, kai wang wrote:
>>>>>>>>>> hi, everyone
>>>>>>>>>> I think this flip is very meaningful. it supports functions
>>> that
>>>>> can
>>>>>> be
>>>>>>>>>> shared by different catalogs and dbs, reducing the
>> duplication
>>> of
>>>>>>> functions.
>>>>>>>>>> 
>>>>>>>>>> Our group based on flink's sql parser module implements
>> create
>>>>>> function
>>>>>>>>>> feature, stores the parsed function metadata and schema into
>>>> mysql,
>>>>>> and
>>>>>>>>>> also customizes the catalog, customizes sql-client to support
>>>>> custom
>>>>>>>>>> schemas and functions. Loaded, but the function is currently
>>>>> global,
>>>>>>> and is
>>>>>>>>>> not subdivided according to catalog and db.
>>>>>>>>>> 
>>>>>>>>>> In addition, I very much hope to participate in the
>> development
>>>> of
>>>>>> this
>>>>>>>>>> flip, I have been paying attention to the community, but
>> found
>>> it
>>>>> is
>>>>>>> more
>>>>>>>>>> difficult to join.
>>>>>>>>>> thank you.
>>>>>>>>>> 
>>>>>>>>>> Xuefu Z <usxu...@gmail.com> 于2019年9月17日周二 上午11:19写道:
>>>>>>>>>> 
>>>>>>>>>>> Thanks to Tmo and Dawid for sharing thoughts.
>>>>>>>>>>> 
>>>>>>>>>>> It seems to me that there is a general consensus on having
>>> temp
>>>>>>> functions
>>>>>>>>>>> that have no namespaces and overwrite built-in functions.
>> (As
>>> a
>>>>> side
>>>>>>> note
>>>>>>>>>>> for comparability, the current user defined functions are
>> all
>>>>>>> temporary and
>>>>>>>>>>> having no namespaces.)
>>>>>>>>>>> 
>>>>>>>>>>> Nevertheless, I can also see the merit of having namespaced
>>> temp
>>>>>>> functions
>>>>>>>>>>> that can overwrite functions defined in a specific cat/db.
>>>>> However,
>>>>>>> this
>>>>>>>>>>> idea appears orthogonal to the former and can be added
>>>>>> incrementally.
>>>>>>>>>>> 
>>>>>>>>>>> How about we first implement non-namespaced temp functions
>> now
>>>> and
>>>>>>> leave
>>>>>>>>>>> the door open for namespaced ones for later releases as the
>>>>>>> requirement
>>>>>>>>>>> might become more crystal? This also helps shorten the
>> debate
>>>> and
>>>>>>> allow us
>>>>>>>>>>> to make some progress along this direction.
>>>>>>>>>>> 
>>>>>>>>>>> As to Dawid's idea of having a dedicated cat/db to host the
>>>>>> temporary
>>>>>>> temp
>>>>>>>>>>> functions that don't have namespaces, my only concern is the
>>>>> special
>>>>>>>>>>> treatment for a cat/db, which makes code less clean, as
>>> evident
>>>> in
>>>>>>> treating
>>>>>>>>>>> the built-in catalog currently.
>>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Xuefiu
>>>>>>>>>>> 
>>>>>>>>>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz <
>>>>>>>>>>> wysakowicz.da...@gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Hi,
>>>>>>>>>>>> Another idea to consider on top of Timo's suggestion. How
>>> about
>>>>> we
>>>>>>> have a
>>>>>>>>>>>> special namespace (catalog + database) for built-in
>> objects?
>>>> This
>>>>>>> catalog
>>>>>>>>>>>> would be invisible for users as Xuefu was suggesting.
>>>>>>>>>>>> 
>>>>>>>>>>>> Then users could still override built-in functions, if they
>>>> fully
>>>>>>> qualify
>>>>>>>>>>>> object with the built-in namespace, but by default the
>> common
>>>>> logic
>>>>>>> of
>>>>>>>>>>>> current dB & cat would be used.
>>>>>>>>>>>> 
>>>>>>>>>>>> CREATE TEMPORARY FUNCTION func ...
>>>>>>>>>>>> registers temporary function in current cat & dB
>>>>>>>>>>>> 
>>>>>>>>>>>> CREATE TEMPORARY FUNCTION cat.db.func ...
>>>>>>>>>>>> registers temporary function in cat db
>>>>>>>>>>>> 
>>>>>>>>>>>> CREATE TEMPORARY FUNCTION system.system.func ...
>>>>>>>>>>>> Overrides built-in function with temporary function
>>>>>>>>>>>> 
>>>>>>>>>>>> The built-in/system namespace would not be writable for
>>>> permanent
>>>>>>>>>>> objects.
>>>>>>>>>>>> WDYT?
>>>>>>>>>>>> 
>>>>>>>>>>>> This way I think we can have benefits of both solutions.
>>>>>>>>>>>> 
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Dawid
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <
>> twal...@apache.org
>>>> 
>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi Bowen,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I understand the potential benefit of overriding certain
>>>>> built-in
>>>>>>>>>>>>> functions. I'm open to such a feature if many people
>> agree.
>>>>>>> However, it
>>>>>>>>>>>>> would be great to still support overriding catalog
>> functions
>>>>> with
>>>>>>>>>>>>> temporary functions in order to prototype a query even
>>> though
>>>> a
>>>>>>>>>>>>> catalog/database might not be available currently or
>> should
>>>> not
>>>>> be
>>>>>>>>>>>>> modified yet. How about we support both cases?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> CREATE TEMPORARY FUNCTION abs
>>>>>>>>>>>>> -> creates/overrides a built-in function and never
>>> consideres
>>>>>>> current
>>>>>>>>>>>>> catalog and database; inconsistent with other DDL but
>>>> acceptable
>>>>>> for
>>>>>>>>>>>>> functions I guess.
>>>>>>>>>>>>> CREATE TEMPORARY FUNCTION cat.db.fun
>>>>>>>>>>>>> -> creates/overrides a catalog function
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Regarding "Flink don't have any other built-in objects
>>>> (tables,
>>>>>>> views)
>>>>>>>>>>>>> except functions", this might change in the near future.
>>> Take
>>>>>>>>>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an
>>>>> example.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Timo
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On 14.09.19 01:40, Bowen Li wrote:
>>>>>>>>>>>>>> Hi Fabian,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Yes, I agree 1-part/no-override is the least favorable
>>> thus I
>>>>>>> didn't
>>>>>>>>>>>>>> include that as a voting option, and the discussion is
>>> mainly
>>>>>>> between
>>>>>>>>>>>>>> 1-part/override builtin and 3-part/not override builtin.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Re > However, it means that temp functions are
>> differently
>>>>>> treated
>>>>>>>>>>> than
>>>>>>>>>>>>>> other db objects.
>>>>>>>>>>>>>> IMO, the treatment difference results from the fact that
>>>>>> functions
>>>>>>>>>>> are
>>>>>>>>>>>> a
>>>>>>>>>>>>>> bit different from other objects - Flink don't have any
>>> other
>>>>>>>>>>> built-in
>>>>>>>>>>>>>> objects (tables, views) except functions.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>> Bowen
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> --
>>>>>>>>>>> Xuefu Zhang
>>>>>>>>>>> 
>>>>>>>>>>> "In Honey We Trust!"
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Xuefu Zhang
>>>>>> 
>>>>>> "In Honey We Trust!"
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> --
>>>> Xuefu Zhang
>>>> 
>>>> "In Honey We Trust!"
>>>> 
>>> 
>> 
>> 
>> --
>> Xuefu Zhang
>> 
>> "In Honey We Trust!"
>> 

Reply via email to