Looks like I'm the only person who is willing to +1 to #2 for now :-)
But I would suggest to change the keyword from GLOBAL to
something like BUILTIN.

I think #2 and #3 are almost the same proposal, just with different
format to indicate whether it want to override built-in functions.

My biggest reason to choose it is I want this behavior be consistent
with temporal tables. I will give some examples to show the behavior
and also make sure I'm not misunderstanding anything here.

For most DBs, when user create a temporary table with:

CREATE TEMPORARY TABLE t1

It's actually equivalent with:

CREATE TEMPORARY TABLE `curent_db`.t1

If user change current database, they will not be able to access t1 without
fully qualified name, .i.e db1.t1 (assuming db1 is current database when
this temporary table is created).

Only #2 and #3 followed this behavior and I would vote for this since this
makes such behavior consistent through temporal tables and functions.

Why I'm not voting for #3 is a special catalog and database just looks very
hacky to me. It gave a imply that our built-in functions saved at a special
catalog and database, which is actually not. Introducing a dedicated keyword
like CREATE TEMPORARY BUILTIN FUNCTION looks more clear and
straightforward. One can argue that we should avoid introducing new keyword,
but it's also very rare that a system can overwrite built-in functions.
Since we
decided to support this, introduce a new keyword is not a big deal IMO.

Best,
Kurt


On Thu, Sep 19, 2019 at 3:07 PM Piotr Nowojski <pi...@ververica.com> wrote:

> Hi,
>
> It is a quite long discussion to follow and I hope I didn’t misunderstand
> anything. From the proposals presented by Xuefu I would vote:
>
> -1 for #1 and #2
> +1 for #3
>
> Besides #3 being IMO more general and more consistent, having qualified
> names (#3) would help/make easier for someone to use cross
> databases/catalogs queries (joining multiple data sets/streams). For
> example with some functions to manipulate/clean up/convert the stored data
> in different catalogs registered in the respective catalogs.
>
> Piotrek
>
> > On 19 Sep 2019, at 06:35, Jark Wu <imj...@gmail.com> wrote:
> >
> > I agree with Xuefu that inconsistent handling with all the other objects
> is
> > not a big problem.
> >
> > Regarding to option#3, the special "system.system" namespace may confuse
> > users.
> > Users need to know the set of built-in function names to know when to use
> > "system.system" namespace.
> > What will happen if user registers a non-builtin function name under the
> > "system.system" namespace?
> > Besides, I think it doesn't solve the "explode" problem I mentioned at
> the
> > beginning of this thread.
> >
> > So here is my vote:
> >
> > +1 for #1
> > 0 for #2
> > -1 for #3
> >
> > Best,
> > Jark
> >
> >
> > On Thu, 19 Sep 2019 at 08:38, Xuefu Z <usxu...@gmail.com> wrote:
> >
> >> @Dawid, Re: we also don't need additional referencing the specialcatalog
> >> anywhere.
> >>
> >> True. But once we allow such reference, then user can do so in any
> possible
> >> place where a function name is expected, for which we have to handle.
> >> That's a big difference, I think.
> >>
> >> Thanks,
> >> Xuefu
> >>
> >> On Wed, Sep 18, 2019 at 5:25 PM Dawid Wysakowicz <
> >> wysakowicz.da...@gmail.com>
> >> wrote:
> >>
> >>> @Bowen I am not suggesting introducing additional catalog. I think we
> >> need
> >>> to get rid of the current built-in catalog.
> >>>
> >>> @Xuefu in option #3 we also don't need additional referencing the
> special
> >>> catalog anywhere else besides in the CREATE statement. The resolution
> >>> behaviour is exactly the same in both options.
> >>>
> >>> On Thu, 19 Sep 2019, 08:17 Xuefu Z, <usxu...@gmail.com> wrote:
> >>>
> >>>> Hi Dawid,
> >>>>
> >>>> "GLOBAL" is a temporary keyword that was given to the approach. It can
> >> be
> >>>> changed to something else for better.
> >>>>
> >>>> The difference between this and the #3 approach is that we only need
> >> the
> >>>> keyword for this create DDL. For other places (such as function
> >>>> referencing), no keyword or special namespace is needed.
> >>>>
> >>>> Thanks,
> >>>> Xuefu
> >>>>
> >>>> On Wed, Sep 18, 2019 at 4:32 PM Dawid Wysakowicz <
> >>>> wysakowicz.da...@gmail.com>
> >>>> wrote:
> >>>>
> >>>>> Hi,
> >>>>> I think it makes sense to start voting at this point.
> >>>>>
> >>>>> Option 1: Only 1-part identifiers
> >>>>> PROS:
> >>>>> - allows shadowing built-in functions
> >>>>> CONS:
> >>>>> - incosistent with all the other objects, both permanent & temporary
> >>>>> - does not allow shadowing catalog functions
> >>>>>
> >>>>> Option 2: Special keyword for built-in function
> >>>>> I think this is quite similar to the special catalog/db. The thing I
> >> am
> >>>>> strongly against in this proposal is the GLOBAL keyword. This keyword
> >>>> has a
> >>>>> meaning in rdbms systems and means a function that is present for a
> >>>>> lifetime of a session in which it was created, but available in all
> >>> other
> >>>>> sessions. Therefore I really don't want to use this keyword in a
> >>>> different
> >>>>> context.
> >>>>>
> >>>>> Option 3: Special catalog/db
> >>>>>
> >>>>> PROS:
> >>>>> - allows shadowing built-in functions
> >>>>> - allows shadowing catalog functions
> >>>>> - consistent with other objects
> >>>>> CONS:
> >>>>> - we introduce a special namespace for built-in functions
> >>>>>
> >>>>> I don't see a problem with introducing the special namespace. In the
> >>> end
> >>>> it
> >>>>> is very similar to the keyword approach. In this case the catalog/db
> >>>>> combination would be the "keyword"
> >>>>>
> >>>>> Therefore my votes:
> >>>>> Option 1: -0
> >>>>> Option 2: -1 (I might change to +0 if we can come up with a better
> >>>> keyword)
> >>>>> Option 3: +1
> >>>>>
> >>>>> Best,
> >>>>> Dawid
> >>>>>
> >>>>>
> >>>>> On Thu, 19 Sep 2019, 05:12 Xuefu Z, <usxu...@gmail.com> wrote:
> >>>>>
> >>>>>> Hi Aljoscha,
> >>>>>>
> >>>>>> Thanks for the summary and these are great questions to be
> >> answered.
> >>>> The
> >>>>>> answer to your first question is clear: there is a general
> >> agreement
> >>> to
> >>>>>> override built-in functions with temp functions.
> >>>>>>
> >>>>>> However, your second and third questions are sort of related, as a
> >>>>> function
> >>>>>> reference can be either just function name (like "func") or in the
> >>> form
> >>>>> or
> >>>>>> "cat.db.func". When a reference is just function name, it can mean
> >>>>> either a
> >>>>>> built-in function or a function defined in the current cat/db. If
> >> we
> >>>>>> support overriding a built-in function with a temp function, such
> >>>>>> overriding can also cover a function in the current cat/db.
> >>>>>>
> >>>>>> I think what Timo referred as "overriding a catalog function"
> >> means a
> >>>>> temp
> >>>>>> function defined as "cat.db.func" overrides a catalog function
> >> "func"
> >>>> in
> >>>>>> cat/db even if cat/db is not current. To support this, temp
> >> function
> >>>> has
> >>>>> to
> >>>>>> be tied to a cat/db. What's why I said above that the 2nd and 3rd
> >>>>> questions
> >>>>>> are related. The problem with such support is the ambiguity when
> >> user
> >>>>>> defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func
> >>> ...".
> >>>>>> Here "func" can means a global temp function, or a temp function in
> >>>>> current
> >>>>>> cat/db. If we can assume the former, this creates an inconsistency
> >>>>> because
> >>>>>> "CREATE FUNCTION func" actually means a function in current cat/db.
> >>> If
> >>>> we
> >>>>>> assume the latter, then there is no way for user to create a global
> >>>> temp
> >>>>>> function.
> >>>>>>
> >>>>>> Giving a special namespace for built-in functions may solve the
> >>>> ambiguity
> >>>>>> problem above, but it also introduces artificial catalog/database
> >>> that
> >>>>>> needs special treatment and pollutes the cleanness of  the code. I
> >>>> would
> >>>>>> rather introduce a syntax in DDL to solve the problem, like "CREATE
> >>>>>> [GLOBAL] TEMPORARY FUNCTION func".
> >>>>>>
> >>>>>> Thus, I'd like to summarize a few candidate proposals for voting
> >>>>> purposes:
> >>>>>>
> >>>>>> 1. Support only global, temporary functions without namespace. Such
> >>>> temp
> >>>>>> functions overrides built-in functions and catalog functions in
> >>> current
> >>>>>> cat/db. The resolution order is: temp functions -> built-in
> >> functions
> >>>> ->
> >>>>>> catalog functions. (Partially or fully qualified functions has no
> >>>>>> ambiguity!)
> >>>>>>
> >>>>>> 2. In addition to #1, support creating and referencing temporary
> >>>>> functions
> >>>>>> associated with a cat/db with "GLOBAL" qualifier in DDL for global
> >>> temp
> >>>>>> functions. The resolution order is: global temp functions ->
> >> built-in
> >>>>>> functions -> temp functions in current cat/db -> catalog function.
> >>>>>> (Resolution for partially or fully qualified function reference is:
> >>>> temp
> >>>>>> functions -> persistent functions.)
> >>>>>>
> >>>>>> 3. In addition to #1, support creating and referencing temporary
> >>>>> functions
> >>>>>> associated with a cat/db with a special namespace for built-in
> >>>> functions
> >>>>>> and global temp functions. The resolution is the same as #2, except
> >>>> that
> >>>>>> the special namespace might be prefixed to a reference to a
> >> built-in
> >>>>>> function or global temp function. (In absence of the special
> >>> namespace,
> >>>>> the
> >>>>>> resolution order is the same as in #2.)
> >>>>>>
> >>>>>> My personal preference is #1, given the unknown use case and
> >>> introduced
> >>>>>> complexity for #2 and #3. However, #2 is an acceptable alternative.
> >>>> Thus,
> >>>>>> my votes are:
> >>>>>>
> >>>>>> +1 for #1
> >>>>>> +0 for #2
> >>>>>> -1 for #3
> >>>>>>
> >>>>>> Everyone, please cast your vote (in above format please!), or let
> >> me
> >>>> know
> >>>>>> if you have more questions or other candidates.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Xuefu
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek <
> >>> aljos...@apache.org>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> I think this discussion and the one for FLIP-64 are very
> >> connected.
> >>>> To
> >>>>>>> resolve the differences, think we have to think about the basic
> >>>>>> principles
> >>>>>>> and find consensus there. The basic questions I see are:
> >>>>>>>
> >>>>>>> - Do we want to support overriding builtin functions?
> >>>>>>> - Do we want to support overriding catalog functions?
> >>>>>>> - And then later: should temporary functions be tied to a
> >>>>>>> catalog/database?
> >>>>>>>
> >>>>>>> I don’t have much to say about these, except that we should
> >>> somewhat
> >>>>>> stick
> >>>>>>> to what the industry does. But I also understand that the
> >> industry
> >>> is
> >>>>>>> already very divided on this.
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Aljoscha
> >>>>>>>
> >>>>>>>> On 18. Sep 2019, at 11:41, Jark Wu <imj...@gmail.com> wrote:
> >>>>>>>>
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> +1 to strive for reaching consensus on the remaining topics. We
> >>> are
> >>>>>>> close to the truth. It will waste a lot of time if we resume the
> >>>> topic
> >>>>>> some
> >>>>>>> time later.
> >>>>>>>>
> >>>>>>>> +1 to “1-part/override” and I’m also fine with Timo’s
> >>> “cat.db.fun”
> >>>>> way
> >>>>>>> to override a catalog function.
> >>>>>>>>
> >>>>>>>> I’m not sure about “system.system.fun”, it introduces a
> >>> nonexistent
> >>>>> cat
> >>>>>>> & db? And we still need to do special treatment for the dedicated
> >>>>>>> system.system cat & db?
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Jark
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> 在 2019年9月18日,06:54,Timo Walther <twal...@apache.org> 写道:
> >>>>>>>>>
> >>>>>>>>> Hi everyone,
> >>>>>>>>>
> >>>>>>>>> @Xuefu: I would like to avoid adding too many things
> >>>> incrementally.
> >>>>>>> Users should be able to override all catalog objects consistently
> >>>>>> according
> >>>>>>> to FLIP-64 (Support for Temporary Objects in Table module). If
> >>>>> functions
> >>>>>>> are treated completely different, we need more code and special
> >>>> cases.
> >>>>>> From
> >>>>>>> an implementation perspective, this topic only affects the lookup
> >>>> logic
> >>>>>>> which is rather low implementation effort which is why I would
> >> like
> >>>> to
> >>>>>>> clarify the remaining items. As you said, we have a slight
> >> consenus
> >>>> on
> >>>>>>> overriding built-in functions; we should also strive for reaching
> >>>>>> consensus
> >>>>>>> on the remaining topics.
> >>>>>>>>>
> >>>>>>>>> @Dawid: I like your idea as it ensures registering catalog
> >>> objects
> >>>>>>> consistent and the overriding of built-in functions more
> >> explicit.
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> Timo
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On 17.09.19 11:59, kai wang wrote:
> >>>>>>>>>> hi, everyone
> >>>>>>>>>> I think this flip is very meaningful. it supports functions
> >>> that
> >>>>> can
> >>>>>> be
> >>>>>>>>>> shared by different catalogs and dbs, reducing the
> >> duplication
> >>> of
> >>>>>>> functions.
> >>>>>>>>>>
> >>>>>>>>>> Our group based on flink's sql parser module implements
> >> create
> >>>>>> function
> >>>>>>>>>> feature, stores the parsed function metadata and schema into
> >>>> mysql,
> >>>>>> and
> >>>>>>>>>> also customizes the catalog, customizes sql-client to support
> >>>>> custom
> >>>>>>>>>> schemas and functions. Loaded, but the function is currently
> >>>>> global,
> >>>>>>> and is
> >>>>>>>>>> not subdivided according to catalog and db.
> >>>>>>>>>>
> >>>>>>>>>> In addition, I very much hope to participate in the
> >> development
> >>>> of
> >>>>>> this
> >>>>>>>>>> flip, I have been paying attention to the community, but
> >> found
> >>> it
> >>>>> is
> >>>>>>> more
> >>>>>>>>>> difficult to join.
> >>>>>>>>>> thank you.
> >>>>>>>>>>
> >>>>>>>>>> Xuefu Z <usxu...@gmail.com> 于2019年9月17日周二 上午11:19写道:
> >>>>>>>>>>
> >>>>>>>>>>> Thanks to Tmo and Dawid for sharing thoughts.
> >>>>>>>>>>>
> >>>>>>>>>>> It seems to me that there is a general consensus on having
> >>> temp
> >>>>>>> functions
> >>>>>>>>>>> that have no namespaces and overwrite built-in functions.
> >> (As
> >>> a
> >>>>> side
> >>>>>>> note
> >>>>>>>>>>> for comparability, the current user defined functions are
> >> all
> >>>>>>> temporary and
> >>>>>>>>>>> having no namespaces.)
> >>>>>>>>>>>
> >>>>>>>>>>> Nevertheless, I can also see the merit of having namespaced
> >>> temp
> >>>>>>> functions
> >>>>>>>>>>> that can overwrite functions defined in a specific cat/db.
> >>>>> However,
> >>>>>>> this
> >>>>>>>>>>> idea appears orthogonal to the former and can be added
> >>>>>> incrementally.
> >>>>>>>>>>>
> >>>>>>>>>>> How about we first implement non-namespaced temp functions
> >> now
> >>>> and
> >>>>>>> leave
> >>>>>>>>>>> the door open for namespaced ones for later releases as the
> >>>>>>> requirement
> >>>>>>>>>>> might become more crystal? This also helps shorten the
> >> debate
> >>>> and
> >>>>>>> allow us
> >>>>>>>>>>> to make some progress along this direction.
> >>>>>>>>>>>
> >>>>>>>>>>> As to Dawid's idea of having a dedicated cat/db to host the
> >>>>>> temporary
> >>>>>>> temp
> >>>>>>>>>>> functions that don't have namespaces, my only concern is the
> >>>>> special
> >>>>>>>>>>> treatment for a cat/db, which makes code less clean, as
> >>> evident
> >>>> in
> >>>>>>> treating
> >>>>>>>>>>> the built-in catalog currently.
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks,
> >>>>>>>>>>> Xuefiu
> >>>>>>>>>>>
> >>>>>>>>>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz <
> >>>>>>>>>>> wysakowicz.da...@gmail.com>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hi,
> >>>>>>>>>>>> Another idea to consider on top of Timo's suggestion. How
> >>> about
> >>>>> we
> >>>>>>> have a
> >>>>>>>>>>>> special namespace (catalog + database) for built-in
> >> objects?
> >>>> This
> >>>>>>> catalog
> >>>>>>>>>>>> would be invisible for users as Xuefu was suggesting.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Then users could still override built-in functions, if they
> >>>> fully
> >>>>>>> qualify
> >>>>>>>>>>>> object with the built-in namespace, but by default the
> >> common
> >>>>> logic
> >>>>>>> of
> >>>>>>>>>>>> current dB & cat would be used.
> >>>>>>>>>>>>
> >>>>>>>>>>>> CREATE TEMPORARY FUNCTION func ...
> >>>>>>>>>>>> registers temporary function in current cat & dB
> >>>>>>>>>>>>
> >>>>>>>>>>>> CREATE TEMPORARY FUNCTION cat.db.func ...
> >>>>>>>>>>>> registers temporary function in cat db
> >>>>>>>>>>>>
> >>>>>>>>>>>> CREATE TEMPORARY FUNCTION system.system.func ...
> >>>>>>>>>>>> Overrides built-in function with temporary function
> >>>>>>>>>>>>
> >>>>>>>>>>>> The built-in/system namespace would not be writable for
> >>>> permanent
> >>>>>>>>>>> objects.
> >>>>>>>>>>>> WDYT?
> >>>>>>>>>>>>
> >>>>>>>>>>>> This way I think we can have benefits of both solutions.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Best,
> >>>>>>>>>>>> Dawid
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <
> >> twal...@apache.org
> >>>>
> >>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Hi Bowen,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I understand the potential benefit of overriding certain
> >>>>> built-in
> >>>>>>>>>>>>> functions. I'm open to such a feature if many people
> >> agree.
> >>>>>>> However, it
> >>>>>>>>>>>>> would be great to still support overriding catalog
> >> functions
> >>>>> with
> >>>>>>>>>>>>> temporary functions in order to prototype a query even
> >>> though
> >>>> a
> >>>>>>>>>>>>> catalog/database might not be available currently or
> >> should
> >>>> not
> >>>>> be
> >>>>>>>>>>>>> modified yet. How about we support both cases?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> CREATE TEMPORARY FUNCTION abs
> >>>>>>>>>>>>> -> creates/overrides a built-in function and never
> >>> consideres
> >>>>>>> current
> >>>>>>>>>>>>> catalog and database; inconsistent with other DDL but
> >>>> acceptable
> >>>>>> for
> >>>>>>>>>>>>> functions I guess.
> >>>>>>>>>>>>> CREATE TEMPORARY FUNCTION cat.db.fun
> >>>>>>>>>>>>> -> creates/overrides a catalog function
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Regarding "Flink don't have any other built-in objects
> >>>> (tables,
> >>>>>>> views)
> >>>>>>>>>>>>> except functions", this might change in the near future.
> >>> Take
> >>>>>>>>>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an
> >>>>> example.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>> Timo
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On 14.09.19 01:40, Bowen Li wrote:
> >>>>>>>>>>>>>> Hi Fabian,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Yes, I agree 1-part/no-override is the least favorable
> >>> thus I
> >>>>>>> didn't
> >>>>>>>>>>>>>> include that as a voting option, and the discussion is
> >>> mainly
> >>>>>>> between
> >>>>>>>>>>>>>> 1-part/override builtin and 3-part/not override builtin.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Re > However, it means that temp functions are
> >> differently
> >>>>>> treated
> >>>>>>>>>>> than
> >>>>>>>>>>>>>> other db objects.
> >>>>>>>>>>>>>> IMO, the treatment difference results from the fact that
> >>>>>> functions
> >>>>>>>>>>> are
> >>>>>>>>>>>> a
> >>>>>>>>>>>>>> bit different from other objects - Flink don't have any
> >>> other
> >>>>>>>>>>> built-in
> >>>>>>>>>>>>>> objects (tables, views) except functions.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>> Bowen
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> --
> >>>>>>>>>>> Xuefu Zhang
> >>>>>>>>>>>
> >>>>>>>>>>> "In Honey We Trust!"
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Xuefu Zhang
> >>>>>>
> >>>>>> "In Honey We Trust!"
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>> --
> >>>> Xuefu Zhang
> >>>>
> >>>> "In Honey We Trust!"
> >>>>
> >>>
> >>
> >>
> >> --
> >> Xuefu Zhang
> >>
> >> "In Honey We Trust!"
> >>
>
>

Reply via email to