Hi Aljoscha, Timo

Thanks for the reminder. I've update the details in FLIP wiki, and will
kick off a voting thread.

On Fri, Oct 4, 2019 at 1:51 PM Timo Walther <twal...@apache.org> wrote:

> Hi,
>
> I agree with Aljoscha. It is not transparent to me which votes are
> binding to the current status of the FLIP.
>
> Some other minor comments from my side:
>
> - We don't need to deprecate methods in FunctionCatalog. This class is
> internal. We can simply change the method signatures.
> - `String name` is missing in the FunctionIdentifier code example; can
> we call FunctionIdentifier.getSimpleName() just
> FunctionIdentifier.getName()?
> - Add the methods that we discussed to the example:  `of(String)`,
> `of(ObjectIdentifier)`
>
> Other than that, I'm happy to give my +1 to this proposal.
>
> Thanks for the productive discussion,
> Timo
>
>
> On 04.10.19 13:29, Aljoscha Krettek wrote:
> > Hi,
> >
> > I see there was quite some discussion and changes on the FLIP after this
> VOTE was started. I would suggest to start a new voting thread on the
> current state of the FLIP (keeping in mind that a FLIP vote needs at least
> three committer/PMC votes).
> >
> > For the future, we should probably keep discussion to the [DISCUSS]
> thread and use the vote thread only for voting.
> >
> > Best,
> > Aljoscha
> >
> >> On 3. Oct 2019, at 21:17, Bowen Li <bowenl...@gmail.com> wrote:
> >>
> >> I'm glad to announce that the community has accepted the design of
> FLIP-57,
> >> and we are moving forward to implementing it.
> >>
> >> Thanks everyone!
> >>
> >> On Wed, Oct 2, 2019 at 11:01 AM Bowen Li <bowenl...@gmail.com> wrote:
> >>
> >>> Introducing a new term "path" to APIs like
> >>> "getShortPath(Identifier)/getLongPath(Identifier)" would be confusing
> to
> >>> users, thus I feel "getSimpleName/getIdentifier" is fine.
> >>>
> >>> To summarize the discussion result.
> >>>
> >>>    - categorize functions into 2 dimensions - system v.s. catalog,
> >>>    non-temp v.s. temp - and that give us 4 combinations
> >>>    - definition of FunctionIdentifier
> >>>
> >>>                      @PublicEvolving
> >>>
> >>> Class FunctionIdentifier {
> >>>
> >>>     String name;
> >>>
> >>>     ObjectIdentifier oi;
> >>>
> >>>         // for temporary/non-temporary system function
> >>>     public FunctionIdentifier of(String name) {  }
> >>>     // for temporary/non-temporary catalog function
> >>>     public FunctionIdentifier of(ObjectIdentifier identifier) {  }
> >>>
> >>>
> >>>     Optional<ObjectIdentifier> getIdentifier() {}
> >>>
> >>>     Optional<String> getSimpleName() {}
> >>>
> >>> }
> >>>
> >>>
> >>> I've updated them to FLIP wiki. Please take a final look. I'll close
> the
> >>> voting if there's no other concern raised within 24 hours.
> >>>
> >>> Cheers
> >>>
> >>> On Wed, Oct 2, 2019 at 4:54 AM Dawid Wysakowicz <
> dwysakow...@apache.org>
> >>> wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> I very much agree with Xuefu's summary of the two points, especially
> on
> >>>> the "functionIdentifier doesn't need to reflect the categories".
> >>>>
> >>>> For the factory methods I think methods of should be enough:
> >>>>
> >>>>   // for temporary/non-temporary system function
> >>>>     public FunctionIdentifier of(String name) {  }
> >>>>   // for temporary/non-temporary catalog function
> >>>>     public FunctionIdentifier of(ObjectIdentifier identifier){  }
> >>>>
> >>>> In case of the getters I did not like the method name `getName` in the
> >>>> original proposal, as in my opinion it could imply that it can return
> >>>> also just the name part of an ObjectIdentifier, which should not be
> the
> >>>> case.
> >>>>
> >>>> I'm fine with getSimpleName/getIdentifier, but want to throw in few
> >>>> other suggestions:
> >>>>
> >>>> * getShortPath(Identifier)/getLongPath(Identifier),
> >>>>
> >>>> * getSystemPath(Identifier)/getCatalogPath(Identifier)
> >>>>
> >>>> +1 to any of the 3 options.
> >>>>
> >>>> One additional thing the FunctionIdentifier should be a PublicEvolving
> >>>> class, as it is part of a PublicEvolving APIs e.g. CallExpression,
> which
> >>>> user might need to access e.g. in a filter pushdown.
> >>>>
> >>>> I also support the Xuefu's suggestion not to support the "ALL" keyword
> >>>> in the "SHOW [TEMPORARY] FUNCTIONS" statement, but as the exact design
> >>>> of it  is not part of the FLIP-57, we do not need to agree on that in
> >>>> this thread.
> >>>>
> >>>> Overall I think after updating the FLIP with the outcome of the
> >>>> discussion I vote +1 for it.
> >>>>
> >>>> Best,
> >>>>
> >>>> Dawid
> >>>>
> >>>>
> >>>> On 02/10/2019 00:21, Xuefu Z wrote:
> >>>>> Here are some of my thoughts on the minor debates above:
> >>>>>
> >>>>> 1. +1 for 4 categories of functions. They are categorized along two
> >>>>> dimensions of binary values: X: *temporary* vs non-temporary
> >>>> (persistent);
> >>>>> Y: *system* vs non-system (so said catalog).
> >>>>> 2. In my opinion, class functionIdentifier doesn't really need to
> >>>> reflect
> >>>>> the categories of the functions. Instead, we should decouple them to
> >>>> make
> >>>>> the API more stable. Thus, my suggestion is:
> >>>>>
> >>>>> @Internal
> >>>>> class FunctionIdentifier {
> >>>>>   // for temporary/non-temporary system function
> >>>>>     public FunctionIdentifier ofSimpleName(String name) {  }
> >>>>>   // for temporary/non-temporary catalog function
> >>>>>     public FunctionIdentifier ofIdentifier(ObjectIdentifier
> >>>>> identifier){  }
> >>>>>     public Optional<String> getSimpleName() {  }
> >>>>>     public Optional<ObjectIdentifier> getIdentifier() {  }
> >>>>> }
> >>>>> 3. DDLs -- I don't think we need "ALL" keyword. The grammar can just
> be:
> >>>>>
> >>>>> SHOW [TEMPORARY] [SYSTEM] FUNCTIONS.
> >>>>>
> >>>>> When either keyword is missing, "ALL" is implied along that
> dimension.
> >>>> We
> >>>>> should always limit the search to the system function catalog and the
> >>>>> current catalog/DB. I don't see a need of listing functions across
> >>>>> different catalogs and databases. (It can be added later if that
> >>>> arises.)
> >>>>> Thanks,
> >>>>> Xuefu
> >>>>>
> >>>>> On Tue, Oct 1, 2019 at 11:12 AM Bowen Li <bowenl...@gmail.com>
> wrote:
> >>>>>
> >>>>>> Hi Dawid,
> >>>>>>
> >>>>>> Thanks for bringing the suggestions up. I was prototyping yesterday
> and
> >>>>>> found out those places exactly as what you suggested.
> >>>>>>
> >>>>>> For CallExpression and UnresolvedCallExpression, I've added them to
> >>>>>> FLIP-57. We will replace ObjectIdentifier with FunctionIdentifier
> and
> >>>> mark
> >>>>>> that as a breaking change
> >>>>>>
> >>>>>> For FunctionIdentifier, the suggested changes LGTM. Just want to
> bring
> >>>> up
> >>>>>> an issue on naming. It seems to me how we now name functions
> >>>> categories is
> >>>>>> a bit unclear and confusing, which is reflected on the suggested
> APIs
> >>>> - in
> >>>>>> FunctionIdentifier you lay out, "builtin function" would include
> >>>> builtin
> >>>>>> functions and temporary system functions as we are kind of using
> >>>> "system"
> >>>>>> and "built-in" interchangeably, and "catalog function" would include
> >>>>>> catalog functions and temporary functions. I currently can see two
> >>>>>> approaches to make it clearer to users.
> >>>>>>
> >>>>>> 1) Simplify FunctionIdentifier to be the following. As it's
> internal,
> >>>> we
> >>>>>> add comments and explanation to devs on which case the APIs support.
> >>>>>> However, I feel this approach would be somehow a bit conflicting to
> >>>> what
> >>>>>> you want to achieve for the clarity of APIs
> >>>>>>
> >>>>>> @Internal
> >>>>>> class FunctionIdentifier {
> >>>>>>   // for built-in function and temporary system function
> >>>>>>     public FunctionIdentifier of(String name) {  }
> >>>>>>   // for temporary function and catalog function
> >>>>>>     public FunctionIdentifier of(ObjectIdentifier identifier){  }
> >>>>>>     public Optional<String> getFunctionName() {  }
> >>>>>>     public Optional<ObjectIdentifier> getObjectIdentifier() {  }
> >>>>>> }
> >>>>>>
> >>>>>> 2) We can rename our function categories as following so there'll be
> >>>> mainly
> >>>>>> just two categories of functions, "system functions" and "catalog
> >>>>>> functions", either of which can have temporary ones
> >>>>>>
> >>>>>>   - built-in functions -> officially rename to "system functions"
> and
> >>>> note
> >>>>>> to users that "system" and "built-in" can be used interchangeably.
> We
> >>>>>> prefer "system" because that's the keyword we decided to use in DDL
> >>>> that
> >>>>>> creates its temporary peers ("CREATE TEMPORARY SYSTEM FUNCTION")
> >>>>>>   - temporary system functions
> >>>>>>   - catalog functions
> >>>>>>   - temporary functions  -> rename to "temporary catalog function"
> >>>>>>
> >>>>>> @Internal
> >>>>>> class FunctionIdentifier {
> >>>>>>   // for temporary/non-temporary system function
> >>>>>>     public FunctionIdentifier ofSystemFunction(String name) {  }
> >>>>>>   // for temporary/non-temporary catalog function
> >>>>>>     public FunctionIdentifier ofCatalogFunction(ObjectIdentifier
> >>>>>> identifier){  }
> >>>>>>     public Optional<String> getSystemFunctionName() {  }
> >>>>>>     public Optional<ObjectIdentifier> getCatalogFunctionIdentifier()
> >>>> {  }
> >>>>>> }
> >>>>>>
> >>>>>> WDYT?
> >>>>>>
> >>>>>>
> >>>>>> On Tue, Oct 1, 2019 at 5:48 AM Fabian Hueske <fhue...@gmail.com>
> >>>> wrote:
> >>>>>>> Thanks for the summary Bowen!
> >>>>>>>
> >>>>>>> Looks good to me.
> >>>>>>>
> >>>>>>> Cheers,
> >>>>>>> Fabian
> >>>>>>>
> >>>>>>> Am Mo., 30. Sept. 2019 um 23:24 Uhr schrieb Bowen Li <
> >>>>>> bowenl...@gmail.com
> >>>>>>>> :
> >>>>>>>> Hi all,
> >>>>>>>>
> >>>>>>>> I've updated the FLIP wiki with the following changes:
> >>>>>>>>
> >>>>>>>> - Lifespan of temp functions are not tied to those of catalogs and
> >>>>>>>> databases. Users can create temp functions even though
> catalogs/dbs
> >>>> in
> >>>>>>>> their fully qualified names don't even exist.
> >>>>>>>> - some new SQL commands
> >>>>>>>>     - "SHOW FUNCTIONS" - list names of temp and non-temp
> >>>>>> system/built-in
> >>>>>>>> functions, and names of temp and catalog functions in the current
> >>>>>> catalog
> >>>>>>>> and db
> >>>>>>>>     - "SHOW ALL FUNCTIONS" - list names of temp and non-temp
> >>>>>> system/built
> >>>>>>>> functions, and fully qualified names of temp and catalog
> functions in
> >>>>>> all
> >>>>>>>> catalogs and dbs
> >>>>>>>>     - "SHOW ALL TEMPORARY FUNCTIONS" - list fully qualified names
> of
> >>>>>> temp
> >>>>>>>> functions in all catalog and db
> >>>>>>>>     - "SHOW ALL TEMPORARY SYSTEM FUNCTIONS" - list names of all
> temp
> >>>>>>> system
> >>>>>>>> functions
> >>>>>>>>
> >>>>>>>> Let me know if you have any questions.
> >>>>>>>>
> >>>>>>>> Seems we have resolved all concerns. If there's no more ones, I'd
> >>>> like
> >>>>>> to
> >>>>>>>> close the vote by this time tomorrow.
> >>>>>>>>
> >>>>>>>> Cheers,
> >>>>>>>> Bowen
> >>>>>>>>
> >>>>>>>> On Mon, Sep 30, 2019 at 11:59 AM Bowen Li <bowenl...@gmail.com>
> >>>> wrote:
> >>>>>>>>> Hi,
> >>>>>>>>>
> >>>>>>>>> I think above are some valid points, and we can adopt the
> >>>>>> suggestions.
> >>>>>>>>> To elaborate a bit on the new SQL syntax, it would imply that,
> >>>> unlike
> >>>>>>>>> "SHOW FUNCTION" which only return function names, "SHOW ALL
> >>>>>> [TEMPORARY]
> >>>>>>>>> FUNCTIONS" would return functions' fully qualified names with
> >>>> catalog
> >>>>>>> and
> >>>>>>>>> db names.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Mon, Sep 30, 2019 at 6:38 AM Timo Walther <twal...@apache.org
> >
> >>>>>>> wrote:
> >>>>>>>>>> Hi all,
> >>>>>>>>>>
> >>>>>>>>>> I support Fabian's arguments. In my opinion, temporary objects
> >>>>>> should
> >>>>>>>>>> just be an additional layer on top of the regular
> catalog/database
> >>>>>>>>>> lookup logic. Thus, a temporary table or function has always
> >>>> highest
> >>>>>>>>>> precedence and should be stable within the local session.
> Otherwise
> >>>>>> it
> >>>>>>>>>> could magically disappear while someone else is performing
> >>>>>>> modifications
> >>>>>>>>>> in the catalog.
> >>>>>>>>>>
> >>>>>>>>>> Furthermore, this feature is very useful for prototyping as
> users
> >>>>>> can
> >>>>>>>>>> simply express that a catalog/database is present even through
> they
> >>>>>>>>>> might not have access to it currently.
> >>>>>>>>>>
> >>>>>>>>>> Regards,
> >>>>>>>>>> Timo
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On 30.09.19 14:57, Fabian Hueske wrote:
> >>>>>>>>>>> Hi all,
> >>>>>>>>>>>
> >>>>>>>>>>> Sorry for the late reply.
> >>>>>>>>>>>
> >>>>>>>>>>> I think it might lead to confusing situations if temporary
> >>>>>> functions
> >>>>>>>> (or
> >>>>>>>>>>> any temporary db objects for that matter) are bound to the life
> >>>>>>> cycle
> >>>>>>>>>> of an
> >>>>>>>>>>> (external) db/catalog.
> >>>>>>>>>>> Imaging a situation where you create a temp function in a db
> in an
> >>>>>>>>>> external
> >>>>>>>>>>> catalog and use it but at some point it does not work anymore
> >>>>>>> because
> >>>>>>>>>> some
> >>>>>>>>>>> other dropped the database from the external catalog.
> >>>>>>>>>>> Shouldn't temporary objects be only controlled by the owner of
> a
> >>>>>>>>>> session?
> >>>>>>>>>>> I agree that creating temp objects in non-existing db/catalogs
> >>>>>>> sounds
> >>>>>>>> a
> >>>>>>>>>> bit
> >>>>>>>>>>> strange, but IMO the opposite (the db/catalog must exist for a
> >>>>>> temp
> >>>>>>>>>>> function to be created/exist) can have significant implications
> >>>>>> like
> >>>>>>>> the
> >>>>>>>>>>> one I described.
> >>>>>>>>>>> I think it would be quite easy for users to understand that
> >>>>>>> temporary
> >>>>>>>>>>> objects are solely owned by them (and their session).
> >>>>>>>>>>> The problem of listing temporary objects could be solved by
> >>>>>> adding a
> >>>>>>>> ALL
> >>>>>>>>>>> [TEMPORARY] clause:
> >>>>>>>>>>>
> >>>>>>>>>>> SHOW ALL FUNCTIONS; could show all functions regardless of the
> >>>>>>>>>>> catalog/database including temporary functions.
> >>>>>>>>>>> SHOW ALL TEMPORARY FUNCTIONS; could show all temporary
> functions
> >>>>>>>>>> regardless
> >>>>>>>>>>> of the catalog/database.
> >>>>>>>>>>>
> >>>>>>>>>>> Best,
> >>>>>>>>>>> Fabian
> >>>>>>>>>>>
> >>>>>>>>>>> Am Sa., 28. Sept. 2019 um 02:21 Uhr schrieb Bowen Li <
> >>>>>>>>>> bowenl...@gmail.com>:
> >>>>>>>>>>>> @Dawid, do you have any other concerns? If not, I hope we can
> >>>>>> close
> >>>>>>>> the
> >>>>>>>>>>>> voting.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Thu, Sep 26, 2019 at 8:14 PM Rui Li <lirui.fu...@gmail.com
> >
> >>>>>>>> wrote:
> >>>>>>>>>>>>> I'm not sure how much benefit #1 can bring us. If users just
> >>>>>> want
> >>>>>>> to
> >>>>>>>>>> try
> >>>>>>>>>>>>> out temporary functions, they can create temporary system
> >>>>>>> functions
> >>>>>>>>>> which
> >>>>>>>>>>>>> don't require a catalog/DB. IIUC, the main reason why we
> allow
> >>>>>>>>>> temporary
> >>>>>>>>>>>>> catalog function is to let users override permanent catalog
> >>>>>>>> functions.
> >>>>>>>>>>>>> Therefore a temporary function in a non-existing catalog
> won't
> >>>>>>> serve
> >>>>>>>>>> that
> >>>>>>>>>>>>> purpose. Besides, each session is provided with a default
> >>>>>> catalog
> >>>>>>>> and
> >>>>>>>>>> DB.
> >>>>>>>>>>>>> So even if users simply want to create some catalog functions
> >>>>>> they
> >>>>>>>> can
> >>>>>>>>>>>>> forget about after the session, wouldn't the default
> catalog/DB
> >>>>>> be
> >>>>>>>>>> enough
> >>>>>>>>>>>>> for such experiments?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Thu, Sep 26, 2019 at 4:38 AM Bowen Li <
> bowenl...@gmail.com>
> >>>>>>>> wrote:
> >>>>>>>>>>>>>> Re 1) As described in the FLIP, a temp function lookup will
> >>>>>> first
> >>>>>>>>>> make
> >>>>>>>>>>>>> sure
> >>>>>>>>>>>>>> the db exists. If the db doesn't exist, a lazy drop is
> >>>>>> triggered
> >>>>>>> to
> >>>>>>>>>>>>> remove
> >>>>>>>>>>>>>> that temp function.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I agree Hive doesn't handle it consistently, and we are not
> >>>>>>> copying
> >>>>>>>>>>>> Hive.
> >>>>>>>>>>>>>> IMHO, allowing registering temp functions in nonexistent
> >>>>>>> catalog/db
> >>>>>>>>>> is
> >>>>>>>>>>>>>> hacky and problematic. For instance, "SHOW FUNCTIONS" would
> >>>>>> list
> >>>>>>>>>> system
> >>>>>>>>>>>>>> functions and functions in the current catalog/db, since
> users
> >>>>>>>> cannot
> >>>>>>>>>>>>>> designate a nonexistent catalog/db as current ones, how can
> >>>>>> they
> >>>>>>>> list
> >>>>>>>>>>>>>> functions in nonexistent catalog/db? They may end up never
> >>>>>>> knowing
> >>>>>>>>>> what
> >>>>>>>>>>>>>> temp functions they've created unless trying out with
> queries
> >>>>>> or
> >>>>>>> we
> >>>>>>>>>>>>>> introducing some more nonstandard SQL statements. The same
> >>>>>>> applies
> >>>>>>>> to
> >>>>>>>>>>>>> other
> >>>>>>>>>>>>>> temp objects like temp tables.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Re 2) A standalone FunctionIdentifier sounds good to me
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Wed, Sep 25, 2019 at 4:46 AM Dawid Wysakowicz <
> >>>>>>>>>>>>>> wysakowicz.da...@gmail.com>
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Ad. 1
> >>>>>>>>>>>>>>> I wouldn't say it is hacky.
> >>>>>>>>>>>>>>> Moreover, how do you want ensure that the dB always exists
> >>>>>> when
> >>>>>>> a
> >>>>>>>>>>>>>> temporary
> >>>>>>>>>>>>>>> object is used?( in this particular case function). Do you
> >>>>>> want
> >>>>>>> to
> >>>>>>>>>>>>> query
> >>>>>>>>>>>>>>> for the database existence whenever e.g a temporary
> function
> >>>>>> is
> >>>>>>>>>>>> used? I
> >>>>>>>>>>>>>>> think important aspect here is that the database can be
> >>>>>> dropped
> >>>>>>>> from
> >>>>>>>>>>>>>>> external system, not just flink or a different flink
> session.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> E.g in case of hive, you cannot create a temporary table
> in a
> >>>>>>>>>>>> database
> >>>>>>>>>>>>>> that
> >>>>>>>>>>>>>>> does not exist, that's true. But if you create a temporary
> >>>>>> table
> >>>>>>>> in
> >>>>>>>>>> a
> >>>>>>>>>>>>>>> database and drop that database from a different session,
> you
> >>>>>>> can
> >>>>>>>>>>>> still
> >>>>>>>>>>>>>>> query the previously created temporary table from the
> original
> >>>>>>>>>>>> session.
> >>>>>>>>>>>>>> It
> >>>>>>>>>>>>>>> does not sound like a consistent behaviour to me. Why
> don't we
> >>>>>>>> make
> >>>>>>>>>>>>> this
> >>>>>>>>>>>>>>> behaviour of not binding a temporary objects to the
> lifetime
> >>>>>> of
> >>>>>>> a
> >>>>>>>>>>>>>> database
> >>>>>>>>>>>>>>> explicit as part of the temporary objects contract? In the
> end
> >>>>>>>> they
> >>>>>>>>>>>>> exist
> >>>>>>>>>>>>>>> in different layers. Permanent objects & databases in a
> >>>>>> catalog
> >>>>>>>> (in
> >>>>>>>>>>>>> case
> >>>>>>>>>>>>>> of
> >>>>>>>>>>>>>>> hive megastore) whereas temporary objects in flink
> sessions.
> >>>>>>>> That's
> >>>>>>>>>>>>> also
> >>>>>>>>>>>>>>> true for the original hive client. The temporary objects
> live
> >>>>>> in
> >>>>>>>> the
> >>>>>>>>>>>>> hive
> >>>>>>>>>>>>>>> client whereas databases are created in the metastore.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Ad.2
> >>>>>>>>>>>>>>> I'm open for suggestions here. The one thing I wanted to
> >>>>>> achieve
> >>>>>>>>>> here
> >>>>>>>>>>>>> is
> >>>>>>>>>>>>>> so
> >>>>>>>>>>>>>>> that we do not change the contract of ObjectIdentifier. One
> >>>>>>>>>> important
> >>>>>>>>>>>>>> thing
> >>>>>>>>>>>>>>> to remember here is that we need the function identifier
> to be
> >>>>>>>> part
> >>>>>>>>>>>> of
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>> FunctionDefinition object and not only as the result of the
> >>>>>>>> function
> >>>>>>>>>>>>>>> lookup. At some point we want to be able to store
> >>>>>>> QueryOperations
> >>>>>>>> in
> >>>>>>>>>>>>> the
> >>>>>>>>>>>>>>> catalogs. They can contain function calls within which we
> need
> >>>>>>> to
> >>>>>>>>>>>> have
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>> identifier.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I agree my initial suggestion is over complicated. How
> about
> >>>>>> we
> >>>>>>>> have
> >>>>>>>>>>>>> just
> >>>>>>>>>>>>>>> the FunctionIdentifier as top level class without making
> the
> >>>>>>>>>>>>>>> ObjectIdentifier extend from it? I think it's pretty much
> the
> >>>>>>> same
> >>>>>>>>>>>> what
> >>>>>>>>>>>>>> you
> >>>>>>>>>>>>>>> suggested. The only difference is that it would be a top
> level
> >>>>>>>> class
> >>>>>>>>>>>>>> with a
> >>>>>>>>>>>>>>> more descriptive name.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Wed, 25 Sep 2019, 13:57 Bowen Li, <bowenl...@gmail.com>
> >>>>>>> wrote:
> >>>>>>>>>>>>>>>> Sorry, I missed some parts of the solution. The complete
> >>>>>>>>>>>> alternative
> >>>>>>>>>>>>> is
> >>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>> following, basically having separate APIs in
> FunctionLookup
> >>>>>> for
> >>>>>>>>>>>>>> ambiguous
> >>>>>>>>>>>>>>>> and precise function lookup since planner is able to tell
> >>>>>> which
> >>>>>>>> API
> >>>>>>>>>>>>> to
> >>>>>>>>>>>>>>> call
> >>>>>>>>>>>>>>>> with parsed queries, and have a unified result:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> ```
> >>>>>>>>>>>>>>>> class FunctionLookup {
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Optional<Result> lookupAmbiguousFunction(String name);
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Optional<Result> lookupPreciseFunction(ObjectIdentifier
> oi);
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> class Result {
> >>>>>>>>>>>>>>>>      Optional<ObjectIdentifier>  getObjectIdentifier() {
> ...
> >>>>>> }
> >>>>>>>>>>>>>>>>      String getName() { ... }
> >>>>>>>>>>>>>>>>      // ...
> >>>>>>>>>>>>>>>> }
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> }
> >>>>>>>>>>>>>>>> ```
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>> Bowen
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On Tue, Sep 24, 2019 at 9:42 PM Bowen Li <
> >>>>>> bowenl...@gmail.com>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>> Hi Dawid,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Re 1): I agree making it easy for users to run
> experiments
> >>>>>> is
> >>>>>>>>>>>>>>> important.
> >>>>>>>>>>>>>>>>> However, I'm not sure allowing users to register temp
> >>>>>>> functions
> >>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>> nonexistent catalog/db is the optimal way. It seems a bit
> >>>>>>> hacky,
> >>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>> breaks
> >>>>>>>>>>>>>>>>> the current contract between Flink and users that
> catalog/db
> >>>>>>>> must
> >>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>> valid
> >>>>>>>>>>>>>>>>> in order to operate on.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> How about we instead focus on making it convenient to
> create
> >>>>>>>>>>>>>> catalogs?
> >>>>>>>>>>>>>>>>> Users actually can already do it with ease via program or
> >>>>>> SQL
> >>>>>>>> CLI
> >>>>>>>>>>>>>> yaml
> >>>>>>>>>>>>>>>> file
> >>>>>>>>>>>>>>>>> for an in-memory catalog which has neither extra
> dependency
> >>>>>>> nor
> >>>>>>>>>>>>>>> external
> >>>>>>>>>>>>>>>>> connections. What we can further improve is DDL for
> >>>>>> catalogs,
> >>>>>>>>>>>> and I
> >>>>>>>>>>>>>>>> raised
> >>>>>>>>>>>>>>>>> it in discussion of [FLIP 69 - Flink SQL DDL Enhancement]
> >>>>>>> driven
> >>>>>>>>>>>> by
> >>>>>>>>>>>>>>> Terry
> >>>>>>>>>>>>>>>>> now.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> In that case, if users would like to experiment via SQL,
> >>>>>> they
> >>>>>>>> can
> >>>>>>>>>>>>>>> easily
> >>>>>>>>>>>>>>>>> create an in memory catalog/database using DDL, then play
> >>>>>> with
> >>>>>>>>>>>> temp
> >>>>>>>>>>>>>>>>> functions.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Re 2): For the assumption, IIUIC, function
> ObjectIdentifier
> >>>>>>> has
> >>>>>>>>>>>> not
> >>>>>>>>>>>>>>> been
> >>>>>>>>>>>>>>>>> resolved when stack call reaches
> >>>>>>>>>>>> FunctionCatalog#lookupFunction(),
> >>>>>>>>>>>>>> but
> >>>>>>>>>>>>>>>> only
> >>>>>>>>>>>>>>>>> been parsed?
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> I agree keeping ObjectIdentifier as-is would be good.
> I'm ok
> >>>>>>>> with
> >>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> suggested classes, though making ObjectIdentifier a
> subclass
> >>>>>>> of
> >>>>>>>>>>>>>>>>> FunctionIdentifier seem a bit counter intuitive.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Another potentially simpler way is:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> ```
> >>>>>>>>>>>>>>>>> // in class FunctionLookup
> >>>>>>>>>>>>>>>>> class Result {
> >>>>>>>>>>>>>>>>>      Optional<ObjectIdentifier>  getObjectIdentifier() {
> >>>>>> ... }
> >>>>>>>>>>>>>>>>>      String getName() { ... }
> >>>>>>>>>>>>>>>>>      ...
> >>>>>>>>>>>>>>>>> }
> >>>>>>>>>>>>>>>>> ```
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> WDYT?
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On Tue, Sep 24, 2019 at 3:41 PM Dawid Wysakowicz <
> >>>>>>>>>>>>>>>>> wysakowicz.da...@gmail.com> wrote:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>>>>> I really like the flip and think it clarifies important
> >>>>>>> aspects
> >>>>>>>>>>>> of
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>> system.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> I have two, I hope small suggestions, which will not
> take
> >>>>>>> much
> >>>>>>>>>>>>> time
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>> agree on.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> 1. Could we follow the MySQL approach in regards to the
> >>>>>>>>>>>> existence
> >>>>>>>>>>>>> of
> >>>>>>>>>>>>>>>>>> cat/db
> >>>>>>>>>>>>>>>>>> for temporary functions? That means not to check it, so
> >>>>>> e.g.
> >>>>>>>>>>>> it's
> >>>>>>>>>>>>>>>> possible
> >>>>>>>>>>>>>>>>>> to create a temporary function in a database that does
> not
> >>>>>>>>>>>> exist.
> >>>>>>>>>>>>> I
> >>>>>>>>>>>>>>>> think
> >>>>>>>>>>>>>>>>>> it's really useful e.g in cases when user wants to
> perform
> >>>>>>>>>>>>>> experiments
> >>>>>>>>>>>>>>>> but
> >>>>>>>>>>>>>>>>>> does not have access to the db yet or temporarily does
> not
> >>>>>>> have
> >>>>>>>>>>>>>>>> connection
> >>>>>>>>>>>>>>>>>> to a catalog.
> >>>>>>>>>>>>>>>>>> 2. Could we not change the ObjectIdentifier? Could we
> not
> >>>>>>>> loosen
> >>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>> requirements for all catalog objects such as tables,
> views,
> >>>>>>>>>>>> types
> >>>>>>>>>>>>>> just
> >>>>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>>>>> the functions? It's really important later on from e.g
> the
> >>>>>>>>>>>>>>>> serializability
> >>>>>>>>>>>>>>>>>> perspective. The important aspect of the
> ObjectIdentifier
> >>>>>> is
> >>>>>>>>>>>> that
> >>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>> know
> >>>>>>>>>>>>>>>>>> that it has been resolved. The suggested changes break
> that
> >>>>>>>>>>>>>>> assumption.
> >>>>>>>>>>>>>>>>>> What do you think about adding an interface
> >>>>>>> FunctionIdentifier
> >>>>>>>> {
> >>>>>>>>>>>>>>>>>> String getName();
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> /**
> >>>>>>>>>>>>>>>>>>    Return 3-part identifier. Empty in case of a built-in
> >>>>>>>>>>>> function.
> >>>>>>>>>>>>>>>>>> */
> >>>>>>>>>>>>>>>>>> Optional<ObjectIdentifier> getObjectIdentifier()
> >>>>>>>>>>>>>>>>>> }
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> class ObjectIdentifier implements FunctionIdentifier {
> >>>>>>>>>>>>>>>>>> Optional<ObjectIdentifier> getObjectIdentifier() {
> >>>>>>>>>>>>>>>>>>   return Optional.of(this);
> >>>>>>>>>>>>>>>>>> }
> >>>>>>>>>>>>>>>>>> }
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> class SystemFunctionIdentifier implements
> >>>>>> FunctionIdentifier
> >>>>>>>>>>>> {...}
> >>>>>>>>>>>>>>>>>> WDYT?
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On Wed, 25 Sep 2019, 04:50 Xuefu Z, <usxu...@gmail.com>
> >>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>> +1. LGTM
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> On Tue, Sep 24, 2019 at 6:09 AM Terry Wang <
> >>>>>>>>>>>> zjuwa...@gmail.com>
> >>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>> +1
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>> Terry Wang
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> 在 2019年9月24日,上午10:42,Kurt Young <ykt...@gmail.com>
> 写道:
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> +1
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>> Kurt
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> On Tue, Sep 24, 2019 at 2:30 AM Bowen Li <
> >>>>>>>>>>>>> bowenl...@gmail.com
> >>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>> Hi all,
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> I'd like to start a voting thread for FLIP-57 [1],
> >>>>>> which
> >>>>>>>>>>>>>> we've
> >>>>>>>>>>>>>>>>>> reached
> >>>>>>>>>>>>>>>>>>>>>> consensus in [2].
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> This voting will be open for minimum 3 days till
> 6:30pm
> >>>>>>>>>>>>> UTC,
> >>>>>>>>>>>>>>> Sep
> >>>>>>>>>>>>>>>>>> 26.
> >>>>>>>>>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>>>>>>>> Bowen
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> [1]
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-57%3A+Rework+FunctionCatalog
> >>>>>>>>>>>>>>>>>>>>>> [2]
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-57-Rework-FunctionCatalog-td32291.html#a32613
> >>>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>>> Xuefu Zhang
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> "In Honey We Trust!"
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>> --
> >>>>>>>>>>>>> Best regards!
> >>>>>>>>>>>>> Rui Li
> >>>>>>>>>>>>>
> >>>>
>
>

Reply via email to