@Dawid, do you have any other concerns? If not, I hope we can close the voting.
On Thu, Sep 26, 2019 at 8:14 PM Rui Li <lirui.fu...@gmail.com> wrote: > I'm not sure how much benefit #1 can bring us. If users just want to try > out temporary functions, they can create temporary system functions which > don't require a catalog/DB. IIUC, the main reason why we allow temporary > catalog function is to let users override permanent catalog functions. > Therefore a temporary function in a non-existing catalog won't serve that > purpose. Besides, each session is provided with a default catalog and DB. > So even if users simply want to create some catalog functions they can > forget about after the session, wouldn't the default catalog/DB be enough > for such experiments? > > On Thu, Sep 26, 2019 at 4:38 AM Bowen Li <bowenl...@gmail.com> wrote: > > > Re 1) As described in the FLIP, a temp function lookup will first make > sure > > the db exists. If the db doesn't exist, a lazy drop is triggered to > remove > > that temp function. > > > > I agree Hive doesn't handle it consistently, and we are not copying Hive. > > > > IMHO, allowing registering temp functions in nonexistent catalog/db is > > hacky and problematic. For instance, "SHOW FUNCTIONS" would list system > > functions and functions in the current catalog/db, since users cannot > > designate a nonexistent catalog/db as current ones, how can they list > > functions in nonexistent catalog/db? They may end up never knowing what > > temp functions they've created unless trying out with queries or we > > introducing some more nonstandard SQL statements. The same applies to > other > > temp objects like temp tables. > > > > Re 2) A standalone FunctionIdentifier sounds good to me > > > > On Wed, Sep 25, 2019 at 4:46 AM Dawid Wysakowicz < > > wysakowicz.da...@gmail.com> > > wrote: > > > > > Ad. 1 > > > I wouldn't say it is hacky. > > > Moreover, how do you want ensure that the dB always exists when a > > temporary > > > object is used?( in this particular case function). Do you want to > query > > > for the database existence whenever e.g a temporary function is used? I > > > think important aspect here is that the database can be dropped from > > > external system, not just flink or a different flink session. > > > > > > E.g in case of hive, you cannot create a temporary table in a database > > that > > > does not exist, that's true. But if you create a temporary table in a > > > database and drop that database from a different session, you can still > > > query the previously created temporary table from the original session. > > It > > > does not sound like a consistent behaviour to me. Why don't we make > this > > > behaviour of not binding a temporary objects to the lifetime of a > > database > > > explicit as part of the temporary objects contract? In the end they > exist > > > in different layers. Permanent objects & databases in a catalog (in > case > > of > > > hive megastore) whereas temporary objects in flink sessions. That's > also > > > true for the original hive client. The temporary objects live in the > hive > > > client whereas databases are created in the metastore. > > > > > > Ad.2 > > > I'm open for suggestions here. The one thing I wanted to achieve here > is > > so > > > that we do not change the contract of ObjectIdentifier. One important > > thing > > > to remember here is that we need the function identifier to be part of > > the > > > FunctionDefinition object and not only as the result of the function > > > lookup. At some point we want to be able to store QueryOperations in > the > > > catalogs. They can contain function calls within which we need to have > > the > > > identifier. > > > > > > I agree my initial suggestion is over complicated. How about we have > just > > > the FunctionIdentifier as top level class without making the > > > ObjectIdentifier extend from it? I think it's pretty much the same what > > you > > > suggested. The only difference is that it would be a top level class > > with a > > > more descriptive name. > > > > > > > > > On Wed, 25 Sep 2019, 13:57 Bowen Li, <bowenl...@gmail.com> wrote: > > > > > > > Sorry, I missed some parts of the solution. The complete alternative > is > > > the > > > > following, basically having separate APIs in FunctionLookup for > > ambiguous > > > > and precise function lookup since planner is able to tell which API > to > > > call > > > > with parsed queries, and have a unified result: > > > > > > > > ``` > > > > class FunctionLookup { > > > > > > > > Optional<Result> lookupAmbiguousFunction(String name); > > > > > > > > > > > > Optional<Result> lookupPreciseFunction(ObjectIdentifier oi); > > > > > > > > > > > > class Result { > > > > Optional<ObjectIdentifier> getObjectIdentifier() { ... } > > > > String getName() { ... } > > > > // ... > > > > } > > > > > > > > } > > > > ``` > > > > > > > > Thanks, > > > > Bowen > > > > > > > > > > > > On Tue, Sep 24, 2019 at 9:42 PM Bowen Li <bowenl...@gmail.com> > wrote: > > > > > > > > > Hi Dawid, > > > > > > > > > > Re 1): I agree making it easy for users to run experiments is > > > important. > > > > > However, I'm not sure allowing users to register temp functions in > > > > > nonexistent catalog/db is the optimal way. It seems a bit hacky, > and > > > > breaks > > > > > the current contract between Flink and users that catalog/db must > be > > > > valid > > > > > in order to operate on. > > > > > > > > > > How about we instead focus on making it convenient to create > > catalogs? > > > > > Users actually can already do it with ease via program or SQL CLI > > yaml > > > > file > > > > > for an in-memory catalog which has neither extra dependency nor > > > external > > > > > connections. What we can further improve is DDL for catalogs, and I > > > > raised > > > > > it in discussion of [FLIP 69 - Flink SQL DDL Enhancement] driven by > > > Terry > > > > > now. > > > > > > > > > > In that case, if users would like to experiment via SQL, they can > > > easily > > > > > create an in memory catalog/database using DDL, then play with temp > > > > > functions. > > > > > > > > > > Re 2): For the assumption, IIUIC, function ObjectIdentifier has not > > > been > > > > > resolved when stack call reaches FunctionCatalog#lookupFunction(), > > but > > > > only > > > > > been parsed? > > > > > > > > > > I agree keeping ObjectIdentifier as-is would be good. I'm ok with > the > > > > > suggested classes, though making ObjectIdentifier a subclass of > > > > > FunctionIdentifier seem a bit counter intuitive. > > > > > > > > > > Another potentially simpler way is: > > > > > > > > > > ``` > > > > > // in class FunctionLookup > > > > > class Result { > > > > > Optional<ObjectIdentifier> getObjectIdentifier() { ... } > > > > > String getName() { ... } > > > > > ... > > > > > } > > > > > ``` > > > > > > > > > > WDYT? > > > > > > > > > > > > > > > > > > > > On Tue, Sep 24, 2019 at 3:41 PM Dawid Wysakowicz < > > > > > wysakowicz.da...@gmail.com> wrote: > > > > > > > > > >> Hi, > > > > >> I really like the flip and think it clarifies important aspects of > > the > > > > >> system. > > > > >> > > > > >> I have two, I hope small suggestions, which will not take much > time > > to > > > > >> agree on. > > > > >> > > > > >> 1. Could we follow the MySQL approach in regards to the existence > of > > > > >> cat/db > > > > >> for temporary functions? That means not to check it, so e.g. it's > > > > possible > > > > >> to create a temporary function in a database that does not exist. > I > > > > think > > > > >> it's really useful e.g in cases when user wants to perform > > experiments > > > > but > > > > >> does not have access to the db yet or temporarily does not have > > > > connection > > > > >> to a catalog. > > > > >> 2. Could we not change the ObjectIdentifier? Could we not loosen > the > > > > >> requirements for all catalog objects such as tables, views, types > > just > > > > for > > > > >> the functions? It's really important later on from e.g the > > > > serializability > > > > >> perspective. The important aspect of the ObjectIdentifier is that > we > > > > know > > > > >> that it has been resolved. The suggested changes break that > > > assumption. > > > > >> > > > > >> What do you think about adding an interface FunctionIdentifier { > > > > >> > > > > >> String getName(); > > > > >> > > > > >> /** > > > > >> Return 3-part identifier. Empty in case of a built-in function. > > > > >> */ > > > > >> Optional<ObjectIdentifier> getObjectIdentifier() > > > > >> } > > > > >> > > > > >> class ObjectIdentifier implements FunctionIdentifier { > > > > >> Optional<ObjectIdentifier> getObjectIdentifier() { > > > > >> return Optional.of(this); > > > > >> } > > > > >> } > > > > >> > > > > >> class SystemFunctionIdentifier implements FunctionIdentifier {...} > > > > >> > > > > >> WDYT? > > > > >> > > > > >> On Wed, 25 Sep 2019, 04:50 Xuefu Z, <usxu...@gmail.com> wrote: > > > > >> > > > > >> > +1. LGTM > > > > >> > > > > > >> > On Tue, Sep 24, 2019 at 6:09 AM Terry Wang <zjuwa...@gmail.com> > > > > wrote: > > > > >> > > > > > >> > > +1 > > > > >> > > > > > > >> > > Best, > > > > >> > > Terry Wang > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > 在 2019年9月24日,上午10:42,Kurt Young <ykt...@gmail.com> 写道: > > > > >> > > > > > > > >> > > > +1 > > > > >> > > > > > > > >> > > > Best, > > > > >> > > > Kurt > > > > >> > > > > > > > >> > > > > > > > >> > > > On Tue, Sep 24, 2019 at 2:30 AM Bowen Li < > bowenl...@gmail.com > > > > > > > >> wrote: > > > > >> > > > > > > > >> > > >> Hi all, > > > > >> > > >> > > > > >> > > >> I'd like to start a voting thread for FLIP-57 [1], which > > we've > > > > >> reached > > > > >> > > >> consensus in [2]. > > > > >> > > >> > > > > >> > > >> This voting will be open for minimum 3 days till 6:30pm > UTC, > > > Sep > > > > >> 26. > > > > >> > > >> > > > > >> > > >> Thanks, > > > > >> > > >> Bowen > > > > >> > > >> > > > > >> > > >> [1] > > > > >> > > >> > > > > >> > > >> > > > > >> > > > > > > >> > > > > > >> > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-57%3A+Rework+FunctionCatalog > > > > >> > > >> [2] > > > > >> > > >> > > > > >> > > >> > > > > >> > > > > > > >> > > > > > >> > > > > > > > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-57-Rework-FunctionCatalog-td32291.html#a32613 > > > > >> > > >> > > > > >> > > > > > > >> > > > > > > >> > > > > > >> > -- > > > > >> > Xuefu Zhang > > > > >> > > > > > >> > "In Honey We Trust!" > > > > >> > > > > > >> > > > > > > > > > > > > > > > > > -- > Best regards! > Rui Li >