I'm not sure how much benefit #1 can bring us. If users just want to try
out temporary functions, they can create temporary system functions which
don't require a catalog/DB. IIUC, the main reason why we allow temporary
catalog function is to let users override permanent catalog functions.
Therefore a temporary function in a non-existing catalog won't serve that
purpose. Besides, each session is provided with a default catalog and DB.
So even if users simply want to create some catalog functions they can
forget about after the session, wouldn't the default catalog/DB be enough
for such experiments?

On Thu, Sep 26, 2019 at 4:38 AM Bowen Li <bowenl...@gmail.com> wrote:

> Re 1) As described in the FLIP, a temp function lookup will first make sure
> the db exists. If the db doesn't exist, a lazy drop is triggered to remove
> that temp function.
>
> I agree Hive doesn't handle it consistently, and we are not copying Hive.
>
> IMHO, allowing registering temp functions in nonexistent catalog/db is
> hacky and problematic. For instance, "SHOW FUNCTIONS" would list system
> functions and functions in the current catalog/db, since users cannot
> designate a nonexistent catalog/db as current ones, how can they list
> functions in nonexistent catalog/db? They may end up never knowing what
> temp functions they've created unless trying out with queries or we
> introducing some more nonstandard SQL statements. The same applies to other
> temp objects like temp tables.
>
> Re 2) A standalone FunctionIdentifier sounds good to me
>
> On Wed, Sep 25, 2019 at 4:46 AM Dawid Wysakowicz <
> wysakowicz.da...@gmail.com>
> wrote:
>
> > Ad. 1
> > I wouldn't say it is hacky.
> > Moreover, how do you want ensure that the dB always exists when a
> temporary
> > object is used?( in this particular case function). Do you want to query
> > for the database existence whenever e.g a temporary function is used? I
> > think important aspect here is that the database can be dropped from
> > external system, not just flink or a different flink session.
> >
> > E.g in case of hive, you cannot create a temporary table in a database
> that
> > does not exist, that's true. But if you create a temporary table in a
> > database and drop that database from a different session, you can still
> > query the previously created temporary table from the original session.
> It
> > does not sound like a consistent behaviour to me. Why don't we make this
> > behaviour of not binding a temporary objects to the lifetime of a
> database
> > explicit as part of the temporary objects contract? In the end they exist
> > in different layers. Permanent objects & databases in a catalog (in case
> of
> > hive megastore) whereas temporary objects in flink sessions. That's also
> > true for the original hive client. The temporary objects live in the hive
> > client whereas databases are created in the metastore.
> >
> > Ad.2
> > I'm open for suggestions here. The one thing I wanted to achieve here is
> so
> > that we do not change the contract of ObjectIdentifier. One important
> thing
> > to remember here is that we need the function identifier to be part of
> the
> > FunctionDefinition object and not only as the result of the function
> > lookup. At some point we want to be able to store QueryOperations in the
> > catalogs. They can contain function calls within which we need to have
> the
> > identifier.
> >
> > I agree my initial suggestion is over complicated. How about we have just
> > the FunctionIdentifier as top level class without making the
> > ObjectIdentifier extend from it? I think it's pretty much the same what
> you
> > suggested. The only difference is that it would be a top level class
> with a
> > more descriptive name.
> >
> >
> > On Wed, 25 Sep 2019, 13:57 Bowen Li, <bowenl...@gmail.com> wrote:
> >
> > > Sorry, I missed some parts of the solution. The complete alternative is
> > the
> > > following, basically having separate APIs in FunctionLookup for
> ambiguous
> > > and precise function lookup since planner is able to tell which API to
> > call
> > > with parsed queries, and have a unified result:
> > >
> > > ```
> > > class FunctionLookup {
> > >
> > > Optional<Result> lookupAmbiguousFunction(String name);
> > >
> > >
> > > Optional<Result> lookupPreciseFunction(ObjectIdentifier oi);
> > >
> > >
> > > class Result {
> > >     Optional<ObjectIdentifier>  getObjectIdentifier() { ... }
> > >     String getName() { ... }
> > >     // ...
> > > }
> > >
> > > }
> > > ```
> > >
> > > Thanks,
> > > Bowen
> > >
> > >
> > > On Tue, Sep 24, 2019 at 9:42 PM Bowen Li <bowenl...@gmail.com> wrote:
> > >
> > > > Hi Dawid,
> > > >
> > > > Re 1): I agree making it easy for users to run experiments is
> > important.
> > > > However, I'm not sure allowing users to register temp functions in
> > > > nonexistent catalog/db is the optimal way. It seems a bit hacky, and
> > > breaks
> > > > the current contract between Flink and users that catalog/db must be
> > > valid
> > > > in order to operate on.
> > > >
> > > > How about we instead focus on making it convenient to create
> catalogs?
> > > > Users actually can already do it with ease via program or SQL CLI
> yaml
> > > file
> > > > for an in-memory catalog which has neither extra dependency nor
> > external
> > > > connections. What we can further improve is DDL for catalogs, and I
> > > raised
> > > > it in discussion of [FLIP 69 - Flink SQL DDL Enhancement] driven by
> > Terry
> > > > now.
> > > >
> > > > In that case, if users would like to experiment via SQL, they can
> > easily
> > > > create an in memory catalog/database using DDL, then play with temp
> > > > functions.
> > > >
> > > > Re 2): For the assumption, IIUIC, function ObjectIdentifier has not
> > been
> > > > resolved when stack call reaches FunctionCatalog#lookupFunction(),
> but
> > > only
> > > > been parsed?
> > > >
> > > > I agree keeping ObjectIdentifier as-is would be good. I'm ok with the
> > > > suggested classes, though making ObjectIdentifier a subclass of
> > > > FunctionIdentifier seem a bit counter intuitive.
> > > >
> > > > Another potentially simpler way is:
> > > >
> > > > ```
> > > > // in class FunctionLookup
> > > > class Result {
> > > >     Optional<ObjectIdentifier>  getObjectIdentifier() { ... }
> > > >     String getName() { ... }
> > > >     ...
> > > > }
> > > > ```
> > > >
> > > > WDYT?
> > > >
> > > >
> > > >
> > > > On Tue, Sep 24, 2019 at 3:41 PM Dawid Wysakowicz <
> > > > wysakowicz.da...@gmail.com> wrote:
> > > >
> > > >> Hi,
> > > >> I really like the flip and think it clarifies important aspects of
> the
> > > >> system.
> > > >>
> > > >> I have two, I hope small suggestions, which will not take much time
> to
> > > >> agree on.
> > > >>
> > > >> 1. Could we follow the MySQL approach in regards to the existence of
> > > >> cat/db
> > > >> for temporary functions? That means not to check it, so e.g. it's
> > > possible
> > > >> to create a temporary function in a database that does not exist. I
> > > think
> > > >> it's really useful e.g in cases when user wants to perform
> experiments
> > > but
> > > >> does not have access to the db yet or temporarily does not have
> > > connection
> > > >> to a catalog.
> > > >> 2. Could we not change the ObjectIdentifier? Could we not loosen the
> > > >> requirements for all catalog objects such as tables, views, types
> just
> > > for
> > > >> the functions? It's really important later on from e.g the
> > > serializability
> > > >> perspective. The important aspect of the ObjectIdentifier is that we
> > > know
> > > >> that it has been resolved. The suggested changes break that
> > assumption.
> > > >>
> > > >> What do you think about adding an interface FunctionIdentifier {
> > > >>
> > > >> String getName();
> > > >>
> > > >> /**
> > > >>   Return 3-part identifier. Empty in case of a built-in function.
> > > >> */
> > > >> Optional<ObjectIdentifier> getObjectIdentifier()
> > > >> }
> > > >>
> > > >> class ObjectIdentifier implements FunctionIdentifier {
> > > >> Optional<ObjectIdentifier> getObjectIdentifier() {
> > > >>  return Optional.of(this);
> > > >> }
> > > >> }
> > > >>
> > > >> class SystemFunctionIdentifier implements FunctionIdentifier {...}
> > > >>
> > > >> WDYT?
> > > >>
> > > >> On Wed, 25 Sep 2019, 04:50 Xuefu Z, <usxu...@gmail.com> wrote:
> > > >>
> > > >> > +1. LGTM
> > > >> >
> > > >> > On Tue, Sep 24, 2019 at 6:09 AM Terry Wang <zjuwa...@gmail.com>
> > > wrote:
> > > >> >
> > > >> > > +1
> > > >> > >
> > > >> > > Best,
> > > >> > > Terry Wang
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > > > 在 2019年9月24日,上午10:42,Kurt Young <ykt...@gmail.com> 写道:
> > > >> > > >
> > > >> > > > +1
> > > >> > > >
> > > >> > > > Best,
> > > >> > > > Kurt
> > > >> > > >
> > > >> > > >
> > > >> > > > On Tue, Sep 24, 2019 at 2:30 AM Bowen Li <bowenl...@gmail.com
> >
> > > >> wrote:
> > > >> > > >
> > > >> > > >> Hi all,
> > > >> > > >>
> > > >> > > >> I'd like to start a voting thread for FLIP-57 [1], which
> we've
> > > >> reached
> > > >> > > >> consensus in [2].
> > > >> > > >>
> > > >> > > >> This voting will be open for minimum 3 days till 6:30pm UTC,
> > Sep
> > > >> 26.
> > > >> > > >>
> > > >> > > >> Thanks,
> > > >> > > >> Bowen
> > > >> > > >>
> > > >> > > >> [1]
> > > >> > > >>
> > > >> > > >>
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-57%3A+Rework+FunctionCatalog
> > > >> > > >> [2]
> > > >> > > >>
> > > >> > > >>
> > > >> > >
> > > >> >
> > > >>
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-57-Rework-FunctionCatalog-td32291.html#a32613
> > > >> > > >>
> > > >> > >
> > > >> > >
> > > >> >
> > > >> > --
> > > >> > Xuefu Zhang
> > > >> >
> > > >> > "In Honey We Trust!"
> > > >> >
> > > >>
> > > >
> > >
> >
>


-- 
Best regards!
Rui Li

Reply via email to