@Dawid, do you have any other concerns? If not, I hope we can close the
voting.


On Thu, Sep 26, 2019 at 8:14 PM Rui Li <lirui.fu...@gmail.com> wrote:

> I'm not sure how much benefit #1 can bring us. If users just want to try
> out temporary functions, they can create temporary system functions which
> don't require a catalog/DB. IIUC, the main reason why we allow temporary
> catalog function is to let users override permanent catalog functions.
> Therefore a temporary function in a non-existing catalog won't serve that
> purpose. Besides, each session is provided with a default catalog and DB.
> So even if users simply want to create some catalog functions they can
> forget about after the session, wouldn't the default catalog/DB be enough
> for such experiments?
>
> On Thu, Sep 26, 2019 at 4:38 AM Bowen Li <bowenl...@gmail.com> wrote:
>
> > Re 1) As described in the FLIP, a temp function lookup will first make
> sure
> > the db exists. If the db doesn't exist, a lazy drop is triggered to
> remove
> > that temp function.
> >
> > I agree Hive doesn't handle it consistently, and we are not copying Hive.
> >
> > IMHO, allowing registering temp functions in nonexistent catalog/db is
> > hacky and problematic. For instance, "SHOW FUNCTIONS" would list system
> > functions and functions in the current catalog/db, since users cannot
> > designate a nonexistent catalog/db as current ones, how can they list
> > functions in nonexistent catalog/db? They may end up never knowing what
> > temp functions they've created unless trying out with queries or we
> > introducing some more nonstandard SQL statements. The same applies to
> other
> > temp objects like temp tables.
> >
> > Re 2) A standalone FunctionIdentifier sounds good to me
> >
> > On Wed, Sep 25, 2019 at 4:46 AM Dawid Wysakowicz <
> > wysakowicz.da...@gmail.com>
> > wrote:
> >
> > > Ad. 1
> > > I wouldn't say it is hacky.
> > > Moreover, how do you want ensure that the dB always exists when a
> > temporary
> > > object is used?( in this particular case function). Do you want to
> query
> > > for the database existence whenever e.g a temporary function is used? I
> > > think important aspect here is that the database can be dropped from
> > > external system, not just flink or a different flink session.
> > >
> > > E.g in case of hive, you cannot create a temporary table in a database
> > that
> > > does not exist, that's true. But if you create a temporary table in a
> > > database and drop that database from a different session, you can still
> > > query the previously created temporary table from the original session.
> > It
> > > does not sound like a consistent behaviour to me. Why don't we make
> this
> > > behaviour of not binding a temporary objects to the lifetime of a
> > database
> > > explicit as part of the temporary objects contract? In the end they
> exist
> > > in different layers. Permanent objects & databases in a catalog (in
> case
> > of
> > > hive megastore) whereas temporary objects in flink sessions. That's
> also
> > > true for the original hive client. The temporary objects live in the
> hive
> > > client whereas databases are created in the metastore.
> > >
> > > Ad.2
> > > I'm open for suggestions here. The one thing I wanted to achieve here
> is
> > so
> > > that we do not change the contract of ObjectIdentifier. One important
> > thing
> > > to remember here is that we need the function identifier to be part of
> > the
> > > FunctionDefinition object and not only as the result of the function
> > > lookup. At some point we want to be able to store QueryOperations in
> the
> > > catalogs. They can contain function calls within which we need to have
> > the
> > > identifier.
> > >
> > > I agree my initial suggestion is over complicated. How about we have
> just
> > > the FunctionIdentifier as top level class without making the
> > > ObjectIdentifier extend from it? I think it's pretty much the same what
> > you
> > > suggested. The only difference is that it would be a top level class
> > with a
> > > more descriptive name.
> > >
> > >
> > > On Wed, 25 Sep 2019, 13:57 Bowen Li, <bowenl...@gmail.com> wrote:
> > >
> > > > Sorry, I missed some parts of the solution. The complete alternative
> is
> > > the
> > > > following, basically having separate APIs in FunctionLookup for
> > ambiguous
> > > > and precise function lookup since planner is able to tell which API
> to
> > > call
> > > > with parsed queries, and have a unified result:
> > > >
> > > > ```
> > > > class FunctionLookup {
> > > >
> > > > Optional<Result> lookupAmbiguousFunction(String name);
> > > >
> > > >
> > > > Optional<Result> lookupPreciseFunction(ObjectIdentifier oi);
> > > >
> > > >
> > > > class Result {
> > > >     Optional<ObjectIdentifier>  getObjectIdentifier() { ... }
> > > >     String getName() { ... }
> > > >     // ...
> > > > }
> > > >
> > > > }
> > > > ```
> > > >
> > > > Thanks,
> > > > Bowen
> > > >
> > > >
> > > > On Tue, Sep 24, 2019 at 9:42 PM Bowen Li <bowenl...@gmail.com>
> wrote:
> > > >
> > > > > Hi Dawid,
> > > > >
> > > > > Re 1): I agree making it easy for users to run experiments is
> > > important.
> > > > > However, I'm not sure allowing users to register temp functions in
> > > > > nonexistent catalog/db is the optimal way. It seems a bit hacky,
> and
> > > > breaks
> > > > > the current contract between Flink and users that catalog/db must
> be
> > > > valid
> > > > > in order to operate on.
> > > > >
> > > > > How about we instead focus on making it convenient to create
> > catalogs?
> > > > > Users actually can already do it with ease via program or SQL CLI
> > yaml
> > > > file
> > > > > for an in-memory catalog which has neither extra dependency nor
> > > external
> > > > > connections. What we can further improve is DDL for catalogs, and I
> > > > raised
> > > > > it in discussion of [FLIP 69 - Flink SQL DDL Enhancement] driven by
> > > Terry
> > > > > now.
> > > > >
> > > > > In that case, if users would like to experiment via SQL, they can
> > > easily
> > > > > create an in memory catalog/database using DDL, then play with temp
> > > > > functions.
> > > > >
> > > > > Re 2): For the assumption, IIUIC, function ObjectIdentifier has not
> > > been
> > > > > resolved when stack call reaches FunctionCatalog#lookupFunction(),
> > but
> > > > only
> > > > > been parsed?
> > > > >
> > > > > I agree keeping ObjectIdentifier as-is would be good. I'm ok with
> the
> > > > > suggested classes, though making ObjectIdentifier a subclass of
> > > > > FunctionIdentifier seem a bit counter intuitive.
> > > > >
> > > > > Another potentially simpler way is:
> > > > >
> > > > > ```
> > > > > // in class FunctionLookup
> > > > > class Result {
> > > > >     Optional<ObjectIdentifier>  getObjectIdentifier() { ... }
> > > > >     String getName() { ... }
> > > > >     ...
> > > > > }
> > > > > ```
> > > > >
> > > > > WDYT?
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Sep 24, 2019 at 3:41 PM Dawid Wysakowicz <
> > > > > wysakowicz.da...@gmail.com> wrote:
> > > > >
> > > > >> Hi,
> > > > >> I really like the flip and think it clarifies important aspects of
> > the
> > > > >> system.
> > > > >>
> > > > >> I have two, I hope small suggestions, which will not take much
> time
> > to
> > > > >> agree on.
> > > > >>
> > > > >> 1. Could we follow the MySQL approach in regards to the existence
> of
> > > > >> cat/db
> > > > >> for temporary functions? That means not to check it, so e.g. it's
> > > > possible
> > > > >> to create a temporary function in a database that does not exist.
> I
> > > > think
> > > > >> it's really useful e.g in cases when user wants to perform
> > experiments
> > > > but
> > > > >> does not have access to the db yet or temporarily does not have
> > > > connection
> > > > >> to a catalog.
> > > > >> 2. Could we not change the ObjectIdentifier? Could we not loosen
> the
> > > > >> requirements for all catalog objects such as tables, views, types
> > just
> > > > for
> > > > >> the functions? It's really important later on from e.g the
> > > > serializability
> > > > >> perspective. The important aspect of the ObjectIdentifier is that
> we
> > > > know
> > > > >> that it has been resolved. The suggested changes break that
> > > assumption.
> > > > >>
> > > > >> What do you think about adding an interface FunctionIdentifier {
> > > > >>
> > > > >> String getName();
> > > > >>
> > > > >> /**
> > > > >>   Return 3-part identifier. Empty in case of a built-in function.
> > > > >> */
> > > > >> Optional<ObjectIdentifier> getObjectIdentifier()
> > > > >> }
> > > > >>
> > > > >> class ObjectIdentifier implements FunctionIdentifier {
> > > > >> Optional<ObjectIdentifier> getObjectIdentifier() {
> > > > >>  return Optional.of(this);
> > > > >> }
> > > > >> }
> > > > >>
> > > > >> class SystemFunctionIdentifier implements FunctionIdentifier {...}
> > > > >>
> > > > >> WDYT?
> > > > >>
> > > > >> On Wed, 25 Sep 2019, 04:50 Xuefu Z, <usxu...@gmail.com> wrote:
> > > > >>
> > > > >> > +1. LGTM
> > > > >> >
> > > > >> > On Tue, Sep 24, 2019 at 6:09 AM Terry Wang <zjuwa...@gmail.com>
> > > > wrote:
> > > > >> >
> > > > >> > > +1
> > > > >> > >
> > > > >> > > Best,
> > > > >> > > Terry Wang
> > > > >> > >
> > > > >> > >
> > > > >> > >
> > > > >> > > > 在 2019年9月24日,上午10:42,Kurt Young <ykt...@gmail.com> 写道:
> > > > >> > > >
> > > > >> > > > +1
> > > > >> > > >
> > > > >> > > > Best,
> > > > >> > > > Kurt
> > > > >> > > >
> > > > >> > > >
> > > > >> > > > On Tue, Sep 24, 2019 at 2:30 AM Bowen Li <
> bowenl...@gmail.com
> > >
> > > > >> wrote:
> > > > >> > > >
> > > > >> > > >> Hi all,
> > > > >> > > >>
> > > > >> > > >> I'd like to start a voting thread for FLIP-57 [1], which
> > we've
> > > > >> reached
> > > > >> > > >> consensus in [2].
> > > > >> > > >>
> > > > >> > > >> This voting will be open for minimum 3 days till 6:30pm
> UTC,
> > > Sep
> > > > >> 26.
> > > > >> > > >>
> > > > >> > > >> Thanks,
> > > > >> > > >> Bowen
> > > > >> > > >>
> > > > >> > > >> [1]
> > > > >> > > >>
> > > > >> > > >>
> > > > >> > >
> > > > >> >
> > > > >>
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-57%3A+Rework+FunctionCatalog
> > > > >> > > >> [2]
> > > > >> > > >>
> > > > >> > > >>
> > > > >> > >
> > > > >> >
> > > > >>
> > > >
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-57-Rework-FunctionCatalog-td32291.html#a32613
> > > > >> > > >>
> > > > >> > >
> > > > >> > >
> > > > >> >
> > > > >> > --
> > > > >> > Xuefu Zhang
> > > > >> >
> > > > >> > "In Honey We Trust!"
> > > > >> >
> > > > >>
> > > > >
> > > >
> > >
> >
>
>
> --
> Best regards!
> Rui Li
>

Reply via email to