Hi,

I think above are some valid points, and we can adopt the suggestions.

To elaborate a bit on the new SQL syntax, it would imply that, unlike "SHOW
FUNCTION" which only return function names, "SHOW ALL [TEMPORARY]
FUNCTIONS" would return functions' fully qualified names with catalog and
db names.



On Mon, Sep 30, 2019 at 6:38 AM Timo Walther <twal...@apache.org> wrote:

> Hi all,
>
> I support Fabian's arguments. In my opinion, temporary objects should
> just be an additional layer on top of the regular catalog/database
> lookup logic. Thus, a temporary table or function has always highest
> precedence and should be stable within the local session. Otherwise it
> could magically disappear while someone else is performing modifications
> in the catalog.
>
> Furthermore, this feature is very useful for prototyping as users can
> simply express that a catalog/database is present even through they
> might not have access to it currently.
>
> Regards,
> Timo
>
>
> On 30.09.19 14:57, Fabian Hueske wrote:
> > Hi all,
> >
> > Sorry for the late reply.
> >
> > I think it might lead to confusing situations if temporary functions (or
> > any temporary db objects for that matter) are bound to the life cycle of
> an
> > (external) db/catalog.
> > Imaging a situation where you create a temp function in a db in an
> external
> > catalog and use it but at some point it does not work anymore because
> some
> > other dropped the database from the external catalog.
> > Shouldn't temporary objects be only controlled by the owner of a session?
> >
> > I agree that creating temp objects in non-existing db/catalogs sounds a
> bit
> > strange, but IMO the opposite (the db/catalog must exist for a temp
> > function to be created/exist) can have significant implications like the
> > one I described.
> > I think it would be quite easy for users to understand that temporary
> > objects are solely owned by them (and their session).
> > The problem of listing temporary objects could be solved by adding a ALL
> > [TEMPORARY] clause:
> >
> > SHOW ALL FUNCTIONS; could show all functions regardless of the
> > catalog/database including temporary functions.
> > SHOW ALL TEMPORARY FUNCTIONS; could show all temporary functions
> regardless
> > of the catalog/database.
> >
> > Best,
> > Fabian
> >
> > Am Sa., 28. Sept. 2019 um 02:21 Uhr schrieb Bowen Li <
> bowenl...@gmail.com>:
> >
> >> @Dawid, do you have any other concerns? If not, I hope we can close the
> >> voting.
> >>
> >>
> >> On Thu, Sep 26, 2019 at 8:14 PM Rui Li <lirui.fu...@gmail.com> wrote:
> >>
> >>> I'm not sure how much benefit #1 can bring us. If users just want to
> try
> >>> out temporary functions, they can create temporary system functions
> which
> >>> don't require a catalog/DB. IIUC, the main reason why we allow
> temporary
> >>> catalog function is to let users override permanent catalog functions.
> >>> Therefore a temporary function in a non-existing catalog won't serve
> that
> >>> purpose. Besides, each session is provided with a default catalog and
> DB.
> >>> So even if users simply want to create some catalog functions they can
> >>> forget about after the session, wouldn't the default catalog/DB be
> enough
> >>> for such experiments?
> >>>
> >>> On Thu, Sep 26, 2019 at 4:38 AM Bowen Li <bowenl...@gmail.com> wrote:
> >>>
> >>>> Re 1) As described in the FLIP, a temp function lookup will first make
> >>> sure
> >>>> the db exists. If the db doesn't exist, a lazy drop is triggered to
> >>> remove
> >>>> that temp function.
> >>>>
> >>>> I agree Hive doesn't handle it consistently, and we are not copying
> >> Hive.
> >>>> IMHO, allowing registering temp functions in nonexistent catalog/db is
> >>>> hacky and problematic. For instance, "SHOW FUNCTIONS" would list
> system
> >>>> functions and functions in the current catalog/db, since users cannot
> >>>> designate a nonexistent catalog/db as current ones, how can they list
> >>>> functions in nonexistent catalog/db? They may end up never knowing
> what
> >>>> temp functions they've created unless trying out with queries or we
> >>>> introducing some more nonstandard SQL statements. The same applies to
> >>> other
> >>>> temp objects like temp tables.
> >>>>
> >>>> Re 2) A standalone FunctionIdentifier sounds good to me
> >>>>
> >>>> On Wed, Sep 25, 2019 at 4:46 AM Dawid Wysakowicz <
> >>>> wysakowicz.da...@gmail.com>
> >>>> wrote:
> >>>>
> >>>>> Ad. 1
> >>>>> I wouldn't say it is hacky.
> >>>>> Moreover, how do you want ensure that the dB always exists when a
> >>>> temporary
> >>>>> object is used?( in this particular case function). Do you want to
> >>> query
> >>>>> for the database existence whenever e.g a temporary function is
> >> used? I
> >>>>> think important aspect here is that the database can be dropped from
> >>>>> external system, not just flink or a different flink session.
> >>>>>
> >>>>> E.g in case of hive, you cannot create a temporary table in a
> >> database
> >>>> that
> >>>>> does not exist, that's true. But if you create a temporary table in a
> >>>>> database and drop that database from a different session, you can
> >> still
> >>>>> query the previously created temporary table from the original
> >> session.
> >>>> It
> >>>>> does not sound like a consistent behaviour to me. Why don't we make
> >>> this
> >>>>> behaviour of not binding a temporary objects to the lifetime of a
> >>>> database
> >>>>> explicit as part of the temporary objects contract? In the end they
> >>> exist
> >>>>> in different layers. Permanent objects & databases in a catalog (in
> >>> case
> >>>> of
> >>>>> hive megastore) whereas temporary objects in flink sessions. That's
> >>> also
> >>>>> true for the original hive client. The temporary objects live in the
> >>> hive
> >>>>> client whereas databases are created in the metastore.
> >>>>>
> >>>>> Ad.2
> >>>>> I'm open for suggestions here. The one thing I wanted to achieve here
> >>> is
> >>>> so
> >>>>> that we do not change the contract of ObjectIdentifier. One important
> >>>> thing
> >>>>> to remember here is that we need the function identifier to be part
> >> of
> >>>> the
> >>>>> FunctionDefinition object and not only as the result of the function
> >>>>> lookup. At some point we want to be able to store QueryOperations in
> >>> the
> >>>>> catalogs. They can contain function calls within which we need to
> >> have
> >>>> the
> >>>>> identifier.
> >>>>>
> >>>>> I agree my initial suggestion is over complicated. How about we have
> >>> just
> >>>>> the FunctionIdentifier as top level class without making the
> >>>>> ObjectIdentifier extend from it? I think it's pretty much the same
> >> what
> >>>> you
> >>>>> suggested. The only difference is that it would be a top level class
> >>>> with a
> >>>>> more descriptive name.
> >>>>>
> >>>>>
> >>>>> On Wed, 25 Sep 2019, 13:57 Bowen Li, <bowenl...@gmail.com> wrote:
> >>>>>
> >>>>>> Sorry, I missed some parts of the solution. The complete
> >> alternative
> >>> is
> >>>>> the
> >>>>>> following, basically having separate APIs in FunctionLookup for
> >>>> ambiguous
> >>>>>> and precise function lookup since planner is able to tell which API
> >>> to
> >>>>> call
> >>>>>> with parsed queries, and have a unified result:
> >>>>>>
> >>>>>> ```
> >>>>>> class FunctionLookup {
> >>>>>>
> >>>>>> Optional<Result> lookupAmbiguousFunction(String name);
> >>>>>>
> >>>>>>
> >>>>>> Optional<Result> lookupPreciseFunction(ObjectIdentifier oi);
> >>>>>>
> >>>>>>
> >>>>>> class Result {
> >>>>>>      Optional<ObjectIdentifier>  getObjectIdentifier() { ... }
> >>>>>>      String getName() { ... }
> >>>>>>      // ...
> >>>>>> }
> >>>>>>
> >>>>>> }
> >>>>>> ```
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Bowen
> >>>>>>
> >>>>>>
> >>>>>> On Tue, Sep 24, 2019 at 9:42 PM Bowen Li <bowenl...@gmail.com>
> >>> wrote:
> >>>>>>> Hi Dawid,
> >>>>>>>
> >>>>>>> Re 1): I agree making it easy for users to run experiments is
> >>>>> important.
> >>>>>>> However, I'm not sure allowing users to register temp functions
> >> in
> >>>>>>> nonexistent catalog/db is the optimal way. It seems a bit hacky,
> >>> and
> >>>>>> breaks
> >>>>>>> the current contract between Flink and users that catalog/db must
> >>> be
> >>>>>> valid
> >>>>>>> in order to operate on.
> >>>>>>>
> >>>>>>> How about we instead focus on making it convenient to create
> >>>> catalogs?
> >>>>>>> Users actually can already do it with ease via program or SQL CLI
> >>>> yaml
> >>>>>> file
> >>>>>>> for an in-memory catalog which has neither extra dependency nor
> >>>>> external
> >>>>>>> connections. What we can further improve is DDL for catalogs,
> >> and I
> >>>>>> raised
> >>>>>>> it in discussion of [FLIP 69 - Flink SQL DDL Enhancement] driven
> >> by
> >>>>> Terry
> >>>>>>> now.
> >>>>>>>
> >>>>>>> In that case, if users would like to experiment via SQL, they can
> >>>>> easily
> >>>>>>> create an in memory catalog/database using DDL, then play with
> >> temp
> >>>>>>> functions.
> >>>>>>>
> >>>>>>> Re 2): For the assumption, IIUIC, function ObjectIdentifier has
> >> not
> >>>>> been
> >>>>>>> resolved when stack call reaches
> >> FunctionCatalog#lookupFunction(),
> >>>> but
> >>>>>> only
> >>>>>>> been parsed?
> >>>>>>>
> >>>>>>> I agree keeping ObjectIdentifier as-is would be good. I'm ok with
> >>> the
> >>>>>>> suggested classes, though making ObjectIdentifier a subclass of
> >>>>>>> FunctionIdentifier seem a bit counter intuitive.
> >>>>>>>
> >>>>>>> Another potentially simpler way is:
> >>>>>>>
> >>>>>>> ```
> >>>>>>> // in class FunctionLookup
> >>>>>>> class Result {
> >>>>>>>      Optional<ObjectIdentifier>  getObjectIdentifier() { ... }
> >>>>>>>      String getName() { ... }
> >>>>>>>      ...
> >>>>>>> }
> >>>>>>> ```
> >>>>>>>
> >>>>>>> WDYT?
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Tue, Sep 24, 2019 at 3:41 PM Dawid Wysakowicz <
> >>>>>>> wysakowicz.da...@gmail.com> wrote:
> >>>>>>>
> >>>>>>>> Hi,
> >>>>>>>> I really like the flip and think it clarifies important aspects
> >> of
> >>>> the
> >>>>>>>> system.
> >>>>>>>>
> >>>>>>>> I have two, I hope small suggestions, which will not take much
> >>> time
> >>>> to
> >>>>>>>> agree on.
> >>>>>>>>
> >>>>>>>> 1. Could we follow the MySQL approach in regards to the
> >> existence
> >>> of
> >>>>>>>> cat/db
> >>>>>>>> for temporary functions? That means not to check it, so e.g.
> >> it's
> >>>>>> possible
> >>>>>>>> to create a temporary function in a database that does not
> >> exist.
> >>> I
> >>>>>> think
> >>>>>>>> it's really useful e.g in cases when user wants to perform
> >>>> experiments
> >>>>>> but
> >>>>>>>> does not have access to the db yet or temporarily does not have
> >>>>>> connection
> >>>>>>>> to a catalog.
> >>>>>>>> 2. Could we not change the ObjectIdentifier? Could we not loosen
> >>> the
> >>>>>>>> requirements for all catalog objects such as tables, views,
> >> types
> >>>> just
> >>>>>> for
> >>>>>>>> the functions? It's really important later on from e.g the
> >>>>>> serializability
> >>>>>>>> perspective. The important aspect of the ObjectIdentifier is
> >> that
> >>> we
> >>>>>> know
> >>>>>>>> that it has been resolved. The suggested changes break that
> >>>>> assumption.
> >>>>>>>> What do you think about adding an interface FunctionIdentifier {
> >>>>>>>>
> >>>>>>>> String getName();
> >>>>>>>>
> >>>>>>>> /**
> >>>>>>>>    Return 3-part identifier. Empty in case of a built-in
> >> function.
> >>>>>>>> */
> >>>>>>>> Optional<ObjectIdentifier> getObjectIdentifier()
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>> class ObjectIdentifier implements FunctionIdentifier {
> >>>>>>>> Optional<ObjectIdentifier> getObjectIdentifier() {
> >>>>>>>>   return Optional.of(this);
> >>>>>>>> }
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>> class SystemFunctionIdentifier implements FunctionIdentifier
> >> {...}
> >>>>>>>> WDYT?
> >>>>>>>>
> >>>>>>>> On Wed, 25 Sep 2019, 04:50 Xuefu Z, <usxu...@gmail.com> wrote:
> >>>>>>>>
> >>>>>>>>> +1. LGTM
> >>>>>>>>>
> >>>>>>>>> On Tue, Sep 24, 2019 at 6:09 AM Terry Wang <
> >> zjuwa...@gmail.com>
> >>>>>> wrote:
> >>>>>>>>>> +1
> >>>>>>>>>>
> >>>>>>>>>> Best,
> >>>>>>>>>> Terry Wang
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> 在 2019年9月24日,上午10:42,Kurt Young <ykt...@gmail.com> 写道:
> >>>>>>>>>>>
> >>>>>>>>>>> +1
> >>>>>>>>>>>
> >>>>>>>>>>> Best,
> >>>>>>>>>>> Kurt
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Tue, Sep 24, 2019 at 2:30 AM Bowen Li <
> >>> bowenl...@gmail.com
> >>>>>>>> wrote:
> >>>>>>>>>>>> Hi all,
> >>>>>>>>>>>>
> >>>>>>>>>>>> I'd like to start a voting thread for FLIP-57 [1], which
> >>>> we've
> >>>>>>>> reached
> >>>>>>>>>>>> consensus in [2].
> >>>>>>>>>>>>
> >>>>>>>>>>>> This voting will be open for minimum 3 days till 6:30pm
> >>> UTC,
> >>>>> Sep
> >>>>>>>> 26.
> >>>>>>>>>>>> Thanks,
> >>>>>>>>>>>> Bowen
> >>>>>>>>>>>>
> >>>>>>>>>>>> [1]
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-57%3A+Rework+FunctionCatalog
> >>>>>>>>>>>> [2]
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-57-Rework-FunctionCatalog-td32291.html#a32613
> >>>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> Xuefu Zhang
> >>>>>>>>>
> >>>>>>>>> "In Honey We Trust!"
> >>>>>>>>>
> >>>
> >>> --
> >>> Best regards!
> >>> Rui Li
> >>>
>
>

Reply via email to