A minor comment on `useModules(List<String> names)`, would be better to use varargs here to a more fluent API: `useModules("a", "b", "c")`.
Besides, do we also need to add these new methods (useModules, listFullModules) to TableEnvironment? Best, Jark On Wed, 3 Feb 2021 at 18:36, Timo Walther <twal...@apache.org> wrote: > Thanks for the nice summary Jane. The summary looks great. Some minor > feedback: > > - Remove the `used` column for SHOW MODULES. It will always show true. > > - `List<Pair<String, Boolean>> listFullModules()` is a very long > signature. And `Pair` should be avoided in code because it is not very > descriptive. How about creating a POJO (static inner class of > ModuleManager) called `ModuleEntry` or similar. > > Otherwise +1 for the proposal. > > Regards, > Timo > > On 03.02.21 11:24, Jane Chan wrote: > > Hi everyone, > > > > I did a summary on the Jira issue page [1] since the discussion has > > achieved a consensus. If there is anything missed or not corrected, > please > > let me know. > > > > [1] https://issues.apache.org/jira/browse/FLINK-21045# > > > > Best, > > Jane > > > > > > > > > > > > On Wed, Feb 3, 2021 at 1:33 PM Jark Wu <imj...@gmail.com> wrote: > > > >> Hi Jane, > >> > >> Yes. I think we should fail fast. > >> > >> Best, > >> Jark > >> > >> On Wed, 3 Feb 2021 at 12:06, Jane Chan <qingyue....@gmail.com> wrote: > >> > >>> Hi everyone, > >>> > >>> Thanks for the discussion to make this improvement plan clearer. > >>> > >>> Hi, @Jark, @Rui, and @Timo, I'm collecting the final discussion > summaries > >>> now and want to confirm one thing that for the statement `USE MODULES x > >> [, > >>> y, z, ...]`, if the module name list contains an unexsited module, > shall > >> we > >>> #1 fail the execution for all of them or #2 enabled the rest modules > and > >>> return a warning to users? My personal preference goes to #1 for > >>> simplicity. What do you think? > >>> > >>> Best, > >>> Jane > >>> > >>> On Tue, Feb 2, 2021 at 3:53 PM Timo Walther <twal...@apache.org> > wrote: > >>> > >>>> +1 > >>>> > >>>> @Jane Can you summarize our discussion in the JIRA issue? > >>>> > >>>> Thanks, > >>>> Timo > >>>> > >>>> > >>>> On 02.02.21 03:50, Jark Wu wrote: > >>>>> Hi Timo, > >>>>> > >>>>>> Another question is whether a LOAD operation also adds the module to > >>> the > >>>>> enabled list by default? > >>>>> > >>>>> I would like to add the module to the enabled list by default, the > >> main > >>>>> reasons are: > >>>>> 1) Reordering is an advanced requirement, adding modules needs > >>> additional > >>>>> USE statements with "core" module > >>>>> sounds too burdensome. Most users should be satisfied with only > >> LOAD > >>>>> statements. > >>>>> 2) We should keep compatible for TableEnvironment#loadModule(). > >>>>> 3) We are using the LOAD statement instead of CREATE, so I think it's > >>>> fine > >>>>> that it does some implicit things. > >>>>> > >>>>> Best, > >>>>> Jark > >>>>> > >>>>> On Tue, 2 Feb 2021 at 00:48, Timo Walther <twal...@apache.org> > >> wrote: > >>>>> > >>>>>> Not the module itself but the ModuleManager should handle this case, > >>>> yes. > >>>>>> > >>>>>> Regards, > >>>>>> Timo > >>>>>> > >>>>>> > >>>>>> On 01.02.21 17:35, Jane Chan wrote: > >>>>>>> +1 to Jark's proposal > >>>>>>> > >>>>>>> To make it clearer, will `module#getFunctionDefinition()` > >> return > >>>> empty > >>>>>>> suppose the module is loaded but not enabled? > >>>>>>> > >>>>>>> Best, > >>>>>>> Jane > >>>>>>> > >>>>>>> On Mon, Feb 1, 2021 at 10:02 PM Timo Walther <twal...@apache.org> > >>>> wrote: > >>>>>>> > >>>>>>>> +1 to Jark's proposal > >>>>>>>> > >>>>>>>> I like the difference between just loading and actually enabling > >>> these > >>>>>>>> modules. > >>>>>>>> > >>>>>>>> @Rui: I would use the same behavior as catalogs here. You cannot > >>>> `USE` a > >>>>>>>> catalog without creating it before. > >>>>>>>> > >>>>>>>> Another question is whether a LOAD operation also adds the module > >> to > >>>> the > >>>>>>>> enabled list by default? > >>>>>>>> > >>>>>>>> Regards, > >>>>>>>> Timo > >>>>>>>> > >>>>>>>> On 01.02.21 13:52, Rui Li wrote: > >>>>>>>>> If `USE MODULES` implies unloading modules that are not listed, > >>> does > >>>> it > >>>>>>>>> also imply loading modules that are not previously loaded, > >>> especially > >>>>>>>> since > >>>>>>>>> we're mapping modules by name now? > >>>>>>>>> > >>>>>>>>> On Mon, Feb 1, 2021 at 8:20 PM Jark Wu <imj...@gmail.com> wrote: > >>>>>>>>> > >>>>>>>>>> I agree with Timo that the USE implies the specified modules are > >>> in > >>>>>> use > >>>>>>>> in > >>>>>>>>>> the specified order and others are not used. > >>>>>>>>>> This would be easier to know what's the result list and order > >>> after > >>>>>> the > >>>>>>>> USE > >>>>>>>>>> statement. > >>>>>>>>>> That means: if current modules in order are x, y, z. And `USE > >>>> MODULES > >>>>>>>> z, y` > >>>>>>>>>> means current modules in order are z, y. > >>>>>>>>>> > >>>>>>>>>> But I would like to not unload the unmentioned modules in the > >> USE > >>>>>>>>>> statement. Because it seems strange that USE > >>>>>>>>>> will implicitly remove modules. In the above example, the user > >> may > >>>>>> type > >>>>>>>> the > >>>>>>>>>> wrong modules list using USE by mistake > >>>>>>>>>> and would like to declare the list again, the user has to > >>> create > >>>>>> the > >>>>>>>>>> module again with some properties he may don't know. Therefore, > >> I > >>>>>>>> propose > >>>>>>>>>> the USE statement just specifies the current module lists and > >>>> doesn't > >>>>>>>>>> unload modules. > >>>>>>>>>> Besides that, we may need a new syntax to list all the modules > >>>>>> including > >>>>>>>>>> not used but loaded. > >>>>>>>>>> We can introduce SHOW FULL MODULES for this purpose with an > >>>> additional > >>>>>>>>>> `used` column. > >>>>>>>>>> > >>>>>>>>>> For example: > >>>>>>>>>> > >>>>>>>>>> Flink SQL> list modules: > >>>>>>>>>> ----------- > >>>>>>>>>> | modules | > >>>>>>>>>> ----------- > >>>>>>>>>> | x | > >>>>>>>>>> | y | > >>>>>>>>>> | z | > >>>>>>>>>> ----------- > >>>>>>>>>> Flink SQL> USE MODULES z, y; > >>>>>>>>>> Flink SQL> show modules: > >>>>>>>>>> ----------- > >>>>>>>>>> | modules | > >>>>>>>>>> ----------- > >>>>>>>>>> | z | > >>>>>>>>>> | y | > >>>>>>>>>> ----------- > >>>>>>>>>> Flink SQL> show FULL modules; > >>>>>>>>>> ------------------- > >>>>>>>>>> | modules | used | > >>>>>>>>>> ------------------- > >>>>>>>>>> | z | true | > >>>>>>>>>> | y | true | > >>>>>>>>>> | x | false | > >>>>>>>>>> ------------------- > >>>>>>>>>> Flink SQL> USE MODULES z, y, x; > >>>>>>>>>> Flink SQL> show modules; > >>>>>>>>>> ----------- > >>>>>>>>>> | modules | > >>>>>>>>>> ----------- > >>>>>>>>>> | z | > >>>>>>>>>> | y | > >>>>>>>>>> | x | > >>>>>>>>>> ----------- > >>>>>>>>>> > >>>>>>>>>> What do you think? > >>>>>>>>>> > >>>>>>>>>> Best, > >>>>>>>>>> Jark > >>>>>>>>>> > >>>>>>>>>> On Mon, 1 Feb 2021 at 19:02, Jane Chan <qingyue....@gmail.com> > >>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> Hi Timo, thanks for the discussion. > >>>>>>>>>>> > >>>>>>>>>>> It seems to reach an agreement regarding #3 that <1> Module > >> name > >>>>>> should > >>>>>>>>>>> better be a simple identifier rather than a string literal. <2> > >>>>>>>> Property > >>>>>>>>>>> `type` is redundant and should be removed, and mapping will > >> rely > >>> on > >>>>>> the > >>>>>>>>>>> module name because loading a module multiple times just using > >> a > >>>>>>>>>> different > >>>>>>>>>>> module name doesn't make much sense. <3> We should migrate to > >> the > >>>>>> newer > >>>>>>>>>> API > >>>>>>>>>>> rather than the deprecated `TableFactory` class. > >>>>>>>>>>> > >>>>>>>>>>> Regarding #1, I think the point lies in whether changing the > >>>>>> resolution > >>>>>>>>>>> order implies an `unload` operation explicitly (i.e., users > >> could > >>>>>> sense > >>>>>>>>>>> it). What do others think? > >>>>>>>>>>> > >>>>>>>>>>> Best, > >>>>>>>>>>> Jane > >>>>>>>>>>> > >>>>>>>>>>> On Mon, Feb 1, 2021 at 6:41 PM Timo Walther < > >> twal...@apache.org> > >>>>>>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>>> IMHO I would rather unload the not mentioned modules. The > >>>> statement > >>>>>>>>>>>> expresses `USE` that implicilty implies that the other modules > >>> are > >>>>>>>> "not > >>>>>>>>>>>> used". What do others think? > >>>>>>>>>>>> > >>>>>>>>>>>> Regards, > >>>>>>>>>>>> Timo > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> On 01.02.21 11:28, Jane Chan wrote: > >>>>>>>>>>>>> Hi Jark and Rui, > >>>>>>>>>>>>> > >>>>>>>>>>>>> Thanks for the discussions. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Regarding #1, I'm fine with `USE MODULES` syntax, and > >>>>>>>>>>>>> > >>>>>>>>>>>>>> It can be interpreted as "setting the current order of > >>> modules", > >>>>>>>>>> which > >>>>>>>>>>>> is > >>>>>>>>>>>>>> similar to "setting the current catalog" for `USE CATALOG`. > >>>>>>>>>>>>>> > >>>>>>>>>>>>> I would like to confirm that the unmentioned modules remain > >> in > >>>> the > >>>>>>>>>> same > >>>>>>>>>>>>> relative order? E.g., if there are three loaded modules `X`, > >>> `Y`, > >>>>>>>>>> `Z`, > >>>>>>>>>>>> then > >>>>>>>>>>>>> `USE MODULES Y, Z` means shifting the order to `Y`, `Z`, `X`. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Regarding #3, I'm fine with mapping modules purely by name, > >>> and I > >>>>>>>>>> think > >>>>>>>>>>>>> Jark raised a good point on making the module name a simple > >>>>>>>>>> identifier > >>>>>>>>>>>>> instead of a string literal. For backward compatibility, > >> since > >>> we > >>>>>>>>>>> haven't > >>>>>>>>>>>>> supported this syntax yet, the affected users are those who > >>>> defined > >>>>>>>>>>>> modules > >>>>>>>>>>>>> in the YAML configuration file. Maybe we can eliminate the > >>> 'type' > >>>>>>>>>> from > >>>>>>>>>>>> the > >>>>>>>>>>>>> 'requiredContext' to make it optional. Thus the proposed > >>> mapping > >>>>>>>>>>>> mechanism > >>>>>>>>>>>>> could use the module name to lookup the suitable factory, > >> and > >>> in > >>>>>> the > >>>>>>>>>>>>> meanwhile updating documentation to encourage users to > >> simplify > >>>>>> their > >>>>>>>>>>>> YAML > >>>>>>>>>>>>> configuration. And in the long run, we can deprecate the > >>> 'type'. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Best, > >>>>>>>>>>>>> Jane > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Mon, Feb 1, 2021 at 4:19 PM Rui Li <lirui.fu...@gmail.com > >>> > >>>>>> wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>>> Thanks Jane for starting the discussion. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Regarding #1, I also prefer `USE MODULES` syntax. It can be > >>>>>>>>>>> interpreted > >>>>>>>>>>>> as > >>>>>>>>>>>>>> "setting the current order of modules", which is similar to > >>>>>> "setting > >>>>>>>>>>> the > >>>>>>>>>>>>>> current catalog" for `USE CATALOG`. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Regarding #3, I'm fine to map modules purely by name > >> because I > >>>>>> think > >>>>>>>>>>> it > >>>>>>>>>>>>>> satisfies all the use cases we have at hand. But I guess we > >>> need > >>>>>> to > >>>>>>>>>>> make > >>>>>>>>>>>>>> sure we're backward compatible, i.e. users don't need to > >>> change > >>>>>>>>>> their > >>>>>>>>>>>> yaml > >>>>>>>>>>>>>> files to configure the modules. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On Mon, Feb 1, 2021 at 3:10 PM Jark Wu <imj...@gmail.com> > >>>> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Thanks Jane for the summary and starting the discussion in > >>> the > >>>>>>>>>>> mailing > >>>>>>>>>>>>>>> list. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Here are my thoughts: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> 1) syntax to reorder modules > >>>>>>>>>>>>>>> I agree with Rui Li it would be quite useful if we can have > >>>> some > >>>>>>>>>>> syntax > >>>>>>>>>>>>>> to > >>>>>>>>>>>>>>> reorder modules. > >>>>>>>>>>>>>>> I slightly prefer `USE MODULES x, y, z` than `RELOAD > >> MODULES > >>> x, > >>>>>> y, > >>>>>>>>>>> z`, > >>>>>>>>>>>>>>> because USE has a more sense of effective and specifying > >>>>>> ordering, > >>>>>>>>>>> than > >>>>>>>>>>>>>>> RELOAD. > >>>>>>>>>>>>>>> From my feeling, RELOAD just means we unregister and > >>>> register > >>>>>>>>>> x,y,z > >>>>>>>>>>>>>> modules > >>>>>>>>>>>>>>> again, > >>>>>>>>>>>>>>> it sounds like other registered modules are still in use > >> and > >>> in > >>>>>> the > >>>>>>>>>>>>>> order. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> 3) mapping modules purely by name > >>>>>>>>>>>>>>> This can definitely improve the usability of loading > >> modules, > >>>>>>>>>> because > >>>>>>>>>>>>>>> the 'type=' property > >>>>>>>>>>>>>>> looks really redundant. We can think of this as a syntax > >>> sugar > >>>>>> that > >>>>>>>>>>> the > >>>>>>>>>>>>>>> default type value is the module name. > >>>>>>>>>>>>>>> And we can support to specify 'type=' property in the > >> future > >>> to > >>>>>>>>>> allow > >>>>>>>>>>>>>>> multiple modules for one module type. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Besides, I would like to mention one more change, that the > >>>> module > >>>>>>>>>>> name > >>>>>>>>>>>>>>> proposed in FLIP-68 is a string literal. > >>>>>>>>>>>>>>> But I think we are all on the same page to change it into a > >>>>>> simple > >>>>>>>>>>>>>>> (non-compound) identifier. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> LOAD/UNLOAD MODULE 'core' > >>>>>>>>>>>>>>> ==> > >>>>>>>>>>>>>>> LOAD/UNLOAD MODULE core > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>> Jark > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> On Sat, 30 Jan 2021 at 04:00, Jane Chan < > >>> qingyue....@gmail.com > >>>>> > >>>>>>>>>>> wrote: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Hi everyone, > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> I would like to start a discussion on FLINK-21045 [1] > >> about > >>>>>>>>>>> supporting > >>>>>>>>>>>>>>>> `LOAD MODULE` and `UNLOAD MODULE` SQL syntax. It's first > >>>>>> proposed > >>>>>>>>>> by > >>>>>>>>>>>>>>>> FLIP-68 [2] as following. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> -- load a module with the given name and append it to the > >>> end > >>>> of > >>>>>>>>>> the > >>>>>>>>>>>>>>> module > >>>>>>>>>>>>>>>> list > >>>>>>>>>>>>>>>> LOAD MODULE 'name' [WITH ('type'='xxx', 'prop'='myProp', > >>> ...)] > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> --unload a module by name from the module list and other > >>>> modules > >>>>>>>>>>>> remain > >>>>>>>>>>>>>>> in > >>>>>>>>>>>>>>>> the same relative positions > >>>>>>>>>>>>>>>> UNLOAD MODULE 'name' > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> After a round of discussion on the Jira ticket, it seems > >>> some > >>>>>>>>>>>>>> unanswered > >>>>>>>>>>>>>>>> questions need more opinions and suggestions. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> 1. The way to redefine resolution order easily > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Rui Li suggested introducing `USE MODULES` and > >>> adding > >>>>>>>> similar > >>>>>>>>>>>>>>>> functionality to the API because > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> 1) It's very tedious to unload old modules just to > >>>> reorder > >>>>>>>>>> them. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> 2) Users may not even know how to "re-load" an old > >>> module > >>>>>> if it > >>>>>>>>>>> was > >>>>>>>>>>>>>> not > >>>>>>>>>>>>>>>>> initially loaded by the user, e.g. don't know which type > >> to > >>>>>> use. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Jane Chan wondered that module is not like the > >>> catalog > >>>>>> which > >>>>>>>>>>> has > >>>>>>>>>>>> a > >>>>>>>>>>>>>>>> concept of namespace could specify, and `USE` sounds like > >> a > >>>>>>>>>>>>>>>> mutual-exclusive concept. > >>>>>>>>>>>>>>>> Maybe `RELOAD MODULES` can express upgrading the > >>>>>> priority of > >>>>>>>>>>> the > >>>>>>>>>>>>>>> loaded > >>>>>>>>>>>>>>>> module(s). > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> 2. `LOAD/UNLOAD MODULE` v.s. `CREATE/DROP MODULE` syntax > >>>>>>>>>>>>>>>> Jark Wu and Nicholas Jiang proposed to use > >>>> `CREATE/DROP > >>>>>>>>>> MODULE` > >>>>>>>>>>>>>>> instead > >>>>>>>>>>>>>>>> of `LOAD/UNLOAD MODULE` because > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> 1) From a pure SQL user's perspective, maybe > `CREATE > >>>>>> MODULE + > >>>>>>>>>> USE > >>>>>>>>>>>>>>>> MODULE` > >>>>>>>>>>>>>>>>> is easier to use rather than `LOAD/UNLOAD`. > >>>>>>>>>>>>>>>>> 2) This will be very similar to what the catalog > >> used > >>>> now. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Timo Walther would rather stick to the agreed > design > >>>>>> because > >>>>>>>>>>>>>>>> loading/unloading modules is a concept known from kernels > >>> etc. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> 3. Simplify the module design by mapping modules purely by > >>>> name > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> LOAD MODULE geo_utils > >>>>>>>>>>>>>>>> LOAD MODULE hive WITH ('version'='2.1') -- no dedicated > >>>>>>>>>>>>>>> 'type='/'module=' > >>>>>>>>>>>>>>>> but allow only 1 module to be loaded parameterized > >>>>>>>>>>>>>>>> UNLOAD hive > >>>>>>>>>>>>>>>> USE MODULES hive, core > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Please find more details in the reference link. Looking > >>>> forward > >>>>>> to > >>>>>>>>>>>> your > >>>>>>>>>>>>>>>> feedback. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> [1] https://issues.apache.org/jira/browse/FLINK-21045# > >>>>>>>>>>>>>>>> < > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>> > >>>>>> > >>>> > >>> > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> [2] > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>> > >>>>>> > >>>> > >>> > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>> Jane > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> -- > >>>>>>>>>>>>>> Best regards! > >>>>>>>>>>>>>> Rui Li > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>> > >>>>>> > >>>>> > >>>> > >>>> > >>> > >> > > > >