If something like the yaml file is the way to go and achieve such
motivation, we would cover that with current design.

On Tue, Oct 1, 2019 at 12:05 Bowen Li <bowenl...@gmail.com> wrote:

> Hi Timo, Dawid,
>
> I've added the suggested SQL and related changes to TableEnvironment API
> and other classes to the google doc. Also removed "USE MODULE" and its
> APIs. Will update FLIP wiki once we have a consensus.
>
> W.r.t. descriptor approach, my gut feeling is similar to Dawid's. Besides,
> I feel yaml file would be a better solution to persist serializable state
> of an environment as the file itself is in serializable format already.
> Though yaml file only serves SQL CLI at this moment, we may be able to
> extend its reach to Table API and allow users to load/offload a
> TableEnvironment from/to yaml files, as something like "TableEnvironment
> tEnv = TableEnvironment.loadFromYaml(<file_path>)" and
> "tEnv.offloadToYaml(<file_path>)" to restore and persist state, and try to
> make yaml file more expressive.
>
>
> On Tue, Oct 1, 2019 at 6:47 AM Dawid Wysakowicz <dwysakow...@apache.org>
> wrote:
>
>> Hi Timo, Bowen,
>>
>> Unfortunately I did not have enough time to go through all the
>> suggestions in details so I can not comment on the whole FLIP.
>>
>> I just wanted to give my opinion on the "descriptor approach in
>> loadModule" part. I am not sure if we need it here. We might be
>> overthinking this a bit. It definitely makes sense for objects like
>> TableSource/TableSink etc. as they are logical definitions that nearly
>> always have to be persisted in a Catalog. I'm not sure if we really need
>> the same for a whole session. If we need a resume session feature, the
>> way to go would probably be to keep the session in memory on the server
>> side. I fear we will never be able to serialize the whole session
>> entirely (temporary objects, objects derived from DataStream etc.)
>>
>> I think it is ok to use instances for objects like Catalogs or Modules
>> and have an overlay on top of that that can create instances from
>> properties.
>>
>> Best,
>>
>> Dawid
>>
>> On 01/10/2019 11:28, Timo Walther wrote:
>> > Hi Bowen,
>> >
>> > thanks for your response.
>> >
>> > Re 2) I also don't have a better approach for this issue. It is
>> > similar to changing the general TableConfig between two statements. It
>> > would be good to add your explanation to the design document.
>> >
>> > Re 3) It would be interesting to know about which "core" functions we
>> > are actually talking about. Also for the overriding built-in functions
>> > that we discussed in the other FLIP. But I'm fine with leaving it to
>> > the user for now. How about we just introduce loadModule(),
>> > unloadModule() methods instead of useModules()? This would ensure that
>> > users don't forget to add the core module when adding an additional
>> > module and they need to explicitly call "unloadModule('core')".
>> >
>> > Re 4) Every table environment feature should also be designed with SQL
>> > statements in mind to verify the concept. SQL is also more popular
>> > that Java/Scala API or YAML file. I would like to add it to 1.10 for
>> > marking the feature as complete.
>> >
>> > SHOW MODULES -> sounds good to me, we should add a listModules():
>> > List<String> method to table environment
>> >
>> > LOAD MODULE 'hive' [WITH ('prop'='myProp', ...)] --> we should add a
>> > loadModule() method to table environment
>> >
>> > UNLOAD MODULE 'hive' --> we should add a unloadModule() method to
>> > table environment
>> >
>> > I would not introduce `USE MODULES 'x' 'y' 'z'` for simplicity and
>> > concise API. Users need to load the module anyway with properties.
>> > They can also load them "in order" immediately. CREATE TABLE can also
>> > not create multiple tables but only one at a time in that order.
>> >
>> > One thing that came to my mind, shall we use a descriptor approach for
>> > loadModule()? The past has shown that passing instances causes
>> > problems when persisting objects. That's why we also want to get rid
>> > of registerTableSource. I could image that users might want to persist
>> > a table environment's state for later use in the future. Even though
>> > this is future work, we should already keep such use cases in mind
>> > when adding new API methods. What do you think?
>> >
>> > Regards,
>> > Timo
>> >
>> >
>> > On 30.09.19 23:17, Bowen Li wrote:
>> >> Hi Timo,
>> >>
>> >> Re 1) I agree. I renamed the title to "Extend Core Table System with
>> >> Pluggable Modules" and all internal references
>> >>
>> >> Re 2) First, I'll rename the API to useModules(). The design doesn't
>> >> forbid
>> >> users to call useModules() multi times. Objects in modules are loaded
>> on
>> >> demand instead of eagerly, so there won't be inconsistency. Users
>> >> have to
>> >> be fully aware of the consequences of resetting modules as that might
>> >> cause
>> >> that some objects can not be referenced anymore or resolution order
>> >> of some
>> >> objects changes.
>> >>
>> >> Re 3) Yes, we'd leave that to users.
>> >>
>> >> Another approach can be to have a non-optional "Core" module for all
>> >> objects that cannot be overrode like "CAST" and "AS" functions, and
>> >> have an
>> >> optional "ExtendedCore" module for other replaceable built-in objects.
>> >> "Core" should be positioned at the 1st in module list all the time.
>> >>
>> >> I'm fine with either solution.
>> >>
>> >> Re 4) It may sound like a nice-to-have advanced feature for 1.10, but
>> we
>> >> can surely fully discuss it for the sake of feature completeness.
>> >>
>> >> Unlike other configs, the order of modules would matter in Flink, and
>> it
>> >> implies the LOAD/UNLOAD commands would not be equal in operation
>> >> positions.
>> >> IIUYC, LOAD MODULE 'x' would be interpreted as appending x to the end
>> of
>> >> module list, and UNLOAD MODULE 'x' would be interpreted as removing x
>> >> from
>> >> any position in the list?
>> >>
>> >> I'm thinking of the following list of commands:
>> >>
>> >> SHOW MODULES - list modules in order
>> >> LOAD MODULE 'hive' [WITH ('prop'='myProp', ...)] - load and append the
>> >> module to end of the module list
>> >> UNLOAD MODULE 'hive' - remove the module from module list, and other
>> >> modules remain the same relative positions
>> >> USE MODULES 'x' 'y' 'z' (wondering can parser take "'x' 'y' 'z'"?),
>> >> or USE
>> >> MODULES 'x,y,z' - to reorder module list completely
>> >>
>> >
>>
>>

Reply via email to