If something like the yaml file is the way to go and achieve such motivation, we would cover that with current design.
On Tue, Oct 1, 2019 at 12:05 Bowen Li <bowenl...@gmail.com> wrote: > Hi Timo, Dawid, > > I've added the suggested SQL and related changes to TableEnvironment API > and other classes to the google doc. Also removed "USE MODULE" and its > APIs. Will update FLIP wiki once we have a consensus. > > W.r.t. descriptor approach, my gut feeling is similar to Dawid's. Besides, > I feel yaml file would be a better solution to persist serializable state > of an environment as the file itself is in serializable format already. > Though yaml file only serves SQL CLI at this moment, we may be able to > extend its reach to Table API and allow users to load/offload a > TableEnvironment from/to yaml files, as something like "TableEnvironment > tEnv = TableEnvironment.loadFromYaml(<file_path>)" and > "tEnv.offloadToYaml(<file_path>)" to restore and persist state, and try to > make yaml file more expressive. > > > On Tue, Oct 1, 2019 at 6:47 AM Dawid Wysakowicz <dwysakow...@apache.org> > wrote: > >> Hi Timo, Bowen, >> >> Unfortunately I did not have enough time to go through all the >> suggestions in details so I can not comment on the whole FLIP. >> >> I just wanted to give my opinion on the "descriptor approach in >> loadModule" part. I am not sure if we need it here. We might be >> overthinking this a bit. It definitely makes sense for objects like >> TableSource/TableSink etc. as they are logical definitions that nearly >> always have to be persisted in a Catalog. I'm not sure if we really need >> the same for a whole session. If we need a resume session feature, the >> way to go would probably be to keep the session in memory on the server >> side. I fear we will never be able to serialize the whole session >> entirely (temporary objects, objects derived from DataStream etc.) >> >> I think it is ok to use instances for objects like Catalogs or Modules >> and have an overlay on top of that that can create instances from >> properties. >> >> Best, >> >> Dawid >> >> On 01/10/2019 11:28, Timo Walther wrote: >> > Hi Bowen, >> > >> > thanks for your response. >> > >> > Re 2) I also don't have a better approach for this issue. It is >> > similar to changing the general TableConfig between two statements. It >> > would be good to add your explanation to the design document. >> > >> > Re 3) It would be interesting to know about which "core" functions we >> > are actually talking about. Also for the overriding built-in functions >> > that we discussed in the other FLIP. But I'm fine with leaving it to >> > the user for now. How about we just introduce loadModule(), >> > unloadModule() methods instead of useModules()? This would ensure that >> > users don't forget to add the core module when adding an additional >> > module and they need to explicitly call "unloadModule('core')". >> > >> > Re 4) Every table environment feature should also be designed with SQL >> > statements in mind to verify the concept. SQL is also more popular >> > that Java/Scala API or YAML file. I would like to add it to 1.10 for >> > marking the feature as complete. >> > >> > SHOW MODULES -> sounds good to me, we should add a listModules(): >> > List<String> method to table environment >> > >> > LOAD MODULE 'hive' [WITH ('prop'='myProp', ...)] --> we should add a >> > loadModule() method to table environment >> > >> > UNLOAD MODULE 'hive' --> we should add a unloadModule() method to >> > table environment >> > >> > I would not introduce `USE MODULES 'x' 'y' 'z'` for simplicity and >> > concise API. Users need to load the module anyway with properties. >> > They can also load them "in order" immediately. CREATE TABLE can also >> > not create multiple tables but only one at a time in that order. >> > >> > One thing that came to my mind, shall we use a descriptor approach for >> > loadModule()? The past has shown that passing instances causes >> > problems when persisting objects. That's why we also want to get rid >> > of registerTableSource. I could image that users might want to persist >> > a table environment's state for later use in the future. Even though >> > this is future work, we should already keep such use cases in mind >> > when adding new API methods. What do you think? >> > >> > Regards, >> > Timo >> > >> > >> > On 30.09.19 23:17, Bowen Li wrote: >> >> Hi Timo, >> >> >> >> Re 1) I agree. I renamed the title to "Extend Core Table System with >> >> Pluggable Modules" and all internal references >> >> >> >> Re 2) First, I'll rename the API to useModules(). The design doesn't >> >> forbid >> >> users to call useModules() multi times. Objects in modules are loaded >> on >> >> demand instead of eagerly, so there won't be inconsistency. Users >> >> have to >> >> be fully aware of the consequences of resetting modules as that might >> >> cause >> >> that some objects can not be referenced anymore or resolution order >> >> of some >> >> objects changes. >> >> >> >> Re 3) Yes, we'd leave that to users. >> >> >> >> Another approach can be to have a non-optional "Core" module for all >> >> objects that cannot be overrode like "CAST" and "AS" functions, and >> >> have an >> >> optional "ExtendedCore" module for other replaceable built-in objects. >> >> "Core" should be positioned at the 1st in module list all the time. >> >> >> >> I'm fine with either solution. >> >> >> >> Re 4) It may sound like a nice-to-have advanced feature for 1.10, but >> we >> >> can surely fully discuss it for the sake of feature completeness. >> >> >> >> Unlike other configs, the order of modules would matter in Flink, and >> it >> >> implies the LOAD/UNLOAD commands would not be equal in operation >> >> positions. >> >> IIUYC, LOAD MODULE 'x' would be interpreted as appending x to the end >> of >> >> module list, and UNLOAD MODULE 'x' would be interpreted as removing x >> >> from >> >> any position in the list? >> >> >> >> I'm thinking of the following list of commands: >> >> >> >> SHOW MODULES - list modules in order >> >> LOAD MODULE 'hive' [WITH ('prop'='myProp', ...)] - load and append the >> >> module to end of the module list >> >> UNLOAD MODULE 'hive' - remove the module from module list, and other >> >> modules remain the same relative positions >> >> USE MODULES 'x' 'y' 'z' (wondering can parser take "'x' 'y' 'z'"?), >> >> or USE >> >> MODULES 'x,y,z' - to reorder module list completely >> >> >> > >> >>