Re: [DISCUSS] Restructure hudi-utilities module

2020-03-09 Thread Balaji Varadarajan
+1 on Vinoth's suggestion on waiting for the lower level (write-client) re-factored and re-organized first.  We can then look at Data-Source and DeltaStreamer to make sure how to best organize them.  Balaji.VOn Sunday, March 8, 2020, 11:06:13 PM PDT, Vinoth Chandar wrote: >> make

Re: [DISCUSS] Restructure hudi-utilities module

2020-03-09 Thread Vinoth Chandar
>> make delta streamer a engine agnostic part so that Spark and Flink can share some common logic. If we make the change at the Write Client level to make it engine agnostic, it should help with most of the cases.. I believe there will be spark specific pieces in the Source abstraction since

Re: [DISCUSS] Restructure hudi-utilities module

2020-03-04 Thread vino yang
Hi guys, My original thought is to make delta streamer a engine agnostic part so that Spark and Flink can share some common logic. >>I am not sure the ROI is there for renaming to hudi-deltastreamer and pull this out.. Everytime we change a module name Actually, here my suggestion is to move

Re: [DISCUSS] Restructure hudi-utilities module

2020-03-04 Thread Vinoth Chandar
I am not sure the ROI is there for renaming to hudi-deltastreamer and pull this out.. Everytime we change a module name, its a breaking change and I would prefer if we reserved those for really pressing issues.. or take natural course of development and get there.. Regarding how multi framework

Re: [DISCUSS] Restructure hudi-utilities module

2020-03-04 Thread Gary Li
+1. hudi-delta gives me the feeling that it has something to do with other frameworks... I’d vote for another name hudi-deltastreamer or hudi-streamer or hudi-stream. On Wed, Mar 4, 2020 at 2:29 AM vino yang wrote: > Hi folks, > > Currently, it seems the content of hudi-utilities looks a bit

[DISCUSS] Restructure hudi-utilities module

2020-03-04 Thread vino yang
Hi folks, Currently, it seems the content of hudi-utilities looks a bit mix. Summarize all of them, there are two aspects list below: - delta streamer and its relevant packages, e.g. deltastreamer, sources, schema, transform, these packages are served for delta streamer. - Some utility