Re: [DISCUSS] Decouple Hudi and Spark (HudiLink / approach)

2019-08-05 Thread Vinoth Chandar
Great discussions! Responded on the. original thread on decoupling.. Let's continue there? On Mon, Aug 5, 2019 at 1:39 AM Semantic Beeng wrote: > "design is more important. When we have a clear idea, it is not too late > to create an issue" > > 100% with Vino > > > On August 5, 2019 at 2:50 AM

Re: [DISCUSS] Decouple Hudi and Spark (HudiLink / approach)

2019-08-05 Thread vino yang
Hi Taher, IMO, Let's listen to more comments, after all, this discussion took place over the weekend. Then listen to Vinoth and the community's comments and suggestions. I personally think that design is more important. When we have a clear idea, it is not too late to create an issue. I am

Re: [DISCUSS] Decouple Hudi and Spark (HudiLink / approach)

2019-08-05 Thread taher koitawala
If everyone agrees that we should decouple Hudi and Spark to enable processing engine abstraction. Should I open a jira ticket for that? On Sun, Aug 4, 2019 at 6:59 PM taher koitawala wrote: > If anyone wants to see a Flink Streaming pipeline here is a really small > and basic Flink pipeline. >

Re: [DISCUSS] Decouple Hudi and Spark (HudiLink / approach)

2019-08-04 Thread taher koitawala
If anyone wants to see a Flink Streaming pipeline here is a really small and basic Flink pipeline. https://github.com/taherk77/FlinkHudi/tree/master/FlinkHudiExample/src/main/java/com/flink/hudi/example Consider users playing a game across multiple platforms and we only get the timestamp,

Re: [DISCUSS] Decouple Hudi and Spark (HudiLink / approach)

2019-08-04 Thread vino yang
Hi Nick, Thank you for your more detailed thoughts, and I fully agree with your thoughts about HudiLink, which should also be part of the long-term planning of the Hudi Ecology. *But I found that the angle of our thinking and the starting point are not consistent. I pay more attention to the