Hi Vinoth, +1, Sounds good to me. Working in parallel can increase efficiency.
Best, Vino taher koitawala <taher...@gmail.com> 于2019年8月2日周五 上午1:00写道: > Agreed. Scope should be 1 and 2. We will take on 1 while you take on 2 > > On Thu, Aug 1, 2019, 9:54 PM Vinay Patil <vinay18.pa...@gmail.com> wrote: > > > Hi Vinoth, > > > > Thank you for proposing this plan, let's keep the scope to 1&2 , as part > of > > v1 let's start with point 1 and you guys can tackle point 2 in parallel. > > > > Excited to be a part of this development. > > > > Regards, > > Vinay Patil > > > > > > On Thu, 1 Aug 2019, 21:49 Vinoth Chandar, <vin...@apache.org> wrote: > > > > > Here are my thoughts.. > > > > > > Last time, when Flink was brought up, we dug into the use-case and > > realized > > > that having Flink/Beam support for windowing on physical/arrival time > > > (hoodie_commit_time) would be valuable and that's why Flink was being > > > proposed. > > > > > > I would like to separate two aspects that I feel are intermingled here. > > > > > > 1) Writing datasets using Flink : Today hoodie-spark-datasource or > > > deltastreamer tool all use Spark to write Hudi datasets. It would be > nice > > > if we can do this as a part of a Flink job as well. > > > 2) Query Hudi datasets using Flink : we can perform awesome streaming > > style > > > pipelines on top of Hudi, since it provided the _hoodie_commit_time > > arrival > > > time watermarks.. Nick & I are trying to flesh this out more with > > > motivating use-cases and make the case for doing this. > > > > > > > > > Now questions for folks driving HUDI-184. Is the scope 1 or 2 or 1 & > 2. ? > > > My suggestion would be to tackle 1 in HUDI-184 and Nick/I can parallel > > > tackle 2 > > > > > > This is exciting work :). Hope we can get past the current release, jar > > > fixes and get to this.. ha ha. > > > > > > /thanks/vinoth > > > > > > > > > > > > > > > > > > > > > On Wed, Jul 31, 2019 at 6:01 AM Semantic Beeng <n...@semanticbeeng.com > > > > > wrote: > > > > > > > All, > > > > > > > > @vc and I have been mulling on this for a while and are working on > some > > > > material to start this. > > > > > > > > But > > > > > > > > 1. We want to start with requirements, right? > > > > > > > > Last time we discussed this we asked for use cases, needs etc. > > > > > > > > Have some here > > > > > > > > > > https://cwiki.apache.org/confluence/display/HUDI/Hudi+for+Continuous+Deep+Analytics > > > > . > > > > > > > > Taher - any news on that example application about trade > > reconciliation, > > > > please? > > > > > > > > 2. Will push that we also drive this with proper architecture > decisions > > > to > > > > map the choices in a principled way. > > > > > > > > This will also help users make sense of fit with their architectures. > > See > > > > https://adr.github.io > > > > > > > > As architect consider that technology to technology integrations are > > bad > > > > idea. > > > > > > > > Reminds us of the M to N integration (point to point) in enterprise > > > > systems. > > > > > > > > Examples > > > > > > > > 1. > > > > > > > > > > https://github.com/alibaba/flink-ai-extended/tree/master/flink-ml-tensorflow > > > > > > > > 2. https://github.com/yahoo/TensorFlowOnSpark > > > > > > > > And now imagine Hudi hard linked to Flink. > > > > > > > > Someone trying to use both Spark and TF for ML and and Flink for data > > > > sliding would be in a tough spot to reconcile. > > > > > > > > And surely quite a few library version conflicts too. > > > > > > > > Instead we need to seek some abstractions in between them to > decouple. > > > > > > > > Hence, the more use cases and design examples you provide the better. > > :-) > > > > > > > > @vc - thoughts? > > > > > > > > Kind regards > > > > > > > > Nick > > > > > > > > > > > > > > > > > > > > > > > > On July 31, 2019 at 8:06 AM Vinoth Chandar <vin...@apache.org> > wrote: > > > > > > > > > > > > >>First of all, we should agree on the plan. > > > > +100 . this will be a very involved process.. if we can get a plan > > agreed > > > > upon, then we can start scoping the subtasks.. > > > > > > > > On Wed, Jul 31, 2019 at 2:11 AM Vinay Patil <vinay18.pa...@gmail.com > > > > > > wrote: > > > > > > > > Hi Guys, > > > > > > > > Add me in this as well, missed out on this last time. > > > > > > > > Regards, > > > > Vinay Patil > > > > > > > > > > > > > >