More for my own edification, how does the recently introduced timeline
service play into the delta writer components?

On Fri, Aug 2, 2019 at 7:53 AM vino yang <yanghua1...@gmail.com> wrote:

> Hi Suneel,
>
> Thank you for your suggestion, let me clarify.
>
>
> *The context of this email is that we are evaluating how to implement a
> Stream Delta writer base on Flink.*
> About the discussion between me, Taher and Vinay, those are just some
> trivial details in the preparation of the document, and the discussion is
> also based on mail.
>
> When we don't have the first draft, discussing the details on the mailing
> list may confuse others and easily deviate from the topic. Our initial plan
> was to facilitate community discussions and reviews when we had a draft of
> the documentation available to the community.
>
> Best,
> Vino
>
> Suneel Marthi <smar...@apache.org> 于2019年8月2日周五 下午10:37写道:
>
> > Please keep all discussions to Mailing lists here - no offline
> discussions
> > please.
> >
> > On Fri, Aug 2, 2019 at 10:22 AM vino yang <yanghua1...@gmail.com> wrote:
> >
> > > Hi guys,
> > >
> > > Currently, I, Taher and Vinay are working on issue HUDI-184.[1]
> > >
> > > As a first step, we are discussing the design doc.
> > >
> > > After diving into the code, We listed some relevant classes about the
> > Spark
> > > delta writer.
> > >
> > >    - module: hoodie-utilities
> > >
> > > com.uber.hoodie.utilities.deltastreamer.HoodieDeltaStreamer
> > > com.uber.hoodie.utilities.deltastreamer.DeltaSyncService
> > > com.uber.hoodie.utilities.deltastreamer.SourceFormatAdapter
> > > com.uber.hoodie.utilities.schema.SchemaProvider
> > > com.uber.hoodie.utilities.transform.Transformer
> > >
> > >    - module: hoodie-client
> > >
> > > com.uber.hoodie.HoodieWriteClient (to commit compaction)
> > >
> > >
> > > The fact is *hoodie-utilities* depends on *hoodie-client*, however,
> > > *hoodie-client* is also not a pure Hudi component, it also depends on
> > Spark
> > > lib.
> > >
> > > So I propose hoodie should provide a pure hoodie-client and decouple
> with
> > > Spark. Then Flink and Spark modules should depend on it.
> > >
> > > Moreover, based on the old discussion[2], we all agree that Spark is
> not
> > > the only choice for Hudi, it could also be Flink/Beam.
> > >
> > > IMO, We should decouple Hudi from Spark at the height of the project,
> > > including but not limited to module splitting and renaming.
> > >
> > > Not sure if this requires a HIP to drive.
> > >
> > > We should first listen to the opinions of the community. Any ideas and
> > > suggestions are welcome and appreciated.
> > >
> > > Best,
> > > Vino
> > >
> > > [1]: https://issues.apache.org/jira/browse/HUDI-184?filter=-1
> > > [2]:
> > >
> > >
> >
> https://lists.apache.org/api/source.lua/1533de2d4cd4243fa9e8f8bf057ffd02f2ac0bec7c7539d8f72166ea@%3Cdev.hudi.apache.org%3E
> > >
> >
>

Reply via email to