Hi Rui Thanks for asking, the design for flink integeration can be found here: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=141724520 please ping me if you have any questions.
At 2020-09-28 20:43:22, "Rui Li" <[email protected]> wrote: >Hello, > >Very excited to see the on-going efforts for Flink integration. I wonder >whether there's a design doc for this feature? I would like to learn more >and hopefully to make some contributions. > >On Fri, Sep 25, 2020 at 6:27 AM nishith agarwal <[email protected]> wrote: > >> Yes, we have some ideas around schema evolution and have discussed with >> Balaji before as well. I'm going to put these thoughts down and share it on >> the cWiki for all of us to jam. Realistically, I don't think we can hit in >> 0.7.0. We already have a pretty strong list of items for 0.7.0. >> >> Spark 3 SQL syntax like MERGE will definitely boost usability! >> >> Thanks, >> Nishith >> >> On Thu, Sep 24, 2020 at 3:22 PM Vinoth Chandar <[email protected]> wrote: >> >> > On schema evolution, Nishith and Balaji were both thinking about this. >> May >> > be there is a proposal in works? >> > I would guess we will not be able to hit it in 0.7.0 though. Maybe by the >> > end of year/0.8.0? >> > >> > Tanu, thanks for the kind words! def, if we pull together, we will reach >> > there sooner. Looking forward to more contributions! :) >> > >> > >We were actually thinking of moving to Spark 3.0 but thought it’s too >> > early with 0.6 release. Is 0.6 not fully tested with Spark 3.0 ? >> > That's correct. There is a PR already open for this. We expect this to be >> > fixed in 0.6.1 shortly and we will unlock spark 3.0 support >> > >> > 0.7.0 will bring spark 3 SQL syntax like MERGE etc. (Other systems that >> > have had this, either had an unfair head start or built ahead with spark >> 3 >> > in mind. :)) >> > We will close this gap down. >> > >> > On Wed, Sep 23, 2020 at 6:25 PM Raymond Xu <[email protected]> >> > wrote: >> > >> > > +1 on the full schema evolution support. May I know which ticket this >> is >> > > related to? thanks. >> > > >> > > On Wed, Sep 23, 2020 at 5:20 AM leesf <[email protected]> wrote: >> > > >> > > > Thanks Vinoth, also we would consider support full schema >> > evolution(such >> > > as >> > > > >> > > > drop some fields) of hudi in 0.7.0, since right now hudi follows avro >> > > > >> > > > schema compatibility >> > > > >> > > > >> > > > >> > > > tanu dua <[email protected]> 于2020年9月23日周三 下午12:38写道: >> > > > >> > > > >> > > > >> > > > > Thanks Vinoth. These are really exciting items and hats off to you >> > and >> > > > team >> > > > >> > > > > in pushing the releases swiftly and improving the framework all the >> > > > time. I >> > > > >> > > > > hope someday I will start contributing once I will get free from my >> > > major >> > > > >> > > > > deliverables and have more understanding the nitty gritty details >> of >> > > > Hudi. >> > > > >> > > > > >> > > > >> > > > > You have mentioned Spark3.0 support in next release. We were >> actually >> > > > >> > > > > thinking of moving to Spark 3.0 but thought it’s too early with 0.6 >> > > > >> > > > > release. Is 0.6 not fully tested with Spark 3.0 ? >> > > > >> > > > > >> > > > >> > > > > >> > > > >> > > > > On Wed, 23 Sep 2020 at 8:25 AM, Vinoth Chandar <[email protected]> >> > > > wrote: >> > > > >> > > > > >> > > > >> > > > > > Hello all, >> > > > >> > > > > > >> > > > >> > > > > > >> > > > >> > > > > > >> > > > >> > > > > > Pursuant to our conversation around release planning, I am happy >> to >> > > > share >> > > > >> > > > > > >> > > > >> > > > > > the initial set of proposals for the next minor/major releases >> > (minor >> > > > >> > > > > > >> > > > >> > > > > > release ofc can go out based on time) >> > > > >> > > > > > >> > > > >> > > > > > >> > > > >> > > > > > >> > > > >> > > > > > *Next Minor version 0.6.1 (with stuff that did not make it to >> > > 0.6.0..) >> > > > * >> > > > >> > > > > > >> > > > >> > > > > > Flink/Writer common refactoring for Flink >> > > > >> > > > > > >> > > > >> > > > > > Small file handling support w/o caching >> > > > >> > > > > > >> > > > >> > > > > > Spark3 Support >> > > > >> > > > > > >> > > > >> > > > > > Remaining bootstrap items >> > > > >> > > > > > >> > > > >> > > > > > Completing bulk_insertV2 (sort mode, de-dup etc) >> > > > >> > > > > > >> > > > >> > > > > > Full list here : >> > > > >> > > > > > >> > > > >> > > > > > https://issues.apache.org/jira/projects/HUDI/versions/12348168 >> > > > >> > > > > > >> > > > >> > > > > > <https://issues.apache.org/jira/projects/HUDI/versions/12348168> >> > > > >> > > > > > >> > > > >> > > > > > >> > > > >> > > > > > >> > > > >> > > > > > *0.7.0 with major new features * >> > > > >> > > > > > >> > > > >> > > > > > RFC-15: metadata, range index (w/ spark support), bloom index >> > > > (eliminate >> > > > >> > > > > > >> > > > >> > > > > > file listing, query pruning, improve bloom index perf) >> > > > >> > > > > > >> > > > >> > > > > > RFC-08: Record Index (to solve global index scalability/perf) >> > > > >> > > > > > >> > > > >> > > > > > RFC-18/19: Clustering/Insert overwrite >> > > > >> > > > > > >> > > > >> > > > > > Spark 3 based datasource rewrite (structured streaming >> sink/source, >> > > > >> > > > > > >> > > > >> > > > > > DELETE/MERGE) >> > > > >> > > > > > >> > > > >> > > > > > Incremental Query on logs (Hive, Spark) >> > > > >> > > > > > >> > > > >> > > > > > Parallel writing support >> > > > >> > > > > > >> > > > >> > > > > > Redesign of marker files for S3 >> > > > >> > > > > > >> > > > >> > > > > > Stretch: ORC, PrestoSQL Support >> > > > >> > > > > > >> > > > >> > > > > > >> > > > >> > > > > > >> > > > >> > > > > > Full list here : >> > > > >> > > > > > >> > > > >> > > > > > https://issues.apache.org/jira/projects/HUDI/versions/12348721 >> > > > >> > > > > > >> > > > >> > > > > > >> > > > >> > > > > > >> > > > >> > > > > > Please chime in with your thoughts. If you would like to commit >> to >> > > > >> > > > > > >> > > > >> > > > > > contributing a feature towards a release, please do so by marking >> > > *`Fix >> > > > >> > > > > > >> > > > >> > > > > > Version/s`* field with that release number. >> > > > >> > > > > > >> > > > >> > > > > > >> > > > >> > > > > > >> > > > >> > > > > > Thanks >> > > > >> > > > > > >> > > > >> > > > > > Vinoth >> > > > >> > > > > > >> > > > >> > > > > > >> > > > >> > > > > >> > > > >> > > > >> > > >> > >> > > >-- >Cheers, >Rui Li
