Hey Shawn, Rahil,
Thanks for raising this issue. These are good suggestions; I would
recommend simplifying the code structure of Hudi Spark incrementally and
gradually making the code less coupled with Spark engine.
Identify breaking changes introduced by the new Spark version and patch
>
This is a good topic, thanks for raising this. Overall our reliance on
spark classes/APIs that are declared experimental is an issue on paper. But
there is few other ways to get right performance without relying on these.
This has been the tricky issue IMO. Thoughts?
I ll review the code
Thanks Shawn for writing this, I would like to also add on to the Spark
Discussion.
Currently I think our integration with Spark is too tight, and brings up
serious issues when upgrading.
I will describe one example(however there are many more) but one area is we
extend Spark's
Hi Hudi developers,
I am writing to discuss the current code structure of the existing
hudi-spark-datasource and propose a more scalable approach for supporting
multiple Spark versions. The current structure involves common code shared
by several Spark versions, such as hudi-spark-common,