Hi Rahul, On the specific scenario, if you could raise a GH Support issue with steps/stacktrace we can certainly help out.
On the first part, we have relied on Avro schema evolution/compatibility thus far, where you null out the old columns (which is very cheap for parquet storage anyway). For tools like delta-streamer, this is enforced by the external schema registries. However, you are right that Spark data frame path may need some more work. Happy to work through this with you on a ticket as well. thanks vinoth On Mon, Dec 7, 2020 at 12:50 PM Rahul Narayanan <[email protected]> wrote: > ---------- Forwarded message --------- > From: Rahul Narayanan <[email protected]> > Date: Thu, Dec 3, 2020 at 11:46 AM > Subject: Schema evolution in hudi > To: [email protected] <[email protected]> > > > Hi Team, > > We are interested in writing new columns and maybe removing some columns in > the future in our dataset. I have read hudi supports schema evolution if it > is backward compatible. To do a poc I tried writing a spark data frame to > hudi using schema but it’s failing. How to write a spark data frame to hudi > specifying the schema explicitly > > Thanks in advance >
