Re: Schema evolution in hudi

Balaji Varadarajan Tue, 08 Dec 2020 00:58:04 -0800

 Hi Rahul, 
Dropping a column would not be backwards compatible. You can add an empty 
column for now to get the dummy column as a workaround. I have added a jira 
(https://issues.apache.org/jira/browse/HUDI-1440) to provide an option to 
override the schema.
Balaji.V


    On Monday, December 7, 2020, 08:04:20 PM PST, Rahul Narayanan 
<[email protected]> wrote:  
 
 Hi Balaji ,
Adding new column is working but when I try to remove a column by inserting a 
new data frame with one column removed it does not work.


On Mon, Dec 7, 2020 at 2:51 PM Balaji Varadarajan <[email protected]> 
wrote:

 Hi Rahul,
With Spark data frame, the schema is deduced automatically. If you write a 
dataframe with schema that is backwards compatible (for eg: with new column 
added at the end), it should work seamlessly. 
Are you seeing any problems with this approach ?
Thanks,Balaji.V
    On Wednesday, December 2, 2020, 10:16:52 PM PST, Rahul Narayanan 
<[email protected]> wrote:  
 
 Hi Team,
We are interested in writing new columns and maybe removing some columns in the 
future in our dataset. I have read hudi supports schema evolution if it is 
backward compatible. To do a poc I tried writing a spark data frame to hudi 
using schema but it’s failing. How to write a spark data frame to hudi 
specifying the schema explicitly 
Thanks in advance

Re: Schema evolution in hudi

Reply via email to