[GitHub] [hudi] vinothchandar commented on issue #8018: [SUPPORT] why is the schema evolution done while not setting hoodie.schema.on.read.enable

via GitHub Thu, 09 Mar 2023 06:18:06 -0800


vinothchandar commented on issue #8018:
URL: https://github.com/apache/hudi/issues/8018#issuecomment-1462141331


   +1 on @kazdy 's notes above on ASR. Hudi has always supported some automatic 
schema evolution to deal with streaming data similar to what Kafka/Schema 
registry model achieves. The reason was, users found it inconvenient to 
coordinate pausing pipelines and doing some manual maintenance/backfills when 
say, new columns were added. What we call full schema evolution/schema-on-read 
is orthogonal, and it just allows more backwards incompatible evolutions to go 
through as well. 
   
   Now on 0.13, I think the reconcile flag simple allows for skipping some 
columns in the incoming write (partial writes scenarios) and Hudi reconciles 
this with the table schema - while still respecting the automatic schema 
evolution. I think this is what 0.13 changes. 
https://hudi.apache.org/releases/release-0.13.0#schema-handling-in-write-path 
   
   >We also don't understand how exactly 
hoodie.datasource.write.reconcile.schema and hoodie.avro.schema.validate work 
in either case. Specific examples would help here for both flags.
   
   Note to @nfarah86 and @nsivabalan to cover this in the schema docs page that 
is being worked on now. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [hudi] vinothchandar commented on issue #8018: [SUPPORT] why is the schema evolution done while not setting hoodie.schema.on.read.enable

Reply via email to