Hi Syed, Typically, I have been the Confluent/avro schema registry used as a the source of truth and Hive schema is just a translation. Thats how the hudi-hive sync also works.. Have you considered making fields optional in the avro schema so that even if the source data does not have few of them, there will be nulls.. In general, the two places I have dealt with this, all made it works using the schema evolution rules avro supports.. and enforcing things like not deleting fields, not changing order etc.
Hope that atleast helps a bit thanks vinoth On Sun, Dec 29, 2019 at 11:55 PM Syed Abdul Kather <[email protected]> wrote: > Hi Team, > > We have pull data from Kafka generated by Debezium. The schema maintained > in the schema registry by confluent framework during the population of > data. > > *Problem Statement Here: * > > All the addition/deletion of columns is maintained in schema registry. > During running the Hudi pipeline, We have custom schema registry that > pulls the latest schema from the schema registry as well as from hive > metastore and we create a uber schema (so that missing the columns from the > schema registry will be pulled from hive metastore) But is there any better > approach to solve this problem?. > > > > > Thanks and Regards, > S SYED ABDUL KATHER >
