Re: Regards to Uber Schema Registry ( Hive Schema + Schema Registry )

Vinoth Chandar Tue, 31 Dec 2019 11:20:16 -0800

Hi Syed,

Typically, I have been the Confluent/avro schema registry used as a the
source of truth and Hive schema is just a translation. Thats how the
hudi-hive sync also works..
Have you considered making fields optional in the avro schema so that even
if the source data does not have few of them, there will be nulls..
In general, the two places I have dealt with this, all made it works using
the schema evolution rules avro supports.. and enforcing things like not
deleting fields, not changing order etc.


Hope that atleast helps a bit

thanks
vinoth

On Sun, Dec 29, 2019 at 11:55 PM Syed Abdul Kather <[email protected]>
wrote:

> Hi Team,
>
> We have pull data from Kafka generated by Debezium. The schema maintained
> in the schema registry by confluent framework during the population of
> data.
>
> *Problem Statement Here: *
>
> All the addition/deletion of columns is maintained in schema registry.
>  During running the Hudi pipeline, We have custom schema registry that
> pulls the latest schema from the schema registry as well as from hive
> metastore and we create a uber schema (so that missing the columns from the
> schema registry will be pulled from hive metastore) But is there any better
> approach to solve this problem?.
>
>
>
>
>             Thanks and Regards,
>         S SYED ABDUL KATHER
>

Re: Regards to Uber Schema Registry ( Hive Schema + Schema Registry )

Reply via email to