>>multiple places like Athena, Redshift Spectrum, on prem Presto, Hive etc
I was under the impression that most of them can read the schema the from
Hive metastore, already? It's atleast true of Presto. Glue catalog in AWS
again adheres to the Hive metastore protocol/APIs, so Hudi (IIRC) can
register to the Glue catalog using the same mechanism.

I am supportive if we can establish we need to support additional
metastores explicitly..

On Mon, Jan 6, 2020 at 5:13 AM Syed Abdul Kather <[email protected]> wrote:

> Hi Vinoth,
>
> As discussed by puru. Please suggest as on supporting the multiple
> megastores or if there is any better way.
>             Thanks and Regards,
>         S SYED ABDUL KATHER
>
>
>
> On Mon, Jan 6, 2020 at 3:47 PM Purushotham Pushpavanthar <
> [email protected]> wrote:
>
> > Hi Vinoth,
> >
> > Since *schema registr*y is source of truth and Hive Meta store is
> > translation it,
> > having option to update multiple metastores in Hudi would help here in
> this
> > case.
> > Similar to what Syed mentioned, same Hudi dataset can be exposed in
> > multiple
> > places like Athena, Redshift Spectrum, on prem Presto, Hive etc where
> > datasets's
> > meta data is not shared with each other.
> >
> > Regards,
> > Purushotham Pushpavanth
> >
> >
> >
> > On Wed, 1 Jan 2020 at 00:46, Vinoth Chandar <[email protected]> wrote:
> >
> > > Can one of the aws folks please chime in here? IIRC I saw some tweets
> > > mentioning Hudi/Athena support is in the works.
> > > Not sure myself.
> > >
> > > On Sun, Dec 29, 2019 at 11:33 PM Syed Abdul Kather <[email protected]
> >
> > > wrote:
> > >
> > > > Hi Team,
> > > >
> > > > We have built the  "CDC  pipeline with apache hudi and debezium" .
> It
> > > > works very well in our production.
> > > >
> > > > But we have inhouse Ambari  Cluster with Hive metastore for all the
> ETL
> > > > purpose and Athena for all analytics purposes.  To make hudi table we
> > > work
> > > > on the athena we have preserved only the latest version and create
> the
> > > > table in parquet format .
> > > >
> > > > Right now hive metastore get update using hudi itself . But to keep
> the
> > > > athena metastore in sync we have wrote a separate script to manage.
> But
> > > > that looks like not right approach . As only the required the
> affected
> > > > partition needs to be updated in athena side.
> > > >
> > > > Please suggest as right approach here .
> > > >
> > > >             Thanks and Regards,
> > > >         S SYED ABDUL KATHER
> > > >
> > >
> >
>

Reply via email to