Hi Kabeer,

I plan to do an incremental query PoC. My use case including:

1. Load one big Hive table located in HDFS to Hudi as a history table (I think 
should use HiveSyncTool)
2. Sink streaming data from Kafka to  Hudi as real time table(use 
HoodieDeltaStreamer?)
3. Join both of two table get the incremental metrics (Spark SQL?)

My questions:

1. Do I just copy the Hudi packages to the server client for deployment?
2. Does Hudi must require access to HiveMetastore? My company has restricted to 
access HiveMetastore? Can Hudi use Hive JDBC to get metadata?
3. What is the HoodieTableMeta use for? Where is the HoodieTableMeta saved?


Best,
Qian
On Oct 2, 2019, 2:59 PM -0700, Kabeer Ahmed <[email protected]>, wrote:
> Qian
>
> Welcome!
> Are you able to tell us a bit more about your use case? Eg: type of the 
> project, industry, complexity of the pipeline that you plan to write (eg: 
> pulling data from external APIs like New York taxi dataset and writing them 
> into Hive for analysis) etc.
> This will give us a bit more context.
> Thanks
> Kabeer.
>
> On Oct 2 2019, at 10:55 pm, Vinoth Chandar <[email protected]> wrote:
> > edit:
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=113709185#Frequentlyaskedquestions(FAQ)-HowisaHudijobdeployed?
> > with the ? at the end
> >
> > On Wed, Oct 2, 2019 at 2:54 PM Vinoth Chandar <[email protected]> wrote:
> > > Hi Qian,
> > > Welcome! Does
> > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=113709185#Frequentlyaskedquestions(FAQ)-HowisaHudijobdeployed?
> > > help ?
> > >
> > >
> > > On Wed, Oct 2, 2019 at 10:18 AM Qian Wang <[email protected]> wrote:
> > > > Hi,
> > > > I am new to Apache Hudi. Currently I am working on a PoC using Hudi and
> > > > anyone can give me some documents what how to deploy Apache Hudi? 
> > > > Thanks.
> > > >
> > > > Best,
> > > > Eric
> > >
> >
> >
>

Reply via email to