Re: How to deploy Hudi

Jaimin Shah Tue, 22 Oct 2019 00:26:53 -0700

Hi Vinoth,
   As you said Hudi can use Hive JDBC to talk to metastore. Was this
functionality added after version 0.4.5? Because I am getting this error
Exception in thread "main"
com.uber.hoodie.com.beust.jcommander.ParameterException: Was passed main
parameter '--use-jdbc' but no main parameter was defined in your arg class


Thanks,
Jaimin

On Thu, 3 Oct 2019 at 10:56, Vinoth Chandar <[email protected]> wrote:

> Hi Qian,
>
> You are right on the choice of tools for 2 and 3. But for 1, if you want to
> do a 1-time bulk load, you can look into options on the migration guide
> http://hudi.apache.org/migration_guide.html (HiveSyncTool is orthogonal to
> this, it simply registers a Hudi dataset to Hive metastore)
>
> On your questions
> 1. You need the appropriate hudi bundle jar to write data
> http://hudi.apache.org/writing_data.html . For reading also, there are
> similar instructions depending on query engine and yes, you would copy a
> bundle jar and install it.
> 2. You can choose to use Hudi without HiveMetastore and it will give you
> access to ReadOptimized and Incremental Views (Not realtime view, that
> needs Hive atm).  Hudi can use Hive JDBC to talk to metastore if thats what
> you are asking.
> 3. Hudi saves metadata on a special .hoodie folder on your DFS itself. Its
> usef for building features like incremental pull
>
> Hope that helps
>
>
> On Wed, Oct 2, 2019 at 3:12 PM Qian Wang <[email protected]> wrote:
>
> > Hi Kabeer,
> >
> > I plan to do an incremental query PoC. My use case including:
> >
> > 1. Load one big Hive table located in HDFS to Hudi as a history table (I
> > think should use HiveSyncTool)
> > 2. Sink streaming data from Kafka to  Hudi as real time table(use
> > HoodieDeltaStreamer?)
> > 3. Join both of two table get the incremental metrics (Spark SQL?)
> >
> > My questions:
> >
> > 1. Do I just copy the Hudi packages to the server client for deployment?
> > 2. Does Hudi must require access to HiveMetastore? My company has
> > restricted to access HiveMetastore? Can Hudi use Hive JDBC to get
> metadata?
> > 3. What is the HoodieTableMeta use for? Where is the HoodieTableMeta
> saved?
> >
> >
> > Best,
> > Qian
> > On Oct 2, 2019, 2:59 PM -0700, Kabeer Ahmed <[email protected]>,
> wrote:
> > > Qian
> > >
> > > Welcome!
> > > Are you able to tell us a bit more about your use case? Eg: type of the
> > project, industry, complexity of the pipeline that you plan to write (eg:
> > pulling data from external APIs like New York taxi dataset and writing
> them
> > into Hive for analysis) etc.
> > > This will give us a bit more context.
> > > Thanks
> > > Kabeer.
> > >
> > > On Oct 2 2019, at 10:55 pm, Vinoth Chandar <[email protected]> wrote:
> > > > edit:
> > > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=113709185#Frequentlyaskedquestions(FAQ)-HowisaHudijobdeployed
> > ?
> > > > with the ? at the end
> > > >
> > > > On Wed, Oct 2, 2019 at 2:54 PM Vinoth Chandar <[email protected]>
> > wrote:
> > > > > Hi Qian,
> > > > > Welcome! Does
> > > > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=113709185#Frequentlyaskedquestions(FAQ)-HowisaHudijobdeployed
> > ?
> > > > > help ?
> > > > >
> > > > >
> > > > > On Wed, Oct 2, 2019 at 10:18 AM Qian Wang <[email protected]>
> > wrote:
> > > > > > Hi,
> > > > > > I am new to Apache Hudi. Currently I am working on a PoC using
> > Hudi and
> > > > > > anyone can give me some documents what how to deploy Apache Hudi?
> > Thanks.
> > > > > >
> > > > > > Best,
> > > > > > Eric
> > > > >
> > > >
> > > >
> > >
> >
>

Re: How to deploy Hudi

Reply via email to