Hi Qian,
You are right on the choice of tools for 2 and 3. But for 1, if you want to
do a 1-time bulk load, you can look into options on the migration guide
http://hudi.apache.org/migration_guide.html (HiveSyncTool is orthogonal to
this, it simply registers a Hudi dataset to Hive metastore)
On yo
https://issues.apache.org/jira/browse/HUDI-288 tracks this
On Tue, Oct 1, 2019 at 10:17 AM Vinoth Chandar wrote:
>
> I think this has come up before.
>
> +1 to the point pratyaksh mentioned. I would like to add a few more
>
> - Schema could be fetched dynamically from a registry based on
>
Hi Kabeer,
I plan to do an incremental query PoC. My use case including:
1. Load one big Hive table located in HDFS to Hudi as a history table (I think
should use HiveSyncTool)
2. Sink streaming data from Kafka to Hudi as real time table(use
HoodieDeltaStreamer?)
3. Join both of two table get
Awesome!
On Wed, Oct 2, 2019 at 3:01 PM Gautam Nayak
wrote:
> Thanks Vinoth for the tip,We were able to fix the issue as our spark
> cluster(2.2.0) bundled both spark-streaming-kafka-0-8 and
> spark-streaming-kafka-0-10 jars. Getting rid of spark-streaming-kafka-0-10
> jars from the cluster reso
Thanks Vinoth for the tip,We were able to fix the issue as our spark
cluster(2.2.0) bundled both spark-streaming-kafka-0-8 and
spark-streaming-kafka-0-10 jars. Getting rid of spark-streaming-kafka-0-10 jars
from the cluster resolved the ClasscastException.
On Oct 1, 2019, at 10:25 AM, Vinoth C
Qian
Welcome!
Are you able to tell us a bit more about your use case? Eg: type of the
project, industry, complexity of the pipeline that you plan to write (eg:
pulling data from external APIs like New York taxi dataset and writing them
into Hive for analysis) etc.
This will give us a bit more c
edit:
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=113709185#Frequentlyaskedquestions(FAQ)-HowisaHudijobdeployed?
with the ? at the end
On Wed, Oct 2, 2019 at 2:54 PM Vinoth Chandar wrote:
> Hi Qian,
>
> Welcome! Does
> https://cwiki.apache.org/confluence/pages/viewpage.actio
Hi Qian,
Welcome! Does
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=113709185#Frequentlyaskedquestions(FAQ)-HowisaHudijobdeployed?
help ?
On Wed, Oct 2, 2019 at 10:18 AM Qian Wang wrote:
> Hi,
>
> I am new to Apache Hudi. Currently I am working on a PoC using Hudi and
> any
Hi,
I am new to Apache Hudi. Currently I am working on a PoC using Hudi and anyone
can give me some documents what how to deploy Apache Hudi? Thanks.
Best,
Eric
Thanks Thomas for the review. I had created a new PR to address your comments
https://github.com/apache/incubator-hudi/pull/935
Please review when you get a chance.
ThanksBalaji.V
On Wednesday, October 2, 2019, 09:26:49 AM PDT, Thomas Weise
wrote:
I looked at the PR and I see a dist
This week I have limited internet access and would not be able to help
much.
On Wed, Oct 2, 2019 at 13:26 Thomas Weise wrote:
> I looked at the PR and I see a disturbing number of LICENSE file
> repetitions in it. There should be no need for that as LICENSE can be
> included automatically by the
I looked at the PR and I see a disturbing number of LICENSE file
repetitions in it. There should be no need for that as LICENSE can be
included automatically by the ASF parent pom (or similar project specific
solution):
https://github.com/apache/maven-apache-parent/blob/master/pom.xml#L308
Please
Based on some conversations I had with Flink folks including Hudi's very
own mentor Thomas, it seems future proof to look into supporting the Flink
streaming APIs. The batch apis IIUC will move towards converging with
Streaming APIs, which matches Hudi's model anyway
>From Hudi's perspective, foll
13 matches
Mail list logo