Hi luoqi:

With partition support[1], I want to introduce a FileFormatSink to
 cover streaming exactly-once and partition-related logic for flink
 file connectors and hive connector. You can take a look.

[1] 
https://docs.google.com/document/d/15R3vZ1R_pAHcvJkRx_CWleXgl08WL3k_ZpnWSdzP7GY/edit?usp=sharing

Best,
Jingsong Lee


------------------------------------------------------------------
From:Bowen Li <bowenl...@gmail.com>
Send Time:2019年9月6日(星期五) 05:21
To:Qi Luo <luoqi...@gmail.com>
Cc:user <user@flink.apache.org>; snake.fly318 <snake.fly...@gmail.com>; 
lichang.bd <lichang...@gmail.com>
Subject:Re: Streaming write to Hive

Hi, 

I'm not sure if there's one yet. Feel free to create one if not.
On Wed, Sep 4, 2019 at 11:28 PM Qi Luo <luoqi...@gmail.com> wrote:

Hi Bowen,

Thank you for the information! Streaming write to Hive is a very common use 
case for our users. Is there any open issue for this to which we can try 
contributing?

+Yufei and Chang who are also interested in this.

Thanks,
Qi
On Thu, Sep 5, 2019 at 12:16 PM Bowen Li <bowenl...@gmail.com> wrote:
Hi Qi,

With 1.9 out of shelf, I'm afraid not. You can make HiveTableSink implements 
AppendStreamTableSink (an empty interface for now) so it can be picked up in 
streaming job. Also, streaming requires checkpointing, and Hive sink doesn't do 
that yet. There might be other tweaks you need to make.

It's on our list for 1.10, not high priority though.

Bowen
On Wed, Sep 4, 2019 at 2:23 AM Qi Luo <luoqi...@gmail.com> wrote:
Hi guys,

In Flink 1.9 HiveTableSink is added to support writing to Hive, but it only 
supports batch mode. StreamingFileSink can write to HDFS in streaming mode, but 
it has no Hive related functionality (e.g. adding Hive partition).

Is there any easy way we can streaming write to Hive (with exactly-once 
guarantee)?

Thanks,
Qi

Reply via email to