BTW,  I'm adding user@ mailing list since this is a user question and
should be asked there.

dev@ mailing list is only for discussions of Flink development. Please see
https://flink.apache.org/community.html#mailing-lists

On Wed, Jul 3, 2019 at 12:34 PM Bowen Li <bowenl...@gmail.com> wrote:

> Hi Youssef,
>
> You need to provide more background context:
>
> - Which Hive sink are you using? We are working on the official Hive sink
> for community and will be released in 1.9. So did you develop yours in
> house?
> - What do you mean by 1st, 2nd, 3rd window? You mean the parallel
> instances of the same operator, or do you have you have 3 windowing
> operations chained?
> - What does your Hive table look like? E.g. is it partitioned or
> non-partitioned? If partitioned, how many partitions do you have? is it
> writing in static partition or dynamic partition mode? what format? how
> large?
> - What does your sink do - is each parallelism writing to multiple
> partitions or a single partition/table? Is it only appending data or
> upserting?
>
> On Wed, Jul 3, 2019 at 1:38 AM Youssef Achbany <
> youssef.achb...@euranova.eu> wrote:
>
>> Dear all,
>>
>> I'm working for a big project and one of the challenge is to read Kafka
>> topics and copy them via Hive command into Hive managed tables in order to
>> enable ACID HIVE properties.
>>
>> I try it but I have a issue with back pressure:
>> - The first window read 20.000 events and wrote them in Hive tables
>> - The second, third, ... send only 100 events because the write in Hive
>> take more time than the read of a Kafka topic. But writing 100 events or
>> 50.000 events takes +/- the same time for Hive.
>>
>> Someone have already do this source and sink? Could you help on this?
>> Or have you some tips?
>> It seems that defining a size window on number of event instead time is
>> not
>> possible. Is it true?
>>
>> Thank you for your help
>>
>> Youssef
>>
>> --
>> ♻ Be green, keep it on the screen
>>
>

Reply via email to