Hi, could you share the SQL you written for your original purpose, not the
one you attached ProcessFunction for debugging?

Best,
Kurt


On Tue, Mar 17, 2020 at 3:08 AM Dominik Wosiński <wos...@gmail.com> wrote:

> Actually, I just put this process function there for debugging purposes.
> My main goal is to join the E & C using the Temporal Table function, but I
> have observed exactly the same behavior i.e. when the parallelism was > 1
> there was no output and when I was setting it to 1 then the output was
> generated. So, I have switched to process function to see whether the
> watermarks are reaching this stage.
>
> Best Regards,
> Dom.
>
> pon., 16 mar 2020 o 19:46 Theo Diefenthal <
> theo.diefent...@scoop-software.de> napisał(a):
>
>> Hi Dominik,
>>
>> I had the same once with a custom processfunction. My processfunction
>> buffered the data for a while and then output it again. As the proces
>> function can do anything with the data (transforming, buffering,
>> aggregating...), I think it's just not safe for flink to reason about the
>> watermark of the output.
>>
>> I solved all my issues by calling `assignTimestampsAndWatermarks`
>> directly post to the (co-)process function.
>>
>> Best regards
>> Theo
>>
>> ------------------------------
>> *Von: *"Dominik Wosiński" <wos...@gmail.com>
>> *An: *"user" <user@flink.apache.org>
>> *Gesendet: *Montag, 16. März 2020 16:55:18
>> *Betreff: *Issues with Watermark generation after join
>>
>> Hey,
>> I have noticed a weird behavior with a job that I am currently working
>> on. I have 4 different streams from Kafka, lets call them A, B, C and D.
>> Now the idea is that first I do SQL Join of A & B based on some field, then
>> I create append stream from Joined A&B, let's call it E. Then I need to
>> assign timestamps to E since it is a result of joining and Flink can't
>> figure out the timestamps.
>>
>> Next, I union E & C, to create some F stream. Then finally I connect E &
>> C using `keyBy` and CoProcessFunction. Now the issue I am facing is that if
>> I try to, it works fine if I enforce the parallelism of E to be 1 by
>> invoking *setParallelism*. But if parallelism is higher than 1, for the
>> same data - the watermark is not progressing correctly. I can see that 
>> *CoProcessFunction
>> *methods are invoked and that data is produced, but the Watermark is
>> never progressing for this function. What I can see is that watermark is
>> always equal to (0 - allowedOutOfOrderness). I can see that timestamps are
>> correctly extracted and when I add debug prints I can actually see that
>> Watermarks are generated for all streams, but for some reason, if the
>> parallelism is > 1 they will never progress up to connect function. Is
>> there anything that needs to be done after SQL joins that I don't know of
>> ??
>>
>> Best Regards,
>> Dom.
>>
>

Reply via email to