Hi Fabian,
Actually I found a JIRA issue for the similar issue :
https://issues.apache.org/jira/browse/BEAM-3225 ,This is something similar
I am facing too.
I have 4 kafka topics as input source. Those are read using GlobalWindow
and processingTime triggers. And further joined based on common
Hi,
Could you maybe post your pipeline code. That way I could have a look.
Best,
Aljoscha
> On 8. Dec 2017, at 12:31, Fabian Hueske wrote:
>
> Hmm, I see...
> I'm running out of ideas.
>
> You might be right with your assumption about a bug in the Beam Flink runner.
> In
Hmm, I see...
I'm running out of ideas.
You might be right with your assumption about a bug in the Beam Flink
runner. In this case, this would be an issue for the Beam project which
hosts the Flink runner.
But it might also be an issue on the Flink side.
Maybe Aljoscha (in CC), one of the
Hi Nishu,
the data loss might be caused by the fact that processing time triggers do
not fire when the program terminates.
So, if your program has records stored in a window and program terminates
because the input was fully consumed, the window operator won't process the
remaining windows but
Hi Fabian,
Program is running until I manually stop it. Trigger is also firing as
expected because I read the entire data after the trigger firing to see
what data is captured. And pass that data over to GroupByKey as Input.
Its using Global window so I accumulate entire data each time the
Hi,
Thanks for your inputs.
I am reading Kafka topics in Global windows and have defined some
ProcessingTime triggers. Hence there is no late records.
Program is performing join between multiple kafka topics. It consists
following types of Transformation sequence is something like :
1. Read
Nishu
You might consider sideouput with metrics at least after window. I would
suggest having that to catch data screw or partition screw in all flink
jobs and amend if needed.
Chen
On Thu, Dec 7, 2017 at 9:48 AM Fabian Hueske wrote:
> Is it possible that the data is
Is it possible that the data is dropped due to being late, i.e., records
with timestamps behind the current watemark?
What kind of operations does your program consist of?
Best, Fabian
2017-12-07 10:20 GMT+01:00 Sendoh :
> I would recommend to also print the count of
I would recommend to also print the count of input and output of each
operator by using Accumulator.
Cheers,
Sendoh
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Hi,
I am running a Streaming pipeline(written in Beam Framework) with Flink.
*Operator sequence* is -> Reading the JSON data, Parse JSON String to the
Object and Group the object based on common key. I noticed that
GroupByKey operator throws away some data in between and hence I don't get
all
10 matches
Mail list logo