That is indeed strange. Would you be able to provide some debugging
information, e.g. how many message get acked for each checkpoint?
What is the parallelism of your job?
Thanks,
Max
On 12.09.18 12:57, Encho Mishinev wrote:
Hello Max,
Thanks for the answer. My guess was that they are acknowledged at
completion of Flink's checkpoints, but wanted to make sure since that
doesn't explain my problem.
Whenever a subscription is nearly empty the job gets slower overall and
the Flink's checkpoints start taking much more time (thrice or more)
even though their state is much smaller, and of course, there always
seem to be messages cycling over and over again.
If you have any clue at all why this might be, let me know.
Thanks for the help,
Encho
On Tue, Sep 11, 2018 at 1:45 PM Maximilian Michels <[email protected]
<mailto:[email protected]>> wrote:
Hey Encho,
The Flink Runner acknowledges messages through PubSubIO's
`CheckpointMark#finalizeCheckpoint()` method.
The Flink Runner wraps the PubSubIO source via the
UnboundedSourceWrapper. When Flink takes a checkpoint of the running
Beam streaming job, the wrapper will retrieve the CheckpointMarks from
the PubSubIO source.
When the Checkpoint is completed, there is a callback which informs the
wrapper (`notifyCheckpointComplete()`) and calls `finalizeCheckpoint()`
on all the generated CheckpointMarks.
Hope that helps debugging your problem. I don't have an explanation why
this doesn't work for the last records in your PubSub queue. It
shouldn't make a difference for how the Flink Runner does checkpointing.
Best,
Max
On 10.09.18 18:17, Encho Mishinev wrote:
> Hello,
>
> I am using Flink runner with Apache Beam 2.6.0. I was wondering
if there
> is information on when exactly the runner acknowledges a pubsub
message
> when reading from PubsubIO?
>
> My problem is that whenever there are a few messages left in a
> subscription my streaming job never really seems to acknowledge them
> all. For example is a subscription has 100,000,000 messages in
total,
> the job will go through about 99,990,000 and then keep reading
the last
> few thousand and seemingly never acknowledge them.
>
> Some clarity on when the acknowledgement happens in the pipeline
might
> help me debug this problem.
>
> Thanks!