Hi,
our use is that the data sources are independent, we are using flink to
ingest data from kafka sources, do a bit of filtering and then write it to
S3.
Since we ingest from multiple kafka sources, and they are independent, we
consider them all optional. Even if 1 just kafka is up and running,
Hi Bariša,
The way I see it is you either
- need data from all sources because you are doing some
conjoint processing. In that case stopping the pipeline is usually the
right thing to do.
- the streams consumed from multiple servers are not combined and hence
could be processed in independent
Hi Bariša,
Could you share the reason why your data processing pipeline should keep
running when one kafka source is down?
It seems like any one among the multiple kafka sources is optional for the
data processing logic, because any kafka source could be the one that is
down.
Best regards,
Jing
I think you can try to use a custom source to do that although the one of the
kafka sources is down the operator is also running(just do nothing). The only
trouble is that you need to manage the checkpoint and something else yourself.
But the good news is that you can copy the implementation of
Hi,
we are running a flink job with multiple kafka sources connected to
different kafka servers.
The problem we are facing is when one of the kafka's is down, the flink job
starts restarting.
Is there anyway for flink to pause processing of the kafka which is down,
and yet continue processing