Fine-grained task recovery

Stanislav Borissov Mon, 14 Dec 2020 08:24:17 -0800

Hi,

I'm running a simple, "embarassingly parallel" ETL-type job. I noticed that
a failure in one subtask causes the entire job to restart. Even with the
region failover strategy, all subtasks of this task and connected ones
would fail. Is there any way to limit restarting to only the single subtask
that failed, so all other subtasks can stay alive and keep working?


For context, I use Flink 1.11 in AWS Kinesis Data Analytics, so some
configuration is not controlled by me
<https://docs.aws.amazon.com/kinesisanalytics/latest/java/reference-flink-settings.title.html>
.

Thanks

Fine-grained task recovery

Reply via email to