Detecting stragglers while using jdbcio

Varun Rauthan Tue, 03 Jan 2023 03:25:26 -0800

Hi,

We have been using JDBCIO for data ingestion from oracle; it works just
fine most of the days, but some days(usually the days when the DB is busy
because of high user activity) couple of our jobs which usually take ~15-20
mins get in stuck state within dataflow with stages showing indication of
stragglers detected.
These jobs then continue to be in the blocked state for days until we kill
them forcefully from dataflow; after which the job succeeds on being
restarted .
So I was just wondering if we have any way to determine that the job is in
this type of blocking situation, after which the job could be safely
restarted.


I see the property --experiments=max_workflow_runtime_walltime_seconds=600;
which kills the dataflow job directly after the mentioned elapsed seconds.
However it would be tough to find an optimal time for each job. So I've
been thinking if there are any better options.

Thanks,
Varun Rauthan
*Thanks and Regards,*

Detecting stragglers while using jdbcio

Reply via email to