Hi, We have been using JDBCIO for data ingestion from oracle; it works just fine most of the days, but some days(usually the days when the DB is busy because of high user activity) couple of our jobs which usually take ~15-20 mins get in stuck state within dataflow with stages showing indication of stragglers detected. These jobs then continue to be in the blocked state for days until we kill them forcefully from dataflow; after which the job succeeds on being restarted . So I was just wondering if we have any way to determine that the job is in this type of blocking situation, after which the job could be safely restarted.
I see the property --experiments=max_workflow_runtime_walltime_seconds=600; which kills the dataflow job directly after the mentioned elapsed seconds. However it would be tough to find an optimal time for each job. So I've been thinking if there are any better options. Thanks, Varun Rauthan *Thanks and Regards,*