ppawel commented on issue #35126:
URL: https://github.com/apache/beam/issues/35126#issuecomment-2967156720

   I think this is also causing restarting workers every ~1h in streaming jobs. 
Symptom is:
   
   ```
   Status service is unhealthy: UNAVAILABLE: SDK harness sdk-0-0 has not 
responded to a status request in 1h5m25.731689364s which is longer than the 
threshold of 1h 
   ```
   
   Shortly after, autoscaler says it raises number of workers to 1 so I guess 
it thinks it lost the unhealthy worker... this causes buildup of the job 
backlog because startup of a new worker takes 1-2 minutes...
   
   @scwhittle Is it planned to release a minor update to Beam 2.65 including 
your patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to