o-nikolas commented on issue #27069: URL: https://github.com/apache/airflow/issues/27069#issuecomment-1289712319
> We should either have more relaxed expectations of the heartbeat frequency (for example accept and communicate that heartbeats coming with 100% delay are still ok) or make sure heartbeats are generated with very good regularity independent from the load of the system. I think regularity Is hard to achieve in case of Python due to mutli-processing /GIL and the fact that heartbeats use database while generating heartbeats - this combination make it nearly impossible to guarantee that heartbeat will come with great regularity. So I'd say relaxing it and looking at what is the consequence of it is likely better way. Yupp, agreed on the latter. We hear this a lot from customers, who are worried about scheduler heartbeat delay. And while it is a good indicator of load, that's about it (unless it stops entirely, then you have a problem :sweat_smile:). So if we "rebrand" the heartbeat to be just that, users will hopefully be less concerned when it falls behind when the environment is being pushed hard. > But since this is often difficult to do and might require huge investment, if we can asses that we can safely increase such timers, we can do it to as "good-enough". There is never a "0-1" answer, this is left to the judgment of the person proposing such PR. Indeed, I don't have the time to invest in this particular area at the moment (which is why it's good to document these things in issues because someone else might!). So if this becomes a nuisance, then I agree that we could just bump the limit for a stopgap solution, but not resolve this issue (which will represent a more long term solution). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
