Hi, Now that we have our Airflow metadata database on AWS (Aurora) + ProxySQL, we sometimes run into database disconnect issues for which the Celery stack will just retry (3 times), but Airflow would just give up right away. It would do so on basically any operations requiring database connection such as task heartbeating or a simple *Variable.Get()*.
In our use-case it would be very much beneficial to implement a database retry on critical functions (same as mentioned above at least) and I was wondering what would be people's opinions about this. I could not figure out any reasons why not to do that, but there might be some? The new "connection pooling pinging" feature (*1.10.6*) does not help since pooling is not active on the workers (I assumed this is passed on to the *airflow run* command) Cheers, Emmanuel -- GetYourGuide AG Stampfenbachstrasse 48 8006 Zürich Switzerland <https://www.facebook.com/GetYourGuide> <https://twitter.com/GetYourGuide> <https://www.instagram.com/getyourguide/> <https://www.linkedin.com/company/getyourguide-ag> <http://www.getyourguide.com>