Yuvaraj created AIRFLOW-3405: -------------------------------- Summary: Task instance fail intermittently due to MySQL error Key: AIRFLOW-3405 URL: https://issues.apache.org/jira/browse/AIRFLOW-3405 Project: Apache Airflow Issue Type: Improvement Environment: MySQL, Redhat Linux Reporter: Yuvaraj
Dags are getting failed intermittently due to below error. OperationalError: (_mysql_exceptions.OperationalError) (1040, 'Too many connections') [2018-11-25 12:24:16,952] - Heartbeat time limited exceeded! We have max_connections defined as 2000 in DB. Below are the setting in cfg. sql_alchemy_pool_size = 1980 sql_alchemy_pool_recycle = 3600 As per DBA, The airflow scheduler keeps opening connections to the database, these connections are mostly idle, they get reset whenever the scheduler restarts but with max_connections at 2000 and scheduler holding on to 1600 of these, other apps trying to connect might start running out of connections. How do we remediate these idle connections. What should be the optimal value for these configs and max_connections that to be set at DB. Consider we need to build a large environment serving 500+ definitions with 10000+ runs per day. Need suggestions... -- This message was sent by Atlassian JIRA (v7.6.3#76005)