sweetpythoncode opened a new issue, #44837:
URL: https://github.com/apache/airflow/issues/44837
### Apache Airflow version
2.10.3
### If "Other Airflow 2 version" selected, which one?
2.8.1
### What happened?
`[2024-10-08, 16:38:05 UTC] {{taskinstance.py:2698}} ERROR - Task failed
with exception
Traceback (most recent call last):
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/sqlalchemy/engine/base.py",
line 1094, in _commit_impl
self.engine.dialect.do_commit(self.connection)
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/sqlalchemy/engine/default.py",
line 686, in do_commit
dbapi_connection.commit()
psycopg2.OperationalError: SSL connection has been closed unexpectedly
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/models/taskinstance.py",
line 434, in _execute_task
with create_session() as session:
File "/usr/local/lib/python3.11/contextlib.py", line 144, in __exit__
next(self.gen)
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/utils/session.py",
line 39, in create_session
session.commit()
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/sqlalchemy/orm/session.py",
line 1454, in commit
self._transaction.commit(_to_root=self.future)
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/sqlalchemy/orm/session.py",
line 839, in commit
trans.commit()
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/sqlalchemy/engine/base.py",
line 2469, in commit
self._do_commit()
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/sqlalchemy/engine/base.py",
line 2659, in _do_commit
self._connection_commit_impl()
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/sqlalchemy/engine/base.py",
line 2630, in _connection_commit_impl
self.connection._commit_impl()
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/sqlalchemy/engine/base.py",
line 1096, in _commit_impl
self._handle_dbapi_exception(e, None, None, None, None)
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/sqlalchemy/engine/base.py",
line 2134, in _handle_dbapi_exception
util.raise_(
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/sqlalchemy/util/compat.py",
line 211, in raise_
raise exception
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/sqlalchemy/engine/base.py",
line 1094, in _commit_impl
self.engine.dialect.do_commit(self.connection)
File
"/usr/local/airflow/.local/lib/python3.11/site-packages/sqlalchemy/engine/default.py",
line 686, in do_commit
dbapi_connection.commit()
sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) SSL connection
has been closed unexpectedly
(Background on this error at: https://sqlalche.me/e/14/e3q8)`
Airflow task itself was successful, but it failed on exit and only after ~30
minutes of runtime, if task runs lower than this time, there are no issues.
**If i run same task on 2.4.3, all good.**
### What you think should happen instead?
Task finished without any errros.
### How to reproduce
MWAA: 2.8.1, 2.10.1.
Run successful task for 30+ minutes.
More details:
https://repost.aws/questions/QU9VhJrACSSdy7zJku4pVO9Q/mwaa-psycopg2-operationalerror-ssl-connection-has-been-closed-unexpectedly
### Operating System
Aws MWAA
### Versions of Apache Airflow Providers
apache-airflow-providers-amazon 8.16.0
apache-airflow-providers-celery 3.5.1
apache-airflow-providers-common-io 1.2.0
apache-airflow-providers-common-sql 1.10.0
apache-airflow-providers-databricks 6.0.0
apache-airflow-providers-ftp 3.7.0
apache-airflow-providers-google 10.13.1
apache-airflow-providers-http 4.8.0
apache-airflow-providers-imap 3.5.0
apache-airflow-providers-postgres 5.10.0
apache-airflow-providers-sendgrid 3.4.0
apache-airflow-providers-sftp 4.8.1
apache-airflow-providers-slack 8.5.1
apache-airflow-providers-sqlite 3.7.0
apache-airflow-providers-ssh 3.10.0
apache-airflow-providers-trino 5.6.0
astronomer-cosmos 1.7.1
### Deployment
Amazon (AWS) MWAA
### Deployment details
Version: 2.8.1+
Size: mw1.large
### Anything else?
During the investigation, I checked:
1. new mwaa cluster.
2. latest mwaa version.
3. this options:
```
database.sql_alchemy_pool_size 20
database.sql_alchemy_pool_recycle 300
database.sql_alchemy_max_overflow 20
database.sql_alchemy_pool_pre_ping true
```
sql-alchemy-connect-args and all others related to engine + keepalive
timeouts, it looks like mwaa blocked this parameter, the issue reported in
slack channel →
https://apache-airflow.slack.com/archives/CCRR5EBA7/p1733820955101179
overall issue report →
https://apache-airflow.slack.com/archives/CCRR5EBA7/p1732187849345499
repost question →
https://repost.aws/questions/QU9VhJrACSSdy7zJku4pVO9Q/mwaa-psycopg2-operationalerror-ssl-connection-has-been-closed-unexpectedly
Here is a rds dump for keepalive 2.4.3 and 2.8.1(its the same on both envs):
```
[2024-12-11, 09:23:19 UTC] {{logging_mixin.py:137}} INFO -
tcp_keepalives_count: 2
[2024-12-11, 09:23:19 UTC] {{logging_mixin.py:137}} INFO -
tcp_keepalives_idle: 300
[2024-12-11, 09:23:19 UTC] {{logging_mixin.py:137}} INFO -
tcp_keepalives_interval: 30
```
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]