jonathanleek opened a new issue, #50514:
URL: https://github.com/apache/airflow/issues/50514
### Apache Airflow Provider(s)
snowflake
### Versions of Apache Airflow Providers
We’ve identified an issue with the SnowflakeSqlApiHook used by the
SnowflakeSqlApiOperator in the Apache Airflow Snowflake provider. In certain
cases, after submitting a query to Snowflake’s SQL API, the initial request
succeeds, but polling the status endpoint fails due to a RemoteDisconnected
error, caused by the remote Snowflake endpoint forcefully closing a connection.
This is likely due to the reuse of a stale connection or other transient
TCP-level issues.
Summary of Issue
• The initial POST to submit the query succeeds.
• The subsequent GET to the statementStatusUrl fails with:
`RemoteDisconnected('Remote end closed connection without response')`
The failure is raised at the application level and results in the task being
marked as failed.
• Retrying at the task/operator level is not viable when:
• The task is considered successful from Airflow’s point of view
(query submitted).
• The query is not idempotent (e.g., it modifies data), so
re-running it would cause issues.
Snowflake confirmed this is a client-side (Airflow hook) implementation gap:
• The polling request should retry on connection-level failures
(like RemoteDisconnected), possibly with exponential backoff.
• This would avoid needing to retry the entire query from the
start.
• The current implementation makes a single attempt to poll the
status endpoint, which is fragile in cloud network environments.
Why This Matters
• Ensures robustness of asynchronous query execution using the
SQL API.
• Avoids failed tasks due to transient connection pool or network
issues.
• Aligns with best practices in client design for polling APIs
over HTTP.
### Apache Airflow version
2.9.3
### Operating System
Debian
### Deployment
Astronomer
### Deployment details
_No response_
### What happened
Query is successfully being sent to Snowflake and executed, as confirmed on
Snowflake side, but the task is being marked as failed due to a TCP error
forcing the connection to be disconnected
### What you think should happen instead
_No response_
### How to reproduce
1. Create a long running query.
2. Kill the TCP Connection during polling.
3. Logs will show something like
`RemoteDisconnected('Remote end closed connection without response')
...
raise ValueError({"status": "error", "message": str(e)})`
### Anything else
_No response_
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [x] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org