jonathanleek opened a new issue, #50514:
URL: https://github.com/apache/airflow/issues/50514

   ### Apache Airflow Provider(s)
   
   snowflake
   
   ### Versions of Apache Airflow Providers
   
   We’ve identified an issue with the SnowflakeSqlApiHook used by the 
SnowflakeSqlApiOperator in the Apache Airflow Snowflake provider. In certain 
cases, after submitting a query to Snowflake’s SQL API, the initial request 
succeeds, but polling the status endpoint fails due to a RemoteDisconnected 
error, caused by the remote Snowflake endpoint forcefully closing a connection. 
This is likely due to the reuse of a stale connection or other transient 
TCP-level issues.
   
   Summary of Issue
        •       The initial POST to submit the query succeeds.
        •       The subsequent GET to the statementStatusUrl fails with:
   `RemoteDisconnected('Remote end closed connection without response')`
   The failure is raised at the application level and results in the task being 
marked as failed.
        •       Retrying at the task/operator level is not viable when:
        •       The task is considered successful from Airflow’s point of view 
(query submitted).
        •       The query is not idempotent (e.g., it modifies data), so 
re-running it would cause issues.
   
   Snowflake confirmed this is a client-side (Airflow hook) implementation gap:
        •       The polling request should retry on connection-level failures 
(like RemoteDisconnected), possibly with exponential backoff.
        •       This would avoid needing to retry the entire query from the 
start.
        •       The current implementation makes a single attempt to poll the 
status endpoint, which is fragile in cloud network environments.
   
   Why This Matters
        •       Ensures robustness of asynchronous query execution using the 
SQL API.
        •       Avoids failed tasks due to transient connection pool or network 
issues.
        •       Aligns with best practices in client design for polling APIs 
over HTTP.
   
   ### Apache Airflow version
   
   2.9.3
   
   ### Operating System
   
   Debian
   
   ### Deployment
   
   Astronomer
   
   ### Deployment details
   
   _No response_
   
   ### What happened
   
   Query is successfully being sent to Snowflake and executed, as confirmed on 
Snowflake side, but the task is being marked as failed due to a TCP error 
forcing the connection to be disconnected
   
   ### What you think should happen instead
   
   _No response_
   
   ### How to reproduce
   
   1. Create a long running query.
   2. Kill the TCP Connection during polling.
   3. Logs will show something like 
   `RemoteDisconnected('Remote end closed connection without response')
   ...
   raise ValueError({"status": "error", "message": str(e)})`
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to