blcksrx opened a new pull request, #61528:
URL: https://github.com/apache/airflow/pull/61528

   ## spark\_submit\_hook: resolve connection master URL construction
   
   ### Root Cause
   
   When a Spark connection is created from a URI (e.g., 
`spark://spark-master:7077`), the `_parse_from_uri` method in `connection.py` 
extracts the scheme (`spark://`) and stores it as `conn_type`, while 
`conn.host` only contains the hostname (`spark-master`).
   
   However, `_resolve_connection` in `SparkSubmitHook` assumed all connections 
were created via UI where `conn_type` is always `spark` and `conn.host` 
contains the full master URL with protocol (e.g., `spark://host`, 
`k8s://https://host`).
   
   ### Impact
   
   When using environment variables or URI format, the master URL was missing 
its protocol prefix:
   
   | Input | Output (Before) | Output (After) |
   |-------|-----------------|----------------|
   | `spark://spark-master:7077` | `spark-master:7077` | 
`spark://spark-master:7077` |
   | `k8s://https://k8s-host:443` | `https://k8s-host:443` | 
`k8s://https://k8s-host:443` |
   
   ### Fix
   
   Updated `_resolve_connection` to properly handle connections created from 
URI by using `conn_type` to reconstruct the protocol prefix when `conn.host` 
doesn't already contain `://`.
   
   related: #56453


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to