ArthurKretzer opened a new issue, #55433:
URL: https://github.com/apache/airflow/issues/55433

   ### Apache Airflow Provider(s)
   
   weaviate
   
   ### Versions of Apache Airflow Providers
   
   ### Apache Airflow Provider(s)
   
   weaviate
   
   ## Versions of Apache Airflow Providers
   
   apache-airflow-providers-weaviate: 3.2.2
   
   ### Apache Airflow version
   
   3.0.5
   
   ### Operating System
   
   Ubuntu 22.05 (WSL2)
   
   ### Deployment
   
   Astronomer
   
   ### Deployment details
   
   Using Astronomer dev deployment locally.
   
   `astro dev start`
   
   ### What happened
   
   
   The WeaviateHook incorrectly defaults to port 80 for HTTP connections, 
ignoring the port attribute specified in the Airflow Connection settings.
   
   When trying to connect to a Weaviate instance running on a custom port 
(e.g., 8080) without HTTPS, the hook attempts to connect to port 80, resulting 
in a Connection refused error.
   
   The error log clearly shows the wrong port being used:
   
   ```
   ERROR - Error testing Weaviate connection: Connection to Weaviate failed. 
Details: Error: [Errno 111] Connection refused. 
   Is Weaviate running and reachable at http://weaviate:80?
   ```
   
   
   ### What you think should happen instead
   
   The WeaviateHook should respect the port field defined in the Airflow 
Connection.
   
   Given a connection configured with host: "weaviate" and port: 8080, the hook 
should attempt to connect to http://weaviate:8080.
   
   
   ### How to reproduce
   
   Create a Weaviate connection in Airflow with the following configuration 
(specifically, a non-standard port and use_https set to false):
   
   JSON
   
   ```json
   {
         "connection_id": "weaviate_default",
         "conn_type": "weaviate",
         "description": "Weaviate vector DB",
         "host": "weaviate",
         "login": null,
         "schema": null,
         "port": 8080,
         "password": null,
         "extra": "{\"use_https\": false, \"grpc_host\": \"weaviate:8080\", 
\"grpc_port\": 50051, \"grpc_secure\": false}"
   }
   ```
   
   Use the WeaviateHook to test the connection from within an Airflow 
environment (e.g., airflow kerberos or a simple Python script):
   
   ## Python
   
   ```python
   from airflow.providers.weaviate.hooks.weaviate import WeaviateHook
   
   hook = WeaviateHook(conn_id="weaviate_default")
   hook.test_connection() 
   ```
   
   Observe the logs: The logs will show a connection error and a message 
indicating that the hook is trying to connect to port 80 instead of 8080.
   
   ### Anything else
   
   The root cause of this bug is located in the get_conn method of the 
WeaviateHook 
(`airflow/providers/weaviate/src/airflow/providers/weaviate/hooks/weaviate.py`).
   
   The specific line is: 144 -> `http_port=conn.port or 443 if http_secure else 
80,`
   
   Due to Python's operator precedence, this line is evaluated as `(conn.port 
or 443) if http_secure else 80`. When http_secure is False, the expression 
always resolves to 80, completely ignoring the value of conn.port.
   
   **Suggested Fix:**
   
   The logic should be corrected by using parentheses to ensure the ternary 
operator is evaluated first:
   
   **Python**
   
   **From**
   `http_port=conn.port or 443 if http_secure else 80,`
   
   **To**
   `http_port=conn.port or (443 if http_secure else 80),`
   
   Or, more explicitly:
   
   **Python**
   
   `http_port = conn.port if conn.port else (443 if http_secure else 80)`
   
   Interestingly, the logic for grpc_port in the same method is correct, as it 
uses extras.pop("grpc_port", ...) which correctly prioritizes the user-defined 
value from the extra field. The http_port logic fails to do the same for the 
standard port field.
   
   ### Are you willing to submit PR?
   
   - [x] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to