uranusjr commented on code in PR #31833:
URL: https://github.com/apache/airflow/pull/31833#discussion_r1280397351


##########
airflow/www/extensions/init_wsgi_middlewares.py:
##########
@@ -37,8 +38,11 @@ def _root_app(env: WSGIEnvironment, resp: StartResponse) -> 
Iterable[bytes]:
 
 def init_wsgi_middleware(flask_app: Flask) -> None:
     """Handle X-Forwarded-* headers and base_url support."""
+    webserver_base_url = conf.get_mandatory_value("webserver", "BASE_URL", 
fallback="")
+    if webserver_base_url.endswith("/"):
+        raise AirflowConfigException("webserver.base_url conf cannot have a 
trailing slash.")

Review Comment:
   This is due to how web servers rewrite URLs when you deploy a webapp under a 
prefix. Say you deploy under `https://mydomain/airflow`. When you visit 
`https://mydomain/airflow/home` in the browser (as an example; any client 
applies), the proxy server (e.g. NGINX) would strip the prefix, and forward the 
rest of the URL to the app, so in Python code (e.g. Airflow) we can just handle 
`/home` without needing to consider what prefix the entirety of Airflow is 
deployed under.
   
   Now you may say, hey, why can’t either the proxy server, the gateway 
protocol (WSGI in Python’s case), or the web framework (Flask for Airflow’s 
case) be smarter and just strip or add the slash in between as appriproate? And 
you would be right! Some of them actually can do this (NGINX has 
`merge_slashes`, for example), but not everyone does since the prefix 
_technically_ should never have a trailing slash (the slash is _technically_ 
significant in a URL), and that extra check is not free. So it’s better to 
avoid fighting the tools and just stick to the technically correct 
configuration.



##########
airflow/www/extensions/init_wsgi_middlewares.py:
##########
@@ -37,8 +38,11 @@ def _root_app(env: WSGIEnvironment, resp: StartResponse) -> 
Iterable[bytes]:
 
 def init_wsgi_middleware(flask_app: Flask) -> None:
     """Handle X-Forwarded-* headers and base_url support."""
+    webserver_base_url = conf.get_mandatory_value("webserver", "BASE_URL", 
fallback="")
+    if webserver_base_url.endswith("/"):
+        raise AirflowConfigException("webserver.base_url conf cannot have a 
trailing slash.")

Review Comment:
   This is due to how web servers rewrite URLs when you deploy a webapp under a 
prefix. Say you deploy under `https://mydomain/airflow`. When you visit 
`https://mydomain/airflow/home` in the browser (as an example; any client 
applies), the proxy server (e.g. NGINX) would strip the prefix, and forward the 
rest of the URL to the app, so in Python code (e.g. Airflow) we can just handle 
`/home` without needing to consider what prefix the entirety of Airflow is 
deployed under.
   
   Now you may say, hey, why can’t either the proxy server, the gateway 
protocol (WSGI in Python’s case), or the web framework (Flask for Airflow’s 
case) be smarter and just strip or add the slash in between as appriproate? And 
you would be right! Some of them actually can do this (NGINX has 
`merge_slashes`, for example), but not everyone does since the prefix 
_technically_ should never have a trailing slash (the slash is _technically_ 
significant in a URL), and that extra check is not free. So it’s better to 
avoid fighting the tools and just stick to the technically correct 
configuration.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to