uranusjr commented on code in PR #31833:
URL: https://github.com/apache/airflow/pull/31833#discussion_r1280397351
##########
airflow/www/extensions/init_wsgi_middlewares.py:
##########
@@ -37,8 +38,11 @@ def _root_app(env: WSGIEnvironment, resp: StartResponse) ->
Iterable[bytes]:
def init_wsgi_middleware(flask_app: Flask) -> None:
"""Handle X-Forwarded-* headers and base_url support."""
+ webserver_base_url = conf.get_mandatory_value("webserver", "BASE_URL",
fallback="")
+ if webserver_base_url.endswith("/"):
+ raise AirflowConfigException("webserver.base_url conf cannot have a
trailing slash.")
Review Comment:
This is due to how web servers rewrite URLs when you deploy a webapp under a
prefix. Say you deploy under `https://mydomain/airflow`. When you visit
`https://mydomain/airflow/home` in the browser (as an example; any client
applies), the proxy server (e.g. NGINX) would strip the prefix, and forward the
rest of the URL to the app, so in Python code (e.g. Airflow) we can just handle
`/home` without needing to consider what prefix the entirety of Airflow is
deployed under.
Now you may say, hey, why can’t either the proxy server, the gateway
protocol (WSGI in Python’s case), or the web framework (Flask for Airflow’s
case) be smarter and just strip or add the slash in between as appriproate? And
you would be right! Some of them actually can do this (NGINX has
`merge_slashes`, for example), but not everyone does since the prefix
_technically_ should never have a trailing slash (the slash is _technically_
significant in a URL), and that extra check is not free. So it’s better to
avoid fighting the tools and just stick to the technically correct
configuration.
##########
airflow/www/extensions/init_wsgi_middlewares.py:
##########
@@ -37,8 +38,11 @@ def _root_app(env: WSGIEnvironment, resp: StartResponse) ->
Iterable[bytes]:
def init_wsgi_middleware(flask_app: Flask) -> None:
"""Handle X-Forwarded-* headers and base_url support."""
+ webserver_base_url = conf.get_mandatory_value("webserver", "BASE_URL",
fallback="")
+ if webserver_base_url.endswith("/"):
+ raise AirflowConfigException("webserver.base_url conf cannot have a
trailing slash.")
Review Comment:
This is due to how web servers rewrite URLs when you deploy a webapp under a
prefix. Say you deploy under `https://mydomain/airflow`. When you visit
`https://mydomain/airflow/home` in the browser (as an example; any client
applies), the proxy server (e.g. NGINX) would strip the prefix, and forward the
rest of the URL to the app, so in Python code (e.g. Airflow) we can just handle
`/home` without needing to consider what prefix the entirety of Airflow is
deployed under.
Now you may say, hey, why can’t either the proxy server, the gateway
protocol (WSGI in Python’s case), or the web framework (Flask for Airflow’s
case) be smarter and just strip or add the slash in between as appriproate? And
you would be right! Some of them actually can do this (NGINX has
`merge_slashes`, for example), but not everyone does since the prefix
_technically_ should never have a trailing slash (the slash is _technically_
significant in a URL), and that extra check is not free. So it’s better to
avoid fighting the tools and just stick to the technically correct
configuration.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]