gemini-code-assist[bot] commented on code in PR #35849:
URL: https://github.com/apache/beam/pull/35849#discussion_r2268400604


##########
sdks/python/apache_beam/yaml/yaml_provider.py:
##########
@@ -1531,12 +1531,15 @@ def _join_url_or_filepath(base, path):
     return path
   base_scheme = urllib.parse.urlparse(base, '').scheme
   path_scheme = urllib.parse.urlparse(path, base_scheme).scheme
-  if path_scheme != base_scheme:
+  if path_scheme != base_scheme or path.startswith(path_scheme + "://"):
+    # path has its own scheme or path is an absolute path
     return path
-  elif base_scheme and base_scheme in urllib.parse.uses_relative:
-    return urllib.parse.urljoin(base, path)
   else:
-    return FileSystems.join(FileSystems.split(base)[0], path)
+    # path is a relative path
+    if base_scheme and base_scheme in urllib.parse.uses_relative:
+      return urllib.parse.urljoin(base, path)
+    else:
+      return FileSystems.join(FileSystems.split(base)[0], path)

Review Comment:
   ![medium](https://www.gstatic.com/codereviewagent/medium-priority.svg)
   
   The current logic correctly fixes the issue with absolute paths. However, 
the logic for detecting an absolute URL is a bit complex and could be 
simplified for better readability and robustness.
   
   A more direct way to check if a path is an absolute URL is to see if it has 
a scheme, using `urllib.parse.urlparse(path).scheme`. This avoids the need to 
compare with `base_scheme` and the `startswith` check, which can be brittle.
   
   Consider simplifying the logic as follows. This handles absolute URLs, 
absolute file paths (via `FileSystems.join`), and relative paths correctly and 
more clearly.
   
   ```python
     if urllib.parse.urlparse(path).scheme:
       return path
   
     base_scheme = urllib.parse.urlparse(base).scheme
     if base_scheme and base_scheme in urllib.parse.uses_relative:
       return urllib.parse.urljoin(base, path)
     else:
       return FileSystems.join(FileSystems.split(base)[0], path)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to