Abacn commented on code in PR #37270:
URL: https://github.com/apache/beam/pull/37270#discussion_r2676841262


##########
sdks/java/container/license_scripts/pull_licenses_java.py:
##########
@@ -58,86 +58,90 @@ def pull_from_url(file_name, url, dep, no_list, 
use_cache=False):
     if url == 'skip':
         return
 
-    # Replace file path with absolute path to manual licenses
-    if url.startswith('file://{}'):
-        url = url.format(manual_license_path)
-        logging.info('Replaced local file URL with {url} for 
{dep}'.format(url=url, dep=dep))
-
-    # Take into account opensource.org changes that cause 404 on licenses
-    if 'opensource.org' in url and url.endswith('-license.php'):
-        url = url.replace('-license.php', '')
-
-    if use_cache:
-        md5sum = hashlib.md5(url.encode()).hexdigest()
-        if md5sum not in CACHED_LICENSES:
-            pulled_file_name = cached_license_path + "/" + md5sum
-            logging.info(f"Requested license {url} not in cache. Pulling it 
into {pulled_file_name}")
+    # Split the url string by commas in case of multiple/dual licenses
+    urls_to_try = [u.strip() for u in url.split(',')]

Review Comment:
   consider parsing url in upstream caller? pull_from_url is not only used by 
pulling license but also sources and notice. It now assumes "," in url means 
splitter for all cases (might not be true)
   
   that is here: 
https://github.com/apache/beam/blob/e50241b581a61ff8821d9713eeaffe68f030e8da/sdks/java/container/license_scripts/pull_licenses_java.py#L225
   
   also there is a non-technical question that whether we need to pull all 
licenses or the one that we prefer to use. There is a docstring:
   
   
https://github.com/apache/beam/blob/e50241b581a61ff8821d9713eeaffe68f030e8da/sdks/java/container/license_scripts/pull_licenses_java.py#L186
   
   noting a "moduleLicense" field specifies the license type. In the case of 
dual license, what value does this field show?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to