uranusjr commented on code in PR #33675:
URL: https://github.com/apache/airflow/pull/33675#discussion_r1303669950


##########
scripts/in_container/run_provider_yaml_files_check.py:
##########
@@ -319,14 +319,13 @@ def 
check_duplicates_in_integrations_names_of_hooks_sensors_operators(yaml_files
         yaml_files.items(), ["sensors", "operators", "hooks", "triggers"]
     ):
         resource_data = provider_data.get(resource_type, [])
-        current_integrations = [r.get("integration-name", "") for r in 
resource_data]
-        if len(current_integrations) != len(set(current_integrations)):
-            for integration in current_integrations:
-                if current_integrations.count(integration) > 1:
-                    errors.append(
-                        f"Duplicated content of 
'{resource_type}/integration-name/{integration}' "
-                        f"in file: {yaml_file_path}"
-                    )
+        count_integrations = Counter(r.get("integration-name", "") for r in 
resource_data)
+        for integration, count in count_integrations.items():
+            if count > 1:
+                errors.append(
+                    f"Duplicated content of 
'{resource_type}/integration-name/{integration}' "
+                    f"in file: {yaml_file_path}"
+                )

Review Comment:
   I wonder if this should just use `["integration-name"]` instead of `.get()`. 
This would emit a nonsensical message when the key is missing anyway…



##########
scripts/in_container/run_provider_yaml_files_check.py:
##########
@@ -319,14 +319,13 @@ def 
check_duplicates_in_integrations_names_of_hooks_sensors_operators(yaml_files
         yaml_files.items(), ["sensors", "operators", "hooks", "triggers"]
     ):
         resource_data = provider_data.get(resource_type, [])
-        current_integrations = [r.get("integration-name", "") for r in 
resource_data]
-        if len(current_integrations) != len(set(current_integrations)):
-            for integration in current_integrations:
-                if current_integrations.count(integration) > 1:
-                    errors.append(
-                        f"Duplicated content of 
'{resource_type}/integration-name/{integration}' "
-                        f"in file: {yaml_file_path}"
-                    )
+        count_integrations = Counter(r.get("integration-name", "") for r in 
resource_data)
+        for integration, count in count_integrations.items():
+            if count > 1:
+                errors.append(
+                    f"Duplicated content of 
'{resource_type}/integration-name/{integration}' "
+                    f"in file: {yaml_file_path}"
+                )

Review Comment:
   I wonder if this should just use `["integration-name"]` instead of `.get()`. 
This would emit a nonsensical message when the key is missing anyway…



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to