gemini-code-assist[bot] commented on code in PR #35753: URL: https://github.com/apache/beam/pull/35753#discussion_r2245972899
########## sdks/python/apache_beam/yaml/yaml_ml.py: ########## @@ -32,37 +29,14 @@ from apache_beam.yaml import options from apache_beam.yaml.yaml_utils import SafeLineLoader - -def _list_submodules(package): - """ - Lists all submodules within a given package. - """ - submodules = [] - for _, module_name, _ in pkgutil.walk_packages( - package.__path__, package.__name__ + '.'): - if 'test' in module_name: - continue - submodules.append(module_name) - return submodules - - try: from apache_beam.ml.transforms import tft from apache_beam.ml.transforms.base import MLTransform # TODO(robertwb): Is this all of them? - _transform_constructors = {} + _transform_constructors = tft.__dict__ Review Comment:  Using `tft.__dict__` to populate `_transform_constructors` is risky as it includes not only the transform classes but also everything else in the `tft` module's namespace (e.g., imported modules, constants, base classes). This could lead to a `TypeError` at runtime if a YAML file specifies a `type` that corresponds to a non-callable member from the dictionary. A safer approach would be to build the dictionary of constructors from `tft.__all__`, which is the idiomatic way to define a module's public API. ```python _transform_constructors = {name: getattr(tft, name) for name in tft.__all__ if hasattr(tft, name)} ``` ########## sdks/python/apache_beam/ml/transforms/base.py: ########## @@ -216,31 +212,21 @@ # Get all columns for this item for col in columns: - if isinstance(item, dict): - result.append(item[col]) + result.append(item[col]) return result def _dict_output_fn( columns: Sequence[str], - batch: Sequence[Union[Dict[str, Any], beam.Row]], - embeddings: Sequence[Any]) -> list[Union[dict[str, Any], beam.Row]]: + batch: Sequence[Dict[str, Any]], + embeddings: Sequence[Any]) -> List[Dict[str, Any]]: Review Comment:  Consider adding a type hint for the return value of `_dict_output_fn` for better code clarity and maintainability. ```python def _dict_output_fn( columns: Sequence[str], batch: Sequence[Dict[str, Any]], embeddings: Sequence[Any]) -> List[Dict[str, Any]]: ``` ########## sdks/python/apache_beam/ml/transforms/base.py: ########## @@ -183,13 +183,9 @@ def append_transform(self, transform: BaseOperation): """ -def _dict_input_fn( - columns: Sequence[str], batch: Sequence[Union[Dict[str, Any], - beam.Row]]) -> List[str]: +def _dict_input_fn(columns: Sequence[str], + batch: Sequence[Dict[str, Any]]) -> List[str]: Review Comment:  Consider adding a type hint for the return value of `_dict_input_fn` for better code clarity and maintainability. ```python def _dict_input_fn(columns: Sequence[str], batch: Sequence[Dict[str, Any]]) -> List[str]: ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org