Re: [PR] Support inferring schemas from Python dataclasses [beam]

via GitHub Mon, 09 Mar 2026 10:39:22 -0700


gemini-code-assist[bot] commented on code in PR #37728:
URL: https://github.com/apache/beam/pull/37728#discussion_r2906963168



##########
sdks/python/apache_beam/typehints/row_type.py:
##########
@@ -127,6 +125,29 @@ def from_user_type(
           field_options=field_options,
           field_descriptions=field_descriptions)
 
+    if match_is_dataclass(user_type):
+      fields = [(field.name, field.type)
+                for field in dataclasses.fields(user_type)]
+
+      field_descriptions = getattr(user_type, '_field_descriptions', None)
+
+      if _user_type_is_generated(user_type):
+        return RowTypeConstraint.from_fields(
+            fields,
+            schema_id=getattr(user_type, _BEAM_SCHEMA_ID),
+            schema_options=schema_options,
+            field_options=field_options,
+            field_descriptions=field_descriptions)
+
+      # TODO(https://github.com/apache/beam/issues/22125): Add user API for
+      # specifying schema/field options
+      return RowTypeConstraint(
+          fields=fields,
+          user_type=user_type,
+          schema_options=schema_options,
+          field_options=field_options,
+          field_descriptions=field_descriptions)

Review Comment:
   ![medium](https://www.gstatic.com/codereviewagent/medium-priority.svg)
   
   This logic for handling dataclasses is almost identical to the preceding 
logic for `NamedTuple`. The primary difference lies in how the `fields` list is 
populated. This duplication could make future maintenance more difficult.
   
   To improve maintainability, consider refactoring this method to consolidate 
the common logic. You could determine the `fields` based on whether the 
`user_type` is a `NamedTuple` or a `dataclass`, and then have a single, shared 
block of code to construct and return the `RowTypeConstraint`. This would make 
the function more DRY (Don't Repeat Yourself).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Support inferring schemas from Python dataclasses [beam]

Reply via email to