damccorm commented on code in PR #36251:
URL: https://github.com/apache/beam/pull/36251#discussion_r2382912644
##########
sdks/python/apache_beam/pipeline.py:
##########
@@ -354,7 +354,7 @@ def _replace_if_needed(self, original_transform_node):
if replacement_transform is original_transform_node.transform:
return
replacement_transform.side_inputs = tuple(
- original_transform_node.transform.side_inputs)
+ getattr(original_transform_node.transform, 'side_inputs', ()))
Review Comment:
> The nested transform misses many fields (check the rest of my PR), which
are not needed when the transform is nested. My PR can make sure any future
nested transform should work.
I agree your PR works. But it is quite messy - for example, we check for the
existence of a side_inputs property 3 times when the object is always a
`PTransform` object. It seems much cleaner to just guarantee that this property
will always exist on `PTransform` objects. This also means that if we use these
properties elsewhere (now or in the future), we don't need to do more of these
kinds of checks.
It seem reasonable to me that `PTransform` should have these fields in all
cases. An alternative would be `PTransform` providing some functions to get
these properties if they exist.
##########
sdks/python/apache_beam/pipeline.py:
##########
@@ -354,7 +354,7 @@ def _replace_if_needed(self, original_transform_node):
if replacement_transform is original_transform_node.transform:
return
replacement_transform.side_inputs = tuple(
- original_transform_node.transform.side_inputs)
+ getattr(original_transform_node.transform, 'side_inputs', ()))
Review Comment:
Regardless, this is a minor code quality issue and not a correctness one. It
doesn't need to block the PR if you disagree.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]