chamikaramj commented on a change in pull request #11210:
URL: https://github.com/apache/beam/pull/11210#discussion_r421958329
##########
File path: sdks/python/apache_beam/io/gcp/experimental/spannerio_test.py
##########
@@ -499,6 +499,7 @@ def test_batch_byte_size(
# and each bach should contains 25 mutations.
res = (
p | beam.Create(mutation_group)
+ | 'combine to list' >> beam.combiners.ToList()
Review comment:
Why do we need to perform this combining to run the test ?
##########
File path: sdks/python/apache_beam/io/gcp/experimental/spannerio.py
##########
@@ -1008,31 +1007,30 @@ def _reset_count(self):
self._cells = 0
def process(self, element):
- mg_info = element.info
+ for elem in element:
Review comment:
Can you clarify ? Would would Dataflow need the elements to be combined
to a list ? All runners should be able to operate on a PCollection of mutation
groups.
##########
File path: sdks/python/apache_beam/io/gcp/experimental/spannerio.py
##########
@@ -1008,31 +1007,30 @@ def _reset_count(self):
self._cells = 0
def process(self, element):
- mg_info = element.info
+ for elem in element:
+ mg_info = elem.info
+ if mg_info['byte_size'] + self._size_in_bytes > \
Review comment:
This seems like a change to the implementation not part of the
implementation. Probably should be a separate PR with a JIRA.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]