Re: [PR] [Integration] Expose length-aware batching in all ModelHandler subclasses [beam]

via GitHub Tue, 24 Mar 2026 20:52:48 -0700


gemini-code-assist[bot] commented on code in PR #37945:
URL: https://github.com/apache/beam/pull/37945#discussion_r2985574367



##########
sdks/python/apache_beam/ml/inference/base_test.py:
##########
@@ -2319,6 +2319,234 @@ def test_batching_kwargs_none_values_omitted(self):
     self.assertEqual(kwargs['min_batch_size'], 5)
 
 
+class PaddingReportingStringModelHandler(base.ModelHandler[str, str,
+                                                           FakeModel]):
+  """Reports each element with the max length of the batch it ran in."""
+  def load_model(self):
+    return FakeModel()
+
+  def run_inference(self, batch, model, inference_args=None):
+    max_len = max(len(s) for s in batch)
+    return [f'{s}:{max_len}' for s in batch]
+
+
+class RunInferenceLengthAwareBatchingTest(unittest.TestCase):
+  """End-to-end tests for PR2 length-aware batching in RunInference."""
+  def test_run_inference_with_length_aware_batch_elements(self):
+    handler = PaddingReportingStringModelHandler(
+        min_batch_size=2,
+        max_batch_size=2,
+        max_batch_duration_secs=60,
+        batch_length_fn=len,
+        batch_bucket_boundaries=[5])
+
+    examples = ['a', 'cccccc', 'bb', 'ddddddd']
+    with TestPipeline('FnApiRunner') as p:
+      results = (
+          p
+          | beam.Create(examples, reshuffle=False)
+          | base.RunInference(handler))
+      assert_that(results, equal_to(['a:2', 'bb:2', 'cccccc:7', 'ddddddd:7']))
+
+
+class HandlerBucketingKwargsForwardingTest(unittest.TestCase):

Review Comment:
   ![medium](https://www.gstatic.com/codereviewagent/medium-priority.svg)
   
   The test methods in this class are very repetitive. To improve 
maintainability and reduce code duplication, consider parameterizing these 
tests. You could use a library like `parameterized` or 
`unittest.TestCase.subTest` with a loop over a list of handler configurations. 
Each configuration could specify the handler class, its specific `__init__` 
arguments, and any necessary mocks or setup.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [Integration] Expose length-aware batching in all ModelHandler subclasses [beam]

Reply via email to