Re: [PR] Fix flaky MatrixPowerTest.test_basics by reading all generated shards [beam]

via GitHub Thu, 21 May 2026 06:32:50 -0700


gemini-code-assist[bot] commented on code in PR #38585:
URL: https://github.com/apache/beam/pull/38585#discussion_r3281537493



##########
sdks/python/apache_beam/examples/matrix_power_test.py:
##########
@@ -51,9 +52,11 @@ def test_basics(self):
         '--input_matrix=%s --input_vector=%s --exponent=%d --output=%s.result' 
%
         (matrix_path, vector_path, self.EXPONENT, vector_path)).split())
     # Parse result file and compare.
-    with open(vector_path + '.result-00000-of-00001') as result_file:
-      results = result_file.read()
-      self.assertEqual(sorted(self.EXPECTED_OUTPUT), sorted(results))
+    results = []
+    for path in glob.glob(vector_path + '.result*'):
+      with open(path) as result_file:
+        results.append(result_file.read())
+    self.assertEqual(sorted(self.EXPECTED_OUTPUT), sorted(''.join(results)))

Review Comment:
   ![medium](https://www.gstatic.com/codereviewagent/medium-priority.svg)
   
   While using `glob.glob` correctly addresses the flakiness caused by multiple 
shards, the current implementation will silently skip the loop if no files are 
found (e.g., if the pipeline fails to produce any output). This would result in 
a less descriptive assertion error (comparing the expected output to an empty 
string). Adding an explicit assertion that shards were found improves the 
debuggability of the test by clearly indicating when no output was generated.
   
   ```suggestion
       shard_paths = glob.glob(vector_path + '.result*')
       self.assertTrue(shard_paths, 'No output shards found matching prefix: 
%s.result' % vector_path)
       results = []
       for path in shard_paths:
         with open(path) as result_file:
           results.append(result_file.read())
       self.assertEqual(sorted(self.EXPECTED_OUTPUT), sorted(''.join(results)))
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Fix flaky MatrixPowerTest.test_basics by reading all generated shards [beam]

Reply via email to