tomstepp commented on code in PR #36112:
URL: https://github.com/apache/beam/pull/36112#discussion_r2353700818


##########
sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Redistribute.java:
##########
@@ -157,8 +168,16 @@ public boolean getAllowDuplicates() {
 
     @Override
     public PCollection<T> expand(PCollection<T> input) {
-      return input
-          .apply("Pair with random key", ParDo.of(new 
AssignShardFn<>(numBuckets)))
+      PCollection<KV<Integer, T>> sharded;
+      if (deterministicSharding) {
+        sharded =
+            input.apply(
+                "Pair with deterministic key",
+                ParDo.of(new AssignDeterministicShardFn<T>(numBuckets)));

Review Comment:
   Do you have any tips for how to do this? I reviewed the existing 
RedistributeTest, but not sure how to set the offsets for elements or 
ProcessContext. If we can set it then we can verify repeated elements of the 
same offset get the same key.
   
   Or are you thinking of just verifying that setting withDeterministicSharding 
using the new shard fn? Not sure how to verify yet, but can look into that one 
some more.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to