tomstepp commented on code in PR #36112: URL: https://github.com/apache/beam/pull/36112#discussion_r2353700818
########## sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Redistribute.java: ########## @@ -157,8 +168,16 @@ public boolean getAllowDuplicates() { @Override public PCollection<T> expand(PCollection<T> input) { - return input - .apply("Pair with random key", ParDo.of(new AssignShardFn<>(numBuckets))) + PCollection<KV<Integer, T>> sharded; + if (deterministicSharding) { + sharded = + input.apply( + "Pair with deterministic key", + ParDo.of(new AssignDeterministicShardFn<T>(numBuckets))); Review Comment: Do you have any tips for how to do this? I reviewed the existing RedistributeTest, but not sure how to set the offsets for elements or ProcessContext. If we can set it then we can verify repeated elements of the same offset get the same key. Or are you thinking of just verifying that setting withDeterministicSharding using the new shard fn? Not sure how to verify yet, but can look into that one some more. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org