[GitHub] [beam] dmvk commented on a change in pull request #13353: [BEAM-11267] Remove unecessary reshuffle for stateful ParDo after key…
dmvk commented on a change in pull request #13353: URL: https://github.com/apache/beam/pull/13353#discussion_r524545989 ## File path: runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingTranslationContext.java ## @@ -84,6 +85,17 @@ public void setOutputDataStream(PValue value, DataStream set) { } } + void setProducer(T value, PTransform producer) { +if (!producers.containsKey(value)) { Review comment: 👍 makes sense This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] dmvk commented on a change in pull request #13353: [BEAM-11267] Remove unecessary reshuffle for stateful ParDo after key…
dmvk commented on a change in pull request #13353: URL: https://github.com/apache/beam/pull/13353#discussion_r524545316 ## File path: runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/WorkItemKeySelector.java ## @@ -49,6 +52,6 @@ public ByteBuffer getKey(WindowedValue> value) thro @Override public TypeInformation getProducedType() { -return new GenericTypeInfo<>(ByteBuffer.class); +return new CoderTypeInformation<>(FlinkKeyUtils.ByteBufferCoder.of(), pipelineOptions.get()); Review comment: Not sure if this is necessary. I wanted to ensure that the new "reinterpreted partitioning" is compatible with the one used by GBK / Combine. The idea was if partitioning is not compatible, it may result in some state partitioning related glitches (eg. you wouldn't have local state for a key-group you need). Second thoughts, flink selects target partition (key group) based on "pojo hash code" (not based on binary representation), so the previous version was probably compatible enough 🤔 @mxm WDYT? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [beam] dmvk commented on a change in pull request #13353: [BEAM-11267] Remove unecessary reshuffle for stateful ParDo after key…
dmvk commented on a change in pull request #13353: URL: https://github.com/apache/beam/pull/13353#discussion_r524545316 ## File path: runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/WorkItemKeySelector.java ## @@ -49,6 +52,6 @@ public ByteBuffer getKey(WindowedValue> value) thro @Override public TypeInformation getProducedType() { -return new GenericTypeInfo<>(ByteBuffer.class); +return new CoderTypeInformation<>(FlinkKeyUtils.ByteBufferCoder.of(), pipelineOptions.get()); Review comment: Not sure if this is necessary. I wanted to ensure that the new "reinterpreted partitioning" is compatible with the one used by GBK / Combine. The idea was if partitioning is not compatible, it may result in some state partitioning related glitches (eg. you wouldn't have local state for a key-group you need). Second thoughts, flink selects target partition based on "pojo hash code" (not based on binary representation), so the previous version was probably compatible enough 🤔 @mxm WDYT? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org