scwhittle commented on pull request #13862: URL: https://github.com/apache/beam/pull/13862#issuecomment-785808463
I'm not exactly sure why this isn't triggering already but I believe the issue can occur with the IdTracker possibly writing empty maps here: https://github.com/apache/beam/blob/3bb232fb098700de408f574585dfe74bbaff7230/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateInternals.java#L767 Then if a read is performed and it is not cached, there is a non-null encoded value (mapcoder encodes size) and then an EmptyMap is coming from MapCoder: https://github.com/apache/beam/blob/7c43ab6a8df9b23caa7321fddff9a032a71908f6/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/MapCoder.java#L111 Since the creation of the maps only looks if null is returned, the EmptyMap is possibly inserted into https://github.com/apache/beam/blob/3bb232fb098700de408f574585dfe74bbaff7230/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillStateInternals.java#L702 I think the IdTracker should delete the values if the maps are empty instead of caching empty maps. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
