[ https://issues.apache.org/jira/browse/FLINK-18687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Caizhi Weng closed FLINK-18687. ------------------------------- Resolution: Duplicate > ProjectionCodeGenerator#generateProjectionExpression should remove for loop > optimization > ---------------------------------------------------------------------------------------- > > Key: FLINK-18687 > URL: https://issues.apache.org/jira/browse/FLINK-18687 > Project: Flink > Issue Type: Bug > Components: Table SQL / Runtime > Affects Versions: 1.11.0 > Reporter: Caizhi Weng > Priority: Critical > Fix For: 1.12.0 > > > If too many fields of the same type are projected, > {{ProjectionCodeGenerator#generateProjectionExpression}} currently performs a > "for loop optimization" which, instead of generating code separately for each > field, they'll be squashed into a for loop. > However, if the indices of the fields with the same type are not continuous, > this optimization will not write fields in index ascending order. This is not > acceptable because {{BinaryWriter}}s expect the users to write fields in > index ascending order (that is to say, we *have to* first write field 0, then > field 1, then...), otherwise the variable length area of the two binary rows > with same data might be different. Although we can use {{getXX}} methods of > {{BinaryRow}} to get the fields correctly, states for streaming jobs compare > state keys with binary bits, not with the contents of the keys. So we need to > make sure the binary bits of the binary rows be the same if two rows contain > the same data. > What's worse, as the current implementation of > {{ProjectionCodeGenerator#generateProjectionExpression}} uses a scala > {{HashMap}}, the key order of the map might be different on different > workers; Even if the projection does not meet the condition to be optimized, > it will still be affected by this bug. > What I suggest is to simply remove this optimization. Because if we still > want this optimization, we have to make sure that the fields of the same type > have continuous order, which is a very strict and rare condition. -- This message was sent by Atlassian Jira (v8.3.4#803005)