ahmedabu98 commented on issue #25669:
URL: https://github.com/apache/beam/issues/25669#issuecomment-1484210381

   Update: we found that this problem was due to line 342 in the 
`Schema::sorted` method below, which copies the field encoding positions of an 
unsorted Schema over to the sorted Schema that is generated. This mismatch 
between encoding positions and actual field indices in the resulting sorted 
Schema leads to this issue. When RowCoder tries to encode/decode a field, it 
looks for the encoding position of that field. Here we see RowCoder using the 
encoding positions or a pre-sorted Schema to try decoding a Row with sorted 
Schema, leading to what looks like field misplacement.
   
   
https://github.com/apache/beam/blob/3a1b64106657346f436bc9011d894ca90408a69e/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/Schema.java#L333-L345
   
   
   Context: TypedSchemaTransformProvider uses a class to represent the 
configuration of the SchemaTransform. The configuration schema is inferred from 
its class using `AutoValueSchema`. Unfortunately, the ordering of fields in the 
generated configuration schema is not guaranteed, so as a workaround we are 
sorting the fields alphabetically.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to