ahmedabu98 commented on issue #25669: URL: https://github.com/apache/beam/issues/25669#issuecomment-1484210381
Update: we found that this problem was due to line 342 in the `Schema::sorted` method below, which copies the field encoding positions of an unsorted Schema over to the sorted Schema that is generated. This mismatch between encoding positions and actual field indices in the resulting sorted Schema leads to this issue. When RowCoder tries to encode/decode a field, it looks for the encoding position of that field. Here we see RowCoder using the encoding positions or a pre-sorted Schema to try decoding a Row with sorted Schema, leading to what looks like field misplacement. https://github.com/apache/beam/blob/3a1b64106657346f436bc9011d894ca90408a69e/sdks/java/core/src/main/java/org/apache/beam/sdk/schemas/Schema.java#L333-L345 Context: TypedSchemaTransformProvider uses a class to represent the configuration of the SchemaTransform. The configuration schema is inferred from its class using `AutoValueSchema`. Unfortunately, the ordering of fields in the generated configuration schema is not guaranteed, so as a workaround we are sorting the fields alphabetically. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
