vinishjail97 opened a new pull request, #17759: URL: https://github.com/apache/hudi/pull/17759
### Describe the issue this Pull Request addresses There's a change in behavior for for SparkHoodieTableFileIndex since 0.14.1. The StructType(partitionFields) returned doesn't have the full path and causing the data validation failures. This behavior was changed as part of this PR https://github.com/apache/hudi/pull/9863/changes ### Summary and Changelog If there's a table with a nested partition column whose leaf name conflicts with another top level field the partitionedSchema passed to the new file group reader is incorrect. When I tried reverting the previous change found another issue where we are relying on `HoodieSchemaConversionUtils.convertStructTypeToHoodieSchema` to get requestedSchema in buildReaderWithPartitionValues but this fails because HoodieSchema doesn't like dots in the names. Looking for guidance or feedback on how to read nested partition columns through parquet reader? ### Impact High <!-- Describe any public API or user-facing feature change or any performance impact. --> ### Risk Level High ### Documentation Update None. ### Contributor's checklist - [ ] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [ ] Enough context is provided in the sections above - [ ] Adequate tests were added if applicable -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
