MarquisC opened a new pull request, #109: URL: https://github.com/apache/iceberg-python/pull/109
We're using PyIceberg to read Iceberg tables stored in S3 as parquet. We have column names in the form of `id:foo` `diagnostic:bar` using `:` as a sort of delimiter to help us do some programatic maintenance on our side. In Parquet the column names are magically subbed in this case `:` -> `_x3A` and upon attempts at scanning/reading the data the schema of the table doesn't match the physical column names for PyArrow. The first pass is a naive fix for this that I have tested and works, but I'm looking for guidance on where you all want me to put this logic, and I'm happy to add it there instead. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
