gowerc opened a new issue #11935: URL: https://github.com/apache/arrow/issues/11935
Hi, I was planning to make this into a feature request on the JIRA board but before I did I just wanted to check if there was a particular reason why this feature doesn't already exist (as it could just be my lack of knowledge of the parquet standard). The current implementations of reading and writing parquet (at least the R and Python implementations) allow you to assign arbitrary metadata to your dataset which can written and read by the same language but cannot be read by other languages (i.e. attributes set in R don't appear to be easily accessible via the python implementation and vice versa. Is there a reason for this? Would it be viable to request that at least simple metadata (i.e. numerics and strings) have some way of being shared between the different implementations ? As an example of use we often assign longer variable labels to columns in order to give a description of what the column is for i.e. `TRT01P` = `First planned treatment`. It would be great if we could attach this metadata within the parquest object in a way that other languages reading the object can also access it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
