gowerc opened a new issue #11935:
URL: https://github.com/apache/arrow/issues/11935


   Hi,
   
   I was planning to make this into a feature request on the JIRA board but 
before I did I just wanted to check if there was a particular reason why this 
feature doesn't already exist (as it could just be my lack of knowledge of the 
parquet standard).
   
   The current implementations of reading and writing parquet (at least the R 
and Python implementations) allow you to assign arbitrary metadata to your 
dataset which can written and read by the same language but cannot be read by 
other languages (i.e. attributes set in R don't appear to be easily accessible 
via the python implementation and vice versa.  Is there a reason for this? 
Would it be viable to request that at least simple metadata (i.e. numerics and 
strings) have some way of being shared between the different implementations ?  
   
   As an example of use we often assign longer variable labels to columns in 
order to give a description of what the column is for i.e.
   `TRT01P` = `First planned treatment`. It would be great if we could attach 
this metadata within the parquest object in a way that other languages reading 
the object can also access it. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to