zeevm opened a new issue, #2394:
URL: https://github.com/apache/arrow-rs/issues/2394

   
   A field with "Repeated" repetition and no "LIST" annotation are read as 
primitives instead of as list.
   
   To reproduce: create a file with a top level field schema like:
   
   `REPEATED BYTE_ARRAY vals (UTF8);`
   
   and write lists of string (i.e. with repetition levels of '0' and '1')
   
   this should be read as a List of strings as specified in 
https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#lists
   
   > This does not affect repeated fields that are not annotated: A repeated 
field that is neither contained by a LIST- or MAP-annotated group nor annotated 
by LIST or MAP should be interpreted as a required list of required elements 
where the element type is the type of the field.
   
   Instead it is read as a field of single string values, where string 
comprising a logical list are instead read as distinct rows.
   
   It is read correctly by pyarrow


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to