[ https://issues.apache.org/jira/browse/ARROW-3247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17046482#comment-17046482 ]
Brian Hulette commented on ARROW-3247: -------------------------------------- Thanks Micah, could you link those here? When searching for parquet and maps this is all I could find. > [Python] Support spark parquet array and map types > -------------------------------------------------- > > Key: ARROW-3247 > URL: https://issues.apache.org/jira/browse/ARROW-3247 > Project: Apache Arrow > Issue Type: Improvement > Components: Python > Reporter: Martin Durant > Priority: Minor > Labels: parquet > > As far I understand, there is already some support for nested > array/dict/structs in arrow. However, spark Map and List types are structured > one level deeper (I believe to allow for both NULL and empty entries). > Surprisingly, fastparquet can load these. I do not know the plan for > arbitrary nested object support, but it should be made clear. > Schema of spark-generated file from the fastparquet test suite: > {code:java} > - spark_schema: > | - map_op_op: MAP, OPTIONAL > | - key_value: REPEATED > | | - key: BYTE_ARRAY, UTF8, REQUIRED > | - value: BYTE_ARRAY, UTF8, OPTIONAL > | - map_op_req: MAP, OPTIONAL > | - key_value: REPEATED > | | - key: BYTE_ARRAY, UTF8, REQUIRED > | - value: BYTE_ARRAY, UTF8, REQUIRED > | - map_req_op: MAP, REQUIRED > | - key_value: REPEATED > | | - key: BYTE_ARRAY, UTF8, REQUIRED > | - value: BYTE_ARRAY, UTF8, OPTIONAL > | - map_req_req: MAP, REQUIRED > | - key_value: REPEATED > | | - key: BYTE_ARRAY, UTF8, REQUIRED > | - value: BYTE_ARRAY, UTF8, REQUIRED > | - arr_op_op: LIST, OPTIONAL > | - list: REPEATED > | - element: BYTE_ARRAY, UTF8, OPTIONAL > | - arr_op_req: LIST, OPTIONAL > | - list: REPEATED > | - element: BYTE_ARRAY, UTF8, REQUIRED > | - arr_req_op: LIST, REQUIRED > | - list: REPEATED > | - element: BYTE_ARRAY, UTF8, OPTIONAL > - arr_req_req: LIST, REQUIRED > - list: REPEATED > - element: BYTE_ARRAY, UTF8, REQUIRED > {code} > (please forgive that some of this has already been mentioned elsewhere; this > is one of the entries in the list at > [https://github.com/dask/fastparquet/issues/374] as a feature that is useful > in fastparquet) -- This message was sent by Atlassian Jira (v8.3.4#803005)