[ https://issues.apache.org/jira/browse/ARROW-14596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17441214#comment-17441214 ]
Joris Van den Bossche commented on ARROW-14596: ----------------------------------------------- [~TomScheffers] that's indeed a current limitation of the new implementation. There is work underway to enable this (the basic feature was just merged for C++ (ARROW-13987), now it still has to be exposed in python (ARROW-11259)), and hopefully this will be possible in the next version 7.0.0 > [Python] parquet.read_table nested fields in columns does not work for > use_legacy_dataset=False > ----------------------------------------------------------------------------------------------- > > Key: ARROW-14596 > URL: https://issues.apache.org/jira/browse/ARROW-14596 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Reporter: Tom Scheffers > Priority: Critical > Fix For: 7.0.0 > > > Reading nested field does not work with use_legacy_dataset=False. > This works: > > {code:java} > import pyarrow.parquet as pq > t = pq.read_table( > source=*filename*, > columns=['store_key', 'properties.country'], > use_legacy_dataset=True, > ).to_pandas() > {code} > This does not work (for the same parquet file): > > {code:java} > import pyarrow.parquet as pq > t = pq.read_table( > source=*filename*, > columns=['store_key', 'properties.country'], > use_legacy_dataset=False, > ).to_pandas(){code} > -- This message was sent by Atlassian Jira (v8.20.1#820001)