[ https://issues.apache.org/jira/browse/ARROW-10344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17217425#comment-17217425 ]
Gert Hulselmans commented on ARROW-10344: ----------------------------------------- I was using Feather v2, but had to switch back to v1, due to a metadata bug when writing Feather v2 files: https://issues.apache.org/jira/browse/ARROW-10056 Is this pandas metadata very useful to have in my case? My feather files just contain one string column (row indices) and for the rest I have just columns of int16, int32, float32 (all other columns have the same type in one feather file). > [Python] Get all columns names (or schema) from Feather file, before loading > whole Feather file > ------------------------------------------------------------------------------------------------ > > Key: ARROW-10344 > URL: https://issues.apache.org/jira/browse/ARROW-10344 > Project: Apache Arrow > Issue Type: New Feature > Components: Python > Affects Versions: 1.0.1 > Reporter: Gert Hulselmans > Priority: Major > > Is there a way to get all column names (or schema) from a Feather file before > loading the full Feather file? > My Feather files are big (like 100GB) and the names of the columns are > different per analysis and can't be hard coded. > {code:python} > import pyarrow.feather as feather > # Code here to check which columns are in the feather file. > ... > my_columns = ... > # Result is pandas.DataFrame > read_df = feather.read_feather('/path/to/file', columns=my_columns) > # Result is pyarrow.Table > read_arrow = feather.read_table('/path/to/file', columns=my_columns) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)