[ https://issues.apache.org/jira/browse/ARROW-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16408384#comment-16408384 ]
Dhruv Madeka edited comment on ARROW-2332 at 3/21/18 6:32 PM: -------------------------------------------------------------- As mentioned in the Github issue - IMHO this requires a few steps. * In `FeatherReader`, separate the extraction of the table from the call to convert it `to_pandas` * Create a `FeatherDataset` class which takes a list of feather files and creates a table for each one * Validate that the schemas for each of the files match, the `validate_schema` for the `ParquetDataset` seems to work. Maybe we can create an abstract dataset class and inherit from there * call `concat_tables` on the extract tables and return them as a pandas dataframe was (Author: madeka): As mentioned in the Github issue - IMHO this requires a few steps. * In `FeatherReader`, separate the extraction of the table from the call to convert it `to_pandas` * Create a `FeatherDataset` class which takes a list of featherfiles and creates a table for each one * Validate that the schemas for each of the files match, the `validate_schema` for the `ParquetDataset` seems to work. Maybe we can create an abstract dataset class and inherit from there > [Python] Provide API for reading multiple Feather files > ------------------------------------------------------- > > Key: ARROW-2332 > URL: https://issues.apache.org/jira/browse/ARROW-2332 > Project: Apache Arrow > Issue Type: Improvement > Components: Python > Reporter: Wes McKinney > Priority: Major > Fix For: 0.10.0 > > > See discussion in > https://github.com/wesm/feather/issues/273#issuecomment-374093374 -- This message was sent by Atlassian JIRA (v7.6.3#76005)