[ https://issues.apache.org/jira/browse/ARROW-8210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joris Van den Bossche updated ARROW-8210: ----------------------------------------- Description: While testing duplicate column names, I ran into multiple issues: * Factory fails if there are duplicate columns, even for a single file * In addition, we should also fix and/or test that factory works for duplicate columns if the schema's are equal * Once a Dataset with duplicated columns is created, scanning without any column projection fails > [C++][Dataset] Handling of duplicate columns in Dataset factory and scanning > ---------------------------------------------------------------------------- > > Key: ARROW-8210 > URL: https://issues.apache.org/jira/browse/ARROW-8210 > Project: Apache Arrow > Issue Type: Bug > Components: C++, C++ - Dataset > Reporter: Joris Van den Bossche > Priority: Major > > While testing duplicate column names, I ran into multiple issues: > * Factory fails if there are duplicate columns, even for a single file > * In addition, we should also fix and/or test that factory works for > duplicate columns if the schema's are equal > * Once a Dataset with duplicated columns is created, scanning without any > column projection fails -- This message was sent by Atlassian Jira (v8.3.4#803005)