[ https://issues.apache.org/jira/browse/ARROW-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16225799#comment-16225799 ]
Phillip Cloud edited comment on ARROW-1754 at 10/30/17 11:29 PM: ----------------------------------------------------------------- I think we should solve this by always making index column name follow the pattern for unnamed columns, namely, {{\_\_index\_level\_<N>\_\_}}. Along with changing {{index_columns}} to be a list of dictionaries mapping the raw arrow column name to either {{None}} or the actual column name. I'll update the pandas metadata spec accordingly. was (Author: cpcloud): I think we should solve this by always making index column name follow the pattern for unnamed columns, namely, {{__index_level_<N>__}}. Along with changing {{index_columns}} to be a list of dictionaries mapping the raw arrow column name to either {{None}} or the actual column name. I'll update the pandas metadata spec accordingly. > [Python] Fix buggy Parquet roundtrip when an index name is the same as a > column name > ------------------------------------------------------------------------------------ > > Key: ARROW-1754 > URL: https://issues.apache.org/jira/browse/ARROW-1754 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Affects Versions: 0.7.1 > Reporter: Wes McKinney > Assignee: Phillip Cloud > Fix For: 0.8.0 > > > See upstream report > https://stackoverflow.com/questions/47013052/issue-with-pyarrow-when-loading-parquet-file-where-index-has-redundant-column -- This message was sent by Atlassian JIRA (v6.4.14#64029)