[ https://issues.apache.org/jira/browse/ARROW-3651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16668434#comment-16668434 ]
Armin Berres commented on ARROW-3651: ------------------------------------- Not sure but maybe Pandas should behave different in this case as well and create a {{DateTimeIndex}} index in this case as the complete index consists of {{Timestamp}} objects? {{df.columns = pd.to_datetime(df.columns)}} in the code above mitigates the problem. > [Python] Datetimes from non-DateTimeIndex cannot be deserialized > ---------------------------------------------------------------- > > Key: ARROW-3651 > URL: https://issues.apache.org/jira/browse/ARROW-3651 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Affects Versions: 0.11.1 > Reporter: Armin Berres > Priority: Major > > Given an index which contains datetimes but is no DateTimeIndex writing the > file works but reading back fails. > {code:python} > df = pd.DataFrame(1, index=pd.MultiIndex.from_arrays([[1,2],[3,4]]), > columns=[pd.to_datetime("2018/01/01")]) > # columns index is no DateTimeIndex anymore > df = df.reset_index().set_index(['level_0', 'level_1']) > table = pa.Table.from_pandas(df) > pq.write_table(table, 'test.parquet') > pq.read_pandas('test.parquet').to_pandas() > {code} > results in > {code} > KeyError Traceback (most recent call last) > ~/venv/mpptool/lib/python3.7/site-packages/pyarrow/pandas_compat.py in > _pandas_type_to_numpy_type(pandas_type) > 676 try: > --> 677 return _pandas_logical_type_map[pandas_type] > 678 except KeyError: > KeyError: 'datetime' > {code} > The created schema: > {code} > 2018-01-01 00:00:00: int64 > level_0: int64 > level_1: int64 > metadata > -------- > {b'pandas': b'{"index_columns": ["level_0", "level_1"], "column_indexes": > [{"n' > b'ame": null, "field_name": null, "pandas_type": "datetime", > "nump' > b'y_type": "object", "metadata": null}], "columns": [{"name": > "201' > b'8-01-01 00:00:00", "field_name": "2018-01-01 00:00:00", > "pandas_' > b'type": "int64", "numpy_type": "int64", "metadata": null}, > {"name' > b'": "level_0", "field_name": "level_0", "pandas_type": "int64", > "' > b'numpy_type": "int64", "metadata": null}, {"name": "level_1", > "fi' > b'eld_name": "level_1", "pandas_type": "int64", "numpy_type": > "int' > b'64", "metadata": null}], "pandas_version": "0.23.4"}'} > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)