Joris Van den Bossche created ARROW-6780:
--------------------------------------------
Summary: [C++][Parquet] Support DurationType in writing/reading
parquet
Key: ARROW-6780
URL: https://issues.apache.org/jira/browse/ARROW-6780
Project: Apache Arrow
Issue Type: Improvement
Reporter: Joris Van den Bossche
Currently this is not supported:
{code}
In [37]: table = pa.table({'a': pa.array([1, 2], pa.duration('s'))})
In [39]: table
Out[39]:
pyarrow.Table
a: duration[s]
In [41]: pq.write_table(table, 'test_duration.parquet')
...
ArrowNotImplementedError: Unhandled type for Arrow to Parquet schema
conversion: duration[s]
{code}
There is no direct mapping to Parquet logical types. There is an INTERVAL type,
but this more matches Arrow's ( YEAR_MONTH or DAY_TIME) interval type.
But, those duration values could be stored as just integers, and based on the
serialized arrow schema, it could be restored when reading back in.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)