Joris Van den Bossche created ARROW-6780:
--------------------------------------------

             Summary: [C++][Parquet] Support DurationType in writing/reading 
parquet
                 Key: ARROW-6780
                 URL: https://issues.apache.org/jira/browse/ARROW-6780
             Project: Apache Arrow
          Issue Type: Improvement
            Reporter: Joris Van den Bossche


Currently this is not supported:

{code}
In [37]: table = pa.table({'a': pa.array([1, 2], pa.duration('s'))}) 

In [39]: table
Out[39]: 
pyarrow.Table
a: duration[s]

In [41]: pq.write_table(table, 'test_duration.parquet')
...
ArrowNotImplementedError: Unhandled type for Arrow to Parquet schema 
conversion: duration[s]
{code}

There is no direct mapping to Parquet logical types. There is an INTERVAL type, 
but this more matches Arrow's  ( YEAR_MONTH or DAY_TIME) interval type. 

But, those duration values could be stored as just integers, and based on the 
serialized arrow schema, it could be restored when reading back in.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to