Max Burke created ARROW-11324:
---------------------------------

             Summary: [Rust] Querying datetime data in DataFusion with an 
embedded timezone always fails
                 Key: ARROW-11324
                 URL: https://issues.apache.org/jira/browse/ARROW-11324
             Project: Apache Arrow
          Issue Type: Bug
          Components: Rust - DataFusion
            Reporter: Max Burke


We have a number (~ hundreds of thousands) of Parquet files that have embedded 
Arrow schemas in them that have time-valued columns with the type 
DateTime(TimeUnit::Nanosecond, Some("UTC")).

 

One of the changes in the Arrow 2 -> 3 working window was to make the Parquet 
loader prefer the Arrow schema compared to the one generated from the columns. 

 

But because DataFusion has the timezone field of the DateTime variant hardcoded 
as None, we can't load any of our data after this upgrade; we get errors like:



{{SELECT * FROM parquet_table WHERE ("timestamp" >= 
to_timestamp('2010-03-24T13:00:00.000000Z') AND "timestamp" <= 
to_timestamp('2010-03-25T00:00:00.000000Z')) ORDER BY timestamp ASC NULLS 
LAST;}}
{{Plan("\'Timestamp(Nanosecond, Some(\"UTC\")) >= Timestamp(Nanosecond, None)\' 
can\'t be evaluated because there isn\'t a common type to coerce the types 
to")}}

 

Any ideas/thoughts? 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to