handmadecode opened a new pull request, #2907:
URL: https://github.com/apache/drill/pull/2907

   # [DRILL-8492](https://issues.apache.org/jira/browse/DRILL-8492): Read 
parquet microsecond columns as bigint
   
   ## Description
   
   Two new configuration options, `store.parquet.reader.time_micros_as_int64` 
and `store.parquet.reader.timestamp_micros_as_int64,` have been added.
   
   When reading Parquet columns of type `time_micros` and `timestamp_micros`, 
the returned value will be the original 64-bit integer value instead of a 
timestamp value truncated to milliseconds if the configuration option for the 
column type is `true`.
   
   The implementation closely follows how the existing option 
`store.parquet.reader.int96_as_timestamp` is handled.
   
   Both new options have the default value `false` to preserve existing 
behaviour as default.
   
   ## Documentation
   
   The new options can be set in "drill-module.conf", and also be set through 
the Web UI's _Options_ page.
   
   ## Testing
   
   Unit tests have been added to 
`org.apache.drill.exec.store.parquet.TestMicrosecondColumns`.
   
   It could be worth noting that microsecond columns can be compared to time or 
timestamp literals even if the option for reading the column's value as a 
64-bit value is true:
   
   `SELECT * FROM file.parquet WHERE TO_TIME(time_micros_column/1000) > 
'09:32:58.174';`
   and
   `SELECT * FROM file.parquet WHERE 
TO_TIMESTAMP(timestamp_micros_column/1000000) > '2024-04-26 13:17:41.421';`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to