parthchandra opened a new pull request, #1265:
URL: https://github.com/apache/datafusion-comet/pull/1265
Notable changes:
1. There are three scan implementations:
| Name | Description
| Operator | Underlying implementation |
| ------------------ |
----------------------------------------------------------------------- |
--------------- | ------------------------- |
| native_comet | original, supports primitive types (default)
| CometScan | Comet |
| native_datafusion | POC1, supports (or will support) complex types
| CometNativeScan | DataFusion |
| native_iceberg_compat | POC2, supports (or will support) complex types,
exposes API for Iceberg | Comet Scan | DataFusion |
The scan implementation can be selected by setting the **_conf_**
`spark.comet.scan.impl` or by setting the **_environment variable_**
`COMET_PARQUET_SCAN_IMPL`
2. Plan compatibility suites generate a different plan based on the
implementation. As a result, we now have three sets of expected plans based on
the scan implementation chosen
3. We now use the Spark Session timezone instead of UTC while reading
timestamp fields. This is so that we can compare them with literal timestamps
(Spark apparently automatically applies the session timezone to those)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]