andygrove opened a new issue, #3110:
URL: https://github.com/apache/datafusion-comet/issues/3110

   ## What is the problem the feature request solves?
   
   > **Note:** This issue was generated with AI assistance. The specification 
details have been extracted from Spark documentation and may need verification.
   
   Comet does not currently support the Spark `precise_timestamp_conversion` 
function, causing queries using this function to fall back to Spark's JVM 
execution instead of running natively on DataFusion.
   
   PreciseTimestampConversion is an internal Spark Catalyst expression used for 
converting TimestampType to Long and back without losing precision during time 
windowing operations. It preserves microsecond-level precision by maintaining 
the internal representation format used by Spark's timestamp handling.
   
   Supporting this expression would allow more Spark workloads to benefit from 
Comet's native acceleration.
   
   ## Describe the potential solution
   
   ### Spark Specification
   
   **Syntax:**
   This is an internal expression not directly exposed in SQL or DataFrame API. 
It is automatically generated during time windowing operations.
   
   **Arguments:**
   | Argument | Type | Description |
   |----------|------|-------------|
   | child | Expression | The input expression to be converted |
   | fromType | DataType | The source data type for conversion |
   | toType | DataType | The target data type for conversion |
   
   **Return Type:** Returns the data type specified by the `toType` parameter, 
typically either TimestampType or LongType depending on conversion direction.
   
   **Supported Data Types:**
   Supports conversion between TimestampType and LongType while preserving 
microsecond precision for time windowing operations.
   
   **Edge Cases:**
   - **Null handling**: Expression is null-intolerant (`nullIntolerant = 
true`), meaning null inputs produce null outputs
   - **Type safety**: Input types are validated against the specified 
`fromType` through `ExpectsInputTypes` trait
   - **Precision preservation**: Maintains full microsecond precision during 
timestamp conversions
   - **Code generation fallback**: Always uses code generation path with direct 
value assignment
   
   **Examples:**
   ```sql
   -- This expression is not directly accessible in SQL
   -- It is automatically used internally during time window operations
   SELECT window(timestamp_col, '1 hour') FROM events;
   ```
   
   ```scala
   // Not directly accessible in DataFrame API
   // Used internally during time windowing operations
   df.groupBy(window($"timestamp", "1 hour")).count()
   ```
   
   ### Implementation Approach
   
   See the [Comet guide on adding new 
expressions](https://datafusion.apache.org/comet/contributor-guide/adding_a_new_expression.html)
 for detailed instructions.
   
   1. **Scala Serde**: Add expression handler in 
`spark/src/main/scala/org/apache/comet/serde/`
   2. **Register**: Add to appropriate map in `QueryPlanSerde.scala`
   3. **Protobuf**: Add message type in `native/proto/src/proto/expr.proto` if 
needed
   4. **Rust**: Implement in `native/spark-expr/src/` (check if DataFusion has 
built-in support first)
   
   
   ## Additional context
   
   **Difficulty:** Medium
   **Spark Expression Class:** 
`org.apache.spark.sql.catalyst.expressions.PreciseTimestampConversion`
   
   **Related:**
   - TimeWindow expressions for windowing operations
   - UnaryExpression base class for single-input expressions
   - ExpectsInputTypes trait for type validation
   - TimestampType and LongType data types
   
   ---
   *This issue was auto-generated from Spark reference documentation.*
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to