andygrove commented on PR #3498:
URL:
https://github.com/apache/datafusion-comet/pull/3498#issuecomment-3891775129
> Thanks for adding crc32 support @rafafrdz!
>
> I think `SparkCrc32` from `datafusion-spark` needs to be registered in the
session context. If you look at `register_datafusion_spark_function()` in
`native/core/src/execution/jni_api.rs`, you'll see that `SparkSha1` and
`SparkSha2` are both explicitly imported and registered there, but `SparkCrc32`
isn't yet. Without that, the fallback `registry.udf("crc32")` lookup won't find
the function. Something like this should do it:
>
> ```rust
> use datafusion_spark::function::hash::crc32::SparkCrc32;
> // ...
> session_ctx.register_udf(ScalarUDF::new_from_impl(SparkCrc32::default()));
> ```
>
> For the tests, Comet now has a SQL file-based testing framework
(CometSqlFileTestSuite) that's preferred for expression tests. Could you add
`crc32.sql` there, following instructions in
https://datafusion.apache.org/comet/contributor-guide/sql-file-tests.html?
Tests currently fail because the function isn't registered, as expected.
@rafafrdz how did you test this locally?
```
- hash functions *** FAILED *** (386 milliseconds)
org.apache.spark.SparkException: Job aborted due to stage failure: Task 2
in stage 3753.0 failed 1 times, most recent failure: Lost task 2.0 in stage
3753.0 (TID 12906) (localhost executor driver):
org.apache.comet.CometNativeException: Error from DataFusion: There is no UDF
named "crc32" in the registry. Use session context `register_udf` function to
register a custom UDF.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]