Hey Jaro,

While not written in Java, nor a UDF, there are some examples in [1]
dask-sql (python based) where we do this to extend DataFusion for custom
grammars, CREATE MODEL, for example. In a nutshell you want to write some
Rust code that extends the DataFusion parser and then performs any binding
logic required when your custom UDF statement is encountered. The
processing chain is a little lengthy to follow but you can see where that
starts [2] here. The `DaskParser` maintains a member which is the
DataFusion parser itself. Happy to give more details just wanted to give
you a place to start looking.

Thanks,
Jeremy Dyer

[1] - https://github.com/dask-contrib/dask-sql
[2] -
https://github.com/dask-contrib/dask-sql/blob/main/dask_planner/src/parser.rs#L385

On Thu, Jan 12, 2023 at 10:36 AM Jaroslaw Nowosad <yare...@gmail.com> wrote:

> Hi all,
>
> I had a task to investigate how to extend Datafusion to add UDFs written in
> plain SQL.
> Reason behind:  there is quite a big bunch of SQL UDF in existing java
> (spark) solutions, however we are starting to move into the Rust ecosystem
> and Datafussion/Arrow/Ballista looks like the proper way.
>
> Question:
> Could I get some points on how to extend DF to add "CREATE FUNCTION AAA
> (p1:int, p2: int) RETURN INT AS '<sql style body here' "?
>
> I saw some rewrite propositions, extending SQL parser with a new command or
> creating separate parser/dialect.
>
> Best Regards,
> Jaro
>

Reply via email to