[ https://issues.apache.org/jira/browse/CALCITE-4564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated CALCITE-4564: ------------------------------------ Labels: pull-request-available (was: ) > Initialization context for non-static user-defined functions (UDFs) > ------------------------------------------------------------------- > > Key: CALCITE-4564 > URL: https://issues.apache.org/jira/browse/CALCITE-4564 > Project: Calcite > Issue Type: Bug > Reporter: Julian Hyde > Assignee: Julian Hyde > Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > I propose to allow user-defined functions (UDFs) to read from an > initialization context during construction. The initialization context would > be a new Java {{interface UdfInitializer}} that provides, among other things, > a type factory and the values of the arguments to the function call whose > values are literals. > The purpose of this feature is to allow functions to do more work at > initialization time and less work on each invocation. Suppose I wanted to > write a UDF {{regexMatch(pattern, string)}} that matches Java regular > expressions. If {{pattern}} is a literal, I would like to create an instance > of the function object that calls {{Pattern.compile(pattern)}} in its > constructor and stores the resulting {{Pattern}} object as a field. Each > invocation of the function can use that {{Pattern}} object, and does not have > to pay the cost of compilation. > In order to use this feature, a UDF class would have a public constructor > with a single argument that is a {{UdfInitializer}}. The method that invokes > the function, conventionally called {{eval}}, must be non-static. > This feature is optional. A UDF that has a public constructor with zero > arguments (which is the current contract for non-static UDFs) will continue > to work. [class > MyPlusFunction|https://github.com/apache/calcite/blob/4bc916619fd286b2c0cc4d5c653c96a68801d74e/core/src/test/java/org/apache/calcite/util/Smalls.java#L429] > is an example of this kind of UDF. > This feature would apply to all UDFs, including table functions (i.e. those > whose argument are tables or which return tables) and aggregate functions. > The initialization context would not affect type derivation aspects of the > function. The return type, operand types, and so forth, will already have > been derived during validate time, and is complete well before any code is > generated or executed. If you want to control type derivation, you should > create your own sub-class of {{SqlOperator}}, as today. > There are some implementation challenges: > * The code generator will need to generate an instance of {{UdfInitializer}} > for each UDF call that occurs in the query. Some data structures that are > readily available at validate time (e.g. {{RexCall}}) are not easily > re-created at run time, so we should be conservative what information is > available via {{UdfInitializer}}. > * The code generator must ensure that those instances are constructed exactly > once during the execution of the query; those instances should not be > variables in the {{execute}} method, but should instead be fields, or perhaps > static fields, in the generated class. > * This functionality needs to work through both the interpreter ({{Bindable}} > convention) and generated code ({{Enumerable}} convention). -- This message was sent by Atlassian Jira (v8.3.4#803005)