westonpace commented on pull request #12590:
URL: https://github.com/apache/arrow/pull/12590#issuecomment-1084979289
This is for function registration so I don't think there will be any dynamic
content here. For example, we currently have "scalar" and we know we will need
"aggregate". There may be other classes of UDF that we add but each time we do
so it will be intentional an accompanied by a code change. I don't think we're
looking into dynamically adding new classes of UDFs.
Function options & state are a slightly different story. We do want to
support dynamic function option content. However, from the C++ perspective,
both of these things will just be `PyObject*`. For example, maybe a user
defines a custom datetime formatting UDF and they want to take the format
pattern and locale in as objects. I think we could do it this way...
```
class UdfOptions(object):
def __init__(self):
pass
class CustomDateFormatOptions(UdfOptions):
def __init__(self, format, locale):
self.format = format
self.locale = locale
```
Function registration remains unchanged. Later, when they call their UDF we
would do something like...
```
pc.call_function("custom_date_format", [timestamps_arr],
CustomDateFormatOptions(my_format, my_locale))
```
Then the python layer could check to see if the options object extends
`UdfOptions` and, if it does, pass it to some kind of `CUdfOptions`:
```
public struct UdfOptions {
PyObject* options_obj;
};
```
...then we'd probably need some logic when we are actually calling the
PyFunc to grab the PyObject out of the options so that hopefully the UDF itself
could look something like...
```
def custom_date_format(arr, options):
...
```
In summary, I think a flat struct should be sufficient for any cases we need
to tackle.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]