westonpace commented on pull request #12590:
URL: https://github.com/apache/arrow/pull/12590#issuecomment-1084979289


   This is for function registration so I don't think there will be any dynamic 
content here.  For example, we currently have "scalar" and we know we will need 
"aggregate".  There may be other classes of UDF that we add but each time we do 
so it will be intentional an accompanied by a code change.  I don't think we're 
looking into dynamically adding new classes of UDFs.
   
   Function options & state are a slightly different story.  We do want to 
support dynamic function option content.  However, from the C++ perspective, 
both of these things will just be `PyObject*`.  For example, maybe a user 
defines a custom datetime formatting UDF and they want to take the format 
pattern and locale in as objects.  I think we could do it this way...
   
   ```
   class UdfOptions(object):
     def __init__(self):
       pass
   
   class CustomDateFormatOptions(UdfOptions):
     def __init__(self, format, locale):
       self.format = format
       self.locale = locale
   ```
   
   Function registration remains unchanged.  Later, when they call their UDF we 
would do something like...
   
   ```
   pc.call_function("custom_date_format", [timestamps_arr], 
CustomDateFormatOptions(my_format, my_locale))
   ```
   
   Then the python layer could check to see if the options object extends 
`UdfOptions` and, if it does, pass it to some kind of `CUdfOptions`:
   
   ```
   public struct UdfOptions {
     PyObject* options_obj;
   };
   ```
   
   ...then we'd probably need some logic when we are actually calling the 
PyFunc to grab the PyObject out of the options so that hopefully the UDF itself 
could look something like...
   
   ```
   def custom_date_format(arr, options):
     ...
   ```
   In summary, I think a flat struct should be sufficient for any cases we need 
to tackle.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to