findepi commented on issue #8051:
URL: https://github.com/apache/datafusion/issues/8051#issuecomment-2348186197

   > The only downside I see is that the "state" is not going to be serialised 
if it has to be distributed in systems like ballista.
   
   That's because we don't use expressions.
   What about using simplify all the way down? The `PrecompiledRegexpMatch` 
could get dumped as a bytes buffer (varbinary) into an expression and then cast 
back to `PrecompiledRegexpMatch`. This will work as long as it's a flat 
structure. It won't work when it's something that has pointers internally and 
requires actual serialization.
   
   Alternatively, we can avoid all this complexity -- at the cost of different 
complexity, but conceptually simpler.
   Let's imagine ScalarUDF invoke gets an option to create a thread local 
scratch space that it can reuse on all invocations. That would make reusing 
compiled pattern easier without having to serialize it in the plan.
   The downside would be that the implementation would need to explicitly check 
whether the pattern is the same on every invocation (equality check once per 
batch).
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to