findepi commented on issue #8051: URL: https://github.com/apache/datafusion/issues/8051#issuecomment-2348186197
> The only downside I see is that the "state" is not going to be serialised if it has to be distributed in systems like ballista. That's because we don't use expressions. What about using simplify all the way down? The `PrecompiledRegexpMatch` could get dumped as a bytes buffer (varbinary) into an expression and then cast back to `PrecompiledRegexpMatch`. This will work as long as it's a flat structure. It won't work when it's something that has pointers internally and requires actual serialization. Alternatively, we can avoid all this complexity -- at the cost of different complexity, but conceptually simpler. Let's imagine ScalarUDF invoke gets an option to create a thread local scratch space that it can reuse on all invocations. That would make reusing compiled pattern easier without having to serialize it in the plan. The downside would be that the implementation would need to explicitly check whether the pattern is the same on every invocation (equality check once per batch). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
