danielhumanmod commented on PR #2831: URL: https://github.com/apache/datafusion-comet/pull/2831#issuecomment-3693765968
> Is there a way to be confident that an arbitrary user-provided regular expression, which Spark expects to be in Java format, is: > > 1. valid in the intended native version > 2. has the same semantics 😅 > > e.g. > > * [[CH] fallback for unsupported regex in re2 incubator-gluten#7866](https://github.com/apache/incubator-gluten/issues/7866) > * https://docs.rs/regex/latest/regex/ doesn't support backreferences--in this case at least you could get a plan-time error, assuming the pattern argument is constant Good point! I share the same concern regarding the semantic differences between Java's regex engine and the native implementation. That is exactly why `isSupportedPattern` is currently hardcoded to return false. We are taking a conservative approach here: the feature is effectively disabled until we can prove via testing (as mentioned by Andy: https://github.com/apache/datafusion-comet/pull/2831#discussion_r2640735448 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
