[ 
https://issues.apache.org/jira/browse/FLINK-39649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-39649:
-----------------------------------
    Labels: pull-request-available  (was: )

> REGEXP_EXTRACT plan-time validation and hot-path log cleanup
> ------------------------------------------------------------
>
>                 Key: FLINK-39649
>                 URL: https://issues.apache.org/jira/browse/FLINK-39649
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Table SQL / API, Table SQL / Planner, Table SQL / Runtime
>            Reporter: Ramin Gharib
>            Priority: Major
>              Labels: pull-request-available
>
> SqlFunctionUtils.regexpExtract compiles the regex per record and emits 
> LOG.error on PatternSyntaxException. The pattern is known at planning time 
> when it is a string literal.
> h3. Reproducer
>  
> {code:java}
>   SELECT REGEXP_EXTRACT(payload, '(', 1) FROM src; {code}
>  
> '(' is an unbalanced group. The job plans successfully and the runtime emits 
> one stack trace per record processed.
> h3.        
> Fix         
>  # Add RegexpExtractInputTypeStrategy. Compiles literal regex during 
> inferInputTypes, fails via callContext.fail(...).
>  # Route BuiltInFunctionDefinitions.REGEXP_EXTRACT through it
>  # Update SqlFunctionUtils.regexpExtract to use REGEXP_PATTERN_CACHE and 
> silently return null on compile failure. No LOG.error on the hot path. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to