[ 
https://issues.apache.org/jira/browse/CALCITE-5910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17752541#comment-17752541
 ] 

Julian Hyde commented on CALCITE-5910:
--------------------------------------

+1 to call out exactly which flavor of regular expressions is supported by 
functions such as REGEXP_CONTAINS. (I wish there was a 
[standard|https://xkcd.com/927/], but sadly there isn't.)

I have heard that re2 is equivalent to Java regexp (same set of valid 
expressions, and the same semantics for those expressions), but handles invalid 
expressions differently (Java regexp will sometimes fail to match rather than 
giving an error). Can anyone confirm or deny that? If deny, give a regex that 
is valid in one but not the other.

> Add REGEXP_EXTRACT and REGEXP_SUBSTR functions (enabled in BigQuery library)
> ----------------------------------------------------------------------------
>
>                 Key: CALCITE-5910
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5910
>             Project: Calcite
>          Issue Type: Task
>            Reporter: Jerin John
>            Assignee: Jerin John
>            Priority: Major
>              Labels: pull-request-available
>
> Add support for 
> [REGEXP_EXTRACT|https://cloud.google.com/bigquery/docs/reference/standard-sql/string_functions#regexp_extract]
>  and 
> [REGEXP_SUBSTR|https://cloud.google.com/bigquery/docs/reference/standard-sql/string_functions#regexp_substr]
>  functions from BigQuery.
> *{{REGEXP_EXTRACT(value, regexp[, position[, occurrence]])}}*
> Returns the substring in {{value}} that matches the regular expression 
> {{{}regexp{}}}. Returns {{NULL}} if there is no match.
>  * If the regular expression contains a capturing group ({{{}(...){}}}), and 
> there is a match for that capturing group, that match is returned. If there 
> are multiple matches for a capturing group, the last match is returned.
>  * If {{position}} is specified, the search starts at this position in 
> {{{}value{}}}, otherwise it starts at the beginning of {{{}value{}}}.
>  * If {{occurrence}} is specified, the search returns a specific occurrence 
> of the {{regexp}} in {{{}value{}}}, otherwise returns the first match.
>  
> *{{REGEXP_SUBSTR(value, regexp[, position[, occurrence]])}}*
> Synonym for REGEXP_EXTRACT



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to