[ 
https://issues.apache.org/jira/browse/FLINK-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timo Walther updated FLINK-10281:
---------------------------------
    Description: 
for example,  regular expression matches text ("\w") or number ("\d") :
{code:java}
testAllApis(
  "foothebar".regexExtract("foo([\\w]+)", 1),       //OK, the method got 
'foo([\w]+)'
  "'foothebar'.regexExtract('foo([\\\\w]+)', 1)",   //failed, the method got 
'foo([\\w]+)' returns "", but if pass 'foo([\\w]+)' would get compile error.
  "REGEX_EXTRACT('foothebar', 'foo([\\\\w]+)', 1)", //OK, the method got 
'foo([\w]+)' but must pass four '\'
  "thebar"
)
{code}
the "similar to" function has the same issue.

 

Update:

Proper escaping of quotes was not possible in the past for Table API. SQL and 
SQL Client
 were not standard compliant.

Due to FLINK-8301 backslashes were considered in SQL literals, however, they
 should only be used in SQL `U&'\1234'` literals. For the Table API, the new
 logic relies on the Java/Scala escaping and uses duplicate quotes for escaping
 the quotes in expression strings. For SQL, we rely on unicode string literals
 with or without the UESCAPE clause. The SQL Client was using backslashes for
 escaping new lines. For the SQL Client, we allow unescaped new lines and
 use ';' for statement finalization; similar to other SQL clients.

 

  was:
for example,  regular expression matches text ("\w") or number ("\d") :
{code:java}
testAllApis(
  "foothebar".regexExtract("foo([\\w]+)", 1),       //OK, the method got 
'foo([\w]+)'
  "'foothebar'.regexExtract('foo([\\\\w]+)', 1)",   //failed, the method got 
'foo([\\w]+)' returns "", but if pass 'foo([\\w]+)' would get compile error.
  "REGEX_EXTRACT('foothebar', 'foo([\\\\w]+)', 1)", //OK, the method got 
'foo([\w]+)' but must pass four '\'
  "thebar"
)
{code}
the "similar to" function has the same issue.

 


> Table function parse regular expression contains backslash failed
> -----------------------------------------------------------------
>
>                 Key: FLINK-10281
>                 URL: https://issues.apache.org/jira/browse/FLINK-10281
>             Project: Flink
>          Issue Type: Bug
>          Components: Table API & SQL
>            Reporter: vinoyang
>            Assignee: vinoyang
>            Priority: Major
>              Labels: pull-request-available
>
> for example,  regular expression matches text ("\w") or number ("\d") :
> {code:java}
> testAllApis(
>   "foothebar".regexExtract("foo([\\w]+)", 1),       //OK, the method got 
> 'foo([\w]+)'
>   "'foothebar'.regexExtract('foo([\\\\w]+)', 1)",   //failed, the method got 
> 'foo([\\w]+)' returns "", but if pass 'foo([\\w]+)' would get compile error.
>   "REGEX_EXTRACT('foothebar', 'foo([\\\\w]+)', 1)", //OK, the method got 
> 'foo([\w]+)' but must pass four '\'
>   "thebar"
> )
> {code}
> the "similar to" function has the same issue.
>  
> Update:
> Proper escaping of quotes was not possible in the past for Table API. SQL and 
> SQL Client
>  were not standard compliant.
> Due to FLINK-8301 backslashes were considered in SQL literals, however, they
>  should only be used in SQL `U&'\1234'` literals. For the Table API, the new
>  logic relies on the Java/Scala escaping and uses duplicate quotes for 
> escaping
>  the quotes in expression strings. For SQL, we rely on unicode string literals
>  with or without the UESCAPE clause. The SQL Client was using backslashes for
>  escaping new lines. For the SQL Client, we allow unescaped new lines and
>  use ';' for statement finalization; similar to other SQL clients.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to