[
https://issues.apache.org/jira/browse/FLINK-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Timo Walther updated FLINK-10281:
---------------------------------
Description:
for example, regular expression matches text ("\w") or number ("\d") :
{code:java}
testAllApis(
"foothebar".regexExtract("foo([\\w]+)", 1), //OK, the method got
'foo([\w]+)'
"'foothebar'.regexExtract('foo([\\\\w]+)', 1)", //failed, the method got
'foo([\\w]+)' returns "", but if pass 'foo([\\w]+)' would get compile error.
"REGEX_EXTRACT('foothebar', 'foo([\\\\w]+)', 1)", //OK, the method got
'foo([\w]+)' but must pass four '\'
"thebar"
)
{code}
the "similar to" function has the same issue.
Update:
Proper escaping of quotes was not possible in the past for Table API. SQL and
SQL Client
were not standard compliant.
Due to FLINK-8301 backslashes were considered in SQL literals, however, they
should only be used in SQL `U&'\1234'` literals. For the Table API, the new
logic relies on the Java/Scala escaping and uses duplicate quotes for escaping
the quotes in expression strings. For SQL, we rely on unicode string literals
with or without the UESCAPE clause. The SQL Client was using backslashes for
escaping new lines. For the SQL Client, we allow unescaped new lines and
use ';' for statement finalization; similar to other SQL clients.
was:
for example, regular expression matches text ("\w") or number ("\d") :
{code:java}
testAllApis(
"foothebar".regexExtract("foo([\\w]+)", 1), //OK, the method got
'foo([\w]+)'
"'foothebar'.regexExtract('foo([\\\\w]+)', 1)", //failed, the method got
'foo([\\w]+)' returns "", but if pass 'foo([\\w]+)' would get compile error.
"REGEX_EXTRACT('foothebar', 'foo([\\\\w]+)', 1)", //OK, the method got
'foo([\w]+)' but must pass four '\'
"thebar"
)
{code}
the "similar to" function has the same issue.
> Table function parse regular expression contains backslash failed
> -----------------------------------------------------------------
>
> Key: FLINK-10281
> URL: https://issues.apache.org/jira/browse/FLINK-10281
> Project: Flink
> Issue Type: Bug
> Components: Table API & SQL
> Reporter: vinoyang
> Assignee: vinoyang
> Priority: Major
> Labels: pull-request-available
>
> for example, regular expression matches text ("\w") or number ("\d") :
> {code:java}
> testAllApis(
> "foothebar".regexExtract("foo([\\w]+)", 1), //OK, the method got
> 'foo([\w]+)'
> "'foothebar'.regexExtract('foo([\\\\w]+)', 1)", //failed, the method got
> 'foo([\\w]+)' returns "", but if pass 'foo([\\w]+)' would get compile error.
> "REGEX_EXTRACT('foothebar', 'foo([\\\\w]+)', 1)", //OK, the method got
> 'foo([\w]+)' but must pass four '\'
> "thebar"
> )
> {code}
> the "similar to" function has the same issue.
>
> Update:
> Proper escaping of quotes was not possible in the past for Table API. SQL and
> SQL Client
> were not standard compliant.
> Due to FLINK-8301 backslashes were considered in SQL literals, however, they
> should only be used in SQL `U&'\1234'` literals. For the Table API, the new
> logic relies on the Java/Scala escaping and uses duplicate quotes for
> escaping
> the quotes in expression strings. For SQL, we rely on unicode string literals
> with or without the UESCAPE clause. The SQL Client was using backslashes for
> escaping new lines. For the SQL Client, we allow unescaped new lines and
> use ';' for statement finalization; similar to other SQL clients.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)