[ 
https://issues.apache.org/jira/browse/HIVE-19064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17084596#comment-17084596
 ] 

Krisztian Kasa commented on HIVE-19064:
---------------------------------------

The non-standard functionality enables double quote enclosed string literals 
like
{code:java}
SELECT "This is a string" FROM t;
{code}
To enable this functionality the setting 
*hive.support.quoted.identifiers=column* is used.

The following query is a simplified version of the query in test *quote2.q*
{code:java}
set hive.support.quoted.identifiers=column;
CREATE TABLE t (c1 int);
SELECT "a\"" FROM t;
{code}
With this patch which enables SQL standard functionality this query fails
{code:java}
org.apache.hadoop.hive.ql.parse.ParseException: line 3:10 cannot recognize 
input near 'a' '""' 'FROM' in selection target
{code}
Cause: in order to enable double quoted identifiers the *QuotedIdentifier* rule 
was extended in *HiveLexer.g*
{code:java}
fragment
QuotedIdentifier
    :
    {allowQuotedId() == Quotation.BACKTICKS}? ('`'  ( '``' | ~('`') )* '`') { 
setText(StringUtils.replace(getText().substring(1, getText().length() -1 ), 
"``", "`")); }
    | {allowQuotedId() == Quotation.STANDARD}? ('\"'  ( '\"\"' | ~('\"') )* 
'\"') { setText(StringUtils.replace(getText().substring(1, getText().length() 
-1 ), "\"\"", "\"")); }
    ;
{code}
According to SQL standard the new rule escapes double quotes by duplication of 
the double quote character like:
{code:java}
set hive.support.quoted.identifiers=standard;
CREATE TABLE t ("col0 ""Zero"" " int);
SELECT "col0 ""Zero"" " FROM t;
{code}
When the parser reads the first double quote character while parsing the query 
*SELECT "a\"" FROM t;* in *column* mode first it tries to apply the 
*QuotedIdentifier*. The Semantic predicate 
{code}
{allowQuotedId() == Quotation.STANDARD}?
{code}
in the rule prevents applying the subrule 
{code}
('\"'  ( '\"\"' | ~('\"') )* '\"')
{code}
which makes sense since we want to treat the text
{code}
"a\""
{code}
as a String literal.
However the parser doesn't rewind the input stream to the *"* character but 
reads the next one which is 'a'. It is not falling back to StringLiteral rule 
which would accept the text.


> Add mode to support delimited identifiers enclosed within double quotation
> --------------------------------------------------------------------------
>
>                 Key: HIVE-19064
>                 URL: https://issues.apache.org/jira/browse/HIVE-19064
>             Project: Hive
>          Issue Type: Improvement
>          Components: Parser, SQL
>    Affects Versions: 3.0.0
>            Reporter: Jesus Camacho Rodriguez
>            Assignee: Krisztian Kasa
>            Priority: Major
>         Attachments: HIVE-19064.01.patch, HIVE-19064.02.patch, 
> HIVE-19064.03.patch, HIVE-19064.4.patch
>
>
> As per SQL standard. Hive currently uses `` (backticks). Default will 
> continue being backticks, but we will support identifiers within double 
> quotation via configuration parameter.
> This issue will also extends support for arbitrary char sequences, e.g., 
> containing {{~ ! @ # $ % ^ & * () , < >}}, in database and table names. 
> Currently, special characters are only supported for column names.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to