[ https://issues.apache.org/jira/browse/HIVE-19064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17084596#comment-17084596 ]
Krisztian Kasa commented on HIVE-19064: --------------------------------------- The non-standard functionality enables double quote enclosed string literals like {code:java} SELECT "This is a string" FROM t; {code} To enable this functionality the setting *hive.support.quoted.identifiers=column* is used. The following query is a simplified version of the query in test *quote2.q* {code:java} set hive.support.quoted.identifiers=column; CREATE TABLE t (c1 int); SELECT "a\"" FROM t; {code} With this patch which enables SQL standard functionality this query fails {code:java} org.apache.hadoop.hive.ql.parse.ParseException: line 3:10 cannot recognize input near 'a' '""' 'FROM' in selection target {code} Cause: in order to enable double quoted identifiers the *QuotedIdentifier* rule was extended in *HiveLexer.g* {code:java} fragment QuotedIdentifier : {allowQuotedId() == Quotation.BACKTICKS}? ('`' ( '``' | ~('`') )* '`') { setText(StringUtils.replace(getText().substring(1, getText().length() -1 ), "``", "`")); } | {allowQuotedId() == Quotation.STANDARD}? ('\"' ( '\"\"' | ~('\"') )* '\"') { setText(StringUtils.replace(getText().substring(1, getText().length() -1 ), "\"\"", "\"")); } ; {code} According to SQL standard the new rule escapes double quotes by duplication of the double quote character like: {code:java} set hive.support.quoted.identifiers=standard; CREATE TABLE t ("col0 ""Zero"" " int); SELECT "col0 ""Zero"" " FROM t; {code} When the parser reads the first double quote character while parsing the query *SELECT "a\"" FROM t;* in *column* mode first it tries to apply the *QuotedIdentifier*. The Semantic predicate {code} {allowQuotedId() == Quotation.STANDARD}? {code} in the rule prevents applying the subrule {code} ('\"' ( '\"\"' | ~('\"') )* '\"') {code} which makes sense since we want to treat the text {code} "a\"" {code} as a String literal. However the parser doesn't rewind the input stream to the *"* character but reads the next one which is 'a'. It is not falling back to StringLiteral rule which would accept the text. > Add mode to support delimited identifiers enclosed within double quotation > -------------------------------------------------------------------------- > > Key: HIVE-19064 > URL: https://issues.apache.org/jira/browse/HIVE-19064 > Project: Hive > Issue Type: Improvement > Components: Parser, SQL > Affects Versions: 3.0.0 > Reporter: Jesus Camacho Rodriguez > Assignee: Krisztian Kasa > Priority: Major > Attachments: HIVE-19064.01.patch, HIVE-19064.02.patch, > HIVE-19064.03.patch, HIVE-19064.4.patch > > > As per SQL standard. Hive currently uses `` (backticks). Default will > continue being backticks, but we will support identifiers within double > quotation via configuration parameter. > This issue will also extends support for arbitrary char sequences, e.g., > containing {{~ ! @ # $ % ^ & * () , < >}}, in database and table names. > Currently, special characters are only supported for column names. -- This message was sent by Atlassian Jira (v8.3.4#803005)