[ 
https://issues.apache.org/jira/browse/IMPALA-10051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17198583#comment-17198583
 ] 

ASF subversion and git services commented on IMPALA-10051:
----------------------------------------------------------

Commit 3ef77566286c0077b89c0b8ce529ea9985018dd6 in impala's branch 
refs/heads/master from Tamas Mate
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=3ef7756 ]

IMPALA-10051: impala-shell exits with ValueError with WITH clauses

When a query contains WITH clause impala-shell tries to identify whether
it is a DML query or not, so that later it can provide appropriate
result messages. Earlier shlex was used to create tokens and assess the
query type based on that. However shlex can misinterpret some query
strings where whitespace charachters are mixed with quotes, because it
splits the string based on whitespace charachters. In some scenarios
'ValueError: No closing quotation' error can occur.

This change moves the tokenization from shlex to sqlparse.

Testing:
 - Added unit test to cover queries that contain mixed whitespaces
   and strings

Change-Id: I442d3bc65b90a55c73c847948d5179a8586d71ad
Reviewed-on: http://gerrit.cloudera.org:8080/16389
Reviewed-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>


> impala-shell exits with ValueError with WITH clauses
> ----------------------------------------------------
>
>                 Key: IMPALA-10051
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10051
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Clients
>    Affects Versions: Impala 4.0
>            Reporter: Tamas Mate
>            Assignee: Tamas Mate
>            Priority: Major
>
> Some strings can cause shlex to throw an exception in WITH clauses, for 
> example in a regexp_replace. This should be handled more gracefully and 
> correctly.
> Working query (impala-shell forwards the query for analysis):
> {code:java}
> impala-shell.sh -q 'with select regexp_replace(column_name, "[a-zA-Z]", "+ 
> ");'
> {code}
> While same query fails with ValueError when empty spaces are removed from the 
> arguments of the regexp_replace:
> {code:java}
> tmate@tmate-box:~/Projects/Impala$ impala-shell.sh -q 'with select 
> regexp_replace(column_name,"[a-zA-Z]","+ ");'
> Starting Impala Shell with no authentication using Python 2.7.16
> Warning: live_progress only applies to interactive shell sessions, and is 
> being skipped for now.
> Opened TCP connection to localhost:21000
> Connected to localhost:21000
> Server version: impalad version 4.0.0-SNAPSHOT DEBUG (build 
> b29cb4ca82a4f05ea7dc0eadc330a64fbe685ef0)
> Traceback (most recent call last):
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1973, in 
> <module>
>     impala_shell_main()
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1927, in 
> impala_shell_main
>     if execute_queries_non_interactive_mode(options, query_options):
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1731, in 
> execute_queries_non_interactive_mode
>     shell.execute_query_list(queries))
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1564, in 
> execute_query_list
>     if self.onecmd(q) is CmdStatus.ERROR:
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 675, in 
> onecmd
>     return func(arg)
>   File "/home/tmate/Projects/Impala/shell/impala_shell.py", line 1276, in 
> do_with
>     tokens = shlex.split(strip_comments(query.lstrip()), posix=False)
>   File 
> "/home/tmate/Projects/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/shlex.py",
>  line 279, in split
>     return list(lex)
>   File 
> "/home/tmate/Projects/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/shlex.py",
>  line 269, in next
>     token = self.get_token()
>   File 
> "/home/tmate/Projects/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/shlex.py",
>  line 96, in get_token
>     raw = self.read_token()
>   File 
> "/home/tmate/Projects/Impala/toolchain/toolchain-packages-gcc7.5.0/python-2.7.16/lib/python2.7/shlex.py",
>  line 172, in read_token
>     raise ValueError, "No closing quotation"
> ValueError: No closing quotation
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to