GitHub user zhaohm3 opened a pull request:
https://github.com/apache/hive/pull/29
Fix Hive throws ParseException while parsing Grouping-Sets clauses
Currently, when Hive parses GROUPING SETS clauses, and if there are some
expressions that were composed of two or more common subexpressions, then the
first element of those expressions can only be a simple Identifier without any
qualifications, otherwise Hive will throw ParseException during its parser
stage. Therefore, Hive will throw ParseException while parsing the following
HQLs:
drop table test;
create table test(tc1 int, tc2 int, tc3 int);
explain select test.tc1, test.tc2 from test group by test.tc1, test.tc2
grouping sets(test.tc1, (test.tc1, test.tc2));
explain select tc1+tc2, tc2 from test group by tc1+tc2, tc2 grouping
sets(tc2, (tc1 + tc2, tc2));
drop table test;
The following contents show some ParseExctption stacktrace:
2015-01-07 09:53:34,718 INFO [main]: log.PerfLogger
(PerfLogger.java:PerfLogBegin(108)) - <PERFLOG method=Driver.run
from=org.apache.hadoop.hive.ql.Driver>
2015-01-07 09:53:34,719 INFO [main]: log.PerfLogger
(PerfLogger.java:PerfLogBegin(108)) - <PERFLOG method=TimeToSubmit
from=org.apache.hadoop.hive.ql.Driver>
2015-01-07 09:53:34,721 INFO [main]: ql.Driver
(Driver.java:checkConcurrency(158)) - Concurrency mode is disabled, not
creating a lock manager
2015-01-07 09:53:34,721 INFO [main]: log.PerfLogger
(PerfLogger.java:PerfLogBegin(108)) - <PERFLOG method=compile
from=org.apache.hadoop.hive.ql.Driver>
2015-01-07 09:53:34,724 INFO [main]: log.PerfLogger
(PerfLogger.java:PerfLogBegin(108)) - <PERFLOG method=parse
from=org.apache.hadoop.hive.ql.Driver>
2015-01-07 09:53:34,724 INFO [main]: parse.ParseDriver
(ParseDriver.java:parse(185)) - Parsing command: explain select test.tc1,
test.tc2 from test group by test.tc1, test.tc2 grouping sets(test.tc1,
(test.tc1, test.tc2))
2015-01-07 09:53:34,734 ERROR [main]: ql.Driver
(SessionState.java:printError(545)) - FAILED: ParseException line 1:105 missing
) at ',' near '<EOF>'
line 1:116 extraneous input ')' expecting EOF near '<EOF>'
org.apache.hadoop.hive.ql.parse.ParseException: line 1:105 missing ) at
',' near '<EOF>'
line 1:116 extraneous input ')' expecting EOF near '<EOF>'
at
org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:210)
at
org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:166)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:404)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:322)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:975)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1040)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:911)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:901)
at
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:268)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:220)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:423)
at
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:792)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:686)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:625)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
2015-01-07 09:53:34,745 INFO [main]: log.PerfLogger
(PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=compile
start=1420595614721 end=1420595614745 duration=24
from=org.apache.hadoop.hive.ql.Driver>
2015-01-07 09:53:34,745 INFO [main]: log.PerfLogger
(PerfLogger.java:PerfLogBegin(108)) - <PERFLOG method=releaseLocks
from=org.apache.hadoop.hive.ql.Driver>
2015-01-07 09:53:34,746 INFO [main]: log.PerfLogger
(PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=releaseLocks
start=1420595614745 end=1420595614746 duration=1
from=org.apache.hadoop.hive.ql.Driver>
2015-01-07 09:53:34,746 INFO [main]: log.PerfLogger
(PerfLogger.java:PerfLogBegin(108)) - <PERFLOG method=releaseLocks
from=org.apache.hadoop.hive.ql.Driver>
2015-01-07 09:53:34,746 INFO [main]: log.PerfLogger
(PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=releaseLocks
start=1420595614746 end=1420595614746 duration=0
from=org.apache.hadoop.hive.ql.Driver>
But, Hive will not throw ParseException while handling the follwing HQLs:
drop table test;
create table test(tc1 int, tc2 int, tc3 int);
explain select tc1, test.tc2 from test group by tc1, test.tc2 grouping
sets(tc1, (tc1, test.tc2));
explain select tc1+tc2, tc1 from test group by tc1+tc2, tc1 grouping
sets(tc1, (tc1, tc1 + tc2));
explain select test.tc1, test.tc1 + test.tc2 from test group by
test.tc1, test.tc1 + test.tc2 grouping sets(test.tc1, (test.tc1), (test.tc1 +
test.tc2));
drop table test;
For more details, visit:
https://issues.apache.org/jira/secure/attachment/12691831/Fix-Hive-ParseException-of-Grouping-Sets.htm
or https://www.zybuluo.com/Spongcer/note/61369
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/zhaohm3/hive HIVE-9336
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/hive/pull/29.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #29
----
commit 8b6c607c9e01b8efbfd616a4da6117b16e2e6bb5
Author: zhaohm3 <[email protected]>
Date: 2015-01-13T03:02:39Z
Fix Hive throws ParseException while parsing Grouping-Sets clauses
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---