Nicolas Lalevée created HIVE-3250:
-------------------------------------
Summary: ArrayIndexOutOfBoundsException in
ColumnPrunerProcFactory$ColumnPrunerSelectProc
Key: HIVE-3250
URL: https://issues.apache.org/jira/browse/HIVE-3250
Project: Hive
Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Nicolas Lalevée
I have a query which was not selecting field and the optimizer fails to evict
them with the following stack trace:
{noformat}
FAILED: Hive Internal Error: java.lang.ArrayIndexOutOfBoundsException(-1)
java.lang.ArrayIndexOutOfBoundsException: -1
at java.util.ArrayList.get(ArrayList.java:324)
at
org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory$ColumnPrunerSelectProc.process(ColumnPrunerProcFactory.java:397)
at
org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
at
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88)
at
org.apache.hadoop.hive.ql.optimizer.ColumnPruner$ColumnPrunerWalker.walk(ColumnPruner.java:143)
at
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102)
at
org.apache.hadoop.hive.ql.optimizer.ColumnPruner.transform(ColumnPruner.java:106)
at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:87)
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7306)
at
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889)
at
org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:191)
at
org.apache.hadoop.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:187)
{noformat}
The failing query reduced to the only failing part:
{noformat}
SELECT explodedUrls FROM
(
SELECT userid, array(named_struct('date', count(*))) AS urls, count(*) AS
user_lid FROM
(
SELECT * FROM NicoPageViewEvent WHERE day > '20130801'
) pve
GROUP BY userid
) userViewData
LATERAL VIEW s_explode_pageflow(userViewData.urls) userViewDataLateralView AS
explodedUrls
{noformat}
Adding fields make it work:
{noformat}
SELECT userid, explodedUrls, user_lid FROM
(
SELECT userid, array(named_struct('date', count(*))) AS urls, count(*) AS
user_lid FROM
(
SELECT * FROM NicoPageViewEvent WHERE day > '20130801'
) pve
GROUP BY userid
) userViewData
LATERAL VIEW s_explode_pageflow(userViewData.urls) userViewDataLateralView AS
explodedUrls
{noformat}
And s_explode_pageflow is a custom function which take an array of struct and
split them into arrays of struct
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira