Geoffrey Cleaves created HIVE-9291:
--------------------------------------
Summary: Hive error when GROUPING by TIMESTAMP column when storage
orc TBLPROPERTIES ('transactional'='true')
Key: HIVE-9291
URL: https://issues.apache.org/jira/browse/HIVE-9291
Project: Hive
Issue Type: Bug
Affects Versions: 0.14.0
Environment: Hortonsworks Sandbox HDP 2.2
Reporter: Geoffrey Cleaves
I am unable to successfully run a "SQL" query that groups by a timestamp column
when the underlying table is created as ORC and TBLPROPERTIES
('transactional'='true'). If I remove ('transactional'='true') when creating
the table then I can run the group by query correctly.
(Additionally, pig does not read tables created with TBLPROPERTIES
('transactional'='true')).
h3. Error output
hive> select to_date(createdat), count( * ) from entrance_t
> group by to_date(createdat);
Query ID = root_20150107131414_f6739293-a87f-4c05-8100-b86ae060be3a
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Job = job_1420194485920_0106, Tracking URL =
http://sandbox.hortonworks.com:8088/proxy/application_1420194485920_0106/
Kill Command = /usr/hdp/2.2.0.0-2041/hadoop/bin/hadoop job -kill
job_1420194485920_0106
Hadoop job information for Stage-1: number of mappers: 3; number of reducers: 1
2015-01-07 13:14:50,082 Stage-1 map = 0%, reduce = 0%
2015-01-07 13:15:30,154 Stage-1 map = 100%, reduce = 100%
Ended Job = job_1420194485920_0106 with errors
Error during job, obtaining debugging information...
Examining task ID: task_1420194485920_0106_m_000000 (and more) from job
job_1420194485920_0106
Task with the most failures(4):
-----
Task ID:
task_1420194485920_0106_m_000001
URL:
http://sandbox.hortonworks.com:8088/taskdetails.jsp?jobid=job_1420194485920_0106&tipid=task_1420194485920_0106_m_000001
-----
Diagnostic Messages for this Task:
Error: java.lang.RuntimeException:
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while
processing row
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:185)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error
while processing row
at
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:52)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:176)
... 8 more
Caused by: java.lang.ArrayIndexOutOfBoundsException: 2
at
org.apache.hadoop.hive.ql.exec.vector.expressions.LongToStringUnaryUDF.evaluate(LongToStringUnaryUDF.java:57)
at
org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.evaluateBatch(VectorHashKeyWrapperBatch.java:91)
at
org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.processBatch(VectorGroupByOperator.java:315)
at
org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.processOp(VectorGroupByOperator.java:859)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:138)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
at
org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
at
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
... 9 more
FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 3 Reduce: 1 HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec
h3. Problem table creation
create table entrance_t
(createdAt timestamp, ip string)
clustered by (createdAt) into 3 buckets STORED AS orc *TBLPROPERTIES
('transactional'='true')*;
insert into table entrance_t select createdat, ip from ad_server LIMIT 10;
h3. No problem table creation
create table entrance_t
(createdAt timestamp, ip string)
clustered by (createdAt) into 3 buckets STORED AS orc;
insert into table entrance_t select createdat, ip from ad_server LIMIT 10;
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)