[
https://issues.apache.org/jira/browse/HIVE-5845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13826719#comment-13826719
]
Remus Rusanu commented on HIVE-5845:
------------------------------------
Hello Ashutosh,
I’ve looked at this and my opinion is that the problem is with the Orc’
VectorizedSerde.serialize. Despite the fact that we’re writing an OrcStruct
field, it adds to the OrcSerde object created the passed in object inspector,
which is for the input struct, instead of the OrcStructInspectr which should be
used with the created OrcStruct.
I tried this patch:
diff --git ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSerde.java
ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSerde.java
index d765353..c4268c1 100644
--- ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSerde.java
+++ ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcSerde.java
@@ -143,9 +143,9 @@ public SerDeStats getSerDeStats() {
public Writable serializeVector(VectorizedRowBatch vrg, ObjectInspector
objInspector)
throws SerDeException {
if (vos == null) {
- vos = new VectorizedOrcSerde(objInspector);
+ vos = new VectorizedOrcSerde(getObjectInspector());
}
- return vos.serialize(vrg, objInspector);
+ return vos.serialize(vrg, getObjectInspector());
}
However, with this fix I’m hitting other (very familiar…) cast exceptions:
Caused by: java.lang.ClassCastException:
org.apache.hadoop.hive.serde2.io.TimestampWritable cannot be cast to
java.sql.Timestamp
at
org.apache.hadoop.hive.serde2.objectinspector.primitive.JavaTimestampObjectInspector.getPrimitiveJavaObject(JavaTimestampObjectInspector.java:39)
at
org.apache.hadoop.hive.ql.io.orc.WriterImpl$TimestampTreeWriter.write(WriterImpl.java:1172)
at
org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.write(WriterImpl.java:1349)
at
org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:1962)
at
org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.write(OrcOutputFormat.java:78)
at
org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.processOp(VectorFileSinkOperator.java:159)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:489)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:827)
at
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:129)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:489)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:827)
at
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:91)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:489)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:827)
at
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
Caused by: java.lang.ClassCastException:
org.apache.hadoop.hive.serde2.io.ByteWritable cannot be cast to
org.apache.hadoop.io.IntWritable
at
org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableIntObjectInspector.get(WritableIntObjectInspector.java:36)
at
org.apache.hadoop.hive.ql.io.orc.WriterImpl$IntegerTreeWriter.write(WriterImpl.java:762)
at
org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.write(WriterImpl.java:1349)
at
org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:1962)
at
org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.write(OrcOutputFormat.java:78)
at
org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.processOp(VectorFileSinkOperator.java:159)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:489)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:827)
at
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:129)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:489)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:827)
at
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:91)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:489)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:827)
at
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
Before I go and hack through code I’m only vaguely familiar with (the Orc
serdes), do you have someone more experienced in this area at HW to have a look
too?
It seems that the Orc writer expects Java primitive types where the vector file
sink creates Writables instead… I’m afraid if I ‘fix’ this one way, some other
place will break.
Thanks,
~Remus
From: Ashutosh Chauhan (JIRA) [mailto:[email protected]]
Sent: Tuesday, November 19, 2013 1:11 AM
To: Remus Rusanu
Subject: [jira] [Commented] (HIVE-5845) CTAS failed on vectorized code path
[https://issues.apache.org/jira/secure/useravatar?avatarId=10452]
Ashutosh
Chauhan<https://issues.apache.org/jira/secure/ViewProfile.jspa?name=ashutoshc>
commented on an issue
Re: CTAS failed on vectorized code
path<https://issues.apache.org/jira/browse/HIVE-5845>
Stack-trace:
Caused by: java.lang.ClassCastException:
org.apache.hadoop.hive.ql.io.orc.OrcStruct cannot be cast to [Ljava.lang.Object;
at
org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldData(StandardStructObjectInspector.java:173)
at
org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.write(WriterImpl.java:1349)
at
org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:1962)
at
org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.write(OrcOutputFormat.java:78)
at
org.apache.hadoop.hive.ql.exec.vector.VectorFileSinkOperator.processOp(VectorFileSinkOperator.java:159)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:489)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:827)
at
org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:129)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:489)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:827)
at
org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:91)
at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:489)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:827)
at
org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:43)
[Add Comment]<https://issues.apache.org/jira/browse/HIVE-5845#add-comment>
Add Comment<https://issues.apache.org/jira/browse/HIVE-5845#add-comment>
Hive<https://issues.apache.org/jira/browse/HIVE> / [Bug]
<https://issues.apache.org/jira/browse/HIVE-5845>
HIVE-5845<https://issues.apache.org/jira/browse/HIVE-5845>
CTAS failed on vectorized code
path<https://issues.apache.org/jira/browse/HIVE-5845>
Following query fails:
create table store_sales_2 stored as orc as select * from alltypesorc;
This message was sent by Atlassian JIRA (v6.1#6144-sha1:2e50328)
[Atlassian logo]
> CTAS failed on vectorized code path
> -----------------------------------
>
> Key: HIVE-5845
> URL: https://issues.apache.org/jira/browse/HIVE-5845
> Project: Hive
> Issue Type: Bug
> Reporter: Ashutosh Chauhan
> Assignee: Remus Rusanu
>
> Following query fails:
> create table store_sales_2 stored as orc as select * from alltypesorc;
--
This message was sent by Atlassian JIRA
(v6.1#6144)