Re: Hive 0.11.0 | Issue with ORC Tables

Nitin Pawar Thu, 19 Sep 2013 05:26:49 -0700

How did you create "test.txt" as ORC file?



On Thu, Sep 19, 2013 at 5:34 PM, Savant, Keshav <
keshav.c.sav...@fisglobal.com> wrote:

>  Hi All,****
>
> ** **
>
> We have setup apache “hive 0.11.0” services on Hadoop cluster (apache
> version 0.20.203.0). Hive is showing expected results when tables are
> stored as * TextFile*. ****
>
> Whereas, Hive 0.11.0’s new feature ORC(*Optimized Row Columnar*) is
> throwing an exception while running a select query, when we run select
> queries on tables stored as “*ORC*”.****
>
> Stacktrace of the exception :****
>
> ** **
>
> 2013-09-19 20:33:38,095 ERROR CliDriver
> (SessionState.java:printError(386)) - Failed with exception
> java.io.IOException:com.google.protobuf.InvalidProtocolBufferException:
> While parsing a protocol message, the input ended unexpectedly in the
> middle of a field.  This could mean either than the input has been
> truncated or that an embedded message misreported its own length.****
>
> java.io.IOException: com.google.protobuf.InvalidProtocolBufferException:
> While parsing a protocol message, the input ended unexpectedly in the
> middle of a field.  This could mean either than the input has been
> truncated or that an embedded message misreported its own length.****
>
>         at
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:544)
> ****
>
>         at
> org.apache.hadoop.hive.ql.exec.FetchOperator.pushRow(FetchOperator.java:488)
> ****
>
>         at
> org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:136)****
>
>         at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1412)**
> **
>
>         at
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:271)**
> **
>
>         at
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)****
>
>         at
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)****
>
>         at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:756)***
> *
>
>         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)**
> **
>
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)****
>
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> ****
>
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> ****
>
>         at java.lang.reflect.Method.invoke(Method.java:597)****
>
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:156)****
>
> Caused by: com.google.protobuf.InvalidProtocolBufferException: While
> parsing a protocol message, the input ended unexpectedly in the middle of a
> field.  This could mean either than the input has been truncated or that an
> embedded message misreported its own length.****
>
>         at
> com.google.protobuf.InvalidProtocolBufferException.truncatedMessage(InvalidProtocolBufferException.java:49)
> ****
>
>         at
> com.google.protobuf.CodedInputStream.readRawBytes(CodedInputStream.java:754)
> ****
>
>         at
> com.google.protobuf.CodedInputStream.readBytes(CodedInputStream.java:294)*
> ***
>
>         at
> com.google.protobuf.UnknownFieldSet$Builder.mergeFieldFrom(UnknownFieldSet.java:484)
> ****
>
>         at
> com.google.protobuf.GeneratedMessage$Builder.parseUnknownField(GeneratedMessage.java:438)
> ****
>
>         at
> org.apache.hadoop.hive.ql.io.orc.OrcProto$PostScript$Builder.mergeFrom(OrcProto.java:10129)
> ****
>
>         at
> org.apache.hadoop.hive.ql.io.orc.OrcProto$PostScript$Builder.mergeFrom(OrcProto.java:9993)
> ****
>
>         at
> com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:300)
> ****
>
>         at
> org.apache.hadoop.hive.ql.io.orc.OrcProto$PostScript.parseFrom(OrcProto.java:9970)
> ****
>
>         at
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.<init>(ReaderImpl.java:193)***
> *
>
>         at
> org.apache.hadoop.hive.ql.io.orc.OrcFile.createReader(OrcFile.java:56)****
>
>         at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:168)
> ****
>
>         at
> org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:432)
> ****
>
>         at
> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:508)
> ****
>
> ** **
>
> We did following steps that leads to above exception:****
>
> **·         **SET mapred.output.compression.codec=
> org.apache.hadoop.io.compress.SnappyCodec;****
>
> **·         **CREATE TABLE person(id INT, name STRING) ROW FORMAT
> DELIMITED FIELDS TERMINATED BY ' ' STORED AS ORC tblproperties
> ("orc.compress"="Snappy");****
>
> **·         **LOAD DATA LOCAL INPATH 'test.txt' INTO TABLE person;****
>
> **·         ***Executing  :* SELECT * FROM person;
> *Results :*****
>
> Failed with exception
> java.io.IOException:com.google.protobuf.InvalidProtocolBufferException:
> While parsing a protocol message, the input ended unexpectedly in the
> middle of a field.  This could mean either than the input has been
> truncated or that an embedded message misreported its own length.****
>
> ** **
>
> Also, we included codec property in core-site.xml in our hadoop cluster
> with other configuration settings.****
>
> <property>****
>
>      <name>io.compression.codecs</name>****
>
>     <value>org.apache.hadoop.io.compress.SnappyCodec</value>****
>
> </property>****
>
> ** **
>
> Following are the new jars with their placements ****
>
> ** **
>
> **1.       **Placed a new jar at $HIVE_HOME/lib/config-1.0.0.jar****
>
> **2.       **Placed a new jar for metastore connection
> $HIVE_HOME/lib/mysql-connector-java-5.1.17-bin.jar****
>
> **3.       **Moved jackson-core-asl-1.8.8.jar from $HIVE_HOME/lib to
> $HADOOP_HOME/lib****
>
> **4.       **Moved jackson-mapper-asl-1.8.8.jar from $HIVE_HOME/lib to
> $HADOOP_HOME/lib****
>
> ** **
>
> Please suggest the possible cause and solution to overcome this issue we
> are facing with ORC format tables.****
>
> ** **
>
> Thanks,****
>
> Keshav****
>
> ** **
>  _____________
> The information contained in this message is proprietary and/or
> confidential. If you are not the intended recipient, please: (i) delete the
> message and all copies; (ii) do not disclose, distribute or use the message
> in any manner; and (iii) notify the sender immediately. In addition, please
> be aware that any message addressed to our domain is subject to archiving
> and review by persons other than the intended recipient. Thank you.
>



-- 
Nitin Pawar

Re: Hive 0.11.0 | Issue with ORC Tables

Reply via email to