Hi Zang thanks much please find the code below

Working code loading data from a path created by Hive table using hive
console outside of spark :

DataFrame df =
hiveContext.read().format("orc").load("/hdfs/path/to/hive/table/partition")

Not working code inside spark hive tables created using hiveContext.sql
insert into partition queries

DataFrame df =
hiveContext.read().format("orc").load("/hdfs/path/to/hive/table/partition/created/by/spark")

You see above is same in both cases just second code is trying to load orc
data created by Spark.
On Sep 30, 2015 11:22 AM, "Zhan Zhang" <zzh...@hortonworks.com> wrote:

> Hi Umesh,
>
> The potential reason is that Hive and Spark does not use same
> OrcInputFormat. In new hive version, there are NewOrcInputFormat, but it is
> not in spark because of backward compatibility (which is not available in
> hive-0.12).
> Do you mind post the code that works and not works for you?
>
> Thanks.
>
> Zhan Zhang
>
> On Sep 29, 2015, at 10:05 PM, Umesh Kacha <umesh.ka...@gmail.com> wrote:
>
> Hi I can read/load orc data created by hive table in a dataframe why is it
> throwing Malformed ORC exception when I try to load data created by
> hiveContext.sql into dataframe?
> On Sep 30, 2015 2:37 AM, "Hortonworks" <zzh...@hortonworks.com> wrote:
>
>> You can try to use data frame for both read and write
>>
>> Thanks
>>
>> Zhan Zhang
>>
>>
>> Sent from my iPhone
>>
>> On Sep 29, 2015, at 1:56 PM, Umesh Kacha <umesh.ka...@gmail.com> wrote:
>>
>> Hi Zang, thanks for the response. Table is created using Spark
>> hiveContext.sql and data inserted into table also using hiveContext.sql.
>> Insert into partition table. When I try to load orc data into dataframe I
>> am loading particular partition data stored in path say
>> /user/xyz/Hive/xyz.db/sparktable/partition1=abc
>>
>> Regards,
>> Umesh
>> On Sep 30, 2015 02:21, "Hortonworks" <zzh...@hortonworks.com> wrote:
>>
>>> How was the table is generated, by hive or by spark?
>>>
>>> If you generate table using have but read it by data frame, it may have
>>> some comparability issue.
>>>
>>> Thanks
>>>
>>> Zhan Zhang
>>>
>>>
>>> Sent from my iPhone
>>>
>>> > On Sep 29, 2015, at 1:47 PM, unk1102 <umesh.ka...@gmail.com> wrote:
>>> >
>>> > Hi I have a spark job which creates hive tables in orc format with
>>> > partitions. It works well I can read data back into hive table using
>>> hive
>>> > console. But if I try further process orc files generated by Spark job
>>> by
>>> > loading into dataframe  then I get the following exception
>>> > Caused by: java.io.IOException: Malformed ORC file
>>> > hdfs://localhost:9000/user/hive/warehouse/partorc/part_tiny.txt.
>>> Invalid
>>> > postscript.
>>> >
>>> > Dataframe df = hiveContext.read().format("orc").load(to/path);
>>> >
>>> > Please guide.
>>> >
>>> >
>>> >
>>> > --
>>> > View this message in context:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/Hive-ORC-Malformed-while-loading-into-spark-data-frame-tp24876.html
>>> > Sent from the Apache Spark User List mailing list archive at
>>> Nabble.com <http://nabble.com/>.
>>> >
>>> > ---------------------------------------------------------------------
>>> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> > For additional commands, e-mail: user-h...@spark.apache.org
>>> >
>>> >
>>>
>>> --
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity
>>> to
>>> which it is addressed and may contain information that is confidential,
>>> privileged and exempt from disclosure under applicable law. If the reader
>>> of this message is not the intended recipient, you are hereby notified
>>> that
>>> any printing, copying, dissemination, distribution, disclosure or
>>> forwarding of this communication is strictly prohibited. If you have
>>> received this communication in error, please contact the sender
>>> immediately
>>> and delete it from your system. Thank You.
>>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>
>
>

Reply via email to