Hi Zang thanks much please find the code below Working code loading data from a path created by Hive table using hive console outside of spark :
DataFrame df = hiveContext.read().format("orc").load("/hdfs/path/to/hive/table/partition") Not working code inside spark hive tables created using hiveContext.sql insert into partition queries DataFrame df = hiveContext.read().format("orc").load("/hdfs/path/to/hive/table/partition/created/by/spark") You see above is same in both cases just second code is trying to load orc data created by Spark. On Sep 30, 2015 11:22 AM, "Zhan Zhang" <zzh...@hortonworks.com> wrote: > Hi Umesh, > > The potential reason is that Hive and Spark does not use same > OrcInputFormat. In new hive version, there are NewOrcInputFormat, but it is > not in spark because of backward compatibility (which is not available in > hive-0.12). > Do you mind post the code that works and not works for you? > > Thanks. > > Zhan Zhang > > On Sep 29, 2015, at 10:05 PM, Umesh Kacha <umesh.ka...@gmail.com> wrote: > > Hi I can read/load orc data created by hive table in a dataframe why is it > throwing Malformed ORC exception when I try to load data created by > hiveContext.sql into dataframe? > On Sep 30, 2015 2:37 AM, "Hortonworks" <zzh...@hortonworks.com> wrote: > >> You can try to use data frame for both read and write >> >> Thanks >> >> Zhan Zhang >> >> >> Sent from my iPhone >> >> On Sep 29, 2015, at 1:56 PM, Umesh Kacha <umesh.ka...@gmail.com> wrote: >> >> Hi Zang, thanks for the response. Table is created using Spark >> hiveContext.sql and data inserted into table also using hiveContext.sql. >> Insert into partition table. When I try to load orc data into dataframe I >> am loading particular partition data stored in path say >> /user/xyz/Hive/xyz.db/sparktable/partition1=abc >> >> Regards, >> Umesh >> On Sep 30, 2015 02:21, "Hortonworks" <zzh...@hortonworks.com> wrote: >> >>> How was the table is generated, by hive or by spark? >>> >>> If you generate table using have but read it by data frame, it may have >>> some comparability issue. >>> >>> Thanks >>> >>> Zhan Zhang >>> >>> >>> Sent from my iPhone >>> >>> > On Sep 29, 2015, at 1:47 PM, unk1102 <umesh.ka...@gmail.com> wrote: >>> > >>> > Hi I have a spark job which creates hive tables in orc format with >>> > partitions. It works well I can read data back into hive table using >>> hive >>> > console. But if I try further process orc files generated by Spark job >>> by >>> > loading into dataframe then I get the following exception >>> > Caused by: java.io.IOException: Malformed ORC file >>> > hdfs://localhost:9000/user/hive/warehouse/partorc/part_tiny.txt. >>> Invalid >>> > postscript. >>> > >>> > Dataframe df = hiveContext.read().format("orc").load(to/path); >>> > >>> > Please guide. >>> > >>> > >>> > >>> > -- >>> > View this message in context: >>> http://apache-spark-user-list.1001560.n3.nabble.com/Hive-ORC-Malformed-while-loading-into-spark-data-frame-tp24876.html >>> > Sent from the Apache Spark User List mailing list archive at >>> Nabble.com <http://nabble.com/>. >>> > >>> > --------------------------------------------------------------------- >>> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> > For additional commands, e-mail: user-h...@spark.apache.org >>> > >>> > >>> >>> -- >>> CONFIDENTIALITY NOTICE >>> NOTICE: This message is intended for the use of the individual or entity >>> to >>> which it is addressed and may contain information that is confidential, >>> privileged and exempt from disclosure under applicable law. If the reader >>> of this message is not the intended recipient, you are hereby notified >>> that >>> any printing, copying, dissemination, distribution, disclosure or >>> forwarding of this communication is strictly prohibited. If you have >>> received this communication in error, please contact the sender >>> immediately >>> and delete it from your system. Thank You. >>> >> >> CONFIDENTIALITY NOTICE >> NOTICE: This message is intended for the use of the individual or entity >> to which it is addressed and may contain information that is confidential, >> privileged and exempt from disclosure under applicable law. If the reader >> of this message is not the intended recipient, you are hereby notified that >> any printing, copying, dissemination, distribution, disclosure or >> forwarding of this communication is strictly prohibited. If you have >> received this communication in error, please contact the sender immediately >> and delete it from your system. Thank You. > > >