Re: Parquet file generated by Spark, but not compatible read by Hive

2017-06-13 Thread Yong Zhang
The issue is cased by the data, and indeed a type miss match between Hive schema and Spark. Now it is fixed. Without that kind of data, the problem won't be trigged in some brands. Thanks taking a look of this problem. Yong From: ayan guha

Re: Parquet file generated by Spark, but not compatible read by Hive

2017-06-12 Thread ayan guha
Try setting following Param: conf.set("spark.sql.hive.convertMetastoreParquet","false") On Tue, Jun 13, 2017 at 3:34 PM, Angel Francisco Orta < angel.francisco.o...@gmail.com> wrote: > Hello, > > Do you use df.write or you make with hivecontext.sql(" insert into ...")? > > Angel. > > El 12 jun.

Re: Parquet file generated by Spark, but not compatible read by Hive

2017-06-12 Thread Angel Francisco Orta
Hello, Do you use df.write or you make with hivecontext.sql(" insert into ...")? Angel. El 12 jun. 2017 11:07 p. m., "Yong Zhang" escribió: > We are using Spark *1.6.2* as ETL to generate parquet file for one > dataset, and partitioned by "brand" (which is a string to

Parquet file generated by Spark, but not compatible read by Hive

2017-06-12 Thread Yong Zhang
We are using Spark 1.6.2 as ETL to generate parquet file for one dataset, and partitioned by "brand" (which is a string to represent brand in this dataset). After the partition files generated in HDFS like "brand=a" folder, we add the partitions in the Hive. The hive version is 1.2.1 (In