Parquet file when are you loading these file? can you please share the code where you are passing parquet file to spark?.
On 8 June 2015 at 16:39, Cheng Lian <lian.cs....@gmail.com> wrote: > Are you appending the joined DataFrame whose PolicyType is string to an > existing Parquet file whose PolicyType is int? The exception indicates that > Parquet found a column with conflicting data types. > > Cheng > > > On 6/8/15 5:29 PM, bipin wrote: > >> Hi I get this error message when saving a table: >> >> parquet.io.ParquetDecodingException: The requested schema is not >> compatible >> with the file schema. incompatible types: optional binary PolicyType >> (UTF8) >> != optional int32 PolicyType >> at >> >> parquet.io.ColumnIOFactory$ColumnIOCreatorVisitor.incompatibleSchema(ColumnIOFactory.java:105) >> at >> >> parquet.io.ColumnIOFactory$ColumnIOCreatorVisitor.visit(ColumnIOFactory.java:97) >> at parquet.schema.PrimitiveType.accept(PrimitiveType.java:386) >> at >> >> parquet.io.ColumnIOFactory$ColumnIOCreatorVisitor.visitChildren(ColumnIOFactory.java:87) >> at >> >> parquet.io.ColumnIOFactory$ColumnIOCreatorVisitor.visit(ColumnIOFactory.java:61) >> at parquet.schema.MessageType.accept(MessageType.java:55) >> at >> parquet.io.ColumnIOFactory.getColumnIO(ColumnIOFactory.java:148) >> at >> parquet.io.ColumnIOFactory.getColumnIO(ColumnIOFactory.java:137) >> at >> parquet.io.ColumnIOFactory.getColumnIO(ColumnIOFactory.java:157) >> at >> >> parquet.hadoop.InternalParquetRecordWriter.initStore(InternalParquetRecordWriter.java:107) >> at >> >> parquet.hadoop.InternalParquetRecordWriter.<init>(InternalParquetRecordWriter.java:94) >> at >> parquet.hadoop.ParquetRecordWriter.<init>(ParquetRecordWriter.java:64) >> at >> >> parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:282) >> at >> >> parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:252) >> at >> org.apache.spark.sql.parquet.ParquetRelation2.org >> $apache$spark$sql$parquet$ParquetRelation2$$writeShard$1(newParquet.scala:667) >> at >> >> org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$insert$2.apply(newParquet.scala:689) >> at >> >> org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$insert$2.apply(newParquet.scala:689) >> at >> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) >> at org.apache.spark.scheduler.Task.run(Task.scala:64) >> at >> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203) >> at >> >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:745) >> >> I joined two tables both loaded from parquet file, the joined table when >> saved throws this error. I could not find anything about this error. Could >> this be a bug ? >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/Error-in-using-saveAsParquetFile-tp23204.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > > -- Hi, Find my attached resume. I have total around 7 years of work experience. I worked for Amazon and Expedia in my previous assignments and currently I am working with start- up technology company called Insideview in hyderabad. Regards Jeetendra