Re: Error in using saveAsParquetFile
Cheng you were right. It works when I remove the field from either one. I should have checked the types beforehand. What confused me is that Spark attempted to join it and midway threw the error. It isn't quite there yet. Thanks for the help. On Mon, Jun 8, 2015 at 8:29 PM Cheng Lian lian.cs@gmail.com wrote: I suspect that Bookings and Customerdetails both have a PolicyType field, one is string and the other is an int. Cheng On 6/8/15 9:15 PM, Bipin Nag wrote: Hi Jeetendra, Cheng I am using following code for joining val Bookings = sqlContext.load(/home/administrator/stageddata/Bookings) val Customerdetails = sqlContext.load(/home/administrator/stageddata/Customerdetails) val CD = Customerdetails. where($CreatedOn 2015-04-01 00:00:00.0). where($CreatedOn 2015-05-01 00:00:00.0) //Bookings by CD val r1 = Bookings. withColumnRenamed(ID,ID2) val r2 = CD. join(r1,CD.col(CustomerID) === r1.col(ID2),left) r2.saveAsParquetFile(/home/administrator/stageddata/BOOKING_FULL); @Cheng I am not appending the joined table to an existing parquet file, it is a new file. @Jitender I have a rather large parquet file and it also contains some confidential data. Can you tell me what you need to check in it. Thanks On 8 June 2015 at 16:47, Jeetendra Gangele gangele...@gmail.com wrote: Parquet file when are you loading these file? can you please share the code where you are passing parquet file to spark?. On 8 June 2015 at 16:39, Cheng Lian lian.cs@gmail.com wrote: Are you appending the joined DataFrame whose PolicyType is string to an existing Parquet file whose PolicyType is int? The exception indicates that Parquet found a column with conflicting data types. Cheng On 6/8/15 5:29 PM, bipin wrote: Hi I get this error message when saving a table: parquet.io.ParquetDecodingException: The requested schema is not compatible with the file schema. incompatible types: optional binary PolicyType (UTF8) != optional int32 PolicyType at parquet.io.ColumnIOFactory$ColumnIOCreatorVisitor.incompatibleSchema(ColumnIOFactory.java:105) at parquet.io.ColumnIOFactory$ColumnIOCreatorVisitor.visit(ColumnIOFactory.java:97) at parquet.schema.PrimitiveType.accept(PrimitiveType.java:386) at parquet.io.ColumnIOFactory$ColumnIOCreatorVisitor.visitChildren(ColumnIOFactory.java:87) at parquet.io.ColumnIOFactory$ColumnIOCreatorVisitor.visit(ColumnIOFactory.java:61) at parquet.schema.MessageType.accept(MessageType.java:55) at parquet.io.ColumnIOFactory.getColumnIO(ColumnIOFactory.java:148) at parquet.io.ColumnIOFactory.getColumnIO(ColumnIOFactory.java:137) at parquet.io.ColumnIOFactory.getColumnIO(ColumnIOFactory.java:157) at parquet.hadoop.InternalParquetRecordWriter.initStore(InternalParquetRecordWriter.java:107) at parquet.hadoop.InternalParquetRecordWriter.init(InternalParquetRecordWriter.java:94) at parquet.hadoop.ParquetRecordWriter.init(ParquetRecordWriter.java:64) at parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:282) at parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:252) at org.apache.spark.sql.parquet.ParquetRelation2.org $apache$spark$sql$parquet$ParquetRelation2$$writeShard$1(newParquet.scala:667) at org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$insert$2.apply(newParquet.scala:689) at org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$insert$2.apply(newParquet.scala:689) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) at org.apache.spark.scheduler.Task.run(Task.scala:64) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) I joined two tables both loaded from parquet file, the joined table when saved throws this error. I could not find anything about this error. Could this be a bug ? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Error-in-using-saveAsParquetFile-tp23204.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org -- Hi, Find my attached resume. I have total around 7 years of work experience. I worked for Amazon and Expedia in my
Re: Error in using saveAsParquetFile
Yeah, this does look confusing. We are trying to improve the error reporting by catching similar issues at the end of the analysis phase and give more descriptive error messages. Part of the work can be found here: https://github.com/apache/spark/blob/0902a11940e550e85a53e110b490fe90e16ddaf4/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala Cheng On 6/9/15 2:51 PM, Bipin Nag wrote: Cheng you were right. It works when I remove the field from either one. I should have checked the types beforehand. What confused me is that Spark attempted to join it and midway threw the error. It isn't quite there yet. Thanks for the help. On Mon, Jun 8, 2015 at 8:29 PM Cheng Lian lian.cs@gmail.com mailto:lian.cs@gmail.com wrote: I suspect that Bookings and Customerdetails both have a PolicyType field, one is string and the other is an int. Cheng On 6/8/15 9:15 PM, Bipin Nag wrote: Hi Jeetendra, Cheng I am using following code for joining val Bookings = sqlContext.load(/home/administrator/stageddata/Bookings) val Customerdetails = sqlContext.load(/home/administrator/stageddata/Customerdetails) val CD = Customerdetails. where($CreatedOn 2015-04-01 00:00:00.0). where($CreatedOn 2015-05-01 00:00:00.0) //Bookings by CD val r1 = Bookings. withColumnRenamed(ID,ID2) val r2 = CD. join(r1,CD.col(CustomerID) === r1.col(ID2),left) r2.saveAsParquetFile(/home/administrator/stageddata/BOOKING_FULL); @Cheng I am not appending the joined table to an existing parquet file, it is a new file. @Jitender I have a rather large parquet file and it also contains some confidential data. Can you tell me what you need to check in it. Thanks On 8 June 2015 at 16:47, Jeetendra Gangele gangele...@gmail.com mailto:gangele...@gmail.com wrote: Parquet file when are you loading these file? can you please share the code where you are passing parquet file to spark?. On 8 June 2015 at 16:39, Cheng Lian lian.cs@gmail.com mailto:lian.cs@gmail.com wrote: Are you appending the joined DataFrame whose PolicyType is string to an existing Parquet file whose PolicyType is int? The exception indicates that Parquet found a column with conflicting data types. Cheng On 6/8/15 5:29 PM, bipin wrote: Hi I get this error message when saving a table: parquet.io http://parquet.io.ParquetDecodingException: The requested schema is not compatible with the file schema. incompatible types: optional binary PolicyType (UTF8) != optional int32 PolicyType at parquet.io.ColumnIOFactory$ColumnIOCreatorVisitor.incompatibleSchema(ColumnIOFactory.java:105) at parquet.io.ColumnIOFactory$ColumnIOCreatorVisitor.visit(ColumnIOFactory.java:97) at parquet.schema.PrimitiveType.accept(PrimitiveType.java:386) at parquet.io.ColumnIOFactory$ColumnIOCreatorVisitor.visitChildren(ColumnIOFactory.java:87) at parquet.io.ColumnIOFactory$ColumnIOCreatorVisitor.visit(ColumnIOFactory.java:61) at parquet.schema.MessageType.accept(MessageType.java:55) at parquet.io.ColumnIOFactory.getColumnIO(ColumnIOFactory.java:148) at parquet.io.ColumnIOFactory.getColumnIO(ColumnIOFactory.java:137) at parquet.io.ColumnIOFactory.getColumnIO(ColumnIOFactory.java:157) at parquet.hadoop.InternalParquetRecordWriter.initStore(InternalParquetRecordWriter.java:107) at parquet.hadoop.InternalParquetRecordWriter.init(InternalParquetRecordWriter.java:94) at parquet.hadoop.ParquetRecordWriter.init(ParquetRecordWriter.java:64) at parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:282) at parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:252) at org.apache.spark.sql.parquet.ParquetRelation2.org http://org.apache.spark.sql.parquet.ParquetRelation2.org$apache$spark$sql$parquet$ParquetRelation2$$writeShard$1(newParquet.scala:667) at org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$insert$2.apply(newParquet.scala:689) at
Re: Error in using saveAsParquetFile
Parquet file when are you loading these file? can you please share the code where you are passing parquet file to spark?. On 8 June 2015 at 16:39, Cheng Lian lian.cs@gmail.com wrote: Are you appending the joined DataFrame whose PolicyType is string to an existing Parquet file whose PolicyType is int? The exception indicates that Parquet found a column with conflicting data types. Cheng On 6/8/15 5:29 PM, bipin wrote: Hi I get this error message when saving a table: parquet.io.ParquetDecodingException: The requested schema is not compatible with the file schema. incompatible types: optional binary PolicyType (UTF8) != optional int32 PolicyType at parquet.io.ColumnIOFactory$ColumnIOCreatorVisitor.incompatibleSchema(ColumnIOFactory.java:105) at parquet.io.ColumnIOFactory$ColumnIOCreatorVisitor.visit(ColumnIOFactory.java:97) at parquet.schema.PrimitiveType.accept(PrimitiveType.java:386) at parquet.io.ColumnIOFactory$ColumnIOCreatorVisitor.visitChildren(ColumnIOFactory.java:87) at parquet.io.ColumnIOFactory$ColumnIOCreatorVisitor.visit(ColumnIOFactory.java:61) at parquet.schema.MessageType.accept(MessageType.java:55) at parquet.io.ColumnIOFactory.getColumnIO(ColumnIOFactory.java:148) at parquet.io.ColumnIOFactory.getColumnIO(ColumnIOFactory.java:137) at parquet.io.ColumnIOFactory.getColumnIO(ColumnIOFactory.java:157) at parquet.hadoop.InternalParquetRecordWriter.initStore(InternalParquetRecordWriter.java:107) at parquet.hadoop.InternalParquetRecordWriter.init(InternalParquetRecordWriter.java:94) at parquet.hadoop.ParquetRecordWriter.init(ParquetRecordWriter.java:64) at parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:282) at parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:252) at org.apache.spark.sql.parquet.ParquetRelation2.org $apache$spark$sql$parquet$ParquetRelation2$$writeShard$1(newParquet.scala:667) at org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$insert$2.apply(newParquet.scala:689) at org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$insert$2.apply(newParquet.scala:689) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) at org.apache.spark.scheduler.Task.run(Task.scala:64) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) I joined two tables both loaded from parquet file, the joined table when saved throws this error. I could not find anything about this error. Could this be a bug ? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Error-in-using-saveAsParquetFile-tp23204.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org -- Hi, Find my attached resume. I have total around 7 years of work experience. I worked for Amazon and Expedia in my previous assignments and currently I am working with start- up technology company called Insideview in hyderabad. Regards Jeetendra
Re: Error in using saveAsParquetFile
Are you appending the joined DataFrame whose PolicyType is string to an existing Parquet file whose PolicyType is int? The exception indicates that Parquet found a column with conflicting data types. Cheng On 6/8/15 5:29 PM, bipin wrote: Hi I get this error message when saving a table: parquet.io.ParquetDecodingException: The requested schema is not compatible with the file schema. incompatible types: optional binary PolicyType (UTF8) != optional int32 PolicyType at parquet.io.ColumnIOFactory$ColumnIOCreatorVisitor.incompatibleSchema(ColumnIOFactory.java:105) at parquet.io.ColumnIOFactory$ColumnIOCreatorVisitor.visit(ColumnIOFactory.java:97) at parquet.schema.PrimitiveType.accept(PrimitiveType.java:386) at parquet.io.ColumnIOFactory$ColumnIOCreatorVisitor.visitChildren(ColumnIOFactory.java:87) at parquet.io.ColumnIOFactory$ColumnIOCreatorVisitor.visit(ColumnIOFactory.java:61) at parquet.schema.MessageType.accept(MessageType.java:55) at parquet.io.ColumnIOFactory.getColumnIO(ColumnIOFactory.java:148) at parquet.io.ColumnIOFactory.getColumnIO(ColumnIOFactory.java:137) at parquet.io.ColumnIOFactory.getColumnIO(ColumnIOFactory.java:157) at parquet.hadoop.InternalParquetRecordWriter.initStore(InternalParquetRecordWriter.java:107) at parquet.hadoop.InternalParquetRecordWriter.init(InternalParquetRecordWriter.java:94) at parquet.hadoop.ParquetRecordWriter.init(ParquetRecordWriter.java:64) at parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:282) at parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:252) at org.apache.spark.sql.parquet.ParquetRelation2.org$apache$spark$sql$parquet$ParquetRelation2$$writeShard$1(newParquet.scala:667) at org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$insert$2.apply(newParquet.scala:689) at org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$insert$2.apply(newParquet.scala:689) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) at org.apache.spark.scheduler.Task.run(Task.scala:64) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) I joined two tables both loaded from parquet file, the joined table when saved throws this error. I could not find anything about this error. Could this be a bug ? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Error-in-using-saveAsParquetFile-tp23204.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Error in using saveAsParquetFile
Hi Jeetendra, Cheng I am using following code for joining val Bookings = sqlContext.load(/home/administrator/stageddata/Bookings) val Customerdetails = sqlContext.load(/home/administrator/stageddata/Customerdetails) val CD = Customerdetails. where($CreatedOn 2015-04-01 00:00:00.0). where($CreatedOn 2015-05-01 00:00:00.0) //Bookings by CD val r1 = Bookings. withColumnRenamed(ID,ID2) val r2 = CD. join(r1,CD.col(CustomerID) === r1.col(ID2),left) r2.saveAsParquetFile(/home/administrator/stageddata/BOOKING_FULL); @Cheng I am not appending the joined table to an existing parquet file, it is a new file. @Jitender I have a rather large parquet file and it also contains some confidential data. Can you tell me what you need to check in it. Thanks On 8 June 2015 at 16:47, Jeetendra Gangele gangele...@gmail.com wrote: Parquet file when are you loading these file? can you please share the code where you are passing parquet file to spark?. On 8 June 2015 at 16:39, Cheng Lian lian.cs@gmail.com wrote: Are you appending the joined DataFrame whose PolicyType is string to an existing Parquet file whose PolicyType is int? The exception indicates that Parquet found a column with conflicting data types. Cheng On 6/8/15 5:29 PM, bipin wrote: Hi I get this error message when saving a table: parquet.io.ParquetDecodingException: The requested schema is not compatible with the file schema. incompatible types: optional binary PolicyType (UTF8) != optional int32 PolicyType at parquet.io.ColumnIOFactory$ColumnIOCreatorVisitor.incompatibleSchema(ColumnIOFactory.java:105) at parquet.io.ColumnIOFactory$ColumnIOCreatorVisitor.visit(ColumnIOFactory.java:97) at parquet.schema.PrimitiveType.accept(PrimitiveType.java:386) at parquet.io.ColumnIOFactory$ColumnIOCreatorVisitor.visitChildren(ColumnIOFactory.java:87) at parquet.io.ColumnIOFactory$ColumnIOCreatorVisitor.visit(ColumnIOFactory.java:61) at parquet.schema.MessageType.accept(MessageType.java:55) at parquet.io.ColumnIOFactory.getColumnIO(ColumnIOFactory.java:148) at parquet.io.ColumnIOFactory.getColumnIO(ColumnIOFactory.java:137) at parquet.io.ColumnIOFactory.getColumnIO(ColumnIOFactory.java:157) at parquet.hadoop.InternalParquetRecordWriter.initStore(InternalParquetRecordWriter.java:107) at parquet.hadoop.InternalParquetRecordWriter.init(InternalParquetRecordWriter.java:94) at parquet.hadoop.ParquetRecordWriter.init(ParquetRecordWriter.java:64) at parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:282) at parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:252) at org.apache.spark.sql.parquet.ParquetRelation2.org $apache$spark$sql$parquet$ParquetRelation2$$writeShard$1(newParquet.scala:667) at org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$insert$2.apply(newParquet.scala:689) at org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$insert$2.apply(newParquet.scala:689) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) at org.apache.spark.scheduler.Task.run(Task.scala:64) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) I joined two tables both loaded from parquet file, the joined table when saved throws this error. I could not find anything about this error. Could this be a bug ? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Error-in-using-saveAsParquetFile-tp23204.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org -- Hi, Find my attached resume. I have total around 7 years of work experience. I worked for Amazon and Expedia in my previous assignments and currently I am working with start- up technology company called Insideview in hyderabad. Regards Jeetendra
Re: Error in using saveAsParquetFile
I suspect that Bookings and Customerdetails both have a PolicyType field, one is string and the other is an int. Cheng On 6/8/15 9:15 PM, Bipin Nag wrote: Hi Jeetendra, Cheng I am using following code for joining val Bookings = sqlContext.load(/home/administrator/stageddata/Bookings) val Customerdetails = sqlContext.load(/home/administrator/stageddata/Customerdetails) val CD = Customerdetails. where($CreatedOn 2015-04-01 00:00:00.0). where($CreatedOn 2015-05-01 00:00:00.0) //Bookings by CD val r1 = Bookings. withColumnRenamed(ID,ID2) val r2 = CD. join(r1,CD.col(CustomerID) === r1.col(ID2),left) r2.saveAsParquetFile(/home/administrator/stageddata/BOOKING_FULL); @Cheng I am not appending the joined table to an existing parquet file, it is a new file. @Jitender I have a rather large parquet file and it also contains some confidential data. Can you tell me what you need to check in it. Thanks On 8 June 2015 at 16:47, Jeetendra Gangele gangele...@gmail.com mailto:gangele...@gmail.com wrote: Parquet file when are you loading these file? can you please share the code where you are passing parquet file to spark?. On 8 June 2015 at 16:39, Cheng Lian lian.cs@gmail.com mailto:lian.cs@gmail.com wrote: Are you appending the joined DataFrame whose PolicyType is string to an existing Parquet file whose PolicyType is int? The exception indicates that Parquet found a column with conflicting data types. Cheng On 6/8/15 5:29 PM, bipin wrote: Hi I get this error message when saving a table: parquet.io http://parquet.io.ParquetDecodingException: The requested schema is not compatible with the file schema. incompatible types: optional binary PolicyType (UTF8) != optional int32 PolicyType at parquet.io.ColumnIOFactory$ColumnIOCreatorVisitor.incompatibleSchema(ColumnIOFactory.java:105) at parquet.io.ColumnIOFactory$ColumnIOCreatorVisitor.visit(ColumnIOFactory.java:97) at parquet.schema.PrimitiveType.accept(PrimitiveType.java:386) at parquet.io.ColumnIOFactory$ColumnIOCreatorVisitor.visitChildren(ColumnIOFactory.java:87) at parquet.io.ColumnIOFactory$ColumnIOCreatorVisitor.visit(ColumnIOFactory.java:61) at parquet.schema.MessageType.accept(MessageType.java:55) at parquet.io.ColumnIOFactory.getColumnIO(ColumnIOFactory.java:148) at parquet.io.ColumnIOFactory.getColumnIO(ColumnIOFactory.java:137) at parquet.io.ColumnIOFactory.getColumnIO(ColumnIOFactory.java:157) at parquet.hadoop.InternalParquetRecordWriter.initStore(InternalParquetRecordWriter.java:107) at parquet.hadoop.InternalParquetRecordWriter.init(InternalParquetRecordWriter.java:94) at parquet.hadoop.ParquetRecordWriter.init(ParquetRecordWriter.java:64) at parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:282) at parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:252) at org.apache.spark.sql.parquet.ParquetRelation2.org http://org.apache.spark.sql.parquet.ParquetRelation2.org$apache$spark$sql$parquet$ParquetRelation2$$writeShard$1(newParquet.scala:667) at org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$insert$2.apply(newParquet.scala:689) at org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$insert$2.apply(newParquet.scala:689) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) at org.apache.spark.scheduler.Task.run(Task.scala:64) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) I joined two tables both loaded from parquet file, the joined table when saved throws this error. I could not find anything about this error. Could this be a bug ? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Error-in-using-saveAsParquetFile-tp23204.html Sent from the Apache Spark User List mailing list archive at