Ran into this same issue. Only solution seems to be to coerce the DataFrame's schema back into the right state. Looks like you have to convert the DF to an RDD, which has an overhead. But otherwise this worked for me:
val newDF = sqlContext.createDataFrame(origDF.rdd, new StructType(origDF.schema.map(sf => new StructField(sf.name, sf.dataType, false)).toArray)) -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Nullable-is-true-for-the-schema-of-parquet-data-tp22837p22840.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org