Ops, I just so the link. It is not actually only for Spark 2.0.
To be clear, https://issues.apache.org/jira/browse/SPARK-15393 was a bit different with your case (it was about writing empty data frame with empty partitions). This was caused by https://github.com/apache/spark/pull/12855 and reverted. I wrote your case in the comments in that JIRA. 2016-06-15 10:26 GMT+09:00 Hyukjin Kwon <gurwls...@gmail.com>: > Yea, I met this case before. I guess this is related with > https://issues.apache.org/jira/browse/SPARK-15393. > > 2016-06-15 8:46 GMT+09:00 antoniosi <antonio...@gmail.com>: > >> I tried the following code in both Spark 1.5.1 and Spark 1.6.0: >> >> import org.apache.spark.sql.types.{ >> StructType, StructField, StringType, IntegerType} >> import org.apache.spark.sql.Row >> >> val schema = StructType( >> StructField("k", StringType, true) :: >> StructField("v", IntegerType, false) :: Nil) >> >> sqlContext.createDataFrame(sc.emptyRDD[Row], schema) >> df.write.save("hdfs://xxx") >> >> Both 1.5.1 and 1.6.0 only save _SUCCESS file. It does not save any >> _metadata >> files. Also, in 1.6.0, it also gives the following error: >> >> 16/06/14 16:29:27 WARN ParquetOutputCommitter: could not write summary >> file >> for hdfs://xxx >> java.lang.NullPointerException >> at >> >> org.apache.parquet.hadoop.ParquetFileWriter.mergeFooters(ParquetFileWriter.java:456) >> at >> >> org.apache.parquet.hadoop.ParquetFileWriter.writeMetadataFile(ParquetFileWriter.java:420) >> at >> >> org.apache.parquet.hadoop.ParquetOutputCommitter.writeMetaDataFile(ParquetOutputCommitter.java:58) >> at >> >> org.apache.parquet.hadoop.ParquetOutputCommitter.commitJob(ParquetOutputCommitter.java:48) >> at >> >> org.apache.spark.sql.execution.datasources.BaseWriterContainer.commitJob(WriterContainer.scala:230) >> at >> >> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1.apply$mcV$sp(InsertIntoHadoopFsRelation.scala:151) >> at >> >> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1.apply(InsertIntoHadoopFsRelation.scala:108) >> at >> >> org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1.apply(InsertIntoHadoopFsRelation.scala:108) >> >> I do not get this exception in 1.5.1 version though. >> >> I see this bug https://issues.apache.org/jira/browse/SPARK-15393, but >> this >> is for Spark 2.0. Is there a same bug in Spark 1.5.1 and 1.6? >> >> Is there a way we could save an empty dataframe properly? >> >> Thanks. >> >> Antonio. >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/Writing-empty-Dataframes-doesn-t-save-any-metadata-files-in-Spark-1-5-1-and-1-6-tp27169.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >