On 30 Sep 2018, at 19:37, Jacek Laskowski
<[email protected]<mailto:[email protected]>> wrote:
scala> spark.range(1).write.saveAsTable("demo")
2018-09-30 17:44:27 WARN ObjectStore:568 - Failed to get database global_temp,
returning NoSuchObjectException
2018-09-30 17:44:28 ERROR FileOutputCommitter:314 - Mkdirs failed to create
file:/user/hive/warehouse/demo/_temporary/0
2018-09-30 17:44:28 ERROR Utils:91 - Aborting task
java.io.IOException: Mkdirs failed to create
file:/user/hive/warehouse/demo/_temporary/0/_temporary/attempt_20180930174428_0000_m_000007_0
(exists=false, cwd=file:/Users/jacek/dev/apps/spark-2.3.2-bin-hadoop2.7)
the java File.mkdirs() API just returns false when you couldn't create a
directory for any reason, without any explanation. It can include: directory
existing there already, a file being there, no disk space, no permissions, etc,
etc.
Here I believe it's trying to create the parent dir, get's a false,
ChecksumFileSystem looks to see if the parent already exists (it doesn't), and
raises the IOE.
Possible causes I can see then are
* permissions
* space
* wrong path
Is this running across a cluster? In which case the fact it's using file:// is
unusual, unless you've got some shared NFS mounted storage.
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:455)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:440)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:911)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:892)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:789)
at
org.apache.parquet.hadoop.ParquetFileWriter.<init>(ParquetFileWriter.java:241)
at
org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:342)
at
org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:302)
at
org.apache.spark.sql.execution.datasources.parquet.ParquetOutputWriter.<init>(ParquetOutputWriter.scala:37)
at
org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anon$1.newInstance(ParquetFileFormat.scala:151)
at
org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.newOutputWriter(FileFormatWriter.scala:367)
at
org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:378)
at
org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$d