Jia Yu created SEDONA-494:
-----------------------------
Summary: Raster data source cannot write to HDFS
Key: SEDONA-494
URL: https://issues.apache.org/jira/browse/SEDONA-494
Project: Apache Sedona
Issue Type: Bug
Reporter: Jia Yu
h2. Reproduce
When run the following code
var df = spark.read.format("binaryFile").load("/user/spark/raster/input.tif")
df.write.format("raster").mode(org.apache.spark.sql.SaveMode.Overwrite).save("output")
Just a _SUCCESS file found in the path.
I can find tiff file created in HDFS audit.log , but there's not 'rename' cmd .
I can find "SparkHadoopMapRedUtil: No need to commit output of task because
needsTaskCommit=false: attempt_xxx
BasicWriteTaskStatsTracker: Expected 1 files, but only saw 0. This could be due
to the output format not writing empty files, or files being not immediately
visible in the filesystem." in executor log.
h2. Solution:
in "org.apache.spark.sql.sedona_sql.io.raster.RasterFileFormat.scala"
val out = hfs.create(new Path(Paths.get(savePath, new
Path(rasterFilePath).getName).toString))
=>
val out = hfs.create(new Path(savePath, new Path(rasterFilePath).getName))
will solve the problem
Paths.get should not be used in FileSystem implements
--
This message was sent by Atlassian Jira
(v8.20.10#820010)