Kristin Cowalcijk created SEDONA-632:
----------------------------------------
Summary: Don't use a conventional output committer when writing
raster files using {{df.write.format("raster")}}
Key: SEDONA-632
URL: https://issues.apache.org/jira/browse/SEDONA-632
Project: Apache Sedona
Issue Type: Improvement
Reporter: Kristin Cowalcijk
Fix For: 1.6.1
Writing large amounts of raster files to distributed file systems or object
store is super slow, because the output committer has to move files from
temporary locations to their target locations. Users will see all the tasks are
completed but the driver is stuck at the committing phase.
We'll add an option {{useDirectCommitter}} to the raster format. By default
{{useDirectCommitter}} is {{true}}, and the raster format will use a direct
committer that writes raster files to their target locations directly. Users
can manually set it to {{false}} if they want the original behavior:
{code:python}
df.write.format("raster").option("useDirectCommitter",
"false").save("/target/location")
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)