Kristin Cowalcijk created SEDONA-632:
----------------------------------------

             Summary: Don't use a conventional output committer when writing 
raster files using {{df.write.format("raster")}}
                 Key: SEDONA-632
                 URL: https://issues.apache.org/jira/browse/SEDONA-632
             Project: Apache Sedona
          Issue Type: Improvement
            Reporter: Kristin Cowalcijk
             Fix For: 1.6.1


Writing large amounts of raster files to distributed file systems or object 
store is super slow, because the output committer has to move files from 
temporary locations to their target locations. Users will see all the tasks are 
completed but the driver is stuck at the committing phase.

We'll add an option {{useDirectCommitter}} to the raster format. By default 
{{useDirectCommitter}} is {{true}}, and the raster format will use a direct 
committer that writes raster files to their target locations directly. Users 
can manually set it to {{false}} if they want the original behavior:

{code:python}
df.write.format("raster").option("useDirectCommitter", 
"false").save("/target/location")
{code}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to