kbendick opened a new issue #4007:
URL: https://github.com/apache/iceberg/issues/4007
As of Iceberg 0.13.0, the Spark stored procedures `remove_orphan_files` and
`expire_snapshots` both support controlling the parallelism of deletes via a
parameter `max_concurrent_deletes`.
By default, without this, the files of each action are deleted serially in
the current thread.
This parameter is used to instantiate a thread pool of the given size, which
is passed to the method `executeDeleteWith(ExecutorService executorService)`,
causing the deletes to take place in a work thread pool.
Without using this, the file deletes can take significantly longer and we
should add documentation for them.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]