mao-liu commented on issue #6563: URL: https://github.com/apache/paimon/issues/6563#issuecomment-3918053175
> I took a slightly different approach and added conditional writes to the snapshot file upload instead, so that no separate lock was needed. I do like how the storage-based Lock implementation would reduce the number of retries though. I wonder how frequent your commits would have to be before lock contention became an issue vs retries taking more time. Might be worth a benchmark to figure out the best performance. @tub I'm quite curious on how you're approaching updating the `snapshot/LATEST` file - this file is overwritten on each successful commit, are there conditional-write semantics applied to updating this file (e.g. matching on Etag?) I ended up implementing this lock in user-code, as it was easy enough to implement an object lock following the interface, and gives us benefits such as having locks around schema updates too. > I wonder how frequent your commits would have to be before lock contention became an issue vs retries taking more time. In my observations thus far, the global committer has some automatic retries when encountering lock contention (options `commit.max-retries`, `commit.max-retry-wait`), and the job facing lock contention is okay to wait until other processes release the lock. I guess the severity of the issue would depend on how long the global committer takes while holding the lock, which in turn depends on table configuration and data volumes... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
