mao-liu commented on issue #6563:
URL: https://github.com/apache/paimon/issues/6563#issuecomment-3918053175

   > I took a slightly different approach and added conditional writes to the 
snapshot file upload instead, so that no separate lock was needed. I do like 
how the storage-based Lock implementation would reduce the number of retries 
though. I wonder how frequent your commits would have to be before lock 
contention became an issue vs retries taking more time. Might be worth a 
benchmark to figure out the best performance.
   
   @tub I'm quite curious on how you're approaching updating the 
`snapshot/LATEST` file - this file is overwritten on each successful commit, 
are there conditional-write semantics applied to updating this file (e.g. 
matching on Etag?)
   
   I ended up implementing this lock in user-code, as it was easy enough to 
implement an object lock following the interface, and gives us benefits such as 
having locks around schema updates too.
   
   > I wonder how frequent your commits would have to be before lock contention 
became an issue vs retries taking more time.
   In my observations thus far, the global committer has some automatic retries 
when encountering lock contention (options `commit.max-retries`, 
`commit.max-retry-wait`), and the job facing lock contention is okay to wait 
until other processes release the lock. I guess the severity of the issue would 
depend on how long the global committer takes while holding the lock, which in 
turn depends on table configuration and data volumes...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to