Davis-Zhang-Onehouse commented on PR #18814:
URL: https://github.com/apache/hudi/pull/18814#issuecomment-4746603916

   Good catch @yihua — yes, the previous release hashed the base path after 
only `s3aToS3(...)`, with no trailing-slash handling, so for a caller passing 
`s3://bucket/table` the lock key was `hash("s3://bucket/table")`.
   
   Appending a trailing slash (the original version of this PR) would therefore 
have changed the derived key for that common no-trailing-slash case, orphaning 
in-flight locks on upgrade — exactly the drift we are trying to prevent.
   
   I have updated the PR to canonicalize **without** a trailing slash: 
`normalizeBasePathForLocking` now trims whitespace, applies `s3aToS3`, and 
strips all trailing slashes (it no longer appends one). Consequences:
   
   - **Common case (no trailing slash):** `"s3://bucket/table"` → 
`"s3://bucket/table"` — identical to the prior release, so the 
hash/partition-key/znode is unchanged and there is **no behavior change on 
upgrade**.
   - **Trailing-slash / multi-slash / whitespace / `s3a` variants:** all 
collapse onto that same canonical no-slash form, which is the intended fix 
(writers that disagree only on these benign details now share one lock).
   
   Updated `TestFSUtils#testNormalizeBasePathForLocking` and the 
`ITTestDynamoDBBasedLockProvider` assertion accordingly. PTAL.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to