Davis-Zhang-Onehouse opened a new pull request, #18814: URL: https://github.com/apache/hudi/pull/18814
### Describe the issue this Pull Request addresses The implicit-key lock providers — `DynamoDBBasedImplicitPartitionKeyLockProvider` and `ZookeeperBasedImplicitBasePathLockProvider` — hash the hudi table base path to derive a DynamoDB partition key / Zookeeper znode for the table-level lock. The hash function (XXH64) is avalanche: any byte-level difference in the input produces a completely different output. Today the only normalization applied before hashing is `s3aToS3` (`s3a://` → `s3://`). Trailing slashes, repeated trailing slashes, and surrounding whitespace pass through unchanged. When two writers for the same hudi table disagree on those benign formatting details — for example, one engine supplies `"s3://bucket/table"` while another supplies `"s3://bucket/table/"` — they end up acquiring **different** lock rows / znodes and lose mutual exclusion, even though they are targeting the same table. Two writers that should serialize can then write concurrently and corrupt the hudi timeline. ### Summary and Changelog Introduce `FSUtils.normalizeBasePathForLocking(String)` as the single source of truth for canonicalization before hashing: 1. Reject null / empty / whitespace-only basePath. 2. `trim()` surrounding whitespace. 3. Apply existing `s3aToS3` (case-insensitive `s3a://` → `s3://`). 4. Strip all trailing `/` then add exactly one. Wire it into both implicit-key lock providers: - `DynamoDBBasedImplicitPartitionKeyLockProvider` — also gains a small public static `derivePartitionKey(String)` so the formula is testable without a DynamoDB client. - `ZookeeperBasedImplicitBasePathLockProvider` — `getLockBasePath(String)` now normalizes before hashing. Both providers also log the normalized form alongside the original input, making it easier to diagnose any future drift. **Deliberately not normalized:** - **Inner consecutive slashes** (`s3://b//x` vs `s3://b/x`) — these can resolve to a legitimate S3 key; rewriting them could collide unrelated tables. - **Scheme case beyond `s3a`** (e.g. `S3://`) — never observed in practice and out of scope. - **URL encoding** — Hudi does not re-encode paths internally. ### Impact - **Behavior:** Implicit-key lock providers will produce the same lock for a hudi table regardless of trailing-slash, multi-slash, surrounding whitespace, or `s3a`/`s3` scheme variations in the base path supplied to the `LockConfiguration`. No public API signature change. - **Performance:** Negligible — one extra `trim()` and a short backward scan over the path string per `getDynamoDBPartitionKey` / `getLockBasePath` call. - **Compatibility (rollout caveat):** Locks taken under the previous (no-trailing-slash or whitespace-sensitive) form effectively orphan at deploy time, since the new code looks at a different lock row / znode for the same logical table. Operators upgrading should coordinate the deploy across all writers that share a hudi table, or briefly quiesce writers while rolling out. Stale DynamoDB rows can be left to expire naturally (heartbeat-based) or cleaned up manually. ### Risk Level medium The lock-key derivation changes for an existing canonical input form (trailing-slash callers will see the same hash as before, since `"s3://b/x/"` → strip → `"s3://b/x"` → add `/` → `"s3://b/x/"`). The risk is concentrated on callers who previously supplied the basePath without a trailing slash — those locks will key under a new row / znode after this change, which is the entire point of the fix but does mean any in-flight lock under the old key is dropped at deploy time. Mitigation: - Existing `TestZookeeperBasedLockProvider` continues to pass with no changes. - New tests cover the invariant directly: trailing slash, multi-slash, whitespace, and `s3a`/`s3` variants all produce the same key. - Local Maven runs: `TestFSUtils` (13 cases), `TestDynamoDBBasedImplicitPartitionKeyLockProvider` (4 cases), `TestZookeeperBasedImplicitBasePathLockProvider` (4 cases) all pass. ### Documentation Update none — no user-facing config or website change. The new helper is internal. ### Contributor's checklist - [x] Read through [contributor's guide](https://hudi.apache.org/contribute/how-to-contribute) - [x] Enough context is provided in the sections above - [x] Adequate tests were added if applicable -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
