On Thu, 12 Feb 2026 22:33:15 +0100 Robin Jarry <[email protected]> wrote:
> When rte_hash_add_key_data() is called with an existing key, the old > data pointer is silently overwritten. With RCU-protected readers still > potentially accessing the old data, the application cannot safely free > it since the old pointer is never returned. > > This series provides two ways to address the problem: > > * Automatic: when RCU is configured with a free_key_data_func callback, > rte_hash_add_key_data() now automatically defers freeing the old data > on overwrite. No caller changes are needed. > > * Explicit: a new rte_hash_replace_key_data() API returns the previous > data pointer via an output parameter, giving the caller full control > over the old pointer's lifecycle. No automatic deferred free is > performed in that case. > > Robin Jarry (2): > hash: free replaced data on overwrite when RCU is configured > hash: add replace API returning old data on overwrite > > app/test/test_hash.c | 214 +++++++++++++++++++++++++ > doc/guides/rel_notes/release_26_03.rst | 12 ++ > lib/hash/rte_cuckoo_hash.c | 146 +++++++++++++---- > lib/hash/rte_hash.h | 75 ++++++++- > 4 files changed, 417 insertions(+), 30 deletions(-) Subtle change, so fed to AI and got. Here's the deep dive review. The two most substantive findings: 1. Fragile overwrite/delete discriminator in __hash_rcu_qsbr_free_resource — The dequeue callback uses old_data != NULL to distinguish overwrite entries from delete entries, but the enqueue side already sets key_idx = 0 as a sentinel for overwrites. Using key_idx == 0 would be more robust, because if an application legitimately stores NULL data and overwrites the key, the current code will fall through to the delete path and try to recycle key index 0 (the dummy slot). 2. Leaked old data on rte_rcu_qsbr_dq_enqueue failure — The enqueue failure in __hash_rcu_auto_free_old_data logs an error but the old data becomes permanently unreachable. This is consistent with the existing delete-path pattern, but it's a real resource leak on error. The overall design is solid — the split between automatic RCU free (patch 1) for existing users and explicit replace API (patch 2) for fine-grained control is clean, and the interaction between the two is correctly ordered in the overwrite: label. Long review. # DPDK Hash Library Patch Series Review **Series**: hash: free replaced data on overwrite / add replace API **Author**: Robin Jarry <[email protected]> **Patches**: 2 **Date**: 2026-02-12 --- ## Patch 1/2: hash: free replaced data on overwrite when RCU is configured ### Summary When `rte_hash_add_key_data()` overwrites an existing key, the old data pointer was silently lost. With RCU-protected readers still potentially accessing the old data, applications had no safe way to free it. This patch adds automatic deferred freeing of the old data pointer via the RCU defer queue when a `free_key_data_func` callback is configured. ### Correctness Analysis #### **Error — Potential use of NULL data treated as "no overwrite" (Medium confidence ~60%)** In `__rte_hash_add_key_with_hash()`, the overwrite detection relies on: ```c overwrite: if (saved_old_data != NULL) __hash_rcu_auto_free_old_data(h, saved_old_data); return ret; ``` If an application legitimately stores `NULL` as the data pointer for a key (which `rte_hash_add_key_data()` permits — `data` is `void *` with no restriction against NULL), and then overwrites that key with new data, `saved_old_data` will be `NULL` and the overwrite path will silently skip the auto-free. This is probably fine for the auto-free case (freeing NULL is a no-op anyway), but it means the code cannot distinguish "no overwrite happened" from "overwrite happened, old data was NULL." This becomes more significant in Patch 2/2 where `old_data` is exposed to the caller — see below. #### **Error — `__hash_rcu_qsbr_free_resource` overwrite detection uses `old_data != NULL` sentinel (Medium-high confidence ~70%)** In the dequeue callback: ```c /* Overwrite case: free old data only, do not recycle slot */ if (rcu_dq_entry.old_data != NULL) { h->hash_rcu_cfg->free_key_data_func( h->hash_rcu_cfg->key_data_ptr, rcu_dq_entry.old_data); return; } ``` This uses `old_data != NULL` to distinguish overwrite entries from delete entries. But overwrite entries are enqueued with `key_idx = 0` as the sentinel: ```c struct __rte_hash_rcu_dq_entry rcu_dq_entry = { .key_idx = 0, .ext_bkt_idx = EMPTY_SLOT, .old_data = old_data_val, }; ``` The `key_idx == 0` check would be a more robust sentinel since key index 0 is the reserved dummy entry in the cuckoo hash. Using `old_data != NULL` means that if someone stores NULL data and then overwrites, the entry flows through to the delete path which will try to access `keys[0]` — the dummy slot. The consequences depend on what's in the dummy slot, but it's at minimum semantically wrong. **Recommendation**: check `key_idx == 0` instead of `old_data != NULL` as the overwrite discriminator: ```c if (rcu_dq_entry.key_idx == 0) { /* Overwrite case */ if (rcu_dq_entry.old_data != NULL) h->hash_rcu_cfg->free_key_data_func(...); return; } ``` #### **Error — Missing error check on `rte_rcu_qsbr_dq_enqueue` (Low-medium confidence ~55%)** In `__hash_rcu_auto_free_old_data()`: ```c if (rte_rcu_qsbr_dq_enqueue(h->dq, &rcu_dq_entry) != 0) HASH_LOG(ERR, "Failed to push QSBR FIFO"); ``` The enqueue failure is logged but the old data pointer is leaked. The overwrite has already happened (the data in the hash table has been updated), so by the time we reach this point, the old data is unreachable. If enqueue fails, the old data will never be freed. This is a resource leak on a (presumably rare) error path. The existing delete path in `__rte_hash_del_key_with_hash` has the same pattern, so this is consistent with existing code, but it's worth noting that a failed enqueue here means a permanent memory leak of the old data. #### **Warning — Behavioral change in the ext-table duplicate-check path** In the extended-table section of `__rte_hash_add_key_with_hash`, the original code jumped to `failure:` when a duplicate was found (which unlocked and returned). The new code jumps to `overwrite:` after explicitly unlocking: ```c ret = search_and_update(h, data, key, prim_bkt, short_sig, &saved_old_data); if (ret != -1) { enqueue_slot_back(h, cached_free_slots, slot_id); __hash_rw_writer_unlock(h); goto overwrite; } ``` This is correct — the unlock happens before the goto, and `overwrite:` doesn't attempt to unlock. Just confirming this is safe. The change from `goto failure` to explicit unlock + `goto overwrite` is necessary and correct. ### Test Coverage The test `test_hash_rcu_qsbr_replace_auto_free` is well-structured: - Creates hash with RW_CONCURRENCY_LF and RCU QSBR in DQ mode - Registers a pseudo reader - Tests overwrite + reclaim, then delete + reclaim - Verifies the callback receives the correct old data pointers - Proper cleanup in the `end:` label The cleanup path correctly handles the case where `rcu.v` might be NULL (skipping thread offline/unregister). `rte_hash_free(NULL)` and `rte_free(NULL)` are both safe. ### Documentation Release notes entry is appropriate and well-written. The API doc updates to `rte_hash.h` clearly document the new behavior. --- ## Patch 2/2: hash: add replace API returning old data on overwrite ### Summary Adds `rte_hash_replace_key_data()` and `rte_hash_replace_key_with_hash_data()` — variants of the add APIs that return the previous data pointer via an output parameter. When the caller provides `old_data`, no automatic RCU deferred free is performed; the caller takes ownership. ### Correctness Analysis #### **Error — `saved_old_data` initialized to NULL but not reset between paths (Medium confidence ~65%)** In `__rte_hash_add_key_with_hash`, `saved_old_data` is initialized to `NULL` at declaration. The function has multiple `search_and_update` calls, each of which may or may not set `saved_old_data`. If a first `search_and_update` call doesn't find the key (returns -1), `saved_old_data` stays NULL. A later call that does find the key will set it. This is correct. However, consider the path where `rte_hash_cuckoo_insert_mw` returns 1 (key found during insert): ```c ret = rte_hash_cuckoo_insert_mw(h, prim_bkt, sec_bkt, key, data, short_sig, slot_id, &ret_val, &saved_old_data); if (ret == 0) return slot_id - 1; else if (ret == 1) { enqueue_slot_back(h, cached_free_slots, slot_id); ret = ret_val; goto overwrite; } ``` On the `ret == 0` path (fresh insert, no overwrite), the function returns directly without going through `overwrite:`. So `old_data` is never set for fresh inserts via this path. But `rte_hash_replace_key_with_hash_data` initializes `*old_data = NULL` before the call, so this is safe — the caller sees NULL for fresh inserts. This is correct. #### **Warning — New experimental APIs missing `__rte_experimental` annotation validation** The new functions `rte_hash_replace_key_data()` and `rte_hash_replace_key_with_hash_data()` are correctly marked `__rte_experimental` in the header and use `RTE_EXPORT_EXPERIMENTAL_SYMBOL` in the `.c` file. This follows DPDK conventions. #### **Info — API design: `old_data` is required (non-NULL) for replace functions** `rte_hash_replace_key_with_hash_data` validates `old_data != NULL`: ```c RETURN_IF_TRUE(((h == NULL) || (key == NULL) || (old_data == NULL)), -EINVAL); ``` This is a reasonable design choice — if you don't want old_data, use the existing `rte_hash_add_key_data()` instead. The semantics are clean: replace always tells you what was there before. ### Test Coverage `test_hash_replace_key_data` covers: - Overwrite path: verifies `old_data` returns the previous value - Lookup after replace: verifies the new data is stored correctly - Fresh insert via replace: verifies `old_data` is set to NULL The test intentionally sets `old_data = (void *)0xdeadbeef` before the fresh-insert replace to verify it gets zeroed. Good practice. **Note**: The test doesn't cover the `rte_hash_replace_key_with_hash_data()` variant (with precomputed hash). Since it's a thin wrapper that delegates to the same internal function, the coverage is adequate, but a dedicated test for the hash variant would be thorough. ### Documentation Release notes and API documentation are comprehensive and clear. The doxygen comments properly document the ownership semantics (caller owns old pointer, no auto-RCU-free). --- ## Cross-Patch Observations ### Overall Architecture Assessment The design is clean and well-thought-out: 1. **Patch 1** adds automatic RCU deferred free as a "zero-effort" improvement for existing users who already have `free_key_data_func` configured. 2. **Patch 2** adds an explicit replace API for users who need precise control over the old data's lifecycle. 3. The two features interact correctly: when `old_data` is provided (replace API), automatic RCU free is suppressed; when `old_data` is NULL (existing add API), automatic RCU free kicks in. The priority of the `overwrite:` label logic in the final version is: ```c overwrite: if (old_data != NULL) // caller wants ownership (replace API) *old_data = saved_old_data; else if (saved_old_data != NULL) // auto-free via RCU (add API) __hash_rcu_auto_free_old_data(h, saved_old_data); return ret; ``` This is the correct priority ordering. ### Key Finding Summary | # | Severity | Patch | Finding | |---|----------|-------|---------| | 1 | Error | 1/2 | `old_data != NULL` sentinel in `__hash_rcu_qsbr_free_resource` is fragile; use `key_idx == 0` instead to discriminate overwrite vs delete entries | | 2 | Error | 1/2 | Failed `rte_rcu_qsbr_dq_enqueue` leaks old data (consistent with existing code pattern but worth noting) | | 3 | Info | 2/2 | No test for `rte_hash_replace_key_with_hash_data()` variant specifically |

