On Tue, Apr 7, 2026, at 9:21 AM, Jeff Layton wrote:
> Add the data structures, allocation helpers, and callback operations
> needed for directory delegation CB_NOTIFY support:
>
> - struct nfsd_notify_event: carries fsnotify events for CB_NOTIFY
> - struct nfsd4_cb_notify: per-delegation state for notification handling
> - Union dl_cb_fattr with dl_cb_notify in nfs4_delegation since a
>   delegation is either a regular file delegation or a directory
>   delegation, never both
>
> Refactor alloc_init_deleg() into a common __alloc_init_deleg() base
> with a pluggable sc_free callback, and add alloc_init_dir_deleg() which
> allocates the page array and notify4 buffer needed for CB_NOTIFY
> encoding.
>
> Add skeleton nfsd4_cb_notify_ops with done/release handlers that will
> be filled in when the notification path is wired up.
>
> Signed-off-by: Jeff Layton <[email protected]>

> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 4afe7e68fb51..b2b8c454fc0f 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c

> @@ -3381,6 +3440,30 @@ nfsd4_cb_getattr_release(struct nfsd4_callback 
> *cb)
>       nfs4_put_stid(&dp->dl_stid);
>  }
> 
> +static int
> +nfsd4_cb_notify_done(struct nfsd4_callback *cb,
> +                             struct rpc_task *task)
> +{
> +     switch (task->tk_status) {
> +     case -NFS4ERR_DELAY:
> +             rpc_delay(task, 2 * HZ);
> +             return 0;
> +     default:
> +             return 1;
> +     }
> +}
> +
> +static void
> +nfsd4_cb_notify_release(struct nfsd4_callback *cb)
> +{
> +     struct nfsd4_cb_notify *ncn =
> +                     container_of(cb, struct nfsd4_cb_notify, ncn_cb);
> +     struct nfs4_delegation *dp =
> +                     container_of(ncn, struct nfs4_delegation, dl_cb_notify);
> +
> +     nfs4_put_stid(&dp->dl_stid);
> +}
> +
>  static const struct nfsd4_callback_ops nfsd4_cb_recall_any_ops = {
>       .done           = nfsd4_cb_recall_any_done,
>       .release        = nfsd4_cb_recall_any_release,

So when a client responds with NFS4ERR_DELAY, the RPC framework retries
after 2s. On retry, prepare() is called again, but ncn_evt_cnt is
already 0 (drained in the first prepare). prepare returns false, which
destroys the callback.

Events arriving during the retry window are dropped because
nfsd4_run_cb_notify() returns early when NFSD4_CALLBACK_RUNNING is set.
After the callback is destroyed, future events can queue a new CB_NOTIFY,
but the window's events are lost.                                               
                                         

The result is that the client misses notifications. Does this impact
behavioral correctness or spec compliance? Is there a way for that
client to detect the loss and recover?


-- 
Chuck Lever

Reply via email to