Dear Neil,

I was running it on arm64, may that be the reason?

Regards,
Richard

On May 27, 2024 4:02:32 AM GMT+02:00, NeilBrown <ne...@suse.de> wrote:
>On Sun, 26 May 2024, Richard Kojedzinszky wrote:
>> Dear Neil,
>> 
>> According to my quick tests, your patch seems to fix this bug. Could you 
>> also manage to try my attached code, could you also reproduce the bug?
>
>Thanks for testing.
>
>I can run your test code but it isn't triggering the bug (90 minutes so
>far).  Possibly a different compiler used for the kernel, possibly
>hardware differences (I'm running under qemu).  Bugs related to barriers
>(which this one seems to be) need just the right circumstances to
>trigger so they can be hard to reproduce on a different system.
>
>I've made some cosmetic improvements to the patch and will post it to
>the NFS maintainers.
>
>Thanks again,
>NeilBrown
>
>
>> 
>> Thanks,
>> Richard
>> 
>> 2024-05-24 07:29 időpontban Richard Kojedzinszky ezt írta:
>> > Dear Neil,
>> > 
>> > I've applied your patch, and since then there are no lockups. Before 
>> > that my application reported a lockup in a minute or two, now it has 
>> > been running for half an hour, and still running.
>> > 
>> > Thanks,
>> > Richard
>> > 
>> > 2024-05-24 01:31 időpontban NeilBrown ezt írta:
>> >> On Fri, 24 May 2024, Richard Kojedzinszky wrote:
>> >>> Dear devs,
>> >>> 
>> >>> I am attaching a stripped down version of the little program which
>> >>> triggers the bug very quickly, in a few minutes in my test lab. It
>> >>> turned out that a single NFS mountpoint is enough. Just start the
>> >>> program giving it the NFS mount as first argument. It will chdir 
>> >>> there,
>> >>> and do file operations, which will trigger a lockup in a few minutes.
>> >> 
>> >> I couldn't get the go code to run.  But then it is a long time since I
>> >> played with go and I didn't try very hard.
>> >> If you could provide simple instructions and a list of package
>> >> dependencies that I need to install (on Debian), I can give it a try.
>> >> 
>> >> Or you could try this patch.  It might help, but I don't have high
>> >> hopes.  It adds some memory barriers and fixes a bug which would cause 
>> >> a
>> >> problem if memory allocation failed (but memory allocation never 
>> >> fails).
>> >> 
>> >> NeilBrown
>> >> 
>> >> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
>> >> index ac505671efbd..5bcc0d14d519 100644
>> >> --- a/fs/nfs/dir.c
>> >> +++ b/fs/nfs/dir.c
>> >> @@ -1804,7 +1804,7 @@ __nfs_lookup_revalidate(struct dentry *dentry, 
>> >> unsigned int flags,
>> >>   } else {
>> >>           /* Wait for unlink to complete */
>> >>           wait_var_event(&dentry->d_fsdata,
>> >> -                        dentry->d_fsdata != NFS_FSDATA_BLOCKED);
>> >> +                        smp_load_acquire(&dentry->d_fsdata) != 
>> >> NFS_FSDATA_BLOCKED);
>> >>           parent = dget_parent(dentry);
>> >>           ret = reval(d_inode(parent), dentry, flags);
>> >>           dput(parent);
>> >> @@ -2508,7 +2508,7 @@ int nfs_unlink(struct inode *dir, struct dentry 
>> >> *dentry)
>> >>   spin_unlock(&dentry->d_lock);
>> >>   error = nfs_safe_remove(dentry);
>> >>   nfs_dentry_remove_handle_error(dir, dentry, error);
>> >> - dentry->d_fsdata = NULL;
>> >> + smp_store_release(&dentry->d_fsdata, NULL);
>> >>   wake_up_var(&dentry->d_fsdata);
>> >>  out:
>> >>   trace_nfs_unlink_exit(dir, dentry, error);
>> >> @@ -2616,7 +2616,7 @@ nfs_unblock_rename(struct rpc_task *task, struct 
>> >> nfs_renamedata *data)
>> >>  {
>> >>   struct dentry *new_dentry = data->new_dentry;
>> >> 
>> >> - new_dentry->d_fsdata = NULL;
>> >> + smp_store_release(&new_dentry->d_fsdata, NULL);
>> >>   wake_up_var(&new_dentry->d_fsdata);
>> >>  }
>> >> 
>> >> @@ -2717,6 +2717,10 @@ int nfs_rename(struct mnt_idmap *idmap, struct 
>> >> inode *old_dir,
>> >>   task = nfs_async_rename(old_dir, new_dir, old_dentry, new_dentry,
>> >>                           must_unblock ? nfs_unblock_rename : NULL);
>> >>   if (IS_ERR(task)) {
>> >> +         if (must_unlock) {
>> >> +                 smp_store_release(&new_dentry->d_fsdata, NULL);
>> >> +                 wake_up_var(&new_dentry->d_fsdata);
>> >> +         }
>> >>           error = PTR_ERR(task);
>> >>           goto out;
>> >>   }
>> 
>

Reply via email to