Re: [Cluster-devel] [PATCH dlm/next 1/4] fs: dlm: replace sanity checks with WARN_ON

2022-02-16 Thread Andreas Gruenbacher
There's also an unnecessary INIT_LIST_HEAD() in send_op().

Andreas

---
 fs/dlm/plock.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
index ce1af7986e16..ff439d780cb1 100644
--- a/fs/dlm/plock.c
+++ b/fs/dlm/plock.c
@@ -67,7 +67,6 @@ static void dlm_release_plock_op(struct plock_op *op)
 static void send_op(struct plock_op *op)
 {
set_version(&op->info);
-   INIT_LIST_HEAD(&op->list);
spin_lock(&ops_lock);
list_add_tail(&op->list, &send_list);
spin_unlock(&ops_lock);
-- 
2.33.1



Re: [Cluster-devel] [PATCH dlm/next 1/4] fs: dlm: replace sanity checks with WARN_ON

2022-02-16 Thread Andreas Gruenbacher
On Wed, Feb 16, 2022 at 5:16 PM Alexander Aring  wrote:
>
> Hi,
>
> On Wed, Feb 16, 2022 at 11:08 AM Andreas Gruenbacher
>  wrote:
> >
> > On Wed, Feb 16, 2022 at 4:53 PM Alexander Aring  wrote:
> > > There are several sanity checks and recover handling if they occur in
> > > the dlm plock handling. They should never occur otherwise we have a bug
> > > in the code. To make such bugs more visible we remove the recover
> > > handling and add a WARN_ON() on those sanity checks.
> > >
> > > Signed-off-by: Alexander Aring 
> > > ---
> > >  fs/dlm/plock.c | 32 
> > >  1 file changed, 4 insertions(+), 28 deletions(-)
> > >
> > > diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
> > > index a10d2bcfe75a..55fba2f0234f 100644
> > > --- a/fs/dlm/plock.c
> > > +++ b/fs/dlm/plock.c
> > > @@ -157,13 +157,7 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 
> > > number, struct file *file,
> > > goto out;
> > > }
> > >
> > > -   spin_lock(&ops_lock);
> > > -   if (!list_empty(&op->list)) {
> > > -   log_error(ls, "dlm_posix_lock: op on list %llx",
> > > - (unsigned long long)number);
> > > -   list_del(&op->list);
> > > -   }
> > > -   spin_unlock(&ops_lock);
> > > +   WARN_ON(!list_empty(&op->list));
> >
> > Why don't those checks need the ops_lock spin lock anymore?
> > Why does it make sense to get rid of the list_del calls?
>
> My understanding is that those list_del() calls try to recover
> something if a sanity check hits. The list_emptry() check should
> always be true at this point no matter if lock is held or not.
> Therefore no lock is required here to do some sanity checking.

I don't immediately see what, other than the spin lock, would
guarantee a consistent memory view. In other words, without taking the
spin lock, 'list_empty(&op->list)' might still be true on one CPU even
though another CPU has already added 'op' to a list. So please, when
changing the locking somewhere, explain why the change is correct
beyond just stating that the locking isn't needed.

Thanks,
Andreas

> If those checks hit there is a bug and trying to recover from it makes
> no sense from my point of view.



[Cluster-devel] [REPORT] kernel BUG at fs/ext4/inode.c:2620 - page_buffers()

2022-02-16 Thread Lee Jones
Good afternoon,

After recently receiving a bug report from Syzbot [0] which was raised
specifically against the Android v5.10 kernel, I spent some time
trying to get to the crux.  Firstly I reproduced the issue on the
reported kernel, then did the same using the latest release kernel
v5.16.

The full kernel trace can be seen below at [1].

I managed to seemingly bisect the issue down to commit:

  60263d5889e6d ("iomap: fall back to buffered writes for invalidation 
failures")

Although it appears to be the belief of the Filesystem community that
this is likely not the cause of the issue and should therefore not be
reverted.

It took quite some time, but I managed to strip down the reported
kernel config (which was very large) down to x86_defconfig plus only a
few additional config symbols.  Most of which are platform (qemu in
this case) config options the other is KASAN, required to successfully
reproduce this:

  CONFIG_HYPERVISOR_GUEST=y  
  CONFIG_PARAVIRT=y  
  CONFIG_PARAVIRT_DEBUG=y
  CONFIG_PARAVIRT_SPINLOCKS=y
  CONFIG_KASAN=y

The (most likely non-optimised) qemu command currently being used is:

qemu-system-x86_64 -smp 8 -m 16G -enable-kvm -cpu max,migratable=off -no-reboot 
\
-kernel ${BUILDDIR}/arch/x86/boot/bzImage -nographic
\
-hda ${IMAGEDIR}/wheezy-auto-repro.img  
\   
-chardev stdio,id=char0,mux=on,logfile=serial.out,signal=off
\
-serial chardev:char0 -mon chardev=char0
\
-append "root=/dev/sda rw console=ttyS0"
 

Darrick seems to suggest that:

  "The BUG report came from page_buffers failing to find any buffer heads
   attached to the page."

If the reproducer, also massively stripped down from the original
report, would be of any use to you, it can be found further down at
[2].

I don't how true this is, but it is my current belief that user-space
should not be able to force the kernel to BUG.  This seems to be a
temporary DoS issue.  So although not a critically serious security
problem involved memory leakage or data corruption, it could
potentially cause a nuisance if not rectified.

Any well meaning help with this would be gratefully received.

Kind regards,
Lee

[0] https://syzkaller.appspot.com/bug?extid=41c966bf0729a530bd8d

[1]
[   15.200920] Page cache invalidation failure on direct I/O.  Possible data 
corruption due to collision with buffered I/O!
[   15.215877] File: /syzkaller.IsS3Yc/0/bus PID: 1497 Comm: repro
[   16.718970] Page cache invalidation failure on direct I/O.  Possible data 
corruption due to collision with buffered I/O!
[   16.734250] File: /syzkaller.IsS3Yc/5/bus PID: 1512 Comm: repro
[   17.013871] Page cache invalidation failure on direct I/O.  Possible data 
corruption due to collision with buffered I/O!
[   17.028193] File: /syzkaller.IsS3Yc/6/bus PID: 1515 Comm: repro
[   17.320498] Page cache invalidation failure on direct I/O.  Possible data 
corruption due to collision with buffered I/O!
[   17.336115] File: /syzkaller.IsS3Yc/7/bus PID: 1518 Comm: repro
[   17.617921] Page cache invalidation failure on direct I/O.  Possible data 
corruption due to collision with buffered I/O!
[   17.633063] File: /syzkaller.IsS3Yc/8/bus PID: 1521 Comm: repro
[   18.527260] Page cache invalidation failure on direct I/O.  Possible data 
corruption due to collision with buffered I/O!
[   18.544236] File: /syzkaller.IsS3Yc/11/bus PID: 1530 Comm: repro
[   18.810347] Page cache invalidation failure on direct I/O.  Possible data 
corruption due to collision with buffered I/O!
[   18.824721] File: /syzkaller.IsS3Yc/12/bus PID: 1533 Comm: repro
[   19.099315] Page cache invalidation failure on direct I/O.  Possible data 
corruption due to collision with buffered I/O!
[   19.114151] File: /syzkaller.IsS3Yc/13/bus PID: 1536 Comm: repro
[   19.403882] Page cache invalidation failure on direct I/O.  Possible data 
corruption due to collision with buffered I/O!
[   19.418467] File: /syzkaller.IsS3Yc/14/bus PID: 1539 Comm: repro
[   19.703934] Page cache invalidation failure on direct I/O.  Possible data 
corruption due to collision with buffered I/O!
[   19.718400] File: /syzkaller.IsS3Yc/15/bus PID: 1542 Comm: repro
[   26.533129] [ cut here ]
[   26.540473] WARNING: CPU: 1 PID: 1612 at fs/ext4/inode.c:3576 
ext4_set_page_dirty+0xaf/0xc0
[   26.553171] Modules linked in:
[   26.557354] CPU: 1 PID: 1612 Comm: repro Not tainted 5.16.0+ #169
[   26.565238] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.15.0-1 04/01/2014
[   26.576182] RIP: 0010:ext4_set_page_dirty+0xaf/0xc0
[   26.583077] Code: 4c 89 ff e8 e3 86 e7 ff 49 f7 07 00 20 00 00 74 19 4c 89 
ff 5b 41 5e 41 5f e9 8d 05 f0 ff 48 83 c0 ff 48 89 c3 e9 76 ff ff ff <0f> 0b eb 
e3 48 83 c0 ff 48 89 c3 eb 9e 0f 0b eb b8 55 48 89 e5 41
[   26.607402] RSP: 0018:88810f4ffa10 EFLAGS: 0001024

Re: [Cluster-devel] [PATCH dlm/next 1/4] fs: dlm: replace sanity checks with WARN_ON

2022-02-16 Thread Alexander Aring
Hi,

On Wed, Feb 16, 2022 at 11:25 AM Alexander Aring  wrote:
>
> Hi,
>
> On Wed, Feb 16, 2022 at 11:16 AM Alexander Aring  wrote:
> >
> > Hi,
> >
> > On Wed, Feb 16, 2022 at 11:08 AM Andreas Gruenbacher
> >  wrote:
> > >
> > > On Wed, Feb 16, 2022 at 4:53 PM Alexander Aring  
> > > wrote:
> > > > There are several sanity checks and recover handling if they occur in
> > > > the dlm plock handling. They should never occur otherwise we have a bug
> > > > in the code. To make such bugs more visible we remove the recover
> > > > handling and add a WARN_ON() on those sanity checks.
> > > >
> > > > Signed-off-by: Alexander Aring 
> > > > ---
> > > >  fs/dlm/plock.c | 32 
> > > >  1 file changed, 4 insertions(+), 28 deletions(-)
> > > >
> > > > diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
> > > > index a10d2bcfe75a..55fba2f0234f 100644
> > > > --- a/fs/dlm/plock.c
> > > > +++ b/fs/dlm/plock.c
> > > > @@ -157,13 +157,7 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 
> > > > number, struct file *file,
> > > > goto out;
> > > > }
> > > >
> > > > -   spin_lock(&ops_lock);
> > > > -   if (!list_empty(&op->list)) {
> > > > -   log_error(ls, "dlm_posix_lock: op on list %llx",
> > > > - (unsigned long long)number);
> > > > -   list_del(&op->list);
> > > > -   }
> > > > -   spin_unlock(&ops_lock);
> > > > +   WARN_ON(!list_empty(&op->list));
> > >
> > > Why don't those checks need the ops_lock spin lock anymore?
> > > Why does it make sense to get rid of the list_del calls?
> >
> > My understanding is that those list_del() calls try to recover
> > something if a sanity check hits. The list_emptry() check should
> > always be true at this point no matter if lock is held or not.
>
> s/true/false, but !list_empty() is true :)

forget that, originally it was correct

- Alex



Re: [Cluster-devel] [PATCH dlm/next 1/4] fs: dlm: replace sanity checks with WARN_ON

2022-02-16 Thread Alexander Aring
Hi,

On Wed, Feb 16, 2022 at 11:16 AM Alexander Aring  wrote:
>
> Hi,
>
> On Wed, Feb 16, 2022 at 11:08 AM Andreas Gruenbacher
>  wrote:
> >
> > On Wed, Feb 16, 2022 at 4:53 PM Alexander Aring  wrote:
> > > There are several sanity checks and recover handling if they occur in
> > > the dlm plock handling. They should never occur otherwise we have a bug
> > > in the code. To make such bugs more visible we remove the recover
> > > handling and add a WARN_ON() on those sanity checks.
> > >
> > > Signed-off-by: Alexander Aring 
> > > ---
> > >  fs/dlm/plock.c | 32 
> > >  1 file changed, 4 insertions(+), 28 deletions(-)
> > >
> > > diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
> > > index a10d2bcfe75a..55fba2f0234f 100644
> > > --- a/fs/dlm/plock.c
> > > +++ b/fs/dlm/plock.c
> > > @@ -157,13 +157,7 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 
> > > number, struct file *file,
> > > goto out;
> > > }
> > >
> > > -   spin_lock(&ops_lock);
> > > -   if (!list_empty(&op->list)) {
> > > -   log_error(ls, "dlm_posix_lock: op on list %llx",
> > > - (unsigned long long)number);
> > > -   list_del(&op->list);
> > > -   }
> > > -   spin_unlock(&ops_lock);
> > > +   WARN_ON(!list_empty(&op->list));
> >
> > Why don't those checks need the ops_lock spin lock anymore?
> > Why does it make sense to get rid of the list_del calls?
>
> My understanding is that those list_del() calls try to recover
> something if a sanity check hits. The list_emptry() check should
> always be true at this point no matter if lock is held or not.

s/true/false, but !list_empty() is true :)

 - Alex



Re: [Cluster-devel] [PATCH dlm/next 1/4] fs: dlm: replace sanity checks with WARN_ON

2022-02-16 Thread Alexander Aring
Hi,

On Wed, Feb 16, 2022 at 11:08 AM Andreas Gruenbacher
 wrote:
>
> On Wed, Feb 16, 2022 at 4:53 PM Alexander Aring  wrote:
> > There are several sanity checks and recover handling if they occur in
> > the dlm plock handling. They should never occur otherwise we have a bug
> > in the code. To make such bugs more visible we remove the recover
> > handling and add a WARN_ON() on those sanity checks.
> >
> > Signed-off-by: Alexander Aring 
> > ---
> >  fs/dlm/plock.c | 32 
> >  1 file changed, 4 insertions(+), 28 deletions(-)
> >
> > diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
> > index a10d2bcfe75a..55fba2f0234f 100644
> > --- a/fs/dlm/plock.c
> > +++ b/fs/dlm/plock.c
> > @@ -157,13 +157,7 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 
> > number, struct file *file,
> > goto out;
> > }
> >
> > -   spin_lock(&ops_lock);
> > -   if (!list_empty(&op->list)) {
> > -   log_error(ls, "dlm_posix_lock: op on list %llx",
> > - (unsigned long long)number);
> > -   list_del(&op->list);
> > -   }
> > -   spin_unlock(&ops_lock);
> > +   WARN_ON(!list_empty(&op->list));
>
> Why don't those checks need the ops_lock spin lock anymore?
> Why does it make sense to get rid of the list_del calls?

My understanding is that those list_del() calls try to recover
something if a sanity check hits. The list_emptry() check should
always be true at this point no matter if lock is held or not.
Therefore no lock is required here to do some sanity checking.
If those checks hit there is a bug and trying to recover from it makes
no sense from my point of view.

- Alex



Re: [Cluster-devel] [PATCH dlm/next 1/4] fs: dlm: replace sanity checks with WARN_ON

2022-02-16 Thread Andreas Gruenbacher
On Wed, Feb 16, 2022 at 4:53 PM Alexander Aring  wrote:
> There are several sanity checks and recover handling if they occur in
> the dlm plock handling. They should never occur otherwise we have a bug
> in the code. To make such bugs more visible we remove the recover
> handling and add a WARN_ON() on those sanity checks.
>
> Signed-off-by: Alexander Aring 
> ---
>  fs/dlm/plock.c | 32 
>  1 file changed, 4 insertions(+), 28 deletions(-)
>
> diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
> index a10d2bcfe75a..55fba2f0234f 100644
> --- a/fs/dlm/plock.c
> +++ b/fs/dlm/plock.c
> @@ -157,13 +157,7 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 
> number, struct file *file,
> goto out;
> }
>
> -   spin_lock(&ops_lock);
> -   if (!list_empty(&op->list)) {
> -   log_error(ls, "dlm_posix_lock: op on list %llx",
> - (unsigned long long)number);
> -   list_del(&op->list);
> -   }
> -   spin_unlock(&ops_lock);
> +   WARN_ON(!list_empty(&op->list));

Why don't those checks need the ops_lock spin lock anymore?
Why does it make sense to get rid of the list_del calls?

> rv = op->info.rv;
>
> @@ -190,13 +184,7 @@ static int dlm_plock_callback(struct plock_op *op)
> struct plock_xop *xop = (struct plock_xop *)op;
> int rv = 0;
>
> -   spin_lock(&ops_lock);
> -   if (!list_empty(&op->list)) {
> -   log_print("dlm_plock_callback: op on list %llx",
> - (unsigned long long)op->info.number);
> -   list_del(&op->list);
> -   }
> -   spin_unlock(&ops_lock);
> +   WARN_ON(!list_empty(&op->list));
>
> /* check if the following 2 are still valid or make a copy */
> file = xop->file;
> @@ -289,13 +277,7 @@ int dlm_posix_unlock(dlm_lockspace_t *lockspace, u64 
> number, struct file *file,
> send_op(op);
> wait_event(recv_wq, (op->done != 0));
>
> -   spin_lock(&ops_lock);
> -   if (!list_empty(&op->list)) {
> -   log_error(ls, "dlm_posix_unlock: op on list %llx",
> - (unsigned long long)number);
> -   list_del(&op->list);
> -   }
> -   spin_unlock(&ops_lock);
> +   WARN_ON(!list_empty(&op->list));
>
> rv = op->info.rv;
>
> @@ -343,13 +325,7 @@ int dlm_posix_get(dlm_lockspace_t *lockspace, u64 
> number, struct file *file,
> send_op(op);
> wait_event(recv_wq, (op->done != 0));
>
> -   spin_lock(&ops_lock);
> -   if (!list_empty(&op->list)) {
> -   log_error(ls, "dlm_posix_get: op on list %llx",
> - (unsigned long long)number);
> -   list_del(&op->list);
> -   }
> -   spin_unlock(&ops_lock);
> +   WARN_ON(!list_empty(&op->list));
>
> /* info.rv from userspace is 1 for conflict, 0 for no-conflict,
>-ENOENT if there are no locks on the file */
> --
> 2.31.1
>

Thanks,
Andreas



[Cluster-devel] [PATCH dlm/next 4/4] fs: dlm: improve plock logging if interrupted

2022-02-16 Thread Alexander Aring
This patch changes the log level if a plock is removed when interrupted
from debug to info. Additional it signals now that the plock entity was
removed to let the user know what's happening.

If on a dev_write() a pending plock cannot be find it will signal that
it might have been removed because wait interruption.

Before this patch there might be a "dev_write no op ..." info message
and the users can only guess that the plock was removed before because
the wait interruption. To be sure that is the case we log both messages
on the same log level.

Let both message be logged on info layer because it should not happened
a lot and if it happens it should be clear why the op was not found.

Signed-off-by: Alexander Aring 
---
 fs/dlm/plock.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
index 757d9013788a..ce1af7986e16 100644
--- a/fs/dlm/plock.c
+++ b/fs/dlm/plock.c
@@ -161,11 +161,12 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 
number, struct file *file,
 
rv = wait_event_interruptible(recv_wq, (op->done != 0));
if (rv == -ERESTARTSYS) {
-   log_debug(ls, "%s: wait killed %llx", __func__,
- (unsigned long long)number);
spin_lock(&ops_lock);
list_del(&op->list);
spin_unlock(&ops_lock);
+   log_print("%s: wait interrupted %x %llx, op removed",
+ __func__, ls->ls_global_id,
+ (unsigned long long)number);
dlm_release_plock_op(op);
do_unlock_close(ls, number, file, fl);
goto out;
@@ -443,8 +444,8 @@ static ssize_t dev_write(struct file *file, const char 
__user *u, size_t count,
else
wake_up(&recv_wq);
} else
-   log_print("dev_write no op %x %llx", info.fsid,
- (unsigned long long)info.number);
+   log_print("%s: no op %x %llx - may got interrupted?", __func__,
+ info.fsid, (unsigned long long)info.number);
return count;
 }
 
-- 
2.31.1



[Cluster-devel] [PATCH dlm/next 2/4] fs: dlm: cleanup plock_op vs plock_xop

2022-02-16 Thread Alexander Aring
Lately the different casting between plock_op and plock_xop and list
holders which was involved showed some issues which were hard to see.
This patch removes the "plock_xop" structure and introduces a
"struct plock_async_data". This structure will be set in "struct plock_op"
in case of asynchronous lock handling as the original "plock_xop" was
made for. There is no need anymore to cast pointers around for
additional fields in case of asynchronous lock handling.  As disadvantage
another allocation was introduces but only needed in the asynchronous
case which is currently only used in combination with nfs lockd.

Signed-off-by: Alexander Aring 
---
 fs/dlm/plock.c | 77 ++
 1 file changed, 46 insertions(+), 31 deletions(-)

diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
index 55fba2f0234f..4e60dd657cb6 100644
--- a/fs/dlm/plock.c
+++ b/fs/dlm/plock.c
@@ -19,20 +19,20 @@ static struct list_head recv_list;
 static wait_queue_head_t send_wq;
 static wait_queue_head_t recv_wq;
 
-struct plock_op {
-   struct list_head list;
-   int done;
-   struct dlm_plock_info info;
-   int (*callback)(struct file_lock *fl, int result);
-};
-
-struct plock_xop {
-   struct plock_op xop;
+struct plock_async_data {
void *fl;
void *file;
struct file_lock flc;
+   int (*callback)(struct file_lock *fl, int result);
 };
 
+struct plock_op {
+   struct list_head list;
+   int done;
+   struct dlm_plock_info info;
+   /* if set indicates async handling */
+   struct plock_async_data *data;
+};
 
 static inline void set_version(struct dlm_plock_info *info)
 {
@@ -58,6 +58,12 @@ static int check_version(struct dlm_plock_info *info)
return 0;
 }
 
+static void dlm_release_plock_op(struct plock_op *op)
+{
+   kfree(op->data);
+   kfree(op);
+}
+
 static void send_op(struct plock_op *op)
 {
set_version(&op->info);
@@ -101,22 +107,21 @@ static void do_unlock_close(struct dlm_ls *ls, u64 number,
 int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, struct file *file,
   int cmd, struct file_lock *fl)
 {
+   struct plock_async_data *op_data;
struct dlm_ls *ls;
struct plock_op *op;
-   struct plock_xop *xop;
int rv;
 
ls = dlm_find_lockspace_local(lockspace);
if (!ls)
return -EINVAL;
 
-   xop = kzalloc(sizeof(*xop), GFP_NOFS);
-   if (!xop) {
+   op = kzalloc(sizeof(*op), GFP_NOFS);
+   if (!op) {
rv = -ENOMEM;
goto out;
}
 
-   op = &xop->xop;
op->info.optype = DLM_PLOCK_OP_LOCK;
op->info.pid= fl->fl_pid;
op->info.ex = (fl->fl_type == F_WRLCK);
@@ -125,22 +130,32 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 
number, struct file *file,
op->info.number = number;
op->info.start  = fl->fl_start;
op->info.end= fl->fl_end;
+   /* async handling */
if (fl->fl_lmops && fl->fl_lmops->lm_grant) {
+   op_data = kzalloc(sizeof(*op_data), GFP_NOFS);
+   if (!op_data) {
+   dlm_release_plock_op(op);
+   rv = -ENOMEM;
+   goto out;
+   }
+
/* fl_owner is lockd which doesn't distinguish
   processes on the nfs client */
op->info.owner  = (__u64) fl->fl_pid;
-   op->callback= fl->fl_lmops->lm_grant;
-   locks_init_lock(&xop->flc);
-   locks_copy_lock(&xop->flc, fl);
-   xop->fl = fl;
-   xop->file   = file;
+   op_data->callback = fl->fl_lmops->lm_grant;
+   locks_init_lock(&op_data->flc);
+   locks_copy_lock(&op_data->flc, fl);
+   op_data->fl = fl;
+   op_data->file   = file;
+
+   op->data = op_data;
} else {
op->info.owner  = (__u64)(long) fl->fl_owner;
}
 
send_op(op);
 
-   if (!op->callback) {
+   if (!op->data) {
rv = wait_event_interruptible(recv_wq, (op->done != 0));
if (rv == -ERESTARTSYS) {
log_debug(ls, "dlm_posix_lock: wait killed %llx",
@@ -148,7 +163,7 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, 
struct file *file,
spin_lock(&ops_lock);
list_del(&op->list);
spin_unlock(&ops_lock);
-   kfree(xop);
+   dlm_release_plock_op(op);
do_unlock_close(ls, number, file, fl);
goto out;
}
@@ -167,7 +182,7 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, 
struct file *file,
  (unsigned long long)number);
}
 

[Cluster-devel] [PATCH dlm/next 1/4] fs: dlm: replace sanity checks with WARN_ON

2022-02-16 Thread Alexander Aring
There are several sanity checks and recover handling if they occur in
the dlm plock handling. They should never occur otherwise we have a bug
in the code. To make such bugs more visible we remove the recover
handling and add a WARN_ON() on those sanity checks.

Signed-off-by: Alexander Aring 
---
 fs/dlm/plock.c | 32 
 1 file changed, 4 insertions(+), 28 deletions(-)

diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
index a10d2bcfe75a..55fba2f0234f 100644
--- a/fs/dlm/plock.c
+++ b/fs/dlm/plock.c
@@ -157,13 +157,7 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 number, 
struct file *file,
goto out;
}
 
-   spin_lock(&ops_lock);
-   if (!list_empty(&op->list)) {
-   log_error(ls, "dlm_posix_lock: op on list %llx",
- (unsigned long long)number);
-   list_del(&op->list);
-   }
-   spin_unlock(&ops_lock);
+   WARN_ON(!list_empty(&op->list));
 
rv = op->info.rv;
 
@@ -190,13 +184,7 @@ static int dlm_plock_callback(struct plock_op *op)
struct plock_xop *xop = (struct plock_xop *)op;
int rv = 0;
 
-   spin_lock(&ops_lock);
-   if (!list_empty(&op->list)) {
-   log_print("dlm_plock_callback: op on list %llx",
- (unsigned long long)op->info.number);
-   list_del(&op->list);
-   }
-   spin_unlock(&ops_lock);
+   WARN_ON(!list_empty(&op->list));
 
/* check if the following 2 are still valid or make a copy */
file = xop->file;
@@ -289,13 +277,7 @@ int dlm_posix_unlock(dlm_lockspace_t *lockspace, u64 
number, struct file *file,
send_op(op);
wait_event(recv_wq, (op->done != 0));
 
-   spin_lock(&ops_lock);
-   if (!list_empty(&op->list)) {
-   log_error(ls, "dlm_posix_unlock: op on list %llx",
- (unsigned long long)number);
-   list_del(&op->list);
-   }
-   spin_unlock(&ops_lock);
+   WARN_ON(!list_empty(&op->list));
 
rv = op->info.rv;
 
@@ -343,13 +325,7 @@ int dlm_posix_get(dlm_lockspace_t *lockspace, u64 number, 
struct file *file,
send_op(op);
wait_event(recv_wq, (op->done != 0));
 
-   spin_lock(&ops_lock);
-   if (!list_empty(&op->list)) {
-   log_error(ls, "dlm_posix_get: op on list %llx",
- (unsigned long long)number);
-   list_del(&op->list);
-   }
-   spin_unlock(&ops_lock);
+   WARN_ON(!list_empty(&op->list));
 
/* info.rv from userspace is 1 for conflict, 0 for no-conflict,
   -ENOENT if there are no locks on the file */
-- 
2.31.1



[Cluster-devel] [PATCH dlm/next 3/4] fs: dlm: rearrange async condition return

2022-02-16 Thread Alexander Aring
This patch moves the return of FILE_LOCK_DEFERRED a little bit earlier
than checking afterwards again if the request was an asynchronous request.

Signed-off-by: Alexander Aring 
---
 fs/dlm/plock.c | 27 +--
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
index 4e60dd657cb6..757d9013788a 100644
--- a/fs/dlm/plock.c
+++ b/fs/dlm/plock.c
@@ -149,26 +149,25 @@ int dlm_posix_lock(dlm_lockspace_t *lockspace, u64 
number, struct file *file,
op_data->file   = file;
 
op->data = op_data;
+
+   send_op(op);
+   rv = FILE_LOCK_DEFERRED;
+   goto out;
} else {
op->info.owner  = (__u64)(long) fl->fl_owner;
}
 
send_op(op);
 
-   if (!op->data) {
-   rv = wait_event_interruptible(recv_wq, (op->done != 0));
-   if (rv == -ERESTARTSYS) {
-   log_debug(ls, "dlm_posix_lock: wait killed %llx",
- (unsigned long long)number);
-   spin_lock(&ops_lock);
-   list_del(&op->list);
-   spin_unlock(&ops_lock);
-   dlm_release_plock_op(op);
-   do_unlock_close(ls, number, file, fl);
-   goto out;
-   }
-   } else {
-   rv = FILE_LOCK_DEFERRED;
+   rv = wait_event_interruptible(recv_wq, (op->done != 0));
+   if (rv == -ERESTARTSYS) {
+   log_debug(ls, "%s: wait killed %llx", __func__,
+ (unsigned long long)number);
+   spin_lock(&ops_lock);
+   list_del(&op->list);
+   spin_unlock(&ops_lock);
+   dlm_release_plock_op(op);
+   do_unlock_close(ls, number, file, fl);
goto out;
}
 
-- 
2.31.1



[Cluster-devel] [PATCH dlm/next 0/4] fs: dlm: cleanup plock code

2022-02-16 Thread Alexander Aring
Hi,

this patch-series cleanups some plock code handling and remove the
plock_op vs plock_xop casting which had made trouble in the past. Also
I add the logging improvment handling to see when a plock op was not
being found, the wait was interrupted before.

This followed up cleanup patch series should be applied on top of the
previous stable patch "[PATCH dlm/next] fs: dlm: fix plock invalid read".

Thanks.

- Alex

Alexander Aring (4):
  fs: dlm: replace sanity checks with WARN_ON
  fs: dlm: cleanup plock_op vs plock_xop
  fs: dlm: rearrange async condition return
  fs: dlm: improve plock logging if interrupted

 fs/dlm/plock.c | 137 +++--
 1 file changed, 64 insertions(+), 73 deletions(-)

-- 
2.31.1



[Cluster-devel] [PATCH dlm/next] fs: dlm: fix plock invalid read

2022-02-16 Thread Alexander Aring
This patch fixes an invalid read showed by KASAN. A unlock will allocate a
"struct plock_op" and a followed send_op() will append it to a global
send_list data structure. In some cases a followed dev_read() moves it
to recv_list and dev_write() will cast it to "struct plock_xop" and access
fields which are only available in those structures. At this point an
invalid read happens by accessing those fields.

To fix this issue the "callback" field is moved to "struct plock_op" to
indicate that a cast to "plock_xop" is allowed and does the additional
"plock_xop" handling if set.

Example of the KASAN output which showed the invalid read:

[ 2064.296453] 
==
[ 2064.304852] BUG: KASAN: slab-out-of-bounds in dev_write+0x52b/0x5a0 [dlm]
[ 2064.306491] Read of size 8 at addr 88800ef227d8 by task dlm_controld/7484
[ 2064.308168]
[ 2064.308575] CPU: 0 PID: 7484 Comm: dlm_controld Kdump: loaded Not tainted 
5.14.0+ #9
[ 2064.310292] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[ 2064.311618] Call Trace:
[ 2064.312218]  dump_stack_lvl+0x56/0x7b
[ 2064.313150]  print_address_description.constprop.8+0x21/0x150
[ 2064.314578]  ? dev_write+0x52b/0x5a0 [dlm]
[ 2064.315610]  ? dev_write+0x52b/0x5a0 [dlm]
[ 2064.316595]  kasan_report.cold.14+0x7f/0x11b
[ 2064.317674]  ? dev_write+0x52b/0x5a0 [dlm]
[ 2064.318687]  dev_write+0x52b/0x5a0 [dlm]
[ 2064.319629]  ? dev_read+0x4a0/0x4a0 [dlm]
[ 2064.320713]  ? bpf_lsm_kernfs_init_security+0x10/0x10
[ 2064.321926]  vfs_write+0x17e/0x930
[ 2064.322769]  ? __fget_light+0x1aa/0x220
[ 2064.323753]  ksys_write+0xf1/0x1c0
[ 2064.324548]  ? __ia32_sys_read+0xb0/0xb0
[ 2064.325464]  do_syscall_64+0x3a/0x80
[ 2064.326387]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 2064.327606] RIP: 0033:0x7f807e4ba96f
[ 2064.328470] Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 39 87 f8 ff 48 
8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 
f0 ff ff 77 31 44 89 c7 48 89 44 24 08 e8 7c 87 f8 ff 48
[ 2064.332902] RSP: 002b:7ffd50cfe6e0 EFLAGS: 0293 ORIG_RAX: 
0001
[ 2064.334658] RAX: ffda RBX: 55cc3886eb30 RCX: 7f807e4ba96f
[ 2064.336275] RDX: 0040 RSI: 7ffd50cfe7e0 RDI: 0010
[ 2064.337980] RBP: 7ffd50cfe7e0 R08:  R09: 0001
[ 2064.339560] R10: 55cc3886eb30 R11: 0293 R12: 55cc3886eb80
[ 2064.341237] R13: 55cc3886eb00 R14: 55cc3886f590 R15: 0001
[ 2064.342857]
[ 2064.343226] Allocated by task 12438:
[ 2064.344057]  kasan_save_stack+0x1c/0x40
[ 2064.345079]  __kasan_kmalloc+0x84/0xa0
[ 2064.345933]  kmem_cache_alloc_trace+0x13b/0x220
[ 2064.346953]  dlm_posix_unlock+0xec/0x720 [dlm]
[ 2064.348811]  do_lock_file_wait.part.32+0xca/0x1d0
[ 2064.351070]  fcntl_setlk+0x281/0xbc0
[ 2064.352879]  do_fcntl+0x5e4/0xfe0
[ 2064.354657]  __x64_sys_fcntl+0x11f/0x170
[ 2064.356550]  do_syscall_64+0x3a/0x80
[ 2064.358259]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 2064.360745]
[ 2064.361511] Last potentially related work creation:
[ 2064.363957]  kasan_save_stack+0x1c/0x40
[ 2064.365811]  __kasan_record_aux_stack+0xaf/0xc0
[ 2064.368100]  call_rcu+0x11b/0xf70
[ 2064.369785]  dlm_process_incoming_buffer+0x47d/0xfd0 [dlm]
[ 2064.372404]  receive_from_sock+0x290/0x770 [dlm]
[ 2064.374607]  process_recv_sockets+0x32/0x40 [dlm]
[ 2064.377290]  process_one_work+0x9a8/0x16e0
[ 2064.379357]  worker_thread+0x87/0xbf0
[ 2064.381188]  kthread+0x3ac/0x490
[ 2064.383460]  ret_from_fork+0x22/0x30
[ 2064.385588]
[ 2064.386518] Second to last potentially related work creation:
[ 2064.389219]  kasan_save_stack+0x1c/0x40
[ 2064.391043]  __kasan_record_aux_stack+0xaf/0xc0
[ 2064.393303]  call_rcu+0x11b/0xf70
[ 2064.394885]  dlm_process_incoming_buffer+0x47d/0xfd0 [dlm]
[ 2064.397694]  receive_from_sock+0x290/0x770 [dlm]
[ 2064.399932]  process_recv_sockets+0x32/0x40 [dlm]
[ 2064.402180]  process_one_work+0x9a8/0x16e0
[ 2064.404388]  worker_thread+0x87/0xbf0
[ 2064.406124]  kthread+0x3ac/0x490
[ 2064.408021]  ret_from_fork+0x22/0x30
[ 2064.409834]
[ 2064.410599] The buggy address belongs to the object at 88800ef22780
[ 2064.410599]  which belongs to the cache kmalloc-96 of size 96
[ 2064.416495] The buggy address is located 88 bytes inside of
[ 2064.416495]  96-byte region [88800ef22780, 88800ef227e0)
[ 2064.422045] The buggy address belongs to the page:
[ 2064.424635] page:b6bef8bc refcount:1 mapcount:0 
mapping: index:0x0 pfn:0xef22
[ 2064.428970] flags: 0xfc200(slab|node=0|zone=1|lastcpupid=0x1f)
[ 2064.432515] raw: 000fc200 ead68b80 00140014 
888001041780
[ 2064.436110] raw:  80200020 0001 

[ 2064.439813] page dumped because: kasan: bad access detected
[ 2064.442548]
[ 2064.443310] Memory state around the buggy address:
[ 2064.445988]  88800ef22680: 00 00 00 00 00 00 00