Re: [PATCH -next] ipc: use GFP_ATOMIC under spin lock

2020-04-29 Thread Matthew Wilcox
On Wed, Apr 29, 2020 at 07:22:13AM +0200, Manfred Spraul wrote:
> Hello together,
> 
> On 4/28/20 1:14 PM, Matthew Wilcox wrote:
> > On Tue, Apr 28, 2020 at 03:47:36AM +, Wei Yongjun wrote:
> > > The function ipc_id_alloc() is called from ipc_addid(), in which
> > > a spin lock is held, so we should use GFP_ATOMIC instead.
> > > 
> > > Fixes: de5738d1c364 ("ipc: convert ipcs_idr to XArray")
> > > Signed-off-by: Wei Yongjun 
> > I see why you think that, but it's not true.  Yes, we hold a spinlock, but
> > the spinlock is in an object which is not reachable from any other CPU.
> 
> Is it really allowed that spin_lock()/spin_unlock may happen on different
> cpus?
> 
> CPU1: spin_lock()
> 
> CPU1: schedule() -> sleeps
> 
> CPU2: -> schedule() returns
> 
> CPU2: spin_unlock().

I think that is allowed, but I'm not an expert in the implementations.

> > Converting to GFP_ATOMIC is completely wrong.
> 
> What is your solution proposal?
> 
> xa_store() also gets a gfp_ flag. Thus even splitting _alloc() and _store()
> won't help
> 
>     xa_alloc(,entry=NULL,)
>     new->seq = ...
>     spin_lock();
>     xa_store(,entry=new,GFP_KERNEL);

While it takes GFP flags, it won't do any allocation if it's overwriting
an allocated entry.

diff --git a/ipc/util.c b/ipc/util.c
index 0f6b0e0aa17e..b929ab0072a5 100644
--- a/ipc/util.c
+++ b/ipc/util.c
@@ -19,8 +19,8 @@
  *
  * General sysv ipc locking scheme:
  * rcu_read_lock()
- *  obtain the ipc object (kern_ipc_perm) by looking up the id in an 
idr
- * tree.
+ *  obtain the ipc object (kern_ipc_perm) by looking up the id in an
+ * xarray.
  * - perform initial checks (capabilities, auditing and permission,
  *   etc).
  * - perform read-only operations, such as INFO command, that
@@ -209,14 +209,14 @@ static inline int ipc_id_alloc(struct ipc_ids *ids, 
struct kern_ipc_perm *new)
u32 idx;
int err;
 
+   xa_lock(&ids->ipcs);
+
if (get_restore_id(ids) < 0) {
int max_idx;
 
max_idx = max(ids->in_use*3/2, ipc_min_cycle);
max_idx = min(max_idx, ipc_mni) - 1;
 
-   xa_lock(&ids->ipcs);
-
err = __xa_alloc_cyclic(&ids->ipcs, &idx, NULL,
XA_LIMIT(0, max_idx), &ids->next_idx,
GFP_KERNEL);
@@ -224,24 +224,31 @@ static inline int ipc_id_alloc(struct ipc_ids *ids, 
struct kern_ipc_perm *new)
ids->seq++;
if (ids->seq >= ipcid_seq_max())
ids->seq = 0;
+   err = 0;
}
 
-   if (err >= 0) {
+   if (!err) {
new->seq = ids->seq;
new->id = (new->seq << ipcmni_seq_shift()) + idx;
-   /* xa_store contains a write barrier */
-   __xa_store(&ids->ipcs, idx, new, GFP_KERNEL);
}
-
-   xa_unlock(&ids->ipcs);
} else {
new->id = get_restore_id(ids);
new->seq = ipcid_to_seqx(new->id);
idx = ipcid_to_idx(new->id);
-   err = xa_insert(&ids->ipcs, idx, new, GFP_KERNEL);
+   err = __xa_insert(&ids->ipcs, idx, NULL, GFP_KERNEL);
set_restore_id(ids, -1);
}
 
+   /*
+* Hold the spinlock so that nobody else can access the object
+* once they can find it.  xa_store contains a write barrier so
+* all previous stores to 'new' will be visible.
+*/
+   spin_lock(&new->lock);
+   if (!err)
+   __xa_store(&ids->ipcs, idx, new, GFP_NOWAIT);
+   xa_unlock(&ids->ipcs);
+
if (err == -EBUSY)
return -ENOSPC;
if (err < 0)
@@ -255,7 +262,7 @@ static inline int ipc_id_alloc(struct ipc_ids *ids, struct 
kern_ipc_perm *new)
  * @new: new ipc permission set
  * @limit: limit for the number of used ids
  *
- * Add an entry 'new' to the ipc ids idr. The permissions object is
+ * Add an entry 'new' to the ipc ids. The permissions object is
  * initialised and the first free entry is set up and the index assigned
  * is returned. The 'new' entry is returned in a locked state on success.
  *
@@ -270,7 +277,7 @@ int ipc_addid(struct ipc_ids *ids, struct kern_ipc_perm 
*new, int limit)
kgid_t egid;
int idx;
 
-   /* 1) Initialize the refcount so that ipc_rcu_putref works */
+   /* Initialize the refcount so that ipc_rcu_putref works */
refcount_set(&new->refcount, 1);
 
if (limit > ipc_mni)
@@ -279,12 +286,7 @@ int ipc_addid(struct ipc_ids *ids, struct kern_ipc_perm 
*new, int limit)
if (ids->in_use >= limit)
return -ENOSPC;
 
-   /*
-* 2) Hold the spinlock so that nobody else can access the object
-* once they can find it
-*/
spin_lock_init(&new->lock);
-   spi

Re: [PATCH -next] ipc: use GFP_ATOMIC under spin lock

2020-04-28 Thread Manfred Spraul

Hello together,

On 4/28/20 1:14 PM, Matthew Wilcox wrote:

On Tue, Apr 28, 2020 at 03:47:36AM +, Wei Yongjun wrote:

The function ipc_id_alloc() is called from ipc_addid(), in which
a spin lock is held, so we should use GFP_ATOMIC instead.

Fixes: de5738d1c364 ("ipc: convert ipcs_idr to XArray")
Signed-off-by: Wei Yongjun 

I see why you think that, but it's not true.  Yes, we hold a spinlock, but
the spinlock is in an object which is not reachable from any other CPU.


Is it really allowed that spin_lock()/spin_unlock may happen on 
different cpus?


CPU1: spin_lock()

CPU1: schedule() -> sleeps

CPU2: -> schedule() returns

CPU2: spin_unlock().



Converting to GFP_ATOMIC is completely wrong.


What is your solution proposal?

xa_store() also gets a gfp_ flag. Thus even splitting _alloc() and 
_store() won't help


    xa_alloc(,entry=NULL,)
    new->seq = ...
    spin_lock();
    xa_store(,entry=new,GFP_KERNEL);

--

    Manfred




Re: [PATCH -next] ipc: use GFP_ATOMIC under spin lock

2020-04-28 Thread Matthew Wilcox
On Tue, Apr 28, 2020 at 05:14:20PM -0700, Andrew Morton wrote:
> On Tue, 28 Apr 2020 04:14:03 -0700 Matthew Wilcox  wrote:
> 
> > On Tue, Apr 28, 2020 at 03:47:36AM +, Wei Yongjun wrote:
> > > The function ipc_id_alloc() is called from ipc_addid(), in which
> > > a spin lock is held, so we should use GFP_ATOMIC instead.
> > > 
> > > Fixes: de5738d1c364 ("ipc: convert ipcs_idr to XArray")
> > > Signed-off-by: Wei Yongjun 
> > 
> > I see why you think that, but it's not true.  Yes, we hold a spinlock, but
> > the spinlock is in an object which is not reachable from any other CPU.
> > So it's not possible to deadlock.
> 
> um, then why are we taking it?

The lock has to be held by the time 'new' is findable because 'new' is
not completely constructed at that point.  The finder will try to acquire
the spinlock before looking at the uninitialised fields, so it's safe.
But it's not a common idiom we use at all.

> >  This probably confuses all kinds
> > of automated checkers,
> 
> A big fat code comment would reduce the email traffic?

I think I can rewrite this to take the spinlock after doing the allocation.


Re: [PATCH -next] ipc: use GFP_ATOMIC under spin lock

2020-04-28 Thread Andrew Morton
On Tue, 28 Apr 2020 04:14:03 -0700 Matthew Wilcox  wrote:

> On Tue, Apr 28, 2020 at 03:47:36AM +, Wei Yongjun wrote:
> > The function ipc_id_alloc() is called from ipc_addid(), in which
> > a spin lock is held, so we should use GFP_ATOMIC instead.
> > 
> > Fixes: de5738d1c364 ("ipc: convert ipcs_idr to XArray")
> > Signed-off-by: Wei Yongjun 
> 
> I see why you think that, but it's not true.  Yes, we hold a spinlock, but
> the spinlock is in an object which is not reachable from any other CPU.
> So it's not possible to deadlock.

um, then why are we taking it?

>  This probably confuses all kinds
> of automated checkers,

A big fat code comment would reduce the email traffic?

> and I should probably rewrite the code to not
> acquire the new spinlock until we're already holding the xa_lock.
> 



Re: [PATCH -next] ipc: use GFP_ATOMIC under spin lock

2020-04-28 Thread Matthew Wilcox
On Tue, Apr 28, 2020 at 03:47:36AM +, Wei Yongjun wrote:
> The function ipc_id_alloc() is called from ipc_addid(), in which
> a spin lock is held, so we should use GFP_ATOMIC instead.
> 
> Fixes: de5738d1c364 ("ipc: convert ipcs_idr to XArray")
> Signed-off-by: Wei Yongjun 

I see why you think that, but it's not true.  Yes, we hold a spinlock, but
the spinlock is in an object which is not reachable from any other CPU.
So it's not possible to deadlock.  This probably confuses all kinds
of automated checkers, and I should probably rewrite the code to not
acquire the new spinlock until we're already holding the xa_lock.

Converting to GFP_ATOMIC is completely wrong.