Re: [Intel-gfx] [PATCH v4.5 1/8] cgroup: Allow registration and lookup of cgroup private data (v3)

2018-03-22 Thread Chris Wilson
Quoting Matt Roper (2018-03-21 23:23:37)
> There are cases where other parts of the kernel may wish to store data
> associated with individual cgroups without building a full cgroup
> controller.  Let's add interfaces to allow them to register and lookup
> this private data for individual cgroups.
> 
> A kernel system (e.g., a driver) that wishes to register private data
> for a cgroup should start by obtaining a unique private data key by
> calling cgroup_priv_getkey().  It may then associate private data
> with a cgroup by calling cgroup_priv_install(cgrp, key, ref) where 'ref'
> is a pointer to a kref field inside the private data structure.  The
> private data may later be looked up by calling cgroup_priv_get(cgrp,
> key) to obtain a new reference to the private data.  Private data may be
> unregistered via cgroup_priv_release(cgrp, key).
> 
> If a cgroup is removed, the reference count for all private data objects
> will be decremented.
> 
> v2:  Significant overhaul suggested by Tejun, Alexei, and Roman
>  - Rework interface to make consumers obtain an ida-based key rather
>than supplying their own arbitrary void*
>  - Internal implementation now uses per-cgroup radixtrees which should
>allow much faster lookup than the previous hashtable approach
>  - Private data is registered via kref, allowing a single private data
>structure to potentially be assigned to multiple cgroups.
> 
> v3:
>  - Spare warning fixes (kbuild test robot)
> 
> Cc: Tejun Heo 
> Cc: Alexei Starovoitov 
> Cc: Roman Gushchin 
> Cc: cgro...@vger.kernel.org
> Signed-off-by: Matt Roper 
> 
> fixup! cgroup: Allow registration and lookup of cgroup private data (v2)
> ---
>  include/linux/cgroup-defs.h |  10 +++
>  include/linux/cgroup.h  |   7 ++
>  kernel/cgroup/cgroup.c  | 183 
> +++-
>  3 files changed, 198 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
> index 9f242b876fde..9086d963cc0a 100644
> --- a/include/linux/cgroup-defs.h
> +++ b/include/linux/cgroup-defs.h
> @@ -427,6 +427,16 @@ struct cgroup {
> /* used to store eBPF programs */
> struct cgroup_bpf bpf;
>  
> +   /*
> +* cgroup private data registered by other non-controller parts of the
> +* kernel.  Insertions are protected by privdata_lock, lookups by
> +* rcu_read_lock().
> +*/
> +   struct radix_tree_root privdata;
> +
> +   /* Protect against concurrent insertions/deletions to privdata */
> +   spinlock_t privdata_lock;
> +
> /* ids of the ancestors at each level including self */
> int ancestor_ids[];
>  };
> diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
> index 473e0c0abb86..63d22dfa00bd 100644
> --- a/include/linux/cgroup.h
> +++ b/include/linux/cgroup.h
> @@ -833,4 +833,11 @@ static inline void put_cgroup_ns(struct cgroup_namespace 
> *ns)
> free_cgroup_ns(ns);
>  }
>  
> +/* cgroup private data handling */
> +int cgroup_priv_getkey(void (*free)(struct kref *));
> +void cgroup_priv_destroykey(int key);
> +int cgroup_priv_install(struct cgroup *cgrp, int key, struct kref *ref);
> +struct kref *cgroup_priv_get(struct cgroup *cgrp, int key);
> +void cgroup_priv_release(struct cgroup *cgrp, int key);
> +
>  #endif /* _LINUX_CGROUP_H */
> diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
> index 8cda3bc3ae22..3268a21e8158 100644
> --- a/kernel/cgroup/cgroup.c
> +++ b/kernel/cgroup/cgroup.c
> @@ -81,8 +81,15 @@ EXPORT_SYMBOL_GPL(css_set_lock);
>  #endif
>  
>  /*
> - * Protects cgroup_idr and css_idr so that IDs can be released without
> - * grabbing cgroup_mutex.
> + * ID allocator for cgroup private data keys; the ID's allocated here will
> + * be used to index all per-cgroup radix trees.  The radix tree built into
> + * the IDR itself will store a key-specific function to be passed to 
> kref_put.
> + */
> +static DEFINE_IDR(cgroup_priv_idr);

Fun two radixtrees, one to hold the (*free)() and the other the void*.

Would not just a
struct cgroup_privdata {
struct rcu_head rcu;
void (*free)(struct kref *);
}
suffice for subclassing? (Also this is a prime candidate for switching
to XArray.)

> +struct kref *
> +cgroup_priv_get(struct cgroup *cgrp, int key)
> +{
> +   struct kref *ref;
> +
> +   WARN_ON(cgrp->root != _dfl_root);
> +   WARN_ON(key == 0);
> +
> +   rcu_read_lock();
> +
> +   ref = radix_tree_lookup(>privdata, key);
> +   if (ref)
> +   kref_get(ref);

This is not safe, you need

if (ref && !kref_get_unless_zero(ref))
ref = NULL;

otherwise the cgroup_priv_release() may drop the last reference prior to
you obtaining yours. Also requires the privdata to be RCU protected.
-Chris
___
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org

[Intel-gfx] [PATCH v4.5 1/8] cgroup: Allow registration and lookup of cgroup private data (v3)

2018-03-21 Thread Matt Roper
There are cases where other parts of the kernel may wish to store data
associated with individual cgroups without building a full cgroup
controller.  Let's add interfaces to allow them to register and lookup
this private data for individual cgroups.

A kernel system (e.g., a driver) that wishes to register private data
for a cgroup should start by obtaining a unique private data key by
calling cgroup_priv_getkey().  It may then associate private data
with a cgroup by calling cgroup_priv_install(cgrp, key, ref) where 'ref'
is a pointer to a kref field inside the private data structure.  The
private data may later be looked up by calling cgroup_priv_get(cgrp,
key) to obtain a new reference to the private data.  Private data may be
unregistered via cgroup_priv_release(cgrp, key).

If a cgroup is removed, the reference count for all private data objects
will be decremented.

v2:  Significant overhaul suggested by Tejun, Alexei, and Roman
 - Rework interface to make consumers obtain an ida-based key rather
   than supplying their own arbitrary void*
 - Internal implementation now uses per-cgroup radixtrees which should
   allow much faster lookup than the previous hashtable approach
 - Private data is registered via kref, allowing a single private data
   structure to potentially be assigned to multiple cgroups.

v3:
 - Spare warning fixes (kbuild test robot)

Cc: Tejun Heo 
Cc: Alexei Starovoitov 
Cc: Roman Gushchin 
Cc: cgro...@vger.kernel.org
Signed-off-by: Matt Roper 

fixup! cgroup: Allow registration and lookup of cgroup private data (v2)
---
 include/linux/cgroup-defs.h |  10 +++
 include/linux/cgroup.h  |   7 ++
 kernel/cgroup/cgroup.c  | 183 +++-
 3 files changed, 198 insertions(+), 2 deletions(-)

diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
index 9f242b876fde..9086d963cc0a 100644
--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -427,6 +427,16 @@ struct cgroup {
/* used to store eBPF programs */
struct cgroup_bpf bpf;
 
+   /*
+* cgroup private data registered by other non-controller parts of the
+* kernel.  Insertions are protected by privdata_lock, lookups by
+* rcu_read_lock().
+*/
+   struct radix_tree_root privdata;
+
+   /* Protect against concurrent insertions/deletions to privdata */
+   spinlock_t privdata_lock;
+
/* ids of the ancestors at each level including self */
int ancestor_ids[];
 };
diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 473e0c0abb86..63d22dfa00bd 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -833,4 +833,11 @@ static inline void put_cgroup_ns(struct cgroup_namespace 
*ns)
free_cgroup_ns(ns);
 }
 
+/* cgroup private data handling */
+int cgroup_priv_getkey(void (*free)(struct kref *));
+void cgroup_priv_destroykey(int key);
+int cgroup_priv_install(struct cgroup *cgrp, int key, struct kref *ref);
+struct kref *cgroup_priv_get(struct cgroup *cgrp, int key);
+void cgroup_priv_release(struct cgroup *cgrp, int key);
+
 #endif /* _LINUX_CGROUP_H */
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 8cda3bc3ae22..3268a21e8158 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -81,8 +81,15 @@ EXPORT_SYMBOL_GPL(css_set_lock);
 #endif
 
 /*
- * Protects cgroup_idr and css_idr so that IDs can be released without
- * grabbing cgroup_mutex.
+ * ID allocator for cgroup private data keys; the ID's allocated here will
+ * be used to index all per-cgroup radix trees.  The radix tree built into
+ * the IDR itself will store a key-specific function to be passed to kref_put.
+ */
+static DEFINE_IDR(cgroup_priv_idr);
+
+/*
+ * Protects cgroup_idr, css_idr, and cgroup_priv_idr so that IDs can be
+ * released without grabbing cgroup_mutex.
  */
 static DEFINE_SPINLOCK(cgroup_idr_lock);
 
@@ -1839,6 +1846,8 @@ static void init_cgroup_housekeeping(struct cgroup *cgrp)
INIT_LIST_HEAD(>cset_links);
INIT_LIST_HEAD(>pidlists);
mutex_init(>pidlist_mutex);
+   INIT_RADIX_TREE(>privdata, GFP_NOWAIT);
+   spin_lock_init(>privdata_lock);
cgrp->self.cgroup = cgrp;
cgrp->self.flags |= CSS_ONLINE;
cgrp->dom_cgrp = cgrp;
@@ -4578,6 +4587,8 @@ static void css_release_work_fn(struct work_struct *work)
container_of(work, struct cgroup_subsys_state, destroy_work);
struct cgroup_subsys *ss = css->ss;
struct cgroup *cgrp = css->cgroup;
+   struct radix_tree_iter iter;
+   void __rcu **slot;
 
mutex_lock(_mutex);
 
@@ -4617,6 +4628,12 @@ static void css_release_work_fn(struct work_struct *work)
 NULL);
 
cgroup_bpf_put(cgrp);
+
+   /* Drop reference on any private data */
+   rcu_read_lock();
+