Get rid of preempt_disable() and preempt_enable() when the
preload is done for splitting purpose. The reason is that
calling spin_lock() with disabled preemtion is forbidden in
CONFIG_PREEMPT_RT kernel.

Therefore, we do not guarantee that a CPU is preloaded, instead
we minimize the case when it is not with this change.

For example i run the special test case that follows the preload
pattern and path. 20 "unbind" threads run it and each does
1000000 allocations. Only 3.5 times among 1000000 a CPU was
not preloaded. So it can happen but the number is negligible.

V1 -> V2:
  - move __this_cpu_cmpxchg check when spin_lock is taken,
    as proposed by Andrew Morton
  - add more explanation in regard of preloading
  - adjust and move some comments

Fixes: 82dd23e84be3 ("mm/vmalloc.c: preload a CPU with one object for split 
purpose")
Reviewed-by: Steven Rostedt (VMware) <rost...@goodmis.org>
Signed-off-by: Uladzislau Rezki (Sony) <ure...@gmail.com>
---
 mm/vmalloc.c | 50 +++++++++++++++++++++++++++++++++-----------------
 1 file changed, 33 insertions(+), 17 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index e92ff5f7dd8b..f48cd0711478 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -969,6 +969,19 @@ adjust_va_to_fit_type(struct vmap_area *va,
                         * There are a few exceptions though, as an example it 
is
                         * a first allocation (early boot up) when we have "one"
                         * big free space that has to be split.
+                        *
+                        * Also we can hit this path in case of regular "vmap"
+                        * allocations, if "this" current CPU was not preloaded.
+                        * See the comment in alloc_vmap_area() why. If so, then
+                        * GFP_NOWAIT is used instead to get an extra object for
+                        * split purpose. That is rare and most time does not
+                        * occur.
+                        *
+                        * What happens if an allocation gets failed. Basically,
+                        * an "overflow" path is triggered to purge lazily freed
+                        * areas to free some memory, then, the "retry" path is
+                        * triggered to repeat one more time. See more details
+                        * in alloc_vmap_area() function.
                         */
                        lva = kmem_cache_alloc(vmap_area_cachep, GFP_NOWAIT);
                        if (!lva)
@@ -1078,31 +1091,34 @@ static struct vmap_area *alloc_vmap_area(unsigned long 
size,
 
 retry:
        /*
-        * Preload this CPU with one extra vmap_area object to ensure
-        * that we have it available when fit type of free area is
-        * NE_FIT_TYPE.
+        * Preload this CPU with one extra vmap_area object. It is used
+        * when fit type of free area is NE_FIT_TYPE. Please note, it
+        * does not guarantee that an allocation occurs on a CPU that
+        * is preloaded, instead we minimize the case when it is not.
+        * It can happen because of migration, because there is a race
+        * until the below spinlock is taken.
         *
         * The preload is done in non-atomic context, thus it allows us
         * to use more permissive allocation masks to be more stable under
-        * low memory condition and high memory pressure.
+        * low memory condition and high memory pressure. In rare case,
+        * if not preloaded, GFP_NOWAIT is used.
         *
-        * Even if it fails we do not really care about that. Just proceed
-        * as it is. "overflow" path will refill the cache we allocate from.
+        * Set "pva" to NULL here, because of "retry" path.
         */
-       preempt_disable();
-       if (!__this_cpu_read(ne_fit_preload_node)) {
-               preempt_enable();
-               pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node);
-               preempt_disable();
+       pva = NULL;
 
-               if (__this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)) {
-                       if (pva)
-                               kmem_cache_free(vmap_area_cachep, pva);
-               }
-       }
+       if (!this_cpu_read(ne_fit_preload_node))
+               /*
+                * Even if it fails we do not really care about that.
+                * Just proceed as it is. If needed "overflow" path
+                * will refill the cache we allocate from.
+                */
+               pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node);
 
        spin_lock(&vmap_area_lock);
-       preempt_enable();
+
+       if (pva && __this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva))
+               kmem_cache_free(vmap_area_cachep, pva);
 
        /*
         * If an allocation fails, the "vend" address is
-- 
2.20.1

Reply via email to