> >
> > I don't think that replacing direct function calls with indirect function
> > calls is a great suggestion with the current state of play around branch
> > prediction.
> >
> > I'd suggest:
> >
> > rcu_lock_acquire(_callback_map);
> >
> >
> > Not an emergency, but did you look into replacing this "if" statement
> > with an array of pointers to functions implementing the legs of the
> > "if" statement? If nothing else, this would greatly reduced indentation.
>
> I don't think that replacing direct function calls with indirect
> > + // Handle two first channels.
> > + for (i = 0; i < FREE_N_CHANNELS; i++) {
> > + for (; bkvhead[i]; bkvhead[i] = bnext) {
> > + bnext = bkvhead[i]->next;
> > + debug_rcu_bhead_unqueue(bkvhead[i]);
> > +
> > +
Hello, Ingo.
> > Greeting,
> >
> > FYI, we noticed the following commit (built with gcc-9):
> >
> > commit: 0acd9a0ded80c986ccc9588ba2703436769ead74 ("Revert "mm/vmalloc:
> > modify struct vmap_area to reduce its size"")
> > https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git WIP.fixes
>
> On Thu, Jun 04, 2020 at 03:42:55PM +0200, Uladzislau Rezki wrote:
> > On Thu, Jun 04, 2020 at 12:23:20PM +0200, Peter Enderborg wrote:
> > > The count and scan can be separated in time. It is a fair chance
> > > that all work is already done when the scan starts
7 @@ kfree_rcu_shrink_scan(struct shrinker *shrink, struct
> shrink_control *sc)
> break;
> }
>
> - return freed;
> + return freed == 0 ? SHRINK_STOP : freed;
> }
>
The loop will be stopped anyway sooner or later, but sooner is better :)
T
from bottom to upper levels using a regular list
instead, because nodes are linked among each other also.
It is faster and without recursion.
Signed-off-by: Uladzislau Rezki (Sony)
---
mm/vmalloc.c | 42 --
1 file changed, 8 insertions(+), 34 deletions
unctions does not make
sense and is redundant.
Reuse "built in" functionality to the macros. So the
code size gets reduced.
Signed-off-by: Uladzislau Rezki (Sony)
---
mm/vmalloc.c | 25 ++---
1 file changed, 6 insertions(+), 19 deletions(-)
diff --git a/mm/vmalloc
in fact.
Therefore do it only once when VA points to final merged
area, after all manipulations: merging/removing/inserting.
Signed-off-by: Uladzislau Rezki (Sony)
---
mm/vmalloc.c | 25 ++---
1 file changed, 14 insertions(+), 11 deletions(-)
diff --git a/mm/vmalloc.c b/mm
From: "Joel Fernandes (Google)"
kfree_rcu()'s debug_objects logic uses the address of the object's
embedded rcu_head to queue/unqueue. Instead of this, make use of the
object's address itself as preparation for future headless kfree_rcu()
support.
Reviewed-by: Uladzislau Rezki
remove the '->initialized' check.
Cc: Sebastian Andrzej Siewior
Signed-off-by: Sebastian Andrzej Siewior
Signed-off-by: Uladzislau Rezki (Sony)
---
kernel/rcu/tree.c | 13 ++---
1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index c1f
if there are
any channels in the pending state after a detach attempt.
Fixes: 34c881745549e ("rcu: Support kfree_bulk() interface in kfree_rcu()")
Acked-by: Joel Fernandes (Google)
Signed-off-by: Uladzislau Rezki (Sony)
---
kernel/rcu/tree.c | 9 ++---
1 file changed, 6 insertions(+), 3
Signed-off-by: Uladzislau Rezki (Sony)
Reviewed-by: Joel Fernandes (Google)
Signed-off-by: Joel Fernandes (Google)
---
mm/list_lru.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/mm/list_lru.c b/mm/list_lru.c
index 4d5294c39bba..42c95bcb53ca 100644
--- a/mm/list_lru.c
pages per CPU.
Signed-off-by: Uladzislau Rezki (Sony)
---
.../admin-guide/kernel-parameters.txt | 8 +++
kernel/rcu/tree.c | 66 +--
2 files changed, 70 insertions(+), 4 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt
b
for the right
kind of pointer.
It also prepares us for future headless support for vmalloc
and SLAB objects. Such objects cannot be queued on a linked
list and are instead directly into an array.
Signed-off-by: Uladzislau Rezki (Sony)
Signed-off-by: Joel Fernandes (Google)
Reviewed-by: Joel
a synchronize_rcu() call. Note that for tiny-RCU, any
call to synchronize_rcu() is actually a quiescent state, therefore
it does nothing.
Reviewed-by: Joel Fernandes (Google)
Signed-off-by: Uladzislau Rezki (Sony)
Signed-off-by: Joel Fernandes (Google)
Co-developed-by: Joel Fernandes (Google)
---
include
We can simplify KFREE_BULK_MAX_ENTR macro and get rid of
magic numbers which were used to make the structure to be
exactly one page.
Suggested-by: Boqun Feng
Reviewed-by: Joel Fernandes (Google)
Signed-off-by: Uladzislau Rezki (Sony)
Signed-off-by: Joel Fernandes (Google)
---
kernel/rcu
.
Reviewed-by: Joel Fernandes (Google)
Co-developed-by: Joel Fernandes (Google)
Signed-off-by: Joel Fernandes (Google)
Signed-off-by: Uladzislau Rezki (Sony)
---
include/linux/rcupdate.h | 14 +++---
include/linux/rcutiny.h| 2 +-
include/linux/rcutree.h| 2 +-
include/trace/events
Introduce helpers to lock and unlock per-cpu "kfree_rcu_cpu"
structures. That will make kfree_call_rcu() more readable
and prevent programming errors.
Reviewed-by: Joel Fernandes (Google)
Signed-off-by: Uladzislau Rezki (Sony)
---
kernel/rcu/tree.c | 31 +++--
le) (3):
rcu/tree: Keep kfree_rcu() awake during lock contention
rcu/tree: Skip entry into the page allocator for PREEMPT_RT
rcu/tree: Make debug_objects logic independent of rcu_head
Sebastian Andrzej Siewior (1):
rcu/tree: Use static initializer for krc.lock
Uladzislau Rezki (Sony) (12):
Replace kfree() with kvfree() in rcu_reclaim_tiny().
This makes it possible to release either SLAB or vmalloc
objects after a GP.
Reviewed-by: Joel Fernandes (Google)
Signed-off-by: Uladzislau Rezki (Sony)
---
kernel/rcu/tiny.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff
throughput.
Reviewed-by: Joel Fernandes (Google)
Signed-off-by: Uladzislau Rezki (Sony)
---
lib/test_vmalloc.c | 103 +
1 file changed, 95 insertions(+), 8 deletions(-)
diff --git a/lib/test_vmalloc.c b/lib/test_vmalloc.c
index 8bbefcaddfe8
(ptr);
Note that the headless usage (example b) can only be used in a code
that can sleep. This is enforced by the CONFIG_DEBUG_ATOMIC_SLEEP
option.
Co-developed-by: Joel Fernandes (Google)
Reviewed-by: Joel Fernandes (Google)
Signed-off-by: Uladzislau Rezki (Sony)
---
include/linux/rcupdate.h
of object.
struct test_kvfree_rcu {
struct rcu_head rcu;
unsigned char array[100];
};
struct test_kvfree_rcu *p;
p = kvmalloc(10 * PAGE_SIZE);
if (p)
kvfree_rcu(p, rcu);
Signed-off-by: Uladzislau Rezki (Sony)
Co-developed-by: Joel Fernandes (Google
st place.
Cc: Sebastian Andrzej Siewior
Reviewed-by: Uladzislau Rezki
Co-developed-by: Uladzislau Rezki
Signed-off-by: Uladzislau Rezki
Signed-off-by: Joel Fernandes (Google)
Signed-off-by: Uladzislau Rezki (Sony)
---
kernel/rcu/tree.c | 12
1 file changed, 12 insertions(+
converting the spinlock to a raw
spinlock.
Vetting all code paths, there is no reason to believe that the raw
spinlock will hurt RT latencies as it is not held for a long time.
Cc: bige...@linutronix.de
Cc: Uladzislau Rezki
Reviewed-by: Uladzislau Rezki
Signed-off-by: Joel Fernandes (Google)
>
> I actually found it in RT 4.4 kernel, I thought this was also on newer RT
> kernels as well (is that not true anymore?). But yes it was exactly what
> Peter said.
>
I see it also in 5.6.4 linut-rt-devel:
#ifdef CONFIG_PREEMPT_RT
...
# define get_local_ptr(var) ({ \
migrate_disable(); \
> >
> > Please see full log here:
> > ftp://vps418301.ovh.net/incoming/include_mm_h_output.txt
> >
> > I can fix it by adding the kvfree() declaration to the rcutiny.h also:
> > extern void kvfree(const void *addr);
> >
> > what seems wired to me? Also it can be fixed if i move it to the
Hello, Paul, Joel.
> > Move inlined kvfree_call_rcu() function out of the
> > header file. This step is a preparation for head-less
> > support.
> >
> > Reviewed-by: Joel Fernandes (Google)
> > Signed-off-by: Uladzislau Rezki (Sony)
> > ---
> >
> > b) Double argument(with rcu_head)
> > This case we consider as it gets called from atomic context even though
> > it can be not. Why we consider such case as atomic: we just assume that.
> > The reason is to keep it simple, because it is not possible to detect
> > whether
> > a current
On Mon, May 04, 2020 at 01:16:41PM -0700, Paul E. McKenney wrote:
> On Mon, May 04, 2020 at 09:51:28PM +0200, Uladzislau Rezki wrote:
> > > > > Since we don't care about traversing backwards, isn't it better to
> > > > > use llist
> > > > > fo
> > > Since we don't care about traversing backwards, isn't it better to use
> > > llist
> > > for this usecase?
> > >
> > > I think Vlad is using locking as we're also tracking the size of the
> > > llist to
> > > know when to free pages. This tracking could suffer from the lost-update
> > >
On Mon, May 04, 2020 at 08:24:37AM -0700, Paul E. McKenney wrote:
> On Mon, May 04, 2020 at 02:43:23PM +0200, Uladzislau Rezki wrote:
> > On Fri, May 01, 2020 at 02:27:49PM -0700, Paul E. McKenney wrote:
> > > On Tue, Apr 28, 2020 at 10:58:48PM +0200, Uladzislau Rezki (Sony) wr
> >
> > For single argument we can drop the lock before the entry to the page
> > allocator. Because it follows might_sleep() anotation we avoid of having
> > a situation when spinlock(rt mutex) is taken from any atomic context.
> >
> > Since the lock is dropped the current context can be
> > @@ -3072,21 +3105,34 @@ static inline bool queue_kfree_rcu_work(struct
> > kfree_rcu_cpu *krcp)
> > krwp = &(krcp->krw_arr[i]);
> >
> > /*
> > -* Try to detach bhead or head and attach it over any
> > +* Try to detach bkvhead or head and
> > >
> > > If we are not doing single-pointer allocation, then that would also
> > > eliminate
> > > entering the low-level page allocator for single-pointer allocations.
> > >
> > > Or did you mean entry into the allocator for the full-page allocations
> > > related to the pointer array for
On Fri, May 01, 2020 at 03:39:09PM -0700, Paul E. McKenney wrote:
> On Tue, Apr 28, 2020 at 10:58:58PM +0200, Uladzislau Rezki (Sony) wrote:
> > Update the kvfree_call_rcu() with head-less support, it
> > means an object without any rcu_head structure can be
> &g
> > > On Tue, Apr 28, 2020 at 10:58:59PM +0200, Uladzislau Rezki (Sony) wrote:
> > > > > From: "Joel Fernandes (Google)"
> > > > >
> > > > > Handle cases where the the object being kvfree_rcu()'d is not aligned
> > >
On Sun, May 03, 2020 at 08:27:00PM -0400, Joel Fernandes wrote:
> On Fri, May 01, 2020 at 04:06:38PM -0700, Paul E. McKenney wrote:
> > On Tue, Apr 28, 2020 at 10:59:01PM +0200, Uladzislau Rezki (Sony) wrote:
> > > Make a kvfree_call_rcu() function to support head-less
&
On Fri, May 01, 2020 at 04:03:59PM -0700, Paul E. McKenney wrote:
> On Tue, Apr 28, 2020 at 10:59:00PM +0200, Uladzislau Rezki (Sony) wrote:
> > Move inlined kvfree_call_rcu() function out of the
> > header file. This step is a preparation for head-less
> > support.
>
On Fri, May 01, 2020 at 03:25:24PM -0700, Paul E. McKenney wrote:
> On Tue, Apr 28, 2020 at 10:58:49PM +0200, Uladzislau Rezki (Sony) wrote:
> > Document the rcutree.rcu_min_cached_objs sysfs kernel parameter.
> >
> > Signed-off-by: Uladzislau Rezki (Sony)
>
> Could y
On Fri, May 01, 2020 at 02:27:49PM -0700, Paul E. McKenney wrote:
> On Tue, Apr 28, 2020 at 10:58:48PM +0200, Uladzislau Rezki (Sony) wrote:
> > Cache some extra objects per-CPU. During reclaim process
> > some pages are cached instead of releasing by linking them
> > into th
> >
> > - local_irq_save(*flags); // For safely calling this_cpu_ptr().
> > + local_irq_save(*flags); /* For safely calling this_cpu_ptr(). */
>
> And here as well. ;-)
>
OK. For me it works either way. I can stick to "//" :)
--
Vlad Rezki
Introduce two helpers to lock and unlock an access to the
per-cpu "kfree_rcu_cpu" structure. The reason is to make
kfree_call_rcu() function to be more readable.
Signed-off-by: Uladzislau Rezki (Sony)
---
kernel/rcu/tree.c | 31 +++
1 file changed, 23
ot subject to such
conversions.
Vetting all code paths, there is no reason to believe that the raw
spinlock will be held for long time so PREEMPT_RT should not suffer from
lengthy acquirals of the lock.
Cc: bige...@linutronix.de
Cc: Uladzislau Rezki
Reviewed-by: Uladzislau Rezki
Signed-off-by: Joel
remove the '->initialized' check.
Signed-off-by: Sebastian Andrzej Siewior
Signed-off-by: Uladzislau Rezki (Sony)
---
kernel/rcu/tree.c | 15 +++
1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index bc6c2bc8fa32..89e9ca3f4e3e 100644
--- a/k
We can simplify KFREE_BULK_MAX_ENTR macro and get rid of
magic numbers which were used to make the structure to be
exactly one page.
Suggested-by: Boqun Feng
Reviewed-by: Joel Fernandes (Google)
Signed-off-by: Uladzislau Rezki (Sony)
Signed-off-by: Joel Fernandes (Google)
---
kernel/rcu
From: "Joel Fernandes (Google)"
Simple clean up of comments in kfree_rcu() code to keep it consistent
with majority of commenting styles.
Reviewed-by: Uladzislau Rezki
Signed-off-by: Joel Fernandes (Google)
Signed-off-by: Uladzislau Rezki (Sony)
---
kernel/rcu/t
)
Signed-off-by: Uladzislau Rezki (Sony)
---
include/linux/rcupdate.h | 38 ++
1 file changed, 34 insertions(+), 4 deletions(-)
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 51b26ab02878..d15d46db61f7 100644
--- a/include/linux/rcupdate.h
Replace kfree() with kvfree() in rcu_reclaim_tiny().
So it becomes possible to release either SLAB memory
or vmalloc one after a GP.
Signed-off-by: Uladzislau Rezki (Sony)
---
kernel/rcu/tiny.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/rcu/tiny.c b/kernel/rcu
kvfree_rcu {
struct rcu_head rcu;
unsigned char array[100];
};
struct test_kvfree_rcu *p;
p = kvmalloc(10 * PAGE_SIZE);
if (p)
kvfree_rcu(p, rcu);
Signed-off-by: Uladzislau Rezki (Sony)
Reviewed-by: Joel Fernandes (Google)
Signed-off-by: Joel Fernand
in
pending state after a detach attempt, just reschedule
the monitor work.
Signed-off-by: Uladzislau Rezki (Sony)
---
kernel/rcu/tree.c | 9 ++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 1487af8e11e8..0762ac06f0b7 100644
--- a/kernel
Rename __is_kfree_rcu_offset to __is_kvfree_rcu_offset.
All RCU paths use kvfree() now instead of kfree(), thus
rename it.
Signed-off-by: Uladzislau Rezki (Sony)
Reviewed-by: Joel Fernandes (Google)
Signed-off-by: Joel Fernandes (Google)
---
include/linux/rcupdate.h | 6 +++---
kernel/rcu
occurs.
A parameter reflecting the minimum allowed pages to be
cached per one CPU is propagated via sysfs, it is read
only, the name is "rcu_min_cached_objs".
Signed-off-by: Uladzislau Rezki (Sony)
---
kernel/rcu/tree.c | 64 ---
1 file c
Rename kvfree_rcu() function to the kvfree_rcu_local() one. The aim is
to introduce the public API that would conflict with this one. So we
temporarily rename it and remove it in a later commit.
Cc: linux...@kvack.org
Cc: Andrew Morton
Cc: r...@vger.kernel.org
Signed-off-by: Uladzislau Rezki
. Also please note that for tiny-RCU any
call of synchronize_rcu() is actually a quiescent
state, therefore (a) does nothing.
Reviewed-by: Joel Fernandes (Google)
Signed-off-by: Uladzislau Rezki (Sony)
---
kernel/rcu/tiny.c | 157 +-
1 file changed, 156
Document the rcutree.rcu_min_cached_objs sysfs kernel parameter.
Signed-off-by: Uladzislau Rezki (Sony)
---
Documentation/admin-guide/kernel-parameters.txt | 8
1 file changed, 8 insertions(+)
diff --git a/Documentation/admin-guide/kernel-parameters.txt
b/Documentation/admin-guide
Cc: Sebastian Andrzej Siewior
Reviewed-by: Uladzislau Rezki
Co-developed-by: Uladzislau Rezki
Signed-off-by: Uladzislau Rezki
Signed-off-by: Joel Fernandes (Google)
Signed-off-by: Uladzislau Rezki (Sony)
---
kernel/rcu/tree.c | 12
1 file changed, 12 insertions(+)
diff --git
From: "Joel Fernandes (Google)"
In order to prepare for future changes for headless RCU support, make the
debug_objects handling in kfree_rcu use the final 'pointer' value of the
object, instead of depending on the head.
Signed-off-by: Uladzislau Rezki (Sony)
Signed-off-by: Joel
steps could
be applied:
a) wait until a grace period has elapsed;
b) direct inlining of the kvfree() call.
Reviewed-by: Joel Fernandes (Google)
Signed-off-by: Uladzislau Rezki (Sony)
Signed-off-by: Joel Fernandes (Google)
Co-developed-by: Joel Fernandes (Google)
---
kernel/rcu/tree.c | 102
try into the page allocator for PREEMPT_RT
rcu/tree: Use consistent style for comments
rcu/tree: Simplify debug_objects handling
rcu/tree: Make kvfree_rcu() tolerate any alignment
Sebastian Andrzej Siewior (1):
rcu/tree: Use static initializer for krc.lock
Uladzislau Rezki (Sony) (18):
The reason is, it is capable of freeing vmalloc()
memory now.
Do the same with __kfree_rcu() macro, it becomes
__kvfree_rcu(), the reason is the same as pointed
above.
Signed-off-by: Uladzislau Rezki (Sony)
Reviewed-by: Joel Fernandes (Google)
Signed-off-by: Joel Fernandes (Google
Morton
Cc: r...@vger.kernel.org
Signed-off-by: Uladzislau Rezki (Sony)
Reviewed-by: Joel Fernandes (Google)
Signed-off-by: Joel Fernandes (Google)
---
mm/list_lru.c | 13 +++--
1 file changed, 3 insertions(+), 10 deletions(-)
diff --git a/mm/list_lru.c b/mm/list_lru.c
index
From: "Joel Fernandes (Google)"
Handle cases where the the object being kvfree_rcu()'d is not aligned by
2-byte boundaries.
Signed-off-by: Uladzislau Rezki (Sony)
Signed-off-by: Joel Fernandes (Google)
---
kernel/rcu/tree.c | 9 ++---
1 file changed, 6 insertions(+), 3 deletion
Move inlined kvfree_call_rcu() function out of the
header file. This step is a preparation for head-less
support.
Reviewed-by: Joel Fernandes (Google)
Signed-off-by: Uladzislau Rezki (Sony)
---
include/linux/rcutiny.h | 6 +-
kernel/rcu/tiny.c | 6 ++
2 files changed, 7
of the
performance throughput and its impact.
Signed-off-by: Uladzislau Rezki (Sony)
---
lib/test_vmalloc.c | 103 +
1 file changed, 95 insertions(+), 8 deletions(-)
diff --git a/lib/test_vmalloc.c b/lib/test_vmalloc.c
index 8bbefcaddfe8..ec73561cda2e
Rename rcu_invoke_kfree_callback to rcu_invoke_kvfree_callback.
Do the same with second trace event, the rcu_kfree_callback,
becomes rcu_kvfree_callback. The reason is to be aligned with
kvfree notation.
Signed-off-by: Uladzislau Rezki (Sony)
Reviewed-by: Joel Fernandes (Google)
Signed-off
the preparation patch for head-less objects
support. When an object is head-less we can not queue
it into any list, instead a pointer is placed directly
into an array.
Signed-off-by: Uladzislau Rezki (Sony)
Signed-off-by: Joel Fernandes (Google)
Reviewed-by: Joel Fernandes (Google)
Co-developed
[ 305.108944] All test took CPU4=297630721 cycles
[ 305.196406] All test took CPU5=297548736 cycles
[ 305.288602] All test took CPU6=297092392 cycles
[ 305.381088] All test took CPU7=297293597 cycles
~14%-23% patched variant is better.
Signed-off-b
> > >
> > > This is explaining what but it doesn't say why. I would go with
> > > "
> > > Allocation functions should comply with the given gfp_mask as much as
> > > possible. The preallocation code in alloc_vmap_area doesn't follow that
> > > pattern and it is using a hardcoded GFP_KERNEL.
> > alloc_vmap_area() is given a gfp_mask for the page allocator.
> > Let's respect that mask and consider it even in the case when
> > doing regular CPU preloading, i.e. where a context can sleep.
>
> This is explaining what but it doesn't say why. I would go with
> "
> Allocation functions
On Wed, Oct 16, 2019 at 01:07:22PM +0200, Michal Hocko wrote:
> On Wed 16-10-19 11:54:38, Uladzislau Rezki (Sony) wrote:
> > When fit type is NE_FIT_TYPE there is a need in one extra object.
> > Usually the "ne_fit_preload_node" per-CPU variable has it and
> >
Hello, Michal.
Sorry for late reply. See my comments enclosed below:
> On Wed 16-10-19 11:54:36, Uladzislau Rezki (Sony) wrote:
> > Some background. The preemption was disabled before to guarantee
> > that a preloaded object is available for a CPU, it was stored for.
>
hen it can occur, how often, under
which conditions and what happens if GFP_NOWAIT gets failed.
Signed-off-by: Uladzislau Rezki (Sony)
---
mm/vmalloc.c | 13 +
1 file changed, 13 insertions(+)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 593bf554518d..2290a0d270e4 100644
--- a/mm
one object for split
purpose")
Reviewed-by: Steven Rostedt (VMware)
Acked-by: Sebastian Andrzej Siewior
Acked-by: Daniel Wagner
Signed-off-by: Uladzislau Rezki (Sony)
---
mm/vmalloc.c | 37 -
1 file changed, 20 insertions(+), 17 deletions(-)
diff
alloc_vmap_area() is given a gfp_mask for the page allocator.
Let's respect that mask and consider it even in the case when
doing regular CPU preloading, i.e. where a context can sleep.
Signed-off-by: Uladzislau Rezki (Sony)
---
mm/vmalloc.c | 8
1 file changed, 4 insertions(+), 4
> > > > > > :* The preload is done in non-atomic context, thus it allows us
> > > > > > :* to use more permissive allocation masks to be more stable
> > > > > > under
> > > > > > :* low memory condition and high memory pressure.
> > > > > > :*
> > > > > > :* Even if it fails
On Mon, Oct 14, 2019 at 03:13:08PM +0200, Michal Hocko wrote:
> On Fri 11-10-19 00:33:18, Uladzislau Rezki (Sony) wrote:
> > Get rid of preempt_disable() and preempt_enable() when the
> > preload is done for splitting purpose. The reason is that
> > calling spin_lock() wi
On Fri, Oct 11, 2019 at 04:55:15PM -0700, Andrew Morton wrote:
> On Thu, 10 Oct 2019 17:17:49 +0200 Uladzislau Rezki wrote:
>
> > > > :* The preload is done in non-atomic context, thus it allows us
> > > > :* to use more permissive allo
dt (VMware)
Signed-off-by: Uladzislau Rezki (Sony)
---
mm/vmalloc.c | 50 +-
1 file changed, 33 insertions(+), 17 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index e92ff5f7dd8b..f48cd0711478 100644
--- a/mm/vmalloc.c
+++ b/mm/vmal
> >
> > A few questions about the resulting alloc_vmap_area():
> >
> > : static struct vmap_area *alloc_vmap_area(unsigned long size,
> > : unsigned long align,
> > : unsigned long vstart, unsigned long vend,
> > : int
le.
Fixes: 82dd23e84be3 ("mm/vmalloc.c: preload a CPU with one object for split
purpose")
Signed-off-by: Uladzislau Rezki (Sony)
---
mm/vmalloc.c | 17 -
1 file changed, 8 insertions(+), 9 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index e92ff5f7dd8b..2ed6fef8
Hello, Daniel.
On Wed, Oct 09, 2019 at 08:05:39AM +0200, Daniel Wagner wrote:
> On Tue, Oct 08, 2019 at 06:04:59PM +0200, Uladzislau Rezki wrote:
> > > so, we do not guarantee, instead we minimize number of allocations
> > > with GFP_NOWAIT flag. For example on my
On Fri, Oct 04, 2019 at 01:20:38PM -0400, Joel Fernandes wrote:
> On Tue, Oct 01, 2019 at 01:27:02PM +0200, Uladzislau Rezki wrote:
> [snip]
> > > > I have just a small question related to workloads and performance
> > > > evaluation.
> > > > Are you awa
On Mon, Oct 07, 2019 at 11:44:20PM +0200, Uladzislau Rezki wrote:
> On Mon, Oct 07, 2019 at 07:36:44PM +0200, Sebastian Andrzej Siewior wrote:
> > On 2019-10-07 18:56:11 [+0200], Uladzislau Rezki wrote:
> > > Actually there is a high lock contention on vmap_area_lock, because
On Mon, Oct 07, 2019 at 07:36:44PM +0200, Sebastian Andrzej Siewior wrote:
> On 2019-10-07 18:56:11 [+0200], Uladzislau Rezki wrote:
> > Actually there is a high lock contention on vmap_area_lock, because it
> > is still global. You can have a look at last slide:
On Mon, Oct 07, 2019 at 06:34:43PM +0200, Daniel Wagner wrote:
> On Mon, Oct 07, 2019 at 06:23:30PM +0200, Uladzislau Rezki wrote:
> > Hello, Daniel, Sebastian.
> >
> > > > On Fri, Oct 04, 2019 at 06:30:42PM +0200, Sebastian Andrzej Siewior
> > > > wrote:
Hello, Daniel, Sebastian.
> > On Fri, Oct 04, 2019 at 06:30:42PM +0200, Sebastian Andrzej Siewior wrote:
> > > On 2019-10-04 18:20:41 [+0200], Uladzislau Rezki wrote:
> > > > If we have migrate_disable/enable, then, i think preempt_enable/disable
> > > > s
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index a3c70e275f4e..9fb7a16f42ae 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -690,8 +690,19 @@ merge_or_add_vmap_area(struct vmap_area *va,
> struct list_head *next;
> struct rb_node **link;
> struct rb_node *parent;
> +
>
> You could have been migrated to another CPU while
> memory has been allocated.
>
That is true that we can migrate since we allow preemption
when allocate. But it does not really matter on which CPU an
allocation occurs and whether we migrate or not.
If we land on another CPU or still stay on
On Fri, Oct 04, 2019 at 05:37:28PM +0200, Sebastian Andrzej Siewior wrote:
> If you post something that is related to PREEMPT_RT please keep tglx and
> me in Cc.
>
> On 2019-10-03 11:09:06 [+0200], Daniel Wagner wrote:
> > Replace preempt_enable() and preempt_disable() with the vmap_area_lock
> >
really make sense.
>
> Fixes: 82dd23e84be3 ("mm/vmalloc.c: preload a CPU with one object for
> split purpose")
> Cc: Uladzislau Rezki (Sony)
> Signed-off-by: Daniel Wagner
> ---
> mm/vmalloc.c | 9 +++--
> 1 file changed, 3 insertions(+), 6 deletions(-)
On Wed, Oct 02, 2019 at 11:23:06AM +1000, Daniel Axtens wrote:
> Hi,
>
> >>/*
> >> * Find a place in the tree where VA potentially will be
> >> * inserted, unless it is merged with its sibling/siblings.
> >> @@ -741,6 +752,10 @@ merge_or_add_vmap_area(struct vmap_area *va,
> >>
> > Hello, Joel.
> >
> > First of all thank you for improving it. I also noticed a high pressure
> > on RCU-machinery during performing some vmalloc tests when kfree_rcu()
> > flood occurred. Therefore i got rid of using kfree_rcu() there.
>
> Replying a bit late due to overseas conference
Hello, Daniel.
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index a3c70e275f4e..9fb7a16f42ae 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -690,8 +690,19 @@ merge_or_add_vmap_area(struct vmap_area *va,
> struct list_head *next;
> struct rb_node **link;
> struct rb_node
> Recently a discussion about stability and performance of a system
> involving a high rate of kfree_rcu() calls surfaced on the list [1]
> which led to another discussion how to prepare for this situation.
>
> This patch adds basic batching support for kfree_rcu(). It is "basic"
> because we do
>
> I think it would be sufficient to call RBCOMPUTE(node, true) on every
> node and check the return value ?
>
Yes, that is enough for sure. The only way i was thinking about to make it
public, because checking the tree for MAX is generic for every users which
use RB_DECLARE_CALLBACKS_MAX
On Sun, Aug 11, 2019 at 05:39:23PM -0700, Michel Lespinasse wrote:
> On Sun, Aug 11, 2019 at 11:46 AM Uladzislau Rezki (Sony)
> wrote:
> > RB_DECLARE_CALLBACKS_MAX defines its own callback to update the
> > augmented subtree information after a node is modified. It make
> On Sun, Aug 11, 2019 at 11:46 AM Uladzislau Rezki (Sony)
> wrote:
> >
> > Recently there was introduced RB_DECLARE_CALLBACKS_MAX template.
> > One of the callback, to be more specific *_compute_max(), calculates
> > a maximum scalar value of node a
the code more transparent.
Signed-off-by: Uladzislau Rezki (Sony)
---
include/linux/rbtree_augmented.h | 40 +-
tools/include/linux/rbtree_augmented.h | 40 +-
2 files changed, 40 insertions(+), 40 deletions(-)
diff --git
301 - 400 of 601 matches
Mail list logo