https://bugs.kde.org/show_bug.cgi?id=511717

--- Comment #32 from Philippe Waroquiers <[email protected]> ---
(In reply to Philippe Waroquiers from comment #31)
> The mystery does not clarify :(.
The mystery now clarifies a little bit :).

> The fault is due to GDB asking to read "reading memory 0x31002000 size 8"
> This range of memory is marked as addressable(rw) by the aspacemgr trace:
> --397751:1: aspacem 152: anon 0030000000-0033146fff     49m rw---
> 
> What is strange is that we have two running threads for which valgrind
> indicates overlapping client stacks:
Ubuntu 25.10 has switched to a newer glibc 2.42.
It looks like this glibc version does not use anymore mprotect with PROT_NONE
for
the stack guard page of a thread.
Instead, it uses madvise(0x703945bfd000, 4096, MADV_GUARD_INSTALL) = 0
This instructs the kernel to mark the memory range as not readable 
but without having an explicit map range to cover this.
(this has been done to optimise the applications having many threads).
So, several threads might have their stack in the same map segment.

This new mechanism has several consequences on valgrind:
* the valgrind address space manager believes that the guard page is readable
and writable
  while any access to this memory will cause a SEGV.
* the logic to guess the client stack in ML_(guess_and_register_stack) assumes
  that a thread stack has its own segment.
  This is not the case anymore as valgrind does not "see" the guard page
  and the kernel proc map also does not indicate that there are some pages that
  are in reality not respecting the protection rw of the segment in which they
are.
* the above explains why the gdbserver read memory still causes a SEGV even
  when the valgrind address space manager indicates that the memory is readable
* I am not too sure of the consequences of valgrind Incorrectly guessing the
client stacks 
   pub_core_stacks.h indicates that this is needed to detect stack switches.
  It is also to translate a thread SP in the stack limits. Such stack limits
are used a.o.
  in m_stacktrace.c

There are several ways we can fix this.
One (possibly too ugly?) way is to make the madvise syscall fail in
PRE(sys_mavise) when the advice
arg is MADV_GUARD_INSTALL.
The glibc code falls back on the mprotect technique when the madvise
MADV_GUARD_INSTALL fails.

A second way would be to modify the address space manager so that it
understands the concept
of MADV_GUARD_INSTALL page (madvise system call would then need to inform the
address space
manager of the calls to MADV_GUARD_INSTALL and MADV_GUARD_REMOVE.
This is likely cleaner than the previous solution but likely this has quite
some impact on the
concept of map segments and the relationship between the valgrind maintained
map and
the /proc map.
The semantic of ADV_GUARD_INSTALL and MADV_GUARD_REMOVE seems however not
trivial to emulate.

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to