Re: [PATCH 3/3] elf: Add _dl_find_eh_frame function

2021-11-23 Thread Adhemerval Zanella via Gcc-patches



On 17/11/2021 10:40, Florian Weimer wrote:
> * Adhemerval Zanella via Libc-alpha:
> 
>> However the code is somewhat complex and I would like to have some feedback
>> if gcc will be willing to accept this change (I assume it would require
>> this code merge on glibc beforehand).
> 
> There's a long review queue on the GCC side due to the stage1 close.
> It may still be considered for GCC 12.  Jakub has also requested that
> we hold off committing the glibc side until the GCC side is reviewed.
> 
> I'll flesh out the commit message and NEWS entry once we have agreed
> upon the interface.
> 
>>> new file mode 100644
>>> index 00..c7313c122d
>>> --- /dev/null
>>> +++ b/elf/dl-find_eh_frame.c
> 
>>> +/* Data for the main executable.  There is usually a large gap between
>>> +   the main executable and initially loaded shared objects.  Record
>>> +   the main executable separately, to increase the chance that the
>>> +   range for the non-closeable mappings below covers only the shared
>>> +   objects (and not also the gap between main executable and shared
>>> +   objects).  */
>>> +static uintptr_t _dl_eh_main_map_start attribute_relro;
>>> +static struct dl_eh_frame_info _dl_eh_main_info attribute_relro;
>>> +
>>> +/* Data for initally loaded shared objects that cannot be unlaoded.
>>
>> s/initally/initially and s/unlaoded/unloaded.
> 
> Fixed.
> 
>>
>>> +   The mapping base addresses are stored in address order in the
>>> +   _dl_eh_nodelete_mappings_bases array (containing
>>> +   _dl_eh_nodelete_mappings_size elements).  The EH data for a base
>>> +   address is stored in the parallel _dl_eh_nodelete_mappings_infos.
>>> +   These arrays are not modified after initialization.  */
>>> +static uintptr_t _dl_eh_nodelete_mappings_end attribute_relro;
>>> +static size_t _dl_eh_nodelete_mappings_size attribute_relro;
>>> +static uintptr_t *_dl_eh_nodelete_mappings_bases attribute_relro;
>>> +static struct dl_eh_frame_info *_dl_eh_nodelete_mappings_infos
>>> +  attribute_relro;
>>> +
>>> +/* Mappings created by dlopen can go away with dlclose, so a data
>>> +   dynamic data structure with some synchronization is needed.
>>
>> This sounds strange ("a data dynamic data").
> 
> I dropped the first data.
> 
>>
>>> +   Individual segments are similar to the _dl_eh_nodelete_mappings
>>
>> Maybe use _dl_eh_nodelete_mappings_*, because '_dl_eh_nodelete_mappings'
>> itself if not defined anywhere.
> 
> Right.
> 
>>> +   Adding new elements to this data structure is another source of
>>> +   quadratic behavior for dlopen.  If the other causes of quadratic
>>> +   behavior are eliminated, a more complicated data structure will be
>>> +   needed.  */
>>
>> This worries me, specially we have reports that python and other dynamic
>> environments do use a lot of plugin and generates a lot of dlopen() calls.
>> What kind of performance implication do you foresee here?
> 
> The additional overhead is not disproportionate to the other sources of
> quadratic behavior.  With 1,000 dlopen'ed objects, overall run-time
> seems to be comparable to the strcmp time required soname matching, for
> example, and is quite difficult to measure.  So we could fix the
> performance regression if we used a hash table for that …
> 
> It's just an undesirable complexity class.  The implementation is not
> actually slow because it's a mostly-linear copy (although a backwards
> one).  Other parts of dlopen involve pointer chasing and are much
> slower.

Right, I agree this should probably won't incur in performance issues,
I was curious if you have any numbers about it.

> 
>>> +/* Allocate an empty segment that is at least SIZE large.  PREVIOUS */
>>
>> What this PREVIOUS refer to?
> 
> Oops, it's now:
> 
> /* Allocate an empty segment that is at least SIZE large.  PREVIOUS
>points to the chain of previously allocated segments and can be
>NULL.  */
> 
>>> +/* Update the version to reflect that an update is happening.  This
>>> +   does not change the bit that controls the active segment chain.
>>> +   Returns the index of the currently active segment chain.  */
>>> +static inline unsigned int
>>> +_dl_eh_mappings_begin_update (void)
>>> +{
>>> +  unsigned int v
>>> += __atomic_wide_counter_fetch_add_relaxed 
>>> (&_dl_eh_loaded_mappings_version,
>>> +   2);
>>
>> Why use an 'unsigned int' for the wide counter here?
> 
> Because …
> 
>>> +  /* Subsequent stores to the TM data must not be reordered before the
>>> + store above with the version update.  */
>>> +  atomic_thread_fence_release ();
>>> +  return v & 1;
>>> +}
> 
> … we only need the lower bit.

Ack, I guess it won't matter to compiler.

> 
>>> +  /* Other initially loaded objects.  */
>>> +  if (pc >= *_dl_eh_nodelete_mappings_bases
>>> +  && pc < _dl_eh_nodelete_mappings_end)
>>> +{
>>> +  size_t idx = _dl_eh_find_lower_bound (pc,
>>> +_dl_eh_nodelete_mappings_base

Re: [PATCH 3/3] elf: Add _dl_find_eh_frame function

2021-11-18 Thread Florian Weimer via Gcc-patches
* Jakub Jelinek:

> dl_iterate_phdr is declared in link.h and without the _ prefix, shouldn't
> dl_find_eh_frame follow the suit and be declared in the same header and
> also without the prefix?

We need to use the _ prefix due to this bug:

  dl_iterate_phdr namespace violation 
  

Not sure about moving to .  The interface is a bit like dladdr,
and that lives in .

> Also, shouldn't the DL_FIND_EH_FRAME_DBASE macro on the other side have
> __ prefix?  We have one DL_* macro, DL_CALL_FCT, so perhaps it is fine
> for -D_GNU_SOURCE, but various other projects do use macros with DL_*
> prefix, like boost or python.

 is not covered by standards, and we rarely use _GNU_SOURCE
conditionals in such headers.  The _ prefix is only needed because of
external linkage.

Thanks,
Florian



Re: [PATCH 3/3] elf: Add _dl_find_eh_frame function

2021-11-18 Thread Jakub Jelinek via Gcc-patches
On Wed, Nov 03, 2021 at 05:28:02PM +0100, Florian Weimer wrote:
> --- /dev/null
> +++ b/bits/dlfcn_eh_frame.h
> @@ -0,0 +1,33 @@
> +/* System dependent definitions for find unwind information using ld.so.
> +   Copyright (C) 2021 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   .  */
> +
> +#ifndef _DLFCN_H
> +# error "Never use  directly; include  
> instead."
> +#endif
> +
> +/* This implementation does not use a DBASE pointer argument in
> +   _dl_find_eh_frame.  */
> +#define DL_FIND_EH_FRAME_DBASE 0
> +
> +__BEGIN_DECLS
> +/* If PC points into an object that has a PT_GNU_EH_FRAME segment,
> +   return the pointer to the start of that segment in memory.  If no
> +   corresponding object exists or the object has no such segment,
> +   returns NULL.  */
> +void *_dl_find_eh_frame (void *__pc) __THROW;
> +__END_DECLS

dl_iterate_phdr is declared in link.h and without the _ prefix, shouldn't
dl_find_eh_frame follow the suit and be declared in the same header and
also without the prefix?
Also, shouldn't the DL_FIND_EH_FRAME_DBASE macro on the other side have
__ prefix?  We have one DL_* macro, DL_CALL_FCT, so perhaps it is fine
for -D_GNU_SOURCE, but various other projects do use macros with DL_*
prefix, like boost or python.

Jakub



Re: [PATCH 3/3] elf: Add _dl_find_eh_frame function

2021-11-18 Thread Florian Weimer via Gcc-patches
* Jakub Jelinek:

> On Wed, Nov 03, 2021 at 05:28:02PM +0100, Florian Weimer wrote:
>> This function is similar to __gnu_Unwind_Find_exidx as used on arm.
>> It can be used to speed up the libgcc unwinder.
>
> I'm little bit worried that this trades the speed of exceptions for
> speed of dlopen/dlclose and extra memory use in each process.
> I admit I haven't been paying close attention to how many shared libraries
> apps typically link against and how many dlopen/dlclose calls they do
> in the last decade and half, but I'd think more applications don't use
> exceptions compared to apps that do use them, and of many of those that do
> use them don't use them for really exceptional cases, so speeding those
> is a good thing.

dlopen has many sources of quadratic behavior already, and many involve
chasing pointers.  The new data structure is very compact, so the new
work during dlopen does not show up prominently in profiles.

> So, I'd wonder, could this overhead be added lazily, when _dl_find_eh_frame
> is called for the first time just take the rtld lock, prepare anything you
> populate right now already from the process start up and every
> dlopen/dlclose before the first _dl_find_eh_frame call and only since then
> keep it updated on dlopen/dlclose?

I think it's possible to do this lazily (except the memory allocation).
But I don't want to do this unless we have performance numbers that
suggest it is actually required.

> Thus, for the expected majority of apps that aren't using exceptions at all
> nothing would change for dlopen/dlclose overhead, while all but the first
> _dl_find_eh_frame would be faster and with no locking?

One thing I'd like to do is to use the data structure in
_dl_find_dso_for_object, and that is actually called during dlopen to
determine the caller DSO.  _dl_find_dso_for_object can show up in
profiles with a lot of dlopen calls, particularly if an object loaded
later calls dlopen, so that the current implementation takes more time
to find the object.  _dl_find_dso_for_object is also used in dlsym,
although we skip it if the caller passes an explicit handle (but
RTLD_DEFAULT, RTLD_NEXT, etc. definitely need it).

We can also replace the soname and file identity lookup with a hash
table.  *That* will definitely recover any losses from
_dl_find_eh_frame_update.  In my profiles strcmp always shows up higher
than _dl_find_eh_frame_update.

Thanks,
Florian



Re: [PATCH 3/3] elf: Add _dl_find_eh_frame function

2021-11-18 Thread Jakub Jelinek via Gcc-patches
On Wed, Nov 03, 2021 at 05:28:02PM +0100, Florian Weimer wrote:
> This function is similar to __gnu_Unwind_Find_exidx as used on arm.
> It can be used to speed up the libgcc unwinder.

I'm little bit worried that this trades the speed of exceptions for
speed of dlopen/dlclose and extra memory use in each process.
I admit I haven't been paying close attention to how many shared libraries
apps typically link against and how many dlopen/dlclose calls they do
in the last decade and half, but I'd think more applications don't use
exceptions compared to apps that do use them, and of many of those that do
use them don't use them for really exceptional cases, so speeding those
is a good thing.
So, I'd wonder, could this overhead be added lazily, when _dl_find_eh_frame
is called for the first time just take the rtld lock, prepare anything you
populate right now already from the process start up and every
dlopen/dlclose before the first _dl_find_eh_frame call and only since then
keep it updated on dlopen/dlclose?
Thus, for the expected majority of apps that aren't using exceptions at all
nothing would change for dlopen/dlclose overhead, while all but the first
_dl_find_eh_frame would be faster and with no locking?

Jakub



Re: [PATCH 3/3] elf: Add _dl_find_eh_frame function

2021-11-17 Thread Florian Weimer via Gcc-patches
* Adhemerval Zanella via Libc-alpha:

> However the code is somewhat complex and I would like to have some feedback
> if gcc will be willing to accept this change (I assume it would require
> this code merge on glibc beforehand).

There's a long review queue on the GCC side due to the stage1 close.
It may still be considered for GCC 12.  Jakub has also requested that
we hold off committing the glibc side until the GCC side is reviewed.

I'll flesh out the commit message and NEWS entry once we have agreed
upon the interface.

>> new file mode 100644
>> index 00..c7313c122d
>> --- /dev/null
>> +++ b/elf/dl-find_eh_frame.c

>> +/* Data for the main executable.  There is usually a large gap between
>> +   the main executable and initially loaded shared objects.  Record
>> +   the main executable separately, to increase the chance that the
>> +   range for the non-closeable mappings below covers only the shared
>> +   objects (and not also the gap between main executable and shared
>> +   objects).  */
>> +static uintptr_t _dl_eh_main_map_start attribute_relro;
>> +static struct dl_eh_frame_info _dl_eh_main_info attribute_relro;
>> +
>> +/* Data for initally loaded shared objects that cannot be unlaoded.
>
> s/initally/initially and s/unlaoded/unloaded.

Fixed.

>
>> +   The mapping base addresses are stored in address order in the
>> +   _dl_eh_nodelete_mappings_bases array (containing
>> +   _dl_eh_nodelete_mappings_size elements).  The EH data for a base
>> +   address is stored in the parallel _dl_eh_nodelete_mappings_infos.
>> +   These arrays are not modified after initialization.  */
>> +static uintptr_t _dl_eh_nodelete_mappings_end attribute_relro;
>> +static size_t _dl_eh_nodelete_mappings_size attribute_relro;
>> +static uintptr_t *_dl_eh_nodelete_mappings_bases attribute_relro;
>> +static struct dl_eh_frame_info *_dl_eh_nodelete_mappings_infos
>> +  attribute_relro;
>> +
>> +/* Mappings created by dlopen can go away with dlclose, so a data
>> +   dynamic data structure with some synchronization is needed.
>
> This sounds strange ("a data dynamic data").

I dropped the first data.

>
>> +   Individual segments are similar to the _dl_eh_nodelete_mappings
>
> Maybe use _dl_eh_nodelete_mappings_*, because '_dl_eh_nodelete_mappings'
> itself if not defined anywhere.

Right.

>> +   Adding new elements to this data structure is another source of
>> +   quadratic behavior for dlopen.  If the other causes of quadratic
>> +   behavior are eliminated, a more complicated data structure will be
>> +   needed.  */
>
> This worries me, specially we have reports that python and other dynamic
> environments do use a lot of plugin and generates a lot of dlopen() calls.
> What kind of performance implication do you foresee here?

The additional overhead is not disproportionate to the other sources of
quadratic behavior.  With 1,000 dlopen'ed objects, overall run-time
seems to be comparable to the strcmp time required soname matching, for
example, and is quite difficult to measure.  So we could fix the
performance regression if we used a hash table for that …

It's just an undesirable complexity class.  The implementation is not
actually slow because it's a mostly-linear copy (although a backwards
one).  Other parts of dlopen involve pointer chasing and are much
slower.

>> +/* Allocate an empty segment that is at least SIZE large.  PREVIOUS */
>
> What this PREVIOUS refer to?

Oops, it's now:

/* Allocate an empty segment that is at least SIZE large.  PREVIOUS
   points to the chain of previously allocated segments and can be
   NULL.  */

>> +/* Update the version to reflect that an update is happening.  This
>> +   does not change the bit that controls the active segment chain.
>> +   Returns the index of the currently active segment chain.  */
>> +static inline unsigned int
>> +_dl_eh_mappings_begin_update (void)
>> +{
>> +  unsigned int v
>> += __atomic_wide_counter_fetch_add_relaxed 
>> (&_dl_eh_loaded_mappings_version,
>> +   2);
>
> Why use an 'unsigned int' for the wide counter here?

Because …

>> +  /* Subsequent stores to the TM data must not be reordered before the
>> + store above with the version update.  */
>> +  atomic_thread_fence_release ();
>> +  return v & 1;
>> +}

… we only need the lower bit.

>> +  /* Other initially loaded objects.  */
>> +  if (pc >= *_dl_eh_nodelete_mappings_bases
>> +  && pc < _dl_eh_nodelete_mappings_end)
>> +{
>> +  size_t idx = _dl_eh_find_lower_bound (pc,
>> +_dl_eh_nodelete_mappings_bases,
>> +_dl_eh_nodelete_mappings_size);
>> +  const struct dl_eh_frame_info *info
>> += _dl_eh_nodelete_mappings_infos + idx;
>
> Ins't a UB if idx is not a valid one?

idx is always valid here.

>> +  bool match;
>> +  if (idx < _dl_eh_nodelete_mappings_size
>> +  && pc == _dl_eh_nodelete_mappings_base

Re: [PATCH 3/3] elf: Add _dl_find_eh_frame function

2021-11-16 Thread Adhemerval Zanella via Gcc-patches



On 03/11/2021 13:28, Florian Weimer via Gcc-patches wrote:
> This function is similar to __gnu_Unwind_Find_exidx as used on arm.
> It can be used to speed up the libgcc unwinder.

Besides the terse patch description, the design seems ok to accomplish the
lock-free read and update.  There are some question and remarks below,
and I still need to revise the tests.

However the code is somewhat complex and I would like to have some feedback
if gcc will be willing to accept this change (I assume it would require
this code merge on glibc beforehand).

> ---
>  NEWS  |   4 +
>  bits/dlfcn_eh_frame.h |  33 +
>  dlfcn/Makefile|   2 +-
>  dlfcn/dlfcn.h |   2 +
>  elf/Makefile  |  31 +-
>  elf/Versions  |   3 +
>  elf/dl-close.c|   4 +
>  elf/dl-find_eh_frame.c| 864 ++
>  elf/dl-find_eh_frame.h|  90 ++
>  elf/dl-find_eh_frame_slow.h   |  55 ++
>  elf/dl-libc_freeres.c |   2 +
>  elf/dl-open.c |   5 +
>  elf/rtld.c|   7 +
>  elf/tst-dl_find_eh_frame-mod1.c   |  10 +
>  elf/tst-dl_find_eh_frame-mod2.c   |  10 +
>  elf/tst-dl_find_eh_frame-mod3.c   |  10 +
>  elf/tst-dl_find_eh_frame-mod4.c   |  10 +
>  elf/tst-dl_find_eh_frame-mod5.c   |  11 +
>  elf/tst-dl_find_eh_frame-mod6.c   |  11 +
>  elf/tst-dl_find_eh_frame-mod7.c   |  10 +
>  elf/tst-dl_find_eh_frame-mod8.c   |  10 +
>  elf/tst-dl_find_eh_frame-mod9.c   |  10 +
>  elf/tst-dl_find_eh_frame-threads.c| 237 +
>  elf/tst-dl_find_eh_frame.c| 179 
>  include/atomic_wide_counter.h |  14 +
>  include/bits/dlfcn_eh_frame.h |   1 +
>  include/link.h|   3 +
>  manual/Makefile   |   2 +-
>  manual/dynlink.texi   |  69 ++
>  manual/libdl.texi |  10 -
>  manual/probes.texi|   2 +-
>  manual/threads.texi   |   2 +-
>  sysdeps/i386/bits/dlfcn_eh_frame.h|  34 +
>  sysdeps/mach/hurd/i386/ld.abilist |   1 +
>  sysdeps/nios2/bits/dlfcn_eh_frame.h   |  34 +
>  sysdeps/unix/sysv/linux/aarch64/ld.abilist|   1 +
>  sysdeps/unix/sysv/linux/alpha/ld.abilist  |   1 +
>  sysdeps/unix/sysv/linux/arc/ld.abilist|   1 +
>  sysdeps/unix/sysv/linux/arm/be/ld.abilist |   1 +
>  sysdeps/unix/sysv/linux/arm/le/ld.abilist |   1 +
>  sysdeps/unix/sysv/linux/csky/ld.abilist   |   1 +
>  sysdeps/unix/sysv/linux/hppa/ld.abilist   |   1 +
>  sysdeps/unix/sysv/linux/i386/ld.abilist   |   1 +
>  sysdeps/unix/sysv/linux/ia64/ld.abilist   |   1 +
>  .../unix/sysv/linux/m68k/coldfire/ld.abilist  |   1 +
>  .../unix/sysv/linux/m68k/m680x0/ld.abilist|   1 +
>  sysdeps/unix/sysv/linux/microblaze/ld.abilist |   1 +
>  .../unix/sysv/linux/mips/mips32/ld.abilist|   1 +
>  .../sysv/linux/mips/mips64/n32/ld.abilist |   1 +
>  .../sysv/linux/mips/mips64/n64/ld.abilist |   1 +
>  sysdeps/unix/sysv/linux/nios2/ld.abilist  |   1 +
>  .../sysv/linux/powerpc/powerpc32/ld.abilist   |   1 +
>  .../linux/powerpc/powerpc64/be/ld.abilist |   1 +
>  .../linux/powerpc/powerpc64/le/ld.abilist |   1 +
>  sysdeps/unix/sysv/linux/riscv/rv32/ld.abilist |   1 +
>  sysdeps/unix/sysv/linux/riscv/rv64/ld.abilist |   1 +
>  .../unix/sysv/linux/s390/s390-32/ld.abilist   |   1 +
>  .../unix/sysv/linux/s390/s390-64/ld.abilist   |   1 +
>  sysdeps/unix/sysv/linux/sh/be/ld.abilist  |   1 +
>  sysdeps/unix/sysv/linux/sh/le/ld.abilist  |   1 +
>  .../unix/sysv/linux/sparc/sparc32/ld.abilist  |   1 +
>  .../unix/sysv/linux/sparc/sparc64/ld.abilist  |   1 +
>  sysdeps/unix/sysv/linux/x86_64/64/ld.abilist  |   1 +
>  sysdeps/unix/sysv/linux/x86_64/x32/ld.abilist |   1 +
>  64 files changed, 1795 insertions(+), 16 deletions(-)
>  create mode 100644 bits/dlfcn_eh_frame.h
>  create mode 100644 elf/dl-find_eh_frame.c
>  create mode 100644 elf/dl-find_eh_frame.h
>  create mode 100644 elf/dl-find_eh_frame_slow.h
>  create mode 100644 elf/tst-dl_find_eh_frame-mod1.c
>  create mode 100644 elf/tst-dl_find_eh_frame-mod2.c
>  create mode 100644 elf/tst-dl_find_eh_frame-mod3.c
>  create mode 100644 elf/tst-dl_find_eh_frame-mod4.c
>  create mode 100644 elf/tst-dl_find_eh_frame-mod5.c
>  create mode 100644 elf/tst-dl_find_eh_frame-mod6.c
>  create mode 100644 elf/tst-dl_find_eh_frame-mod7.c
>  create mode 100644 elf/tst-dl_find_eh_frame-mod8.c
>  create mode 100644 elf/tst-dl_find_eh_frame-mod9.c
>  create mode 100644 elf/tst-d

[PATCH 3/3] elf: Add _dl_find_eh_frame function

2021-11-03 Thread Florian Weimer via Gcc-patches
This function is similar to __gnu_Unwind_Find_exidx as used on arm.
It can be used to speed up the libgcc unwinder.
---
 NEWS  |   4 +
 bits/dlfcn_eh_frame.h |  33 +
 dlfcn/Makefile|   2 +-
 dlfcn/dlfcn.h |   2 +
 elf/Makefile  |  31 +-
 elf/Versions  |   3 +
 elf/dl-close.c|   4 +
 elf/dl-find_eh_frame.c| 864 ++
 elf/dl-find_eh_frame.h|  90 ++
 elf/dl-find_eh_frame_slow.h   |  55 ++
 elf/dl-libc_freeres.c |   2 +
 elf/dl-open.c |   5 +
 elf/rtld.c|   7 +
 elf/tst-dl_find_eh_frame-mod1.c   |  10 +
 elf/tst-dl_find_eh_frame-mod2.c   |  10 +
 elf/tst-dl_find_eh_frame-mod3.c   |  10 +
 elf/tst-dl_find_eh_frame-mod4.c   |  10 +
 elf/tst-dl_find_eh_frame-mod5.c   |  11 +
 elf/tst-dl_find_eh_frame-mod6.c   |  11 +
 elf/tst-dl_find_eh_frame-mod7.c   |  10 +
 elf/tst-dl_find_eh_frame-mod8.c   |  10 +
 elf/tst-dl_find_eh_frame-mod9.c   |  10 +
 elf/tst-dl_find_eh_frame-threads.c| 237 +
 elf/tst-dl_find_eh_frame.c| 179 
 include/atomic_wide_counter.h |  14 +
 include/bits/dlfcn_eh_frame.h |   1 +
 include/link.h|   3 +
 manual/Makefile   |   2 +-
 manual/dynlink.texi   |  69 ++
 manual/libdl.texi |  10 -
 manual/probes.texi|   2 +-
 manual/threads.texi   |   2 +-
 sysdeps/i386/bits/dlfcn_eh_frame.h|  34 +
 sysdeps/mach/hurd/i386/ld.abilist |   1 +
 sysdeps/nios2/bits/dlfcn_eh_frame.h   |  34 +
 sysdeps/unix/sysv/linux/aarch64/ld.abilist|   1 +
 sysdeps/unix/sysv/linux/alpha/ld.abilist  |   1 +
 sysdeps/unix/sysv/linux/arc/ld.abilist|   1 +
 sysdeps/unix/sysv/linux/arm/be/ld.abilist |   1 +
 sysdeps/unix/sysv/linux/arm/le/ld.abilist |   1 +
 sysdeps/unix/sysv/linux/csky/ld.abilist   |   1 +
 sysdeps/unix/sysv/linux/hppa/ld.abilist   |   1 +
 sysdeps/unix/sysv/linux/i386/ld.abilist   |   1 +
 sysdeps/unix/sysv/linux/ia64/ld.abilist   |   1 +
 .../unix/sysv/linux/m68k/coldfire/ld.abilist  |   1 +
 .../unix/sysv/linux/m68k/m680x0/ld.abilist|   1 +
 sysdeps/unix/sysv/linux/microblaze/ld.abilist |   1 +
 .../unix/sysv/linux/mips/mips32/ld.abilist|   1 +
 .../sysv/linux/mips/mips64/n32/ld.abilist |   1 +
 .../sysv/linux/mips/mips64/n64/ld.abilist |   1 +
 sysdeps/unix/sysv/linux/nios2/ld.abilist  |   1 +
 .../sysv/linux/powerpc/powerpc32/ld.abilist   |   1 +
 .../linux/powerpc/powerpc64/be/ld.abilist |   1 +
 .../linux/powerpc/powerpc64/le/ld.abilist |   1 +
 sysdeps/unix/sysv/linux/riscv/rv32/ld.abilist |   1 +
 sysdeps/unix/sysv/linux/riscv/rv64/ld.abilist |   1 +
 .../unix/sysv/linux/s390/s390-32/ld.abilist   |   1 +
 .../unix/sysv/linux/s390/s390-64/ld.abilist   |   1 +
 sysdeps/unix/sysv/linux/sh/be/ld.abilist  |   1 +
 sysdeps/unix/sysv/linux/sh/le/ld.abilist  |   1 +
 .../unix/sysv/linux/sparc/sparc32/ld.abilist  |   1 +
 .../unix/sysv/linux/sparc/sparc64/ld.abilist  |   1 +
 sysdeps/unix/sysv/linux/x86_64/64/ld.abilist  |   1 +
 sysdeps/unix/sysv/linux/x86_64/x32/ld.abilist |   1 +
 64 files changed, 1795 insertions(+), 16 deletions(-)
 create mode 100644 bits/dlfcn_eh_frame.h
 create mode 100644 elf/dl-find_eh_frame.c
 create mode 100644 elf/dl-find_eh_frame.h
 create mode 100644 elf/dl-find_eh_frame_slow.h
 create mode 100644 elf/tst-dl_find_eh_frame-mod1.c
 create mode 100644 elf/tst-dl_find_eh_frame-mod2.c
 create mode 100644 elf/tst-dl_find_eh_frame-mod3.c
 create mode 100644 elf/tst-dl_find_eh_frame-mod4.c
 create mode 100644 elf/tst-dl_find_eh_frame-mod5.c
 create mode 100644 elf/tst-dl_find_eh_frame-mod6.c
 create mode 100644 elf/tst-dl_find_eh_frame-mod7.c
 create mode 100644 elf/tst-dl_find_eh_frame-mod8.c
 create mode 100644 elf/tst-dl_find_eh_frame-mod9.c
 create mode 100644 elf/tst-dl_find_eh_frame-threads.c
 create mode 100644 elf/tst-dl_find_eh_frame.c
 create mode 100644 include/bits/dlfcn_eh_frame.h
 create mode 100644 manual/dynlink.texi
 delete mode 100644 manual/libdl.texi
 create mode 100644 sysdeps/i386/bits/dlfcn_eh_frame.h
 create mode 100644 sysdeps/nios2/bits/dlfcn_eh_frame.h

diff --git a/NEWS b/NEWS
index 82b7016aef..68c9c21458 100644
--- a/NEWS
+++ b/NEWS
@@ -64,6 +64,10 @@ Major new features:
   to be used by compilers for optimizing usage of 'memcmp' when its
   return value is only used for its boolean status.
 
+* The function _dl_find_eh_frame has be