Re: Update on the glibc segfault issue on Alpha

2023-01-05 Thread John Paul Adrian Glaubitz

Hi Adhemveral!

On 1/4/23 13:00, Adhemerval Zanella Netto wrote:

I can confirm that this patch fixes the problem for me. I have uploaded a 
manually built glibc
package with the patch applied to Debian »unreleased« so that the buildds can 
resume building
packages again.

Would it be possible to backport this patch to 2.36 once it has been merged 
with the current
development tree? The reason is that I don't think that Debian is going to 
switch to anything
beyond 2.36 anytime soon due to the upcoming feature freeze in January.


I have sent a more detailed patch [1] and I will probably backport to all 
affected releases
once it is accepted.


Great, thanks a lot! This means, the fix will eventually be available in Debian 
and we won't have
to carry the patch for a long time.

Adrian

--
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer
`. `'   Physicist
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



Re: Update on the glibc segfault issue on Alpha

2023-01-04 Thread Adhemerval Zanella Netto



On 04/01/23 06:25, John Paul Adrian Glaubitz wrote:
> Hi Adhemerval!
> 
> On 1/3/23 13:09, Adhemerval Zanella Netto wrote:
>> Thanks, this commits helps narrow down the issue.  The 73fc4e28b9464f0e 
>> refactor did not
>> add the GL(dl_phdr) and GL(dl_phnum) for static case, relying on the 
>> __ehdr_start symbol
>> to get the correct values.
>>
>> The issue is for some archs, alpha for instance, the hidden weak reference 
>> is not making
>> the static linker to define the __ehdr_start address correctly: it is being 
>> set to 0 and
>> thus GL(dl_phdr) and GL(dl_phnum) are set to invalid values.
>>
>> And I am not sure if the hidden weak __ehdr_start does work on all 
>> architectures, so I
>> think it would be safer to just restore the previous behavior to setup 
>> GL(dl_phdr) and
>> GL(dl_phnum) for static and we can simplify __ehdr_start fallback case to 
>> not use
>> a weak ref (as for PIE). I am checking if the following patch trigger any 
>> regression,
>> at least for alpha it fixes the static failures:
>>
>> diff --git a/csu/libc-start.c b/csu/libc-start.c
>> index 543560f36c..63a3eceaea 100644
>> --- a/csu/libc-start.c
>> +++ b/csu/libc-start.c
>> @@ -271,18 +271,10 @@ LIBC_START_MAIN (int (*main) (int, char **, char ** 
>> MAIN_AUXVEC_DECL),
>>    So we can set up _dl_phdr and _dl_phnum even without any
>>    information from auxv.  */
>>
>> -  extern const ElfW(Ehdr) __ehdr_start
>> -# if BUILD_PIE_DEFAULT
>> -   __attribute__ ((visibility ("hidden")));
>> -# else
>> -   __attribute__ ((weak, visibility ("hidden")));
>> -  if (&__ehdr_start != NULL)
>> -# endif
>> -    {
>> -  assert (__ehdr_start.e_phentsize == sizeof *GL(dl_phdr));
>> -  GL(dl_phdr) = (const void *) &__ehdr_start + __ehdr_start.e_phoff;
>> -  GL(dl_phnum) = __ehdr_start.e_phnum;
>> -    }
>> +  extern const ElfW(Ehdr) __ehdr_start attribute_hidden;
>> +  assert (__ehdr_start.e_phentsize == sizeof *GL(dl_phdr));
>> +  GL(dl_phdr) = (const void *) &__ehdr_start + __ehdr_start.e_phoff;
>> +  GL(dl_phnum) = __ehdr_start.e_phnum;
>>   }
>>
>>     __tunables_init (__environ);
>> diff --git a/sysdeps/unix/sysv/linux/dl-parse_auxv.h 
>> b/sysdeps/unix/sysv/linux/dl-parse_auxv.h
>> index bf9374371e..5913c9d6e5 100644
>> --- a/sysdeps/unix/sysv/linux/dl-parse_auxv.h
>> +++ b/sysdeps/unix/sysv/linux/dl-parse_auxv.h
>> @@ -56,6 +56,10 @@ void _dl_parse_auxv (ElfW(auxv_t) *av, dl_parse_auxv_t 
>> auxv_values)
>>     if (GLRO(dl_sysinfo_dso) != NULL)
>>   GLRO(dl_sysinfo) = auxv_values[AT_SYSINFO];
>>   #endif
>> +#ifndef SHARED
>> +  GL(dl_phdr) = (void*) auxv_values[AT_PHDR];
>> +  GL(dl_phnum) = auxv_values[AT_PHENT];
>> +#endif
>>
>>     DL_PLATFORM_AUXV
>>   }
> 
> I can confirm that this patch fixes the problem for me. I have uploaded a 
> manually built glibc
> package with the patch applied to Debian »unreleased« so that the buildds can 
> resume building
> packages again.
> 
> Would it be possible to backport this patch to 2.36 once it has been merged 
> with the current
> development tree? The reason is that I don't think that Debian is going to 
> switch to anything
> beyond 2.36 anytime soon due to the upcoming feature freeze in January.

I have sent a more detailed patch [1] and I will probably backport to all 
affected releases
once it is accepted.



Re: Update on the glibc segfault issue on Alpha

2023-01-04 Thread John Paul Adrian Glaubitz

Hi Adhemerval!

On 1/3/23 13:09, Adhemerval Zanella Netto wrote:

Thanks, this commits helps narrow down the issue.  The 73fc4e28b9464f0e 
refactor did not
add the GL(dl_phdr) and GL(dl_phnum) for static case, relying on the 
__ehdr_start symbol
to get the correct values.

The issue is for some archs, alpha for instance, the hidden weak reference is 
not making
the static linker to define the __ehdr_start address correctly: it is being set 
to 0 and
thus GL(dl_phdr) and GL(dl_phnum) are set to invalid values.

And I am not sure if the hidden weak __ehdr_start does work on all 
architectures, so I
think it would be safer to just restore the previous behavior to setup 
GL(dl_phdr) and
GL(dl_phnum) for static and we can simplify __ehdr_start fallback case to not 
use
a weak ref (as for PIE). I am checking if the following patch trigger any 
regression,
at least for alpha it fixes the static failures:

diff --git a/csu/libc-start.c b/csu/libc-start.c
index 543560f36c..63a3eceaea 100644
--- a/csu/libc-start.c
+++ b/csu/libc-start.c
@@ -271,18 +271,10 @@ LIBC_START_MAIN (int (*main) (int, char **, char ** 
MAIN_AUXVEC_DECL),
   So we can set up _dl_phdr and _dl_phnum even without any
   information from auxv.  */

-  extern const ElfW(Ehdr) __ehdr_start
-# if BUILD_PIE_DEFAULT
-   __attribute__ ((visibility ("hidden")));
-# else
-   __attribute__ ((weak, visibility ("hidden")));
-  if (&__ehdr_start != NULL)
-# endif
-{
-  assert (__ehdr_start.e_phentsize == sizeof *GL(dl_phdr));
-  GL(dl_phdr) = (const void *) &__ehdr_start + __ehdr_start.e_phoff;
-  GL(dl_phnum) = __ehdr_start.e_phnum;
-}
+  extern const ElfW(Ehdr) __ehdr_start attribute_hidden;
+  assert (__ehdr_start.e_phentsize == sizeof *GL(dl_phdr));
+  GL(dl_phdr) = (const void *) &__ehdr_start + __ehdr_start.e_phoff;
+  GL(dl_phnum) = __ehdr_start.e_phnum;
  }

__tunables_init (__environ);
diff --git a/sysdeps/unix/sysv/linux/dl-parse_auxv.h 
b/sysdeps/unix/sysv/linux/dl-parse_auxv.h
index bf9374371e..5913c9d6e5 100644
--- a/sysdeps/unix/sysv/linux/dl-parse_auxv.h
+++ b/sysdeps/unix/sysv/linux/dl-parse_auxv.h
@@ -56,6 +56,10 @@ void _dl_parse_auxv (ElfW(auxv_t) *av, dl_parse_auxv_t 
auxv_values)
if (GLRO(dl_sysinfo_dso) != NULL)
  GLRO(dl_sysinfo) = auxv_values[AT_SYSINFO];
  #endif
+#ifndef SHARED
+  GL(dl_phdr) = (void*) auxv_values[AT_PHDR];
+  GL(dl_phnum) = auxv_values[AT_PHENT];
+#endif

DL_PLATFORM_AUXV
  }


I can confirm that this patch fixes the problem for me. I have uploaded a 
manually built glibc
package with the patch applied to Debian »unreleased« so that the buildds can 
resume building
packages again.

Would it be possible to backport this patch to 2.36 once it has been merged 
with the current
development tree? The reason is that I don't think that Debian is going to 
switch to anything
beyond 2.36 anytime soon due to the upcoming feature freeze in January.

Thanks,
Adrian

--
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer
`. `'   Physicist
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913



Re: Update on the glibc segfault issue on Alpha

2023-01-03 Thread Florian Weimer
* Adhemerval Zanella Netto:

> diff --git a/csu/libc-start.c b/csu/libc-start.c
> index 543560f36c..63a3eceaea 100644
> --- a/csu/libc-start.c
> +++ b/csu/libc-start.c
> @@ -271,18 +271,10 @@ LIBC_START_MAIN (int (*main) (int, char **, char ** 
> MAIN_AUXVEC_DECL),
>   So we can set up _dl_phdr and _dl_phnum even without any
>   information from auxv.  */
>
> -  extern const ElfW(Ehdr) __ehdr_start
> -# if BUILD_PIE_DEFAULT
> -   __attribute__ ((visibility ("hidden")));
> -# else
> -   __attribute__ ((weak, visibility ("hidden")));
> -  if (&__ehdr_start != NULL)
> -# endif
> -{
> -  assert (__ehdr_start.e_phentsize == sizeof *GL(dl_phdr));
> -  GL(dl_phdr) = (const void *) &__ehdr_start + __ehdr_start.e_phoff;
> -  GL(dl_phnum) = __ehdr_start.e_phnum;
> -}
> +  extern const ElfW(Ehdr) __ehdr_start attribute_hidden;
> +  assert (__ehdr_start.e_phentsize == sizeof *GL(dl_phdr));
> +  GL(dl_phdr) = (const void *) &__ehdr_start + __ehdr_start.e_phoff;
> +  GL(dl_phnum) = __ehdr_start.e_phnum;

There's a separate thread that e.ph_off is not actually correct in this
context because it's a file offset that doesn't necessarily match the
virtual memory offset:

  [PATCH 0/1] __libc_start_main() now uses auxv to obtain phdr's address
  [BZ #29864]
  

I think this needs to be cleaned up so that the static and dynamic cases
are aligned.  That is, we probably want to do the equivalent of

  GL(dl_phdr) = (const void *) &__ehdr_start + __ehdr_start.e_phoff;
  GL(dl_phnum) = __ehdr_start.e_phnum;

in common code.  Ideally, we don't use global variables for that because
in both cases, we only briefly need these variables.

Thanks,
Florian



Re: Update on the glibc segfault issue on Alpha

2023-01-03 Thread Adhemerval Zanella Netto



On 03/01/23 10:29, Florian Weimer wrote:
> * Adhemerval Zanella Netto:
> 
>> diff --git a/csu/libc-start.c b/csu/libc-start.c
>> index 543560f36c..63a3eceaea 100644
>> --- a/csu/libc-start.c
>> +++ b/csu/libc-start.c
>> @@ -271,18 +271,10 @@ LIBC_START_MAIN (int (*main) (int, char **, char ** 
>> MAIN_AUXVEC_DECL),
>>   So we can set up _dl_phdr and _dl_phnum even without any
>>   information from auxv.  */
>>
>> -  extern const ElfW(Ehdr) __ehdr_start
>> -# if BUILD_PIE_DEFAULT
>> -   __attribute__ ((visibility ("hidden")));
>> -# else
>> -   __attribute__ ((weak, visibility ("hidden")));
>> -  if (&__ehdr_start != NULL)
>> -# endif
>> -{
>> -  assert (__ehdr_start.e_phentsize == sizeof *GL(dl_phdr));
>> -  GL(dl_phdr) = (const void *) &__ehdr_start + __ehdr_start.e_phoff;
>> -  GL(dl_phnum) = __ehdr_start.e_phnum;
>> -}
>> +  extern const ElfW(Ehdr) __ehdr_start attribute_hidden;
>> +  assert (__ehdr_start.e_phentsize == sizeof *GL(dl_phdr));
>> +  GL(dl_phdr) = (const void *) &__ehdr_start + __ehdr_start.e_phoff;
>> +  GL(dl_phnum) = __ehdr_start.e_phnum;
> 
> There's a separate thread that e.ph_off is not actually correct in this
> context because it's a file offset that doesn't necessarily match the
> virtual memory offset:> 
>   [PATCH 0/1] __libc_start_main() now uses auxv to obtain phdr's address
>   [BZ #29864]
>   


Yes, but this is a fallback case where the kernel does not provide AT_PHDR
and AT_PHNUM.  As I added on the thread, I think it is better to use the
kernel provided value since they will always present and it leverages a
kernel fix for a similar issue [1].

> 
> I think this needs to be cleaned up so that the static and dynamic cases
> are aligned.  That is, we probably want to do the equivalent of
> 
>   GL(dl_phdr) = (const void *) &__ehdr_start + __ehdr_start.e_phoff;
>   GL(dl_phnum) = __ehdr_start.e_phnum;
> 

I think we should use the __ehdr_start only as fallback.

> in common code.  Ideally, we don't use global variables for that because
> in both cases, we only briefly need these variables.

Right, this make sense.

> 
> Thanks,
> Florian
> 

[1] https://sourceware.org/pipermail/libc-alpha/2022-December/143942.html



Re: Update on the glibc segfault issue on Alpha

2023-01-03 Thread Adhemerval Zanella Netto



On 02/01/23 14:37, John Paul Adrian Glaubitz wrote:
> Hello!
> 
> On 1/2/23 12:44, John Paul Adrian Glaubitz wrote:
>> Adhemerval from glibc upstream is aware of the problem but he has not found 
>> yet a solution
>> to this issue as it needs to be debugged further. I will try to bisect which 
>> particular
>> introduced the regression and file a new upstream bug report.
>>
>> During our discussion, Adhemerval pointed out that this change [3] might be 
>> the culprit
>> but I have not been able to verify this yet.
> 
> My latest bisecting has shown this change to be the responsible change, 
> CC'ing the author:
> 
> commit 73fc4e28b9464f0e13edc719a5372839970e7ddb (refs/bisect/bad)
> Author: Florian Weimer 
> Date:   Mon Feb 28 11:50:41 2022 +0100
> 
>     Linux: Consolidate auxiliary vector parsing (redo)
>         And optimize it slightly.
>         This is commit 8c8510ab2790039e58995ef3a22309582413d3ff revised.
>         In _dl_aux_init in elf/dl-support.c, use an explicit loop
>     and -fno-tree-loop-distribute-patterns to avoid memset.
>         Reviewed-by: Szabolcs Nagy 

Thanks, this commits helps narrow down the issue.  The 73fc4e28b9464f0e 
refactor did not
add the GL(dl_phdr) and GL(dl_phnum) for static case, relying on the 
__ehdr_start symbol
to get the correct values.

The issue is for some archs, alpha for instance, the hidden weak reference is 
not making
the static linker to define the __ehdr_start address correctly: it is being set 
to 0 and
thus GL(dl_phdr) and GL(dl_phnum) are set to invalid values.

And I am not sure if the hidden weak __ehdr_start does work on all 
architectures, so I
think it would be safer to just restore the previous behavior to setup 
GL(dl_phdr) and 
GL(dl_phnum) for static and we can simplify __ehdr_start fallback case to not 
use 
a weak ref (as for PIE). I am checking if the following patch trigger any 
regression,
at least for alpha it fixes the static failures:

diff --git a/csu/libc-start.c b/csu/libc-start.c
index 543560f36c..63a3eceaea 100644
--- a/csu/libc-start.c
+++ b/csu/libc-start.c
@@ -271,18 +271,10 @@ LIBC_START_MAIN (int (*main) (int, char **, char ** 
MAIN_AUXVEC_DECL),
  So we can set up _dl_phdr and _dl_phnum even without any
  information from auxv.  */

-  extern const ElfW(Ehdr) __ehdr_start
-# if BUILD_PIE_DEFAULT
-   __attribute__ ((visibility ("hidden")));
-# else
-   __attribute__ ((weak, visibility ("hidden")));
-  if (&__ehdr_start != NULL)
-# endif
-{
-  assert (__ehdr_start.e_phentsize == sizeof *GL(dl_phdr));
-  GL(dl_phdr) = (const void *) &__ehdr_start + __ehdr_start.e_phoff;
-  GL(dl_phnum) = __ehdr_start.e_phnum;
-}
+  extern const ElfW(Ehdr) __ehdr_start attribute_hidden;
+  assert (__ehdr_start.e_phentsize == sizeof *GL(dl_phdr));
+  GL(dl_phdr) = (const void *) &__ehdr_start + __ehdr_start.e_phoff;
+  GL(dl_phnum) = __ehdr_start.e_phnum;
 }

   __tunables_init (__environ);
diff --git a/sysdeps/unix/sysv/linux/dl-parse_auxv.h 
b/sysdeps/unix/sysv/linux/dl-parse_auxv.h
index bf9374371e..5913c9d6e5 100644
--- a/sysdeps/unix/sysv/linux/dl-parse_auxv.h
+++ b/sysdeps/unix/sysv/linux/dl-parse_auxv.h
@@ -56,6 +56,10 @@ void _dl_parse_auxv (ElfW(auxv_t) *av, dl_parse_auxv_t 
auxv_values)
   if (GLRO(dl_sysinfo_dso) != NULL)
 GLRO(dl_sysinfo) = auxv_values[AT_SYSINFO];
 #endif
+#ifndef SHARED
+  GL(dl_phdr) = (void*) auxv_values[AT_PHDR];
+  GL(dl_phnum) = auxv_values[AT_PHENT];
+#endif

   DL_PLATFORM_AUXV
 }




Re: Update on the glibc segfault issue on Alpha

2023-01-02 Thread John Paul Adrian Glaubitz

Hello!

On 1/2/23 12:44, John Paul Adrian Glaubitz wrote:

Adhemerval from glibc upstream is aware of the problem but he has not found yet 
a solution
to this issue as it needs to be debugged further. I will try to bisect which 
particular
introduced the regression and file a new upstream bug report.

During our discussion, Adhemerval pointed out that this change [3] might be the 
culprit
but I have not been able to verify this yet.


My latest bisecting has shown this change to be the responsible change, CC'ing 
the author:

commit 73fc4e28b9464f0e13edc719a5372839970e7ddb (refs/bisect/bad)
Author: Florian Weimer 
Date:   Mon Feb 28 11:50:41 2022 +0100

Linux: Consolidate auxiliary vector parsing (redo)

And optimize it slightly.

This is commit 8c8510ab2790039e58995ef3a22309582413d3ff revised.

In _dl_aux_init in elf/dl-support.c, use an explicit loop

and -fno-tree-loop-distribute-patterns to avoid memset.

Reviewed-by: Szabolcs Nagy 


Adrian

--
 .''`.  John Paul Adrian Glaubitz
: :' :  Debian Developer
`. `'   Physicist
  `-GPG: 62FF 8A75 84E0 2956 9546  0006 7426 3B37 F5B5 F913