Sam James wrote:
> > It appears that the obstack gnulib module is the culprit.
I replied:
> Therefore, if it is a bug in gnulib, it is also a bug in glibc.
Sam was right. I was wrong. It is a bug in the 'obstack' gnulib module.
The scope
---------
It is visible on CPUs that are
- big-endian,
- 64-bit,
- not allowing unaligned accesses (this excludes powerpc64),
thus only on sparc64.
It is visible only on glibc systems. So, only on Linux/glibc/sparc64.
The workaround
--------------
Is attached.
The explanation
---------------
The code in tree.c does not set the alignment explicitly; it uses the
DEFAULT_ALIGNMENT, which is 16. ('long', 'double', 'void *' all have
an __alignof__ of 8, and 'long double' has an __alignof__ of 16).
The code in tree.c is meant to use the code from gnulib's obstack.h
and obstack.c. obstack.c is compiled to libgnu_la-obstack.o, which is
then included in tp/Texinfo/XS/.libs/Parsetexi.so. Perl loads this
shared library via dlopen().
glibc and Parsetexi.so each define the obstack ABI symbols (_obstack_begin,
_obstack_begin_1, _obstack_newchunk, _obstack_free, etc.). Which of
the symbols "wins" at run time, depends on the flags passed to dlopen().
Apparently perl does not pass the RTLD_DEEPBIND flag. Therefore the
symbols from glibc take precedence. So, tree.c uses
- the obstack.h *macros* from gnulib, but
- the obstack *functions* from glibc.
The obstack facilities from gnulib and from glibc are nearly the same.
They use a 'struct obstack' with very similar layout. The only relevant
difference is that (on a 64-bit platform)
- gnulib's struct obstack has a member
unsigned long alignment_mask;
- glibc's struct obstack instead has two members
unsigned int alignment_mask;
unsigned int __padding;
(The padding is there because the next member is a pointer, which
has alignment 8, see above.)
So, what happens is that during initialization of the obstack (variable
'obs_element' in tree.c) the glibc function _obstack_begin sets the
alignment_mask to DEFAULT_ALIGNMENT - 1, that is 0xf. The __padding member
is apparently 0.
Then, in the function alloc_element of tree.c, obstack_alloc - which is a
macro! - uses the gnulib struct layout and accesses the alignment_mask which
has a value 0xf00000000. __PTR_ALIGN with an alignment mask of 0xf00000000
does not change the bits 31..0 of the pointer. Since tree.c also contains
a call
obstack_alloc (&obs_element, sizeof (int))
obstack_alloc returns a value that is ≡4 mod 8. tree.c then attempts to
access a pointer there, and this crashes with SIGBUS since it's not ≡0 mod 8.
The root cause
--------------
Gnulib generally uses idioms for overriding functions that are safe to use
in shared libraries and will avoid collisions. This is the business with
REPLACE_FOO=1
and
#define foo rpl_foo
and so on.
But the Gnulib module 'obstack' has never been updated to use these idioms.
It is still at the state of 1997 and uses a clunky _OBSTACK_INTERFACE_VERSION
mechanism.
Bruno
--- texinfo-7.1/tp/Texinfo/XS/gnulib/lib/obstack.h.bak 2023-08-13 22:10:03.000000000 +0200
+++ texinfo-7.1/tp/Texinfo/XS/gnulib/lib/obstack.h 2023-11-14 20:30:55.584463250 +0100
@@ -164,6 +164,12 @@
# endif
#endif
+#define _obstack_begin rpl_obstack_begin
+#define _obstack_newchunk rpl_obstack_newchunk
+#define _obstack_allocated_p rpl_obstack_allocated_p
+#define _obstack_free rpl_obstack_free
+#define _obstack_memory_used rpl_obstack_memory_used
+
#ifdef __cplusplus
extern "C" {
#endif