------- Comment From i...@de.ibm.com 2021-08-11 14:45 EDT-------
Oh, sorry, I forgot that you can't use the IBM BZ links. I indeed had the 
following diffs in mind:

1.
#ifndef Z_IFUNC_ASM
local
#endif
-unsigned long (*(crc32_z_ifunc(void)))(unsigned long, const unsigned char FAR 
*, z_size_t)
+unsigned long (*(crc32_z_ifunc(
+#ifdef __s390__
+unsigned long hwcap
+#else
+void
+#endif
+)))(unsigned long, const unsigned char FAR *, z_size_t)
{
#if _ARCH_PWR8==1
#if defined(__BUILTIN_CPU_SUPPORTS__)

2.
#endif
#endif /* _ARCH_PWR8 */

+#ifdef HAVE_S390X_VX
+    if (hwcap & HWCAP_S390_VX)
+        return s390_crc32_vx;
+#endif
+
/* return a function pointer for optimized arches here */

3.
static unsigned long ZEXPORT (*crc32_func)(unsigned long, const unsigned char 
FAR *, z_size_t) = NULL;

if (!crc32_func)
-        crc32_func = crc32_z_ifunc();
+        crc32_func = crc32_z_ifunc(
+#ifdef __s390__
+            getauxval(AT_HWCAP)
+#endif
+        );
return (*crc32_func)(crc, buf, len);
}

I think the latest debdiff still misses the second one (the getauxval()
call is still there):

++#if defined(__s390__) && defined(S390_CRC32_VX)
++    if (getauxval(AT_HWCAP) & HWCAP_S390_VX)
++        return s390_crc32_vx;
++#endif

A couple nits:
- changelog: it's dfltcc, not dltss (yeah, the name is super weird :-))
- "/* Work around a probable bug in glibc when invoking getauxval in the ifunc 
*/" - I don't think it's actually a bug, since 
https://sourceware.org/glibc/wiki/GNU_IFUNC says that:

When LD_BIND_NOW=1 or -Wl,z,now is in effect symbols must be immediately
resolved at startup. In cases where an external function call depends
needs to be made that may fail if such a call has not been initialized
yet (PLT-based relocation which is processed later). For example calling
strlen in an IFUNC resolver built with -Wl,z,now may lead to a segfault
because the PLT is not yet resolved. This may work on x86_64 where the
R_*_IRELATIVE relocations happen in DT_JMPREL after the DT_REL*
relocations, but that is no guarantee that it will work on AArch64,
PPC64, or other architectures that are slightly different. Such
fundamental limitations may be lifted at a future point.

What's surprising is rather that, given this paragraph, the code
nevertheless works and there doesn't seem to be an obvious glibc commit
that lifts this limitation.

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to zlib in Ubuntu.
https://bugs.launchpad.net/bugs/1932010

Title:
  [21.10 FEAT] zlib CRC32 optimization for s390x

Status in Ubuntu on IBM z Systems:
  In Progress
Status in zlib package in Ubuntu:
  In Progress

Bug description:
  Use SIMD instructions to accelerate the zlib CRC32 implementation.
  Business value: Performance improvement in database area

  The zlib CRC32 optimization for IBM Z is currently being discussed for 
inclusion into zlib-ng:
  https://github.com/zlib-ng/zlib-ng/pull/912

  The patch for zlib will be based on that.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1932010/+subscriptions


-- 
Mailing list: https://launchpad.net/~touch-packages
Post to     : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to