gnulib-tool.py speeds up continuous integrations

2024-05-05 Thread Bruno Haible
gnulib-tool is used is many CI jobs. Just adding 'python3' to the
prerequisites of such a job makes it run faster. Here are the execution
times for a single run, before and after adding 'python3', for those
CIs that I maintain or co-maintain. In minutes and seconds.

  Before   After

https://gitlab.com/gnulib/gnulib-ci/-/pipelines   30:  11:
https://gitlab.com/gnu-gettext/ci-distcheck/-/pipelines   36:  32:
https://gitlab.com/gnu-poke/ci-distcheck/-/pipelines  18:4018:24
https://gitlab.com/gnu-libunistring/ci-distcheck/-/pipelines  11:2509:16
https://gitlab.com/gnu-diffutils/ci-distcheck/-/pipelines 07:2106:27
https://gitlab.com/gnu-grep/ci-distcheck/-/pipelines  06:5106:08
https://gitlab.com/gnu-m4/ci-distcheck/-/pipelines06:4605:44
https://gitlab.com/gnu-sed/ci-distcheck/-/pipelines   05:2804:39
https://gitlab.com/gnu-gzip/ci-distcheck/-/pipelines  04:1603:58
https://gitlab.com/gnu-libffcall/ci-distcheck/-/pipelines 01:5001:42
https://gitlab.com/gnu-libsigsegv/ci-distcheck/-/pipelines00:4500:42

Bruno






Re: gnulib-tool problem with off64_t and Emacs

2024-05-05 Thread Bruno Haible
Paul Eggert wrote:
> I had problems updating Emacs to use gnulib commit 
> fde88b711c9b1df5b142444ac7b0bc2aa8892d3a along with emacs commit 
> fd859fbea2e9d13e76db1c5295d9ddd1c5955d83 (these are the same commits as 
> I mentioned earlier today). I reproduced it like this:
> 
> cd emacs
> admin/merge-gnulib
> 
> The resulting lib/gnulib.mk.in had a line:
> 
> -e 's|@''HAVE_OFF64_T''@|$(HAVE_OFF64_T)|g' \
> 
> but there was no "HAVE_OFF64_T = @HAVE_OFF64_T@" line

I reproduce it.

But when I run
  admin/merge-gnulib
a second time, it adds the line:

$ diff -u lib/gnulib.mk.in~ lib/gnulib.mk.in
--- lib/gnulib.mk.in~   2024-05-05 15:17:51.168952913 +0200
+++ lib/gnulib.mk.in2024-05-05 15:28:29.973871297 +0200
@@ -808,6 +808,7 @@
 HAVE_MODULES = @HAVE_MODULES@
 HAVE_NANOSLEEP = @HAVE_NANOSLEEP@
 HAVE_NATIVE_COMP = @HAVE_NATIVE_COMP@
+HAVE_OFF64_T = @HAVE_OFF64_T@
 HAVE_OPENAT = @HAVE_OPENAT@
 HAVE_OPENDIR = @HAVE_OPENDIR@
 HAVE_OS_H = @HAVE_OS_H@

The order in which gnulib-tool creates the files seems to be correct:
lib/gnulib.mk.in is generated last.

  ...
  Replacing file m4/nstrftime.m4 (non-gnulib code backed up in 
m4/nstrftime.m4~) !!
  Copying file m4/off64_t.m4
  Replacing file m4/off_t.m4 (non-gnulib code backed up in m4/off_t.m4~) !!
  ...
  Replacing file m4/xattr.m4 (non-gnulib code backed up in m4/xattr.m4~) !!
  Replacing file m4/zzgnulib.m4 (non-gnulib code backed up in m4/zzgnulib.m4~) 
!!
  Creating m4/gnulib-cache.m4
  Updating m4/gnulib-comp.m4 (backup in m4/gnulib-comp.m4~)
  Updating lib/gnulib.mk.in (backup in lib/gnulib.mk.in~)
  Finished.

The problem is that the new file m4/off64_t.m4 is not yet reflected
in aclocal.m4 at the moment
  autoconf -t ...
is run. This patch fixes it.

I'm not including the fix in the old gnulib-tool.sh. Please document in
emacs/admin/merge-gnulib that Python 3 is now needed as a prerequisite.


2024-05-05  Bruno Haible  

gnulib-tool.py: Regenerate aclocal.m4 before using 'autoconf -t ...'.
Reported by Paul Eggert in
.
* pygnulib/GLImport.py (GLImport): New field m4dirs.
(GLImport.__init__): Accept an additional m4dirs argument.
(GLImport.execute): Regenerate aclocal.m4 before creating the library
Makefile.
* pygnulib/main.py (main): Pass the guessed_m4dirs to GLImport.

diff --git a/pygnulib/GLImport.py b/pygnulib/GLImport.py
index a11da0e63d..e67cbbe5a2 100644
--- a/pygnulib/GLImport.py
+++ b/pygnulib/GLImport.py
@@ -23,6 +23,7 @@
 import subprocess as sp
 from .constants import (
 DIRS,
+UTILS,
 MODES,
 TESTS,
 cleaner,
@@ -61,6 +62,7 @@ class GLImport:
 is a very good choice.'''
 
 mode: int
+m4dirs: list[str]
 config: GLConfig
 cache: GLConfig
 emitter: GLEmiter
@@ -68,19 +70,23 @@ class GLImport:
 moduletable: GLModuleTable
 makefiletable: GLMakefileTable
 
-def __init__(self, config: GLConfig, mode: int) -> None:
+def __init__(self, config: GLConfig, mode: int, m4dirs: list[str]) -> None:
 '''Create GLImport instance.
-The first variable, mode, must be one of the values of the MODES dict
-object, which is accessible from constants module. The second one, 
config,
-must be a GLConfig object.'''
+config - must be a GLConfig object.
+mode - must be one of the values of the constants.MODES values.
+m4dirs - list of all directories that contain relevant .m4 files.'''
 if type(config) is not GLConfig:
 raise TypeError('config must have GLConfig type, not %s'
 % repr(config))
-if type(mode) is int and MODES['import'] <= mode <= MODES['update']:
-self.mode = mode
-else:  # if mode is not int or is not 0-3
+if not (type(mode) is int and MODES['import'] <= mode <= 
MODES['update']):
 raise TypeError('mode must be 0 <= mode <= 3, not %s'
 % repr(mode))
+if type(m4dirs) is not list:
+raise TypeError('m4dirs must be a list of strings, not %s'
+% repr(m4dirs))
+
+self.mode = mode
+self.m4dirs = m4dirs
 
 # config contains the configuration, as specified through command-line
 # parameters.
@@ -1209,6 +1215,17 @@ def execute(self, filetable: GLFileTable, transformers: 
dict[str, tuple[re.Patte
 if os.path.isfile(tmpfile):
 os.remove(tmpfile)
 
+if self.config['gnu_make']:
+# Regenerate aclocal.m4.
+# This is needed because the next step may run 'autoconf -t' and
+# the preceding steps may have added new *.m4 files (which need to
+# be reflected in aclocal.m4 before 'autoconf -t' is run).
+aclocal_args = []
+for dir in self.m4dirs:
+aclocal_args.append('-I')
+

Re: endian.h

2024-05-05 Thread Bruno Haible
> > plus functions or macros:
> > 
> >  uint16_t be16toh (uint16_t);
> >  uint16_t htobe16 (uint16_t);
> > 
> > I could try to work on that if it seems useful to anyone else.

For the implementation of these functions, maybe the existing
Gnulib module 'byteswap' is interesting.

Bruno






Re: header file substitutes

2024-05-05 Thread Bruno Haible
Hi Collin,

> IIRC in  there is:
> 
>  #define __STDC_ENDIAN_LITTLE__ /* Unique constant */
>  #define __STDC_ENDIAN_BIG__ /* Unique constant */
>  #define __STDC_ENDIAN_NATIVE__ /* __STDC_ENDIAN_LITTLE__ or 
> __STDC_ENDIAN_BIG__ */

You can work on this, once Paul has created an 'stdbit' module with the
3 sets of functions, as mentioned in the other mail, I would say.

> But, I think the next POSIX revision has  like Glibc which I
> prefer.

Preference or not — both ISO C and POSIX are relevant for us. If ISO C
has endianness macros in  and POSIX has endianness macros in
, we will implement both.

> So these defines:
> 
> #define LITTLE_ENDIAN /* Unique constant */
> #define BIG_ENDIAN /* Unique constant */
> #define BYTE_ORDER /* LITTLE_ENDIAN or BIG_ENDIAN */
> 
> plus functions or macros:
> 
>  uint16_t be16toh (uint16_t);
>  uint16_t htobe16 (uint16_t);
> 
> I could try to work on that if it seems useful to anyone else.

Indeed, I see that this is scheduled for inclusion in the next POSIX
revision: https://www.austingroupbugs.net/view.php?id=162

It is useful to do this already now, before the next POSIX revision is
official, because when it becomes official it will contain *many* new
functionalities. (And this new functionality is unlikely to change.
It's been sitting in the waiting line since 2011 (!).)

Bruno






Re: obsolete vs. deprecated

2024-05-05 Thread Bruno Haible
Paul Eggert wrote:
> Eventually this should replace Gnulib's count-leading-zeros, 
> count-trailing-zeros, and count-one-bits modules, which should be marked 
> as obsolescent once we have a standard (and nicer) way to get that 
> information.

Please mark these modules as deprecated, not obsolete.

An *obsolete* module is one which one which gnulib-tool will omit when it is
used merely as a dependency [1] — which is not what we want here.

A *deprecated* module is one for which the user needs to do some action
(such as, use another module instead and update the #include statements). [2]

Bruno

[1] https://www.gnu.org/software/gnulib/manual/html_node/Module-description.html
[2] https://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=blob_plain;f=NEWS






Re: gnulib-tool sh+py mismatch when updating Emacs

2024-05-05 Thread Collin Funk
On 5/4/24 10:54 PM, Paul Eggert wrote:
>> I was using Autoconf 2.72 on my system.
>
> Ah, I am using Autoconf 2.71, which is what's in /usr/bin/autoconf on Fedora 
> 40. It comes from the autoconf-2.71-10.fc40.noarch package, which is the 
> current version for Fedora 40.
> 
> Occasionlly I use bleeding-edge autoconf (even past 2.72) but that's only to 
> test Autoconf, not other packages.

When Bruno was creating the test suite for gnulib-tool.py we had some
issues with our gperf and bison --version's being different leading to
different files being generated.

I probably installed Autoconf 2.72 because I expected different
versions there to cause issues.

I found the commit where the sorting in that file changed:


https://git.savannah.gnu.org/cgit/autoconf.git/commit/?id=c2ab755698db245898a4cc89149eb5df256e4bd0

>> -diff -r $diff_options --exclude=__pycache__ -q . "$tmp" >/dev/null 
>> ||
>> +diff -r $diff_options --exclude=__pycache__ 
>> --exclude=autom4te.cache -q . "$tmp" >/dev/null ||
> 
> Looks good to me, thanks. (I didn't bother testing it on my old slow machine.)

I've applied the attached patch.

CollinFrom 987535a15d4d2902818661feb6d6b363e4d7af2b Mon Sep 17 00:00:00 2001
From: Collin Funk 
Date: Sat, 4 May 2024 23:46:02 -0700
Subject: [PATCH] gnulib-tool: Ignore autom4te.cache when using
 GNULIB_TOOL_IMPL=sh+py.

Reported by Paul Eggert in:
.

* gnulib-tool: Don't compare the autom4te.cache directory since requests
are not sorted in Autoconf version 2.71 and below.
---
 ChangeLog   | 8 
 gnulib-tool | 2 +-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/ChangeLog b/ChangeLog
index 4b5b9bad06..f8e20e06c6 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,11 @@
+2024-05-04  Collin Funk  
+
+	gnulib-tool: Ignore autom4te.cache when using GNULIB_TOOL_IMPL=sh+py.
+	Reported by Paul Eggert in:
+	.
+	* gnulib-tool: Don't compare the autom4te.cache directory since requests
+	are not sorted in Autoconf version 2.71 and below.
+
 2024-05-04  Bruno Haible  
 
 	readutmp, boot-time: Work around a Cygwin 3.5.3 bug.
diff --git a/gnulib-tool b/gnulib-tool
index 56c4473318..789fe916a8 100755
--- a/gnulib-tool
+++ b/gnulib-tool
@@ -209,7 +209,7 @@ case "$GNULIB_TOOL_IMPL" in
 else
   diff_options=
 fi
-diff -r $diff_options --exclude=__pycache__ -q . "$tmp" >/dev/null ||
+diff -r $diff_options --exclude=__pycache__ --exclude=autom4te.cache -q . "$tmp" >/dev/null ||
   func_fatal_error "gnulib-tool.py produced different files than gnulib-tool.sh! Compare `pwd` and $tmp."
 # Compare the two outputs.
 diff -q "$tmp-sh-out" "$tmp-py-out" >/dev/null ||
-- 
2.45.0



Re: header file substitutes

2024-05-05 Thread Collin Funk
On 5/4/24 11:07 PM, Paul Eggert wrote:
>> Since  seems resonably portable,
> 
> I assume you mean ? There's no  on my Ubuntu system.

No, sorry maybe I worded my original email awkwardly. :)

I think all of the BSDs have  which define:

#define LITTLE_ENDIAN 1234
#define BIG_ENDIAN 4321
#define BYTE_ORDER /* One of thsoe macros.  */

>From what I can tell, they also had ntohl and friends there. I think
that glibc added the be32toh macros and extended that header, while
also moving it to .

My history might be a bit wrong there. The headers predate me by a few
years. :)

I was trying to say that a macro sequence like this could do most of
the heavy lifting:

#if HAVE_ENDIAN_H
#  include 
#elif HAVE_SYS_ENDIAN_H
#  include 
#endif

Since the BSDs and Solaris should define the same stuff in
.

> Although that's a start, we'll need more of course. Here's what I have so far 
> for my prototype stdbit.in.h, but it needs more work. This sort of thing used 
> to be even trickier (see gl_BIGENDIAN and gl_MULTIARCH) but I hope we can 
> dispense with that complexity nowadays (by using something like the following 
> complexity instead :-).

Looks good to me. You got more of the compiler stuff than I would
have.

Collin



Re: header file substitutes

2024-05-05 Thread Paul Eggert

On 2024-05-04 15:33, Collin Funk wrote:


But I don't think C23 has the conversion macros:

 /* big endian 32 to host.  */
 uint32_t be32toh (uint32_t);
 /* little endian 32 to host.  */
 uint32_t le32toh (uint32_t);


Yes, those might be a good reason for a Gnulib endian module, to support 
endian.h GNU-style. Ideally it would be implemented by appealing to 
stdbit.h when that's helpful.




Since  seems resonably portable,


I assume you mean ? There's no  on my Ubuntu system.


 $ echo | gcc -dM -E - | grep 'ENDIAN'
 #define __ORDER_LITTLE_ENDIAN__ 1234
 #define __FLOAT_WORD_ORDER__ __ORDER_LITTLE_ENDIAN__
 #define __ORDER_PDP_ENDIAN__ 3412
 #define __ORDER_BIG_ENDIAN__ 4321
 #define __BYTE_ORDER__ __ORDER_LITTLE_ENDIAN__


Although that's a start, we'll need more of course. Here's what I have 
so far for my prototype stdbit.in.h, but it needs more work. This sort 
of thing used to be even trickier (see gl_BIGENDIAN and gl_MULTIARCH) 
but I hope we can dispense with that complexity nowadays (by using 
something like the following complexity instead :-).


  /* Define the native endianness.  Prefer predefined macros to #include
 directives, to avoid namespace pollution.  */

  /* GCC and Clang define __BYTE_ORDER__ etc.
 ARM compilers define __BIG_ENDIAN etc.
 Oracle Studio defines __SUNPRO_C etc.
 Some platforms work only on little-endian platforms.  */
  #if (defined __BYTE_ORDER__ && defined __ORDER_BIG_ENDIAN__ \
   && defined __ORDER_LITTLE_ENDIAN__)
  # define __STDC_ENDIAN_BIG__ __ORDER_BIG_ENDIAN__
  # define __STDC_ENDIAN_LITTLE__ __ORDER_LITTLE_ENDIAN__
  # define __STDC_ENDIAN_NATIVE__ __BYTE_ORDER__
  #elif (defined __SUNPRO_C ? defined __sparc \
 : 0)
  # define __STDC_ENDIAN_NATIVE__ __STDC_ENDIAN_BIG__
  #elif ((defined __BYTE_ORDER__ && defined __ORDER_BIG_ENDIAN__ \
  && defined __ORDER_BIG_ENDIAN__) \
 ? __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ \
 : defined __SUNPRO_C ? !defined __sparc \
 : (defined _WIN32 || defined __CYGWIN__ || defined __EMX__ \
|| defined __MSDOS__ || defined __DJGPP__) ? 1 \
 : 0)
  # define __STDC_ENDIAN_NATIVE__ __STDC_ENDIAN_LITTLE__
  #else
  # ifdef __has_include
  #  if __has_include ()
  #   include 
  #  endif
  # endif
  # if defined __BIG_ENDIAN && defined __LITTLE_ENDIAN && defined 
__BYTE_ORDER

  #  define __STDC_ENDIAN_BIG__ __BIG_ENDIAN
  #  define __STDC_ENDIAN_LITTLE__ __LITTLE_ENDIAN
  #  define __STDC_ENDIAN_NATIVE__ __BYTE_ORDER
  # endif
  # if defined BIG_ENDIAN && defined LITTLE_ENDIAN && defined BYTE_ORDER
  #  define __STDC_ENDIAN_BIG__ BIG_ENDIAN
  #  define __STDC_ENDIAN_LITTLE__ LITTLE_ENDIAN
  #  define __STDC_ENDIAN_NATIVE__ BYTE_ORDER
  # endif
  #endif
  #ifndef __STDC_ENDIAN_BIG__
  # define __STDC_ENDIAN_BIG__ 4321
  #endif
  #ifndef __STDC_ENDIAN_LITTLE__
  # define __STDC_ENDIAN_LITTLE__ 1234
  #endif
  #ifndef __STDC_ENDIAN_NATIVE__
  /* __STDC_ENDIAN_NATIVE__ is not defined on this platform.
 If this doesn't suffice for you, please email a fix
 to .  */
  #endif