In GLModuleSystem.py Makefile.am snippets are searched twice for the file names stored in 'lib_SOURCES += ...' variable. Instead of writing the regular expression twice (differently) I have moved it to a private variable & function.
Indirectly this also fixes two issues. The first is the same issue as yesterdays patch. Variables with a trailing comment, e.g. lib_SOURCES += file.c file.h # comment are not matched. As far as I can tell none of the module descriptions have this comment style but gnulib-tool.sh handles it (line 3453-3466). So gnulib-tool.py should too. The second issue I have mentioned previously, I forget where. In some nested loops we have "break" statements that are ineffective. Unlike shell, in Python we can't use "break 2" to exit the outer loop. In this case we had 3 loops nested, something like this: for module in modules: for value in lib_SOURCES_var: for file_name in value: if not file_name.endswith('.h'): condition = True break So the break only stops the current loop 'lib_SOURCES' match of a snippet. But we should stop looping entirely if even one module description sets condition to True. Since the use of a private functions removes a level of nesting, I have added an 'if condition: break'. Not too messy and since the outer loop goes over all modules (can be 1000+) it seems worth it. Collin
>From 26d7603143116626cacba37b988fa26cc7dba1b8 Mon Sep 17 00:00:00 2001 From: Collin Funk <collin.fu...@gmail.com> Date: Sun, 2 Jun 2024 22:32:39 -0700 Subject: [PATCH] gnulib-tool.py: Refactor duplicated regular expressions. * pygnulib/GLModuleSystem.py (_LIB_SOURCES_PATTERN): New variable. (_extract_lib_SOURCES): New function. (GLModule.getAutomakeSnippet_Unconditional): Use the new function. (GLModuleTable.add_dummy): Likewise. Add a second break statement to stop unnecessary looping. --- ChangeLog | 9 +++++++++ pygnulib/GLModuleSystem.py | 40 ++++++++++++++++++++++++-------------- 2 files changed, 34 insertions(+), 15 deletions(-) diff --git a/ChangeLog b/ChangeLog index 269293ca79..e0ed08a9af 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,12 @@ +2024-06-02 Collin Funk <collin.fu...@gmail.com> + + gnulib-tool.py: Refactor duplicated regular expressions. + * pygnulib/GLModuleSystem.py (_LIB_SOURCES_PATTERN): New variable. + (_extract_lib_SOURCES): New function. + (GLModule.getAutomakeSnippet_Unconditional): Use the new function. + (GLModuleTable.add_dummy): Likewise. Add a second break statement to + stop unnecessary looping. + 2024-06-02 Bruno Haible <br...@clisp.org> c-strtod, c-strtof, c-strtold: Fix link error on AIX. diff --git a/pygnulib/GLModuleSystem.py b/pygnulib/GLModuleSystem.py index 5b8d331dd6..5d67c5b6df 100644 --- a/pygnulib/GLModuleSystem.py +++ b/pygnulib/GLModuleSystem.py @@ -40,6 +40,21 @@ from .GLFileSystem import GLFileSystem +# Regular expression matching a 'lib_SOURCES += ...' variable with the file +# names in capture group 1. +_LIB_SOURCES_PATTERN = re.compile(r'^lib_SOURCES[\t ]*\+=[\t ]*(.+?)[\t ]*(?:#|$)', re.MULTILINE) + + +def _extract_lib_SOURCES(snippet: str) -> list[str]: + '''Return the file names specified in the lib_SOURCES variable of + a Makefile.am snippet.''' + lines = [ line.group(1) + for line in re.finditer(_LIB_SOURCES_PATTERN, snippet) ] + return [ file_name + for line in lines + for file_name in line.split() ] + + #=============================================================================== # Define GLModuleSystem class #=============================================================================== @@ -571,13 +586,8 @@ def getAutomakeSnippet_Unconditional(self) -> str: # Synthesize an EXTRA_DIST augmentation. snippet = self.getAutomakeSnippet_Conditional() snippet = combine_lines(snippet) - pattern = re.compile(r'^lib_SOURCES[\t ]*\+=[\t ]*(.*)$', re.MULTILINE) - mentioned_files = set(pattern.findall(snippet)) - if mentioned_files: - # Get all the file names from 'lib_SOURCES += ...'. - mentioned_files = { filename - for line in mentioned_files - for filename in line.split() } + # Get all the file names from 'lib_SOURCES += ...'. + mentioned_files = _extract_lib_SOURCES(snippet) all_files = self.getFiles() lib_files = filter_filelist('\n', all_files, 'lib/', '', 'lib/', '') @@ -985,14 +995,14 @@ def add_dummy(self, modules: list[GLModule]) -> list[GLModule]: # Extract the value of unconditional "lib_SOURCES += ..." augmentations. snippet = combine_lines(snippet) snippet = self.remove_if_blocks(snippet) - pattern = re.compile(r'^lib_SOURCES[\t ]*\+=[\t ]*([^#]+?)$', re.MULTILINE) - for matching_rhs in re.finditer(pattern, snippet): - files = matching_rhs.group(1).split() - for file in files: - # Ignore .h files since they are not compiled. - if not file.endswith('.h'): - have_lib_sources = True - break + file_names = _extract_lib_SOURCES(snippet) + for file_name in file_names: + # Ignore .h files since they are not compiled. + if not file_name.endswith('.h'): + have_lib_sources = True + break + if have_lib_sources: + break # Add the dummy module, to make sure the library will be non-empty. if not have_lib_sources: dummy = self.modulesystem.find('dummy') -- 2.45.1