The func_transitive_closure function in the shell implementation can take a lot of time. So I wondered whether in the Python implementation, there is room for speedup at this place as well. And indeed, there is.
Profiling the test-create-testdir-4.sh unit test, I saw this profiler output: + 43843917 function calls (43842963 primitive calls) in 201.018 seconds + + Ordered by: internal time + + ncalls tottime percall cumtime percall filename:lineno(function) + 11 181.248 16.477 181.248 16.477 {built-in method posix.waitpid} + 25009065 3.666 0.000 3.666 0.000 GLModuleSystem.py:225(__eq__) + 97005 2.783 0.000 6.664 0.000 GLModuleSystem.py:184(__init__) + 7891 1.504 0.000 4.472 0.001 GLModuleSystem.py:915(<listcomp>) + 107597 1.056 0.000 1.056 0.000 {built-in method io.open} + 327094 1.043 0.000 1.509 0.000 posixpath.py:338(normpath) + 316696 0.857 0.000 1.108 0.000 posixpath.py:71(join) + 194049 0.610 0.000 0.610 0.000 {method 'read' of '_io.BufferedReader' objects} + 246396 0.567 0.000 0.567 0.000 {built-in method posix.stat} + 97013 0.454 0.000 0.587 0.000 codecs.py:681(__init__) + 316505 0.428 0.000 3.048 0.000 constants.py:269(joinpath) + 97013 0.416 0.000 2.005 0.000 codecs.py:871(open) + 1583 0.354 0.000 16.804 0.011 GLModuleSystem.py:814(transitive_closure) + 1 0.299 0.299 0.903 0.903 GLModuleSystem.py:954(<listcomp>) + 97007 0.284 0.000 0.986 0.000 codecs.py:451(read) + 97005 0.240 0.000 10.376 0.000 GLModuleSystem.py:99(find) + 44061 0.228 0.000 10.806 0.000 GLModuleSystem.py:537(getDependenciesWithConditions) + 2786508 0.212 0.000 0.212 0.000 {method 'append' of 'list' objects} + 102296 0.193 0.000 1.353 0.000 GLFileSystem.py:77(lookup) + 489239 0.190 0.000 0.271 0.000 GLModuleSystem.py:257(__hash__) + 1 0.178 0.178 200.996 200.996 main.py:120(main) So, I changed a list to a set in two places. As expected, this nearly annihilates the 25 million GLModule.__eq__ invocations: + 19010970 function calls (19010016 primitive calls) in 195.304 seconds + + Ordered by: internal time + + ncalls tottime percall cumtime percall filename:lineno(function) + 11 181.039 16.458 181.039 16.458 {built-in method posix.waitpid} + 97005 2.770 0.000 6.676 0.000 GLModuleSystem.py:184(__init__) + 107597 1.064 0.000 1.064 0.000 {built-in method io.open} + 327094 1.016 0.000 1.486 0.000 posixpath.py:338(normpath) + 316696 0.857 0.000 1.111 0.000 posixpath.py:71(join) + 194049 0.585 0.000 0.585 0.000 {method 'read' of '_io.BufferedReader' objects} + 246396 0.556 0.000 0.556 0.000 {built-in method posix.stat} + 97013 0.454 0.000 0.597 0.000 codecs.py:681(__init__) + 316505 0.424 0.000 3.026 0.000 constants.py:269(joinpath) + 97013 0.414 0.000 2.023 0.000 codecs.py:871(open) + 1583 0.331 0.000 12.204 0.008 GLModuleSystem.py:814(transitive_closure) + 97007 0.284 0.000 0.962 0.000 codecs.py:451(read) + 97005 0.243 0.000 10.375 0.000 GLModuleSystem.py:99(find) + 44061 0.224 0.000 10.795 0.000 GLModuleSystem.py:537(getDependenciesWithConditions) + 2786514 0.213 0.000 0.213 0.000 {method 'append' of 'list' objects} + 504489 0.197 0.000 0.277 0.000 GLModuleSystem.py:257(__hash__) + 102296 0.190 0.000 1.343 0.000 GLFileSystem.py:77(lookup) + 1 0.187 0.187 195.283 195.283 main.py:120(main) The speed improvement is also measurable without a profiler: Before: $ GNULIB_TOOL_IMPL=py time ./test-create-testdir-4.sh real 192.48s user 131.35s sys 51.53s After: $ GNULIB_TOOL_IMPL=py time ./test-create-testdir-4.sh real 189.47s user 128.31s sys 51.64s 2024-04-11 Bruno Haible <br...@clisp.org> gnulib-tool.py: Optimize module set lookups. * gnulib-tool.py (profiler_args): New variable. * pygnulib/GLModuleSystem.py (GLModuleTable.transitive_closure): Turn handledmodules into a set. (GLModuleTable.transitive_closure_separately): For the 'in' test, use a set variable main_modules_set. diff --git a/gnulib-tool.py b/gnulib-tool.py index bb2837a3d5..cdcd316909 100755 --- a/gnulib-tool.py +++ b/gnulib-tool.py @@ -144,4 +144,7 @@ else func_fatal_error "python3 not found; try setting GNULIB_TOOL_IMPL=sh" fi -exec python3 "$gnulib_dir/.gnulib-tool.py" "$@" +profiler_args= +# For profiling, cf. <https://docs.python.org/3/library/profile.html>. +#profiler_args="-m cProfile -s tottime" +exec python3 $profiler_args "$gnulib_dir/.gnulib-tool.py" "$@" diff --git a/pygnulib/GLModuleSystem.py b/pygnulib/GLModuleSystem.py index 01378259d9..d258fd903f 100644 --- a/pygnulib/GLModuleSystem.py +++ b/pygnulib/GLModuleSystem.py @@ -829,7 +829,7 @@ class GLModuleTable: # module on the input list has been processed, it is added to the # "handled list", so we can avoid to process it again. inc_all_tests = self.inc_all_direct_tests - handledmodules = [] + handledmodules = set() inmodules = modules outmodules = [] if self.config['conddeps']: @@ -910,11 +910,11 @@ class GLModuleTable: self.addConditional(module, depmodule, True) else: # if not conditional self.addUnconditional(depmodule) - handledmodules = sorted(set(handledmodules + inmodules_this_round)) + handledmodules = handledmodules.union(inmodules_this_round) # Remove handledmodules from inmodules. - inmodules = [module - for module in inmodules - if module not in handledmodules] + inmodules = [ module + for module in inmodules + if module not in handledmodules ] inmodules = sorted(set(inmodules)) inc_all_tests = self.inc_all_indirect_tests modules = sorted(set(outmodules)) @@ -950,10 +950,11 @@ class GLModuleTable: main_modules = self.transitive_closure(basemodules) self.config.setInclTestCategory(TESTS['tests'], saved_inctests) # Determine tests-related module list. + main_modules_set = set(main_modules) tests_modules = \ [ m for m in finalmodules - if not (m in main_modules and m.getApplicability() == 'main') ] + if not (m in main_modules_set and m.getApplicability() == 'main') ] # Note: Since main_modules is (hopefully) a subset of finalmodules, this # ought to be the same as # [ m