On 2022-Apr-08, Andres Freund wrote: > I just realized that the second find is pretty expensive compared to the > first. > > time find "$sourcetree" -type d \( \( -name CVS -prune \) -o \( -name .git > -prune \) -o -print \) | grep -v "$sourcetree/doc/src/sgml/\+" > /dev/null > real 0m0.019s > user 0m0.008s > sys 0m0.017s > > second: > time find "$sourcetree" -name Makefile -print -o -name GNUmakefile -print | > grep -v "$sourcetree/doc/src/sgml/images/" > /dev/null > > real 0m0.118s > user 0m0.071s > sys 0m0.053s
Hmm, ISTM that time can be reduced a bit with -prune, time find "$sourcetree" \( -name .git -prune \) -o -name Makefile -print -o -name GNUmakefile -print | grep -v "$sourcetree/doc/src/sgml/images/" > /dev/null I thought it might work to do away with the grep and use find's -path instead to prune that subdir, but "time" shows almost no difference for me: time find "$sourcetree" \( -name .git -prune \) -o \( -path '*doc/src/sgml/images' -prune \) -o -name Makefile -print -o -name GNUmakefile -print > /dev/null Maybe find's -path is equally expensive. Still, that seems a good change anyway. (The times are lower in my system than those you show.) > It think we could just obsolete the second find, by checking for the existence > of Makefile / GNUmakefile in the first loop... Hmm, if that's going to require one more shell command per dir, it sounds more expensive. It's worth trying, I guess. > The invocation of ln -s is quite measurable - looks like it's mostly the > process startup overhead (on linux, at least). Doing a ln --version > > /dev/null > each iteration takes about the same time as actually creating the symlinks. Is this running with some locale settings enabled? Maybe we can save some time by making sure we're under LC_ALL=C or something like that, to avoid searching for translation files. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "Las navajas y los monos deben estar siempre distantes" (Germán Poo)