I redid my patch series to include just the things I'm sure are bugfixes for git HEAD. I also redid their commit messages with line wrapping. Will send in a quoting cleanup patch later, with more stuff in one patch, when I'm ready to sign off on it.
I've played around with my prefetch idea, and now I'm happier with it, having explored the possibilities a bit. It'd be neat if some people could test this on their own systems, esp. ones with any kind of slow hard drive or slow CPU, or a crufty /etc/bash_completion.d with a lot of crap in it. Sending an email chock-full of everything that's worth writing down, just so it's in the list archives in case anyone ever wants it. I didn't want to stick most of this into a git commit or a bug comment. It seems to be hard to get Linux to really drop filesystem caches. That or my SATA hard drive's internal cache is big enough that it doesn't need to seek around to get the requested data when all you're doing is time . ./bash_completion between dropping caches. That might be more likely, since it's faster than the first time running it after not doing so for a while, but nowhere near as fast as with hot cache. Intel Core2 Duo E6600 (2.4GHz, 2 non-hyperthreaded cores) 5GB RAM /etc on ext4 (noatime) on a WD10EARS-00M (fairly old green-power magnetic HD) /usr/local/src/bash-completion on xfs on the same HD One point of interest with these results is that the very FIRST test of no-prefetch is FAR higher than any of the others. IDK how to get my system back to a state where it will take that long again. If anyone has any suggestions, that'd be great. The hd churn from grepping a linux kernel tree helps some, but it's not enough. alias churn_hd="grep -r --include '*.c' xxxxx /usr/local/src/linux/ubuntu-trusty/ >/dev/null" alias dropcache="sync; echo 3 | sudo tee /proc/sys/vm/drop_caches > /dev/null; sleep 6" # the sleep 6 seconds is to give things time to settle down right # after dropping caches. Otherwise you get re-reading of # constantly-accessed data contending with bash_completion, which is # realistic but less repeatable. no prefetch $ for i in {0..4};do dropcache ; time BASH_COMPLETION_DISABLE_PREFETCH=1 . ./bash_completion;churn_hd; done 2>&1 | grep '^real' --line-buffered | perl -ne '/m(.*)s/ and $tot+=$1 and ++$c; print; END { print " avg: ", $tot/$c, "\n"; }' real 0m1.383s real 0m0.605s real 0m0.893s real 0m0.797s real 0m0.677s avg (of last 4 only): 0.743 My first version of prefetch: just cat prefetch with: ( exec cat $glob &>/dev/null </dev/null )& disown $! for i in {0..4}; do dropcache; time . ./bash_completion; churn_hd; done 2>&1 | grep '^real' real 0m0.761s 0m0.557s 0m0.821s 0m0.558s 0m0.749s (collapsed for readability) avg: 0.6892 with hot cache I get: avg: 0.0906 Trying to get fancier: fork off cat, and then access all the inodes of the files we will want, to get a deeper read queue depth. prefetch with ( shopt -s failglob # cat doesn't run at all with no matches, even if nullglob is on. # I was worried that cat could get stuck reading from stdin, # but ( cat & ) redirects stdin from /dev/null because subshells don't have job control. exec &>/dev/null # quash output of cat and any bash failglob errors cat $glob & # contents true $glob/ # inodes. just expanding this glob will stat(2), no ls -dL needed. ) & disown $! # don't pollute job control real 0m1.431s 0m0.665s 0m0.594s 0m0.713s 0m0.570s avg (of last 4 only): 0.6355 real 0m0.701s 0m0.557s 0m0.713s 0m0.594s 0m0.797s avg: 0.6724 real 0m0.594s 0m0.977s 0m0.558s 0m0.737s 0m0.593s avg: 0.6918 real 0m0.893s 0m0.570s 0m0.749s 0m0.606s 0m0.569s avg: 0.6774 avg of averages: 0.669s (still excluding the 1.4sec outlier) observations from strace: $ for i in {0..4}; do sync; dropcache; (ls -dLF /; cat /dev/null; ) > /dev/null; strace -o prefetch.$i.glob-inode.cat-bg.strace -tt -s 256 -f -e trace='!rt_sigprocmask,rt_sigaction' \ bash -c "time . ./bash_completion"; churn_hd; done 2>&1 | ... real 0m1.135s 0m0.591s 0m0.590s 0m0.697s 0m0.588s avg: 0.6165 (excluding the outlier). Faster I think because the clock doesn't start until bash has loaded, so that gets libc and all that cached. in the cached case, strace bash -c 'time . ./bash_completion' real 0m0.299s user 0m0.103s sys 0m0.064s cat's read() system calls weren't always returning at the same time as bash's. While bash was chewing on some of the bigger files, or esp. ones that made it stat(2) other things to look for files from some of the scripts sourced (e.g. grup did a lot of stuff), the cat process got ahead of bash and was able to get files into cache so bash's read(2) returned right away when it got there. This is exactly the behaviour I was trying to get from a prefetch process. IDK if drop_caches isn't clearing inode caches or something, but all the stat(2) system calls from bash in the prefetch subshell happen with no delay. Or maybe almost all the relevant inodes are near each other on disk, and got read together in a block? Anyway, once one is done, it's just boom, sequence of stat calls while the other processes are stuck on something. I had been doing inode prefetch by running ls -dL on the glob, but that always slowed things down, by maybe 0.08s, compared to just cat prefetch. Regardless of whether I forked cat & and exec ls, or vice versa. But I think the main problem was just ls finding all its libraries and stuff, and the startup overhead. Expanding $glob/ to make bash stat everything to see if it's a directory is a pretty neat idea, IMHO. And it might help in a case where not all the inodes load together. If they did, just exec cat in the background subshell get it done as part of opening the files. I'm leaving it in on the theory that inodes might be near each other even if they don't all come in in the same disk read, so it's better to prefetch all the stat info before reading file contents. On cygwin, the extra CPU time from the stat system calls, even in a background process, might be a bad thing. cygwin stat is really slow, last I heard. And so is fork. cygwin users might want to set BASH_COMPLETION_DISABLE_PREFETCH=1, if this patch makes it in. On a single-core CPU, it could be a very small slowdown. cat is doing a decent amount of system calls. Copying RAM around isn't a big deal, the files are all very small, none big enough for cat to need more than one size=65536 read. Esp. since write(2) to /dev/null just returns without doing anything, no extra copying there. There is a fadvise(1), which would be perfect. (FADVISE_WILLNEED would does exactly what cat does, but without copying the data to userspace, or writing it. It blocks until the data is cached, if the readahead queue fills up.) However, the current implementation is written in perl, so it's not useful to start it for a handful of small files. It probably takes about as much disk IO to start it as bash_completion load-time does total. I tested having the prefetch thread wait to finish stat(2)ing all the files before running cat, and it performs essentially the same. I think I'll go with this version, since if there is a significant amount of disk activity from the inodes, that + cat's read requests could make an annoying hiccup of disk load that might interfere with something else the user had running. Prob. better to err on the side of being less agressive with read queue depth. Also, this way forks only once. ( # prefetch with shopt -s failglob # don't even run cat if no matches exec &>/dev/null true $glob/ exec cat $glob ) & disown $! real 0m1.143s 0m0.605s 0m0.677s 0m0.725s 0m0.546s avg: 0.63825 (last 4 only) real 0m0.689s 0m0.558s 0m0.785s 0m0.606s 0m0.833s avg: 0.6942 real 0m0.629s 0m0.773s 0m0.581s 0m0.749s 0m0.594s avg: 0.6652 average of averages: 0.666s This is the version in the patch I'm attaching. I also tested with (ls -dLF /; cat /dev/null; ) >/dev/null ahead of the timing timed part, to get some essential stuff cached again. prefetch: (...; true $glob/ exec cat $glob ) & disown $! $ for i in {0..4}; do sync; dropcache; (ls -dLF /; cat /dev/null; ) >/dev/null; time . ./bash_completion; churn_hd; done 2>&1 | ... real 0m1.033s 0m0.472s 0m0.581s 0m0.545s 0m0.749s avg: 0.58675 (last 4) real 0m0.533s 0m0.689s 0m0.509s 0m0.665s 0m0.533s avg: 0.5858 real 0m0.665s 0m0.521s 0m0.713s 0m0.533s 0m0.617s avg: 0.6098 real 0m0.485s 0m0.605s 0m0.545s 0m0.628s 0m0.581s avg: 0.5688 avg of averages: 0.588s no-prefetch with that ls and cat outside the timed part: $ for i in {0..4}; do sync; dropcache; (ls -dLF /; cat /dev/null; ) >/dev/null; time BASH_COMPLETION_DISABLE_PREFETCH=1 . ./bash_completion; churn_hd; done 2>&1 | grep '^real' --line-buffered | ... real 0m1.195s real 0m0.641s real 0m0.700s real 0m0.665s real 0m0.701s avg: 0.67675 (only last 4) real 0m0.617s real 0m0.706s real 0m0.617s real 0m0.700s real 0m0.641s avg: 0.6562 real 0m0.821s real 0m0.629s real 0m0.845s real 0m0.653s real 0m0.917s avg: 0.773 real 0m0.677s real 0m0.773s real 0m0.605s real 0m1.169s real 0m0.617s avg: 0.685 (excluding outlier) avg of averages: 0.698s So, in this case prefetch is saving 0.11s, out of 0.7, or a 15% speedup. Well that's not as good as I thought it was doing, but it's pretty decent. It's a good thing bash doesn't support inline assembly, or I'd be at this for weeks... :P Seriously though, a lot of people are stuck waiting for bash for a second or so, and speeding it up a bit is worth putting effort into, IMO. -- #define X(x,y) x##y Peter Cordes ; e-mail: X(peter@cor , des.ca) "The gods confound the man who first found out how to distinguish the hours! Confound him, too, who in this place set up a sundial, to cut and hack my day so wretchedly into small pieces!" -- Plautus, 200 BC
From b703f34d537104a98b60c98f92c263e61077e9ed Mon Sep 17 00:00:00 2001 From: Peter Cordes <pe...@cordes.ca> Date: Wed, 3 Dec 2014 13:59:31 -0400 Subject: [PATCH 1/3] _longopt: fix parsing --help output that has -- in the description This fixes parsing of things like grep --help: -r, --recursive like --directories=recurse using \([^-]\|-[^-]\)* instead of .* at the front of the pattern makes the greedy match at the front stop at the first --, rather than getting --directories= from the -r line. Also move option completion ahead of the logic that checks previous arg to see if this arg should be limited to a file or directory. Too smart for its own good in such a naive function, crossed up by things like ls --directory or grep --files-with-matches. The sed in the case $prev block doesn't need this, because it puts $prev into the pattern, and it's already presumably a valid option. It will get the right --option wherever it is in the line containing it. Also turns out that bash sorts and uniquifies the results itself, so sort -u isn't needed. Still not perfect, misses --silent from the help output line: -q, --quiet, --silent suppress all normal output Could maybe loop over the --matches in sed, now that we have a sufficiently non-greedy regex to match things in front of --options, but then you'd need a full-blown sed program with pattern and hold space... yuck. Or maybe use awk? Or hardcode a pattern that can match up to 3 long options on one line? This also breaks on commands with weird --help output, like if they for some reason have --something BEFORE an option name. You could start to work around that, with another group like --[^-A-Za-z0-9] to match a -- that isn't at the start of an option, but that's just gratuitously unreadable. Do that for more robustness if anyone ever turns it into a sed program that loops over --option matches on a single line. --- bash_completion | 21 +++++++++++++++------ 1 file changed, 15 insertions(+), 6 deletions(-) diff --git a/bash_completion b/bash_completion index 55c9e48661028cee71301fd244c12d516303d437..a1dbb48e8bb69be538cb6ae9bb4485b6fa4ec698 100644 --- a/bash_completion +++ b/bash_completion @@ -1777,6 +1777,20 @@ _longopt() local cur prev words cword split _init_completion -s || return + # Check for options first: some programs have options like + # --directory=recursive that don't take directory args + # It's more likely the user knows what they're doing, + # for this naive --help parsing function. + if [[ "$cur" == -* ]]; then + COMPREPLY=( $( compgen -W "$( LC_ALL=C $1 --help 2>&1 | \ + sed -ne 's/\([^-]\|-[^-]\)*\(--[-A-Za-z0-9]\{1,\}=\{0,1\}\).*/\2/p' )" \ + -- "$cur" ) ) + # initial part of that regex matches only up to before the first --, + # to avoid tripping on " -r, --recursive like --directory=recursive" in grep --help, for example. + [[ $COMPREPLY == *= ]] && compopt -o nospace + return 0 + fi + case "${prev,,}" in --help|--usage|--version) return 0 @@ -1807,12 +1821,7 @@ _longopt() $split && return 0 - if [[ "$cur" == -* ]]; then - COMPREPLY=( $( compgen -W "$( LC_ALL=C $1 --help 2>&1 | \ - sed -ne 's/.*\(--[-A-Za-z0-9]\{1,\}=\{0,1\}\).*/\1/p' | sort -u )" \ - -- "$cur" ) ) - [[ $COMPREPLY == *= ]] && compopt -o nospace - elif [[ "$1" == @(mk|rm)dir ]]; then + if [[ "$1" == @(mk|rm)dir ]]; then _filedir -d else _filedir -- 2.1.3
From bede4c22106bdd601718019861ac8f017139069c Mon Sep 17 00:00:00 2001 From: Peter Cordes <pe...@cordes.ca> Date: Wed, 3 Dec 2014 14:00:58 -0400 Subject: [PATCH 2/3] upstart support for service completion initctl list works for unprivileged users. Wasn't sure what file to check to detect that upstart was present, but /sbin should always be mounted, and upstart itself provides /sbin/upstart-dbus-bridge, and it's not a conffile in /etc that someone could move if they wanted to on their local system. And it's absolutely not going to have a name conflict with anything from another package. :) I think it's important to check that the system is using an upstart init, so you don't run initctl when completing in a root shell on another kind of system, and maybe do something like generating system log messages. --- bash_completion | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/bash_completion b/bash_completion index a1dbb48e8bb69be538cb6ae9bb4485b6fa4ec698..fd3bc41aaa160fdb3437db419d034c2be045ccc4 100644 --- a/bash_completion +++ b/bash_completion @@ -1137,6 +1137,10 @@ _services() COMPREPLY+=( $( systemctl list-units --full --all 2>/dev/null | \ awk '$1 ~ /\.service$/ { sub("\\.service$", "", $1); print $1 }' ) ) + if [[ -x /sbin/upstart-dbus-bridge ]]; then + COMPREPLY+=( $( initctl list 2>/dev/null | cut -d' ' -f1 ) ) + fi + COMPREPLY=( $( compgen -W '${COMPREPLY[@]#${sysvdirs[0]}/}' -- "$cur" ) ) } -- 2.1.3
From ba68b737e6ccf0be3d8a5ab729eb3ab5e04fd2c1 Mon Sep 17 00:00:00 2001 From: Peter Cordes <pe...@cordes.ca> Date: Wed, 3 Dec 2014 14:02:38 -0400 Subject: [PATCH 3/3] speed up loading the compat dir with disk prefetch Fork off a prefetch thread to make sure the HD isn't sitting idle while there's still data we're going to need. tail(1) might spend less CPU copying stuff around in RAM (it would make fewer system calls writing /dev/null), but POSIX tail only takes one arg. There's a fadvise(1) which would be perfect if it was standard, and not written in perl! I'm seeing a moderate speedup for this change, about 15% on Linux 3.13 with a magnetic HD on an idle system, after a echo 3 | sudo tee /proc/sys/vm/drop_caches > /dev/null And no slowdown with hot caches. (dual core CPU) Other changes: Had to move the necessary stuff up near the top of the file. Was able to greatly simplify the loop over BASH_COMPLETION_COMPAT_DIR by using the glob in the first place, instead of ls and then filtering. Took out the check for [[ -r $i ]] before sourcing. If you have files in /etc/bash_completion.d that aren't readable, you might not even notice if bash_completion silently ignores them. It's not like anything else uses the directory, so don't be too quiet when there is a problem. I've even seen packages put completions in subdirectories (e.g. unison) Could change from -f to -e to get warnings for that. A package could legitimately have a helper function or something in a subdir, though. --- bash_completion | 48 ++++++++++++++++++++++++++++++++++-------------- 1 file changed, 34 insertions(+), 14 deletions(-) diff --git a/bash_completion b/bash_completion index fd3bc41aaa160fdb3437db419d034c2be045ccc4..f2b7db58877c6800fe4e9a46d6d84dcab80ea013 100644 --- a/bash_completion +++ b/bash_completion @@ -47,9 +47,41 @@ readonly BASH_COMPLETION_COMPAT_DIR # _blacklist_glob='@(acroread.sh)' +# Glob for matching various backup files. +# +_backup_glob='@(#*#|*@(~|.@(bak|orig|rej|swp|dpkg*|rpm@(orig|new|save))))' + # Turn on extended globbing and programmable completion shopt -s extglob progcomp +# source (or prefetch from disk) compat completion directory definitions +_load_compat_dir() +{ + [[ -d $BASH_COMPLETION_COMPAT_DIR ]] || return + local i glob="$BASH_COMPLETION_COMPAT_DIR/!($_backup_glob|Makefile*|$_blacklist_glob)" + + if [[ $1 == prefetch ]]; then + if [[ ! $BASH_COMPLETION_DISABLE_PREFETCH ]]; then + ( # fork a background subshell to let main continue ASAP + exec &>/dev/null + true $glob/ # inodes. expanding this glob will stat(2) + exec cat $glob # contents + ) & + disown $! + fi + else + for i in $glob; do + # If there are unreadable files, user probably wants to know, + # so don't check -r + [[ -f $i ]] && . "$i" + done + fi +} +# called again near the end of this file, and then unset +_load_compat_dir prefetch + + + # A lot of the following one-liners were taken directly from the # completion examples provided with the bash 2.04 source distribution @@ -1105,10 +1137,6 @@ _gids() fi } -# Glob for matching various backup files. -# -_backup_glob='@(#*#|*@(~|.@(bak|orig|rej|swp|dpkg*|rpm@(orig|new|save))))' - # Complete on xinetd services # _xinetd_services() @@ -1999,16 +2027,8 @@ _xfunc() "$@" } -# source compat completion directory definitions -if [[ -d $BASH_COMPLETION_COMPAT_DIR && -r $BASH_COMPLETION_COMPAT_DIR && \ - -x $BASH_COMPLETION_COMPAT_DIR ]]; then - for i in $(LC_ALL=C command ls "$BASH_COMPLETION_COMPAT_DIR"); do - i=$BASH_COMPLETION_COMPAT_DIR/$i - [[ ${i##*/} != @($_backup_glob|Makefile*|$_blacklist_glob) \ - && -f $i && -r $i ]] && . "$i" - done -fi -unset i _blacklist_glob +_load_compat_dir source +unset _blacklist_glob _load_compat_dir # source user completion file [[ ${BASH_SOURCE[0]} != ~/.bash_completion && -r ~/.bash_completion ]] \ -- 2.1.3
signature.asc
Description: Digital signature
_______________________________________________ Bash-completion-devel mailing list Bash-completion-devel@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/bash-completion-devel