Package: grep Version: 2.6.3-3 Severity: normal
*** FILE //my comments on lines starting with // //MAINTAINER(S) MAY WISH TO INCREASE BUG PRIORITY based on bug scope //and impact (it may cause things to quite unexpectedly fail or //consume excessive resources and time, where such was not the case //before) //bug may - or may not - be related to (or "same"?) as bug 503658 //under at least certain not-too-unusual circumstances, grep RE //performance is abysmal, e.g.: $ time grep '^\(.\)\(.\).\2\1$' /usr/share/dict/words | wc -l 16 real 1m7.503s user 1m7.432s sys 0m0.012s $ //top(1) also shows us excessive CPU consumption: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3193 mpaoli 20 0 7756 1136 692 R 97.8 0.2 0:24.09 grep //I did also use strace(1) - didn't seem to show anything particularly //unusual - seems the bug consumes excess CPU (is quite CPU bound), //but no obvious excessive system calls or unusual delays on any //system calls noted in strace(1) output //however, when we add the -i option the performance for the above //becomes quite reasonable: $ time grep -i '^\(.\)\(.\).\2\1$' /usr/share/dict/words | wc -l 19 real 0m0.582s user 0m0.580s sys 0m0.004s $ //likewise performance is fine if we use LC_ALL=C $ time LC_ALL=C grep '^\(.\)\(.\).\2\1$' /usr/share/dict/words | wc -l 16 real 0m0.390s user 0m0.392s sys 0m0.000s $ //bug is also present if we explicitly use LC_ALL=en_US.UTF-8 $ time LC_ALL=en_US.UTF-8 grep '^\(.\)\(.\).\2\1$' /usr/share/dict/words | wc -l 16 real 1m5.347s user 1m5.320s sys 0m0.008s $ //bug appears to NOT be present in other common BRE utilities, e.g. //sed(1), ex(1), ed(1): $ time sed -ne '/^\(.\)\(.\).\2\1$/p' /usr/share/dict/words | wc -l 16 real 0m0.267s user 0m0.256s sys 0m0.012s $ time ex /usr/share/dict/words << \__EOF__ | wc -l > g/^\(.\)\(.\).\2\1$/p > q > __EOF__ 16 real 0m1.004s user 0m0.920s sys 0m0.020s $ time ed /usr/share/dict/words << \__EOF__ | wc -l > g/^\(.\)\(.\).\2\1$/p > q > __EOF__ 931708 16 real 0m0.300s user 0m0.292s sys 0m0.008s $ //for the examples above, most any relatively similar file could be used //instead of /usr/share/dict/words, I specifically used (in case it //matters): $ dpkg -S /usr/share/dict/words diversion by dictionaries-common from: /usr/share/dict/words diversion by dictionaries-common to: /usr/share/dict/words.pre-dictionaries-common wamerican, dictionaries-common: /usr/share/dict/words $ dpkg -l dictionaries-common | tail -n 1 ii dictionaries-common 1.5.17 Common utilities for spelling dictionary tools $ //and locale information (unless/except where explicity shown set //differently above) $ locale LANG=en_US.UTF-8 LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL= $ //bug is NOT present in old stable: $ cat /etc/debian_version 5.0.9 $ time grep '^\(.\)\(.\).\2\1$' /usr/share/dict/words | wc -l 16 real 0m0.925s user 0m0.896s sys 0m0.000s $ //even if we explicitly set LC_ALL=en_US.UTF-8, bug still not present in //old stable: $ time LC_ALL=en_US.UTF-8 grep '^\(.\)\(.\).\2\1$' /usr/share/dict/words | wc -l 16 real 0m0.825s user 0m0.808s sys 0m0.000s $ //also bug not present in old stable with en_US.utf8 $ locale -a | fgrep -i en_us.utf en_US.utf8 $ time LC_ALL=en_US.utf8 grep '^\(.\)\(.\).\2\1$' /usr/share/dict/words | wc -l 16 real 0m0.814s user 0m0.812s sys 0m0.000s $ -- System Information: Debian Release: 6.0.3 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 2.6.32-5-amd64 (SMP w/1 CPU core) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Versions of packages grep depends on: ii dpkg 1.15.8.11 Debian package management system ii install-info 4.13a.dfsg.1-6 Manage installed documentation in ii libc6 2.11.2-10 Embedded GNU C Library: Shared lib grep recommends no packages. Versions of packages grep suggests: ii libpcre3 8.02-1.1 Perl 5 Compatible Regular Expressi -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org