printf builtin retains timezone from un-set TZ environment variable
Configuration Information [Automatically generated, do not change]: Machine: x86_64 OS: darwin17.7.0 Compiler: clang Compilation CFLAGS: -g -O2 -Wno-parentheses -Wno-format-security uname output: Darwin f638f97d 18.6.0 Darwin Kernel Version 18.6.0: Thu Apr 25 23:16:27 PD$ Machine Type: x86_64-apple-darwin17.7.0 Bash Version: 5.0 Patch Level: 7 Release Status: release *Description:* I've noticed that a timezone explicitly set in the TZ environment variable can get cached (if any date-using command/builtin is run) and then show up (at least) in dates produced by the printf builtin's "%(...)T" formats even after the TZ variable is unset. I played around with this for a bit and identified two ways to get the timezone to re-set: - Run an external command. I have yet to find an external command that didn't correct it, so I suspect this behavior is intrinsic to the API calls made when running an external command. It's also possible I'm too unimaginative to find an external command that *doesn't* invoke date APIs :). - Set the TZ variable inline and run any builtin (i.e., `TZ=:/etc/localtime :`). *Repeat-By:* tzcache.sh echo $BASH_VERSION export TZ=Antarctica/South_Pole printf "TZ: $TZ\n\t%(%+)T\n" unset TZ # still prints the now-unset timezone printf "TZ: $TZ\n\t%(%+)T\n" # to reset it # run any external command # or any builtin with an inline TZ setting TZ=:/etc/localtime : printf "TZ: $TZ\n\t%(%+)T\n" $ env -i $(which bash) --norc --noprofile tzcache.sh 5.0.7(1)-release TZ: Antarctica/South_Pole Sun Jul 14 05:21:49 NZST 2019 TZ: Sun Jul 14 05:21:49 NZST 2019 TZ: Sat Jul 13 12:21:49 CDT 2019 Cheers, Travis
Re: Combination of "eval set -- ..." and $() command substitution is slow
Dear Chet Ramey, thanks for the explanations, I assume you were able to reproduce the issue. First a warning that I forgot to include in the previous email: the quoted excerpts of the gprof tables correspond to a slightly modified test script in which I tried to magnify the overhead in order to more easily spot performance differences. Therefore, the absolute values displayed, e.g. number of function calls, are exaggerated. The relative values (and hopefully your conclusions) remain largely the same, though. Sorry for any potential confusion. A few short comments prompted by your reply (probably not very useful, feel free to ignore them). Chet Ramey: >> - $() seems generally slightly slower than ``, but becomes pathologically >> so when preceded with "eval set -- ...". > > It is slightly slower -- POSIX requires that the shell parse the contents > of $(...) to determine that it's a valid script as part of finding the > closing `)'. The rules for finding the closing "`" don't have that > requirement. > >> - "eval set -- ..." itself doesn't seem slow at all, but obviously it has >> side-effects not captured by the "time" measurement tool. > > What happens is you end up with a 4900-character command string that you > have to parse multiple times. But that's not the worst of it. Since this statement ought to run exactly once, naïvely I would expect that by "multiple times" you really mean at most "twice": once for the top-level script, another time "inside" the eval "sub-script". > The gprof output provides a clue. > > >> case 1 1 0 (pathological): >>% cumulative self self total >> time seconds secondscalls us/call us/call name >> 38.89 0.21 0.2128890 7.27 7.27 set_line_mbstate > > set_line_mbstate() runs through each command line before parsing, creating > a bitmap that indicates whether each element is a single-byte character or > part of a multi-byte character. The scanner uses this to determine whether > a shell metacharacter should act as a delimiter or get skipped over as part > of a multibyte character. For a single run with args `1 1 0', it gets > called around 7300 times, with around 2400 of them for the 4900-character > string with all the arguments. > > When you're in a multibyte locale (en_US.UTF-8 is one such), each one of > those characters requires a call to mbrlen/mbrtowc. So that ends up being > 2400 * 4900 calls to mbrlen. I am indeed using an UTF-8 locale, but I tested also with export LC_ALL=C and the behaviour did not change, I should have mentioned that. Also, it occurs to me that since all shell metacharacters are surely 7-bit ASCII, and all UTF-8 code units are strictly and deliberately non-ASCII 8-bit bytes, at least in the case of UTF-8 it may not be necessary to construct such bitmap: there cannot be a shell-metacharacter byte part of a UTF-8 sequence representing something else, so there's no need for the distinction. Of course, this observation applies only to such specially crafted multibyte encodings as UTF-8 (which nonetheless is surely by far the most common). Furthermore, it does not explain the issue at hand. > There is something happening here -- there's no way there should be that > many calls to set_line_mbstate(), Notice that there are almost as many calls (only 2 fewer) in case "0 1 0" (in which eval is not used) yet in that case the performance is not harmed. Quoting from the previous email: case 0 1 0: 2.08 0.32 0.012 0.00 0.00 set_line_mbstate case 1 1 0 (pathological): 38.89 0.21 0.2128890 7.27 7.27 set_line_mbstate OTOH, all other test cases show less than 100 calls (these numbers correspond to a profiling run of the given script, unmodified): $ grep -m1 set_line_mbstate gmon*.table gmon.out.0-0-X.table: 0.00 0.00 0.00 88 0.00 0.00 set_line_mbstate gmon.out.0-1-0.table: 0.00 0.13 0.00 7288 0.00 0.00 set_line_mbstate gmon.out.0-1-1.table: 0.00 0.18 0.00 88 0.00 0.00 set_line_mbstate gmon.out.1-0-X.table: 0.00 0.01 0.00 90 0.00 0.00 set_line_mbstate gmon.out.1-1-0.table: 37.50 0.06 0.06 7290 8.23 8.23 set_line_mbstate gmon.out.1-1-1.table: 0.00 0.07 0.00 90 0.00 0.00 set_line_mbstate > even when you have to save and restore > the input line because you have to parse the contents of $(). There must > be some combination of the effect of `eval' on the line bitmap and the > long string. I'll see what I can figure out. > > Chet Cheers.
Re: background, pipeline, hang
On Fri, Jul 12, 2019 at 02:26:52PM -0400, Chet Ramey wrote: > Date: Fri, 12 Jul 2019 14:26:52 -0400 > From: Chet Ramey > To: Mike Jonkmans , bug-bash@gnu.org, > b...@packages.debian.org > Cc: chet.ra...@case.edu > Subject: Re: background, pipeline, hang > Reply-To: chet.ra...@case.edu > > On 7/12/19 10:00 AM, Mike Jonkmans wrote: > > > Bash Version: 4.4 > > Patch Level: 20 > > Release Status: release > > > > Description: > > A pipeline, with two or more pipes, hangs when a background job is > > running. > > Only when the background job finishes, the shell prompts again. > > Without the background job, the shell prompts immediately. > > > > Repeat-By: > > : | : | : ## immediately finishes > > sleep 10 & > > : | : | : ## this hangs (^C, ^\, ^Z do not help) until the sleep > > finishes > > Sorry, I can't reproduce this. > > Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/ Thanks for looking, though. With the tip of Greg Wooledge (use --norc --noprofile), from another post, i was able to research this further. This runs for two seconds: bash --norc --noprofile -ic "trap 'tput hs' DEBUG; sleep 2 & :|:|:" While this immediately finishes: bash --norc --noprofile -ic "trap ':' DEBUG; sleep 2 & :|:|:" I also found something in the (ubuntu)changelog, that might be related. Quotes from "apt-get changelog bash": bash (4.3-6) unstable; urgency=medium [...] * Apply upstream patches 001 - 008: [...] - A change to save state while running the DEBUG trap caused pipelines to hang on systems which need process group synchronization while building pipelines. It seems to have to do with the duration of the command executed in the DEBUG trap. As this also exhibits an unexpected, longer duration: bash --norc --noprofile -ic "trap 'sleep 1; echo $SECONDS' DEBUG; sleep 5 & :|:|:"; echo $SECONDS Compared to: bash --norc --noprofile -ic "trap 'echo $SECONDS' DEBUG; sleep 5 & :|:|:"; echo $SECONDS So this combination is problematic: - trap on DEBUG where the command takes some time - something running in the background - a pipeline with two or more pipes Hope you can make something of this. Regards, Mike
Re: background, pipeline, hang
On Fri, Jul 12, 2019 at 10:14:19AM -0400, Greg Wooledge wrote: > Date: Fri, 12 Jul 2019 10:14:19 -0400 > From: Greg Wooledge > To: Mike Jonkmans > Cc: bug-bash@gnu.org, b...@packages.debian.org > Subject: Re: background, pipeline, hang > > On Fri, Jul 12, 2019 at 04:00:57PM +0200, Mike Jonkmans wrote: > > uname output: Linux sint 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 > > 10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux > > (Why are you reporting problems on an Ubuntu system on a debian.org > mailing list?) Valid point. I used bashbug. The email addresses were already filled in. Should i leave out b...@packages.debian.org ? > > Repeat-By: > > : | : | : ## immediately finishes > > sleep 10 & > > : | : | : ## this hangs (^C, ^\, ^Z do not help) until the sleep > > finishes > > I cannot reproduce this. I tried bash 4.4 on Debian 10 (self-built), > the standard bash 4.4 package (version 4.4-5) on Debian 9, and the standard > bash 5.0 package (version 5.0-4) on Debian 10. Thanks for trying. > You might want to try running a new instance of bash with --norc --noprofile > to remove any dot file settings from the picture. This is a very good tip. It might be useful to have this mentioned somewhere in bashbug. I have narrowed it down thanks to this. Will use another post for that. Thanks.