document that read built-in can't return zero-length string in the middle of input

2024-01-10 Thread ilya Basin
Dear.
I needed to read 16 bytes from a binary file and tried to replace a hexdump 
call with read built-in. I expected that with "-N1" if a NUL character is 
encountered bash would assign an empty string, however there's no indication 
that a NUL character was there and it simply assigns the next non-NUL character 
to the variable.
Example:

$ printf 'a\0c' | { LC_ALL=C; read -r -N1 a; read -r -N1 b; read -r -N1 c; 
echo "a=$a"; echo "b=$b"; echo "c=$c"; }

Expected:
a=a
b=
c=c

Actual:
a=a
b=c
c=

That's questionable, but fine with me. Yet I couldn't find this in the man 
page. Can we document it?



Re: bash-4-2 issue

2024-01-10 Thread Grisha Levit
On Wed, Jan 10, 2024 at 5:33 PM Grisha Levit  wrote:
> I'm not sure this is fixed. In all versions, including 4.2 [...]
>
> $ bash -m -c 'trap /usr/bin/true DEBUG; :|:'
> bash: child setpgid (49581 to 49579): Operation not permitted

Correction, versions prior to 4.3 did not respect the -m flag at invocation,
so the command should be:

bash -c 'set -m; trap /usr/bin/true DEBUG; :|:'



Re: bash-4-2 issue

2024-01-10 Thread Grisha Levit
On Mon, Jan 8, 2024 at 7:04 AM Sam Kappen via Bug reports for the GNU
Bourne Again SHell  wrote:
> We see that bash throws the "Operation not permitted" error when doing
> chained pipe operation
> along with a debug trap.
>
> We set a debug trap here "my_debug" to save the terminal commands entered.
> The GNU bash, version used is  4.2.
>
> root@freescale-p2020ds:~/dir#  ls -l | grep a | grep b | grep c
> -sh: child setpgid (4238 to 4232): Operation not permitted
>
>
> root@freescale-p2020ds:~/dir# trap
> trap -- '' TSTP
> trap -- '' TTIN
> trap -- '' TTOU
> trap -- 'my_debug' DEBUG
> root@freescale-p2020ds:~/dir#
>
> Platform: Linux 3.10 kernel on PPC target.
>
> It seems setpgid is failing because the process group of the pipeline does
> not exist at that time.
>
> This issue is not seen on bash version 4.4.

I'm not sure this is fixed. In all versions, including 4.2, 4.4, 5.2, and the
current devel version, I see what seems to be the same error, triggered by a
pipeline when job control is enabled and the DEBUG trap executes an external
command.

$ bash -m -c 'trap /usr/bin/true DEBUG; :|:'
bash: child setpgid (49581 to 49579): Operation not permitted

And an accompanying warning if Bash is built with -DDEBUG:

DEBUG warning: delete_job (0 pgrp 49579): js.c_reaped (-1) < 0
ndel = 2 js.j_ndead = 0

> I am looking to figure out the particular fix that fixed this issue from
> the above commit and to backport to bash4-2 version.

In general you want to bisect using commits in the devel branch:
https://git.savannah.gnu.org/cgit/bash.git/log/?h=devel



Re: Bash 5.2.0: Memory leak with $(

2024-01-10 Thread Grisha Levit
On Mon, Jan 8, 2024, 12:26  wrote:

> Do any of the other six patches in that report also apply to Bash 5.2?
>

Yes, all but the one for the `kv' builtin which did not exist yet. See
attached.

>
From 711ab85262884f2b91f09eceb9aefd0e2426ce67 Mon Sep 17 00:00:00 2001
From: Grisha Levit 
Date: Sat, 3 Jun 2023 16:51:26 -0400
Subject: [PATCH] various leaks

Found mostly by normal usage running a no-bash-malloc build with clang's
LeakSanitizer enabled. So far seems to provide very accurate results.

* arrayfunc.c
- quote_compound_array_word: make sure to free VALUE
- bind_assoc_var_internal: if assigning to a dynamic variable, make sure
  to free the key (usually assoc_insert would do it)

* bashline.c
- bash_command_name_stat_hook: free original *NAME if we are going to
  change what it points to (what the callers seem to expect)

* builtins/evalstring.c
- parse_and_execute: make sure to dispose of the parsed command
  resulting from a failed function import attempt
- open_redir_file: if we did not get a pointer to pass back the expanded
  filename, make sure to free the name

* examples/loadables/stat.c
- loadstat: bind_assoc_variable does not free its VALUE argument so make
  sure to do it

* subst.c
- param_expand: free temp1 value for codepaths that don't do it
---
 arrayfunc.c   | 6 +-
 bashline.c| 1 +
 builtins/evalstring.c | 4 
 examples/loadables/stat.c | 1 +
 subst.c   | 2 ++
 5 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/arrayfunc.c b/arrayfunc.c
index 2c05d15b..8ba64084 100644
--- a/arrayfunc.c
+++ b/arrayfunc.c
@@ -208,7 +208,10 @@ bind_assoc_var_internal (entry, hash, key, value, flags)
   newval = make_array_variable_value (entry, 0, key, value, flags);
 
   if (entry->assign_func)
-(*entry->assign_func) (entry, newval, 0, key);
+{
+  (*entry->assign_func) (entry, newval, 0, key);
+  FREE (key);
+}
   else
 assoc_insert (hash, key, newval);
 
@@ -985,6 +988,7 @@ quote_compound_array_word (w, type)
   if (t != w+ind)
free (t);
   strcpy (nword + i, value);
+  free (value);
 
   return nword;
 }
diff --git a/bashline.c b/bashline.c
index c85b05b6..bd7548cc 100644
--- a/bashline.c
+++ b/bashline.c
@@ -1928,6 +1928,7 @@ bash_command_name_stat_hook (name)
   result = search_for_command (cname, 0);
   if (result)
 {
+  FREE (*name);
   *name = result;
   return 1;
 }
diff --git a/builtins/evalstring.c b/builtins/evalstring.c
index df3dd68e..20c6a4a7 100644
--- a/builtins/evalstring.c
+++ b/builtins/evalstring.c
@@ -461,6 +461,8 @@ parse_and_execute (string, from_file, flags)
 		  should_jump_to_top_level = 0;
 		  last_result = last_command_exit_value = EX_BADUSAGE;
 		  set_pipestatus_from_exit (last_command_exit_value);
+		  dispose_command(command);
+		  global_command = (COMMAND *)NULL;
 		  reset_parser ();
 		  break;
 		}
@@ -762,6 +764,8 @@ open_redir_file (r, fnp)
 
   if (fnp)
 *fnp = fn;
+  else
+free (fn);
   return fd;
 }
 
diff --git a/examples/loadables/stat.c b/examples/loadables/stat.c
index 1e60e7b6..ed5c9764 100644
--- a/examples/loadables/stat.c
+++ b/examples/loadables/stat.c
@@ -349,6 +349,7 @@ loadstat (vname, var, fname, flags, fmt, sp)
   key = savestring (arraysubs[i]);
   value = statval (i, fname, flags, fmt, sp);
   v = bind_assoc_variable (var, vname, key, value, ASS_FORCE);
+  free (value);
 }
   return 0;
 }
diff --git a/subst.c b/subst.c
index 1ac6eb2d..ff0602da 100644
--- a/subst.c
+++ b/subst.c
@@ -10727,6 +10727,7 @@ comsub:
 	{
 	  chk_atstar (temp, quoted, pflags, quoted_dollar_at_p, contains_dollar_at);
 	  tdesc = parameter_brace_expand_word (temp, SPECIAL_VAR (temp, 0), quoted, pflags, 0);
+	  free (temp1);
 	  if (tdesc == &expand_wdesc_error || tdesc == &expand_wdesc_fatal)
 		return (tdesc);
 	  ret = tdesc;
@@ -10739,6 +10740,7 @@ comsub:
 	{
 	  set_exit_status (EXECUTION_FAILURE);
 	  report_error (_("%s: invalid variable name for name reference"), temp);
+	  free (temp1);
 	  return (&expand_wdesc_error);	/* XXX */
 	}
 	  else
-- 
2.43.0



Re: completion very slow with gigantic list

2024-01-10 Thread Eric Wong
"Dale R. Worley"  wrote:
> A priori, it isn't surprising.  But the question becomes "What
> algorithmic improvement to bash would make this work faster?" and then
> "Who will write this code?"

I'll try to take a look at it in a few months if I run out of
things to do and nobody beats me to it.  I've already got a lot
on my plate and hit this on my way to other things.



Re: Bash 5.2.21 segfaults when I feed it garbage

2024-01-10 Thread Grisha Levit
On Mon, Jan 8, 2024 at 4:41 PM Chet Ramey  wrote:
> I think there's a simpler
> way to fix it in parse_compound_assignment and parse_string_to_word_list
> directly, and that change will be in the next devel branch push.

Rewriting the original report as:

bash <<<'((X=([))'

even after the last fix, there's still a similar issue with input like:

bash <<<'((X=([))]'

=
ERROR: AddressSanitizer: heap-use-after-free on address 0x000107f00cbc
at pc 0x000104b083ec bp 0x00016b3506e0 sp 0x00016b3506d8
READ of size 4 at 0x000107f00cbc thread T0
#0 0x104b083e8 in shell_getc parse.y:2712
#1 0x104b01908 in read_token parse.y:3516
#2 0x104ae47c0 in yylex parse.y:2995

0x000107f00cbc is located 60 bytes inside of 64-byte region
[0x000107f00c80,0x000107f00cc0)
freed by thread T0 here:
#0 0x105e1f380 in wrap_free+0x98
#1 0x104aec0f0 in pop_string parse.y:2042
#2 0x104b095d0 in shell_getc parse.y:2753
#3 0x104b15030 in read_token_word parse.y:5604
#4 0x104b047ec in read_token parse.y:3712
#5 0x104b28afc in parse_compound_assignment parse.y:6971
#6 0x104b13a28 in read_token_word parse.y:5543
#7 0x104b047ec in read_token parse.y:3712
#8 0x104ae47c0 in yylex parse.y:2995

previously allocated by thread T0 here:
#0 0x105e1f244 in wrap_malloc+0x94
#1 0x104ec5b40 in xmalloc xmalloc.c:107
#2 0x104aea90c in push_string parse.y:1981
#3 0x104b0cf58 in parse_dparen parse.y:4837
#4 0x104b02d50 in read_token parse.y:3635
#5 0x104ae47c0 in yylex parse.y:2995

SUMMARY: AddressSanitizer: heap-use-after-free parse.y:2712 in shell_getc



Re: completion very slow with gigantic list

2024-01-10 Thread Dale R. Worley
Eric Wong  writes:
> Hi, I noticed bash struggles with gigantic completion lists
> (100k items of ~70 chars each)

A priori, it isn't surprising.  But the question becomes "What
algorithmic improvement to bash would make this work faster?" and then
"Who will write this code?"

Dale



[PATCH 8/8] doc/bash.1: fix erroneous escape sequences

2024-01-10 Thread G. Branden Robinson
troff:doc/bash.1:10090: warning: ignoring escape character before '+'
troff:doc/bash.1:11896: warning: ignoring escape character before 'P'
---
 doc/bash.1 | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/doc/bash.1 b/doc/bash.1
index 35c076f0..9d44a6d4 100644
--- a/doc/bash.1
+++ b/doc/bash.1
@@ -10087,7 +10087,7 @@ .SH "SHELL BUILTIN COMMANDS"
 .TP
 \fBset \-o\fP
 .TP
-\fBset \+o\fP
+\fBset +o\fP
 .PD
 Without options, display the name and value of each shell variable
 in a format that can be reused as input
@@ -11893,7 +11893,7 @@ .SH "SHELL COMPATIBILITY MODE"
 arithmetic expressions used as indexed array subscripts can be
 expanded more than once
 .IP \(bu
-\fBtest \-v\fP, when given an argument of \fBA[@]\fP, where \fBA\P is
+\fBtest \-v\fP, when given an argument of \fBA[@]\fP, where \fBA\fP is
 an existing associative array, will return true if the array has any set
 elements.
 Bash-5.2 will look for and report on a key named \fB@\fP.
-- 
2.30.2


signature.asc
Description: PGP signature


[PATCH 7/8] doc/bash.1: make code displays more portable

2024-01-10 Thread G. Branden Robinson
1.  Use `EX`/`EE` extension.

groff_man(7):
 .EX
 .EEBegin and end example.  After .EX, filling is disabled and a
constant‐width (monospaced) font is selected.  Calling .EE
enables filling and restores the previous font.

.EX and .EE are extensions introduced in Ninth Edition Unix.
Documenter’s Workbench, Heirloom Doctools, and Plan 9
troffs, and mandoc (since 1.12.2) also support them.
Solaris troff does not.  See subsection “Use of extensions”
in groff_man_style(7).

If the man(7) implementation doesn't support these, no harm is
done--only groff complains about undefined macros.  Calling one is a
no-op.  Consequently, invoke `nf` and `fi` within `EX` and `EE`, which
is redundant on systems that support the latter, but still gets the
desired effect elsewhere--but do this only for multi-line code displays;
it's not necessary for a one-liner.  Using the macros also avoids the
`CW` font portability problem.

2.  Use a relative inset for the "_completion_loader" code display, so
it looks "displayed" even if the font family can't be changed.

3.  Adapt the "_completion_loader" code display to varying line lengths
in the output device.  Indent function body by two spaces, not four.
Use dirty *roff tricks to make the long lines wrap if the device is
narrow (less than 80 ens width, which likely means a user-configured
terminal or AT&T nroff, where the line length defaults to 65 ens).

This fixes the last of these warnings:

troff:./doc/bash.1:7422: warning: cannot select font 'CW'

...and no lines should be overset even on a legacy Unix system.

$ groff -rLL=65n -man -T ascii -P -cbou ./doc/bash.1 | wc -L
65
---
 doc/bash.1 | 34 --
 1 file changed, 24 insertions(+), 10 deletions(-)

diff --git a/doc/bash.1 b/doc/bash.1
index fff8a817..35c076f0 100644
--- a/doc/bash.1
+++ b/doc/bash.1
@@ -365,8 +365,9 @@ .SH INVOCATION
 behaves as if the following command were executed:
 .PP
 .RS
-.if t \f(CWif [ \-n "$BASH_ENV" ]; then . "$BASH_ENV"; fi\fP
-.if n if [ \-n "$BASH_ENV" ]; then . "$BASH_ENV"; fi
+.EX
+if [ \-n "$BASH_ENV" ]; then . "$BASH_ENV"; fi
+.EE
 .RE
 .PP
 but the value of the
@@ -1356,8 +1357,9 @@ .SH PARAMETERS
 argument, running
 .PP
 .RS
-.if t \f(CWdeclare \-n ref=$1\fP
+.EX
 .if n declare \-n ref=$1
+.EE
 .RE
 .PP
 inside the function creates a nameref variable \fBref\fP whose value is
@@ -7419,17 +7421,29 @@ .SS Programmable Completion
 file corresponding to the name of the command, the following default
 completion function would load completions dynamically:
 .PP
-\f(CW_completion_loader()
-.br
+.RS
+.EX
+.nf
+_completion_loader()
 {
+  . "/etc/bash_completion.d/$1.sh" \c
+.if \n(LL<80n \{\
+\e
 .br
-. "/etc/bash_completion.d/$1.sh" >/dev/null 2>&1 && return 124
-.br
+.ti +4n
+.\}
+>/dev/null 2>&1 && return 124
 }
+complete \-D \-F _completion_loader \c
+.if \n(LL<80n \{\
+\e
 .br
-complete \-D \-F _completion_loader \-o bashdefault \-o default
-.br
-\fP
+.ti +4n
+.\}
+\-o bashdefault \-o default
+.fi
+.EE
+.RE
 .SH HISTORY
 When the
 .B \-o history
-- 
2.30.2



signature.asc
Description: PGP signature


[PATCH 6/8] doc/bash.1: use page-local `FN` macro for file name

2024-01-10 Thread G. Branden Robinson
The page defines it; might as well use it.
---
 doc/bash.1 | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/bash.1 b/doc/bash.1
index 8943e01e..fff8a817 100644
--- a/doc/bash.1
+++ b/doc/bash.1
@@ -2107,7 +2107,7 @@ .SS Shell Variables
 This variable expands to a 32-bit pseudo-random number each time it is
 referenced.
 The random number generator is not linear on systems that support
-\f(CW/dev/urandom\fP
+.FN /dev/urandom
 or
 .IR \%arc4random (3),
 so each returned number has no relationship to the numbers preceding it.
-- 
2.30.2



signature.asc
Description: PGP signature


[PATCH 5/8] doc/bash.1: add man page cross references

2024-01-10 Thread G. Branden Robinson
Cross-reference arc4random(3) and stty(1) man pages.  Protect the former
from hyphenation.

Also break an input line after a sentence.
---
 doc/bash.1 | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/doc/bash.1 b/doc/bash.1
index f532d628..8943e01e 100644
--- a/doc/bash.1
+++ b/doc/bash.1
@@ -2105,9 +2105,12 @@ .SS Shell Variables
 .TP
 .B SRANDOM
 This variable expands to a 32-bit pseudo-random number each time it is
-referenced. The random number generator is not linear on systems that
-support \f(CW/dev/urandom\fP or \fIarc4random\fP, so each returned number
-has no relationship to the numbers preceding it.
+referenced.
+The random number generator is not linear on systems that support
+\f(CW/dev/urandom\fP
+or
+.IR \%arc4random (3),
+so each returned number has no relationship to the numbers preceding it.
 The random number generator cannot be seeded, so assignments to this
 variable have no effect.
 If
@@ -6834,8 +6837,7 @@ .SS Commands for Changing Text
 .TP
 .B \fIend\-of\-file\fP (usually C\-d)
 The character indicating end-of-file as set, for example, by
-.if t \f(CWstty\fP.
-.if n ``stty''.
+.IR stty (1).
 If this character is read when there are no characters
 on the line, and point is at the beginning of the line, readline
 interprets it as the end of input and returns
-- 
2.30.2



signature.asc
Description: PGP signature


[PATCH 4/8] doc/bash.1: make quoted trailing spaces unbreakable

2024-01-10 Thread G. Branden Robinson
By luck, at present, input like

times, as necessary, to indicate multiple
levels of indirection.  The default is
.Q "+\ " .

does not get set as

times, as necessary, to indicate multiple
levels of indirection.  The default is “+
”.

by any of groff {1.22.4,1.23.0,git}, mandoc, Heirloom Doctools, or DWB
3.3 nroffs using their default line lengths...but it could, without this
patch.

In my opinion, it is not necessary to make _internal_ spaces in a
quotation unbreakable.  Thus I leave cases like

.Q "bind \-x"

alone.
---
 doc/bash.1 | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/doc/bash.1 b/doc/bash.1
index b1257639..f532d628 100644
--- a/doc/bash.1
+++ b/doc/bash.1
@@ -2624,14 +2624,14 @@ .SS Shell Variables
 .SM
 .B PROMPTING
 below) and used as the primary prompt string.  The default value is
-.Q "\es\-\ev\e$ " .
+.Q "\es\-\ev\e$\ " .
 .TP
 .B PS2
 The value of this parameter is expanded as with
 .SM
 .B PS1
 and used as the secondary prompt string.  The default is
-.Q "> " .
+.Q ">\ " .
 .TP
 .B PS3
 The value of this parameter is used as the prompt for the
@@ -2653,7 +2653,7 @@ .SS Shell Variables
 .B PS4
 is replicated multiple times, as necessary, to indicate multiple
 levels of indirection.  The default is
-.Q "+ " .
+.Q "+\ " .
 .TP
 .B SHELL
 This variable expands to the full pathname to the shell.
-- 
2.30.2



signature.asc
Description: PGP signature


[PATCH 3/8] doc/bash.1: define and use "Q" quotation macro

2024-01-10 Thread G. Branden Robinson
...instead of assuming the availability of a font named `CW`, and using
inconsistent quotation conventions when rendering to terminals with
nroff(1).

This resolves 25 instances of the following warning from groff 1.23.0.

troff:./doc/bash.1:360: warning: cannot select font 'CW'

To extract the most salient points from a lengthy discussion on Perl's
GitHub site[1]:

1.  "CW" is not a portable font name.
2.  There is no portable way to ask troff whether a font is available.

Further, and this is more a matter of taste and opinion, some
typographical authorities deprecate changes of font family in running
text.  (This is similar to typographers' and technical writers' advice
not to change the typeface often, for instance with frequent use of
italics and, especially, bold.  But this latter horse is probably locked
_out_ of the barn for man pages.)  Contrast with use of different font
families for "displays", that is, text that is set off by vertical
separation (and often indentation) from its surroundings.

With a macro, it is more straightforward to tackle another portability
problem: the availability of special character escape sequences for
doubled typographer's quotes.  groff has supported these for 30+ years;
and mandoc and Heirloom Doctools do as well.  But they are not portable
to System V or DWB troff, let alone the Seventh Edition Unix troff that
BSD employed until replacing it with groff in the early 1990s.

Therefore, define a quotation macro as follows:

A.  Bracket the first argument with `lq` and `rq` special character
escape sequences if the formatter is groff;
B.  ...with `` and '' on a non-groff troff typesetter;
C.  ...with " and " on a non-groff nroff program.
D.  Accept a second argument for abutting, trailing punctuation,
similarly to man(7)'s font alternation macros (and groff's `ME`,
`UE`, and `MR` extensions).

Furthermore:
E.  Drop font style changes, no longer necessary to mark the text now
that quotation is reliable.
F.  Use dummy character escape sequence \& after quoted, word-ending
dots to avoid erroneous insertion of inter-sentence space.[2]  (See
particularly bash(1)'s subsection "Pathname Expansion".)
G.  Fix outright missing closing quotation in description of "enable"
built-in.

mandoc maintainer Ingo Schwarze gravely warns man page authors against
defining macros, but (1) mandoc _does_ support them if they're not too
complex, like this [I tested]; (2) he'd rather that any given man page
be written in mdoc(7) instead, a heavy lift for a document of bash(1)'s
mass even if such were desired by its maintainer and/or user community;
(3) at some point, portability demands of legacy systems militate for
such things (and all legacy troffs are considerably more flexible than
mandoc(1)); and (4) an alternative approach, perhaps using autoconf(1)
to check the host's troff(1), nroff(1), and man(7) for features and
constructing a man page at build time using soelim(1), is conceivable,
and I'm willing to assist with that if it sounds promising.

Exhibits of visible change (UTF-8, which groff and mandoc produce,
follows):

-ement (the one with the highest index) is "main".  This variable 
ex‐
+ement (the one with the highest index) is “main”.  This variable 
ex‐

-will write the trace output generated when set -x is enabled to 
that
+will write the trace output generated when “set -x” is enabled to

-mand.  The default value is ‘‘’’.
+mand.  The default value is “”.

- ‘‘.’’  at the start of a name or immediately following a slash must be
+ “.” at the start of a name or immediately following a slash must be 
matched

[The foregoing item was rendered with left adjustment; observe that the
quoted dot is no longer treated as ending a sentence.]

-name, as if the command were ‘‘enable -f name name .  The return
+name, as if the command were “enable -f  name name”.  The return

[1] https://github.com/Perl/perl5/issues/21239
[2] https://www.gnu.org/software/groff/manual/groff.html.node/Sentences.html
---
 doc/bash.1 | 234 ++---
 1 file changed, 131 insertions(+), 103 deletions(-)

diff --git a/doc/bash.1 b/doc/bash.1
index ed67e4b0..b1257639 100644
--- a/doc/bash.1
+++ b/doc/bash.1
@@ -43,7 +43,15 @@
 .\" but Sun doesn't seem to like that very much.
 .\"
 .de FN
-\fI\|\\$1\|\fP
+\%\fI\|\\$1\|\fP
+..
+.\" quotation macro
+.de Q
+.ie \n(.g \(lq\\$1\(rq\\$2
+.el \{
+.  if t ``\\$1''\\$2
+.  if n "\\$1"\\$2
+.\}
 ..
 .SH NAME
 bash \- GNU Bourne-Again SHell
@@ -1868,8 +1876,7 @@ .SS Shell Variables
 The element with index 0 is the name of any currently-executing
 shell function.
 The bottom-most element (the one with the highest index) is
-.if t \f(CW"main"\fP.
-.if n "main".
+.Q main .
 This variable exists only when a shell function is executing.
 Assignments to
 .SM
@@ -2009,8 +2016,7 @@ .SS Shell Variables
 .TP
 .B RE

[PATCH 2/8] doc/bash.1: fix unescaped hyphens

2024-01-10 Thread G. Branden Robinson
---
 doc/bash.1 | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/doc/bash.1 b/doc/bash.1
index 8c2fa229..ed67e4b0 100644
--- a/doc/bash.1
+++ b/doc/bash.1
@@ -6132,7 +6132,7 @@ .SS "Readline Variables"
 treated specially by the kernel's terminal driver to their  
 readline equivalents.
 These override the default readline bindings described here.
-Type \f(CWstty -a\fP at a bash prompt to see your current terminal settings,
+Type \f(CWstty \-a\fP at a bash prompt to see your current terminal settings,
 including the special control characters (usually \fBcchars\fP).
 .TP
 .B blink\-matching\-paren (Off)
@@ -7397,7 +7397,7 @@ .SS Programmable Completion
 .br
 }
 .br
-complete \-D \-F _completion_loader -o bashdefault -o default
+complete \-D \-F _completion_loader \-o bashdefault \-o default
 .br
 \fP
 .SH HISTORY
-- 
2.30.2



signature.asc
Description: PGP signature


[PATCH 0/8] doc/bash.1: silence groff warnings, fix style issues

2024-01-10 Thread G. Branden Robinson
With this series of changes, bash(1) formats quiescently for me using
groff 1.23.0 with the following command line.

nroff -ww -rCHECKSTYLE=1 -man -z doc/bash.1

Regards,
Branden


signature.asc
Description: PGP signature


[PATCH 1/8] doc/bash.1: use consistent inter-paragraph spacing

2024-01-10 Thread G. Branden Robinson
Historically in man(7), the inter-paragraph spacing (equivalently, the
spacing before section and subsection headings, and the value of the PD
register) is 0.4v (or four tenths of a "vee", the distance between
vertically adjacent text baselines) on typesetters, and 1v on terminals
(that is, a blank line).[1]

1.  Replace `sp` requests with `PP` calls for paragraphs that should not
be (specially) indented.

2.  Replace `sp` requests with `IP` calls for each paragraph that
continues a discussion begun with a tagged paragraph (`TP`) and
therefore should be indented the same as its predecessor.

3.  Add `PD` calls to restore normal inter-paragraph spacing when
separating them inside doubly-nested tagged paragraphs where the
outer layer of tagged paragraphs disables inter-paragraph spacing
with `PD 0`.  (The difficulty/tedium of managing such things, as
well as the "presentationalism" of the `PD` macro is why
groff_man_style(7) deprecates it.  But it does have the virtue of
being portable to all man(7) implementations thanks to its presence
in Seventh Edition Unix man(7) (1979).)

I would like to say that this commit produces no change in nroff-mode
output on any implementation, but that's not quite the case.

Replacing `.sp 0.5` with `PP` caused the amount of inter-paragraph space
to round up to 1v on groff and mandoc.  This measurement "floors" on
nroff traditionally, rounding to zero, as the following command shows.

  printf '.TH foo 1\n.SH Name\na\n.sp 0.5\nb\n' | nroff -man | cat -s

(Use any nroff you like, or simply "mandoc".)

* groff Git
* groff 1.23.0
* groff 1.22.4

The following three implementations show progressively greater, but
arguably not alarming, differences.

* mandoc 1.14.6

As above, plus:

Incorrectly suppresses the restored (non-zero) paragraph spacing at the
following points.

$ sed -n 800,805p doc/bash.1
in decreasing order of precedence:
.IP
.RS
.PD 0
.TP
.B ( \fIexpression\fP )
$ sed -n 8303,8308p doc/bash.1
builtin is invoked.
.IP
.RS
.PD 0
.TP 8
\fB\-o\fP \fIcomp-option\fP

* Heirloom Doctools

As above, plus:

a.  Appears to _fix_ a font selection problem when output piped through
ul(1) from the Debian "bsdextrautils" 2.36.1 package; the following
lines are no longer set in a "dim" style on xterm(1).

if [ -n "$BASH_ENV" ]; then . "$BASH_ENV"; fi

declare -n ref=$1

b.  The same lines were already getting a blank line rendered after (but
not before) them, unlike groff(1) and mandoc(1).

c.  The following line is now also styled consistently with its
successors, instead of being set in a "dim" style.

BASH_VERSINFO[0]The major version number (the release).

(When not piped through "less -R", "BASH_VERSINFO[", "]", and
"release" appear in reverse video rather than a dim font.)

d.  The blank line that should render in the following is incorrectly
suppressed.

$ sed -n 2671,2675p doc/bash.1
braces denote optional portions.
.PP
.RS
.PD 0
.TP 10

e.  The "(expression)" and "-o comp-option" lines mentioned under mandoc
above are set with an _extra_ blank line, instead of mandoc's none.

f.  A blank line no longer renders at the paragraph break in the
following.  This is correct because at this point in the page (where
paragraph tagging is 3 levels deep), the configured inter-paragraph
spacing is zero.  (Whether that is _intended_, and what style rules
are supposed to apply at various levels of tag nesting, are not
clear to me.)

$ sed -n 10286,10291p doc/bash.1
.B xtrace
Same as
.BR \-x .
.PP
If
.B \-o

* Documenter's Workbench 3.3

As Heirloom, plus:

g.  Because the number of lines in the formatted output changes, the
pagination changes as well (as it does with groff(1) when the
`-rcR=0` option is used when rendering man pages).

h.  The foregoing furthermore changes the parity of line adjustment.
(In other words, spaces that "justify" the line are sometimes
inserted from the opposite end.)

[1] Here is my support for my claim, thanks to TUHS.

I elided literal bell characters from System III sources; the resulting
syntax is invalid.  (It was an effort to cope with the prospect of macro
arguments containing embedded double-quotes, but the means of such
embedding was so esoteric in AT&T troff that few users acquired the
skill.)

--
HISTORY/MAN/1979-01-v7/tmac.an:.de PD
HISTORY/MAN/1979-01-v7/tmac.an-.if t .nr )P .4v
HISTORY/MAN/1979-01-v7/tmac.an-.if n .nr )P 1v
HISTORY/MAN/1979-01-v7/tmac.an-.if !"\\$1"" .nr )P \\$1v
--
HISTORY/MAN/1980-03-3BSD/tmac.an.new:.de PD
HISTORY/MAN/1980-03-3BSD/tmac.an.new-.if t .nr )P .4v
HISTORY/MAN/1980-03-3BSD/tmac.an.new-.if n .nr )P 1v
HISTORY/MAN/1980-03-3BSD/tmac.an.new-.if !"\\$1"" .nr )P \\$1v
--
HISTORY/MAN/1980-06-SystemIII/an.src:.dePD
HISTORY/MAN/1980-06-SystemIII/an.src-.ift .nr PD .4v
HISTORY/MAN/1980-06-SystemIII/an.src-.ifn .nr PD 1v
HISTORY/MAN/1980-06-SystemIII/an.src-.if!\\$1 .nr PD \\$1v
--
HISTOR

completion very slow with gigantic list

2024-01-10 Thread Eric Wong
Hi, I noticed bash struggles with gigantic completion lists
(100k items of ~70 chars each)

It's reproducible with both LANG+LC_ALL set to en_US.UTF-8 and C,
so it's not just locales slowing things down.

This happens on the up-to-date `devel' branch
(commit 584a2b4c9e11bd713030916d9d832602891733d7),
but I first noticed this on Debian oldstable (5.1.4)

strcoll and strlen seem to be at the top of profiles, and
mregister_free when building devel with default options...
ltrace reveals it's doing strlen repeatedly on the entire
(100k items * 70 chars each = ~7MB)

Sidenote: I'm not really sure what one would do with ~100K
completion candidates, but I managed to hit that case when
attempting completion for an NNTP group + IMAP mailbox listing.

Standalone reproducer here:
---8<--
# bash struggles with giant completion list (100K items of ~70 chars each)
# Usage:
#   . giant_complete.bash
#   giant_complete a # watch CPU usage spike
#
# derived from lei-completion.bash in https://80x24.org/public-inbox.git
# There could be something wrong in my code, too, since I'm not
# familiar with writing completions...

_giant_complete() {
# generate a giant list:
local wordlist="$(awk