Bug#1101705: unzstd.1: Some remarks and a patch with editorial changes for this man page

Bjarni Ingi Gislason Sun, 30 Mar 2025 08:51:20 -0700

Package: zstd
Version: 1.5.7+dfsg-1
Severity: minor
Tags: patch

   * What led up to the situation?


     Checking for defects with a new version

test-[g|n]roff -mandoc -t -K utf8 -rF0 -rHY=0 -rCHECKSTYLE=10 -ww -z < "man 
page"

  [Use "grep -e ' $' -e '\\~$' <file>" to find obvious trailing spaces.]

  ["test-groff" is a script in the repository for "groff"; is not shipped]
(local copy and "troff" slightly changed by me).

  [The fate of "test-nroff" was decided in groff bug #55941.]

   * What was the outcome of this action?

tbl:"<stdin>:7: error: invalid column classifier '\'
tbl:"<stdin>:7: error: giving up on this table region
an.tmac:<stdin>:8: warning: tbl preprocessor failed, or it or soelim was not 
run; table(s) likely not rendered (TE macro called with TW register undefined)


   * What outcome did you expect instead?

     No output (no warnings).

-.-

  General remarks and further material, if a diff-file exist, are in the
attachments.


-- System Information:
Debian Release: trixie/sid
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: amd64 (x86_64)

Kernel: Linux 6.12.20-amd64 (SMP w/2 CPU threads; PREEMPT)
Locale: LANG=is_IS.iso88591, LC_CTYPE=is_IS.iso88591 (charmap=ISO-8859-1), 
LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: sysvinit (via /sbin/init)

Versions of packages zstd depends on:
ii  libc6       2.41-6
ii  libgcc-s1   14.2.0-19
ii  liblz4-1    1.10.0-4
ii  liblzma5    5.6.4-1
ii  libstdc++6  14.2.0-19
ii  zlib1g      1:1.3.dfsg+really1.3.1-1+b1

zstd recommends no packages.

zstd suggests no packages.

-- no debconf information

Input file is unzstd.1

Output from "mandoc -T lint  unzstd.1": (shortened list)

      1 input text line longer than 80 bytes: A dictionary ID is a...
      1 input text line longer than 80 bytes: Additionally, this c...
      1 input text line longer than 80 bytes: Benchmark file(s) us...
      1 input text line longer than 80 bytes: Beyond compression l...
      1 input text line longer than 80 bytes: Bigger hash tables c...
      1 input text line longer than 80 bytes: Bigger hash tables u...
      1 input text line longer than 80 bytes: Compress\. This is t...
      1 input text line longer than 80 bytes: Determine \fBoverlap...
      1 input text line longer than 80 bytes: Display information ...
      1 input text line longer than 80 bytes: Employing environmen...
      1 input text line longer than 80 bytes: For \fBZSTD_btopt\fR...
      1 input text line longer than 80 bytes: For \fBZSTD_fast\fR,...
      1 input text line longer than 80 bytes: Higher numbers of bi...
      1 input text line longer than 80 bytes: If input directory c...
      1 input text line longer than 80 bytes: In most places where...
      1 input text line longer than 80 bytes: In situations where ...
      1 input text line longer than 80 bytes: It is possible to co...
      1 input text line longer than 80 bytes: Larger bucket sizes ...
      1 input text line longer than 80 bytes: Larger search length...
      1 input text line longer than 80 bytes: Larger values will i...
      1 input text line longer than 80 bytes: Limit dictionary to ...
      1 input text line longer than 80 bytes: Limit the amount of ...
      1 input text line longer than 80 bytes: More searches increa...
      2 input text line longer than 80 bytes: Multiply the integer...
      1 input text line longer than 80 bytes: Note 1: this mode is...
      1 input text line longer than 80 bytes: Note 2: this mode is...
      1 input text line longer than 80 bytes: Note that RFC8878 re...
      2 input text line longer than 80 bytes: Note: If \fBwindowLo...
      1 input text line longer than 80 bytes: Note: \fB\-\-long\fR...
      1 input text line longer than 80 bytes: Note: for level 19, ...
      1 input text line longer than 80 bytes: Note: up to level 15...
      1 input text line longer than 80 bytes: Same as cover but wi...
      1 input text line longer than 80 bytes: Select parameters fo...
      1 input text line longer than 80 bytes: Selects segments of ...
      1 input text line longer than 80 bytes: Since dictionary com...
      1 input text line longer than 80 bytes: Source files are pre...
      1 input text line longer than 80 bytes: Specify the frequenc...
      2 input text line longer than 80 bytes: Specify the maximum ...
      2 input text line longer than 80 bytes: Specify the size of ...
      1 input text line longer than 80 bytes: Test the integrity o...
      1 input text line longer than 80 bytes: The \fBzstd\fR CLI p...
      1 input text line longer than 80 bytes: The \fIoptions\fR ar...
      1 input text line longer than 80 bytes: The \fIzstandard\fR ...
      1 input text line longer than 80 bytes: The following parame...
      1 input text line longer than 80 bytes: The higher number of...
      1 input text line longer than 80 bytes: The minimum \fIclog\...
      1 input text line longer than 80 bytes: The minimum \fIhlog\...
      1 input text line longer than 80 bytes: The minimum \fIovlog...
      1 input text line longer than 80 bytes: There are 9 strategi...
      1 input text line longer than 80 bytes: They can both be ove...
      1 input text line longer than 80 bytes: This is also used du...
      1 input text line longer than 80 bytes: Unless \fB\-\-stdout...
      1 input text line longer than 80 bytes: Use FILEs as trainin...
      1 input text line longer than 80 bytes: Use \fB#\fR compress...
      1 input text line longer than 80 bytes: Use \fIFILES\fR as a...
      1 input text line longer than 80 bytes: Use legacy dictionar...
      1 input text line longer than 80 bytes: When compressing a s...
      1 input text line longer than 80 bytes: When compressing, th...
      1 input text line longer than 80 bytes: When decompressing, ...
      1 input text line longer than 80 bytes: When invoked via a \...
      1 input text line longer than 80 bytes: do not store the ori...
      1 invalid character in tbl layout: B
      1 invalid character in tbl layout: O
      3 invalid character in tbl layout: [
     12 invalid character in tbl layout: \
      1 invalid character in tbl layout: f
      1 invalid character in tbl layout: o
      1 invalid character in tbl layout: z
      1 skipping paragraph macro: sp after SH
      1 tbl line starts with span
      1 tbl without any data cells
      1 unknown font, skipping request: TS FILE\fR]
      1 unknown font, skipping request: TS FILE\fR] [\-o \fIOUTPUT\-FILE\fR]
      1 unknown font, skipping request: TS fIINPUT\-FILE\fR] [\-o 
\fIOUTPUT\-FILE\fR]
      1 unknown font, skipping request: TS fIOPTIONS\fR] [\-    
\fIINPUT\-FILE\fR] [\-o \fIOUTPUT\-FILE\fR]
      1 unknown font, skipping request: TS fIOUTPUT\-FILE\fR]
      1 unknown font, skipping request: TS fR]
      1 unknown font, skipping request: TS fR] [\-      \fIINPUT\-FILE\fR] [\-o 
\fIOUTPUT\-FILE\fR]
      1 unknown font, skipping request: TS fR] [\-o \fIOUTPUT\-FILE\fR]

-.-.

Output from "test-nroff -mandoc -t -ww -z unzstd.1": (shortened list)

      1 giving up on this table region
      1 invalid column classifier '\'
      1 tbl preprocessor failed, or it or soelim was not run; table(s) likely 
not rendered (TE macro called with TW register undefined)

-.-.

Change '-' (\-) to '\(en' (en-dash) for a (numeric) range.
GNU gnulib has recently (2023-06-18) updated its
"build_aux/update-copyright" to recognize "\(en" in man pages.

unzstd.1:72:\fB\-#\fR: selects \fB#\fR compression level [1\(en19] (default: 
3)\. Higher compression levels \fIgenerally\fR produce higher compression ratio 
at the expense of speed and memory\. A rough rule of thumb is that compression 
speed is expected to be divided by 2 every 2 levels\. Technically, each level 
is mapped to a set of advanced parameters (that can also be modified 
individually, see below)\. Because the compressor's behavior highly depends on 
the content to compress, there's no guarantee of a smooth progression from one 
level to another\.

-.-.

Change (or include a "FIXME" paragraph about) misused SI (metric)
numeric prefixes (or names) to the binary ones, like Ki (kibi), Mi
(mebi), Gi (gibi), or Ti (tebi), if indicated.
If the metric prefixes are correct, add the definitions or an
explanation to avoid misunderstanding.

208:The minimum \fIhlog\fR is 6 (64 entries / 256 B) and the maximum is 30 (1B 
entries / 4 GiB)\.
215:The minimum \fIclog\fR is 6 (64 entries / 256 B) and the maximum is 29 
(512M entries / 2 GiB) on 32\-bit platforms and 30 (1B entries / 4 GiB) on 
64\-bit platforms\.

-.-.

Add a (no-break, "\ " or "\~") space between a number and a unit,
as these are not one entity.

215:The minimum \fIclog\fR is 6 (64 entries / 256 B) and the maximum is 29 
(512\~MiB entries / 2 GiB) on 32\-bit platforms and 30 (1B entries / 4 GiB) on 
64\-bit platforms\.

-.-.

Strings longer than 3/4 of a standard line length (80).

Use "\:" to split the string at the end of an output line, for example a
long URL (web address)

285 \fB\-\-zstd\fR=wlog=23,clog=23,hlog=22,slog=6,mml=3,tlen=48,strat=6
341 \fB\-\-train\-fastcover[=k#,d=#,f=#,steps=#,split=#,accel=#]\fR

-.-.

Wrong distance (not two spaces) between sentences in the input file.

  Separate the sentences and subordinate clauses; each begins on a new
line.  See man-pages(7) ("Conventions for source file layout") and
"info groff" ("Input Conventions").

  The best procedure is to always start a new sentence on a new line,
at least, if you are typing on a computer.

Remember coding: Only one command ("sentence") on each (logical) line.

E-mail: Easier to quote exactly the relevant lines.

Generally: Easier to edit the sentence.

Patches: Less unaffected text.

Search for two adjacent words is easier, when they belong to the same line,
and the same phrase.

  The amount of space between sentences in the output can then be
controlled with the ".ss" request.

Mark a final abbreviation point as such by suffixing it with "\&".

Some sentences (etc.) do not begin on a new line.

Split (sometimes) lines after a punctuation mark; before a conjunction.

  Lines with only one (or two) space(s) between sentences could be split,
so latter sentences begin on a new line.

Use

#!/usr/bin/sh

sed -e '/^\./n' \
-e 's/\([[:alpha:]]\)\.  */\1.\n/g' $1

to split lines after a sentence period.
Check result with the difference between the formatted outputs.
See also the attachment "general.bugs"

[List of affected lines removed.]

-.-.

Split lines longer than 80 characters into two or more lines.
Appropriate break points are the end of a sentence and a subordinate
clause; after punctuation marks.
Add "\:" to split the string for the output, "\<newline>" in the source.  

[List of affected lines removed.]


Longest line is number 323 with 874 characters
Selects segments of size \fIk\fR with highest score to put in the dictionary\. 
The score of a segment is computed by the sum of the frequencies of all the 
subsegments of size \fId\fR\. Generally \fId\fR should be in the range [6, 8], 
occasionally up to 16, but the algorithm will run faster with d <= \fI8\fR\. 
Good values for \fIk\fR vary widely based on the input data, but a safe range 
is [2 * \fId\fR, 2000]\. If \fIsplit\fR is 100, all input samples are used for 
both training and testing to find optimal \fId\fR and \fIk\fR to build 
dictionary\. Supports multithreading if \fBzstd\fR is compiled with threading 
support\. Having \fIshrink\fR enabled takes a truncated dictionary of minimum 
size and doubles in size until compression ratio of the truncated dictionary is 
at most \fIshrinkDictMaxRegression%\fR worse than the compression ratio of the 
largest dictionary\.

-.-.

Remove reverse slash (\) in front of a period (.) that is to be printed
as such, and can not come a control character in the first column of a line.
Use "\&" to protect the period to avoid that.
This is a sign, that the man page was transformed from another source
file with a program, whose name is NOT mentioned in a comment.

[List of affected lines removed.]

-.-.

Use the name of units in text; use symbols in tables and
calculations.
The rule is to have a (no-break, \~) space between a number and
its units (see "www.bipm.org/en/publications/si-brochure")

208:The minimum \fIhlog\fR is 6 (64 entries / 256 B) and the maximum is 30 (1B 
entries / 4 GiB)\.
215:The minimum \fIclog\fR is 6 (64 entries / 256 B) and the maximum is 29 
(512M entries / 2 GiB) on 32\-bit platforms and 30 (1B entries / 4 GiB) on 
64\-bit platforms\.
369:\fB\-i#\fR: minimum evaluation time, in seconds (default: 3s), benchmark 
mode only

-.-.

Put a parenthetical sentence, phrase on a separate line,
if not part of a code.
See man-pages(7), item "semantic newline".

[List of affected lines removed.]

-.-.

Use a character "\(->" instead of plain "->" or "\->".

382:\fBOutput Format:\fR CompressionLevel#Filename: InputSize \-> OutputSize 
(CompressionRatio), CompressionSpeed, DecompressionSpeed

-.-.

Only one space character is after a possible end of sentence
(after a punctuation, that can end a sentence).

[List of affected lines removed.]


-.-.

Use thousand markers to make large numbers easy to read

302:Limit dictionary to specified size (default: 112640 bytes)\. As usual, 
quantities are expressed in bytes by default, and it's possible to employ 
suffixes (like \fBKB\fR or \fBMB\fR) to specify larger values\.
316:A dictionary ID is a locally unique ID\. The decoder will use this value to 
verify it is using the right dictionary\. By default, zstd will create a 
4\-bytes random number ID\. It's possible to provide an explicit number ID 
instead\. It's up to the dictionary manager to not assign twice the same ID to 
2 different dictionaries\. Note that short numbers have an advantage: an ID < 
256 will only need 1 byte in the compressed frame header, and an ID < 65536 
will only need 2 bytes\. This compares favorably to 4 bytes default\.
318:Note that RFC8878 reserves IDs less than 32768 and greater than or equal to 
2\e^31, so they should not be used in public\.

-.-.

Put a subordinate sentence (after a comma) on a new line.

[List of affected lines removed.]

-.-.

Remove quotes when there is a printable
but no space character between them
and the quotes are not for emphasis (markup),
for example as an argument to a macro.

unzstd.1:1:.TH "ZSTD" "1" "October 2024" "zstd 1.5.6" "User Commands"
unzstd.1:2:.SH "NAME"
unzstd.1:4:.SH "SYNOPSIS"
unzstd.1:15:.SH "DESCRIPTION"
unzstd.1:19:.IP "\(bu" 4
unzstd.1:21:.IP "\(bu" 4
unzstd.1:23:.IP "\(bu" 4
unzstd.1:25:.IP "\(bu" 4
unzstd.1:27:.IP "\(bu" 4
unzstd.1:34:.IP "\(bu" 4
unzstd.1:36:.IP "\(bu" 4
unzstd.1:41:.SH "OPTIONS"
unzstd.1:71:.IP "\(bu" 4
unzstd.1:73:.IP "\(bu" 4
unzstd.1:75:.IP "\(bu" 4
unzstd.1:77:.IP "\(bu" 4
unzstd.1:79:.IP "\(bu" 4
unzstd.1:85:.IP "\(bu" 4
unzstd.1:87:.IP "\(bu" 4
unzstd.1:91:.IP "\(bu" 4
unzstd.1:95:.IP "\(bu" 4
unzstd.1:97:.IP "\(bu" 4
unzstd.1:107:.IP "\(bu" 4
unzstd.1:109:.IP "\(bu" 4
unzstd.1:111:.IP "\(bu" 4
unzstd.1:113:.IP "\(bu" 4
unzstd.1:115:.IP "\(bu" 4
unzstd.1:121:.IP "\(bu" 4
unzstd.1:123:.IP "\(bu" 4
unzstd.1:125:.IP "\(bu" 4
unzstd.1:127:.IP "\(bu" 4
unzstd.1:129:.IP "\(bu" 4
unzstd.1:131:.IP "\(bu" 4
unzstd.1:133:.IP "\(bu" 4
unzstd.1:135:.IP "\(bu" 4
unzstd.1:137:.IP "\(bu" 4
unzstd.1:139:.IP "\(bu" 4
unzstd.1:141:.IP "\(bu" 4
unzstd.1:143:.IP "\(bu" 4
unzstd.1:145:.IP "\(bu" 4
unzstd.1:147:.IP "\(bu" 4
unzstd.1:151:.IP "\(bu" 4
unzstd.1:153:.IP "\(bu" 4
unzstd.1:155:.IP "\(bu" 4
unzstd.1:157:.IP "\(bu" 4
unzstd.1:159:.IP "\(bu" 4
unzstd.1:161:.IP "\(bu" 4
unzstd.1:163:.IP "\(bu" 4
unzstd.1:165:.IP "\(bu" 4
unzstd.1:167:.IP "\(bu" 4
unzstd.1:188:.SS "\-\-zstd[=options]:"
unzstd.1:282:.SS "Example"
unzstd.1:286:.SS "\-B#:"
unzstd.1:360:.SH "BENCHMARK"
unzstd.1:362:.IP "\(bu" 4
unzstd.1:364:.IP "\(bu" 4
unzstd.1:366:.IP "\(bu" 4
unzstd.1:368:.IP "\(bu" 4
unzstd.1:370:.IP "\(bu" 4
unzstd.1:372:.IP "\(bu" 4
unzstd.1:374:.IP "\(bu" 4
unzstd.1:376:.IP "\(bu" 4
unzstd.1:389:.SH "BUGS"
unzstd.1:391:.SH "AUTHOR"

-.-.

Section headings (.SH and .SS) do not need quoting their arguments.

2:.SH "NAME"
4:.SH "SYNOPSIS"
15:.SH "DESCRIPTION"
39:.SS "Concatenation with \.zst Files"
41:.SH "OPTIONS"
42:.SS "Integer Suffixes and Special Values"
50:.SS "Operation Mode"
70:.SS "Operation Modifiers"
170:.SS "gzip Operation Modifiers"
178:.SS "Environment Variables"
186:.SH "ADVANCED COMPRESSION OPTIONS"
188:.SS "\-\-zstd[=options]:"
282:.SS "Example"
286:.SS "\-B#:"
288:.SH "DICTIONARY BUILDER"
360:.SH "BENCHMARK"
385:.SH "SEE ALSO"
389:.SH "BUGS"
391:.SH "AUTHOR"

-.-.

Put a (long) web address on a new line to reduce the posibility of
splitting the address between two output lines.
Or inhibit hyphenation with "\%" in front of the name.


388:The \fIzstandard\fR format is specified in Y\. Collet, "Zstandard 
Compression and the 'application/zstd' Media Type", 
https://www\.ietf\.org/rfc/rfc8878\.txt, Internet RFC 8878 (February 2021)\.
390:Report bugs at: https://github\.com/facebook/zstd/issues

-.-.

Output from "test-groff  -mandoc -t -K utf8 -rF0 -rHY=0 -rCHECKSTYLE=10 -ww -z 
":

tbl:"<stdin>:7: error: invalid column classifier '\'
tbl:"<stdin>:7: error: giving up on this table region
an.tmac:<stdin>:8: warning: tbl preprocessor failed, or it or soelim was not 
run; table(s) likely not rendered (TE macro called with TW register undefined)

-.-.

Additionally:

Size of the circumflex (^), used as an exponent sign, increased by two points
in troff-mode, to make it more visible.

-.-

Generally:

Split (sometimes) lines after a punctuation mark; before a conjunction.

--- unzstd.1    2025-03-30 01:54:53.014040667 +0000
+++ unzstd.1.new        2025-03-30 02:43:00.272422207 +0000
@@ -2,10 +2,7 @@
 .SH "NAME"
 \fBzstd\fR \- zstd, zstdmt, unzstd, zstdcat \- Compress or decompress \.zst 
files
 .SH "SYNOPSIS"
-.TS
-allbox;
-\fBzstd\fR [\fIOPTIONS\fR] [\- \fIINPUT\-FILE\fR] [\-o \fIOUTPUT\-FILE\fR]
-.TE
+\fBzstd\fR [\fIOPTIONS\fR] [\- | \fIINPUT\-FILE\fR] [\-o \fIOUTPUT\-FILE\fR]
 .P
 \fBzstdmt\fR is equivalent to \fBzstd \-T0\fR
 .P
@@ -43,10 +40,10 @@ It is possible to concatenate multiple \
 In most places where an integer argument is expected, an optional suffix is 
supported to easily indicate large integers\. There must be no space between 
the integer and the suffix\.
 .TP
 \fBKiB\fR
-Multiply the integer by 1,024 (2\e^10)\. \fBKi\fR, \fBK\fR, and \fBKB\fR are 
accepted as synonyms for \fBKiB\fR\.
+Multiply the integer by 1,024 (2\s+2^\s-210)\. \fBKi\fR, \fBK\fR, and \fBKB\fR 
are accepted as synonyms for \fBKiB\fR\.
 .TP
 \fBMiB\fR
-Multiply the integer by 1,048,576 (2\e^20)\. \fBMi\fR, \fBM\fR, and \fBMB\fR 
are accepted as synonyms for \fBMiB\fR\.
+Multiply the integer by 1,048,576 (2\s+2^\s-220)\. \fBMi\fR, \fBM\fR, and 
\fBMB\fR are accepted as synonyms for \fBMiB\fR\.
 .SS "Operation Mode"
 If multiple operation mode options are given, the last one takes effect\.
 .TP
@@ -69,7 +66,7 @@ Use \fIFILES\fR as a training set to cre
 Display information related to a zstd compressed file, such as size, ratio, 
and checksum\. Some of these fields may not be available\. This command's 
output can be augmented with the \fB\-v\fR modifier\.
 .SS "Operation Modifiers"
 .IP "\(bu" 4
-\fB\-#\fR: selects \fB#\fR compression level [1\-19] (default: 3)\. Higher 
compression levels \fIgenerally\fR produce higher compression ratio at the 
expense of speed and memory\. A rough rule of thumb is that compression speed 
is expected to be divided by 2 every 2 levels\. Technically, each level is 
mapped to a set of advanced parameters (that can also be modified individually, 
see below)\. Because the compressor's behavior highly depends on the content to 
compress, there's no guarantee of a smooth progression from one level to 
another\.
+\fB\-#\fR: selects \fB#\fR compression level [1\(en19] (default: 3)\. Higher 
compression levels \fIgenerally\fR produce higher compression ratio at the 
expense of speed and memory\. A rough rule of thumb is that compression speed 
is expected to be divided by 2 every 2 levels\. Technically, each level is 
mapped to a set of advanced parameters (that can also be modified individually, 
see below)\. Because the compressor's behavior highly depends on the content to 
compress, there's no guarantee of a smooth progression from one level to 
another\.
 .IP "\(bu" 4
 \fB\-\-ultra\fR: unlocks high compression levels 20+ (maximum 22), using a lot 
more memory\. Note that decompression will also require more memory when using 
these levels\.
 .IP "\(bu" 4
@@ -205,14 +202,14 @@ Specify the maximum number of bits for a
 .IP
 Bigger hash tables cause fewer collisions which usually makes compression 
faster, but requires more memory during compression\.
 .IP
-The minimum \fIhlog\fR is 6 (64 entries / 256 B) and the maximum is 30 (1B 
entries / 4 GiB)\.
+The minimum \fIhlog\fR is 6 (64 entries / 256 B) and the maximum is 30 (1\~Gi 
entries / 4 GiB)\.
 .TP
 \fBchainLog\fR=\fIclog\fR, \fBclog\fR=\fIclog\fR
 Specify the maximum number of bits for the secondary search structure, whose 
form depends on the selected \fBstrategy\fR\.
 .IP
 Higher numbers of bits increases the chance to find a match which usually 
improves compression ratio\. It also slows down compression speed and increases 
memory requirements for compression\. This option is ignored for the 
\fBZSTD_fast\fR \fBstrategy\fR, which only has the primary hash table\.
 .IP
-The minimum \fIclog\fR is 6 (64 entries / 256 B) and the maximum is 29 (512M 
entries / 2 GiB) on 32\-bit platforms and 30 (1B entries / 4 GiB) on 64\-bit 
platforms\.
+The minimum \fIclog\fR is 6 (64 entries / 256 B) and the maximum is 29 
(512\~Mi entries / 2 GiB) on 32\-bit platforms and 30 (1\~Gi entries / 4 GiB) 
on 64\-bit platforms\.
 .TP
 \fBsearchLog\fR=\fIslog\fR, \fBslog\fR=\fIslog\fR
 Specify the maximum number of searches in a hash chain or a binary tree using 
logarithmic scale\.
@@ -299,7 +296,7 @@ Since dictionary compression is mostly e
 Dictionary saved into \fBFILE\fR (default name: dictionary)\.
 .TP
 \fB\-\-maxdict=#\fR
-Limit dictionary to specified size (default: 112640 bytes)\. As usual, 
quantities are expressed in bytes by default, and it's possible to employ 
suffixes (like \fBKB\fR or \fBMB\fR) to specify larger values\.
+Limit dictionary to specified size (default: 112,640 bytes)\. As usual, 
quantities are expressed in bytes by default, and it's possible to employ 
suffixes (like \fBKB\fR or \fBMB\fR) to specify larger values\.
 .TP
 \fB\-#\fR
 Use \fB#\fR compression level during training (optional)\. Will generate 
statistics more tuned for selected compression level, resulting in a 
\fIsmall\fR compression ratio improvement for this level\.
@@ -315,7 +312,7 @@ In situations where the training set is
 \fB\-\-dictID=#\fR
 A dictionary ID is a locally unique ID\. The decoder will use this value to 
verify it is using the right dictionary\. By default, zstd will create a 
4\-bytes random number ID\. It's possible to provide an explicit number ID 
instead\. It's up to the dictionary manager to not assign twice the same ID to 
2 different dictionaries\. Note that short numbers have an advantage: an ID < 
256 will only need 1 byte in the compressed frame header, and an ID < 65536 
will only need 2 bytes\. This compares favorably to 4 bytes default\.
 .IP
-Note that RFC8878 reserves IDs less than 32768 and greater than or equal to 
2\e^31, so they should not be used in public\.
+Note that RFC8878 reserves IDs less than 32768 and greater than or equal to 
2\s+2^\s-231, so they should not be used in public\.
 .TP
 \fB\-\-train\-cover[=k#,d=#,steps=#,split=#,shrink[=#]]\fR
 Select parameters for the default dictionary builder algorithm named cover\. 
If \fId\fR is not specified, then it tries \fId\fR = 6 and \fId\fR = 8\. If 
\fIk\fR is not specified, then it tries \fIsteps\fR values in the range [50, 
2000]\. If \fIsteps\fR is not specified, then the default value of 40 is used\. 
If \fIsplit\fR is not specified or split <= 0, then the default value of 100 is 
used\. Requires that \fId\fR <= \fIk\fR\. If \fIshrink\fR flag is not used, 
then the default value for \fIshrinkDict\fR of 0 is used\. If \fIshrink\fR is 
not specified, then the default value for \fIshrinkDictMaxRegression\fR of 1 is 
used\.
@@ -341,7 +338,7 @@ Examples:
 \fB\-\-train\-fastcover[=k#,d=#,f=#,steps=#,split=#,accel=#]\fR
 Same as cover but with extra parameters \fIf\fR and \fIaccel\fR and different 
default value of split If \fIsplit\fR is not specified, then it tries 
\fIsplit\fR = 75\. If \fIf\fR is not specified, then it tries \fIf\fR = 20\. 
Requires that 0 < \fIf\fR < 32\. If \fIaccel\fR is not specified, then it tries 
\fIaccel\fR = 1\. Requires that 0 < \fIaccel\fR <= 10\. Requires that \fId\fR = 
6 or \fId\fR = 8\.
 .IP
-\fIf\fR is log of size of array that keeps track of frequency of subsegments 
of size \fId\fR\. The subsegment is hashed to an index in the range 
[0,2^\fIf\fR \- 1]\. It is possible that 2 different subsegments are hashed to 
the same index, and they are considered as the same subsegment when computing 
frequency\. Using a higher \fIf\fR reduces collision but takes longer\.
+\fIf\fR is log of size of array that keeps track of frequency of subsegments 
of size \fId\fR\. The subsegment is hashed to an index in the range 
[0,2\s+2^\s-2\fIf\fR \- 1]\. It is possible that 2 different subsegments are 
hashed to the same index, and they are considered as the same subsegment when 
computing frequency\. Using a higher \fIf\fR reduces collision but takes 
longer\.
 .IP
 Examples:
 .IP

  Any program (person), that produces man pages, should check the output
for defects by using (both groff and nroff)

[gn]roff -mandoc -t -ww -b -z -K utf8 <man page>

  The same goes for man pages that are used as an input.

  For a style guide use

  mandoc -T lint

-.-

  Any "autogenerator" should check its products with the above mentioned
'groff', 'mandoc', and additionally with 'nroff ...'.

  It should also check its input files for too long (> 80) lines.

  This is just a simple quality control measure.

  The "autogenerator" may have to be corrected to get a better man page,
the source file may, and any additional file may.

  Common defects:

  Not removing trailing spaces (in in- and output).
  The reason for these trailing spaces should be found and eliminated.

  "git" has a "tool" to point out whitespace,
see for example "git-apply(1)" and git-config(1)")

  Not beginning each input sentence on a new line.
Line length and patch size should thus be reduced.

  The script "reportbug" uses 'quoted-printable' encoding when a line is
longer than 1024 characters in an 'ascii' file.

  See man-pages(7), item "semantic newline".

-.-

The difference between the formatted output of the original and patched file
can be seen with:

  nroff -mandoc <file1> > <out1>
  nroff -mandoc <file2> > <out2>
  diff -d -u <out1> <out2>

and for groff, using

\"printf '%s\n%s\n' '.kern 0' '.ss 12 0' | groff -mandoc -Z - \"

instead of 'nroff -mandoc'

  Add the option '-t', if the file contains a table.

  Read the output from 'diff -d -u ...' with 'less -R' or similar.

-.-.

  If 'man' (man-db) is used to check the manual for warnings,
the following must be set:

  The option \"-warnings=w\"

  The environmental variable:

export MAN_KEEP_STDERR=yes (or any non-empty value)

  or

  (produce only warnings):

export MANROFFOPT=\"-ww -b -z\"

export MAN_KEEP_STDERR=yes (or any non-empty value)

-.-

Bug#1101705: unzstd.1: Some remarks and a patch with editorial changes for this man page

Reply via email to