http://qa.mandrakesoft.com/show_bug.cgi?id=4212





------- Additional Comments From [EMAIL PROTECTED]  2003-18-07 18:38 -------
Created an attachment (id=538)
 --> (http://qa.mandrakesoft.com/attachment.cgi?id=538&action=view)
The patch


-- 
Configure bugmail: http://qa.mandrakesoft.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


------- Reminder: -------
assigned_to: [EMAIL PROTECTED]
status: UNCONFIRMED
creation_date: 
description: 
In Mandrake 9.1, there's an installer option "Use Unicode by default".

This option causes UTF-8 versions of locales to be used, which finally moves the
distribution in the direction of unifying diverse character set encoding
standards into a single, well-known and well understood encoding: UTF-8.
However, there is a problem with some UTF-8 locales with regards to man pages.
The problem exhibits itself in hyphens (e.g. in the option names) being
displayed incorrectly and being unsearchable (the "minus" character from the
keyboard doesn't match them).

This is due to the fact that the groff utility that's used for formatting pages
(when called from the nroff shell script) formats "\-" sequence in the source
input as Unicode character "0x2212", and "-" character as Unicode character
"0x2010" instead of the backward-compatible minus sign (which has code "0x002D"
for compatibility with ASCII).

The hyphen sign "0x2212" isn't handled properly by either the less viewer, or
the output terminal and as a result it's displayed with a leading garbage
character and can't be input from the keyboard when searching in the manual page
(so that e.g. it isn't possible to search for "-h" option when reading the
manual for ls).

Among others, the "en_US.UTF-8" locale is influenced by this bug. OTOH, some
other locales (e.g. "pl") aren't influenced by it because the nroff wrapper has
a quick hack which switches from UTF-8 to legacy encodings (like ISO-8859-2) for
those locales, since man pages are still encoded in non-UTF8 charsets. See the
source of /usr/bin/nroff script for details.

The problem is solved by modifying groff's font descriptions for the utf8 device
so that the standard, ASCII-compatible "0x002D" character code is used instead
of "0x2212" for the hyphen sequence ("\-").

The font settings for utf8 device are in the
/usr/share/groff/1.18.1/font/devutf8/ directory, in the files R (for regular
text), B (for bold), I and BI (for italic an bold-italic respectively).

I'm attaching a patch that does the change.
Test if the patch will apply cleanly by doing:
# cd /
# patch -p1 --dry-run < path/to/patch/devutf8_hyphen.patch
Apply the patch:
# cd /
# patch -p1 < path/to/patch/devutf8_hyphen.patch

Test by executing "man ls" and "man mount" in the en_US.UTF-8 locale. All
hyphens should me ok, you should be able to search for an option e.g. "-v".

Please, test it and, if you find it to be correct, apply and (if needed) forward
to groff maintainers (http://www.gnu.org/directory/GNU/groff.html).

Reply via email to