http://qa.mandrakesoft.com/show_bug.cgi?id=4212





------- Additional Comments From [EMAIL PROTECTED]  2003-22-07 15:33 -------
I ran into the same kind of troubles. My locale is fr_FR.UTF-8 .
There is another problem with the special french characters that are colored in
the man pages. That's because the color encoding by grotty is the old format,
using the backspace character to contain information. less doesn't handle this
correctly when there are more than one byte to encode the character.

The solution: use the normal corloring format. This can be changed in the
man.config file by removing the "-c" of the NROFF arguments. I haven't noticed
any drawbacks with other locales and viewers so I think it's fine.

In the same time, the previous problem (the special - ) can also be solved with
a simple modification of the man.config file too: change the less arguments to
let the terminal handle the special charcaters : -U.

The patch is attached.

-- 
Configure bugmail: http://qa.mandrakesoft.com/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


------- Reminder: -------
assigned_to: [EMAIL PROTECTED]
status: UNCONFIRMED
creation_date: 
description: 
In Mandrake 9.1, there's an installer option "Use Unicode by default".

This option causes UTF-8 versions of locales to be used, which finally moves the
distribution in the direction of unifying diverse character set encoding
standards into a single, well-known and well understood encoding: UTF-8.
However, there is a problem with some UTF-8 locales with regards to man pages.
The problem exhibits itself in hyphens (e.g. in the option names) being
displayed incorrectly and being unsearchable (the "minus" character from the
keyboard doesn't match them).

This is due to the fact that the groff utility that's used for formatting pages
(when called from the nroff shell script) formats "\-" sequence in the source
input as Unicode character "0x2212", and "-" character as Unicode character
"0x2010" instead of the backward-compatible minus sign (which has code "0x002D"
for compatibility with ASCII).

The hyphen sign "0x2212" isn't handled properly by either the less viewer, or
the output terminal and as a result it's displayed with a leading garbage
character and can't be input from the keyboard when searching in the manual page
(so that e.g. it isn't possible to search for "-h" option when reading the
manual for ls).

Among others, the "en_US.UTF-8" locale is influenced by this bug. OTOH, some
other locales (e.g. "pl") aren't influenced by it because the nroff wrapper has
a quick hack which switches from UTF-8 to legacy encodings (like ISO-8859-2) for
those locales, since man pages are still encoded in non-UTF8 charsets. See the
source of /usr/bin/nroff script for details.

The problem is solved by modifying groff's font descriptions for the utf8 device
so that the standard, ASCII-compatible "0x002D" character code is used instead
of "0x2212" for the hyphen sequence ("\-").

The font settings for utf8 device are in the
/usr/share/groff/1.18.1/font/devutf8/ directory, in the files R (for regular
text), B (for bold), I and BI (for italic an bold-italic respectively).

I'm attaching a patch that does the change.
Test if the patch will apply cleanly by doing:
# cd /
# patch -p1 --dry-run < path/to/patch/devutf8_hyphen.patch
Apply the patch:
# cd /
# patch -p1 < path/to/patch/devutf8_hyphen.patch

Test by executing "man ls" and "man mount" in the en_US.UTF-8 locale. All
hyphens should me ok, you should be able to search for an option e.g. "-v".

Please, test it and, if you find it to be correct, apply and (if needed) forward
to groff maintainers (http://www.gnu.org/directory/GNU/groff.html).

Reply via email to