http://qa.mandrakesoft.com/show_bug.cgi?id=4212
------- Additional Comments From [EMAIL PROTECTED] 2003-18-07 18:38 ------- Created an attachment (id=538) --> (http://qa.mandrakesoft.com/attachment.cgi?id=538&action=view) The patch -- Configure bugmail: http://qa.mandrakesoft.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is. ------- Reminder: ------- assigned_to: [EMAIL PROTECTED] status: UNCONFIRMED creation_date: description: In Mandrake 9.1, there's an installer option "Use Unicode by default". This option causes UTF-8 versions of locales to be used, which finally moves the distribution in the direction of unifying diverse character set encoding standards into a single, well-known and well understood encoding: UTF-8. However, there is a problem with some UTF-8 locales with regards to man pages. The problem exhibits itself in hyphens (e.g. in the option names) being displayed incorrectly and being unsearchable (the "minus" character from the keyboard doesn't match them). This is due to the fact that the groff utility that's used for formatting pages (when called from the nroff shell script) formats "\-" sequence in the source input as Unicode character "0x2212", and "-" character as Unicode character "0x2010" instead of the backward-compatible minus sign (which has code "0x002D" for compatibility with ASCII). The hyphen sign "0x2212" isn't handled properly by either the less viewer, or the output terminal and as a result it's displayed with a leading garbage character and can't be input from the keyboard when searching in the manual page (so that e.g. it isn't possible to search for "-h" option when reading the manual for ls). Among others, the "en_US.UTF-8" locale is influenced by this bug. OTOH, some other locales (e.g. "pl") aren't influenced by it because the nroff wrapper has a quick hack which switches from UTF-8 to legacy encodings (like ISO-8859-2) for those locales, since man pages are still encoded in non-UTF8 charsets. See the source of /usr/bin/nroff script for details. The problem is solved by modifying groff's font descriptions for the utf8 device so that the standard, ASCII-compatible "0x002D" character code is used instead of "0x2212" for the hyphen sequence ("\-"). The font settings for utf8 device are in the /usr/share/groff/1.18.1/font/devutf8/ directory, in the files R (for regular text), B (for bold), I and BI (for italic an bold-italic respectively). I'm attaching a patch that does the change. Test if the patch will apply cleanly by doing: # cd / # patch -p1 --dry-run < path/to/patch/devutf8_hyphen.patch Apply the patch: # cd / # patch -p1 < path/to/patch/devutf8_hyphen.patch Test by executing "man ls" and "man mount" in the en_US.UTF-8 locale. All hyphens should me ok, you should be able to search for an option e.g. "-v". Please, test it and, if you find it to be correct, apply and (if needed) forward to groff maintainers (http://www.gnu.org/directory/GNU/groff.html).