DJ Lucas wrote: > DJ Lucas wrote: >> Thank you again for the detailed critique, suggestions and examples. >> You've been a great help. I'll have another go at it using your text above. >> >> > OK, I think this is almost the final... > > http://www.linuxfromscratch.org/~dj/LFS-MANDB/chapter06/man-db.html
> Some packages provide non-English manual pages. They are displayed > correctly only if their location and encoding matches the expectation > of the "man" program. However, different Linux distributions have > different policies (expressed in the choice of the man program, its > configuration and patches applied to it) concerning the character > encoding in which manual pages are stored in the filesystem. > > E.g., Debian previously required Russian manual pages to be encoded > in KOI8-R and to be placed in /usr/share/man/ru. Now, in addition, > their man program (Man-DB) searches for UTF-8 encoded Russian manual > pages in /usr/share/man/ru.UTF-8. On the other hand, Fedora uses > UTF-8 encoded manual pages exclusively. Russian manual pages are > found in /usr/share/man/ru and their man program doesn't acknowledge > /usr/share/man/ru.UTF-8. Many other distributions ignore the problem > completely, leaving the end user with a mix of readable and > unreadable manual pages, and even worse yet, unreadable error > messages when a suitable manual page is not found. "ignore the problem" => which problem? The text suggests that many distributions ignore that fact that different distributions have different policies. Some other word is needed. Maybe: "Many other distributions ignore the need for a consistent policy, leaving the user with ..."? "a mix of readable and unreadable manual pages" - yes, very well spotted, better than I formulated on this list! However, there is a very low-priority wish: some people will misinterpret the word "unreadable" as "no way to make the man program access this file" instead of "man reads this file and displays garbage". Here a picture would be worth thousand words, but pictures are not in the current LFS tradition. "and, even worse yet, unreadable error messages" => no, unreadable pages are worse. And this situation follows from a bug in the "man" program (it uses the obsolete catgets interface instead of gettext), not from misplaced or misencoded manual pages, so let's not mention it. > Disagreement about the expected encoding of manual pages amongst > distribution vendors, has led to confusion for upstream package > maintainers. One package may contain UTF-8 manual pages, while > another ships with manual pages in legacy encodings. Man-DB uses a > built-in table (see below) to find the correct serach directory for > manual pages based on the user's locale settings. No, it doesn't look into the table in this case. See add_nls_manpath() in http://www.chiark.greenend.org.uk/~cjwatson/bzr/man-db/trunk/src/manp.c It iterates over all subdirectories and tests whether the subdirectory is for the user's language, completely disregarding the encoding. IOW, all of /usr/share/man/ru{,.KOI8-R,.CP1251,.UTF-8} are searched in all of ru_RU.KOI8-R, ru_RU.CP1251 (unofficial, has to be localadef'ed manually) and ru_RU.UTF-8 locales. The rest is OK. > ...I have a couple of questions: <snip already answered questions> > Was this an LFS > only problem in that we didn't pass '+lang none' to Man's build? It is a problem for all distributions that don't pass '+lang none'. No distribution known to me passes '+lang none'. Fedora converts error messages so that they look right in UTF-8 locales, but this makes them incorrect in legacy locales. > Also, I think the one line paragraph above the table can be removed > completely since the table is explained in the paragraph above that, but > I'm not sure. Yes, remove it. -- Alexander E. Patrakov -- http://linuxfromscratch.org/mailman/listinfo/lfs-dev FAQ: http://www.linuxfromscratch.org/faq/ Unsubscribe: See the above information page
