"Douglas A. Tutty" <[EMAIL PROTECTED]> writes: > > What gets me is when a man page is written in english and "'" gets > translated as "?", as in can?t or "'" is a square white blob (on a > regular VT). Why couldn't whoever wrote it in english have used the > standard english "'" glyph instead of a UTF thingy?
The problem isn't the manpage author, it's your setup. Specifically, you're using a locale that sports UTF-8 encoding, but you're using a terminal/font combination that is not capable of correctly rendering UTF-8-encoded common typographical symbols used for English language text, like the right single quote / apostrophe. If you use a locale based on ASCII encoding instead, those manpages will render more correctly (for example, substituting the unsightly ASCII vertical apostrophe for its more urbane cousin or writing (C) in place of the copyright symbol). See the bottom of this post if LANG=C isn't good enough for you. Unlike some people here, I couldn't give a σθιτ if you, S. Keeling, or anyone else wants to use UTF-8 or not---I'm not on any crusade---but an environment variable setting of "LANG=en_US.UTF-8" is basically an announcement to applications that your terminal is UTF-8 capable. You don't have to run a UTF-8-capable terminal if you don't want to, but you shouldn't lie to your applications and then whine about those damn foreigners writing manpages incorrectly (just a joke, just a joke). In truth, if you look at the manpage source, you'll probably find that the manpage authors *have* used the ASCII "'" character for apostrophes and right single quotes. That's because this is the encoding convention used in the typesetting language "roff" in which manpages are written. You write `stuff like this' knowing that a correctly configured manpage rendering pipeline will convert those ASCII backticks and apostrophes into the correct English typographical symbols (if the manpage is being printed or being displayed on a sophisticated terminal) or at least do the best it can (if it's being delivered to an ASCII-only terminal). If manpage writers were really on the ball, they'd use \(lqleft and right double-quotes\(rq too, but you don't see too much of that. To clarify further, there's nothing English about "'". If it's anything, it's ASCII, not English. I'm not sure that the ASCII standard actually specifies what printable characters, including "'", are supposed to look like, but in most fonts with ASCII-compatible encoding, the "'" character is rendered as an undirected, typewriter-style apostrophe, like a vertical tickmark, and I believe this is pretty much universally accepted as the "correct" rendering of this character, among those who care about these things. In particular, it is *not* the character used in typeset English text as an apostrophe or right single quote. It's rarely used in English text at all, except in historically ASCII contents like email and computer plain text files. It's about as un-English as you can get. It's very ASCII, though. Anyway, to really take a stand on this UTF-8 crap and announce to the world that 7 bits were good enough for cavemen so, by God, they're good enough for you too, you can simply use a preexisting ASCII-only locale (like LANG=C) or you can generate one. Add this line to "/etc/locale.gen": en_US ANSI_X3.4-1968 run "/usr/sbin/locale-gen" as root, and find some way to set "LANG=en_US" or "LC_ALL=en_US". ANSI_X3.4-1968 is another name for ASCII, so your new "en_US" locale shouldn't bother you with heretical characters. Some applications will still give up and print a "?" for non-ASCII characters, but "man" should do an excellent job displaying a pure ASCII rendering of your manpages for you. -- Kevin Buhr <[EMAIL PROTECTED]> -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]