Hi, On Mon, 2008-01-28 at 12:29:35 +0000, Colin Watson wrote: > On Mon, Dec 31, 2007 at 02:37:48PM +0000, Colin Watson wrote: > > On Sun, Dec 30, 2007 at 10:28:12PM -0800, Russ Allbery wrote: > > > Colin Watson <[EMAIL PROTECTED]> writes: > > > > I propose that policy should standardise that we move to using UTF-8 as > > > > the source encoding for all manual pages since it clearly makes sense to > > > > do so. > [...] > > Right. Here's an update; I think I've captured most of the discussion in > > the thread so far. The following patch could in principle be applied > > now, given seconds. Wordsmithing welcome, as I'm aware that this is a > > rather dense recommendation; I'm also looking for seconds for this > > proposal.
> I'm also looking for at least one more second for this proposal. > > --- orig/policy.sgml > +++ mod/policy.sgml > @@ -8521,6 +8521,37 @@ > be present in the future. > </footnote> > </p> > + > + <p> > + Manual pages in locale-specific subdirectories of > + <file>/usr/share/man</file> should use either UTF-8 or the usual > + legacy encoding for that language (normally the one corresponding > + to the shortest relevant locale name in > + <file>/usr/share/i18n/SUPPORTED</file>). For example, pages under > + <file>/usr/share/man/fr</file> should use either UTF-8 or > + ISO-8859-1.<footnote><prgn>man</prgn> will automatically detect > + whether UTF-8 is in use. In future, all manual pages will be > + required to use UTF-8.</footnote> > + </p> > + > + <p> > + A country name (e.g. <file>de_DE</file>) should not be included in > + the subdirectory name unless it indicates a significant difference > + in the language, as this excludes speakers of the language in > + other countries.<footnote>At the time of writing, Chinese and > + Portuguese are the main languages with such differences, so > + <file>pt_BR</file>, <file>zh_CN</file>, and <file>zh_TW</file> are > + all allowed.</footnote> > + </p> > + > + <p> > + Due to limitations in current implementations, all characters > + in the manual page source should be representable in the usual > + legacy encoding for that language, even if the file is > + actually encoded in UTF-8. Safe alternative ways to write many > + characters outside that range may be found in > + <manref name="groff_char" section="7">. > + </p> > </sect> Seconded. regards, guillem
signature.asc
Description: Digital signature