Re: Bug#865713: Declaring a charset of UTF-8 for policy files

2017-06-24 Thread Paul Hardy
On Sat, Jun 24, 2017 at 8:07 PM, Paul Hardy wrote: > On Sat, Jun 24, 2017 at 7:12 PM, Paul Wise wrote: >> On Sun, Jun 25, 2017 at 8:54 AM, Simon McVittie wrote: >> >>> For what it's worth, I agree that declaring the correct charset in HTTP >>> metadata is a better solution than prepending U+FEFF

Re: Bug#865713: Declaring a charset of UTF-8 for policy files

2017-06-24 Thread Paul Hardy
On Sat, Jun 24, 2017 at 8:13 PM, Russ Allbery wrote: > > That's one of the things that confuses me a bit -- why not just use the > existing HTML files? ... > > I assume you're looking at: > > https://www.debian.org/doc/devel-manuals#policy I did a StartPage search for "debian upgrade checkli

Re: Bug#865713: Declaring a charset of UTF-8 for policy files

2017-06-24 Thread Paul Wise
On Sat, 2017-06-24 at 20:48 -0700, Russ Allbery wrote: > Can't we just set the character set for the text files that come from > Debian Policy?  At least with Apache you can set character sets with > whatever granularity you want. Doesn't look like there are any files within the Debian Policy dir

Re: Bug#865713: Declaring a charset of UTF-8 for policy files

2017-06-24 Thread Russ Allbery
Paul Wise writes: > On Sat, 2017-06-24 at 20:07 -0700, Paul Hardy wrote: >> 2) Set the HTTP headers for charset="UTF-8" > FYI, there are 1018 non-UTF-8 out of 2605 total *.txt files on the > Debian website and 9 non-UTF-8 out of 1102 total *.txt files in the > Debian archive mirrors. It seems fe

Re: Bug#865713: Declaring a charset of UTF-8 for policy files

2017-06-24 Thread Paul Wise
On Sat, 2017-06-24 at 20:07 -0700, Paul Hardy wrote: > Three possibilities seem to exist, and I am fine with any one being chosen: > > 1) Use the UTF-8 signature in UTF-8 text files If this triggers browsers to use the right encoding, it seems reasonable to add it in the situation where the file

Re: Bug#865713: Declaring a charset of UTF-8 for policy files

2017-06-24 Thread Russ Allbery
Paul Hardy writes: > If using the UTF-8 signature in a document is too aesthetically > distateful (and I don't disagree), and if setting the HTTP header to > denote a UTF-8 charset is not a universal solution because it will only > have effect on Debian's servers, would a tool that converted such

Re: Bug#865713: Declaring a charset of UTF-8 for policy files

2017-06-24 Thread Paul Hardy
On Sat, Jun 24, 2017 at 7:12 PM, Paul Wise wrote: > On Sun, Jun 25, 2017 at 8:54 AM, Simon McVittie wrote: > >> For what it's worth, I agree that declaring the correct charset in HTTP >> metadata is a better solution than prepending U+FEFF ZERO WIDTH NO-BREAK >> SPACE >> (aka the "byte-order mark

Re: Bug#865713: Declaring a charset of UTF-8 for policy files

2017-06-24 Thread Paul Wise
On Sun, Jun 25, 2017 at 8:54 AM, Simon McVittie wrote: > For what it's worth, I agree that declaring the correct charset in HTTP > metadata is a better solution than prepending U+FEFF ZERO WIDTH NO-BREAK SPACE > (aka the "byte-order mark") in the file content. Forcing every text file to UTF-8 isn

Re: Bug#865713: Declaring a charset of UTF-8 for policy files

2017-06-24 Thread Simon McVittie
On Sat, 24 Jun 2017 at 15:04:41 -0700, Russ Allbery wrote: > Stéphane Blondon writes: > > pabs added such configuration few days ago for Apache configuration: > > https://anonscm.debian.org/cgit/mirror/dsa-puppet.git/commit/?id=5bcf8431d6b375d211a29f9d2c338e4400332e1a > > Paul, does this resolve

Re: Bug#865713: Declaring a charset of UTF-8 for policy files

2017-06-24 Thread Russ Allbery
Stéphane Blondon writes: > pabs added such configuration few days ago for Apache configuration: > https://anonscm.debian.org/cgit/mirror/dsa-puppet.git/commit/?id=5bcf8431d6b375d211a29f9d2c338e4400332e1a > The reason is a bad display in some browser for UTF-8 encoded txt file. > The start of thi

Re: Bug#865713: Declaring a charset of UTF-8 for policy files

2017-06-24 Thread Stéphane Blondon
Le 24/06/2017 à 20:44, Russ Allbery a écrit : > debian-www folks, is there a way to declare UTF-8 as the charset for all > the *.txt files that originate from the debian-policy package and are > served by www.debian.org? I can guarantee that all the text files shipped > as part of the Policy packa

Re: Bug#865713: Declaring a charset of UTF-8 for policy files

2017-06-24 Thread Russ Allbery
debian-www, not debian-web. Colin Watson writes: > On Fri, Jun 23, 2017 at 11:49:20PM -0700, Russ Allbery wrote: >> I'm still a bit dubious about this, since I don't believe editors and >> generators normally add it, but given how we generate the text versions >> of the documents, it's relativel