On Sat, 2017-06-24 at 20:07 -0700, Paul Hardy wrote: > Three possibilities seem to exist, and I am fine with any one being chosen: > > 1) Use the UTF-8 signature in UTF-8 text files
If this triggers browsers to use the right encoding, it seems reasonable to add it in the situation where the files could be served by any web server on the Internet. Right now all the mirrors of www.debian.org are on Debian-controlled servers though, but there are many non-UTF-8 text files so using the UTF-8 signature seems better. > 2) Set the HTTP headers for charset="UTF-8" FYI, there are 1018 non-UTF-8 out of 2605 total *.txt files on the Debian website and 9 non-UTF-8 out of 1102 total *.txt files in the Debian archive mirrors. It seems feasible to convert the files in the Debian archive to UTF-8 but it doesn't seem to be feasible to do that for www.debian.org. pabs@mirror-anu:/srv/static.debian.org/mirrors/www.debian.org/cur$ find -iname '*.txt' | wc -l 2605 pabs@mirror-anu:/srv/static.debian.org/mirrors/www.debian.org/cur$ find -iname *.txt -print0 | xargs -0 isutf8 | wc -l 1018 pabs@mirror-anu:/srv/mirrors/debian$ find -iname '*.txt' | wc -l 1102 pabs@mirror-anu:/srv/mirrors/debian$ find -iname '*.txt' -print0 | xargs -0 isutf8 | wc -l 9 > 3) Convert UTF-8 text files to HTML documents for web display Sounds like this is already done. -- bye, pabs https://wiki.debian.org/PaulWise
signature.asc
Description: This is a digitally signed message part