Dear Apache developers, I sent the following three months ago, but since I got no response, and now 2.0.49 has been rolled without the patch, I resubmit it for you attention:
The default httpd.conf includes the lines AddCharset ISO-8859-1 .iso8859-1 .latin1 AddCharset ISO-8859-2 .iso8859-2 .latin2 .cen AddCharset ISO-8859-3 .iso8859-3 .latin3 AddCharset ISO-8859-4 .iso8859-4 .latin4 AddCharset ISO-8859-5 .iso8859-5 .latin5 .cyr .iso-ru AddCharset ISO-8859-6 .iso8859-6 .latin6 .arb AddCharset ISO-8859-7 .iso8859-7 .latin7 .grk AddCharset ISO-8859-8 .iso8859-8 .latin8 .heb AddCharset ISO-8859-9 .iso8859-9 .latin9 .trk However, quick look at http://www.iana.org/assignments/character-sets shows that calling the non-latin charsets ISO8859-N by the name latinN is wrong. For example, latin8 is ISO-8859-14, or iso-celtic, and certainly not ISO-8859-8, which is just hebrew! Similarly, latin6 is ISO-8859-10, and not ISO-8859-6, which is arabic! Finally, latin5 is ISO-8859-9, turkish, and not ISO-8859-5, which is cyrillic. latin1-4 are ok, and I didn't find latin7 in this reference at all. I suggest httpd.conf should be fixed accordingly. To make my point clearer, here is the patch: --- httpd-2.0.48/docs/conf/httpd-std.conf.in.~20031011014743~ 2003-10-11 03:47:43.000000000 +0200 +++ httpd-2.0.48/docs/conf/httpd-std.conf.in 2003-12-15 18:47:07.000000000 +0200 @@ -797,11 +797,15 @@ AddCharset ISO-8859-2 .iso8859-2 .latin2 .cen AddCharset ISO-8859-3 .iso8859-3 .latin3 AddCharset ISO-8859-4 .iso8859-4 .latin4 -AddCharset ISO-8859-5 .iso8859-5 .latin5 .cyr .iso-ru -AddCharset ISO-8859-6 .iso8859-6 .latin6 .arb -AddCharset ISO-8859-7 .iso8859-7 .latin7 .grk -AddCharset ISO-8859-8 .iso8859-8 .latin8 .heb -AddCharset ISO-8859-9 .iso8859-9 .latin9 .trk +AddCharset ISO-8859-5 .iso8859-5 .cyr .iso-ru +AddCharset ISO-8859-6 .iso8859-6 .arb +AddCharset ISO-8859-7 .iso8859-7 .grk +AddCharset ISO-8859-8 .iso8859-8 .heb +AddCharset ISO-8859-9 .iso8859-9 .latin5 .trk +AddCharset ISO-8859-10 .iso8859-10 .latin6 +AddCharset ISO-8859-13 .iso8859-13 .latin7 +AddCharset ISO-8859-14 .iso8859-14 .latin8 +AddCharset ISO-8859-15 .iso8859-15 .latin9 AddCharset ISO-2022-JP .iso2022-jp .jis AddCharset ISO-2022-KR .iso2022-kr .kis AddCharset ISO-2022-CN .iso2022-cn .cis I have also included latin7 and latin9, which for some reason absent from IANA, but appear as standard in in the FSF's "free recode". BTW, instead of inventing new charset abbreviations like .cyr, .arb, .grk, .heb, I would personally prefer using the IANA (RFC 1345) aliases: .cyrillic, .arabic, .greek, .hebrew, in the same way we use .latin1, .latin2 , etc, but this is a matter of opinion, not bug fix patching. Best, Zvi. -- Dr. Zvi Har'El mailto:[EMAIL PROTECTED] Department of Mathematics tel:+972-54-227607 icq:179294841 Technion - Israel Institute of Technology fax:+972-4-8293388 http://www.math.technion.ac.il/~rl/ Haifa 32000, ISRAEL "If you can't say somethin' nice, don't say nothin' at all." -- Thumper (1942) Friday, 27 Adar 5764, 19 March 2004, 6:53PM
