On December 26, 2004 at 13:04, Jeff Breidenbach wrote: > Unfortunately, while MHonArc is fine at dealing with UTF-8 messages, > it will choke on a UTF-8 configuration file. So the Serbian
Yes and no, depending on where multi-byte encoding occurs. But in general, it is wise to avoid multi-byte sequences when possible. I believe a warning about this is somewhere in the mhonarc docs. > localization will need to be converted to ISO 10646 numerical > character references. I don't personally know how to do a > UTF-8 -> ISO 10646 conversion, but hopefully you can figure it > out or maybe someone on gossip knows how to do it. All that is needed is the Unicode code point value of the character and use that value as the numeric character reference. You can write a Perl script using unpack to map the UTF-8 sequences into character entity references. Take a look at MHonArc's MHonArc::CharEnt module for one implementation that does this (note, use the version in the latest snapshot build since it contains a fix to the invocation of unpack for perl versions >= 5.6). --ewh _______________________________________________ Discussion list for The Mail Archive Gossip@jab.org http://jab.org/cgi-bin/mailman/listinfo/gossip