On 23 Aug 2005, at 14:51, John Gardiner Myers wrote:
Matt Sergeant wrote:
Wasn't there unicode normalisation in the original email parser that
I submitted to the project (that Theo turned into the current parser)
?
Certainly it would make sense to use that if you could. It works very
well on a very large set of test data.
That code only deals with MIME-labeled charsets. It has no provision
for charset detection.
Really? I must have written that later in my local version of the code.
I can probably provide some code for charset detection - it's fairly
simple once you have the heuristics figured out.
Matt.
______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email
______________________________________________________________________