Hi, John Darrington wrote: > Given a text file, it will attempt to guess the natural language in > which it was written. I'm sure it would be fairly simple to modify it to > guess the charset. If you point me to a reasonably large set of example > files, I'll see what I can do.
You could use your existing samples, which hopefully include a number of non-ASCII characters, recode them to UTF-8, and then try a few encodings -- the German text would typically be in latin-1, latin-15, or one of the Windows or Mac specific charsets for West or Central Europe. -- Matthias Urlichs | {M:U} IT Design @ m-u-it.de | [EMAIL PROTECTED] Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de -- Dimensions will always be expressed in the least usable term. EXAMPLE: Velocity will be expressed in furlongs per fortnight.