Keith -- could you possible supply an example of "a properly encoded utf-8 string" from which it can be unambiguously determined whether the string "sang" is an English word (the past tense of "sing") or a Vietnamese word meaning "to", "posh" or "knowingly" in English ? Could you also paste that string into Richard Ishida's Unicode String Analyser :
http://rishida.net/tools/analysestring/ and let us know what information it returns ? Philip Taylor -------- Keith J. Schultz wrote: > Unfortunately, for efficiency reasons, utf-8 strings are not properly > encoded and programs assume a particular language, to save space. -- Windows 8 ? Just say "no". -------------------------------------------------- Subscriptions, Archive, and List information, etc.: http://tug.org/mailman/listinfo/xetex