Thank you all for your responses to my request. I expected a few pointers and instead I received something much better. I am in awe of how much some of you know about these matters.

Now, since you are being so helpful, I want to narrow down the issue of charset support in a couple of ways and ask some further questions that are more practical and specific to the environment I am working in.

It is very much our interest and intention to promote the use of Unicode in applications generally and email clients in particular. Now, Mark Keasling wrote, "If SHIFT_JIS, EUC-JP and ISO-2022-JP aren't supported you'd have a difficult time entering the Japanese market." But to what extent is such support going to be expected (or demanded)? Thanks to our use of c-client, we have no problem handling incoming messages with text in these charsets. But would someone developing a Japanese-market email client be content to work internally using only Unicode or would he want to actually handle the data in those other charsets? We have two apparent options: (1) converting every text part to UTF-8 while keeping a record of the original charset or (2) giving the developer an option of reading text in either UTF-8 or the original charset. We would prefer (1) but are quite ready to provide (2) if necessary. And is there a (3)?

I assume that for outgoing messages, a parallel situation would apply. Are there in fact any differences there? We can either (1) require that the application supply all outgoing text in UTF-8 and then convert it to any requested charset before the message is sent or (2) accept outgoing text either in UTF-8 and convert it as requested, or in any other labeled charset in which case we treat it as binary data.

Again, many thanks for your generous assistance,

Pete Maclean


Reply via email to