Re: RFC: Charset Conversion Routines

Alexey Melnikov Tue, 24 Feb 2009 03:20:50 -0800

Bron Gondwana wrote:

d) Whitespace compression.  I'm currently mapping all
  whitespace to ' ' instead of '', and then either stripping
  all ' ' from the string, or only outputting them if the
  previous character on the output string was not a space.
  Rob tells me that there are some issues with asian charsets
  and space not having any meaning - how best to handle?

I think no matter what you do with whitespace compression, it might notwork for some languages.So I wouldn't worry too much about this, as long as this procedure isoptional (or can be controlled by a configuration option or a client).

e) Interfaces, interfaces, interfaces.  At the moment we have:

* charset_compilepat - for use in:
 * charset_searchstring
 * charset_searchfile
* charset_decode_mimebody - and
 * charset_encode_mimebody
* charset_extractfile

My current implementation that I'm working on uses "int flags"
as an extra parameter to each of these, allowing CHARSET_CANON
and CHARSET_STRIPSPACE to be passed down to the translation
layer.

This looks sensible.

Another alternative is to implement whitespace compression in anotherfunction, layered on top of the charset API.

Would people be happy with that as an interface?  It's
somewhat invasive, needing changes through lots of imap/*.c and
sieve/*.c files.

Bron.

Re: RFC: Charset Conversion Routines

Reply via email to