Hello

On Tue, Nov 06, 2007 at 02:00:44PM +0100, Tomasz Sterna wrote:
> Dnia 05-11-2007, Pn o godzinie 16:23 +0100, Tomasz Sterna pisze:
> > Alternatively we could invent binary-2-utf mapping which has less
> > overhead than BASE64.
> 
> Simplest that comes to mind:
> Let's take first 256 allowable UTF-8 characters and assign them to 256
> values of a single byte.
> That would be less than 33% BASE64 overhead.
> 
> But I'm sure one of the more knowledgeable in the UTF internals would
> come up with better mapping.

If you want to map every byte to char (for simplicity), then you can not
come with anything better, since the chars at the beginning are the
shortest ones and their size grows with their position.

But, how the data sizes transfered would change, if the stream was
UTF-7? Most of it are namespaces, which contain only ASCII. Then you
have base64 data and most of the text transfered is usually ASCII too.
This could be quite simple to add as a stream feature.

-- 
Anyone who goes to a psychiatrist ought to have his head examined.
                -- Samuel Goldwyn

Michal 'vorner' Vaner

Attachment: pgppSRabIvB1z.pgp
Description: PGP signature

Reply via email to