In the case of GIF versus JPG, which are usually regarded as "lossless" versus "lossy", please note that there is no "orignal", in the sense of a stream of bytes. Why not? Because an image is not a stream of bytes. Period. What is being compressed here is a rectangular array of pixels, and that is what is being restored when the image is "viewed". I am not aware of ANY use of the GIF format to compress an arbitrary byte stream.

So, by analogy, if the XYZ compression format (I made that up) claims to compress a sequence of Unicode glyphs, as opposed to an arbitrary byte stream, and can later reconstruct that sequence of glyphs exactly, then I argue that it has every right to be called "lossless", in the same manner that GIF is called "lossless", because there is no original byte stream to preserve.

Jill



> -----Original Message-----
> From: Doug Ewell [mailto:[EMAIL PROTECTED]]
> Sent: Tuesday, November 25, 2003 7:09 PM
> To: Unicode Mailing List; UnicoRe Mailing List
> Subject: Re: Compression through normalization
>
>
> Here's a summary of the responses so far:
>
> * Philippe Verdy and and Jill Ramonsky say YES, a compressor can
> normalize, because it knows it is operating on Unicode character data
> and can take advantage of Unicode properties.
>
> * Peter Kirk and Mark Shoulson say NO, it can't, because all the
> compressor really knows about is the byte stream, so it must be
> preserved byte-for-byte.
>
> * I'm still not sure, but I'm leaning toward NO.

Reply via email to