On Tue, Feb 03, 2009 at 10:56:13PM +0000, Duncan Coutts wrote: > > > Thanks to suggestions from Duncan Coutts, it's possible to call > > > hSetEncoding even on buffered read Handles, and the right thing > > > happens. So we can read from text streams that include multiple > > > encodings, such as an HTTP response or email message, without having > > > to turn buffering off (though there is a penalty for switching > > > encodings on a buffered Handle, as the IO system has to do some > > > re-decoding to figure out where it should start reading from again). > > > > Sounds useful, but is this the bit that causes the 30% performance hit? > > No. You only pay that penalty if you switch encoding. The standard case > has no extra cost.
I'm confused. I thought the standard case was conversion to the system's local encoding? How is that different than selecting the same encoding manually? There always has to be *some* conversion from a 32-bit Char to the system's selection, right? What exactly do we have to do to avoid the penalty? > No, I think that's 30% for latin1. The cost is not really the character > conversion but the copying from a byte buffer via iconv to a char > buffer. Don't we already have to copy between a byte buffer and a char buffer, since read() and write() use a byte buffer? > > 30% slower is a big deal, especially since we're not all that speedy now. > > Bear in mind that's talking about the [Char] interface, and nobody using > that is expecting great performance. We already have an API for getting Yes, I know, but it's still the most convenient interface, and making it suck more isn't cool -- though there are certainly big wins here. -- John _______________________________________________ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users