At 14:15 +0100 8/21/03, Nicholas Clark wrote:
On Wed, Aug 20, 2003 at 07:19:42PM -0400, Benjamin Goldberg wrote:
 > Leopold Toetsch wrote:
 > > But these could be converted to utf32 as soon as they are seen.
 > For a long string, that could be quite a bit of bloat.
Jarkko's view is that the combined hit of the size of the extra code to skip
along the variable length encoding, the time taken to execute that code,
(and I guess the cache misses it creates) is greater than the gain from
saving space.

Indeed. I think available memory has increased more than 4 fold since the first regexp engine that could only do 1-byte ASCII. So relatively, I don't think that bloat is an issue. Just don't do regexps on 256Mbyte strings when your machine has less than 1 GByte RAM ;-)



Liz

Reply via email to