>>>>> "JH" == Jarkko Hietaniemi <[EMAIL PROTECTED]> writes: >> 2) An attached table of attributes and ranges to which they apply? >> Uses less memory for sparse attributes, but means that it's hard work >> every time we have to interrogate or shuffle characters as we need to >> check all the ranges each time to see if the characters we are >> manipulating have metadata. JH> I believe this alternative has been discussed once in a while. Which JH> ranges an operation affects is a log(N) operation on the character JH> position (binary search), and the ranges can also be kept sorted among JH> themselves on (primary key start position, secondary key end JH> position), so that finding out the victim ranges is also a log(N). JH> Admittedly, log(N) tends to be larger than 1, and certainly larger JH> than 0 :-) Also, using UTF-8 (or any variable length encoding) is JH> a pain since you can't any more just happily offset to the data. JH> One could also implement SVs as balanced trees, splitting and merging JH> as the scalar grows and shrinks. I'd offer the possiblity that there are two (or perhaps more) different problems here. One is the current bunch of bytes (string, executable to be twiddled) Another which the attribute on strings seems to be structured data. Squeezing attributes onto a buffer, seems to be shoehorning a more general problem onto a specific implementation. Getting an efficient representation of a meaningful structure should be done a new data type. (I'm thinking of representing COBOL records/data, or even XML documents) <chaim> -- Chaim Frenkel Nonlinear Knowledge, Inc. [EMAIL PROTECTED] +1-718-236-0183