On Thursday, 28 June 2012 at 09:58:02 UTC, Roman D. Boiko wrote:
Pedantically speaking, it is possible to index a string with
about 50-51% memory overhead to get random access in 0(1) time.
Best-performing algorithms can do random access in about 35-50
nanoseconds per operation for strings up to tens of megabytes.
For bigger strings (tested up to 1GB) or when some other
memory-intensive calculations are performed simultaneously,
random access takes up to 200 nanoseconds due to memory-access
resolution process.
This would support both random access to characters by their code
point index in a string and determining code point index by code
unit index.
If only the former is needed, space overhead decreases to 25% for
1K and <15% for 16K-1G string sizes (measured in number of code
units, which is twice the number of bytes for wstring). Strings
up to 2^64 code units would be supported.
This would also improve access speed significantly (by 10% for
small strings and about twice for large).