On Thursday, 28 June 2012 at 10:02:59 UTC, Roman D. Boiko wrote:
On Thursday, 28 June 2012 at 09:58:02 UTC, Roman D. Boiko wrote:
Pedantically speaking, it is possible to index a string with about 50-51% memory overhead to get random access in 0(1) time. Best-performing algorithms can do random access in about 35-50 nanoseconds per operation for strings up to tens of megabytes. For bigger strings (tested up to 1GB) or when some other memory-intensive calculations are performed simultaneously, random access takes up to 200 nanoseconds due to memory-access resolution process.

Just a remark, indexing would take O(N) operations and N/B memory transfers where N = str.length and B is size of cache buffer.

That being said, I would be against switching from string representation as arrays. Such switch would hardly help us solve any problems of practical importance better (by a significant degree) than they have to be solved with current design.

However, a struct could be created for indexing which I mentioned in two previous posts to give efficient random access for narrow strings (and arbitrary variable-length data stored consequently in arrays) without any significant overhead.

Respective algorithms are called Rank and Select, and there exist many variations of them (with different trade-offs, but some of them are arguably better than others).

I have investigated this question quite deeply in the last two weeks, because similar algorithms would be useful in my DCT project. If nobody else will implement them before me, I will eventually do that myself. It is just a matter of finding some free time, likely a week or two.

Reply via email to