On 2012-03-12 17:24:56 +0000, Jonathan M Davis said:

The problem is that sort requires a random access range, and narrow string
(string and wstring) aren't, because they're variable length encoded. I'm not
sure that strings _can_ be efficiently sorted, because of that, but maybe
there's a sorting algorthm that could do it reasonably efficiently,

I'd certainly think that'd be posible. (Might, in fact, be a nice problem for the students in my algorithms class ;)

I guess I'm just tripped up by the fact that char[] is treated as an "almost-string", and hence is "infected" by the variable-length encoding of string (i.e., const char[]).

This all makes sense, for sure -- at least as a practical tradeoff or the like. (After all, UTF-8 is a super-elegant solution to a very difficult problem.) It's just so easy to assume that T[] is a random-access range. It seems so *obvious*. And it is true for any type T except the variable-length chars... :)

In my case, I was just using char literals (and strings) as an easy way of encoding test cases for a template class, and the sorting (+ uniq) was just a way of removing duplicates. (Could've used a hash, of course.) So I was really just treating it as a ubyte[]. Slightly Evil[tm], I guess.

and we could
special case sort for narrow strings to use that one, but it's a while since I
messed with sorting algorithms, so I don't remember all of their
characteristics off of the top of my head. Certainly, with how sort is currenly
implemented, it can't handle any range which isn't a random access range.

No, I get that. I was just assuming that any T[] could be treated as a random access range if I really wanted to ;)

--
Magnus Lie Hetland
http://hetland.org

Reply via email to