On 01/05/2009, at 12:49 AM, Wojciech Kaczmarek wrote:

On Thu, Apr 30, 2009 at 16:56, Brian Candler <[email protected]> wrote:
On Thu, Apr 30, 2009 at 02:23:17PM +0100, Brian Candler wrote:
(5) Strangely, doc id keys in _all_docs appear to behave differently; perhaps they are ASCII-compared rather than UCA compared. See script 3
below.

And this has just had me tearing my hair out for the last half hour: a
search for

   _all_docs?startkey="_design/"&endkey="_design/ZZZZ"

did not match some of my documents, e.g. _design/c000. Now I realise that
almost certainly this is because Z comes before c in ASCII collation.

Is this intentional behaviour? If so I will change the Wiki so it recommends

   _all_docs?startkey="_design/"&endkey="_design/~"

Isn't it better to use "\u9999" as the ending marker?


\u9999 isn't the final unicode collation point - firstly that's not the last value in a 16 bit space, secondly unicode isn't 16 bits, and finally, unicode collation is locale dependent.

I've previously argued that the only way to do this correctly is to allow a prefix search defined over all JSON values: http://mail-archives.apache.org/mod_mbox/couchdb-dev/200901.mbox/%[email protected]%3e

Antony Blakey
--------------------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787

Only two things are infinite, the universe and human stupidity, and I'm not sure about the former.
 -- Albert Einstein

Reply via email to