On 29/12/2008, at 3:26 PM, Chris Anderson wrote:
I almost suggesting giving an option for inclusive and exclusive
interval ends, basically, < / > vs <= / >= control from the client.
But then thinking about Maximillian's proposal (of defaulting to an
exclusive right end) I began to wonder if offering *only* the
interval-style he suggests, would satisfy both precision maths, and
newbie expectations.
My concern right now is prefix searching e.g. paging though
startkey='rs' endkey='rs\uFFF8'
It would be good to have a prefix-test mode that would be applicable
to the 'final' string component of a key - ala SQLs "LIKE 'rs%'". This
would eliminate the need for the 'rs\uFFF8' hack.
Something like endkey_succ=<key> which would be equivalent to a non-
inclusive endkey=succ(<key>) where succ(x) is the first key value wrt
the the view collation algorithm that wouldn't satisfy x <= <key>. The
essential characteristic being that succ(x) doesn't need to be
calculated by the client.
I'm not suggesting endkey_succ as the syntactic mechanism.
In my opinion the ICU collation
driver is configured sanely, and I feel comfortable delegating to ICU.
It's a good library for our cause. I would absolutely love to see test
cases that indicated where CouchDB can improve on this front.
I'd like to be able to turn on normalization for all sorting. I could
normalise all documents, and all key values, but given that CouchDB
has IUC, this would be a lot more convenient and reliable if it was a
server-provided feature.
I imagine some might like to enable correct ordering of French
accents: http://unicode.org/reports/tr10/#French_Accents, which is a
specific instance of a linguistic tailoring as described here: http://unicode.org/reports/tr10/#Linguistic_Features
. I suggest that both a couch instance, and/or an individual db might
want to specify a unicode locale from e.g. http://unicode.org/cldr/
There's been a suggestion of raw Unicode code point ordering as a
collation configuration parameter, specifiable in design docs.
That's not valid unicode. I think it's a bad idea.
Maybe
the next logical step is a configuration member, for design docs,
which could optionally specify the ICU configuration.
Specified in a hierarchic manner: system / db. I hesitate to include
'view' because there are a number of view-like things that don't have
configuration (_all_docs), and for completeness you would then want to
deal with propagating a particular configuration through all of the
design-doc-driven facilities. IMO, just the system & db would be enough.
Antony Blakey
-------------
CTO, Linkuistics Pty Ltd
Ph: 0438 840 787
A Man may make a Remark –
In itself – a quiet thing
That may furnish the Fuse unto a Spark
In dormant nature – lain –
Let us divide – with skill –
Let us discourse – with care –
Powder exists in Charcoal –
Before it exists in Fire –
-– Emily Dickinson 913 (1865)