From: keean.schu...@googlemail.com [mailto:keean.schu...@googlemail.com] On 
Behalf Of Keean Schupke
Sent: Friday, March 18, 2011 8:17 PM

>> On 18 March 2011 19:29, Pablo Castro <pablo.cas...@microsoft.com> wrote:
>>
>> From: keean.schu...@googlemail.com [mailto:keean.schu...@googlemail.com] On 
>> Behalf Of Keean Schupke
>> Sent: Friday, March 18, 2011 1:53 AM
>>
>> >> See my proposal in another thread. The basic idea is to copy BDB. Have a 
>> >> primary index that is based on an integer, something primitive and fast. 
>> >> Allow secondary indexes which use a callback to generate a binary index 
>> >> key. IDB shifts the complexity out into a library. Common use cases can 
>> >> be provided (a hash of all fields in the object, internationalised 
>> >> bidirectional lexicographic etc...), but the user is free to write their 
>> >> own for less usual cases (for example indexing by the last word in a name 
>> >> string to order by surname).
I agree with Jeremy's comments on the other thread for this. Having the 
callback mechanism definitely sounds interesting but there are a ton of common 
cases that we can solve by just taking a language identifier, I'm not sure we 
want to make people work hard to get something that's already supported in most 
systems. The idea of having a callback to compute the index value feels 
incremental to this, so we could take on it later on without disrupting the 
explicit international collation stuff.
>>
>> The idea would be to provide pre-defined implementations of the callback for 
>> common use cases, then it is just as simple to register a callback as set 
>> any other option. All this means to the API is you pass a function instead 
>> of a string. It also is better for modularity as all the code relating to 
>> the sort order is kept in the callback functions.
>>
>> The difference comes down to something like:
>>
>> index.set_order_lexicographic('us');
>>
>> vs
>>
>> index.set_order_method(order_lexicographic('us'));
>>
>> So more than just setting a property like the first case, where presumably 
>> all the ordering code is mixed in with the indexing code, the second case 
>> encapsulates all the ordering code in the function returned from the 
>> execution of order_lexicographic('us'). This function would represent a 
>> mapping from the object being indexed to a binary blob that is the actual 
>> stored index data.
>>
>> So doing it this was does not necessarily make things harder, and it 
>> improves encapsulation, the type-safety, and the flexibility of the API.

Yep, we talked about supporting callbacks already in the other threads and in 
this one. As I mentioned before, I think this is an incremental to the basic 
feature of taking a collation name. I do realize you can just pass a 
pre-implemented function, but that opens the door to a bunch of things we'd 
need to handle, including storing possibly storing code in the database (such 
that proper updates don't depend on each page re-registering all the index 
callbacks), handling scripts with the appropriate context to run during index 
updates, etc.  I would much rather have basic functionality in place and then 
expand as needed once we have users using the API.

Thanks
-pablo


Reply via email to