Maybe you could consider using sem:uuid() in MarkLogic 7? You are much 
better off with a statistically unique ID than actually taking the time 
and massive concurrency reduction to check uniqueness.

John

On 04/06/2014 18:01, Ron Hitchens wrote:
>
>     I'm working on a project, one aspect of which requires minting unique IDs 
> and assuring that no two documents with the same ID wind up in the database.  
> I know how to accomplish this using locks (I'm pretty sure) but any such 
> implementation is awkward and prone to subtle edge case errors, and can be 
> difficult to test.
>
>     It seems to me that this is something that MarkLogic could do much more 
> reliably and quickly than any user-level code.  The thought that occurred to 
> me is a variation on range indexes which only allow a single instance of any 
> given value.
>
>     Conventional range indexes work by creating term lists that look like 
> this (see Jason Hunter's ML Architecture paper), where each term list 
> contains an element (or attribute) value and a list of fragment IDs where 
> that term exists.
>
> aardvark | 23, 135, 469, 611
> ant      | 23, 469, 558, 611, 750
> baboon   | 53, 97, 469, 621
> etc...
>
>     By making a range index like this but which only allows a single fragment 
> ID in the list, that would ensure that no two documents in the database 
> contain a given element with the same value.  That is, attempting to add a 
> second document with the same element or attribute value would cause an 
> exception.  And being a range index, it would provide a fast lexicon of all 
> the current unique values in the DB.
>
>     Such an index would look something like this:
>
> abc3vk34 | 17
> bkx46lkd | 52
> bz1d34nm | 37
> etc...
>
>     Usage could be something like this:
>
> declare function create-new-id-doc ($id-root as xs:string) as xs:string
> {
>      try {
>          let $id := $id-root || "-" || mylib:random-string(8)
>          let $uri := "/idregistry/id-" || $id
>          let $_ :=
>              xdmp:document-insert ($uri,
>                  <registered-id>
>                      <id>{ $id }</id>
>                      <created>{ fn:current-dateTime() }</created>
>                  </registered-id>
>           return $id
>      } catch (e) {
>          create-new-id-doc ($id-root)
>      }
> };
>
>     This doesn't require that I write any (possibly buggy) mutual exclusion 
> code and I can be confident that once the xdmp:document-insert succeeds that 
> the ID is unique in the database and that the type (as configured for the 
> range index) is correct.
>
>     Any love for Unique Value Range Indexes in the next version of MarkLogic?
>
> ---
> Ron Hitchens {r...@overstory.co.uk}  +44 7879 358212
>
> _______________________________________________
> General mailing list
> General@developer.marklogic.com
> http://developer.marklogic.com/mailman/listinfo/general
>


-- 
John Snelson, Lead Engineer                    http://twitter.com/jpcs
MarkLogic Corporation                         http://www.marklogic.com
_______________________________________________
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to