RE: [MarkLogic Dev General] Require separate search constraints to be satisfied within same doc subtree

Geert Josten Fri, 02 Apr 2010 01:15:51 -0700

Hi Karl,

To put it short: the search criteria should be applied to each book 
individually instead of on all the books within the same subject. Searching is 
document based by default, which explains the behaviour you observed. You are 
looking for the 'searchable-expression' option, that can be used to select 
subnodes within each document to which the criteria should be applied. In your 
case it should look like:


<searchable-expression>/books/book</searchable-expression>

More details can be found here: 
http://developer.marklogic.com/pubs/4.1/apidocs/SearchAPI.html#search:search

Kind regards,
Geert

>


drs. G.P.H. (Geert) Josten
Consultant


Daidalos BV
Hoekeindsehof 1-4
2665 JZ Bleiswijk

T +31 (0)10 850 1200
F +31 (0)10 850 1199

mailto:[email protected]
http://www.daidalos.nl/

KvK 27164984

P Please consider the environment before printing this mail.
De informatie - verzonden in of met dit e-mailbericht - is afkomstig van 
Daidalos BV en is uitsluitend bestemd voor de geadresseerde. Indien u dit 
bericht onbedoeld hebt ontvangen, verzoeken wij u het te verwijderen. Aan dit 
bericht kunnen geen rechten worden ontleend.

> From: [email protected]
> [mailto:[email protected]] On Behalf Of
> Karl Erisman
> Sent: vrijdag 2 april 2010 1:57
> To: General Mark Logic Developer Discussion
> Subject: [MarkLogic Dev General] Require separate search
> constraints to be satisfied within same doc subtree
>
> Suppose you have a database that stores information about
> books in a book store.  There is one document per subject.
> Each document contains the books categorized under that
> subject area.  For example, here is sports.xml:
>
>     <books subject="sports">
>         <book id='1'>
>             <author>
>                 <firstname>James</firstname>
>                 <lastname>Johnson</lastname>
>             </author>
>             <title>Running by Moonlight</title>
>             <pages>220</pages>
>         </book>
>         <book id='2'>
>             <author>
>                 <firstname>Marie</firstname>
>                 <lastname>Franklin</lastname>
>             </author>
>             <title>Optimum Nutrition for Peak Bowling
> Performance</title>
>             <pages>2</pages>
>         </book>
>     </books>
>
> If the following options node is used:
>
>     <options xmlns="http://marklogic.com/appservices/search";>
>         <constraint name="author-firstname">
>             <word>
>                 <element ns="" name="firstname" />
>             </word>
>         </constraint>
>         <constraint name="title">
>             <word>
>                 <element ns="" name="title" />
>             </word>
>         </constraint>
>     </options>
>
> ...then these queries will both return sports.xml, as desired:
>
>     author-firstname:james title:running
>     author-firstname:marie title:bowling
>
> However, using that approach, the following query would
> erroneously return sports.xml because, though both
> constraints are satisfied, they are not satisfied within the
> same book (author-firstname from book #1 and title from book #2):
>
>     author-firstname:james title:bowling
>
> Now, I know that setting fragment roots at <book> could work
> in this particular case, but I do not want to use fragmentation.
>
> One (annoyingly complicated) possibility: define a custom
> constraint for the book search that would be called like this:
>
>     book-query:"firstname:james title:bowling"
>
> (book-query is a constraint with multiple related sub-queries
> in a "phrase").
>
> This would require writing a constraint module that would
> have to parse the $right part ("firstname:james
> title:bowling") itself and generate a cts:query.  I would
> have to write such a library module for every set of
> constraints that needs to match in a common subtree (in the
> example above, <book> is the only common subtree, but real
> data may have several similar situations per document).  The
> constraint modules might use the Search API themselves to
> separately find matches for each sub-query, then post-process
> the results to filter out documents without matches to all
> criteria occurring within a common ancestor.  For example,
> the custom constraint module implementing the book-query
> would retrieve result sequences A and B for firstname and
> title, respectively.  Then post-processing would find that A
> and B both contain sports.xml.  However, because the
> search:match nodes for the sub-query constraints do not share
> a common <book> ancestor (/books/book[1]/author/firstname vs.
> /books/book[2]/title), sports.xml does NOT match.
>
> I certainly hope there's a better approach.  This one seems
> overly-complicated (to put it mildly).  It's computationally
> expensive since max-matches would have to be set to a large
> number to get all matches and in the worst case (large number
> of matches, no common ancestor), all matches would have to be
> examined (while searching for a common ancestor).
>
> Am I missing some key way to use the Search API that would be
> just the ticket for this?
>
> Thanks,
> Karl
> _______________________________________________
> General mailing list
> [email protected]
> http://xqzone.com/mailman/listinfo/general
>
_______________________________________________
General mailing list
[email protected]
http://xqzone.com/mailman/listinfo/general

RE: [MarkLogic Dev General] Require separate search constraints to be satisfied within same doc subtree

Reply via email to