Geert,

Sorry, I forgot the examples are in the presentation, which I haven't managed 
to get online yet. The steps for thesaurus expansion would look something like 
this:

declare function local:do-thsr-expand(
  $q as item()
) as item()
{
  let $q-thsr := 
cts:or-query(doc("/my-thesaurus.xml")//thsr:entry/thsr:term/cts:word-query(.))
  let $runs := exprun:create-runs(exprun:unnest-ands($q))
  let $expanded := exprun:thsr-expand-runs($runs,$q-thsr)
  return exprun:resolve-runs($expanded)
};

-Will

-----Original Message-----
From: [email protected] 
[mailto:[email protected]] On Behalf Of Geert Josten
Sent: Monday, November 05, 2012 10:45 PM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Searching using language features..

Hi Will,

Looks interesting. Could you give some brief samples of how to call/use this 
code? Couldn't find a main module..

Kind regards,
Geert

> -----Oorspronkelijk bericht-----
> Van: [email protected] [mailto:general- 
> [email protected]] Namens Will Thompson
> Verzonden: maandag 5 november 2012 22:41
> Aan: MarkLogic Developer Discussion
> Onderwerp: Re: [MarkLogic Dev General] Searching using language
features..
>
> Geert,
>
> Regarding 2), there is thsr:expand(), which integrates well into the
search
> libraries, but has its limitations. I gave a presentation at the last
MarkLogic
> World that included an example of thesaurus expansion beyond what's
provided
> in the thsr library, specifically multi-word expansion. The code is
available in my
> github repo: https://github.com/wthoolihan/MLUC-2012-Examples. If you
have
> any questions, let me know.
>
> -Will
>
>
> -----Original Message-----
> From: [email protected] [mailto:general- 
> [email protected]] On Behalf Of Geert Josten
> Sent: Monday, November 05, 2012 2:21 AM
> To: MarkLogic Developer Discussion
> Subject: [MarkLogic Dev General] Searching using language features..
>
> Hi,
>
> Several language support related questions this time. Most have been
asked
> before, but had trouble putting all answers together. So, I'm just 
> going
to ask
> them once more:
>
> 1) Others have asked before, but is there a trick to ignore language 
> in
queries,
> and get results for all languages, without doing an or-query for all
languages you
> are interested in?
>
> 2) MarkLogic has stemming support, but there is also a library to use
thesauri.
> What is the best way to integrate that into the search library if I
would like to
> use thesauri to expand search terms before doing the actual search? Or
other
> similar code that would be able to expand a term into a list of all
kinds of
> synonyms (or related terms)..
>
> 3) Stopwords: to my knowledge there are no built-in language-specific
lists of
> stop words like 'the'. I know I can find stop words by searching for 
> the
top
> number of values (or words) and take the most common ones up to some 
> threshold (and perhaps synthesize static lists from that). But what is
the most
> efficient way to eliminate those from a search string? I have some 
> code
of my
> own in which I tokenize and eliminate with xqy dynamically, on each
call, but
> perhaps someone knows a smarter trick?
>
> Cheers,
> Geert
>
>
> M.Sc. G.P.H. (Geert) Josten
> Senior Developer
>
>
> Dayon B.V.
> Delftechpark 37b
> 2628 XJ Delft
> The Netherlands
>
> T +31 (0)88 26 82 570
>
> [email protected]
> www.dayon.nl
>
> De informatie - verzonden in of met dit e-mailbericht - is afkomstig 
> van
Dayon
> BV en is uitsluitend bestemd voor de geadresseerde. Indien u dit 
> bericht onbedoeld hebt ontvangen, verzoeken wij u het te verwijderen. 
> Aan dit
bericht
> kunnen geen rechten worden ontleend.
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
> _______________________________________________
> General mailing list
> [email protected]
> http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to