On Tue, 12 Apr 2016 07:10:46 -0700, Gontla Praveen  
<praveenkumargontla...@gmail.com> wrote:

> Hi Mary,
>
> Why an advanced stemming need to be enabled any specific reason for that?

Not everyone needs or wants advanced stemming: it does more work (so,  
slightly slower) with larger indexes.
For some languages, the slight increase in recall is not worth it for many  
use cases.

>
> What will be difference between using basic stemming and advanced  
> stemming ?

Basic stemming only indexes the preferred stem for each token (typically,  
the shortest one). Advanced stemming indexes all possible stems.

Completing the picture:
* decompounding is like advanced stemming, but with additional indexing  
for components of compounds. This principally applies to German and  
languages like that that create long noun clusters as single words.
* you can also turn stemming off entirely; principally useful where you  
searching over non-linguistic content

//Mary

>
> Thanks,
> Praveen.
>
> On Thu, Mar 31, 2016 at 12:58 PM, Mary Holstege  
> <mary.holst...@marklogic.com
>> wrote:
>
>>
>> Do you have advanced stemming enabled? With basic stemming only the  
>> first
>> stem returned from cts:stem indexed and used for matching in search.
>>
>> //Mary
>>
>>
>> On 03/31/2016 03:00 AM, Debin, Infant Jerald (LNG-CON) wrote:
>>
>> Hi Team,
>>
>>
>>
>> For the term French term *“disparu”* corresponding French stemmed word
>> *“disparaître”* is not getting recognized when performing search.
>>
>>
>>
>> *Example:*
>>
>>
>>
>> *Query:*
>>
>>
>>
>> let $text:= <text xml:lang="fr">avec la rupture de septembre 1997, cette
>> disparues situation fait disparaître la justification. Les services  
>> fournis
>> disparu par la demanderesse l'ont été dans l'attente d'une
>> rémunération,</text>
>>
>> return
>>
>> cts:highlight($text,cts:query(<cts:word-query>
>>
>>                                 <cts:text  
>> xml:lang="fr">disparu</cts:text>
>>
>>                                 <cts:option>case-insensitive</cts:option>
>>
>>
>> <cts:option>diacritic-insensitive</cts:option>
>>
>>
>> <cts:option>punctuation-insensitive</cts:option>
>>
>>                       </cts:word-query>),<h1>{$cts:text}</h1>)
>>
>>
>>
>> *Result:*
>>
>>
>>
>> Disparaître is not getting recognized and highlighted as below,
>>
>>
>>
>> <text xml:lang="fr">avec la rupture de septembre 1997, cette  
>> <h1>disparues</h1> situation fait disparaître la justification. Les  
>> services fournis <h1>disparu</h1> par la demanderesse l'ont été dans  
>> l'attente d'une rémunération,</text>
>>
>>
>>
>> Below is the result of cts:stem,
>>
>>
>>
>> cts:stem("disparu","fr")
>>
>>
>>
>> disparu
>>
>> disparaître
>>
>>
>>
>> Please let us know on this issue.
>>
>>
>>
>> Thanks and Regards,
>>
>>
>>
>> Debin
>>
>> Mob: +91-9789826001
>>
>>
>>
>>
>> _______________________________________________
>> General mailing listgene...@developer.marklogic.com
>> Manage your subscription at:  
>> http://developer.marklogic.com/mailman/listinfo/general
>>
>>
>>
>> _______________________________________________
>> General mailing list
>> General@developer.marklogic.com
>> Manage your subscription at:
>> http://developer.marklogic.com/mailman/listinfo/general
>>
>>


-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/
_______________________________________________
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to