Hello.

I have a question about morphology.
Currently i'm storing multiple form of words i.e. if word 'N' in sequence 'M
N K' leads to two normal forms 'Nf1' and 'Nf2' then i'm storing 'Mf Nf1 Nf2
Kf'

That allows me to search if user entered N2 (or any of its forms) as well as
any form of N1. Maybe i'm wrong in this concept, if so please advice how do
you manage morphology (of Russian for instance or any language where for
example: N1,N2,N3,N4,N5 can be the forms of word Nx and N1,N2,N6,N7,N8 can
be the forms of Ny and when user enters N2(or N1) we should search for
documents that containing Nx and Ny both - one form (N1) can lead to
multiple normal forms (Nx and Ny))

So i've choose this scheme and storing as described above. 
When i'm looking for words taking care of distance between them, i'm using
lucene syntax "A B"~distance... unfortunaly if A leads to A1 and A2 forms i
should split this into syntax +("A1 B"~dist "A2 B"~dist ") - this grows with
progression depending of normal forms quantity of each term.

Can i search within distance using something like (+(A1 A2) +(B))~dist...
i heard that dismax can handle distance between words ignoring quotes -
could you advice in this?

Thanx in advice.
-- 
View this message in context: 
http://www.nabble.com/morphology-and-queryPrase-tp18412375p18412375.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to