See below...

On 3/27/07, daveburns <[EMAIL PROTECTED]> wrote:


Hi,
afriad I'm a noobie at Luncene but read Otis/Eriks book and was hoping
someone can answer a quick question on the AliasAnalyzer (Chap 4). I want
to
build a search for names (Companies/surname, firstname etc) but need to
match thing s like
Robert= bob, bobby, rob etc (or margaret=peggy etc).

1) The problem is my search returns all Robert or Bob hit's if I search on
bob but if I search on Robert only Robert hits come back. I though the
index
would automatically provide my required 2-way matching?



Lucene only gives you back what you put into the index. In this case,
did you index "robert" when the token was "bob"? In other words,
did you index all the variants of 'robert' every time you found any variant?

Let's say the three variants are 'bob' 'rob' and 'robert'. Whenever you
encounter any of those three variants, you need to add the *other*
two to your index as well as the original variant.


2) The second problem is to do with Company names. Can I create the above
alias analyzer to match I.B.M=IBM=International Business Machines? i.e.
multiple words to a single word.


Well, if you make your own query analyzer (subclassing the relevant one),
you
can always substitute the one-word version for the three word version. Be
careful about doing it the other way unless you make very sure it's a
phrase.
That is, tokenizing "International Business Machines" would give you
three token and you'd be searching on the OR of them (assuming default
queryparser behavior). So you'd be better off having your analyzer turn
"international business machines" into IBM. You might be able to get
away with enclosing multiple-word entries in quotes...


Best
Erick


Thanks,

Dave.
--
View this message in context:
http://www.nabble.com/Synonyms-and-Aliases-query-tf3473040.html#a9692225
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Reply via email to