Look at Lucene's svn repository under contrib/analyzers and you'll see a KeywordTokenizer and corresponding KeywordAnalyzer you can use.
Erik
On Feb 18, 2005, at 5:44 PM, Luke Shannon wrote:
I have created an Analyzer that I think should just be converting to lower
case and add synonyms in the index (it is at the end of the email).
The problem is, after running it I get one more result than I was expecting
(Document 1 should not be there):
Running testNameCombination1, expecting: 1 result The query: +(type:138) +(name:mario*) returned 2
Start Listing documents:
Document: 0 contains: Name: Text<name:mario test> Desc: Text<desc:this is test from mario>
Document: 1 contains: Name: Text<name:test mario> Desc: Text<desc:retro>
End Listing documents
Those same 2 documents in Luke look like this:
Document 0 Text<name:mario test> Text<desc:this is test from mario>
Document 1 Text<name:test mario> Text<desc:retro>
That looks correct to me. The query shouldn't match Document 1.
The analzyer used on this field is below and is applied like so:
//set the default PerFieldAnalyzerWrapper analyzer = new PerFieldAnalyzerWrapper(new SynonymAnalyzer(new FBSynonymEngine()));
//the analyzer for the name field (only converts to lower case and adds synonyms analyzer.addAnalyzer("name", new KeywordSynonymAnalyzer(new FBSynonymEngine()));
Any help would be appreciated.
Thanks,
Luke
import org.apache.lucene.analysis.*; import java.io.Reader;
public class KeywordSynonymAnalyzer extends Analyzer { private SynonymEngine engine;
public KeywordSynonymAnalyzer(SynonymEngine engine) { this.engine = engine; }
public TokenStream tokenStream(String fieldName, Reader reader) { TokenStream result = new SynonymFilter(new LowerCaseTokenizer(reader), engine); return result; } }
Luke Shannon | Software Developer FutureBrand Toronto
207 Queen's Quay, Suite 400 Toronto, ON, M5J 1A7 416 642 7935 (office)
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]