Hi, We are using doc.add(Field.Text("keywords",keywords)); to add the keywords to the document, where keywords is comma separated keywords string. Lucene seems to tokenize the keywords with multiple words like(MAIN BOARD) as different keywords(ie as MAIN and BOARD). Tokenization is based on comma and space...So if we search for "MAIN BOARD", documents having keywords like "MAIN LOGIC", "MAIN PARTS", etc also show up
If one searches for "MAIN BOARD", we want get only the documents have "MAIN BOARD". How to do this ? To achieve this we used doc.add(Field.Keyword("keywords", keywords)); and while searching we cannot use standard analyzer, while searching, as divides the keywords if we search keywords having space... so we wrote an KeywordAnalyser(KeywordAnalyzer is basically returns only one single token) as given below. /** * Tokenizes the entire stream as single token */ public class KeywordAnalyzer extends Analyzer { public TokenStream tokenStream(String fieldName, final Reader reader) { return new TokenStream(){ private boolean done; private final char[] buffer = new char[1024]; public Token next() throws IOException { if(!done) { done = true; StringBuffer buffer = new StringBuffer(); int length = 0; while(true) { length = reader.read(this.buffer); if(length == -1) break; buffer.append(this.buffer,0,length); } String text = buffer.toString(); return new Token(text.toUpperCase(),0,text.length()); } return null; } }; } } Which solve the above said problem, but we are not able to the wild card searchs like MAIN*, etc. We need both the functionality ie. 1. if user searches for MAIN BOARD, should get only documents that contain MAIN BOARD and not MAIN LOGIC, MAIN, MAIN PART etc. 2. User should be able to do the wild card search like MAIN*, etc and get the desired documents. Please let us know, how we should do the indexing ? and which analyzer to use to do the search ? thanks Rahul...