Fw: Urgent : Specific search problem with whitespace analyzer

Krishnendra Nandi Tue, 21 Nov 2006 21:01:35 -0800

Hi,

I am doing "field:text" kind of search using my own analyzer which behaves 
like whitespaceanalyzer. Following are the code snippets for my own 
whitespaceanalyzer and whitespacetokenizer.



// WhiteSpaceAnalyzerMaestro.java
package com.hewitt.itk.maestro.support.service.simplesearch;

import java.io.Reader;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.TokenStream;

/** An Analyzer that uses WhitespaceTokenizer. */

public final class WhitespaceAnalyzerMaestro extends Analyzer {
  public TokenStream tokenStream(String fieldName, Reader reader) {
    return new WhitespaceTokenizerMaestro(reader);
  }
} 



// WhitespaceTokenizerMaestro.java
package com.hewitt.itk.maestro.support.service.simplesearch;

import java.io.Reader;

import org.apache.lucene.analysis.WhitespaceTokenizer;

/** A WhitespaceTokenizerMaestro is a tokenizer that divides text at 
whitespace.
 * Adjacent sequences of non-Whitespace characters form tokens. */

public class WhitespaceTokenizerMaestro extends WhitespaceTokenizer {
  /** Construct a new WhitespaceTokenizerMaestro. */
  public WhitespaceTokenizerMaestro(Reader in) {
    super(in);
  }

  /** Collects only characters which do not satisfy
   * [EMAIL PROTECTED] Character#isWhitespace(char)} 
   * and lowercases that character before returning.*/
  protected boolean isTokenChar(char c) {
        c = Character.toLowerCase(c); 
    return !Character.isWhitespace(c);
  }
}



I have modified the tokenizer class by making it return characters in 
lower case.

Now my search criteria is  ISSUE_TITLE:test  in which  ISSUE_TITLE is the 
field in which test is to be searched. 

Following is my code snippet which is doing the search:

BooleanQuery masterQuery = new BooleanQuery();
 
 masterQuery.add(MultiFieldQueryParser.parse(
                                                        searchQuery,
                                                        fields,
                                                        analyzer),
                            REQUIRED,
                            PROHIBITED);

Here the searchquery is   ISSUE_TITLE:test , fields is the array of fields 
in which ISSUE_TITLE is one of the fields and analyzer is 
WhitespaceAnalyzerMaestro() (already mentioned above).

When I run the search, the masterQuery I get after running the above code 
snippet has the following value: 
+(ISSUE_TITLE:test* ISSUE_TITLE:test* ISSUE_TITLE:test* ISSUE_TITLE:test* 
ISSUE_TITLE:test* ISSUE_TITLE:test* ISSUE_TITLE:test* ISSUE_TITLE:test* 
ISSUE_TITLE:test* ISSUE_TITLE:test* ISSUE_TITLE:test* ISSUE_TITLE:test* 
ISSUE_TITLE:test* ISSUE_TITLE:test* ISSUE_TITLE:test* ISSUE_TITLE:test* 
ISSUE_TITLE:test* ISSUE_TITLE:test* ISSUE_TITLE:test* ISSUE_TITLE:test* 
ISSUE_TITLE:test* ISSUE_TITLE:test* ISSUE_TITLE:test* ISSUE_TITLE:test* 
ISSUE_TITLE:test* ISSUE_TITLE:test* ISSUE_TITLE:test* ISSUE_TITLE:test* 
ISSUE_TITLE:test* ISSUE_TITLE:test*)

which I think is not correct. Is the MultiFieldQueryParser not supporting 
WhiteSpaceAnalyzer?

Please help.

Regards
Krishnendra Nandi

 
The information contained in this e-mail and any accompanying documents may 
contain information that is confidential or otherwise protected from 
disclosure. If you are not the intended recipient of this message, or if this 
message has been addressed to you in error, please immediately alert the sender 
by reply e-mail and then delete this message, including any attachments. Any 
dissemination, distribution or other use of the contents of this message by 
anyone other than the intended recipient 
is strictly prohibited.

Fw: Urgent : Specific search problem with whitespace analyzer

Reply via email to