I'm using my own analyzer so I can interject HTMLStripCharFilter as described
in a previous thread.
private static Analyzer htmlStripAnalyzer = new ReusableAnalyzerBase() {
@Override
protected TokenStreamComponents createComponents(
final String fieldName, final Reader reader) {
return new TokenStreamComponents(new
StandardTokenizer(Version.LUCENE_30,
new HTMLStripCharFilter(CharReader.get(reader))));
}
};
So should I replace StandardTokenizer with LowerCaseTokenizer above? Then
would I override nextToken() and stem or is there a better way to use something
like SnowballFilter?
----- Original Message ----
From: Erick Erickson <[email protected]>
To: [email protected]
Sent: Thu, April 29, 2010 3:30:09 PM
Subject: Re: Highlighter usage
What analyzer are you using at index time? My guess is something
like WhitespaceAnalyzer that doesn't stem or change case..... Try
a different analyzer, SimpleAnalyzer comes to mind....
HTH
Erick
On Thu, Apr 29, 2010 at 4:21 PM, Justin <[email protected]> wrote:
> I'm trying to use Highlighter with QueryScorer after reading:
>
> https://issues.apache.org/jira/browse/LUCENE-1685
>
> The problem is: I'm not getting a result unless my the query term is an
> exact match. Am I missing filters? Is there a more complete example of how
> this should work?
>
>
> String content = "Global Climate Change affects us all";
>
> String field = "content";
> BooleanQuery query = new BooleanQuery();
>
> // Unstemmed, matched case works
> //query.add(new TermQuery(new Term(field, "Climate")),
> BooleanClause.Occur.MUST);
>
> // Stemmed, lowercase doesn't work
> query.add(new TermQuery(new Term(field, "climat")),
> BooleanClause.Occur.MUST);
> query.add(new TermQuery(new Term(field, "affect")),
> BooleanClause.Occur.MUST);
>
> Highlighter highlighter = new Highlighter(new QueryScorer(query,
> field));
> highlighter.setMaxDocCharsToAnalyze(500000);
>
> TokenStream ts = htmlStripAnalyzer.tokenStream(field, new
> StringReader(content));
> ts = new CachingTokenFilter(ts);
>
> System.out.println(highlighter.getBestFragment(ts, content));
>
>
> Thanks for any feedback,
> Justin
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]