Thanks Simon,
Unfortunately, I'm using Lucene 3.0.1 and CharTermAttribute doesn't seem to
have been introduced until 3.1.0. Similarly my version of Lucene does not have
a BooleanQuery.addClause(BooleanClause) method. Maybe you meant
BooleanQuery.add(BooleanClause).
In any case, most of what you're doing there, I'm just not familiar with.
Seems very low level. I've never had to use TokenStreams to build a query
before and I'm not really sure what is going on there. Also, I don't know what
PositionIncrementAttribute is or how it would be used to create a PhraseQuery.
The way I'm currently creating PhraseQuerys is very straightforward and
intuitive. E.g. to search for the term "foo bar" I'd build the query like this:
PhraseQuery phraseQuery = new
PhraseQuery();
phraseQuery.add(new
Term("title", "foo"));
phraseQuery.add(new
Term("title", "bar"));
Is there really no easier way to associate the correct analyzer with these
types of queries?
Bill
-----Original Message-----
From: Simon Willnauer [mailto:[email protected]]
Sent: Friday, August 03, 2012 3:43 AM
To: [email protected]; Bill Chesky
Subject: Re: Analyzer on query question
On Thu, Aug 2, 2012 at 11:09 PM, Bill Chesky
<[email protected]> wrote:
> Hi,
>
> I understand that generally speaking you should use the same analyzer on
> querying as was used on indexing. In my code I am using the SnowballAnalyzer
> on index creation. However, on the query side I am building up a complex
> BooleanQuery from other BooleanQuerys and/or PhraseQuerys on several fields.
> None of these require specifying an analyzer anywhere. This is causing some
> odd results, I think, because a different analyzer (or no analyzer?) is being
> used for the query.
>
> Question: how do I build my boolean and phrase queries using the
> SnowballAnalyzer?
>
> One thing I did that seemed to kind of work was to build my complex query
> normally then build a snowball-analyzed query using a QueryParser
> instantiated with a SnowballAnalyzer. To do this, I simply pass the string
> value of the complex query to the QueryParser.parse() method to get the new
> query. Something like this:
>
> // build a complex query from other BooleanQuerys and PhraseQuerys
> BooleanQuery fullQuery = buildComplexQuery();
> QueryParser parser = new QueryParser(Version.LUCENE_30, "title", new
> SnowballAnalyzer(Version.LUCENE_30, "English"));
> Query snowballAnalyzedQuery = parser.parse(fullQuery.toString());
>
> TopScoreDocCollector collector = TopScoreDocCollector.create(10000, true);
> indexSearcher.search(snowballAnalyzedQuery, collector);
you can just use the analyzer directly like this:
Analyzer analyzer = new SnowballAnalyzer(Version.LUCENE_30, "English");
TokenStream stream = analyzer.tokenStream("title", new
StringReader(fullQuery.toString()):
CharTermAttribute termAttr = stream.addAttribute(CharTermAttribute.class);
stream.reset();
BooleanQuery q = new BooleanQuery();
while(stream.incrementToken()) {
q.addClause(new BooleanClause(Occur.MUST, new Term("title",
termAttr.toString())));
}
you also have access to the token positions if you want to create
phrase queries etc. just add a PositionIncrementAttribute like this:
PositionIncrementAttribute posAttr =
stream.addAttribute(PositionsIncrementAttribute.class);
pls. doublecheck the code it's straight from the top of my head.
simon
>
> Like I said, this seems to kind of work but it doesn't feel right. Does this
> make sense? Is there a better way?
>
> thanks in advance,
>
> Bill