I was having a problem with wildcard and prefix queries not returning hits on
a stemmed field. To solve this I overrode QueryParser to have a HashMap
of stemmed field name -> unstemmed field name and then used that map
when constructing WildcardQueries and PrefixQueries. Now I have a Stemmed
version of a field and a unstemmed version and this QueryParser switches
between them exactly when it should.
I hope this helps someone, here is the code:
import java.util.HashMap;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.queryParser.*;
/**
* This is a QueryParser that will fall back to unstemmed
* field for WildcardQueries and PrefixQueries.
* @author bdc34 a cornell dot edu
*/
public class VitroQueryParser extends QueryParser {
/**
* Map from stemmed field names to the names of fields with the
* same terms but unstemmed.
*/
HashMap <String,String> stemmedToUnstemmed;
public VitroQueryParser(String f, Analyzer a) {super(f, a); }
public VitroQueryParser(CharStream stream) {super(stream); }
public VitroQueryParser(QueryParserTokenManager tm) {super(tm); }
/**
* Sets the map of field name to field name where
* the key maps to the name of the field with the unstemmed
* version of the same terms.
*/
public void setStemmedToUnstemmed(HashMap<String, String>
stemmedToUnstemmed{
this.stemmedToUnstemmed = stemmedToUnstemmed;
}
/**
* attempts to get a field name for the unstemmed data of
* the given stemmedField data. Returns stemmedField
* if there is not mapping in stemmedToUnstemmed.
*/
public String getUnstemmed(String stemmedField){
if( stemmedField == null ||
stemmedToUnstemmed == null ||
!stemmedToUnstemmed.containsKey(stemmedField))
return stemmedField;
else
return stemmedToUnstemmed.get(stemmedField);
}
@Override
protected org.apache.lucene.search.Query getPrefixQuery(String field,
String termStr)
throws ParseException {
return super.getPrefixQuery(getUnstemmed(field), termStr);
}
@Override
protected org.apache.lucene.search.Query getWildcardQuery(String field,
String termStr)
throws ParseException {
return super.getWildcardQuery(getUnstemmed(field), termStr);
}
}
--
Brian Caruso
Programmer/Analyst
Albert R. Mann Library
Cornell University
Ithaca, NY 14853
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]