Hi, I asked this question a month ago on lucene-user and was referred here.
I have content being analyzed in Solr using these tokenizers and filters: <fieldType name="text_standard" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="0" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="0" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/> </analyzer> </fieldType> Basically I want to be able to search against this index in Lucene with one of my background searching applications. My main reason for using Lucene over Solr for this is that I use the highlighter to keep track of exactly which terms were found which I use for my own scoring system and I always collect the whole set of found documents. I've messed around with using Boosts but it wasn't fine grained enough and I wasn't able to effectively create a score threshold (would creating my own scorer be a better idea?) Is it possible to use this analyzer from Lucene, or at least re-create it in code? Thanks.