Hi - Just looking at synonyms, and had a couple of questions.
1) For some of my synonyms, it seems to make senses to simply replace the original word with the other (e.g. "theatre" => "theater", so searches for either will find either). For others, I want to add an alternate term while preserving the original (e.g. "cirque" => "circus", so searches for "circus" find Cirque du Soleil, but searches for "cirque" only match "cirque", not "circus". I was thinking that the best way to do this was with two different synonym filters. The replace filter would be used both at index and query time, the other only at index time. Does doing this using two synonym filters make sense? section from my schema.xml <fieldType name="text" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StandardFilterFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" words="stopwords.txt"/> <filter class="solr.SynonymFilterFactory" synonyms="synonyms_replace.txt" ignoreCase="true" expand="false" includeOrig="false"/> <filter class="solr.SynonymFilterFactory" synonyms="synonyms_add.txt" ignoreCase="true" expand="false" includeOrig="true"/> <filter class="solr.EnglishPorterFilterFactory" protected=" protwords.txt"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.StandardFilterFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" words="stopwords.txt"/> <filter class="solr.SynonymFilterFactory" synonyms="synonyms_replace.txt" ignoreCase="true" expand="false" includeOrig="false"/> <filter class="solr.EnglishPorterFilterFactory" protected=" protwords.txt"/> </analyzer> </fieldType> 2) For this to work, I need to use "includeOrig". It appears that "includeOrig" is hard coded to be false in SynonymFilterFactory. Is there any reason for this? It's pretty easy to change (diff below), any reason this should not be supported? Thanks, Tom Diffing vs. my local copy of 1.2, but it appears to be the same in HEAD. --- src/java/org/apache/solr/analysis/SynonymFilterFactory.java +++ src/java/org/apache/solr/analysis/SynonymFilterFactory.java (working copy) @@ -37,6 +37,7 @@ ignoreCase = getBoolean("ignoreCase",false); expand = getBoolean("expand",true); + includeOrig = getBoolean("includeOrig",false); if (synonyms != null) { List<String> wlist=null; @@ -57,8 +58,9 @@ private SynonymMap synMap; private boolean ignoreCase; private boolean expand; + private boolean includeOrig; - private static void parseRules(List<String> rules, SynonymMap map, String mappingSep, String synSep, boolean ignoreCase, boolean expansion) { + private void parseRules(List<String> rules, SynonymMap map, String mappingSep, String synSep, boolean ignoreCase, boolean expansion) { int count=0; for (String rule : rules) { // To use regexes, we need an expression that specifies an odd number of chars. @@ -88,7 +90,6 @@ } } - boolean includeOrig=false; for (List<String> fromToks : source) { count++; for (List<String> toToks : target) {