Oh, one more thing. I wasn't suggesting that you *remove* WordDelimiterFilterFactory from the query chain, just that you should be more selective about the options. Look at the differences in the options in the example schema for a place to start....
Best Erick On Wed, Nov 9, 2011 at 12:33 PM, Erick Erickson <erickerick...@gmail.com> wrote: > Regarding <1>. Take a look at admin/analysis and see the tokenization just > to check. > > Oh, and one more thing... > putting <LowerCaseFilterFactory> in front of <WordDelimiterFilterFactory> > kind of defeats the purpose of WordDelimiterFilterFactory. One of the > things WDDF does is split on case change and you're removing the case > changes before WDDF gets hold of it. > > Best > Erick > > On Tue, Nov 8, 2011 at 9:40 PM, Ellery Leung <elleryle...@be-o.com> wrote: >> Thanks Erick, here are my responses: >> >> 1. Yes. What I want to achieve is that when index is filtered with >> EdgeNgram, and a query that is not filtered in that way, I can do search on >> partial string. >> 2. Good suggestion, will test it. >> 3. ok >> 4. Thank you >> 5/6. Will remove the synonyms and word delimiterfilterfactory in query >> 7. will look at that using Luke. By the way, it is the first time I saw >> that there is a tool for that. Thank you. >> 8. Yes. >> >> Will check that again, thank you. >> >> -----Original Message----- >> From: Erick Erickson [mailto:erickerick...@gmail.com] >> Sent: 2011年11月8日 9:52 下午 >> To: solr-user@lucene.apache.org; elleryle...@be-o.com >> Subject: Re: Weird: Solr Search result and Analysis Result not match? >> >> Several things: >> >> 1> You don't have EdgeNGramFilterFactory in your query analysis chain, >> is this intentional? >> 2> You have a LOT of stuff going on here, you might try making your >> analysis chain simpler and >> adding stuff back in until you see the error. Don't forget to re-index! >> 3> Analysis doesn't take into account query *parsing*, so it's >> possible to get a false sense of >> assurance when the analysis page matches your expectations. >> 4> Even though nothing jumps out at me except the Edge.... factory, >> nice job of including >> information. >> 5> It's unusual to expand synonyms both at query and index time, >> usually one or the >> other with index time preferred. >> 6> Same with WordDelimiterFilterFactory. If you put all the variants >> in the index, you don't >> need to put all the variants in the query and vice-versa. >> 7> Take a look at your actual contents, perhaps using Luke to insure >> that what you expect >> to be in your index actually is. >> 8> You did re-index after your latest changes to your schema, right <G>? >> >> All of this is a way of saying that I don't quite see what the problem >> is, but at least there are >> some avenues to explore. >> >> Best >> Erick >> >> On Mon, Nov 7, 2011 at 9:29 PM, Ellery Leung <elleryle...@be-o.com> wrote: >>> Hi all. >>> >>> >>> >>> I am using Solr 3.4 under Win 7. >>> >>> >>> >>> In schema there is a multivalue field indexed in this way: >>> >>> ========================== >>> >>> Schema: >>> >>> ========================== >>> >>> <field name="myEvent" type="myCustomText" multiValued="true" indexed="true" >>> stored="true" omitNorms="true"/> >>> >>> >>> >>> <fieldType name="myCustomText" class="solr.TextField" >>> positionIncrementGap="100"> >>> >>> <analyzer type="index"> >>> >>> <charFilter class="solr.MappingCharFilterFactory" >>> mapping="../../filters/filter-mappings.txt"/> >>> >>> <charFilter class="solr.HTMLStripCharFilterFactory"/> >>> >>> <tokenizer class="solr.StandardTokenizerFactory"/> >>> >>> <filter class="solr.TrimFilterFactory"/> >>> >>> <filter class="solr.LowerCaseFilterFactory"/> >>> >>> <filter class="solr.SynonymFilterFactory" >>> synonyms="../../filters/filter-synonyms.txt" ignoreCase="true" >>> expand="true"/> >>> >>> <filter class="solr.ASCIIFoldingFilterFactory"/> >>> >>> <filter class="solr.WordDelimiterFilterFactory" >>> splitOnCaseChange="1" splitOnNumerics="1" stemEnglishPossessive="1" >>> generateWordParts="1" generateNumberParts="1" catenateWords="1" >>> catenateNumbers="1" catenateAll="0" preserveOriginal="1"/> >>> >>> <filter class="solr.PhoneticFilterFactory" >>> encoder="DoubleMetaphone" inject="true"/> >>> >>> <filter class="solr.PorterStemFilterFactory"/> >>> >>> <filter class="solr.EdgeNGramFilterFactory" minGramSize="1" >>> maxGramSize="50" side="front"/> >>> >>> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> >>> >>> </analyzer> >>> >>> <analyzer type="query"> >>> >>> <charFilter class="solr.MappingCharFilterFactory" >>> mapping="../../filters/filter-mappings.txt"/> >>> >>> <charFilter class="solr.HTMLStripCharFilterFactory"/> >>> >>> <tokenizer class="solr.StandardTokenizerFactory"/> >>> >>> <filter class="solr.TrimFilterFactory"/> >>> >>> <filter class="solr.LowerCaseFilterFactory"/> >>> >>> <filter class="solr.SynonymFilterFactory" >>> synonyms="../../filters/filter-synonyms.txt" ignoreCase="true" >>> expand="true"/> >>> >>> <filter class="solr.ASCIIFoldingFilterFactory"/> >>> >>> <filter class="solr.WordDelimiterFilterFactory" >>> splitOnCaseChange="1" splitOnNumerics="1" stemEnglishPossessive="1" >>> generateWordParts="0" generateNumberParts="1" catenateWords="1" >>> catenateNumbers="1" catenateAll="0" preserveOriginal="1"/> >>> >>> <filter class="solr.PhoneticFilterFactory" >>> encoder="DoubleMetaphone"/> >>> >>> <filter class="solr.PorterStemFilterFactory"/> >>> >>> <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> >>> >>> </analyzer> >>> >>> </fieldType> >>> >>> ========================== >>> >>> Actual index: >>> >>> ========================== >>> >>> <arr name="myEvent"> >>> >>> <str>2284e2</str> >>> >>> <str>2284e4</str> >>> >>> <str>2284e5</str> >>> >>> <str>1911e2</str> >>> >>> </arr> >>> >>> >>> >>> ========================== >>> >>> Question: >>> >>> ========================== >>> >>> Now when I do a search like this: >>> >>> >>> >>> myEvent:1911e2 >>> >>> >>> >>> This should match the 4th item. Now on "Full Interface", it does not return >>> any result. But on "analysis", matches are highlighted. >>> >>> >>> >>> By using Debug: the parsedquery is: >>> >>> >>> >>> MultiPhraseQuery(myEvent:"(1911e2 1911) (A e) 2") >>> >>> >>> >>> Parsedquery_toString: >>> >>> >>> >>> myEvent:"(1911e2 1911) (A e) 2" >>> >>> >>> >>> Can anyone please help me on this? >>> >>> >> >> >