RE: Weird: Solr Search result and Analysis Result not match?

Ellery Leung Tue, 08 Nov 2011 18:40:47 -0800

Thanks Erick, here are my responses:

1. Yes.  What I want to achieve is that when index is filtered with EdgeNgram, 
and a query that is not filtered in that way, I can do search on partial string.
2. Good suggestion, will test it.
3. ok
4. Thank you
5/6. Will remove the synonyms and word delimiterfilterfactory in query
7. will look at that using Luke.  By the way, it is the first time I saw that 
there is a tool for that.  Thank you.
8. Yes.


Will check that again, thank you.

-----Original Message-----
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: 2011年11月8日 9:52 下午
To: solr-user@lucene.apache.org; elleryle...@be-o.com
Subject: Re: Weird: Solr Search result and Analysis Result not match?

Several things:

1> You don't have EdgeNGramFilterFactory in your query analysis chain,
is this intentional?
2> You have a LOT of stuff going on here, you might try making your
analysis chain simpler and
     adding stuff back in until you see the error. Don't forget to re-index!
3> Analysis doesn't take into account query *parsing*, so it's
possible to get a false sense of
     assurance when the analysis page matches your expectations.
4> Even though nothing jumps out at me except the Edge.... factory,
nice job of including
     information.
5> It's unusual to expand synonyms both at query and index time,
usually one or the
     other with index time preferred.
6> Same with WordDelimiterFilterFactory. If you put all the variants
in the index, you don't
     need to put all the variants in the query and vice-versa.
7> Take a look at your actual contents, perhaps using Luke to insure
that what you expect
      to be in your index actually is.
8> You did re-index after your latest changes to your schema, right <G>?

All of this is a way of saying that I don't quite see what the problem
is, but at least there are
some avenues to explore.

Best
Erick

On Mon, Nov 7, 2011 at 9:29 PM, Ellery Leung <elleryle...@be-o.com> wrote:
> Hi all.
>
>
>
> I am using Solr 3.4 under Win 7.
>
>
>
> In schema there is a multivalue field indexed in this way:
>
> ==========================
>
> Schema:
>
> ==========================
>
> <field name="myEvent" type="myCustomText" multiValued="true" indexed="true"
> stored="true" omitNorms="true"/>
>
>
>
> <fieldType name="myCustomText" class="solr.TextField"
> positionIncrementGap="100">
>
>        <analyzer type="index">
>
>                <charFilter class="solr.MappingCharFilterFactory"
> mapping="../../filters/filter-mappings.txt"/>
>
>                <charFilter class="solr.HTMLStripCharFilterFactory"/>
>
>                <tokenizer class="solr.StandardTokenizerFactory"/>
>
>                <filter class="solr.TrimFilterFactory"/>
>
>                <filter class="solr.LowerCaseFilterFactory"/>
>
>                <filter class="solr.SynonymFilterFactory"
> synonyms="../../filters/filter-synonyms.txt" ignoreCase="true"
> expand="true"/>
>
>                <filter class="solr.ASCIIFoldingFilterFactory"/>
>
>                <filter class="solr.WordDelimiterFilterFactory"
> splitOnCaseChange="1" splitOnNumerics="1" stemEnglishPossessive="1"
> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" preserveOriginal="1"/>
>
>                <filter class="solr.PhoneticFilterFactory"
> encoder="DoubleMetaphone" inject="true"/>
>
>                <filter class="solr.PorterStemFilterFactory"/>
>
>                <filter class="solr.EdgeNGramFilterFactory" minGramSize="1"
> maxGramSize="50" side="front"/>
>
>                <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>
>        </analyzer>
>
>        <analyzer type="query">
>
>                <charFilter class="solr.MappingCharFilterFactory"
> mapping="../../filters/filter-mappings.txt"/>
>
>                <charFilter class="solr.HTMLStripCharFilterFactory"/>
>
>                <tokenizer class="solr.StandardTokenizerFactory"/>
>
>                <filter class="solr.TrimFilterFactory"/>
>
>                <filter class="solr.LowerCaseFilterFactory"/>
>
>                <filter class="solr.SynonymFilterFactory"
> synonyms="../../filters/filter-synonyms.txt" ignoreCase="true"
> expand="true"/>
>
>                <filter class="solr.ASCIIFoldingFilterFactory"/>
>
>                <filter class="solr.WordDelimiterFilterFactory"
> splitOnCaseChange="1" splitOnNumerics="1" stemEnglishPossessive="1"
> generateWordParts="0" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" preserveOriginal="1"/>
>
>                <filter class="solr.PhoneticFilterFactory"
> encoder="DoubleMetaphone"/>
>
>                <filter class="solr.PorterStemFilterFactory"/>
>
>                <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>
>        </analyzer>
>
> </fieldType>
>
> ==========================
>
> Actual index:
>
> ==========================
>
> <arr name="myEvent">
>
> <str>2284e2</str>
>
> <str>2284e4</str>
>
> <str>2284e5</str>
>
> <str>1911e2</str>
>
> </arr>
>
>
>
> ==========================
>
> Question:
>
> ==========================
>
> Now when I do a search like this:
>
>
>
> myEvent:1911e2
>
>
>
> This should match the 4th item.  Now on "Full Interface", it does not return
> any result.  But on "analysis", matches are highlighted.
>
>
>
> By using Debug: the parsedquery is:
>
>
>
> MultiPhraseQuery(myEvent:"(1911e2 1911) (A e) 2")
>
>
>
> Parsedquery_toString:
>
>
>
> myEvent:"(1911e2 1911) (A e) 2"
>
>
>
> Can anyone please help me on this?
>
>

RE: Weird: Solr Search result and Analysis Result not match?

Reply via email to