Glad your problem isn't one any longer. Yeah, there are a lot
of nooks and crannies that one gets used to with Solr!

I'd estimate that between learning how to read the debug
output and the analysis page 80-90% of the
"my search isn't working" questions on the list can be answered,
but it takes a while to get comfortable with those tools (and
to even know they exist!)...

Best
Erick

On Wed, Sep 24, 2014 at 6:57 AM, aaguilar <antelmo.aguilar...@nd.edu> wrote:

> Hello Erick,
>
> Just wanted to let you know that I did the change you suggested and
> everything works as expected.  Also, thanks for letting me know about the
> Analysis page in solr.  I did not know about it and I have found it very
> useful.
>
> Thanks!
>
> On Mon, Sep 22, 2014 at 5:41 PM, Antelmo Aguilar <
> antelmo.aguilar...@nd.edu>
> wrote:
>
> > Hello Erick,
> >
> > Thank you so much for your help.  That makes perfect sense.  I will do
> the
> > changes you suggest and let you know how it goes.
> >
> > Thanks!
> >
> > On Mon, Sep 22, 2014 at 4:12 PM, Erick Erickson [via Lucene] <
> > ml-node+s472066n4160547...@n3.nabble.com> wrote:
> >
> >> You have your index and query time analysis chains defined much
> >> differently. Omitting the WordDelimiterFilterFactory from the
> >> query-time analysis chain will lead to endless problems.
> >>
> >> With the definition you have, here are the terms in the index and
> >> their term positions as  below. This is available from the
> >> admin/analysis page if you click the "verbose" checkbox, although I
> >> admit it's kind of hard to read:
> >> 1         2                       3            4
> >> fatty  acid-binding     binding    protein
> >>          acid
> >>
> >> But at query time, this is how they're being analyzed
> >> 1             2                   3
> >> fatty    acid-binding    protein
> >>
> >> So searching for "fatty acid-binding protein" requires that the tokens
> >> "fatty" "acid-binding" and "protein" appear in term positions 1, 2, 3
> >> rather  than where they actually are (1, 2, 4). Searching for "fatty
> >> acid-binding protein"~1 would actually find this, the "~1" means allow
> >> one gap in there.
> >>
> >> HOWEVER, that's the least of your problems. WordDelimiterFilterFactory
> >> will _also_ "split on intra-word delimiters (all non alpha-numeric
> >> characters)". While that doesn't really say so explicitly, that will
> >> have the effect of removing puncutation. So searching for "fatty
> >> acid-binding protein."~1 (note the period) will fail since the token
> >> will include the period.
> >>
> >> I'd _really_ advise you to use the stock WordDelimiterFilterFactory
> >> settings in both analysis and query times included in the stock Solr
> >> release for, say, text_en_splitting or even a single analyzer like
> >> text_en_splitting_tight.
> >>
> >> Best,
> >> Erick
> >>
> >> On Mon, Sep 22, 2014 at 6:33 AM, aaguilar <[hidden email]
> >> <http://user/SendEmail.jtp?type=node&node=4160547&i=0>> wrote:
> >>
> >> > Hello Erick.
> >> >
> >> > Below is the information you requested.   Thanks for your help!
> >> >
> >> > <fieldType name="text_ws_finer" class="solr.TextField"
> >> positionIncrementGap=
> >> > "100"> <analyzer type="index"> <tokenizer class=
> >> > "solr.WhitespaceTokenizerFactory"/> <filter class=
> >> > "solr.WordDelimiterFilterFactory" splitOnNumerics="0"
> >> splitOnCaseChange="0"
> >> > generateWordParts="1" generateNumberParts="0" catenateWords="0"
> >> > catenateNumbers="0" catenateAll="0" preserveOriginal="1"/> <filter
> >> class=
> >> > "solr.StopFilterFactory"/> <filter
> >> class="solr.LowerCaseFilterFactory"/> </
> >> > analyzer> <analyzer type="query"> <tokenizer class=
> >> > "solr.WhitespaceTokenizerFactory"/> <filter class=
> >> > "solr.LowerCaseFilterFactory"/> </analyzer> </fieldType>
> >> >
> >> >
> >> > <field name="description" type="text_ws_finer" indexed="true"
> >> stored="true"
> >> > />
> >> >
> >> > On Fri, Sep 19, 2014 at 7:36 PM, Erick Erickson [via Lucene] <
> >> > [hidden email] <http://user/SendEmail.jtp?type=node&node=4160547&i=1
> >>
> >> wrote:
> >> >
> >> >> Hmmm, I'd have to see the schema definition for your description
> >> >> field. For this, the admin/analysis page is very helpful. Here's my
> >> >> guess:
> >> >>
> >> >> Your analysis chain doesn't break the incoming tokens up quite like
> >> >> you think it is. Thus you have the tokens in your index like
> >> >> 'protein,' (notice the comma) and 'protein-like' rather than just
> >> >> 'protein'. However, I can't quite reconcile this with your statement:
> >> >> "Another weird thing is that if I used description:"fatty
> >> >> acid-binding" AND description:"protein"
> >> >>
> >> >> so I'm at something of a loss. If you paste in your schema definition
> >> >> for the 'description' field _and_ the corresponding <fieldType>
> >> >> definition I can give it a quick whirl.
> >> >>
> >> >> Best,
> >> >> Erick
> >> >>
> >> >> On Fri, Sep 19, 2014 at 11:53 AM, aaguilar <[hidden email]
> >> >> <http://user/SendEmail.jtp?type=node&node=4160122&i=0>> wrote:
> >> >>
> >> >> > Hello Erick,
> >> >> >
> >> >> > Thanks for the response.  I tried adding the debug=True to the
> >> query,
> >> >> but I
> >> >> > do not know exactly what I am looking for in the output.  Would it
> >> be
> >> >> > possible for you to look at the results?  I would really appreciate
> >> it.
> >> >> I
> >> >> > attached two files, one of them is with the filter query
> >> >> description:"fatty
> >> >> > acid-binding" and the other is with the filter query
> >> description:"fatty
> >> >> > acid-binding protein".  If you see the file that has the results
> for
> >> >> > description:"fatty acid-binding" , you can see that the hits do
> have
> >> >> "fatty
> >> >> > acid-binding protein" and nothing in between.  I really appreciate
> >> any
> >> >> help
> >> >> > you can provide.
> >> >> >
> >> >> > Thanks you
> >> >> >
> >> >> > On Fri, Sep 19, 2014 at 2:03 PM, Erick Erickson [via Lucene] <
> >> >> > [hidden email] <
> http://user/SendEmail.jtp?type=node&node=4160122&i=1>>
> >>
> >> >> wrote:
> >> >> >
> >> >> >> Your very best friend here is attaching &debug=query to the URL
> and
> >> >> >> looking at the parsed query results. Upon occasion there's some
> >> >> >>
> >> >> >> One possible explanation is that description field has something
> >> like
> >> >> >> "fatty acid-binding some words protein" in which case your query
> >> >> >> "fatty acid-binding protein" would fail, but "fatty acid-binding
> >> >> >> protein"~4 would succeed.
> >> >> >>
> >> >> >> The other possibility is that your query parsing isn't quite doing
> >> >> >> what you think, but adding &debug=query should help there.
> >> >> >>
> >> >> >> Best,
> >> >> >> Erick
> >> >> >>
> >> >> >> On Fri, Sep 19, 2014 at 8:10 AM, aaguilar <[hidden email]
> >> >> >> <http://user/SendEmail.jtp?type=node&node=4160036&i=0>> wrote:
> >> >> >>
> >> >> >> > Hello All,
> >> >> >> >
> >> >> >> > I recently came across a problem when I tried using
> >> >> description:"fatty
> >> >> >> > acid-binding protein" as a filter query when doing a query
> >> through
> >> >> the
> >> >> >> query
> >> >> >> > interface for Solr in the Tomcat server.  Using that filter
> query
> >> did
> >> >> >> not
> >> >> >> > give me any results at all, however if I used description:"fatty
> >> >> >> > acid-binding" as the filter query, it would give me the results
> I
> >> >> >> wanted.
> >> >> >> >
> >> >> >> > The thing is that some of the results I got back from Solr, did
> >> have
> >> >> the
> >> >> >> > words "fatty acid-binding protein" in the description field.  So
> >> I
> >> >> >> really do
> >> >> >> > not know what might be causing the issue of Solr not being able
> >> to
> >> >> find
> >> >> >> > those hits.
> >> >> >> >
> >> >> >> > Another weird thing is that if I used description:"fatty
> >> >> acid-binding"
> >> >> >> AND
> >> >> >> > description:"protein" as the filter query when doing a query, it
> >> gave
> >> >> me
> >> >> >> the
> >> >> >> > results I anticipated (with some extra results that did not have
> >> the
> >> >> >> exact
> >> >> >> > phrase "fatty acid-binding protein").  Does anyone have an idea
> >> as to
> >> >> >> what
> >> >> >> > might be happening?  Just in case this is helpful, the version
> of
> >> >> Solr
> >> >> >> we
> >> >> >> > are using is 4.0.0.2012.10.06.03.04.33.  I appreciate any help
> >> anyone
> >> >> >> can
> >> >> >> > provide.
> >> >> >> >
> >> >> >> > Thanks!
> >> >> >> >
> >> >> >> >
> >> >> >> >
> >> >> >> > --
> >> >> >> > View this message in context:
> >> >> >>
> >> >>
> >>
> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990.html
> >> >> >> > Sent from the Solr - User mailing list archive at Nabble.com.
> >> >> >>
> >> >> >>
> >> >> >> ------------------------------
> >> >> >>  If you reply to this email, your message will be added to the
> >> >> discussion
> >> >> >> below:
> >> >> >>
> >> >> >>
> >> >>
> >>
> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160036.html
> >> >> >>  To unsubscribe from Issue Adding Filter Query, click here
> >> >> >> <
> >> >> >> .
> >> >> >> NAML
> >> >> >> <
> >> >>
> >>
> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
> >
> >>
> >> >>
> >> >> >>
> >> >> >
> >> >> >
> >> >> > fatty_acid-binding_protein.xml (1K) <
> >> >>
> >>
> http://lucene.472066.n3.nabble.com/attachment/4160048/0/fatty_acid-binding_protein.xml
> >
> >>
> >> >>
> >> >> > fatty_acid-binding.xml (63K) <
> >> >>
> >>
> http://lucene.472066.n3.nabble.com/attachment/4160048/1/fatty_acid-binding.xml
> >
> >>
> >> >>
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> > --
> >> >> > View this message in context:
> >> >>
> >>
> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160048.html
> >> >> > Sent from the Solr - User mailing list archive at Nabble.com.
> >> >>
> >> >>
> >> >> ------------------------------
> >> >>  If you reply to this email, your message will be added to the
> >> discussion
> >> >> below:
> >> >>
> >> >>
> >>
> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160122.html
> >> >>  To unsubscribe from Issue Adding Filter Query, click here
> >> >> <
> >> >> .
> >> >> NAML
> >> >> <
> >>
> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
> >
> >>
> >> >>
> >> >
> >> >
> >> >
> >> >
> >> > --
> >> > View this message in context:
> >>
> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160423.html
> >> > Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >>
> >> ------------------------------
> >>  If you reply to this email, your message will be added to the
> >> discussion below:
> >>
> >>
> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160547.html
> >>  To unsubscribe from Issue Adding Filter Query, click here
> >> <
> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4159990&code=QW50ZWxtby5BZ3VpbGFyLjE3QG5kLmVkdXw0MTU5OTkwfC0xMDkyNTg2ODY3
> >
> >> .
> >> NAML
> >> <
> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
> >
> >>
> >
> >
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Issue-Adding-Filter-Query-tp4159990p4160921.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Reply via email to