Re: SOLR Exact phrase search issue
Heck, Charlie, it explains 90% of the problems I’ve personally had with programming in general over my entire career... > On Jul 15, 2020, at 5:08 AM, Charlie Hull wrote: > > On 14/07/2020 12:48, Erick Erickson wrote: >> This is almost certainly a mismatch between what you think is >> happening and what you’ve actually told Solr to do ;). > That's a great one-line explanation of 90% of the issues people face with > Solr :-) > > Charlie >> >> Best, >> Erick >> >>> On Jul 14, 2020, at 7:05 AM, Villalba Sans, Raúl >>> wrote: >>> >>> Hello, >>> >>> We have an app that uses SOLR as search engine. We have detected incorrect >>> behavior for which we find no explanation. If we perform a search with the >>> phrase "Què t’hi jugues" we do not receive any results, although we know >>> that there is a result that contains this phrase. However, if we search for >>> "Què t’hi" or for "t’hi jugues" we do find results, including "Què t’hi >>> jugues ". We attach screenshots of the search tool and the xml of the >>> results. We would greatly appreciate it if you could lend a hand in trying >>> to find a solution or identify the cause of the problem. >>> Search 1 – “Què t’hi jugues” >>> >>> Search 2 – “Què t’hi” >>> >>> >>> Search 3 – “t’hi jugues” >>> >>> >>> Best regards, >>> >>> Raül Villalba Sans >>> Delivery Centers – Centros de Producción >>> Parque de Gardeny, Edificio 28 >>> 25071 Lleida, España >>> T +34 973 193 580 >>> > > > -- > Charlie Hull > OpenSource Connections, previously Flax > > tel/fax: +44 (0)8700 118334 > mobile: +44 (0)7767 825828 > web: www.o19s.com
Re: SOLR Exact phrase search issue
On 14/07/2020 12:48, Erick Erickson wrote: This is almost certainly a mismatch between what you think is happening and what you’ve actually told Solr to do ;). That's a great one-line explanation of 90% of the issues people face with Solr :-) Charlie Best, Erick On Jul 14, 2020, at 7:05 AM, Villalba Sans, Raúl wrote: Hello, We have an app that uses SOLR as search engine. We have detected incorrect behavior for which we find no explanation. If we perform a search with the phrase "Què t’hi jugues" we do not receive any results, although we know that there is a result that contains this phrase. However, if we search for "Què t’hi" or for "t’hi jugues" we do find results, including "Què t’hi jugues ". We attach screenshots of the search tool and the xml of the results. We would greatly appreciate it if you could lend a hand in trying to find a solution or identify the cause of the problem. Search 1 – “Què t’hi jugues” Search 2 – “Què t’hi” Search 3 – “t’hi jugues” Best regards, Raül Villalba Sans Delivery Centers – Centros de Producción Parque de Gardeny, Edificio 28 25071 Lleida, España T +34 973 193 580 -- Charlie Hull OpenSource Connections, previously Flax tel/fax: +44 (0)8700 118334 mobile: +44 (0)7767 825828 web: www.o19s.com
Re: SOLR Exact phrase search issue
This is usually a result of either indexing or querying not quite doing what you expect. The screenshots don’t help diagnose as they’re just the results, but don’t really help understand why. So here’s what I do to try to figure out why: 1> add &debug=query to the query You can check the “debugQuery” checkbox on the admin UI. In particular look at the “parsed query” in the results. Is it what you expect? 2> use the Admin/Analysis page to see how the fields you’re searching against are tokenized. Sometimes your analysis chain produces unexpected results. <1> will show you this for querying, but not indexing. 3> try turning on highlighting. You have not shown, for instance, that "Què t’hi jugues” all appears in a single field. It’s conceivable that you’re not searching that field at all and are matching "t’hi jugues” or "Què t’hi” in a different field than you expect. 4> Another thing that fools people is that the analysis chain may break up “t’hi” into “t” and “hi” which then may match unexpected places. 5> Are any of these stopwords? The admin/analysis page will show you. 6> Finally, try attaching &debug=true&explainOther=id:id_of_doc_you_expect. That will show you now the document you expect was scored, whether or not it’s included in numFound. It’s intended exactly for answering the question “why didn’t my searche return a doc I _know_ it should?" Seems like a lot of places to look, but <1> is certainly the first place I’d look. This is almost certainly a mismatch between what you think is happening and what you’ve actually told Solr to do ;). Best, Erick > On Jul 14, 2020, at 7:05 AM, Villalba Sans, Raúl > wrote: > > Hello, > > We have an app that uses SOLR as search engine. We have detected incorrect > behavior for which we find no explanation. If we perform a search with the > phrase "Què t’hi jugues" we do not receive any results, although we know that > there is a result that contains this phrase. However, if we search for "Què > t’hi" or for "t’hi jugues" we do find results, including "Què t’hi jugues ". > We attach screenshots of the search tool and the xml of the results. We would > greatly appreciate it if you could lend a hand in trying to find a solution > or identify the cause of the problem. > > Search 1 – “Què t’hi jugues” > > > Search 2 – “Què t’hi” > > > Search 3 – “t’hi jugues” > > > Best regards, > > > > Raül Villalba Sans > Delivery Centers – Centros de Producción > > Parque de Gardeny, Edificio 28 > 25071 Lleida, España > T +34 973 193 580 > >
Phrase search issue with XMLPayload? Is it the better solution?
I have a project that involves words extracted by OCR, each page has words, each word has its geometry to blink a highlight to end user. I've been trying represent this document structure by xml foo bar baz qux Using the field 'fulltext_st' , <document > <page top='111' bottom='222' right='333' left='444' word='foo' num='1'>foo</page> <page top='211' bottom='322' right='833' left='944' word='bar' num='1'>bar</page> <page top='311' bottom='422' right='733' left='144' word='baz' num='1'>baz</page> <page top='411' bottom='522' right='633' left='244' word='qux' num='1'>qux</page> </document> I can get all terms in my search result with them payloads. But if I do search using phrase query I can't fetch any result. Example: search?q=foo 1 search?q=foo+bar 1 1 /search?q="foo bar" *nothing* I was wondering if I could get your thoughts if xmlpayload supports sort of the things(with phrase search) or is there a good solution to index a doc with many pages and one rectangle(graphical word geometry) for each term? thank you in advance -- View this message in context: http://old.nabble.com/Phrase-search-issue-with-XMLPayload--Is-it-the-better-solution--tp27018815p27018815.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Phrase Search Issue
Amit, Append &debugQuery=true to the search request URL and you'll see how your query string was interpreted. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: dabboo > To: solr-user@lucene.apache.org > Sent: Thursday, May 21, 2009 3:48:45 AM > Subject: Re: Phrase Search Issue > > > This problem is related with the default operator in dismax. Currently OR is > the default operator and it is behaving perfectly fine. I have changed the > default operator in schema.xml to AND, I also have changed the minimum match > to 100%. > > But it seems like AND as default operator doesnt work with Dismax. > Please suggest. > > Thanks, > Amit Garg > > > > dabboo wrote: > > > > Hi, > > > > I am facing one issue in phrase query. I am entering 'Top of the world' as > > my search criteria. I am expecting it to return all the records in which, > > one field should all these words in any order. > > > > But it is treating as OR and returning all the records, which are having > > either of these words. I am doing this using dismax request. > > > > I would appreciate if somebody can provide me some pointers. > > > > Thanks, > > Amit Garg > > > > -- > View this message in context: > http://www.nabble.com/Phrase-Search-Issue-tp23648813p23649189.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: Phrase Search Issue
This problem is related with the default operator in dismax. Currently OR is the default operator and it is behaving perfectly fine. I have changed the default operator in schema.xml to AND, I also have changed the minimum match to 100%. But it seems like AND as default operator doesnt work with Dismax. Please suggest. Thanks, Amit Garg dabboo wrote: > > Hi, > > I am facing one issue in phrase query. I am entering 'Top of the world' as > my search criteria. I am expecting it to return all the records in which, > one field should all these words in any order. > > But it is treating as OR and returning all the records, which are having > either of these words. I am doing this using dismax request. > > I would appreciate if somebody can provide me some pointers. > > Thanks, > Amit Garg > -- View this message in context: http://www.nabble.com/Phrase-Search-Issue-tp23648813p23649189.html Sent from the Solr - User mailing list archive at Nabble.com.
Phrase Search Issue
Hi, I am facing one issue in phrase query. I am entering 'Top of the world' as my search criteria. I am expecting it to return all the records in which, one field should all these words in any order. But it is treating as OR and returning all the records, which are having either of these words. I am doing this using dismax request. I would appreciate if somebody can provide me some pointers. Thanks, Amit Garg -- View this message in context: http://www.nabble.com/Phrase-Search-Issue-tp23648813p23648813.html Sent from the Solr - User mailing list archive at Nabble.com.