RE: How to stop Solr tokenising search terms with spaces
Hi Ahmet, We have gone for the Ngram solution. Thanks Regards, Dinesh Babu. -Original Message- From: Ahmet Arslan [mailto:iori...@yahoo.com.INVALID] Sent: 08 December 2014 15:27 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces Hi, May be you have omitTermFreqAndPositions=true set for your fields? Positions are necessary for phrase queries to work. Ahmet On Monday, December 8, 2014 5:20 PM, Dinesh Babu dinesh.b...@pb.com wrote: Hi Yonik, It is a text field ( all our search fields are of type text ). Very unlucky for me that it is not working. Will try the NGram solution provided by Jack. Regards, Dinesh Babu. -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: 08 December 2014 13:25 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces On Mon, Dec 8, 2014 at 2:50 AM, Dinesh Babu dinesh.b...@pb.com wrote: I just tried your suggestion {!complexphrase}displayName:RVN Viewpoint users Even the above did not work. Am I missing any configuration changes for this parser to work? What is the fieldType of displayName? The complexphrase query parser is only for text fields (those that that index each word as a separate term.) -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data Regards, Dinesh Babu. -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: 07 December 2014 20:49 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces On Sun, Dec 7, 2014 at 3:18 PM, Dinesh Babu dinesh.b...@pb.com wrote: Thanks Yonik. This does not seem to work for me. This is wgat I did 1) q=displayName:rvn* brings me two records (a) RVN Viewpoint Users and (b) RVN Project Admins 2) {!complexphrase}RVN* -- Unknown query type \org.apache.lucene.search.PrefixQuery\ found in phrase query string \RVN*\ Looks like you found a bug in this part... a prefix query being quoted when it doesn't need to be. 3) {!complexphrase}RVN V* -- Does not bring any result back. This type of query should work (and does for me). Is it because the default search field does not have these terms, and you didn't specify a different field to search? Try this: {!complexphrase}displayName:RVN V* -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data
Re: How to stop Solr tokenising search terms with spaces
If possible, please post your field type for others to see the final solution. Thanks! -- Jack Krupansky -Original Message- From: Dinesh Babu Sent: Wednesday, December 10, 2014 9:54 AM To: solr-user@lucene.apache.org ; Ahmet Arslan Subject: RE: How to stop Solr tokenising search terms with spaces Hi Ahmet, We have gone for the Ngram solution. Thanks Regards, Dinesh Babu. -Original Message- From: Ahmet Arslan [mailto:iori...@yahoo.com.INVALID] Sent: 08 December 2014 15:27 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces Hi, May be you have omitTermFreqAndPositions=true set for your fields? Positions are necessary for phrase queries to work. Ahmet On Monday, December 8, 2014 5:20 PM, Dinesh Babu dinesh.b...@pb.com wrote: Hi Yonik, It is a text field ( all our search fields are of type text ). Very unlucky for me that it is not working. Will try the NGram solution provided by Jack. Regards, Dinesh Babu. -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: 08 December 2014 13:25 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces On Mon, Dec 8, 2014 at 2:50 AM, Dinesh Babu dinesh.b...@pb.com wrote: I just tried your suggestion {!complexphrase}displayName:RVN Viewpoint users Even the above did not work. Am I missing any configuration changes for this parser to work? What is the fieldType of displayName? The complexphrase query parser is only for text fields (those that that index each word as a separate term.) -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data Regards, Dinesh Babu. -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: 07 December 2014 20:49 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces On Sun, Dec 7, 2014 at 3:18 PM, Dinesh Babu dinesh.b...@pb.com wrote: Thanks Yonik. This does not seem to work for me. This is wgat I did 1) q=displayName:rvn* brings me two records (a) RVN Viewpoint Users and (b) RVN Project Admins 2) {!complexphrase}RVN* -- Unknown query type \org.apache.lucene.search.PrefixQuery\ found in phrase query string \RVN*\ Looks like you found a bug in this part... a prefix query being quoted when it doesn't need to be. 3) {!complexphrase}RVN V* -- Does not bring any result back. This type of query should work (and does for me). Is it because the default search field does not have these terms, and you didn't specify a different field to search? Try this: {!complexphrase}displayName:RVN V* -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data
RE: How to stop Solr tokenising search terms with spaces
But my requirement is A* B* to be A* B* . A* OR B*won't meet my requirement. We have chosen the NGram solution and it is working for our rquirement at the moment. Thanks for your input and help Yonik Regards, Dinesh Babu. -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: 08 December 2014 17:58 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces On Mon, Dec 8, 2014 at 12:01 PM, Erik Hatcher erik.hatc...@gmail.com wrote: debug output tells a lot. Looks like in the last two examples that the second part (Viewpoint*) is NOT parsed with the complex phrase parser - the whitespace thwarts it. Actually, it looks like it is, but you're not telling the complex phrase parser to put the two clauses in a phrase. You need the quotes. Even for complexphrase parser A* B* is the same as A* OR B* -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data
Re: How to stop Solr tokenising search terms with spaces
On Tue, Dec 9, 2014 at 12:49 PM, Dinesh Babu dinesh.b...@pb.com wrote: But my requirement is A* B* to be A* B* . A* OR B*won't meet my requirement. The syntax is what it is... With the complexphrase parser, if you want at phrase, you need to surround the clauses with double quotes: A* B* -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data We have chosen the NGram solution and it is working for our rquirement at the moment. Thanks for your input and help Yonik Regards, Dinesh Babu. -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: 08 December 2014 17:58 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces On Mon, Dec 8, 2014 at 12:01 PM, Erik Hatcher erik.hatc...@gmail.com wrote: debug output tells a lot. Looks like in the last two examples that the second part (Viewpoint*) is NOT parsed with the complex phrase parser - the whitespace thwarts it. Actually, it looks like it is, but you're not telling the complex phrase parser to put the two clauses in a phrase. You need the quotes. Even for complexphrase parser A* B* is the same as A* OR B* -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data
Re: How to stop Solr tokenising search terms with spaces
What's the parsed query? debug=true On Dec 8, 2014, at 02:50, Dinesh Babu dinesh.b...@pb.com wrote: I just tried your suggestion {!complexphrase}displayName:RVN Viewpoint users Even the above did not work. Am I missing any configuration changes for this parser to work? Regards, Dinesh Babu. -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: 07 December 2014 20:49 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces On Sun, Dec 7, 2014 at 3:18 PM, Dinesh Babu dinesh.b...@pb.com wrote: Thanks Yonik. This does not seem to work for me. This is wgat I did 1) q=displayName:rvn* brings me two records (a) RVN Viewpoint Users and (b) RVN Project Admins 2) {!complexphrase}RVN* -- Unknown query type \org.apache.lucene.search.PrefixQuery\ found in phrase query string \RVN*\ Looks like you found a bug in this part... a prefix query being quoted when it doesn't need to be. 3) {!complexphrase}RVN V* -- Does not bring any result back. This type of query should work (and does for me). Is it because the default search field does not have these terms, and you didn't specify a different field to search? Try this: {!complexphrase}displayName:RVN V* -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data
Re: How to stop Solr tokenising search terms with spaces
On Mon, Dec 8, 2014 at 2:50 AM, Dinesh Babu dinesh.b...@pb.com wrote: I just tried your suggestion {!complexphrase}displayName:RVN Viewpoint users Even the above did not work. Am I missing any configuration changes for this parser to work? What is the fieldType of displayName? The complexphrase query parser is only for text fields (those that that index each word as a separate term.) -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data Regards, Dinesh Babu. -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: 07 December 2014 20:49 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces On Sun, Dec 7, 2014 at 3:18 PM, Dinesh Babu dinesh.b...@pb.com wrote: Thanks Yonik. This does not seem to work for me. This is wgat I did 1) q=displayName:rvn* brings me two records (a) RVN Viewpoint Users and (b) RVN Project Admins 2) {!complexphrase}RVN* -- Unknown query type \org.apache.lucene.search.PrefixQuery\ found in phrase query string \RVN*\ Looks like you found a bug in this part... a prefix query being quoted when it doesn't need to be. 3) {!complexphrase}RVN V* -- Does not bring any result back. This type of query should work (and does for me). Is it because the default search field does not have these terms, and you didn't specify a different field to search? Try this: {!complexphrase}displayName:RVN V* -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data
RE: How to stop Solr tokenising search terms with spaces
Thanks a lot Jack. Will try this Solution. Regards, Dinesh Babu. -Original Message- From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: 07 December 2014 20:38 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces Thanks for the clarification. You may be able to get by using an ngram filter at index time - but not at query time. Then Tom would be indexed at position 0 as to, om, and tom, and Hanks would be indexed at position 1 as ha, an, nk, ks, han, ank, nks, hank, anks, and hanks, permitting all of your queries, as unquoted terms or quoted simple phrases, such as to ank. Use the standard tokenizer combined with the NGramFilterFactory and lower case filter, but only use the ngram filter at index time. See: http://lucene.apache.org/core/4_10_2/analyzers-common/org/apache/lucene/analysis/ngram/NGramFilterFactory.html But be aware that use of the ngram filter dramatically increases the index size, so don't use it on large text fields, just short text fields like names. -- Jack Krupansky -Original Message- From: Dinesh Babu Sent: Sunday, December 7, 2014 2:58 PM To: solr-user@lucene.apache.org Subject: RE: How to stop Solr tokenising search terms with spaces Hi Alex, My requirement is that I should be able to search for a person , for example Tom Hanks, by either 1) the whole of first name (Tom) 2) or partial first name with prefix (To ) 3) or partial first name without prefix ( om) 4) or the whole of surname ( Hanks) 5) or partial surname with prefix (Han) 6) or partial surname without prefix (ank) 7) or the whole name (Tom Hanks) 8) or partial first name with or without prefix and partial surname with or without prefix ( To Han , om ank) 9) All of the above as case insensitive search Thanks in advance for your help Regards, Dinesh Babu. -Original Message- From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] Sent: 07 December 2014 01:20 To: solr-user Subject: Re: How to stop Solr tokenising search terms with spaces There is no spoon. And, there is no phrase search. Certainly nothing that is one approach that fits all. What is actually happening is that you seem to want both phrase and prefix search. In your original question you did not explain the second part. So, you were given a solution for the first one. To get the second part, you now need to to put some sort of NGram into the index-type analyzer chain. But the problem is, you need to be very clear on what you want there. Do you want: 1) Major Hanks 2) Major Ha 3) Hanks Ma (swapped) 4) Hanks random text Major (swapped and apart) 4) Ha Ma (prefix on both words) 5) ha ma (lower case searches too) Or only some of those? Each of these things have implications and trade-offs. Once you know what you want to find, we can help you get there. Regards, Alex. P.s. If you are not sure what I am talking about with the analyzer chain, may I recommend my own book: http://www.amazon.ca/Instant-Apache-Solr-Indexing-Data-ebook/dp/B00D85K9XC It seems to be on sale right now. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On 6 December 2014 at 19:17, Dinesh Babu dinesh.b...@pb.com wrote: Just curious, why solr does not provide a simple mechanism to do a phrase search ? It is a very common use case and it is very surprising that there is no straight forward, at least I have not found one after so much research, way to do it in Solr. Regards, Dinesh -Original Message- From: Dinesh Babu [mailto:dinesh.b...@pb.com] Sent: 05 December 2014 17:29 To: solr-user@lucene.apache.org Subject: RE: How to stop Solr tokenising search terms with spaces Hi Erik, Probably I celebrated too soon. When I tested {!field} it seemed to work as the query was on such a data that it made to look like it is working. using the example that I originally mentioned to search for Tom Hanks Major 1) If I search {!field f=displayName}: Hanks Major, it works 2) If I provide partial word {!field f=displayName}: Hanks Ma, it does not work Is this how {!field is designed to work? Also I tried without and with escaping space as you suggested. It has the same issue 1) q= field1:Hanks Major , it works 2) q= field1:Hanks Maj , does not works Regards, Dinesh Babu. -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: 05 December 2014 16:44 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces But also, to spell out the more typical way to do that: q=field1:”…” OR field2:”…” The nice thing about {!field} is that the value doesn’t have to have quotes and deal with escaping issues, but if you just want phrase queries and quote/escaping isn’t a hassle maybe that’s cleaner for you. Erik On Dec 5
RE: How to stop Solr tokenising search terms with spaces
Hi Yonik, It is a text field ( all our search fields are of type text ). Very unlucky for me that it is not working. Will try the NGram solution provided by Jack. Regards, Dinesh Babu. -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: 08 December 2014 13:25 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces On Mon, Dec 8, 2014 at 2:50 AM, Dinesh Babu dinesh.b...@pb.com wrote: I just tried your suggestion {!complexphrase}displayName:RVN Viewpoint users Even the above did not work. Am I missing any configuration changes for this parser to work? What is the fieldType of displayName? The complexphrase query parser is only for text fields (those that that index each word as a separate term.) -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data Regards, Dinesh Babu. -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: 07 December 2014 20:49 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces On Sun, Dec 7, 2014 at 3:18 PM, Dinesh Babu dinesh.b...@pb.com wrote: Thanks Yonik. This does not seem to work for me. This is wgat I did 1) q=displayName:rvn* brings me two records (a) RVN Viewpoint Users and (b) RVN Project Admins 2) {!complexphrase}RVN* -- Unknown query type \org.apache.lucene.search.PrefixQuery\ found in phrase query string \RVN*\ Looks like you found a bug in this part... a prefix query being quoted when it doesn't need to be. 3) {!complexphrase}RVN V* -- Does not bring any result back. This type of query should work (and does for me). Is it because the default search field does not have these terms, and you didn't specify a different field to search? Try this: {!complexphrase}displayName:RVN V* -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data
Re: How to stop Solr tokenising search terms with spaces
Hi, May be you have omitTermFreqAndPositions=true set for your fields? Positions are necessary for phrase queries to work. Ahmet On Monday, December 8, 2014 5:20 PM, Dinesh Babu dinesh.b...@pb.com wrote: Hi Yonik, It is a text field ( all our search fields are of type text ). Very unlucky for me that it is not working. Will try the NGram solution provided by Jack. Regards, Dinesh Babu. -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: 08 December 2014 13:25 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces On Mon, Dec 8, 2014 at 2:50 AM, Dinesh Babu dinesh.b...@pb.com wrote: I just tried your suggestion {!complexphrase}displayName:RVN Viewpoint users Even the above did not work. Am I missing any configuration changes for this parser to work? What is the fieldType of displayName? The complexphrase query parser is only for text fields (those that that index each word as a separate term.) -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data Regards, Dinesh Babu. -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: 07 December 2014 20:49 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces On Sun, Dec 7, 2014 at 3:18 PM, Dinesh Babu dinesh.b...@pb.com wrote: Thanks Yonik. This does not seem to work for me. This is wgat I did 1) q=displayName:rvn* brings me two records (a) RVN Viewpoint Users and (b) RVN Project Admins 2) {!complexphrase}RVN* -- Unknown query type \org.apache.lucene.search.PrefixQuery\ found in phrase query string \RVN*\ Looks like you found a bug in this part... a prefix query being quoted when it doesn't need to be. 3) {!complexphrase}RVN V* -- Does not bring any result back. This type of query should work (and does for me). Is it because the default search field does not have these terms, and you didn't specify a different field to search? Try this: {!complexphrase}displayName:RVN V* -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data
RE: How to stop Solr tokenising search terms with spaces
Hi Erik, 1. With search phrase in quotes {!complexphrase}displayName:RVN Viewpoint* debug: { rawquerystring: {!complexphrase}displayName:\RVN Viewpoint*\, querystring: {!complexphrase}displayName:\RVN Viewpoint*\, parsedquery: ComplexPhraseQuery(\RVN Viewpoint*\), parsedquery_toString: \RVN Viewpoint*\, QParser: ComplexPhraseQParser } 2. Tried with search phrase not in quotes: This brings result back but only those starting with viewpoint and does not bring rvn viewpoint {!complexphrase}displayName:RVN Viewpoint* debug: { rawquerystring: {!complexphrase}displayName:RVN Viewpoint*, querystring: {!complexphrase}displayName:RVN Viewpoint*, parsedquery: displayName:rvn displayName:viewpoint*, parsedquery_toString: displayName:rvn displayName:viewpoint*, QParser: ComplexPhraseQParser } 3. {!complexphrase}displayName:RVN* Viewpoint* debug: { rawquerystring: {!complexphrase}displayName:RVN* Viewpoint*, querystring: {!complexphrase}displayName:RVN* Viewpoint*, parsedquery: displayName:rvn* displayName:viewpoint*, parsedquery_toString: displayName:rvn* displayName:viewpoint*, QParser: ComplexPhraseQParser } Regards, Dinesh Babu. -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: 08 December 2014 09:40 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces What's the parsed query? debug=true On Dec 8, 2014, at 02:50, Dinesh Babu dinesh.b...@pb.com wrote: I just tried your suggestion {!complexphrase}displayName:RVN Viewpoint users Even the above did not work. Am I missing any configuration changes for this parser to work? Regards, Dinesh Babu. -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: 07 December 2014 20:49 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces On Sun, Dec 7, 2014 at 3:18 PM, Dinesh Babu dinesh.b...@pb.com wrote: Thanks Yonik. This does not seem to work for me. This is wgat I did 1) q=displayName:rvn* brings me two records (a) RVN Viewpoint Users and (b) RVN Project Admins 2) {!complexphrase}RVN* -- Unknown query type \org.apache.lucene.search.PrefixQuery\ found in phrase query string \RVN*\ Looks like you found a bug in this part... a prefix query being quoted when it doesn't need to be. 3) {!complexphrase}RVN V* -- Does not bring any result back. This type of query should work (and does for me). Is it because the default search field does not have these terms, and you didn't specify a different field to search? Try this: {!complexphrase}displayName:RVN V* -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data
Re: How to stop Solr tokenising search terms with spaces
debug output tells a lot. Looks like in the last two examples that the second part (Viewpoint*) is NOT parsed with the complex phrase parser - the whitespace thwarts it. I’d recommend doing something like this to test that parser out to avoid the “meta” parsing issue. q={!complexphrase v=$qq}qq=the expression Erik On Dec 8, 2014, at 11:14 AM, Dinesh Babu dinesh.b...@pb.com wrote: Hi Erik, 1. With search phrase in quotes {!complexphrase}displayName:RVN Viewpoint* debug: { rawquerystring: {!complexphrase}displayName:\RVN Viewpoint*\, querystring: {!complexphrase}displayName:\RVN Viewpoint*\, parsedquery: ComplexPhraseQuery(\RVN Viewpoint*\), parsedquery_toString: \RVN Viewpoint*\, QParser: ComplexPhraseQParser } 2. Tried with search phrase not in quotes: This brings result back but only those starting with viewpoint and does not bring rvn viewpoint {!complexphrase}displayName:RVN Viewpoint* debug: { rawquerystring: {!complexphrase}displayName:RVN Viewpoint*, querystring: {!complexphrase}displayName:RVN Viewpoint*, parsedquery: displayName:rvn displayName:viewpoint*, parsedquery_toString: displayName:rvn displayName:viewpoint*, QParser: ComplexPhraseQParser } 3. {!complexphrase}displayName:RVN* Viewpoint* debug: { rawquerystring: {!complexphrase}displayName:RVN* Viewpoint*, querystring: {!complexphrase}displayName:RVN* Viewpoint*, parsedquery: displayName:rvn* displayName:viewpoint*, parsedquery_toString: displayName:rvn* displayName:viewpoint*, QParser: ComplexPhraseQParser } Regards, Dinesh Babu. -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: 08 December 2014 09:40 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces What's the parsed query? debug=true On Dec 8, 2014, at 02:50, Dinesh Babu dinesh.b...@pb.com wrote: I just tried your suggestion {!complexphrase}displayName:RVN Viewpoint users Even the above did not work. Am I missing any configuration changes for this parser to work? Regards, Dinesh Babu. -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: 07 December 2014 20:49 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces On Sun, Dec 7, 2014 at 3:18 PM, Dinesh Babu dinesh.b...@pb.com wrote: Thanks Yonik. This does not seem to work for me. This is wgat I did 1) q=displayName:rvn* brings me two records (a) RVN Viewpoint Users and (b) RVN Project Admins 2) {!complexphrase}RVN* -- Unknown query type \org.apache.lucene.search.PrefixQuery\ found in phrase query string \RVN*\ Looks like you found a bug in this part... a prefix query being quoted when it doesn't need to be. 3) {!complexphrase}RVN V* -- Does not bring any result back. This type of query should work (and does for me). Is it because the default search field does not have these terms, and you didn't specify a different field to search? Try this: {!complexphrase}displayName:RVN V* -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data
RE: How to stop Solr tokenising search terms with spaces
Thanks Erik Regards, Dinesh Babu. -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: 08 December 2014 17:02 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces debug output tells a lot. Looks like in the last two examples that the second part (Viewpoint*) is NOT parsed with the complex phrase parser - the whitespace thwarts it. I’d recommend doing something like this to test that parser out to avoid the “meta” parsing issue. q={!complexphrase v=$qq}qq=the expression Erik On Dec 8, 2014, at 11:14 AM, Dinesh Babu dinesh.b...@pb.com wrote: Hi Erik, 1. With search phrase in quotes {!complexphrase}displayName:RVN Viewpoint* debug: { rawquerystring: {!complexphrase}displayName:\RVN Viewpoint*\, querystring: {!complexphrase}displayName:\RVN Viewpoint*\, parsedquery: ComplexPhraseQuery(\RVN Viewpoint*\), parsedquery_toString: \RVN Viewpoint*\, QParser: ComplexPhraseQParser } 2. Tried with search phrase not in quotes: This brings result back but only those starting with viewpoint and does not bring rvn viewpoint {!complexphrase}displayName:RVN Viewpoint* debug: { rawquerystring: {!complexphrase}displayName:RVN Viewpoint*, querystring: {!complexphrase}displayName:RVN Viewpoint*, parsedquery: displayName:rvn displayName:viewpoint*, parsedquery_toString: displayName:rvn displayName:viewpoint*, QParser: ComplexPhraseQParser } 3. {!complexphrase}displayName:RVN* Viewpoint* debug: { rawquerystring: {!complexphrase}displayName:RVN* Viewpoint*, querystring: {!complexphrase}displayName:RVN* Viewpoint*, parsedquery: displayName:rvn* displayName:viewpoint*, parsedquery_toString: displayName:rvn* displayName:viewpoint*, QParser: ComplexPhraseQParser } Regards, Dinesh Babu. -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: 08 December 2014 09:40 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces What's the parsed query? debug=true On Dec 8, 2014, at 02:50, Dinesh Babu dinesh.b...@pb.com wrote: I just tried your suggestion {!complexphrase}displayName:RVN Viewpoint users Even the above did not work. Am I missing any configuration changes for this parser to work? Regards, Dinesh Babu. -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: 07 December 2014 20:49 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces On Sun, Dec 7, 2014 at 3:18 PM, Dinesh Babu dinesh.b...@pb.com wrote: Thanks Yonik. This does not seem to work for me. This is wgat I did 1) q=displayName:rvn* brings me two records (a) RVN Viewpoint Users and (b) RVN Project Admins 2) {!complexphrase}RVN* -- Unknown query type \org.apache.lucene.search.PrefixQuery\ found in phrase query string \RVN*\ Looks like you found a bug in this part... a prefix query being quoted when it doesn't need to be. 3) {!complexphrase}RVN V* -- Does not bring any result back. This type of query should work (and does for me). Is it because the default search field does not have these terms, and you didn't specify a different field to search? Try this: {!complexphrase}displayName:RVN V* -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data
Re: How to stop Solr tokenising search terms with spaces
On Mon, Dec 8, 2014 at 12:01 PM, Erik Hatcher erik.hatc...@gmail.com wrote: debug output tells a lot. Looks like in the last two examples that the second part (Viewpoint*) is NOT parsed with the complex phrase parser - the whitespace thwarts it. Actually, it looks like it is, but you're not telling the complex phrase parser to put the two clauses in a phrase. You need the quotes. Even for complexphrase parser A* B* is the same as A* OR B* -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data
RE: How to stop Solr tokenising search terms with spaces
Hi Alex, My requirement is that I should be able to search for a person , for example Tom Hanks, by either 1) the whole of first name (Tom) 2) or partial first name with prefix (To ) 3) or partial first name without prefix ( om) 4) or the whole of surname ( Hanks) 5) or partial surname with prefix (Han) 6) or partial surname without prefix (ank) 7) or the whole name (Tom Hanks) 8) or partial first name with or without prefix and partial surname with or without prefix ( To Han , om ank) 9) All of the above as case insensitive search Thanks in advance for your help Regards, Dinesh Babu. -Original Message- From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] Sent: 07 December 2014 01:20 To: solr-user Subject: Re: How to stop Solr tokenising search terms with spaces There is no spoon. And, there is no phrase search. Certainly nothing that is one approach that fits all. What is actually happening is that you seem to want both phrase and prefix search. In your original question you did not explain the second part. So, you were given a solution for the first one. To get the second part, you now need to to put some sort of NGram into the index-type analyzer chain. But the problem is, you need to be very clear on what you want there. Do you want: 1) Major Hanks 2) Major Ha 3) Hanks Ma (swapped) 4) Hanks random text Major (swapped and apart) 4) Ha Ma (prefix on both words) 5) ha ma (lower case searches too) Or only some of those? Each of these things have implications and trade-offs. Once you know what you want to find, we can help you get there. Regards, Alex. P.s. If you are not sure what I am talking about with the analyzer chain, may I recommend my own book: http://www.amazon.ca/Instant-Apache-Solr-Indexing-Data-ebook/dp/B00D85K9XC It seems to be on sale right now. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On 6 December 2014 at 19:17, Dinesh Babu dinesh.b...@pb.com wrote: Just curious, why solr does not provide a simple mechanism to do a phrase search ? It is a very common use case and it is very surprising that there is no straight forward, at least I have not found one after so much research, way to do it in Solr. Regards, Dinesh -Original Message- From: Dinesh Babu [mailto:dinesh.b...@pb.com] Sent: 05 December 2014 17:29 To: solr-user@lucene.apache.org Subject: RE: How to stop Solr tokenising search terms with spaces Hi Erik, Probably I celebrated too soon. When I tested {!field} it seemed to work as the query was on such a data that it made to look like it is working. using the example that I originally mentioned to search for Tom Hanks Major 1) If I search {!field f=displayName}: Hanks Major, it works 2) If I provide partial word {!field f=displayName}: Hanks Ma, it does not work Is this how {!field is designed to work? Also I tried without and with escaping space as you suggested. It has the same issue 1) q= field1:Hanks Major , it works 2) q= field1:Hanks Maj , does not works Regards, Dinesh Babu. -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: 05 December 2014 16:44 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces But also, to spell out the more typical way to do that: q=field1:”…” OR field2:”…” The nice thing about {!field} is that the value doesn’t have to have quotes and deal with escaping issues, but if you just want phrase queries and quote/escaping isn’t a hassle maybe that’s cleaner for you. Erik On Dec 5, 2014, at 11:30 AM, Dinesh Babu dinesh.b...@pb.com wrote: One more quick question Erik, If I want to do search on multiple fields using {!field} do we have a query similar to what {!prefix} has : q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val} where f1_val=field 1 valuef2_val=field2 value Regards, Dinesh Babu. -Original Message- From: Dinesh Babu Sent: 05 December 2014 16:26 To: solr-user@lucene.apache.org Subject: RE: How to stop Solr tokenising search terms with spaces Thanks a lot Erik. {!field} seems to solve our issue. Much appreciate your help Regards, Dinesh Babu. -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: 05 December 2014 16:00 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces try using {!field} instead of {!prefix}. {!field} will create a phrase query (or term query if it’s just one term) after analysis. [it also could construct other query types if the analysis overlaps tokens, but maybe not relevant here] Also note that you can use multiple of these expressions if needed: q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val} where f1_val=field 1 valuef2_val
RE: How to stop Solr tokenising search terms with spaces
Hi Jack, Reproducing the email that specifies my requirement. My requirement is that I should be able to search for a person , for example Tom Hanks, by either 1) the whole of first name (Tom) 2) or partial first name with prefix (To ) 3) or partial first name without prefix ( om) 4) or the whole of surname ( Hanks) 5) or partial surname with prefix (Han) 6) or partial surname without prefix (ank) 7) or the whole name (Tom Hanks) 8) or partial first name with or without prefix and partial surname with or without prefix ( To Han , om ank) 9) All of the above as case insensitive search Thanks in advance for your help Regards, Dinesh Babu Regards, Dinesh Babu. -Original Message- From: Jack Krupansky [mailto:j...@basetechnology.com] Sent: 07 December 2014 02:04 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces AFAIK, partial word matching is not a common use case. Could you provide a citation to shows otherwise? Solr does provide a simple mechanism for phrase search - just place your phrase in quotes. If you wish to do something more complex, then of course the solution may be more complex. The starting point would be for you to provide a more complete description of your use case, which is clearly not simple phrase search. Your most recent messages suggested that you want to match on partial words, but... you need to be more specific - don't make us try to guess your requirements. Feeding us partial requirements, one partial requirement at a time is not particularly effective. Finally, are you really trying to match names within arbitrary text, or do you have a field that simply contains a complete name? Again, this comes back to providing us with more specific requirements. My guess, from your mention of LDAP, is that the field would contain only a name, but... that's me guessing when you need to be specific. Once this distinction is cleared up, we can then focus on solutions that work either for arbitrary text or single value fields. -- Jack Krupansky -Original Message- From: Dinesh Babu Sent: Saturday, December 6, 2014 7:17 PM To: solr-user@lucene.apache.org Subject: RE: How to stop Solr tokenising search terms with spaces Just curious, why solr does not provide a simple mechanism to do a phrase search ? It is a very common use case and it is very surprising that there is no straight forward, at least I have not found one after so much research, way to do it in Solr. Regards, Dinesh -Original Message- From: Dinesh Babu [mailto:dinesh.b...@pb.com] Sent: 05 December 2014 17:29 To: solr-user@lucene.apache.org Subject: RE: How to stop Solr tokenising search terms with spaces Hi Erik, Probably I celebrated too soon. When I tested {!field} it seemed to work as the query was on such a data that it made to look like it is working. using the example that I originally mentioned to search for Tom Hanks Major 1) If I search {!field f=displayName}: Hanks Major, it works 2) If I provide partial word {!field f=displayName}: Hanks Ma, it does not work Is this how {!field is designed to work? Also I tried without and with escaping space as you suggested. It has the same issue 1) q= field1:Hanks Major , it works 2) q= field1:Hanks Maj , does not works Regards, Dinesh Babu. -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: 05 December 2014 16:44 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces But also, to spell out the more typical way to do that: q=field1:”…” OR field2:”…” The nice thing about {!field} is that the value doesn’t have to have quotes and deal with escaping issues, but if you just want phrase queries and quote/escaping isn’t a hassle maybe that’s cleaner for you. Erik On Dec 5, 2014, at 11:30 AM, Dinesh Babu dinesh.b...@pb.com wrote: One more quick question Erik, If I want to do search on multiple fields using {!field} do we have a query similar to what {!prefix} has : q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val} where f1_val=field 1 valuef2_val=field2 value Regards, Dinesh Babu. -Original Message- From: Dinesh Babu Sent: 05 December 2014 16:26 To: solr-user@lucene.apache.org Subject: RE: How to stop Solr tokenising search terms with spaces Thanks a lot Erik. {!field} seems to solve our issue. Much appreciate your help Regards, Dinesh Babu. -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: 05 December 2014 16:00 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces try using {!field} instead of {!prefix}. {!field} will create a phrase query (or term query if it’s just one term) after analysis. [it also could construct other query types if the analysis overlaps tokens, but maybe not relevant here] Also note that you can use multiple
RE: How to stop Solr tokenising search terms with spaces
Thanks Yonik. This does not seem to work for me. This is wgat I did 1) q=displayName:rvn* brings me two records (a) RVN Viewpoint Users and (b) RVN Project Admins 2) {!complexphrase}RVN* -- Unknown query type \org.apache.lucene.search.PrefixQuery\ found in phrase query string \RVN*\ 3) {!complexphrase}RVN V* -- Does not bring any result back. 4) {!complexphrase}RVN Viewpoint* -- Does not bring any result back. Do I need to make any configuration changes to get this working? Regards, Dinesh Babu. -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: 07 December 2014 03:30 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces On Sat, Dec 6, 2014 at 7:17 PM, Dinesh Babu dinesh.b...@pb.com wrote: Just curious, why solr does not provide a simple mechanism to do a phrase search ? Simple phrase queries: q= field1:Hanks Major Phrase queries with wildcards / partial matches are a different story... they are complex: q={!complexphrase}hanks ma* See more examples here: http://heliosearch.org/solr-4-8-features/ -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data
Re: How to stop Solr tokenising search terms with spaces
Thanks for the clarification. You may be able to get by using an ngram filter at index time - but not at query time. Then Tom would be indexed at position 0 as to, om, and tom, and Hanks would be indexed at position 1 as ha, an, nk, ks, han, ank, nks, hank, anks, and hanks, permitting all of your queries, as unquoted terms or quoted simple phrases, such as to ank. Use the standard tokenizer combined with the NGramFilterFactory and lower case filter, but only use the ngram filter at index time. See: http://lucene.apache.org/core/4_10_2/analyzers-common/org/apache/lucene/analysis/ngram/NGramFilterFactory.html But be aware that use of the ngram filter dramatically increases the index size, so don't use it on large text fields, just short text fields like names. -- Jack Krupansky -Original Message- From: Dinesh Babu Sent: Sunday, December 7, 2014 2:58 PM To: solr-user@lucene.apache.org Subject: RE: How to stop Solr tokenising search terms with spaces Hi Alex, My requirement is that I should be able to search for a person , for example Tom Hanks, by either 1) the whole of first name (Tom) 2) or partial first name with prefix (To ) 3) or partial first name without prefix ( om) 4) or the whole of surname ( Hanks) 5) or partial surname with prefix (Han) 6) or partial surname without prefix (ank) 7) or the whole name (Tom Hanks) 8) or partial first name with or without prefix and partial surname with or without prefix ( To Han , om ank) 9) All of the above as case insensitive search Thanks in advance for your help Regards, Dinesh Babu. -Original Message- From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] Sent: 07 December 2014 01:20 To: solr-user Subject: Re: How to stop Solr tokenising search terms with spaces There is no spoon. And, there is no phrase search. Certainly nothing that is one approach that fits all. What is actually happening is that you seem to want both phrase and prefix search. In your original question you did not explain the second part. So, you were given a solution for the first one. To get the second part, you now need to to put some sort of NGram into the index-type analyzer chain. But the problem is, you need to be very clear on what you want there. Do you want: 1) Major Hanks 2) Major Ha 3) Hanks Ma (swapped) 4) Hanks random text Major (swapped and apart) 4) Ha Ma (prefix on both words) 5) ha ma (lower case searches too) Or only some of those? Each of these things have implications and trade-offs. Once you know what you want to find, we can help you get there. Regards, Alex. P.s. If you are not sure what I am talking about with the analyzer chain, may I recommend my own book: http://www.amazon.ca/Instant-Apache-Solr-Indexing-Data-ebook/dp/B00D85K9XC It seems to be on sale right now. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On 6 December 2014 at 19:17, Dinesh Babu dinesh.b...@pb.com wrote: Just curious, why solr does not provide a simple mechanism to do a phrase search ? It is a very common use case and it is very surprising that there is no straight forward, at least I have not found one after so much research, way to do it in Solr. Regards, Dinesh -Original Message- From: Dinesh Babu [mailto:dinesh.b...@pb.com] Sent: 05 December 2014 17:29 To: solr-user@lucene.apache.org Subject: RE: How to stop Solr tokenising search terms with spaces Hi Erik, Probably I celebrated too soon. When I tested {!field} it seemed to work as the query was on such a data that it made to look like it is working. using the example that I originally mentioned to search for Tom Hanks Major 1) If I search {!field f=displayName}: Hanks Major, it works 2) If I provide partial word {!field f=displayName}: Hanks Ma, it does not work Is this how {!field is designed to work? Also I tried without and with escaping space as you suggested. It has the same issue 1) q= field1:Hanks Major , it works 2) q= field1:Hanks Maj , does not works Regards, Dinesh Babu. -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: 05 December 2014 16:44 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces But also, to spell out the more typical way to do that: q=field1:”…” OR field2:”…” The nice thing about {!field} is that the value doesn’t have to have quotes and deal with escaping issues, but if you just want phrase queries and quote/escaping isn’t a hassle maybe that’s cleaner for you. Erik On Dec 5, 2014, at 11:30 AM, Dinesh Babu dinesh.b...@pb.com wrote: One more quick question Erik, If I want to do search on multiple fields using {!field} do we have a query similar to what {!prefix} has : q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val} where f1_val=field 1
Re: How to stop Solr tokenising search terms with spaces
On Sun, Dec 7, 2014 at 3:18 PM, Dinesh Babu dinesh.b...@pb.com wrote: Thanks Yonik. This does not seem to work for me. This is wgat I did 1) q=displayName:rvn* brings me two records (a) RVN Viewpoint Users and (b) RVN Project Admins 2) {!complexphrase}RVN* -- Unknown query type \org.apache.lucene.search.PrefixQuery\ found in phrase query string \RVN*\ Looks like you found a bug in this part... a prefix query being quoted when it doesn't need to be. 3) {!complexphrase}RVN V* -- Does not bring any result back. This type of query should work (and does for me). Is it because the default search field does not have these terms, and you didn't specify a different field to search? Try this: {!complexphrase}displayName:RVN V* -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data
RE: How to stop Solr tokenising search terms with spaces
I just tried your suggestion {!complexphrase}displayName:RVN Viewpoint users Even the above did not work. Am I missing any configuration changes for this parser to work? Regards, Dinesh Babu. -Original Message- From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley Sent: 07 December 2014 20:49 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces On Sun, Dec 7, 2014 at 3:18 PM, Dinesh Babu dinesh.b...@pb.com wrote: Thanks Yonik. This does not seem to work for me. This is wgat I did 1) q=displayName:rvn* brings me two records (a) RVN Viewpoint Users and (b) RVN Project Admins 2) {!complexphrase}RVN* -- Unknown query type \org.apache.lucene.search.PrefixQuery\ found in phrase query string \RVN*\ Looks like you found a bug in this part... a prefix query being quoted when it doesn't need to be. 3) {!complexphrase}RVN V* -- Does not bring any result back. This type of query should work (and does for me). Is it because the default search field does not have these terms, and you didn't specify a different field to search? Try this: {!complexphrase}displayName:RVN V* -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data
RE: How to stop Solr tokenising search terms with spaces
Just curious, why solr does not provide a simple mechanism to do a phrase search ? It is a very common use case and it is very surprising that there is no straight forward, at least I have not found one after so much research, way to do it in Solr. Regards, Dinesh -Original Message- From: Dinesh Babu [mailto:dinesh.b...@pb.com] Sent: 05 December 2014 17:29 To: solr-user@lucene.apache.org Subject: RE: How to stop Solr tokenising search terms with spaces Hi Erik, Probably I celebrated too soon. When I tested {!field} it seemed to work as the query was on such a data that it made to look like it is working. using the example that I originally mentioned to search for Tom Hanks Major 1) If I search {!field f=displayName}: Hanks Major, it works 2) If I provide partial word {!field f=displayName}: Hanks Ma, it does not work Is this how {!field is designed to work? Also I tried without and with escaping space as you suggested. It has the same issue 1) q= field1:Hanks Major , it works 2) q= field1:Hanks Maj , does not works Regards, Dinesh Babu. -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: 05 December 2014 16:44 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces But also, to spell out the more typical way to do that: q=field1:”…” OR field2:”…” The nice thing about {!field} is that the value doesn’t have to have quotes and deal with escaping issues, but if you just want phrase queries and quote/escaping isn’t a hassle maybe that’s cleaner for you. Erik On Dec 5, 2014, at 11:30 AM, Dinesh Babu dinesh.b...@pb.com wrote: One more quick question Erik, If I want to do search on multiple fields using {!field} do we have a query similar to what {!prefix} has : q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val} where f1_val=field 1 valuef2_val=field2 value Regards, Dinesh Babu. -Original Message- From: Dinesh Babu Sent: 05 December 2014 16:26 To: solr-user@lucene.apache.org Subject: RE: How to stop Solr tokenising search terms with spaces Thanks a lot Erik. {!field} seems to solve our issue. Much appreciate your help Regards, Dinesh Babu. -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: 05 December 2014 16:00 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces try using {!field} instead of {!prefix}. {!field} will create a phrase query (or term query if it’s just one term) after analysis. [it also could construct other query types if the analysis overlaps tokens, but maybe not relevant here] Also note that you can use multiple of these expressions if needed: q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val} where f1_val=field 1 valuef2_val=field2 value Erik On Dec 5, 2014, at 10:45 AM, Dinesh Babu dinesh.b...@pb.com wrote: Hi, We are using Solr 4.10.2 to store user names from LDAP. I want Solr not to tokenise my search term which has space in it Eg: If there is a user by the name Tom Hanks Major, then 1) When I do a query for Tom Hanks Major , I don't want solr break this search phrase and search for individual words (ie, Tom ,Hanks, Major), but search for the whole phrase and get me the Tom Hanks Major user 2) Also if I query for Hanks Major I should get the Tom Hanks Major user back We used !prefix, but that does no allow the scenario 2. Also !prefix will restrict the search to one field and can't do on mutiple fields. Any solutions? Regards, Dinesh Babu.
Re: How to stop Solr tokenising search terms with spaces
There is no spoon. And, there is no phrase search. Certainly nothing that is one approach that fits all. What is actually happening is that you seem to want both phrase and prefix search. In your original question you did not explain the second part. So, you were given a solution for the first one. To get the second part, you now need to to put some sort of NGram into the index-type analyzer chain. But the problem is, you need to be very clear on what you want there. Do you want: 1) Major Hanks 2) Major Ha 3) Hanks Ma (swapped) 4) Hanks random text Major (swapped and apart) 4) Ha Ma (prefix on both words) 5) ha ma (lower case searches too) Or only some of those? Each of these things have implications and trade-offs. Once you know what you want to find, we can help you get there. Regards, Alex. P.s. If you are not sure what I am talking about with the analyzer chain, may I recommend my own book: http://www.amazon.ca/Instant-Apache-Solr-Indexing-Data-ebook/dp/B00D85K9XC It seems to be on sale right now. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On 6 December 2014 at 19:17, Dinesh Babu dinesh.b...@pb.com wrote: Just curious, why solr does not provide a simple mechanism to do a phrase search ? It is a very common use case and it is very surprising that there is no straight forward, at least I have not found one after so much research, way to do it in Solr. Regards, Dinesh -Original Message- From: Dinesh Babu [mailto:dinesh.b...@pb.com] Sent: 05 December 2014 17:29 To: solr-user@lucene.apache.org Subject: RE: How to stop Solr tokenising search terms with spaces Hi Erik, Probably I celebrated too soon. When I tested {!field} it seemed to work as the query was on such a data that it made to look like it is working. using the example that I originally mentioned to search for Tom Hanks Major 1) If I search {!field f=displayName}: Hanks Major, it works 2) If I provide partial word {!field f=displayName}: Hanks Ma, it does not work Is this how {!field is designed to work? Also I tried without and with escaping space as you suggested. It has the same issue 1) q= field1:Hanks Major , it works 2) q= field1:Hanks Maj , does not works Regards, Dinesh Babu. -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: 05 December 2014 16:44 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces But also, to spell out the more typical way to do that: q=field1:”…” OR field2:”…” The nice thing about {!field} is that the value doesn’t have to have quotes and deal with escaping issues, but if you just want phrase queries and quote/escaping isn’t a hassle maybe that’s cleaner for you. Erik On Dec 5, 2014, at 11:30 AM, Dinesh Babu dinesh.b...@pb.com wrote: One more quick question Erik, If I want to do search on multiple fields using {!field} do we have a query similar to what {!prefix} has : q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val} where f1_val=field 1 valuef2_val=field2 value Regards, Dinesh Babu. -Original Message- From: Dinesh Babu Sent: 05 December 2014 16:26 To: solr-user@lucene.apache.org Subject: RE: How to stop Solr tokenising search terms with spaces Thanks a lot Erik. {!field} seems to solve our issue. Much appreciate your help Regards, Dinesh Babu. -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: 05 December 2014 16:00 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces try using {!field} instead of {!prefix}. {!field} will create a phrase query (or term query if it’s just one term) after analysis. [it also could construct other query types if the analysis overlaps tokens, but maybe not relevant here] Also note that you can use multiple of these expressions if needed: q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val} where f1_val=field 1 valuef2_val=field2 value Erik On Dec 5, 2014, at 10:45 AM, Dinesh Babu dinesh.b...@pb.com wrote: Hi, We are using Solr 4.10.2 to store user names from LDAP. I want Solr not to tokenise my search term which has space in it Eg: If there is a user by the name Tom Hanks Major, then 1) When I do a query for Tom Hanks Major , I don't want solr break this search phrase and search for individual words (ie, Tom ,Hanks, Major), but search for the whole phrase and get me the Tom Hanks Major user 2) Also if I query for Hanks Major I should get the Tom Hanks Major user back We used !prefix, but that does no allow the scenario 2. Also !prefix will restrict the search to one field and can't do on mutiple fields. Any solutions? Regards, Dinesh Babu.
Re: How to stop Solr tokenising search terms with spaces
AFAIK, partial word matching is not a common use case. Could you provide a citation to shows otherwise? Solr does provide a simple mechanism for phrase search - just place your phrase in quotes. If you wish to do something more complex, then of course the solution may be more complex. The starting point would be for you to provide a more complete description of your use case, which is clearly not simple phrase search. Your most recent messages suggested that you want to match on partial words, but... you need to be more specific - don't make us try to guess your requirements. Feeding us partial requirements, one partial requirement at a time is not particularly effective. Finally, are you really trying to match names within arbitrary text, or do you have a field that simply contains a complete name? Again, this comes back to providing us with more specific requirements. My guess, from your mention of LDAP, is that the field would contain only a name, but... that's me guessing when you need to be specific. Once this distinction is cleared up, we can then focus on solutions that work either for arbitrary text or single value fields. -- Jack Krupansky -Original Message- From: Dinesh Babu Sent: Saturday, December 6, 2014 7:17 PM To: solr-user@lucene.apache.org Subject: RE: How to stop Solr tokenising search terms with spaces Just curious, why solr does not provide a simple mechanism to do a phrase search ? It is a very common use case and it is very surprising that there is no straight forward, at least I have not found one after so much research, way to do it in Solr. Regards, Dinesh -Original Message- From: Dinesh Babu [mailto:dinesh.b...@pb.com] Sent: 05 December 2014 17:29 To: solr-user@lucene.apache.org Subject: RE: How to stop Solr tokenising search terms with spaces Hi Erik, Probably I celebrated too soon. When I tested {!field} it seemed to work as the query was on such a data that it made to look like it is working. using the example that I originally mentioned to search for Tom Hanks Major 1) If I search {!field f=displayName}: Hanks Major, it works 2) If I provide partial word {!field f=displayName}: Hanks Ma, it does not work Is this how {!field is designed to work? Also I tried without and with escaping space as you suggested. It has the same issue 1) q= field1:Hanks Major , it works 2) q= field1:Hanks Maj , does not works Regards, Dinesh Babu. -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: 05 December 2014 16:44 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces But also, to spell out the more typical way to do that: q=field1:”…” OR field2:”…” The nice thing about {!field} is that the value doesn’t have to have quotes and deal with escaping issues, but if you just want phrase queries and quote/escaping isn’t a hassle maybe that’s cleaner for you. Erik On Dec 5, 2014, at 11:30 AM, Dinesh Babu dinesh.b...@pb.com wrote: One more quick question Erik, If I want to do search on multiple fields using {!field} do we have a query similar to what {!prefix} has : q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val} where f1_val=field 1 valuef2_val=field2 value Regards, Dinesh Babu. -Original Message- From: Dinesh Babu Sent: 05 December 2014 16:26 To: solr-user@lucene.apache.org Subject: RE: How to stop Solr tokenising search terms with spaces Thanks a lot Erik. {!field} seems to solve our issue. Much appreciate your help Regards, Dinesh Babu. -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: 05 December 2014 16:00 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces try using {!field} instead of {!prefix}. {!field} will create a phrase query (or term query if it’s just one term) after analysis. [it also could construct other query types if the analysis overlaps tokens, but maybe not relevant here] Also note that you can use multiple of these expressions if needed: q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val} where f1_val=field 1 valuef2_val=field2 value Erik On Dec 5, 2014, at 10:45 AM, Dinesh Babu dinesh.b...@pb.com wrote: Hi, We are using Solr 4.10.2 to store user names from LDAP. I want Solr not to tokenise my search term which has space in it Eg: If there is a user by the name Tom Hanks Major, then 1) When I do a query for Tom Hanks Major , I don't want solr break this search phrase and search for individual words (ie, Tom ,Hanks, Major), but search for the whole phrase and get me the Tom Hanks Major user 2) Also if I query for Hanks Major I should get the Tom Hanks Major user back We used !prefix, but that does no allow the scenario 2. Also !prefix will restrict the search to one field and can't do on mutiple fields. Any solutions? Regards, Dinesh Babu.
Re: How to stop Solr tokenising search terms with spaces
On Sat, Dec 6, 2014 at 7:17 PM, Dinesh Babu dinesh.b...@pb.com wrote: Just curious, why solr does not provide a simple mechanism to do a phrase search ? Simple phrase queries: q= field1:Hanks Major Phrase queries with wildcards / partial matches are a different story... they are complex: q={!complexphrase}hanks ma* See more examples here: http://heliosearch.org/solr-4-8-features/ -Yonik http://heliosearch.org - native code faceting, facet functions, sub-facets, off-heap data
Re: How to stop Solr tokenising search terms with spaces
try using {!field} instead of {!prefix}. {!field} will create a phrase query (or term query if it’s just one term) after analysis. [it also could construct other query types if the analysis overlaps tokens, but maybe not relevant here] Also note that you can use multiple of these expressions if needed: q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val} where f1_val=field 1 valuef2_val=field2 value Erik On Dec 5, 2014, at 10:45 AM, Dinesh Babu dinesh.b...@pb.com wrote: Hi, We are using Solr 4.10.2 to store user names from LDAP. I want Solr not to tokenise my search term which has space in it Eg: If there is a user by the name Tom Hanks Major, then 1) When I do a query for Tom Hanks Major , I don't want solr break this search phrase and search for individual words (ie, Tom ,Hanks, Major), but search for the whole phrase and get me the Tom Hanks Major user 2) Also if I query for Hanks Major I should get the Tom Hanks Major user back We used !prefix, but that does no allow the scenario 2. Also !prefix will restrict the search to one field and can't do on mutiple fields. Any solutions? Regards, Dinesh Babu.
RE: How to stop Solr tokenising search terms with spaces
One more quick question Erik, If I want to do search on multiple fields using {!field} do we have a query similar to what {!prefix} has : q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val} where f1_val=field 1 valuef2_val=field2 value Regards, Dinesh Babu. -Original Message- From: Dinesh Babu Sent: 05 December 2014 16:26 To: solr-user@lucene.apache.org Subject: RE: How to stop Solr tokenising search terms with spaces Thanks a lot Erik. {!field} seems to solve our issue. Much appreciate your help Regards, Dinesh Babu. -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: 05 December 2014 16:00 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces try using {!field} instead of {!prefix}. {!field} will create a phrase query (or term query if it’s just one term) after analysis. [it also could construct other query types if the analysis overlaps tokens, but maybe not relevant here] Also note that you can use multiple of these expressions if needed: q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val} where f1_val=field 1 valuef2_val=field2 value Erik On Dec 5, 2014, at 10:45 AM, Dinesh Babu dinesh.b...@pb.com wrote: Hi, We are using Solr 4.10.2 to store user names from LDAP. I want Solr not to tokenise my search term which has space in it Eg: If there is a user by the name Tom Hanks Major, then 1) When I do a query for Tom Hanks Major , I don't want solr break this search phrase and search for individual words (ie, Tom ,Hanks, Major), but search for the whole phrase and get me the Tom Hanks Major user 2) Also if I query for Hanks Major I should get the Tom Hanks Major user back We used !prefix, but that does no allow the scenario 2. Also !prefix will restrict the search to one field and can't do on mutiple fields. Any solutions? Regards, Dinesh Babu.
RE: How to stop Solr tokenising search terms with spaces
Thanks a lot Erik. {!field} seems to solve our issue. Much appreciate your help Regards, Dinesh Babu. -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: 05 December 2014 16:00 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces try using {!field} instead of {!prefix}. {!field} will create a phrase query (or term query if it’s just one term) after analysis. [it also could construct other query types if the analysis overlaps tokens, but maybe not relevant here] Also note that you can use multiple of these expressions if needed: q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val} where f1_val=field 1 valuef2_val=field2 value Erik On Dec 5, 2014, at 10:45 AM, Dinesh Babu dinesh.b...@pb.com wrote: Hi, We are using Solr 4.10.2 to store user names from LDAP. I want Solr not to tokenise my search term which has space in it Eg: If there is a user by the name Tom Hanks Major, then 1) When I do a query for Tom Hanks Major , I don't want solr break this search phrase and search for individual words (ie, Tom ,Hanks, Major), but search for the whole phrase and get me the Tom Hanks Major user 2) Also if I query for Hanks Major I should get the Tom Hanks Major user back We used !prefix, but that does no allow the scenario 2. Also !prefix will restrict the search to one field and can't do on mutiple fields. Any solutions? Regards, Dinesh Babu.
Re: How to stop Solr tokenising search terms with spaces
But also, to spell out the more typical way to do that: q=field1:”…” OR field2:”…” The nice thing about {!field} is that the value doesn’t have to have quotes and deal with escaping issues, but if you just want phrase queries and quote/escaping isn’t a hassle maybe that’s cleaner for you. Erik On Dec 5, 2014, at 11:30 AM, Dinesh Babu dinesh.b...@pb.com wrote: One more quick question Erik, If I want to do search on multiple fields using {!field} do we have a query similar to what {!prefix} has : q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val} where f1_val=field 1 valuef2_val=field2 value Regards, Dinesh Babu. -Original Message- From: Dinesh Babu Sent: 05 December 2014 16:26 To: solr-user@lucene.apache.org Subject: RE: How to stop Solr tokenising search terms with spaces Thanks a lot Erik. {!field} seems to solve our issue. Much appreciate your help Regards, Dinesh Babu. -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: 05 December 2014 16:00 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces try using {!field} instead of {!prefix}. {!field} will create a phrase query (or term query if it’s just one term) after analysis. [it also could construct other query types if the analysis overlaps tokens, but maybe not relevant here] Also note that you can use multiple of these expressions if needed: q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val} where f1_val=field 1 valuef2_val=field2 value Erik On Dec 5, 2014, at 10:45 AM, Dinesh Babu dinesh.b...@pb.com wrote: Hi, We are using Solr 4.10.2 to store user names from LDAP. I want Solr not to tokenise my search term which has space in it Eg: If there is a user by the name Tom Hanks Major, then 1) When I do a query for Tom Hanks Major , I don't want solr break this search phrase and search for individual words (ie, Tom ,Hanks, Major), but search for the whole phrase and get me the Tom Hanks Major user 2) Also if I query for Hanks Major I should get the Tom Hanks Major user back We used !prefix, but that does no allow the scenario 2. Also !prefix will restrict the search to one field and can't do on mutiple fields. Any solutions? Regards, Dinesh Babu.
Re: How to stop Solr tokenising search terms with spaces
Dinesh - indeed. You can compose arbitrarily complex queries using what has been termed “nested queries” like this. It used to be q=_query_:”{!…}...” OR _query_:”{!…}…”, but the _query_ trick isn’t strictly necessary now (though care has to be take to make sure these complex nested expressions parse as you expect) See slide #12 here: http://www.slideshare.net/erikhatcher/sa-22830939 http://www.slideshare.net/erikhatcher/sa-22830939 Erik On Dec 5, 2014, at 11:30 AM, Dinesh Babu dinesh.b...@pb.com wrote: One more quick question Erik, If I want to do search on multiple fields using {!field} do we have a query similar to what {!prefix} has : q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val} where f1_val=field 1 valuef2_val=field2 value Regards, Dinesh Babu. -Original Message- From: Dinesh Babu Sent: 05 December 2014 16:26 To: solr-user@lucene.apache.org Subject: RE: How to stop Solr tokenising search terms with spaces Thanks a lot Erik. {!field} seems to solve our issue. Much appreciate your help Regards, Dinesh Babu. -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: 05 December 2014 16:00 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces try using {!field} instead of {!prefix}. {!field} will create a phrase query (or term query if it’s just one term) after analysis. [it also could construct other query types if the analysis overlaps tokens, but maybe not relevant here] Also note that you can use multiple of these expressions if needed: q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val} where f1_val=field 1 valuef2_val=field2 value Erik On Dec 5, 2014, at 10:45 AM, Dinesh Babu dinesh.b...@pb.com wrote: Hi, We are using Solr 4.10.2 to store user names from LDAP. I want Solr not to tokenise my search term which has space in it Eg: If there is a user by the name Tom Hanks Major, then 1) When I do a query for Tom Hanks Major , I don't want solr break this search phrase and search for individual words (ie, Tom ,Hanks, Major), but search for the whole phrase and get me the Tom Hanks Major user 2) Also if I query for Hanks Major I should get the Tom Hanks Major user back We used !prefix, but that does no allow the scenario 2. Also !prefix will restrict the search to one field and can't do on mutiple fields. Any solutions? Regards, Dinesh Babu.
RE: How to stop Solr tokenising search terms with spaces
Hi Erik, Probably I celebrated too soon. When I tested {!field} it seemed to work as the query was on such a data that it made to look like it is working. using the example that I originally mentioned to search for Tom Hanks Major 1) If I search {!field f=displayName}: Hanks Major, it works 2) If I provide partial word {!field f=displayName}: Hanks Ma, it does not work Is this how {!field is designed to work? Also I tried without and with escaping space as you suggested. It has the same issue 1) q= field1:Hanks Major , it works 2) q= field1:Hanks Maj , does not works Regards, Dinesh Babu. -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: 05 December 2014 16:44 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces But also, to spell out the more typical way to do that: q=field1:”…” OR field2:”…” The nice thing about {!field} is that the value doesn’t have to have quotes and deal with escaping issues, but if you just want phrase queries and quote/escaping isn’t a hassle maybe that’s cleaner for you. Erik On Dec 5, 2014, at 11:30 AM, Dinesh Babu dinesh.b...@pb.com wrote: One more quick question Erik, If I want to do search on multiple fields using {!field} do we have a query similar to what {!prefix} has : q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val} where f1_val=field 1 valuef2_val=field2 value Regards, Dinesh Babu. -Original Message- From: Dinesh Babu Sent: 05 December 2014 16:26 To: solr-user@lucene.apache.org Subject: RE: How to stop Solr tokenising search terms with spaces Thanks a lot Erik. {!field} seems to solve our issue. Much appreciate your help Regards, Dinesh Babu. -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: 05 December 2014 16:00 To: solr-user@lucene.apache.org Subject: Re: How to stop Solr tokenising search terms with spaces try using {!field} instead of {!prefix}. {!field} will create a phrase query (or term query if it’s just one term) after analysis. [it also could construct other query types if the analysis overlaps tokens, but maybe not relevant here] Also note that you can use multiple of these expressions if needed: q={!prefix f=field1 v=$f1_val} OR {!prefix f=field2 v=$f2_val} where f1_val=field 1 valuef2_val=field2 value Erik On Dec 5, 2014, at 10:45 AM, Dinesh Babu dinesh.b...@pb.com wrote: Hi, We are using Solr 4.10.2 to store user names from LDAP. I want Solr not to tokenise my search term which has space in it Eg: If there is a user by the name Tom Hanks Major, then 1) When I do a query for Tom Hanks Major , I don't want solr break this search phrase and search for individual words (ie, Tom ,Hanks, Major), but search for the whole phrase and get me the Tom Hanks Major user 2) Also if I query for Hanks Major I should get the Tom Hanks Major user back We used !prefix, but that does no allow the scenario 2. Also !prefix will restrict the search to one field and can't do on mutiple fields. Any solutions? Regards, Dinesh Babu.