RE: Returning results for multi-word search term
I used the "copyField" and created a text version of the field that I wanted to search on and am now getting the results I was looking for. Thanks for all your help. ~~~ William Kevin Miller ECS Federal, Inc. USPS/MTSC (405) 573-2158 -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Tuesday, July 11, 2017 10:54 AM To: solr-user Subject: Re: Returning results for multi-word search term The admin/analysis page is your friend here. Hover over the light gray abbreviations (like "ST") and you'll see what the analysis chain component is that does the transformation. (Standard Tokenizer in this case). I almost always turn off the "verbose" checkbox BTW. In general you only want string types for things where the entire field must be considered for matching. Do _not_ fall into the habit of indexing a string field and then searching for words with *something* as though it were a SQL query as that's just a horribly inefficient. Best, Erick On Tue, Jul 11, 2017 at 6:46 AM, Miller, William K - Norman, OK - Contractor <william.k.mil...@usps.gov.invalid> wrote: > I do have my fields as strings not text, so I am going to play around with > using the "text". If I continue to have problems, I will post the additional > information you are requesting. > > > > > ~~~ > William Kevin Miller > > ECS Federal, Inc. > USPS/MTSC > (405) 573-2158 > > > -Original Message- > From: Shawn Heisey [mailto:apa...@elyograg.org] > Sent: Tuesday, July 11, 2017 8:34 AM > To: solr-user@lucene.apache.org > Subject: Re: Returning results for multi-word search term > > On 7/10/2017 1:02 PM, Miller, William K - Norman, OK - Contractor wrote: >> >> I am trying to return results when using a multi-word term. I am >> using “Paddle Arm” as my search term(including the quotes). I know >> that the field that I am querying against has these words together. >> If I run the query using Paddle* Arm* I get the following results, >> but I want to get only the last two. I have looked at Fuzzy Searches >> but that I don’t feel will work and I have looked at the Proximity >> Searches and I get no results back with that one whether I use 0,1 or >> 10. How can I structure my query to get the last items in the below list? >> >> >> >> Paddle Assembly >> >> Paddle >> >> Paddle >> >> Paddle Pneumatic Piping >> >> Paddle >> >> Paddle Assembly >> >> Paddle >> >> Paddle Assembly >> >> Paddle to Bucket Offset Check >> >> Paddle to Bucket Wall >> >> Paddle to Bucket Offset >> >> Paddle >> >> Paddle Assembly Troubleshooting >> >> Paddle Assembly Troubleshooting >> >> Paddle Air Pressure >> >> Paddle Assembly >> >> Paddle >> >> Paddle Stop Adjustment >> >> Paddle Stop >> >> Paddle Assembly >> >> Paddle Assembly >> >> Paddle Vacuum Holes >> >> Paddle Position >> >> Paddle Detection Sensor Adjustment >> >> Paddle Assembly >> >> Paddle >> >> Paddle Assembly >> >> Paddle Stop >> >> Paddle Assembly >> >> Paddle Assembly >> >> Paddle >> >> Paddle Assembly >> >> Paddle Assembly >> >> Paddle Rotary Actuator >> >> Paddle Removal and Replacement >> >> Paddle Assembly >> >> Paddle Removal and Replacement >> >> Paddle Seal Removal and Replacement >> >> Paddle Location >> >> Paddle Location >> >> Paddle Removal Location >> >> Paddle/Belt Speed for Photoeye Inputs >> >> Paddle Arm Spring, Upper Paddle Arm, and Lower Paddle Arm >> >> Paddle Arm Spring, Upper Paddle Arm, and Lower Paddle Arm >> >> > > The answer to your question is going to be less about the query structure and > more about the type of field you're using and any defined analysis for that > type. > > With a schema field type that is properly configured, the query you are > trying to use (with quotes) should work, as long as it is being directed > specifically to the correct field as Erick mentioned. Note that if you > change your schema to make this work, you will need to entirely reindex. > > Nearly any of the "text" field types included in the example schemas will do > the job. The "string" or "strings" types in the examples will NOT work, > because they do not break the text into multiple tokens (search terms). I > think you're probably trying to search a field that is using one of the > latter types. > > Can you share your schema and the name of the field that you are querying? > Your followup message tells us the version of Solr (6.5.1), so the most > likely filename for the schema will be "managed-schema" with no extension. > If there have been significant changes from an example in your solrconfig.xml > file, it would be a good idea to share that too. > > Thanks, > Shawn >
Re: Returning results for multi-word search term
The admin/analysis page is your friend here. Hover over the light gray abbreviations (like "ST") and you'll see what the analysis chain component is that does the transformation. (Standard Tokenizer in this case). I almost always turn off the "verbose" checkbox BTW. In general you only want string types for things where the entire field must be considered for matching. Do _not_ fall into the habit of indexing a string field and then searching for words with *something* as though it were a SQL query as that's just a horribly inefficient. Best, Erick On Tue, Jul 11, 2017 at 6:46 AM, Miller, William K - Norman, OK - Contractor <william.k.mil...@usps.gov.invalid> wrote: > I do have my fields as strings not text, so I am going to play around with > using the "text". If I continue to have problems, I will post the additional > information you are requesting. > > > > > ~~~ > William Kevin Miller > > ECS Federal, Inc. > USPS/MTSC > (405) 573-2158 > > > -Original Message- > From: Shawn Heisey [mailto:apa...@elyograg.org] > Sent: Tuesday, July 11, 2017 8:34 AM > To: solr-user@lucene.apache.org > Subject: Re: Returning results for multi-word search term > > On 7/10/2017 1:02 PM, Miller, William K - Norman, OK - Contractor wrote: >> >> I am trying to return results when using a multi-word term. I am >> using “Paddle Arm” as my search term(including the quotes). I know >> that the field that I am querying against has these words together. >> If I run the query using Paddle* Arm* I get the following results, but >> I want to get only the last two. I have looked at Fuzzy Searches but >> that I don’t feel will work and I have looked at the Proximity >> Searches and I get no results back with that one whether I use 0,1 or >> 10. How can I structure my query to get the last items in the below list? >> >> >> >> Paddle Assembly >> >> Paddle >> >> Paddle >> >> Paddle Pneumatic Piping >> >> Paddle >> >> Paddle Assembly >> >> Paddle >> >> Paddle Assembly >> >> Paddle to Bucket Offset Check >> >> Paddle to Bucket Wall >> >> Paddle to Bucket Offset >> >> Paddle >> >> Paddle Assembly Troubleshooting >> >> Paddle Assembly Troubleshooting >> >> Paddle Air Pressure >> >> Paddle Assembly >> >> Paddle >> >> Paddle Stop Adjustment >> >> Paddle Stop >> >> Paddle Assembly >> >> Paddle Assembly >> >> Paddle Vacuum Holes >> >> Paddle Position >> >> Paddle Detection Sensor Adjustment >> >> Paddle Assembly >> >> Paddle >> >> Paddle Assembly >> >> Paddle Stop >> >> Paddle Assembly >> >> Paddle Assembly >> >> Paddle >> >> Paddle Assembly >> >> Paddle Assembly >> >> Paddle Rotary Actuator >> >> Paddle Removal and Replacement >> >> Paddle Assembly >> >> Paddle Removal and Replacement >> >> Paddle Seal Removal and Replacement >> >> Paddle Location >> >> Paddle Location >> >> Paddle Removal Location >> >> Paddle/Belt Speed for Photoeye Inputs >> >> Paddle Arm Spring, Upper Paddle Arm, and Lower Paddle Arm >> >> Paddle Arm Spring, Upper Paddle Arm, and Lower Paddle Arm >> >> > > The answer to your question is going to be less about the query structure and > more about the type of field you're using and any defined analysis for that > type. > > With a schema field type that is properly configured, the query you are > trying to use (with quotes) should work, as long as it is being directed > specifically to the correct field as Erick mentioned. Note that if you > change your schema to make this work, you will need to entirely reindex. > > Nearly any of the "text" field types included in the example schemas will do > the job. The "string" or "strings" types in the examples will NOT work, > because they do not break the text into multiple tokens (search terms). I > think you're probably trying to search a field that is using one of the > latter types. > > Can you share your schema and the name of the field that you are querying? > Your followup message tells us the version of Solr (6.5.1), so the most > likely filename for the schema will be "managed-schema" with no extension. > If there have been significant changes from an example in your solrconfig.xml > file, it would be a good idea to share that too. > > Thanks, > Shawn >
RE: Returning results for multi-word search term
I do have my fields as strings not text, so I am going to play around with using the "text". If I continue to have problems, I will post the additional information you are requesting. ~~~ William Kevin Miller ECS Federal, Inc. USPS/MTSC (405) 573-2158 -Original Message- From: Shawn Heisey [mailto:apa...@elyograg.org] Sent: Tuesday, July 11, 2017 8:34 AM To: solr-user@lucene.apache.org Subject: Re: Returning results for multi-word search term On 7/10/2017 1:02 PM, Miller, William K - Norman, OK - Contractor wrote: > > I am trying to return results when using a multi-word term. I am > using “Paddle Arm” as my search term(including the quotes). I know > that the field that I am querying against has these words together. > If I run the query using Paddle* Arm* I get the following results, but > I want to get only the last two. I have looked at Fuzzy Searches but > that I don’t feel will work and I have looked at the Proximity > Searches and I get no results back with that one whether I use 0,1 or > 10. How can I structure my query to get the last items in the below list? > > > > Paddle Assembly > > Paddle > > Paddle > > Paddle Pneumatic Piping > > Paddle > > Paddle Assembly > > Paddle > > Paddle Assembly > > Paddle to Bucket Offset Check > > Paddle to Bucket Wall > > Paddle to Bucket Offset > > Paddle > > Paddle Assembly Troubleshooting > > Paddle Assembly Troubleshooting > > Paddle Air Pressure > > Paddle Assembly > > Paddle > > Paddle Stop Adjustment > > Paddle Stop > > Paddle Assembly > > Paddle Assembly > > Paddle Vacuum Holes > > Paddle Position > > Paddle Detection Sensor Adjustment > > Paddle Assembly > > Paddle > > Paddle Assembly > > Paddle Stop > > Paddle Assembly > > Paddle Assembly > > Paddle > > Paddle Assembly > > Paddle Assembly > > Paddle Rotary Actuator > > Paddle Removal and Replacement > > Paddle Assembly > > Paddle Removal and Replacement > > Paddle Seal Removal and Replacement > > Paddle Location > > Paddle Location > > Paddle Removal Location > > Paddle/Belt Speed for Photoeye Inputs > > Paddle Arm Spring, Upper Paddle Arm, and Lower Paddle Arm > > Paddle Arm Spring, Upper Paddle Arm, and Lower Paddle Arm > > The answer to your question is going to be less about the query structure and more about the type of field you're using and any defined analysis for that type. With a schema field type that is properly configured, the query you are trying to use (with quotes) should work, as long as it is being directed specifically to the correct field as Erick mentioned. Note that if you change your schema to make this work, you will need to entirely reindex. Nearly any of the "text" field types included in the example schemas will do the job. The "string" or "strings" types in the examples will NOT work, because they do not break the text into multiple tokens (search terms). I think you're probably trying to search a field that is using one of the latter types. Can you share your schema and the name of the field that you are querying? Your followup message tells us the version of Solr (6.5.1), so the most likely filename for the schema will be "managed-schema" with no extension. If there have been significant changes from an example in your solrconfig.xml file, it would be a good idea to share that too. Thanks, Shawn
Re: Returning results for multi-word search term
On 7/10/2017 1:02 PM, Miller, William K - Norman, OK - Contractor wrote: > > I am trying to return results when using a multi-word term. I am > using “Paddle Arm” as my search term(including the quotes). I know > that the field that I am querying against has these words together. > If I run the query using Paddle* Arm* I get the following results, but > I want to get only the last two. I have looked at Fuzzy Searches but > that I don’t feel will work and I have looked at the Proximity > Searches and I get no results back with that one whether I use 0,1 or > 10. How can I structure my query to get the last items in the below list? > > > > Paddle Assembly > > Paddle > > Paddle > > Paddle Pneumatic Piping > > Paddle > > Paddle Assembly > > Paddle > > Paddle Assembly > > Paddle to Bucket Offset Check > > Paddle to Bucket Wall > > Paddle to Bucket Offset > > Paddle > > Paddle Assembly Troubleshooting > > Paddle Assembly Troubleshooting > > Paddle Air Pressure > > Paddle Assembly > > Paddle > > Paddle Stop Adjustment > > Paddle Stop > > Paddle Assembly > > Paddle Assembly > > Paddle Vacuum Holes > > Paddle Position > > Paddle Detection Sensor Adjustment > > Paddle Assembly > > Paddle > > Paddle Assembly > > Paddle Stop > > Paddle Assembly > > Paddle Assembly > > Paddle > > Paddle Assembly > > Paddle Assembly > > Paddle Rotary Actuator > > Paddle Removal and Replacement > > Paddle Assembly > > Paddle Removal and Replacement > > Paddle Seal Removal and Replacement > > Paddle Location > > Paddle Location > > Paddle Removal Location > > Paddle/Belt Speed for Photoeye Inputs > > Paddle Arm Spring, Upper Paddle Arm, and Lower Paddle Arm > > Paddle Arm Spring, Upper Paddle Arm, and Lower Paddle Arm > > The answer to your question is going to be less about the query structure and more about the type of field you're using and any defined analysis for that type. With a schema field type that is properly configured, the query you are trying to use (with quotes) should work, as long as it is being directed specifically to the correct field as Erick mentioned. Note that if you change your schema to make this work, you will need to entirely reindex. Nearly any of the "text" field types included in the example schemas will do the job. The "string" or "strings" types in the examples will NOT work, because they do not break the text into multiple tokens (search terms). I think you're probably trying to search a field that is using one of the latter types. Can you share your schema and the name of the field that you are querying? Your followup message tells us the version of Solr (6.5.1), so the most likely filename for the schema will be "managed-schema" with no extension. If there have been significant changes from an example in your solrconfig.xml file, it would be a good idea to share that too. Thanks, Shawn
Re: Returning results for multi-word search term
Well, one issue is that Paddle* Arm* has an implicit OR between the terms. Try +Paddle* +Arm* That'll reduce the documents found, although it would find "Paddle robotic armature" (no such thing, just sayin'). Although another possibility is that you're really sending some_field:Paddle* Arm* which is parsed as some_field:Paddle* default_search_field:Arm* “Paddle Arm” should find the last two. I suspect you're using "string" type for the field you're searching against rather than a text-based field that tokenizes. You must show us the fieldType of the field and the results of =query added to the URL to have a hope of saying anything more. And if you really need phrases and wildcards, see Complex Phrase Query Parser here: https://lucene.apache.org/solr/guide/6_6/other-parsers.html. But before going there, I'd figure out wha't up with not being able to search "Paddle Arm" as a phrase, it should certainly do what you're asking given the right field definition. Best, Erick On Mon, Jul 10, 2017 at 12:10 PM, Miller, William K - Norman, OK - Contractor <william.k.mil...@usps.gov.invalid> wrote: > I forgot to mention that I am using Solr 6.5.1 and I am indexing XML files. > My Solr server is running on a Linux OS. > > > > > > > > > > ~~~ > > William Kevin Miller > > ECS Federal, Inc. > > USPS/MTSC > > (405) 573-2158 > > > > From: Miller, William K - Norman, OK - Contractor > [mailto:william.k.mil...@usps.gov.INVALID] > Sent: Monday, July 10, 2017 2:03 PM > To: 'solr-user@lucene.apache.org' > Subject: Returning results for multi-word search term > > > > I am trying to return results when using a multi-word term. I am using > “Paddle Arm” as my search term(including the quotes). I know that the field > that I am querying against has these words together. If I run the query > using Paddle* Arm* I get the following results, but I want to get only the > last two. I have looked at Fuzzy Searches but that I don’t feel will work > and I have looked at the Proximity Searches and I get no results back with > that one whether I use 0,1 or 10. How can I structure my query to get the > last items in the below list? > > > > Paddle Assembly > > Paddle > > Paddle > > Paddle Pneumatic Piping > > Paddle > > Paddle Assembly > > Paddle > > Paddle Assembly > > Paddle to Bucket Offset Check > > Paddle to Bucket Wall > > Paddle to Bucket Offset > > Paddle > > Paddle Assembly Troubleshooting > > Paddle Assembly Troubleshooting > > Paddle Air Pressure > > Paddle Assembly > > Paddle > > Paddle Stop Adjustment > > Paddle Stop > > Paddle Assembly > > Paddle Assembly > > Paddle Vacuum Holes > > Paddle Position > > Paddle Detection Sensor Adjustment > > Paddle Assembly > > Paddle > > Paddle Assembly > > Paddle Stop > > Paddle Assembly > > Paddle Assembly > > Paddle > > Paddle Assembly > > Paddle Assembly > > Paddle Rotary Actuator > > Paddle Removal and Replacement > > Paddle Assembly > > Paddle Removal and Replacement > > Paddle Seal Removal and Replacement > > Paddle Location > > Paddle Location > > Paddle Removal Location > > Paddle/Belt Speed for Photoeye Inputs > > Paddle Arm Spring, Upper Paddle Arm, and Lower Paddle Arm > > Paddle Arm Spring, Upper Paddle Arm, and Lower Paddle Arm > > > > > > > > > > ~~~ > > William Kevin Miller > > ECS Federal, Inc. > > USPS/MTSC > > (405) 573-2158 > >
RE: Returning results for multi-word search term
I forgot to mention that I am using Solr 6.5.1 and I am indexing XML files. My Solr server is running on a Linux OS. ~~~ William Kevin Miller [ecsLogo] ECS Federal, Inc. USPS/MTSC (405) 573-2158 From: Miller, William K - Norman, OK - Contractor [mailto:william.k.mil...@usps.gov.INVALID] Sent: Monday, July 10, 2017 2:03 PM To: 'solr-user@lucene.apache.org' Subject: Returning results for multi-word search term I am trying to return results when using a multi-word term. I am using "Paddle Arm" as my search term(including the quotes). I know that the field that I am querying against has these words together. If I run the query using Paddle* Arm* I get the following results, but I want to get only the last two. I have looked at Fuzzy Searches but that I don't feel will work and I have looked at the Proximity Searches and I get no results back with that one whether I use 0,1 or 10. How can I structure my query to get the last items in the below list? Paddle Assembly Paddle Paddle Paddle Pneumatic Piping Paddle Paddle Assembly Paddle Paddle Assembly Paddle to Bucket Offset Check Paddle to Bucket Wall Paddle to Bucket Offset Paddle Paddle Assembly Troubleshooting Paddle Assembly Troubleshooting Paddle Air Pressure Paddle Assembly Paddle Paddle Stop Adjustment Paddle Stop Paddle Assembly Paddle Assembly Paddle Vacuum Holes Paddle Position Paddle Detection Sensor Adjustment Paddle Assembly Paddle Paddle Assembly Paddle Stop Paddle Assembly Paddle Assembly Paddle Paddle Assembly Paddle Assembly Paddle Rotary Actuator Paddle Removal and Replacement Paddle Assembly Paddle Removal and Replacement Paddle Seal Removal and Replacement Paddle Location Paddle Location Paddle Removal Location Paddle/Belt Speed for Photoeye Inputs Paddle Arm Spring, Upper Paddle Arm, and Lower Paddle Arm Paddle Arm Spring, Upper Paddle Arm, and Lower Paddle Arm ~~~ William Kevin Miller [ecsLogo] ECS Federal, Inc. USPS/MTSC (405) 573-2158
Returning results for multi-word search term
I am trying to return results when using a multi-word term. I am using "Paddle Arm" as my search term(including the quotes). I know that the field that I am querying against has these words together. If I run the query using Paddle* Arm* I get the following results, but I want to get only the last two. I have looked at Fuzzy Searches but that I don't feel will work and I have looked at the Proximity Searches and I get no results back with that one whether I use 0,1 or 10. How can I structure my query to get the last items in the below list? Paddle Assembly Paddle Paddle Paddle Pneumatic Piping Paddle Paddle Assembly Paddle Paddle Assembly Paddle to Bucket Offset Check Paddle to Bucket Wall Paddle to Bucket Offset Paddle Paddle Assembly Troubleshooting Paddle Assembly Troubleshooting Paddle Air Pressure Paddle Assembly Paddle Paddle Stop Adjustment Paddle Stop Paddle Assembly Paddle Assembly Paddle Vacuum Holes Paddle Position Paddle Detection Sensor Adjustment Paddle Assembly Paddle Paddle Assembly Paddle Stop Paddle Assembly Paddle Assembly Paddle Paddle Assembly Paddle Assembly Paddle Rotary Actuator Paddle Removal and Replacement Paddle Assembly Paddle Removal and Replacement Paddle Seal Removal and Replacement Paddle Location Paddle Location Paddle Removal Location Paddle/Belt Speed for Photoeye Inputs Paddle Arm Spring, Upper Paddle Arm, and Lower Paddle Arm Paddle Arm Spring, Upper Paddle Arm, and Lower Paddle Arm ~~~ William Kevin Miller [ecsLogo] ECS Federal, Inc. USPS/MTSC (405) 573-2158