AW: Special suggestions requirement
Is there anything you cannot do with Solr? :-) Thanks a lot Erick! I only had to use . instead of ?, e.g. ...:8983/solr/terms?terms.fl=fieldnameterms.limit=100terms.prefix=abcdterms.regex.flag=case_insensitiveterms=trueterms.regex=abcd.. Adding terms.sort=index allows me even to sort as I need. Thanks, Alexander -Ursprüngliche Nachricht- Von: Erick Erickson [mailto:erickerick...@gmail.com] Gesendet: Samstag, 4. August 2012 20:11 An: solr-user@lucene.apache.org Betreff: Re: Special suggestions requirement Would it work to use TermsComponent with wildcards? Something like terms.regex=ABCD42??... see: http://wiki.apache.org/solr/TermsComponent/ Best Erick On Fri, Aug 3, 2012 at 9:07 AM, Michael Della Bitta michael.della.bi...@appinions.com wrote: I could be crazy, but it sounds to me like you need a trie, not a search index: http://en.wikipedia.org/wiki/Trie But in any case, what you want to do should be achievable. It seems like you need to do EdgeNgrams and facet on the results, where facet.counts 1 to exclude the actual part numbers, since each of those would be distinct. I'm on the train right now, so I can't test this. :\ Michael Della Bitta Appinions | 18 East 41st St., Suite 1806 | New York, NY 10017 www.appinions.com Where Influence Isn't a Game On Thu, Aug 2, 2012 at 9:19 PM, Lochschmied, Alexander alexander.lochschm...@vishay.com wrote: Even with prefix query, I do not get ABCD02 or any ABCD02... back. BTW: EdgeNGramFilterFactory is used on the field we are getting the suggestions/spellchecks from. I think the problem is that there are a lot of different part numbers starting with ABCD and every part number has the same length. I showed only 4 in the example but there might be thousands. Here are some full part number examples that might be in the index: ABCD110040 ABCD00 ABCD99 ABCD155500 ... I'm looking for a way to make Solr return distinct list of fixed length substrings of them, e.g. if ABCD is entered, I would need ABCD00 ABCD01 ABCD02 ABCD03 ... ABCD99 Then if user chose ABCD42 from the suggestions, I would need ABCD4201 ABCD4202 ABCD4203 ... ABCD4299 and so on. I would be able to do some post processing if needed or adjust the schema or indexing process. But the key functionality I need from Solr is returning distinct set of those suggestions where only the last two characters change. All of the available combinations of those last two characters must be considered though. I need to show alpha-numerically sorted suggestions; the smallest value first. Thanks, Alexander -Ursprüngliche Nachricht- Von: Michael Della Bitta [mailto:michael.della.bi...@appinions.com] Gesendet: Donnerstag, 2. August 2012 15:02 An: solr-user@lucene.apache.org Betreff: Re: Special suggestions requirement In this case, we're storing the overall value length and sorting it on that, then alphabetically. Also, how are your queries fashioned? If you're doing a prefix query, everything that matches it should score the same. If you're only doing a prefix query, you might need to add a term for exact matches as well to get them to show up. Michael Della Bitta Appinions | 18 East 41st St., Suite 1806 | New York, NY 10017 www.appinions.com Where Influence Isn't a Game On Wed, Aug 1, 2012 at 9:58 PM, Lochschmied, Alexander alexander.lochschm...@vishay.com wrote: Is there a way to offer distinct, alphabetically sorted, fixed length options? I am trying to suggest part numbers and I'm currently trying to do it with the spellchecker component. Let's say ABCD was entered and we have indexed part numbers like ABCD ABCD2000 ABCD2100 ABCD2200 ... I would like to have 2 characters suggested always, so for ABCD, it should suggest ABCD00 ABCD20 ABCD21 ABCD22 ... No smart sorting is needed, just alphabetically sorting. The problem is that for example 00 (or ABCD00) may not be suggested currently as it doesn't score high enough. But we are really trying to get all distinct values starting from the smallest (up to a certain number of suggestions). I was looking already at custom comparator class option. But this would probably not work as I would need more information to implement it there (like at least the currently entered search term, ABCD in the example). Thanks, Alexander
Re: Special suggestions requirement
Would it work to use TermsComponent with wildcards? Something like terms.regex=ABCD42??... see: http://wiki.apache.org/solr/TermsComponent/ Best Erick On Fri, Aug 3, 2012 at 9:07 AM, Michael Della Bitta michael.della.bi...@appinions.com wrote: I could be crazy, but it sounds to me like you need a trie, not a search index: http://en.wikipedia.org/wiki/Trie But in any case, what you want to do should be achievable. It seems like you need to do EdgeNgrams and facet on the results, where facet.counts 1 to exclude the actual part numbers, since each of those would be distinct. I'm on the train right now, so I can't test this. :\ Michael Della Bitta Appinions | 18 East 41st St., Suite 1806 | New York, NY 10017 www.appinions.com Where Influence Isn’t a Game On Thu, Aug 2, 2012 at 9:19 PM, Lochschmied, Alexander alexander.lochschm...@vishay.com wrote: Even with prefix query, I do not get ABCD02 or any ABCD02... back. BTW: EdgeNGramFilterFactory is used on the field we are getting the suggestions/spellchecks from. I think the problem is that there are a lot of different part numbers starting with ABCD and every part number has the same length. I showed only 4 in the example but there might be thousands. Here are some full part number examples that might be in the index: ABCD110040 ABCD00 ABCD99 ABCD155500 ... I'm looking for a way to make Solr return distinct list of fixed length substrings of them, e.g. if ABCD is entered, I would need ABCD00 ABCD01 ABCD02 ABCD03 ... ABCD99 Then if user chose ABCD42 from the suggestions, I would need ABCD4201 ABCD4202 ABCD4203 ... ABCD4299 and so on. I would be able to do some post processing if needed or adjust the schema or indexing process. But the key functionality I need from Solr is returning distinct set of those suggestions where only the last two characters change. All of the available combinations of those last two characters must be considered though. I need to show alpha-numerically sorted suggestions; the smallest value first. Thanks, Alexander -Ursprüngliche Nachricht- Von: Michael Della Bitta [mailto:michael.della.bi...@appinions.com] Gesendet: Donnerstag, 2. August 2012 15:02 An: solr-user@lucene.apache.org Betreff: Re: Special suggestions requirement In this case, we're storing the overall value length and sorting it on that, then alphabetically. Also, how are your queries fashioned? If you're doing a prefix query, everything that matches it should score the same. If you're only doing a prefix query, you might need to add a term for exact matches as well to get them to show up. Michael Della Bitta Appinions | 18 East 41st St., Suite 1806 | New York, NY 10017 www.appinions.com Where Influence Isn't a Game On Wed, Aug 1, 2012 at 9:58 PM, Lochschmied, Alexander alexander.lochschm...@vishay.com wrote: Is there a way to offer distinct, alphabetically sorted, fixed length options? I am trying to suggest part numbers and I'm currently trying to do it with the spellchecker component. Let's say ABCD was entered and we have indexed part numbers like ABCD ABCD2000 ABCD2100 ABCD2200 ... I would like to have 2 characters suggested always, so for ABCD, it should suggest ABCD00 ABCD20 ABCD21 ABCD22 ... No smart sorting is needed, just alphabetically sorting. The problem is that for example 00 (or ABCD00) may not be suggested currently as it doesn't score high enough. But we are really trying to get all distinct values starting from the smallest (up to a certain number of suggestions). I was looking already at custom comparator class option. But this would probably not work as I would need more information to implement it there (like at least the currently entered search term, ABCD in the example). Thanks, Alexander
Re: Special suggestions requirement
I could be crazy, but it sounds to me like you need a trie, not a search index: http://en.wikipedia.org/wiki/Trie But in any case, what you want to do should be achievable. It seems like you need to do EdgeNgrams and facet on the results, where facet.counts 1 to exclude the actual part numbers, since each of those would be distinct. I'm on the train right now, so I can't test this. :\ Michael Della Bitta Appinions | 18 East 41st St., Suite 1806 | New York, NY 10017 www.appinions.com Where Influence Isn’t a Game On Thu, Aug 2, 2012 at 9:19 PM, Lochschmied, Alexander alexander.lochschm...@vishay.com wrote: Even with prefix query, I do not get ABCD02 or any ABCD02... back. BTW: EdgeNGramFilterFactory is used on the field we are getting the suggestions/spellchecks from. I think the problem is that there are a lot of different part numbers starting with ABCD and every part number has the same length. I showed only 4 in the example but there might be thousands. Here are some full part number examples that might be in the index: ABCD110040 ABCD00 ABCD99 ABCD155500 ... I'm looking for a way to make Solr return distinct list of fixed length substrings of them, e.g. if ABCD is entered, I would need ABCD00 ABCD01 ABCD02 ABCD03 ... ABCD99 Then if user chose ABCD42 from the suggestions, I would need ABCD4201 ABCD4202 ABCD4203 ... ABCD4299 and so on. I would be able to do some post processing if needed or adjust the schema or indexing process. But the key functionality I need from Solr is returning distinct set of those suggestions where only the last two characters change. All of the available combinations of those last two characters must be considered though. I need to show alpha-numerically sorted suggestions; the smallest value first. Thanks, Alexander -Ursprüngliche Nachricht- Von: Michael Della Bitta [mailto:michael.della.bi...@appinions.com] Gesendet: Donnerstag, 2. August 2012 15:02 An: solr-user@lucene.apache.org Betreff: Re: Special suggestions requirement In this case, we're storing the overall value length and sorting it on that, then alphabetically. Also, how are your queries fashioned? If you're doing a prefix query, everything that matches it should score the same. If you're only doing a prefix query, you might need to add a term for exact matches as well to get them to show up. Michael Della Bitta Appinions | 18 East 41st St., Suite 1806 | New York, NY 10017 www.appinions.com Where Influence Isn't a Game On Wed, Aug 1, 2012 at 9:58 PM, Lochschmied, Alexander alexander.lochschm...@vishay.com wrote: Is there a way to offer distinct, alphabetically sorted, fixed length options? I am trying to suggest part numbers and I'm currently trying to do it with the spellchecker component. Let's say ABCD was entered and we have indexed part numbers like ABCD ABCD2000 ABCD2100 ABCD2200 ... I would like to have 2 characters suggested always, so for ABCD, it should suggest ABCD00 ABCD20 ABCD21 ABCD22 ... No smart sorting is needed, just alphabetically sorting. The problem is that for example 00 (or ABCD00) may not be suggested currently as it doesn't score high enough. But we are really trying to get all distinct values starting from the smallest (up to a certain number of suggestions). I was looking already at custom comparator class option. But this would probably not work as I would need more information to implement it there (like at least the currently entered search term, ABCD in the example). Thanks, Alexander
Re: Special suggestions requirement
In this case, we're storing the overall value length and sorting it on that, then alphabetically. Also, how are your queries fashioned? If you're doing a prefix query, everything that matches it should score the same. If you're only doing a prefix query, you might need to add a term for exact matches as well to get them to show up. Michael Della Bitta Appinions | 18 East 41st St., Suite 1806 | New York, NY 10017 www.appinions.com Where Influence Isn’t a Game On Wed, Aug 1, 2012 at 9:58 PM, Lochschmied, Alexander alexander.lochschm...@vishay.com wrote: Is there a way to offer distinct, alphabetically sorted, fixed length options? I am trying to suggest part numbers and I'm currently trying to do it with the spellchecker component. Let's say ABCD was entered and we have indexed part numbers like ABCD ABCD2000 ABCD2100 ABCD2200 ... I would like to have 2 characters suggested always, so for ABCD, it should suggest ABCD00 ABCD20 ABCD21 ABCD22 ... No smart sorting is needed, just alphabetically sorting. The problem is that for example 00 (or ABCD00) may not be suggested currently as it doesn't score high enough. But we are really trying to get all distinct values starting from the smallest (up to a certain number of suggestions). I was looking already at custom comparator class option. But this would probably not work as I would need more information to implement it there (like at least the currently entered search term, ABCD in the example). Thanks, Alexander
AW: Special suggestions requirement
Even with prefix query, I do not get ABCD02 or any ABCD02... back. BTW: EdgeNGramFilterFactory is used on the field we are getting the suggestions/spellchecks from. I think the problem is that there are a lot of different part numbers starting with ABCD and every part number has the same length. I showed only 4 in the example but there might be thousands. Here are some full part number examples that might be in the index: ABCD110040 ABCD00 ABCD99 ABCD155500 ... I'm looking for a way to make Solr return distinct list of fixed length substrings of them, e.g. if ABCD is entered, I would need ABCD00 ABCD01 ABCD02 ABCD03 ... ABCD99 Then if user chose ABCD42 from the suggestions, I would need ABCD4201 ABCD4202 ABCD4203 ... ABCD4299 and so on. I would be able to do some post processing if needed or adjust the schema or indexing process. But the key functionality I need from Solr is returning distinct set of those suggestions where only the last two characters change. All of the available combinations of those last two characters must be considered though. I need to show alpha-numerically sorted suggestions; the smallest value first. Thanks, Alexander -Ursprüngliche Nachricht- Von: Michael Della Bitta [mailto:michael.della.bi...@appinions.com] Gesendet: Donnerstag, 2. August 2012 15:02 An: solr-user@lucene.apache.org Betreff: Re: Special suggestions requirement In this case, we're storing the overall value length and sorting it on that, then alphabetically. Also, how are your queries fashioned? If you're doing a prefix query, everything that matches it should score the same. If you're only doing a prefix query, you might need to add a term for exact matches as well to get them to show up. Michael Della Bitta Appinions | 18 East 41st St., Suite 1806 | New York, NY 10017 www.appinions.com Where Influence Isn't a Game On Wed, Aug 1, 2012 at 9:58 PM, Lochschmied, Alexander alexander.lochschm...@vishay.com wrote: Is there a way to offer distinct, alphabetically sorted, fixed length options? I am trying to suggest part numbers and I'm currently trying to do it with the spellchecker component. Let's say ABCD was entered and we have indexed part numbers like ABCD ABCD2000 ABCD2100 ABCD2200 ... I would like to have 2 characters suggested always, so for ABCD, it should suggest ABCD00 ABCD20 ABCD21 ABCD22 ... No smart sorting is needed, just alphabetically sorting. The problem is that for example 00 (or ABCD00) may not be suggested currently as it doesn't score high enough. But we are really trying to get all distinct values starting from the smallest (up to a certain number of suggestions). I was looking already at custom comparator class option. But this would probably not work as I would need more information to implement it there (like at least the currently entered search term, ABCD in the example). Thanks, Alexander