Re: Solr's suggester results
Here is my problem statement and I would really appreciate for your feedback. 1. There are 1000's of pdf's with large amount of content are indexed to Solr. 2. Using AnalyzingInfixSuggester for the suggestions. Q. As the SuggeterComponent provides the 'entire content' of the field in the suggestions. How is it possible to have Suggester to return only part of the content of the field, instead of the entire content, which in my scenario quite long? Thanks in advance. PD -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
Re: Solr's suggester results
I'm using the FreeTextLookupFactory in my implementation now. Yes, now it can suggest part of the field from the middle of the content. I read that this implementation is able to consider the previous tokens when making the suggestions. However, when I try to enter a search phrase, it seems that it is only considering the last token and not any of the previous tokens. For example, when I search for http://localhost:8983/edm/collection1/suggest?suggest.q=trouble free, it is giving me suggestions based on the word 'free' only, and not 'trouble free'. This is my configuration: In solrconfig.xml: FreeTextLookupFactory suggester_freetext_dir DocumentDictionaryFactory Suggestion suggestType 5 false false json true true 10 mySuggester suggest In schema.xml Is there anything I configured wrongly? I've set the ngrams to 5, which means it is supposed to consider up to the previous 5 tokens entered? Regards, Edwin On 17 June 2015 at 22:12, Alessandro Benedetti wrote: > Edwin, > The spellcheck is a thing, the Suggester is another. > > If you need to provide auto suggestion to your users, the suggester is the > right thing to use. > But I really doubt to be useful to select as a suggester field the entire > content. > it is going to be quite expensive. > > In the case I would again really suggest you to take a look to the article > I quoted and Solr generic documentation. > > It is possible to suggest part of the field. > You can use the FreeText suggester with a proper analysis selected. > > Cheers > > 2015-06-17 6:14 GMT+01:00 Zheng Lin Edwin Yeo : > > > Yes I've looked at that before, but I was told that the newer version of > > Solr has its own suggester, and does not need to use spellchecker > anymore? > > > > So it's not necessary to use the spellechecker inside suggester anymore? > > > > Regards, > > Edwin > > > > > > On 17 June 2015 at 11:56, Erick Erickson > wrote: > > > > > Have you looked at spellchecker? Because that sound much more like > > > what you're asking about than suggester. > > > > > > Spell checking is more what you're asking for, have you even looked at > > that > > > after it was suggested? > > > > > > bq: Also, when I do a search, it shouldn't be returning whole fields, > > > but just to return a portion of the sentence > > > > > > This is what highlighting is built for. > > > > > > Really, I recommend you take the time to do some familiarization with > the > > > whole search space and Solr. The excellent book here: > > > > > > > > > > > > http://www.amazon.com/Solr-Action-Trey-Grainger/dp/1617291021/ref=sr_1_1?ie=UTF8&qid=1434513284&sr=8-1&keywords=apache+solr&pebp=1434513287267&perid=0YRK508J0HJ1N3BAX20E > > > > > > will give you the grounding you need to get the most out of Solr. > > > > > > Best, > > > Erick > > > > > > On Tue, Jun 16, 2015 at 8:27 PM, Zheng Lin Edwin Yeo > > > wrote: > > > > The long content is from when I tried to index PDF files. As some PDF > > > files > > > > has alot of words in the content, it will lead to the *UTF8 encoding > is > > > > longer than the max length 32766 error.* > > > > > > > > I think the problem is the content size of the PDF file exceed 32766 > > > > characters? > > > > > > > > I'm trying to accomplish to be able to index documents that can be of > > any > > > > size (even those with very large contents), and build the suggester > > from > > > > there. Also, when I do a search, it shouldn't be returning whole > > fields, > > > > but just to return a portion of the sentence. > > > > > > > > > > > > > > > > Regards, > > > > Edwin > > > > > > > > > > > > On 16 June 2015 at 23:02, Erick Erickson > > > wrote: > > > > > > > >> The suggesters are built to return whole fields. You _might_ > > > >> be able to add multiple fragments to a multiValued > > > >> entry and get fragments, I haven't tried that though > > > >> and I suspect that actually you'd get the same thing.. > > > >> > > > >> This is an XY problem IMO. Please describe exactly what > > > >> you're trying to accomplish, with examples rather than > > > >> continue to pursue this path. It sounds like you want > > > >> spellcheck or similar. The _point_ behind the > > > >> suggesters is that they handle multiple-word suggestions > > > >> by returning he whole field. So putting long text fields > > > >> into them is not going to work. > > > >> > > > >> Best, > > > >> Erick > > > >> > > > >> On Tue, Jun 16, 2015 at 1:46 AM, Alessandro Benedetti > > > >> wrote: > > > >> > in line : > > > >> > > > > >> > 2015-06-16 4:43 GMT+01:00 Zheng Lin Edwin Yeo < > edwinye...@gmail.com > > >: > > > >> > > > > >> >> Thanks Benedetti, > > > >> >> > > > >> >> I've change to the AnalyzingInfixLookup approach, and it is able > to > > > >> start > > > >> >> searching from the middle of the field. > > > >> >> > > > >> >> However, is it possible to make the suggester to show only part > of > > > the > > > >> >> content of the
Re: Solr's suggester results
Edwin, The spellcheck is a thing, the Suggester is another. If you need to provide auto suggestion to your users, the suggester is the right thing to use. But I really doubt to be useful to select as a suggester field the entire content. it is going to be quite expensive. In the case I would again really suggest you to take a look to the article I quoted and Solr generic documentation. It is possible to suggest part of the field. You can use the FreeText suggester with a proper analysis selected. Cheers 2015-06-17 6:14 GMT+01:00 Zheng Lin Edwin Yeo : > Yes I've looked at that before, but I was told that the newer version of > Solr has its own suggester, and does not need to use spellchecker anymore? > > So it's not necessary to use the spellechecker inside suggester anymore? > > Regards, > Edwin > > > On 17 June 2015 at 11:56, Erick Erickson wrote: > > > Have you looked at spellchecker? Because that sound much more like > > what you're asking about than suggester. > > > > Spell checking is more what you're asking for, have you even looked at > that > > after it was suggested? > > > > bq: Also, when I do a search, it shouldn't be returning whole fields, > > but just to return a portion of the sentence > > > > This is what highlighting is built for. > > > > Really, I recommend you take the time to do some familiarization with the > > whole search space and Solr. The excellent book here: > > > > > > > http://www.amazon.com/Solr-Action-Trey-Grainger/dp/1617291021/ref=sr_1_1?ie=UTF8&qid=1434513284&sr=8-1&keywords=apache+solr&pebp=1434513287267&perid=0YRK508J0HJ1N3BAX20E > > > > will give you the grounding you need to get the most out of Solr. > > > > Best, > > Erick > > > > On Tue, Jun 16, 2015 at 8:27 PM, Zheng Lin Edwin Yeo > > wrote: > > > The long content is from when I tried to index PDF files. As some PDF > > files > > > has alot of words in the content, it will lead to the *UTF8 encoding is > > > longer than the max length 32766 error.* > > > > > > I think the problem is the content size of the PDF file exceed 32766 > > > characters? > > > > > > I'm trying to accomplish to be able to index documents that can be of > any > > > size (even those with very large contents), and build the suggester > from > > > there. Also, when I do a search, it shouldn't be returning whole > fields, > > > but just to return a portion of the sentence. > > > > > > > > > > > > Regards, > > > Edwin > > > > > > > > > On 16 June 2015 at 23:02, Erick Erickson > > wrote: > > > > > >> The suggesters are built to return whole fields. You _might_ > > >> be able to add multiple fragments to a multiValued > > >> entry and get fragments, I haven't tried that though > > >> and I suspect that actually you'd get the same thing.. > > >> > > >> This is an XY problem IMO. Please describe exactly what > > >> you're trying to accomplish, with examples rather than > > >> continue to pursue this path. It sounds like you want > > >> spellcheck or similar. The _point_ behind the > > >> suggesters is that they handle multiple-word suggestions > > >> by returning he whole field. So putting long text fields > > >> into them is not going to work. > > >> > > >> Best, > > >> Erick > > >> > > >> On Tue, Jun 16, 2015 at 1:46 AM, Alessandro Benedetti > > >> wrote: > > >> > in line : > > >> > > > >> > 2015-06-16 4:43 GMT+01:00 Zheng Lin Edwin Yeo >: > > >> > > > >> >> Thanks Benedetti, > > >> >> > > >> >> I've change to the AnalyzingInfixLookup approach, and it is able to > > >> start > > >> >> searching from the middle of the field. > > >> >> > > >> >> However, is it possible to make the suggester to show only part of > > the > > >> >> content of the field (like 2 or 3 fields after), instead of the > > entire > > >> >> content/sentence, which can be quite long? > > >> >> > > >> > > > >> > I assume you use "fields" in the place of tokens. > > >> > The answer is yes, I already said that in my previous mail, I invite > > you > > >> to > > >> > read carefully the answers and the documentation linked ! > > >> > > > >> > Related the excessive dimensions of tokens. This is weird, what are > > you > > >> > trying to autocomplete ? > > >> > I really doubt would be useful for a user to see super long auto > > >> completed > > >> > terms. > > >> > > > >> > Cheers > > >> > > > >> >> > > >> >> > > >> >> Regards, > > >> >> Edwin > > >> >> > > >> >> > > >> >> > > >> >> On 15 June 2015 at 17:33, Alessandro Benedetti < > > >> benedetti.ale...@gmail.com > > >> >> > > > >> >> wrote: > > >> >> > > >> >> > ehehe Edwin, I think you should read again the document I linked > > time > > >> >> ago : > > >> >> > > > >> >> > http://lucidworks.com/blog/solr-suggester/ > > >> >> > > > >> >> > The suggester you used is not meant to provide infix suggestions. > > >> >> > The fuzzy suggester is working on a fuzzy basis , with the > > *starting* > > >> >> terms > > >> >> > of a field content. > > >> >> > > > >> >> > What you are looking for is actually one of the Infix Suggesters.
Re: Solr's suggester results
Yes I've looked at that before, but I was told that the newer version of Solr has its own suggester, and does not need to use spellchecker anymore? So it's not necessary to use the spellechecker inside suggester anymore? Regards, Edwin On 17 June 2015 at 11:56, Erick Erickson wrote: > Have you looked at spellchecker? Because that sound much more like > what you're asking about than suggester. > > Spell checking is more what you're asking for, have you even looked at that > after it was suggested? > > bq: Also, when I do a search, it shouldn't be returning whole fields, > but just to return a portion of the sentence > > This is what highlighting is built for. > > Really, I recommend you take the time to do some familiarization with the > whole search space and Solr. The excellent book here: > > > http://www.amazon.com/Solr-Action-Trey-Grainger/dp/1617291021/ref=sr_1_1?ie=UTF8&qid=1434513284&sr=8-1&keywords=apache+solr&pebp=1434513287267&perid=0YRK508J0HJ1N3BAX20E > > will give you the grounding you need to get the most out of Solr. > > Best, > Erick > > On Tue, Jun 16, 2015 at 8:27 PM, Zheng Lin Edwin Yeo > wrote: > > The long content is from when I tried to index PDF files. As some PDF > files > > has alot of words in the content, it will lead to the *UTF8 encoding is > > longer than the max length 32766 error.* > > > > I think the problem is the content size of the PDF file exceed 32766 > > characters? > > > > I'm trying to accomplish to be able to index documents that can be of any > > size (even those with very large contents), and build the suggester from > > there. Also, when I do a search, it shouldn't be returning whole fields, > > but just to return a portion of the sentence. > > > > > > > > Regards, > > Edwin > > > > > > On 16 June 2015 at 23:02, Erick Erickson > wrote: > > > >> The suggesters are built to return whole fields. You _might_ > >> be able to add multiple fragments to a multiValued > >> entry and get fragments, I haven't tried that though > >> and I suspect that actually you'd get the same thing.. > >> > >> This is an XY problem IMO. Please describe exactly what > >> you're trying to accomplish, with examples rather than > >> continue to pursue this path. It sounds like you want > >> spellcheck or similar. The _point_ behind the > >> suggesters is that they handle multiple-word suggestions > >> by returning he whole field. So putting long text fields > >> into them is not going to work. > >> > >> Best, > >> Erick > >> > >> On Tue, Jun 16, 2015 at 1:46 AM, Alessandro Benedetti > >> wrote: > >> > in line : > >> > > >> > 2015-06-16 4:43 GMT+01:00 Zheng Lin Edwin Yeo : > >> > > >> >> Thanks Benedetti, > >> >> > >> >> I've change to the AnalyzingInfixLookup approach, and it is able to > >> start > >> >> searching from the middle of the field. > >> >> > >> >> However, is it possible to make the suggester to show only part of > the > >> >> content of the field (like 2 or 3 fields after), instead of the > entire > >> >> content/sentence, which can be quite long? > >> >> > >> > > >> > I assume you use "fields" in the place of tokens. > >> > The answer is yes, I already said that in my previous mail, I invite > you > >> to > >> > read carefully the answers and the documentation linked ! > >> > > >> > Related the excessive dimensions of tokens. This is weird, what are > you > >> > trying to autocomplete ? > >> > I really doubt would be useful for a user to see super long auto > >> completed > >> > terms. > >> > > >> > Cheers > >> > > >> >> > >> >> > >> >> Regards, > >> >> Edwin > >> >> > >> >> > >> >> > >> >> On 15 June 2015 at 17:33, Alessandro Benedetti < > >> benedetti.ale...@gmail.com > >> >> > > >> >> wrote: > >> >> > >> >> > ehehe Edwin, I think you should read again the document I linked > time > >> >> ago : > >> >> > > >> >> > http://lucidworks.com/blog/solr-suggester/ > >> >> > > >> >> > The suggester you used is not meant to provide infix suggestions. > >> >> > The fuzzy suggester is working on a fuzzy basis , with the > *starting* > >> >> terms > >> >> > of a field content. > >> >> > > >> >> > What you are looking for is actually one of the Infix Suggesters. > >> >> > For example the AnalyzingInfixLookup approach. > >> >> > > >> >> > When working with Suggesters is important first to make a > distinction > >> : > >> >> > > >> >> > 1) Returning the full content of the field ( analysisInfix or > Fuzzy) > >> >> > > >> >> > 2) Returning token(s) ( Free Text Suggester) > >> >> > > >> >> > Then the second difference is : > >> >> > > >> >> > 1) Infix suggestions ( from the "middle" of the field content) > >> >> > 2) Classic suggester ( from the beginning of the field content) > >> >> > > >> >> > Clarified that, will be quite simple to work with suggesters. > >> >> > > >> >> > Cheers > >> >> > > >> >> > 2015-06-15 9:28 GMT+01:00 Zheng Lin Edwin Yeo < > edwinye...@gmail.com>: > >> >> > > >> >> > > I've indexed a rich-text documents with the following content: > >> >> > > > >> >> > > Th
Re: Solr's suggester results
Have you looked at spellchecker? Because that sound much more like what you're asking about than suggester. Spell checking is more what you're asking for, have you even looked at that after it was suggested? bq: Also, when I do a search, it shouldn't be returning whole fields, but just to return a portion of the sentence This is what highlighting is built for. Really, I recommend you take the time to do some familiarization with the whole search space and Solr. The excellent book here: http://www.amazon.com/Solr-Action-Trey-Grainger/dp/1617291021/ref=sr_1_1?ie=UTF8&qid=1434513284&sr=8-1&keywords=apache+solr&pebp=1434513287267&perid=0YRK508J0HJ1N3BAX20E will give you the grounding you need to get the most out of Solr. Best, Erick On Tue, Jun 16, 2015 at 8:27 PM, Zheng Lin Edwin Yeo wrote: > The long content is from when I tried to index PDF files. As some PDF files > has alot of words in the content, it will lead to the *UTF8 encoding is > longer than the max length 32766 error.* > > I think the problem is the content size of the PDF file exceed 32766 > characters? > > I'm trying to accomplish to be able to index documents that can be of any > size (even those with very large contents), and build the suggester from > there. Also, when I do a search, it shouldn't be returning whole fields, > but just to return a portion of the sentence. > > > > Regards, > Edwin > > > On 16 June 2015 at 23:02, Erick Erickson wrote: > >> The suggesters are built to return whole fields. You _might_ >> be able to add multiple fragments to a multiValued >> entry and get fragments, I haven't tried that though >> and I suspect that actually you'd get the same thing.. >> >> This is an XY problem IMO. Please describe exactly what >> you're trying to accomplish, with examples rather than >> continue to pursue this path. It sounds like you want >> spellcheck or similar. The _point_ behind the >> suggesters is that they handle multiple-word suggestions >> by returning he whole field. So putting long text fields >> into them is not going to work. >> >> Best, >> Erick >> >> On Tue, Jun 16, 2015 at 1:46 AM, Alessandro Benedetti >> wrote: >> > in line : >> > >> > 2015-06-16 4:43 GMT+01:00 Zheng Lin Edwin Yeo : >> > >> >> Thanks Benedetti, >> >> >> >> I've change to the AnalyzingInfixLookup approach, and it is able to >> start >> >> searching from the middle of the field. >> >> >> >> However, is it possible to make the suggester to show only part of the >> >> content of the field (like 2 or 3 fields after), instead of the entire >> >> content/sentence, which can be quite long? >> >> >> > >> > I assume you use "fields" in the place of tokens. >> > The answer is yes, I already said that in my previous mail, I invite you >> to >> > read carefully the answers and the documentation linked ! >> > >> > Related the excessive dimensions of tokens. This is weird, what are you >> > trying to autocomplete ? >> > I really doubt would be useful for a user to see super long auto >> completed >> > terms. >> > >> > Cheers >> > >> >> >> >> >> >> Regards, >> >> Edwin >> >> >> >> >> >> >> >> On 15 June 2015 at 17:33, Alessandro Benedetti < >> benedetti.ale...@gmail.com >> >> > >> >> wrote: >> >> >> >> > ehehe Edwin, I think you should read again the document I linked time >> >> ago : >> >> > >> >> > http://lucidworks.com/blog/solr-suggester/ >> >> > >> >> > The suggester you used is not meant to provide infix suggestions. >> >> > The fuzzy suggester is working on a fuzzy basis , with the *starting* >> >> terms >> >> > of a field content. >> >> > >> >> > What you are looking for is actually one of the Infix Suggesters. >> >> > For example the AnalyzingInfixLookup approach. >> >> > >> >> > When working with Suggesters is important first to make a distinction >> : >> >> > >> >> > 1) Returning the full content of the field ( analysisInfix or Fuzzy) >> >> > >> >> > 2) Returning token(s) ( Free Text Suggester) >> >> > >> >> > Then the second difference is : >> >> > >> >> > 1) Infix suggestions ( from the "middle" of the field content) >> >> > 2) Classic suggester ( from the beginning of the field content) >> >> > >> >> > Clarified that, will be quite simple to work with suggesters. >> >> > >> >> > Cheers >> >> > >> >> > 2015-06-15 9:28 GMT+01:00 Zheng Lin Edwin Yeo : >> >> > >> >> > > I've indexed a rich-text documents with the following content: >> >> > > >> >> > > This is a testing rich text documents to test the uploading of >> files to >> >> > > Solr >> >> > > >> >> > > >> >> > > When I tried to use the suggestion, it return me the entire field in >> >> the >> >> > > content once I enter suggest?q=t. However, when I tried to search >> for >> >> > > q='rich', I don't get any results returned. >> >> > > >> >> > > This is my current configuration for the suggester: >> >> > > >> >> > > >> >> > > mySuggester >> >> > > FuzzyLookupFactory >> >> > > DocumentDictionaryFactory >> >> > > Suggestion >> >> > > suggestType >> >> > > true >> >> > > false >> >> >
Re: Solr's suggester results
The long content is from when I tried to index PDF files. As some PDF files has alot of words in the content, it will lead to the *UTF8 encoding is longer than the max length 32766 error.* I think the problem is the content size of the PDF file exceed 32766 characters? I'm trying to accomplish to be able to index documents that can be of any size (even those with very large contents), and build the suggester from there. Also, when I do a search, it shouldn't be returning whole fields, but just to return a portion of the sentence. Regards, Edwin On 16 June 2015 at 23:02, Erick Erickson wrote: > The suggesters are built to return whole fields. You _might_ > be able to add multiple fragments to a multiValued > entry and get fragments, I haven't tried that though > and I suspect that actually you'd get the same thing.. > > This is an XY problem IMO. Please describe exactly what > you're trying to accomplish, with examples rather than > continue to pursue this path. It sounds like you want > spellcheck or similar. The _point_ behind the > suggesters is that they handle multiple-word suggestions > by returning he whole field. So putting long text fields > into them is not going to work. > > Best, > Erick > > On Tue, Jun 16, 2015 at 1:46 AM, Alessandro Benedetti > wrote: > > in line : > > > > 2015-06-16 4:43 GMT+01:00 Zheng Lin Edwin Yeo : > > > >> Thanks Benedetti, > >> > >> I've change to the AnalyzingInfixLookup approach, and it is able to > start > >> searching from the middle of the field. > >> > >> However, is it possible to make the suggester to show only part of the > >> content of the field (like 2 or 3 fields after), instead of the entire > >> content/sentence, which can be quite long? > >> > > > > I assume you use "fields" in the place of tokens. > > The answer is yes, I already said that in my previous mail, I invite you > to > > read carefully the answers and the documentation linked ! > > > > Related the excessive dimensions of tokens. This is weird, what are you > > trying to autocomplete ? > > I really doubt would be useful for a user to see super long auto > completed > > terms. > > > > Cheers > > > >> > >> > >> Regards, > >> Edwin > >> > >> > >> > >> On 15 June 2015 at 17:33, Alessandro Benedetti < > benedetti.ale...@gmail.com > >> > > >> wrote: > >> > >> > ehehe Edwin, I think you should read again the document I linked time > >> ago : > >> > > >> > http://lucidworks.com/blog/solr-suggester/ > >> > > >> > The suggester you used is not meant to provide infix suggestions. > >> > The fuzzy suggester is working on a fuzzy basis , with the *starting* > >> terms > >> > of a field content. > >> > > >> > What you are looking for is actually one of the Infix Suggesters. > >> > For example the AnalyzingInfixLookup approach. > >> > > >> > When working with Suggesters is important first to make a distinction > : > >> > > >> > 1) Returning the full content of the field ( analysisInfix or Fuzzy) > >> > > >> > 2) Returning token(s) ( Free Text Suggester) > >> > > >> > Then the second difference is : > >> > > >> > 1) Infix suggestions ( from the "middle" of the field content) > >> > 2) Classic suggester ( from the beginning of the field content) > >> > > >> > Clarified that, will be quite simple to work with suggesters. > >> > > >> > Cheers > >> > > >> > 2015-06-15 9:28 GMT+01:00 Zheng Lin Edwin Yeo : > >> > > >> > > I've indexed a rich-text documents with the following content: > >> > > > >> > > This is a testing rich text documents to test the uploading of > files to > >> > > Solr > >> > > > >> > > > >> > > When I tried to use the suggestion, it return me the entire field in > >> the > >> > > content once I enter suggest?q=t. However, when I tried to search > for > >> > > q='rich', I don't get any results returned. > >> > > > >> > > This is my current configuration for the suggester: > >> > > > >> > > > >> > > mySuggester > >> > > FuzzyLookupFactory > >> > > DocumentDictionaryFactory > >> > > Suggestion > >> > > suggestType > >> > > true > >> > > false > >> > > > >> > > > >> > > > >> > > >> > startup="lazy" > > >> > > > >> > > json > >> > > true > >> > > > >> > > true > >> > > 10 > >> > > mySuggester > >> > > > >> > > > >> > > suggest > >> > > > >> > > > >> > > > >> > > Is it possible to allow the suggester to return something even from > the > >> > > middle of the sentence, and also not to return the entire sentence > if > >> the > >> > > sentence. Perhaps it should just suggest the next 2 or 3 fields, > and to > >> > > return more fields as the users type. > >> > > > >> > > For example, > >> > > When user type 'this', it should return 'This is a testing' > >> > > When user type 'this is a testing', it should return 'This is a > testing > >> > > rich text documents'. > >> > > > >> > > > >> > > Regards, > >> > > Edwin > >> > > > >> > > >> > > >> > > >> > -- > >> > -- > >> > > >> > Benedetti Alessandro > >> > Visiting card : http://about.me/a
Re: Solr's suggester results
The suggesters are built to return whole fields. You _might_ be able to add multiple fragments to a multiValued entry and get fragments, I haven't tried that though and I suspect that actually you'd get the same thing.. This is an XY problem IMO. Please describe exactly what you're trying to accomplish, with examples rather than continue to pursue this path. It sounds like you want spellcheck or similar. The _point_ behind the suggesters is that they handle multiple-word suggestions by returning he whole field. So putting long text fields into them is not going to work. Best, Erick On Tue, Jun 16, 2015 at 1:46 AM, Alessandro Benedetti wrote: > in line : > > 2015-06-16 4:43 GMT+01:00 Zheng Lin Edwin Yeo : > >> Thanks Benedetti, >> >> I've change to the AnalyzingInfixLookup approach, and it is able to start >> searching from the middle of the field. >> >> However, is it possible to make the suggester to show only part of the >> content of the field (like 2 or 3 fields after), instead of the entire >> content/sentence, which can be quite long? >> > > I assume you use "fields" in the place of tokens. > The answer is yes, I already said that in my previous mail, I invite you to > read carefully the answers and the documentation linked ! > > Related the excessive dimensions of tokens. This is weird, what are you > trying to autocomplete ? > I really doubt would be useful for a user to see super long auto completed > terms. > > Cheers > >> >> >> Regards, >> Edwin >> >> >> >> On 15 June 2015 at 17:33, Alessandro Benedetti > > >> wrote: >> >> > ehehe Edwin, I think you should read again the document I linked time >> ago : >> > >> > http://lucidworks.com/blog/solr-suggester/ >> > >> > The suggester you used is not meant to provide infix suggestions. >> > The fuzzy suggester is working on a fuzzy basis , with the *starting* >> terms >> > of a field content. >> > >> > What you are looking for is actually one of the Infix Suggesters. >> > For example the AnalyzingInfixLookup approach. >> > >> > When working with Suggesters is important first to make a distinction : >> > >> > 1) Returning the full content of the field ( analysisInfix or Fuzzy) >> > >> > 2) Returning token(s) ( Free Text Suggester) >> > >> > Then the second difference is : >> > >> > 1) Infix suggestions ( from the "middle" of the field content) >> > 2) Classic suggester ( from the beginning of the field content) >> > >> > Clarified that, will be quite simple to work with suggesters. >> > >> > Cheers >> > >> > 2015-06-15 9:28 GMT+01:00 Zheng Lin Edwin Yeo : >> > >> > > I've indexed a rich-text documents with the following content: >> > > >> > > This is a testing rich text documents to test the uploading of files to >> > > Solr >> > > >> > > >> > > When I tried to use the suggestion, it return me the entire field in >> the >> > > content once I enter suggest?q=t. However, when I tried to search for >> > > q='rich', I don't get any results returned. >> > > >> > > This is my current configuration for the suggester: >> > > >> > > >> > > mySuggester >> > > FuzzyLookupFactory >> > > DocumentDictionaryFactory >> > > Suggestion >> > > suggestType >> > > true >> > > false >> > > >> > > >> > > >> > > > > startup="lazy" > >> > > >> > > json >> > > true >> > > >> > > true >> > > 10 >> > > mySuggester >> > > >> > > >> > > suggest >> > > >> > > >> > > >> > > Is it possible to allow the suggester to return something even from the >> > > middle of the sentence, and also not to return the entire sentence if >> the >> > > sentence. Perhaps it should just suggest the next 2 or 3 fields, and to >> > > return more fields as the users type. >> > > >> > > For example, >> > > When user type 'this', it should return 'This is a testing' >> > > When user type 'this is a testing', it should return 'This is a testing >> > > rich text documents'. >> > > >> > > >> > > Regards, >> > > Edwin >> > > >> > >> > >> > >> > -- >> > -- >> > >> > Benedetti Alessandro >> > Visiting card : http://about.me/alessandro_benedetti >> > >> > "Tyger, tyger burning bright >> > In the forests of the night, >> > What immortal hand or eye >> > Could frame thy fearful symmetry?" >> > >> > William Blake - Songs of Experience -1794 England >> > >> > > > > -- > -- > > Benedetti Alessandro > Visiting card : http://about.me/alessandro_benedetti > > "Tyger, tyger burning bright > In the forests of the night, > What immortal hand or eye > Could frame thy fearful symmetry?" > > William Blake - Songs of Experience -1794 England
Re: Solr's suggester results
in line : 2015-06-16 4:43 GMT+01:00 Zheng Lin Edwin Yeo : > Thanks Benedetti, > > I've change to the AnalyzingInfixLookup approach, and it is able to start > searching from the middle of the field. > > However, is it possible to make the suggester to show only part of the > content of the field (like 2 or 3 fields after), instead of the entire > content/sentence, which can be quite long? > I assume you use "fields" in the place of tokens. The answer is yes, I already said that in my previous mail, I invite you to read carefully the answers and the documentation linked ! Related the excessive dimensions of tokens. This is weird, what are you trying to autocomplete ? I really doubt would be useful for a user to see super long auto completed terms. Cheers > > > Regards, > Edwin > > > > On 15 June 2015 at 17:33, Alessandro Benedetti > > wrote: > > > ehehe Edwin, I think you should read again the document I linked time > ago : > > > > http://lucidworks.com/blog/solr-suggester/ > > > > The suggester you used is not meant to provide infix suggestions. > > The fuzzy suggester is working on a fuzzy basis , with the *starting* > terms > > of a field content. > > > > What you are looking for is actually one of the Infix Suggesters. > > For example the AnalyzingInfixLookup approach. > > > > When working with Suggesters is important first to make a distinction : > > > > 1) Returning the full content of the field ( analysisInfix or Fuzzy) > > > > 2) Returning token(s) ( Free Text Suggester) > > > > Then the second difference is : > > > > 1) Infix suggestions ( from the "middle" of the field content) > > 2) Classic suggester ( from the beginning of the field content) > > > > Clarified that, will be quite simple to work with suggesters. > > > > Cheers > > > > 2015-06-15 9:28 GMT+01:00 Zheng Lin Edwin Yeo : > > > > > I've indexed a rich-text documents with the following content: > > > > > > This is a testing rich text documents to test the uploading of files to > > > Solr > > > > > > > > > When I tried to use the suggestion, it return me the entire field in > the > > > content once I enter suggest?q=t. However, when I tried to search for > > > q='rich', I don't get any results returned. > > > > > > This is my current configuration for the suggester: > > > > > > > > > mySuggester > > > FuzzyLookupFactory > > > DocumentDictionaryFactory > > > Suggestion > > > suggestType > > > true > > > false > > > > > > > > > > > > > startup="lazy" > > > > > > > json > > > true > > > > > > true > > > 10 > > > mySuggester > > > > > > > > > suggest > > > > > > > > > > > > Is it possible to allow the suggester to return something even from the > > > middle of the sentence, and also not to return the entire sentence if > the > > > sentence. Perhaps it should just suggest the next 2 or 3 fields, and to > > > return more fields as the users type. > > > > > > For example, > > > When user type 'this', it should return 'This is a testing' > > > When user type 'this is a testing', it should return 'This is a testing > > > rich text documents'. > > > > > > > > > Regards, > > > Edwin > > > > > > > > > > > -- > > -- > > > > Benedetti Alessandro > > Visiting card : http://about.me/alessandro_benedetti > > > > "Tyger, tyger burning bright > > In the forests of the night, > > What immortal hand or eye > > Could frame thy fearful symmetry?" > > > > William Blake - Songs of Experience -1794 England > > > -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti "Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry?" William Blake - Songs of Experience -1794 England
Re: Solr's suggester results
Also, is there a way to overcome the long content problem? I'm getting this error when I've indexed large rich-text documents and tried to build the suggester. *{* * "responseHeader":{* *"status":500,* *"QTime":47},* * "error":{* *"msg":"Document contains at least one immense term in field=\"exacttext\" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped. Please correct the analyzer to not produce such terms. The prefix of the first immense term is: '[32, 10, 32, 10, 32, 32, 10, 32, 32, 10, 32, 32, 10, 32, 32, 10, 32, 32, 10, 32, 32, 10, 32, 32, 10, 32, 32, 10, 32, 32]...', original message: bytes can be at most 32766 in length; got 139402",* *"trace":"java.lang.IllegalArgumentException: Document contains at least one immense term in field=\"exacttext\" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped. Please correct the analyzer to not produce such terms. The prefix of the first immense term is: '[32, 10, 32, 10, 32, 32, 10, 32, 32, 10, 32, 32, 10, 32, 32, 10, 32, 32, 10, 32, 32, 10, 32, 32, 10, 32, 32, 10, 32, 32]...', original message: bytes can be at most 32766 in length; got 139402\r\n\tat org.apache.lucene.index.DefaultIndexingChain$PerField.invert(DefaultIndexingChain.java:667)\r\n\tat org.apache.lucene.index.DefaultIndexingChain.processField(DefaultIndexingChain.java:344)\r\n\tat org.apache.lucene.index.DefaultIndexingChain.processDocument(DefaultIndexingChain.java:300)\r\n\tat org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:232)\r\n\tat org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:458)\r\n\tat org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1350)\r\n\tat org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1138)\r\n\tat org.apache.lucene.search.suggest.analyzing.AnalyzingInfixSuggester.add(AnalyzingInfixSuggester.java:381)\r\n\tat org.apache.lucene.search.suggest.analyzing.AnalyzingInfixSuggester.build(AnalyzingInfixSuggester.java:310)\r\n\tat org.apache.lucene.search.suggest.Lookup.build(Lookup.java:193)\r\n\tat org.apache.solr.spelling.suggest.SolrSuggester.build(SolrSuggester.java:163)\r\n\tat org.apache.solr.handler.component.SuggestComponent.prepare(SuggestComponent.java:179)\r\n\tat org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:196)\r\n\tat org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)\r\n\tat org.apache.solr.core.SolrCore.execute(SolrCore.java:1984)\r\n\tat org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:829)\r\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:446)\r\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:220)\r\n\tat org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)\r\n\tat org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:82)\r\n\tat org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:294)\r\n\tat org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)\r\n\tat org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:82)\r\n\tat org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:294)\r\n\tat org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)\r\n\tat org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)\r\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)\r\n\tat org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)\r\n\tat org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)\r\n\tat org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)\r\n\tat org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)\r\n\tat org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)\r\n\tat org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)\r\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)\r\n\tat org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)\r\n\tat org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)\r\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)\r\n\tat org.eclipse.jetty.server.Server.handle(Server.java:368)\r\n\tat org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)\r\n\tat org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)\r\n\tat org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)\r\n\tat org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(Abstra
Re: Solr's suggester results
Thanks Benedetti, I've change to the AnalyzingInfixLookup approach, and it is able to start searching from the middle of the field. However, is it possible to make the suggester to show only part of the content of the field (like 2 or 3 fields after), instead of the entire content/sentence, which can be quite long? Regards, Edwin On 15 June 2015 at 17:33, Alessandro Benedetti wrote: > ehehe Edwin, I think you should read again the document I linked time ago : > > http://lucidworks.com/blog/solr-suggester/ > > The suggester you used is not meant to provide infix suggestions. > The fuzzy suggester is working on a fuzzy basis , with the *starting* terms > of a field content. > > What you are looking for is actually one of the Infix Suggesters. > For example the AnalyzingInfixLookup approach. > > When working with Suggesters is important first to make a distinction : > > 1) Returning the full content of the field ( analysisInfix or Fuzzy) > > 2) Returning token(s) ( Free Text Suggester) > > Then the second difference is : > > 1) Infix suggestions ( from the "middle" of the field content) > 2) Classic suggester ( from the beginning of the field content) > > Clarified that, will be quite simple to work with suggesters. > > Cheers > > 2015-06-15 9:28 GMT+01:00 Zheng Lin Edwin Yeo : > > > I've indexed a rich-text documents with the following content: > > > > This is a testing rich text documents to test the uploading of files to > > Solr > > > > > > When I tried to use the suggestion, it return me the entire field in the > > content once I enter suggest?q=t. However, when I tried to search for > > q='rich', I don't get any results returned. > > > > This is my current configuration for the suggester: > > > > > > mySuggester > > FuzzyLookupFactory > > DocumentDictionaryFactory > > Suggestion > > suggestType > > true > > false > > > > > > > > startup="lazy" > > > > > json > > true > > > > true > > 10 > > mySuggester > > > > > > suggest > > > > > > > > Is it possible to allow the suggester to return something even from the > > middle of the sentence, and also not to return the entire sentence if the > > sentence. Perhaps it should just suggest the next 2 or 3 fields, and to > > return more fields as the users type. > > > > For example, > > When user type 'this', it should return 'This is a testing' > > When user type 'this is a testing', it should return 'This is a testing > > rich text documents'. > > > > > > Regards, > > Edwin > > > > > > -- > -- > > Benedetti Alessandro > Visiting card : http://about.me/alessandro_benedetti > > "Tyger, tyger burning bright > In the forests of the night, > What immortal hand or eye > Could frame thy fearful symmetry?" > > William Blake - Songs of Experience -1794 England >
Re: Solr's suggester results
ehehe Edwin, I think you should read again the document I linked time ago : http://lucidworks.com/blog/solr-suggester/ The suggester you used is not meant to provide infix suggestions. The fuzzy suggester is working on a fuzzy basis , with the *starting* terms of a field content. What you are looking for is actually one of the Infix Suggesters. For example the AnalyzingInfixLookup approach. When working with Suggesters is important first to make a distinction : 1) Returning the full content of the field ( analysisInfix or Fuzzy) 2) Returning token(s) ( Free Text Suggester) Then the second difference is : 1) Infix suggestions ( from the "middle" of the field content) 2) Classic suggester ( from the beginning of the field content) Clarified that, will be quite simple to work with suggesters. Cheers 2015-06-15 9:28 GMT+01:00 Zheng Lin Edwin Yeo : > I've indexed a rich-text documents with the following content: > > This is a testing rich text documents to test the uploading of files to > Solr > > > When I tried to use the suggestion, it return me the entire field in the > content once I enter suggest?q=t. However, when I tried to search for > q='rich', I don't get any results returned. > > This is my current configuration for the suggester: > > > mySuggester > FuzzyLookupFactory > DocumentDictionaryFactory > Suggestion > suggestType > true > false > > > > > > json > true > > true > 10 > mySuggester > > > suggest > > > > Is it possible to allow the suggester to return something even from the > middle of the sentence, and also not to return the entire sentence if the > sentence. Perhaps it should just suggest the next 2 or 3 fields, and to > return more fields as the users type. > > For example, > When user type 'this', it should return 'This is a testing' > When user type 'this is a testing', it should return 'This is a testing > rich text documents'. > > > Regards, > Edwin > -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti "Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry?" William Blake - Songs of Experience -1794 England