I'm using the FreeTextLookupFactory in my implementation now. Yes, now it can suggest part of the field from the middle of the content.
I read that this implementation is able to consider the previous tokens when making the suggestions. However, when I try to enter a search phrase, it seems that it is only considering the last token and not any of the previous tokens. For example, when I search for http://localhost:8983/edm/collection1/suggest?suggest.q=trouble free, it is giving me suggestions based on the word 'free' only, and not 'trouble free'. This is my configuration: In solrconfig.xml: <searchComponent name="suggest" class="solr.SuggestComponent"> <lst name="suggester"> <str name="lookupImpl">FreeTextLookupFactory</str> <str name="indexPath">suggester_freetext_dir</str> <str name="dictionaryImpl">DocumentDictionaryFactory</str> <str name="field">Suggestion</str> <str name="suggestFreeTextAnalyzerFieldType">suggestType</str> <str name="ngrams">5</str> <str name="buildOnStartup">false</str> <str name="buildOnCommit">false</str> </lst> </searchComponent> <requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy" > <lst name="defaults"> <str name="wt">json</str> <str name="indent">true</str> <str name="suggest">true</str> <str name="suggest.count">10</str> <str name="suggest.dictionary">mySuggester</str> </lst> <arr name="components"> <str>suggest</str> </arr> </requestHandler> In schema.xml <fieldType name="suggestType" class="solr.TextField" positionIncrementGap="100"> <analyzer> <charFilter class="solr.PatternReplaceCharFilterFactory" pattern="[^a-zA-Z0-9]" replacement=" " /> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.ShingleFilterFactory" maxShingleSize="5" outputUnigrams="true"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> </analyzer> </fieldType> Is there anything I configured wrongly? I've set the ngrams to 5, which means it is supposed to consider up to the previous 5 tokens entered? Regards, Edwin On 17 June 2015 at 22:12, Alessandro Benedetti <benedetti.ale...@gmail.com> wrote: > Edwin, > The spellcheck is a thing, the Suggester is another. > > If you need to provide auto suggestion to your users, the suggester is the > right thing to use. > But I really doubt to be useful to select as a suggester field the entire > content. > it is going to be quite expensive. > > In the case I would again really suggest you to take a look to the article > I quoted and Solr generic documentation. > > It is possible to suggest part of the field. > You can use the FreeText suggester with a proper analysis selected. > > Cheers > > 2015-06-17 6:14 GMT+01:00 Zheng Lin Edwin Yeo <edwinye...@gmail.com>: > > > Yes I've looked at that before, but I was told that the newer version of > > Solr has its own suggester, and does not need to use spellchecker > anymore? > > > > So it's not necessary to use the spellechecker inside suggester anymore? > > > > Regards, > > Edwin > > > > > > On 17 June 2015 at 11:56, Erick Erickson <erickerick...@gmail.com> > wrote: > > > > > Have you looked at spellchecker? Because that sound much more like > > > what you're asking about than suggester. > > > > > > Spell checking is more what you're asking for, have you even looked at > > that > > > after it was suggested? > > > > > > bq: Also, when I do a search, it shouldn't be returning whole fields, > > > but just to return a portion of the sentence > > > > > > This is what highlighting is built for. > > > > > > Really, I recommend you take the time to do some familiarization with > the > > > whole search space and Solr. The excellent book here: > > > > > > > > > > > > http://www.amazon.com/Solr-Action-Trey-Grainger/dp/1617291021/ref=sr_1_1?ie=UTF8&qid=1434513284&sr=8-1&keywords=apache+solr&pebp=1434513287267&perid=0YRK508J0HJ1N3BAX20E > > > > > > will give you the grounding you need to get the most out of Solr. > > > > > > Best, > > > Erick > > > > > > On Tue, Jun 16, 2015 at 8:27 PM, Zheng Lin Edwin Yeo > > > <edwinye...@gmail.com> wrote: > > > > The long content is from when I tried to index PDF files. As some PDF > > > files > > > > has alot of words in the content, it will lead to the *UTF8 encoding > is > > > > longer than the max length 32766 error.* > > > > > > > > I think the problem is the content size of the PDF file exceed 32766 > > > > characters? > > > > > > > > I'm trying to accomplish to be able to index documents that can be of > > any > > > > size (even those with very large contents), and build the suggester > > from > > > > there. Also, when I do a search, it shouldn't be returning whole > > fields, > > > > but just to return a portion of the sentence. > > > > > > > > > > > > > > > > Regards, > > > > Edwin > > > > > > > > > > > > On 16 June 2015 at 23:02, Erick Erickson <erickerick...@gmail.com> > > > wrote: > > > > > > > >> The suggesters are built to return whole fields. You _might_ > > > >> be able to add multiple fragments to a multiValued > > > >> entry and get fragments, I haven't tried that though > > > >> and I suspect that actually you'd get the same thing.. > > > >> > > > >> This is an XY problem IMO. Please describe exactly what > > > >> you're trying to accomplish, with examples rather than > > > >> continue to pursue this path. It sounds like you want > > > >> spellcheck or similar. The _point_ behind the > > > >> suggesters is that they handle multiple-word suggestions > > > >> by returning he whole field. So putting long text fields > > > >> into them is not going to work. > > > >> > > > >> Best, > > > >> Erick > > > >> > > > >> On Tue, Jun 16, 2015 at 1:46 AM, Alessandro Benedetti > > > >> <benedetti.ale...@gmail.com> wrote: > > > >> > in line : > > > >> > > > > >> > 2015-06-16 4:43 GMT+01:00 Zheng Lin Edwin Yeo < > edwinye...@gmail.com > > >: > > > >> > > > > >> >> Thanks Benedetti, > > > >> >> > > > >> >> I've change to the AnalyzingInfixLookup approach, and it is able > to > > > >> start > > > >> >> searching from the middle of the field. > > > >> >> > > > >> >> However, is it possible to make the suggester to show only part > of > > > the > > > >> >> content of the field (like 2 or 3 fields after), instead of the > > > entire > > > >> >> content/sentence, which can be quite long? > > > >> >> > > > >> > > > > >> > I assume you use "fields" in the place of tokens. > > > >> > The answer is yes, I already said that in my previous mail, I > invite > > > you > > > >> to > > > >> > read carefully the answers and the documentation linked ! > > > >> > > > > >> > Related the excessive dimensions of tokens. This is weird, what > are > > > you > > > >> > trying to autocomplete ? > > > >> > I really doubt would be useful for a user to see super long auto > > > >> completed > > > >> > terms. > > > >> > > > > >> > Cheers > > > >> > > > > >> >> > > > >> >> > > > >> >> Regards, > > > >> >> Edwin > > > >> >> > > > >> >> > > > >> >> > > > >> >> On 15 June 2015 at 17:33, Alessandro Benedetti < > > > >> benedetti.ale...@gmail.com > > > >> >> > > > > >> >> wrote: > > > >> >> > > > >> >> > ehehe Edwin, I think you should read again the document I > linked > > > time > > > >> >> ago : > > > >> >> > > > > >> >> > http://lucidworks.com/blog/solr-suggester/ > > > >> >> > > > > >> >> > The suggester you used is not meant to provide infix > suggestions. > > > >> >> > The fuzzy suggester is working on a fuzzy basis , with the > > > *starting* > > > >> >> terms > > > >> >> > of a field content. > > > >> >> > > > > >> >> > What you are looking for is actually one of the Infix > Suggesters. > > > >> >> > For example the AnalyzingInfixLookup approach. > > > >> >> > > > > >> >> > When working with Suggesters is important first to make a > > > distinction > > > >> : > > > >> >> > > > > >> >> > 1) Returning the full content of the field ( analysisInfix or > > > Fuzzy) > > > >> >> > > > > >> >> > 2) Returning token(s) ( Free Text Suggester) > > > >> >> > > > > >> >> > Then the second difference is : > > > >> >> > > > > >> >> > 1) Infix suggestions ( from the "middle" of the field content) > > > >> >> > 2) Classic suggester ( from the beginning of the field content) > > > >> >> > > > > >> >> > Clarified that, will be quite simple to work with suggesters. > > > >> >> > > > > >> >> > Cheers > > > >> >> > > > > >> >> > 2015-06-15 9:28 GMT+01:00 Zheng Lin Edwin Yeo < > > > edwinye...@gmail.com>: > > > >> >> > > > > >> >> > > I've indexed a rich-text documents with the following > content: > > > >> >> > > > > > >> >> > > This is a testing rich text documents to test the uploading > of > > > >> files to > > > >> >> > > Solr > > > >> >> > > > > > >> >> > > > > > >> >> > > When I tried to use the suggestion, it return me the entire > > > field in > > > >> >> the > > > >> >> > > content once I enter suggest?q=t. However, when I tried to > > search > > > >> for > > > >> >> > > q='rich', I don't get any results returned. > > > >> >> > > > > > >> >> > > This is my current configuration for the suggester: > > > >> >> > > <searchComponent name="suggest" > class="solr.SuggestComponent"> > > > >> >> > > <lst name="suggester"> > > > >> >> > > <str name="name">mySuggester</str> > > > >> >> > > <str name="lookupImpl">FuzzyLookupFactory</str> > > > >> >> > > <str name="dictionaryImpl">DocumentDictionaryFactory</str> > > > >> >> > > <str name="field">Suggestion</str> > > > >> >> > > <str name="suggestAnalyzerFieldType">suggestType</str> > > > >> >> > > <str name="buildOnStartup">true</str> > > > >> >> > > <str name="buildOnCommit">false</str> > > > >> >> > > </lst> > > > >> >> > > </searchComponent> > > > >> >> > > > > > >> >> > > <requestHandler name="/suggest" class="solr.SearchHandler" > > > >> >> > startup="lazy" > > > > >> >> > > <lst name="defaults"> > > > >> >> > > <str name="wt">json</str> > > > >> >> > > <str name="indent">true</str> > > > >> >> > > > > > >> >> > > <str name="suggest">true</str> > > > >> >> > > <str name="suggest.count">10</str> > > > >> >> > > <str name="suggest.dictionary">mySuggester</str> > > > >> >> > > </lst> > > > >> >> > > <arr name="components"> > > > >> >> > > <str>suggest</str> > > > >> >> > > </arr> > > > >> >> > > </requestHandler> > > > >> >> > > > > > >> >> > > Is it possible to allow the suggester to return something > even > > > from > > > >> the > > > >> >> > > middle of the sentence, and also not to return the entire > > > sentence > > > >> if > > > >> >> the > > > >> >> > > sentence. Perhaps it should just suggest the next 2 or 3 > > fields, > > > >> and to > > > >> >> > > return more fields as the users type. > > > >> >> > > > > > >> >> > > For example, > > > >> >> > > When user type 'this', it should return 'This is a testing' > > > >> >> > > When user type 'this is a testing', it should return 'This > is a > > > >> testing > > > >> >> > > rich text documents'. > > > >> >> > > > > > >> >> > > > > > >> >> > > Regards, > > > >> >> > > Edwin > > > >> >> > > > > > >> >> > > > > >> >> > > > > >> >> > > > > >> >> > -- > > > >> >> > -------------------------- > > > >> >> > > > > >> >> > Benedetti Alessandro > > > >> >> > Visiting card : http://about.me/alessandro_benedetti > > > >> >> > > > > >> >> > "Tyger, tyger burning bright > > > >> >> > In the forests of the night, > > > >> >> > What immortal hand or eye > > > >> >> > Could frame thy fearful symmetry?" > > > >> >> > > > > >> >> > William Blake - Songs of Experience -1794 England > > > >> >> > > > > >> >> > > > >> > > > > >> > > > > >> > > > > >> > -- > > > >> > -------------------------- > > > >> > > > > >> > Benedetti Alessandro > > > >> > Visiting card : http://about.me/alessandro_benedetti > > > >> > > > > >> > "Tyger, tyger burning bright > > > >> > In the forests of the night, > > > >> > What immortal hand or eye > > > >> > Could frame thy fearful symmetry?" > > > >> > > > > >> > William Blake - Songs of Experience -1794 England > > > >> > > > > > > > > > -- > -------------------------- > > Benedetti Alessandro > Visiting card : http://about.me/alessandro_benedetti > > "Tyger, tyger burning bright > In the forests of the night, > What immortal hand or eye > Could frame thy fearful symmetry?" > > William Blake - Songs of Experience -1794 England >