Re: Problem with words thats amost similar
2009/12/18 Steinar Asbjørnsen > > What I've done so far is to add both restaurant and restaurering to > protwords.txt. > I've also refeed a single document (with the keyword "restaurering") to > check that it no longer appears in a search result for "restaurant". > Do i have to refeed every document in the index? > Or restart so that solr re-reads the protwords.txt-file (this is on a > different installation(prod) then the one i restarted earlier(dev))? > > Documents already added in the index have already gone through the analysis step and terms such as restaurering would have already been stemmed. Therefore, you'll need to re-index all documents which contained the words you have specified in protwords.txt. -- Regards, Shalin Shekhar Mangar.
Re: Problem with words thats amost similar
Den 17. des. 2009 kl. 13.48 skrev Shalin Shekhar Mangar: > 2009/12/17 Steinar Asbjørnsen > >> Den 17. des. 2009 kl. 12.42 skrev Shalin Shekhar Mangar: >> >>> For specific cases like this, you can add the word to a file and specify >> it >>> in schema, for example: >>> >>> >> protected="protwords.txt"/> >> >> Ty Shalin. >> >> This is my schema.xml file >> >> >> >> > words="stopwords.txt" enablePositionIncrements="true"/> >> > generateWordParts="1" generateNumberParts="1" catenateWords="1" >> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> >> >> > protected="protwords.txt"/> >> >> >> >> >> > ignoreCase="true" expand="true"/> >> > words="stopwords.txt"/> >> > generateWordParts="1" generateNumberParts="1" catenateWords="0" >> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/> >> >> > protected="protwords.txt"/> >> >> >> >> >> I added restaurant and restaurering to protwords.txt, restarted Tomcat, but >> no dice. >> Do I need to use the SnowballPorterFilterFactory? >> And do I need to reindex the documents? >> >> > Actually EnglishPorterFilterFactory is the same as > SnowballPorterFilterFactory with language="English". Both will work. You > will need to re-index the documents. What I've done so far is to add both restaurant and restaurering to protwords.txt. I've also refeed a single document (with the keyword "restaurering") to check that it no longer appears in a search result for "restaurant". Do i have to refeed every document in the index? Or restart so that solr re-reads the protwords.txt-file (this is on a different installation(prod) then the one i restarted earlier(dev))? Steinar
Re: Problem with words thats amost similar
2009/12/17 Steinar Asbjørnsen > Den 17. des. 2009 kl. 12.42 skrev Shalin Shekhar Mangar: > > >> > >> > > For specific cases like this, you can add the word to a file and specify > it > > in schema, for example: > > > > > protected="protwords.txt"/> > > Ty Shalin. > > This is my schema.xml file > > > > words="stopwords.txt" enablePositionIncrements="true"/> > generateWordParts="1" generateNumberParts="1" catenateWords="1" > catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> > > protected="protwords.txt"/> > > > > > ignoreCase="true" expand="true"/> > words="stopwords.txt"/> > generateWordParts="1" generateNumberParts="1" catenateWords="0" > catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/> > > protected="protwords.txt"/> > > > > > I added restaurant and restaurering to protwords.txt, restarted Tomcat, but > no dice. > Do I need to use the SnowballPorterFilterFactory? > And do I need to reindex the documents? > > Actually EnglishPorterFilterFactory is the same as SnowballPorterFilterFactory with language="English". Both will work. You will need to re-index the documents. -- Regards, Shalin Shekhar Mangar.
Re: Problem with words thats amost similar
Den 17. des. 2009 kl. 12.42 skrev Shalin Shekhar Mangar: > 2009/12/17 Steinar Asbjørnsen > >> Hi all. >> >> I have a delicate problem when it comes to two words that are rather >> similar in the way they are typed, but when it comes to the meaning of the >> word they are completely different. >> The actual words are restaurant (as in restaurant) and restaurering (as in >> restoration). >> >> Solr seems to think these words are similar enough to present hits on both >> of them in the same search result. >> Obviously this is not desirable. >> >> Is there a way to take care of such spesific cases without disabling solr >> functionality for stemming and/or plurals? >> Or would I need to disable stemming to make this special case disapear? >> >> > For specific cases like this, you can add the word to a file and specify it > in schema, for example: > > protected="protwords.txt"/> Ty Shalin. This is my schema.xml file I added restaurant and restaurering to protwords.txt, restarted Tomcat, but no dice. Do I need to use the SnowballPorterFilterFactory? And do I need to reindex the documents? Steinar
Re: Problem with words thats amost similar
2009/12/17 Steinar Asbjørnsen > Hi all. > > I have a delicate problem when it comes to two words that are rather > similar in the way they are typed, but when it comes to the meaning of the > word they are completely different. > The actual words are restaurant (as in restaurant) and restaurering (as in > restoration). > > Solr seems to think these words are similar enough to present hits on both > of them in the same search result. > Obviously this is not desirable. > > Is there a way to take care of such spesific cases without disabling solr > functionality for stemming and/or plurals? > Or would I need to disable stemming to make this special case disapear? > > For specific cases like this, you can add the word to a file and specify it in schema, for example: -- Regards, Shalin Shekhar Mangar.
Problem with words thats amost similar
Hi all. I have a delicate problem when it comes to two words that are rather similar in the way they are typed, but when it comes to the meaning of the word they are completely different. The actual words are restaurant (as in restaurant) and restaurering (as in restoration). Solr seems to think these words are similar enough to present hits on both of them in the same search result. Obviously this is not desirable. Is there a way to take care of such spesific cases without disabling solr functionality for stemming and/or plurals? Or would I need to disable stemming to make this special case disapear? I'm using the dismax query handler The field im querying against is of type text. Any help is apreciated :) Regards, Steinar