Re: Problem with words thats amost similar

2009-12-18 Thread Shalin Shekhar Mangar
2009/12/18 Steinar Asbjørnsen 

>
> What I've done so far is to add both restaurant and restaurering to
> protwords.txt.
> I've also refeed a single document (with the keyword "restaurering") to
> check that it no longer appears in a search result for "restaurant".
> Do i have to refeed every document in the index?
> Or restart so that solr re-reads the protwords.txt-file (this is on a
> different installation(prod) then the one i restarted earlier(dev))?
>
>
Documents already added in the index have already gone through the analysis
step and terms such as restaurering would have already been stemmed.
Therefore, you'll need to re-index all documents which contained the words
you have specified in protwords.txt.
-- 
Regards,
Shalin Shekhar Mangar.


Re: Problem with words thats amost similar

2009-12-18 Thread Steinar Asbjørnsen
Den 17. des. 2009 kl. 13.48 skrev Shalin Shekhar Mangar:

> 2009/12/17 Steinar Asbjørnsen 
> 
>> Den 17. des. 2009 kl. 12.42 skrev Shalin Shekhar Mangar:
>> 
 
 
>>> For specific cases like this, you can add the word to a file and specify
>> it
>>> in schema, for example:
>>> 
>>> >> protected="protwords.txt"/>
>> 
>> Ty Shalin.
>> 
>> This is my schema.xml file
>> 
>> 
>>   
>>   > words="stopwords.txt" enablePositionIncrements="true"/>
>>   > generateWordParts="1" generateNumberParts="1" catenateWords="1"
>> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
>>   
>>   > protected="protwords.txt"/>
>>   
>> 
>> 
>>   
>>   > ignoreCase="true" expand="true"/>
>>   > words="stopwords.txt"/>
>>   > generateWordParts="1" generateNumberParts="1" catenateWords="0"
>> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
>>   
>>   > protected="protwords.txt"/>
>>   
>> 
>>   
>> 
>> I added restaurant and restaurering to protwords.txt, restarted Tomcat, but
>> no dice.
>> Do I need to use the SnowballPorterFilterFactory?
>> And do I need to reindex the documents?
>> 
>> 
> Actually EnglishPorterFilterFactory is the same as
> SnowballPorterFilterFactory with language="English". Both will work. You
> will need to re-index the documents.

What I've done so far is to add both restaurant and restaurering to 
protwords.txt.
I've also refeed a single document (with the keyword "restaurering") to check 
that it no longer appears in a search result for "restaurant".
Do i have to refeed every document in the index?
Or restart so that solr re-reads the protwords.txt-file (this is on a different 
installation(prod) then the one i restarted earlier(dev))?

Steinar

Re: Problem with words thats amost similar

2009-12-17 Thread Shalin Shekhar Mangar
2009/12/17 Steinar Asbjørnsen 

> Den 17. des. 2009 kl. 12.42 skrev Shalin Shekhar Mangar:
>
> >>
> >>
> > For specific cases like this, you can add the word to a file and specify
> it
> > in schema, for example:
> >
> >  > protected="protwords.txt"/>
>
> Ty Shalin.
>
> This is my schema.xml file
> 
>  
>
> words="stopwords.txt" enablePositionIncrements="true"/>
> generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
>
> protected="protwords.txt"/>
>
>  
>  
>
> ignoreCase="true" expand="true"/>
> words="stopwords.txt"/>
> generateWordParts="1" generateNumberParts="1" catenateWords="0"
> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
>
> protected="protwords.txt"/>
>
>  
>
>
> I added restaurant and restaurering to protwords.txt, restarted Tomcat, but
> no dice.
> Do I need to use the SnowballPorterFilterFactory?
> And do I need to reindex the documents?
>
>
Actually EnglishPorterFilterFactory is the same as
SnowballPorterFilterFactory with language="English". Both will work. You
will need to re-index the documents.

-- 
Regards,
Shalin Shekhar Mangar.


Re: Problem with words thats amost similar

2009-12-17 Thread Steinar Asbjørnsen
Den 17. des. 2009 kl. 12.42 skrev Shalin Shekhar Mangar:

> 2009/12/17 Steinar Asbjørnsen 
> 
>> Hi all.
>> 
>> I have a delicate problem when it comes to two words that are rather
>> similar in the way they are typed, but when it comes to the meaning of the
>> word they are completely different.
>> The actual words are restaurant (as in restaurant) and restaurering (as in
>> restoration).
>> 
>> Solr seems to think these words are similar enough to present hits on both
>> of them in the same search result.
>> Obviously this is not desirable.
>> 
>> Is there a way to take care of such spesific cases without disabling solr
>> functionality for stemming and/or plurals?
>> Or would I need to disable stemming to make this special case disapear?
>> 
>> 
> For specific cases like this, you can add the word to a file and specify it
> in schema, for example:
> 
>  protected="protwords.txt"/>

Ty Shalin.

This is my schema.xml file

  






  
  







  


I added restaurant and restaurering to protwords.txt, restarted Tomcat, but no 
dice.
Do I need to use the SnowballPorterFilterFactory?
And do I need to reindex the documents?

Steinar

Re: Problem with words thats amost similar

2009-12-17 Thread Shalin Shekhar Mangar
2009/12/17 Steinar Asbjørnsen 

> Hi all.
>
> I have a delicate problem when it comes to two words that are rather
> similar in the way they are typed, but when it comes to the meaning of the
> word they are completely different.
> The actual words are restaurant (as in restaurant) and restaurering (as in
> restoration).
>
> Solr seems to think these words are similar enough to present hits on both
> of them in the same search result.
> Obviously this is not desirable.
>
> Is there a way to take care of such spesific cases without disabling solr
> functionality for stemming and/or plurals?
> Or would I need to disable stemming to make this special case disapear?
>
>
For specific cases like this, you can add the word to a file and specify it
in schema, for example:



-- 
Regards,
Shalin Shekhar Mangar.


Problem with words thats amost similar

2009-12-17 Thread Steinar Asbjørnsen
Hi all.

I have a delicate problem when it comes to two words that are rather similar in 
the way they are typed, but when it comes to the meaning of the word they are 
completely different.
The actual words are restaurant (as in restaurant) and restaurering (as in 
restoration).

Solr seems to think these words are similar enough to present hits on both of 
them in the same search result.
Obviously this is not desirable.

Is there a way to take care of such spesific cases without disabling solr 
functionality for stemming and/or plurals?
Or would I need to disable stemming to make this special case disapear?

I'm using the dismax query handler
The field  im querying against is of type text.

Any help is apreciated :)

Regards,
Steinar