RE : Re: RE : Re: RE : Re: problem undestanding the hits.score

2007-11-11 Thread Jamal H Tandina
Thanks you for your reply The thing is i'am trying to emplement a weight for a word form indexing html web pages. The is like : *50% + Weigth(word in doc d) = *20% + * 10% + ... the code is : = doc.add(new Field(url,

RE : Re: problem undestanding the hits.score

2007-11-02 Thread Jamal H Tandina
Thank you for your reply How can i change the defaultSimilarity in the indexing and the searching, do you have an example or an url how to set the Similarity ? http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/org/apache/lucene/search/Similarity.html Thanks again Ion

Re: RE : Re: problem undestanding the hits.score

2007-11-02 Thread Ion Badita
That is already in the similarity formula, in tf term, documents that have more occurrences of a given term receive a higher score. Jamal H Tandina wrote: If you want to give priority to documents that are larger, like z1, you should change the DefaultSimilarity (at index time), more

Re: RE : Re: problem undestanding the hits.score

2007-11-02 Thread Ion Badita
For your specific problem you need to change the DefaultSimilarity only at index time, because the lengthNorm is written to the index when is created. So... first you'll need to extend the DefaultSimilarity and override the lengthNorm() method with the one suggested in the previous replay; then

Re: RE : Re: problem undestanding the hits.score

2007-11-02 Thread Erick Erickson
I strongly recommend against this. Simple word counts are a poor measure of relevance. Which is why Lucene doesn't score that way. Do you have an example showing why the default scoring is inadequate or is this just an assumption? It would be helpful if you gave us some idea of what you're trying

RE : Re: problem undestanding the hits.score

2007-11-02 Thread Jamal H Tandina
If you want to give priority to documents that are larger, like z1, you should change the DefaultSimilarity (at index time), more exactly the method: public float lengthNorm(String fieldName, int numTerms) { return (float)(1.0 / Math.sqrt(numTerms)); } to something like this