Thank you again Ivan (and sorry for the silence, I was away these last few 
days).
I made the jar with maven, the problem that I have now is a compilation 
failure due to the override annotation in NormRemovalSimilarity.java ("*method 
does not override or implement a method from a supertype*"). When I put the 
line in comment, the jar is built with success but I think that the new 
decodeNormValue function is not overriding the original one (normal!). 
Indeed, when I search my field contents that has similarity=my_similarity, 
the explanation of the score is: 

...
                        {
                           "value": 0.25,
                           "description": "fieldNorm(doc=0)"
                        }
...

I suppose that under the new similarity, the value should be 1.0, shouldn't 
it?
Cheers,
Patrick

Le jeudi 3 avril 2014 12:15:15 UTC-4, Ivan Brusic a écrit :
>
> I added a simple Maven pom to the gist: 
> https://gist.github.com/brusic/9786587#file-pom-xml
>
> Easiest thing to do is download Maven (if you do not have it) and use it 
> take care handling the dependencies and build a jar if you simple execute: 
> mvn package
>
> Since Elasticsearch already comes bundle with the correct jars, you can 
> also add those to your classpath instead. I think you only need Lucene 
> core, which is in $ES_HOME/lib/lucene-core-4-?-?.jar Substitute the 
> question marks for the correct version. I am not on Elasticsearch, so I do 
> not know offhand which version of Lucene is packaged.
>
> -- 
> Ivan
>
>
> On Thu, Apr 3, 2014 at 7:44 AM, geantbrun <agin.p...@gmail.com<javascript:>
> > wrote:
>
>> Ivan,
>> Sorry but I realize (I'm totally unaware of Java) that I skipped the java 
>> compile step (I simply put the java files in a jar file with jar cf). The 
>> problem now is that executing :
>>
>> javac NormRemovalSimilarity.java -classpath ./elasticsearch-1.1.0.jar
>>
>> generates errors, the first one being:
>>
>> package org.apache.lucene.search.similarities does not exist
>>
>> Googled it but found nothing. Any idea?
>> Patrick
>>
>> P.S. I installed elasticsearch following the easy 
>> way<https://gist.github.com/wingdspur/2026107>(dpkg the deb file)
>>
>> Le jeudi 3 avril 2014 09:16:02 UTC-4, geantbrun a écrit :
>>
>>> Thanks again for your great help Ivan. Does not work for me. When I 
>>> substitute NormRemovalSimilarityProvider by BM25SimilarityProvider (or 
>>> simply by BM25), it works. Is it possible that I put my jar file in the 
>>> wrong directory (usr/share/elasticsearch/lib)? Is it necessary to 
>>> *register* somewhere the new classes I define before restarting service?
>>> Cheers,
>>> Patrick
>>>
>>> Le mercredi 2 avril 2014 17:47:46 UTC-4, Ivan Brusic a écrit :
>>>>
>>>> Are you using a full class name? I have no problems with 
>>>>
>>>> curl -XPOST 'http://localhost:9200/sim/' -d '
>>>> {
>>>>  "settings" : {
>>>>    "similarity" : {
>>>>     "my_similarity" : {
>>>>      "type" : "org.elasticsearch.index.similarity.
>>>> NormRemovalSimilarityProvider"
>>>>     }   
>>>>   }
>>>>  },
>>>>  "mappings" : {
>>>>   "post" : {
>>>>    "properties" : {
>>>>     "id" : { "type" : "long", "store" : "yes", "precision_step" : "0" },
>>>>     "name" : { "type" : "string", "store" : "yes", "index" : 
>>>> "analyzed"},
>>>>     "contents" : { "type" : "string", "store" : "no", "index" : 
>>>> "analyzed", "similarity" : "my_similarity"}
>>>>    }
>>>>   }
>>>>  }
>>>> }
>>>> '
>>>>
>>>>
>>>>
>>>> On Wed, Apr 2, 2014 at 12:03 PM, geantbrun <agin.p...@gmail.com> wrote:
>>>>
>>>>> In order to better understand the error, I copied your 
>>>>> NormRemovalSimilarity and NormRemovalSimilarityProvider code snippets in 
>>>>> usr/share/elasticsearch/lib. I put these 2 files in a jar named 
>>>>> NormRemovalSimilarity.jar. After restarting the elasticsearch service, I 
>>>>> tried to create the index with the same mapping as before (except that I 
>>>>> put "type" : "NormRemoval" in the settings of my_similarity.
>>>>>
>>>>> The result is the same: 
>>>>> {"error":"IndexCreationException[[exbd] failed to create index]; 
>>>>> nested: NoClassSettingsException[Failed to load class setting [type] 
>>>>> with value [NormRemoval]]; nested: ClassNotFoundException[org.
>>>>> elasticsearch.index.similarity.normremoval.
>>>>> NormRemovalSimilarityProvider]; ","status":500}]
>>>>>
>>>>> I deleted the jar file just to see if the error is the same: yes it 
>>>>> is. It's like the new similarity is never found or loaded. Is it still 
>>>>> working without modifications on your side?
>>>>> Cheers,
>>>>> Patrick
>>>>>
>>>>>
>>>>> Le mercredi 2 avril 2014 00:31:44 UTC-4, Ivan Brusic a écrit :
>>>>>>
>>>>>> It has been a while since I used a custom similarity, but what you 
>>>>>> have looks right. Can you try a full class name instead? 
>>>>>> Use org.elasticsearch.index.similarity.tfCappedSimilarityProvider. 
>>>>>> According to the error, it is looking for org.elasticsearch.index.si
>>>>>> milarity.tfcappedsimilarity.tfCappedSimilaritySimilarityProvider.
>>>>>>
>>>>>> -- 
>>>>>> Ivan
>>>>>>
>>>>>>
>>>>>> On Tue, Apr 1, 2014 at 7:00 AM, geantbrun <agin.p...@gmail.com>wrote:
>>>>>>
>>>>>>> Sure.
>>>>>>>
>>>>>>> {
>>>>>>>  "settings" : {
>>>>>>>   "index" : {
>>>>>>>    "similarity" : {
>>>>>>>     "my_similarity" : {
>>>>>>>      "type" : "tfCappedSimilarity"
>>>>>>>     }
>>>>>>>    }
>>>>>>>   }
>>>>>>>  },
>>>>>>>  "mappings" : {
>>>>>>>   "post" : {
>>>>>>>    "properties" : {
>>>>>>>     "id" : { "type" : "long", "store" : "yes", "precision_step" : 
>>>>>>> "0" },
>>>>>>>     "name" : { "type" : "string", "store" : "yes", "index" : 
>>>>>>> "analyzed"},
>>>>>>>     "contents" : { "type" : "string", "store" : "no", "index" : 
>>>>>>> "analyzed", "similarity" : "my_similarity"}
>>>>>>>    }
>>>>>>>   }
>>>>>>>  }
>>>>>>> }
>>>>>>>
>>>>>>> If I substitute tfCappedSimilarity for tfCapped in the mapping, the 
>>>>>>> error is the same except that provider is referred as 
>>>>>>> tfCappedSimilarityProvider and not as tfCappedSimilaritySimilarit
>>>>>>> yProvider.
>>>>>>> Cheers,
>>>>>>> Patrick
>>>>>>>
>>>>>>>
>>>>>>> Le lundi 31 mars 2014 17:13:24 UTC-4, Ivan Brusic a écrit :
>>>>>>>>
>>>>>>>> Can you also post your mapping where you defined the similarity?
>>>>>>>>
>>>>>>>> -- 
>>>>>>>> Ivan
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Mar 31, 2014 at 10:36 AM, geantbrun <agin.p...@gmail.com>wrote:
>>>>>>>>
>>>>>>>>> I realize that I probably have to define the similarity property 
>>>>>>>>> of my field as "my_similarity" (and not as "tfCappedSimilarity") and 
>>>>>>>>> define 
>>>>>>>>> in the settings my_similarity as being of type tfCappedSimilarity.
>>>>>>>>> When I do that, I get the following error at the index/mapping 
>>>>>>>>> creation:
>>>>>>>>>
>>>>>>>>> {"error":"IndexCreationException[[exbd] failed to create index]; 
>>>>>>>>> nested: NoClassSettingsException[Failed to load class setting 
>>>>>>>>> [type] with value [tfCappedSimilarity]]; nested: 
>>>>>>>>> ClassNotFoundException[org.
>>>>>>>>> elasticsearch.index.similarity.tfcappedsimilarity.tfCappedSimil
>>>>>>>>> aritySimilarityProvider]; ","status":500}]
>>>>>>>>>
>>>>>>>>> Note that the provider is referred in the error as 
>>>>>>>>> tfCappedSimilaritySimilarityProvider (similarity repeated 2 
>>>>>>>>> times). Is it normal?
>>>>>>>>> Patrick
>>>>>>>>>
>>>>>>>>> Le lundi 31 mars 2014 13:06:00 UTC-4, geantbrun a écrit :
>>>>>>>>>
>>>>>>>>>> Hi Ivan,
>>>>>>>>>> I followed your instructions but it does not seem to work, I must 
>>>>>>>>>> be wrong somewhere. I created the jar file from the following two 
>>>>>>>>>> java 
>>>>>>>>>> files, could you tell me if they are ok?
>>>>>>>>>>
>>>>>>>>>> tfCappedSimilarity.java
>>>>>>>>>> ***************************
>>>>>>>>>> package org.elasticsearch.index.similarity;
>>>>>>>>>>
>>>>>>>>>>  import org.apache.lucene.search.similarities.DefaultSimilarity;
>>>>>>>>>> import org.elasticsearch.common.logging.ESLogger;
>>>>>>>>>> import org.elasticsearch.common.logging.Loggers;
>>>>>>>>>>  
>>>>>>>>>> public class tfCappedSimilarity extends DefaultSimilarity {
>>>>>>>>>>
>>>>>>>>>>         private ESLogger logger;
>>>>>>>>>>
>>>>>>>>>>         public tfCappedSimilarity() {
>>>>>>>>>>                 logger = Loggers.getLogger(getClass());
>>>>>>>>>>         }
>>>>>>>>>>
>>>>>>>>>>         /**
>>>>>>>>>>          * Capped tf value
>>>>>>>>>>          */
>>>>>>>>>>         @Override
>>>>>>>>>>         public float tf(float freq) {
>>>>>>>>>>                 return (float)Math.sqrt(Math.min(9, freq));
>>>>>>>>>>         }
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> tfCappedSimilarityProvider.java
>>>>>>>>>> *************************************
>>>>>>>>>> package org.elasticsearch.index.similarity;
>>>>>>>>>>
>>>>>>>>>> import org.elasticsearch.common.inject.Inject;
>>>>>>>>>> import org.elasticsearch.common.inject.assistedinject.Assisted;
>>>>>>>>>> import org.elasticsearch.common.settings.Settings;
>>>>>>>>>>
>>>>>>>>>> public class tfCappedSimilarityProvider extends 
>>>>>>>>>> AbstractSimilarityProvider {
>>>>>>>>>>
>>>>>>>>>>         private tfCappedSimilarity similarity;
>>>>>>>>>>
>>>>>>>>>>         @Inject
>>>>>>>>>>         public tfCappedSimilarityProvider(@Assisted String name, 
>>>>>>>>>> @Assisted Settings settings) {
>>>>>>>>>>                  super(name);
>>>>>>>>>>                 this.similarity = new tfCappedSimilarity();
>>>>>>>>>>         }
>>>>>>>>>>
>>>>>>>>>>         /**
>>>>>>>>>>          * {@inheritDoc}
>>>>>>>>>>          */
>>>>>>>>>>         @Override
>>>>>>>>>>         public tfCappedSimilarity get() {
>>>>>>>>>>                 return similarity;
>>>>>>>>>>         }
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> In my mapping, I define the similarity property of my field as 
>>>>>>>>>> tfCappedSimilarity, is it ok?
>>>>>>>>>>
>>>>>>>>>> What makes me say that it does not work: I insert a doc with a 
>>>>>>>>>> word repeated 16 times in my field. When I do a search with that 
>>>>>>>>>> word, the 
>>>>>>>>>> result shows a tf of 4 (square root of 16) and not 3 as I was 
>>>>>>>>>> expecting, Is 
>>>>>>>>>> there a way to know if the similarity was loaded or not (maybe in a 
>>>>>>>>>> log 
>>>>>>>>>> file?).
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>> Patrick
>>>>>>>>>>
>>>>>>>>>> Le mercredi 26 mars 2014 17:16:36 UTC-4, Ivan Brusic a écrit :
>>>>>>>>>>>
>>>>>>>>>>> I updated my gist to illustrate the SimilarityProvider that goes 
>>>>>>>>>>> along with it. Similarities are easier to add to Elasticsearch than 
>>>>>>>>>>> most 
>>>>>>>>>>> plugins. You just need to compile the two files into a jar and then 
>>>>>>>>>>> add 
>>>>>>>>>>> that jar into Elasticsearch's classpath ($ES_HOME/lib most likely). 
>>>>>>>>>>> The 
>>>>>>>>>>> code will scan for every SimilarityProvider defined and load it.
>>>>>>>>>>>
>>>>>>>>>>> You then mapping the similarity to a field: http://www.
>>>>>>>>>>> elasticsearch.org/guide/en/elasticsearch/reference/
>>>>>>>>>>> current/mapping-core-types.html#_configuring_similarity_
>>>>>>>>>>> per_field
>>>>>>>>>>>
>>>>>>>>>>> Note that you cannot change the similarity of a field 
>>>>>>>>>>> dynamically.
>>>>>>>>>>>
>>>>>>>>>>> Ivan
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> http://www.elasticsearch.org/guide/en/elasticsearch/referenc
>>>>>>>>>>> e/current/mapping-core-types.html#_configuring_similarity_pe
>>>>>>>>>>> r_field
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Mar 26, 2014 at 12:49 PM, geantbrun <agin.p...@gmail.com
>>>>>>>>>>> > wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Britta is looping over words that are passed as parameters. 
>>>>>>>>>>>> It's easy to implement her script for a simple query but what 
>>>>>>>>>>>> about boolean 
>>>>>>>>>>>> querys? In my understanding (but I could be wrong of course), I 
>>>>>>>>>>>> would have 
>>>>>>>>>>>> to parse the query to call the script with each sub-clause, am I 
>>>>>>>>>>>> wrong?
>>>>>>>>>>>>
>>>>>>>>>>>> I prefer your custom similarity alternative. Again, sorry for 
>>>>>>>>>>>> the silly question (newbie!) but where do you put your java file? 
>>>>>>>>>>>> Is it the 
>>>>>>>>>>>> only thing that is needed (except for the modification in the 
>>>>>>>>>>>> mapping)?
>>>>>>>>>>>> cheers,
>>>>>>>>>>>> Patrick
>>>>>>>>>>>>
>>>>>>>>>>>> Le mercredi 26 mars 2014 11:58:52 UTC-4, Ivan Brusic a écrit :
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am still on a version of Elasticsearch that does not have 
>>>>>>>>>>>>> access to the new scoring capabilities, so I cannot test out any 
>>>>>>>>>>>>> scripts. 
>>>>>>>>>>>>> The non normalized term frequency should be the line:
>>>>>>>>>>>>> tf = _index[field][word].tf()
>>>>>>>>>>>>>
>>>>>>>>>>>>> If that is the case, you could substitute that line with 
>>>>>>>>>>>>> something like:
>>>>>>>>>>>>> tf = Math.min(10, _index[field][word].tf())
>>>>>>>>>>>>>
>>>>>>>>>>>>> As a stated before, I am used to using Similarities, so I find 
>>>>>>>>>>>>> the example easier. Here is a custom similarity that I used in 
>>>>>>>>>>>>> Elasticsearch (removes any norms that are indexed):
>>>>>>>>>>>>> https://gist.github.com/brusic/9786587
>>>>>>>>>>>>>
>>>>>>>>>>>>>  The second part would be the tf() method you would need to 
>>>>>>>>>>>>> implement instead of decodeNormValue I used.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Ivan
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>  -- 
>>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>>> Groups "elasticsearch" group.
>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>>> send an email to elasticsearc...@googlegroups.com.
>>>>>>>>> To view this discussion on the web visit 
>>>>>>>>> https://groups.google.com/d/msgid/elasticsearch/6370b4dc-824
>>>>>>>>> 3-4aea-918a-e4e4e9588aaf%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/6370b4dc-8243-4aea-918a-e4e4e9588aaf%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>> .
>>>>>>>>>
>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>
>>>>>>>>
>>>>>>>>  -- 
>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>> Groups "elasticsearch" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>> send an email to elasticsearc...@googlegroups.com.
>>>>>>> To view this discussion on the web visit 
>>>>>>> https://groups.google.com/d/msgid/elasticsearch/f9c6111c-9c4
>>>>>>> a-427d-952e-a203f2376fb8%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/f9c6111c-9c4a-427d-952e-a203f2376fb8%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>> .
>>>>>>>
>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>
>>>>>>
>>>>>>  -- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "elasticsearch" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to elasticsearc...@googlegroups.com.
>>>>> To view this discussion on the web visit https://groups.google.com/d/
>>>>> msgid/elasticsearch/68488979-9153-430b-b349-2192717677e7%
>>>>> 40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/68488979-9153-430b-b349-2192717677e7%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/25ca773c-17fc-4b03-aaf7-58464f6a6885%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/25ca773c-17fc-4b03-aaf7-58464f6a6885%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/57c7df18-aea1-4b8c-98ce-9ee8e25a738d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to