If you enable explanations, you can see why Lucene the rational behind the
scoring:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-explain.html

You are probably correct in that the array length is influencing the
scoring. By default, Lucene will rate higher fields with fewer terms by
using length normalization. You can disable norms on the field:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#norms

You can fine-tune better by learning how to read Lucene's explanations. It
is difficult at first, but it is a useful skill.

Cheers,

Ivan


On Tue, Jul 1, 2014 at 1:02 AM, Pierrick Boutruche <pboutru...@octo.com>
wrote:

> Up ? Any ideas ?
>
> Le lundi 30 juin 2014 17:48:54 UTC+2, Pierrick Boutruche a écrit :
>
>> Hi everyone,
>>
>> I'm creating on my own a little Geocoder. My goal is to be able to
>> retrieve a big city or a country with a string on input. This string can be
>> mistyped, so I indexed geonames cities5000 data (cities > 5000 inhab), and
>> crossed theses data with countries & admin data. So I got a 46000 cities
>> index with country, admin & pop.
>>
>> I created a search_field in which I put country, admin & city name +
>> alternate names provided in cities5000 file.
>>
>> I want, within this array, search for a string.
>>
>> Currently, I'm just searching with a MatchQuery, like "Paris" in
>> "search_field". Unfortunately, the first result is Paris... in Canada...
>>
>> Still, the "search_field" data is this one, for Paris (CA) and Paris (FR):
>>
>> [u'Paris', u'Paris', u'Canada', u'Ontario', u'Ontario']
>>
>> [u'Paris', u'Paris', u'France', u'\xcele-de-France', u'Ile-de-France',
>> u'Paris', u'Paris']
>>
>> I don't understand why Paris, CA is first, 'cause there's so much more
>> "Paris" in the second one...
>>
>>
>> Anyway, is there any way to make the number of "my_query" terms
>> appearance make the difference ? Because with alternate names, there will
>> be so much much more Paris that it has te count.
>>
>> Actually I think the array length matters in the scoring and I don't want
>> it to... I thought of a custom query score, but I don't think I'm able to
>> get the query term in the script query.
>>
>>
>> Any ideas ?
>>
>>
>> Thanks !
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/0f1e6aec-697c-46fc-882e-d8927783fab5%40googlegroups.com
> <https://groups.google.com/d/msgid/elasticsearch/0f1e6aec-697c-46fc-882e-d8927783fab5%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCa9nNPX-7oQgjXq6AsFVUyxarDOq9SQ3w6M2MMgT2rNQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to