Re: Custom Query variables ?

2014-07-02 Thread joergpra...@gmail.com
For geo search, it would be a good approach to respect the searchers
preference by using a locale, so I suggest to add a locale "fr" filter to
the search.
Or an origin is added to the start query and all cities are ordered by geo
distance in relation to the origin. For country search, the origin could be
the capital city...

Jörg


On Wed, Jul 2, 2014 at 6:38 PM, Ivan Brusic  wrote:

> If you enable explanations, you can see why Lucene the rational behind the
> scoring:
>
>
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-explain.html
>
> You are probably correct in that the array length is influencing the
> scoring. By default, Lucene will rate higher fields with fewer terms by
> using length normalization. You can disable norms on the field:
>
>
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#norms
>
> You can fine-tune better by learning how to read Lucene's explanations. It
> is difficult at first, but it is a useful skill.
>
> Cheers,
>
> Ivan
>
>
> On Tue, Jul 1, 2014 at 1:02 AM, Pierrick Boutruche 
> wrote:
>
>> Up ? Any ideas ?
>>
>> Le lundi 30 juin 2014 17:48:54 UTC+2, Pierrick Boutruche a écrit :
>>
>>> Hi everyone,
>>>
>>> I'm creating on my own a little Geocoder. My goal is to be able to
>>> retrieve a big city or a country with a string on input. This string can be
>>> mistyped, so I indexed geonames cities5000 data (cities > 5000 inhab), and
>>> crossed theses data with countries & admin data. So I got a 46000 cities
>>> index with country, admin & pop.
>>>
>>> I created a search_field in which I put country, admin & city name +
>>> alternate names provided in cities5000 file.
>>>
>>> I want, within this array, search for a string.
>>>
>>> Currently, I'm just searching with a MatchQuery, like "Paris" in
>>> "search_field". Unfortunately, the first result is Paris... in Canada...
>>>
>>> Still, the "search_field" data is this one, for Paris (CA) and Paris
>>> (FR):
>>>
>>> [u'Paris', u'Paris', u'Canada', u'Ontario', u'Ontario']
>>>
>>> [u'Paris', u'Paris', u'France', u'\xcele-de-France', u'Ile-de-France',
>>> u'Paris', u'Paris']
>>>
>>> I don't understand why Paris, CA is first, 'cause there's so much more
>>> "Paris" in the second one...
>>>
>>>
>>> Anyway, is there any way to make the number of "my_query" terms
>>> appearance make the difference ? Because with alternate names, there will
>>> be so much much more Paris that it has te count.
>>>
>>> Actually I think the array length matters in the scoring and I don't
>>> want it to... I thought of a custom query score, but I don't think I'm able
>>> to get the query term in the script query.
>>>
>>>
>>> Any ideas ?
>>>
>>>
>>> Thanks !
>>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to elasticsearch+unsubscr...@googlegroups.com.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/0f1e6aec-697c-46fc-882e-d8927783fab5%40googlegroups.com
>> 
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCa9nNPX-7oQgjXq6AsFVUyxarDOq9SQ3w6M2MMgT2rNQ%40mail.gmail.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGUyH0-GBjqAYOrMvuEL_ERA82MMdGEK2GHCBEmOcGOFg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Custom Query variables ?

2014-07-02 Thread Ivan Brusic
If you enable explanations, you can see why Lucene the rational behind the
scoring:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-explain.html

You are probably correct in that the array length is influencing the
scoring. By default, Lucene will rate higher fields with fewer terms by
using length normalization. You can disable norms on the field:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-core-types.html#norms

You can fine-tune better by learning how to read Lucene's explanations. It
is difficult at first, but it is a useful skill.

Cheers,

Ivan


On Tue, Jul 1, 2014 at 1:02 AM, Pierrick Boutruche 
wrote:

> Up ? Any ideas ?
>
> Le lundi 30 juin 2014 17:48:54 UTC+2, Pierrick Boutruche a écrit :
>
>> Hi everyone,
>>
>> I'm creating on my own a little Geocoder. My goal is to be able to
>> retrieve a big city or a country with a string on input. This string can be
>> mistyped, so I indexed geonames cities5000 data (cities > 5000 inhab), and
>> crossed theses data with countries & admin data. So I got a 46000 cities
>> index with country, admin & pop.
>>
>> I created a search_field in which I put country, admin & city name +
>> alternate names provided in cities5000 file.
>>
>> I want, within this array, search for a string.
>>
>> Currently, I'm just searching with a MatchQuery, like "Paris" in
>> "search_field". Unfortunately, the first result is Paris... in Canada...
>>
>> Still, the "search_field" data is this one, for Paris (CA) and Paris (FR):
>>
>> [u'Paris', u'Paris', u'Canada', u'Ontario', u'Ontario']
>>
>> [u'Paris', u'Paris', u'France', u'\xcele-de-France', u'Ile-de-France',
>> u'Paris', u'Paris']
>>
>> I don't understand why Paris, CA is first, 'cause there's so much more
>> "Paris" in the second one...
>>
>>
>> Anyway, is there any way to make the number of "my_query" terms
>> appearance make the difference ? Because with alternate names, there will
>> be so much much more Paris that it has te count.
>>
>> Actually I think the array length matters in the scoring and I don't want
>> it to... I thought of a custom query score, but I don't think I'm able to
>> get the query term in the script query.
>>
>>
>> Any ideas ?
>>
>>
>> Thanks !
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/0f1e6aec-697c-46fc-882e-d8927783fab5%40googlegroups.com
> 
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CALY%3DcQCa9nNPX-7oQgjXq6AsFVUyxarDOq9SQ3w6M2MMgT2rNQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Custom Query variables ?

2014-07-01 Thread Pierrick Boutruche
Up ? Any ideas ?

Le lundi 30 juin 2014 17:48:54 UTC+2, Pierrick Boutruche a écrit :
>
> Hi everyone, 
>
> I'm creating on my own a little Geocoder. My goal is to be able to 
> retrieve a big city or a country with a string on input. This string can be 
> mistyped, so I indexed geonames cities5000 data (cities > 5000 inhab), and 
> crossed theses data with countries & admin data. So I got a 46000 cities 
> index with country, admin & pop. 
>
> I created a search_field in which I put country, admin & city name + 
> alternate names provided in cities5000 file. 
>
> I want, within this array, search for a string. 
>
> Currently, I'm just searching with a MatchQuery, like "Paris" in 
> "search_field". Unfortunately, the first result is Paris... in Canada... 
>
> Still, the "search_field" data is this one, for Paris (CA) and Paris (FR):
>
> [u'Paris', u'Paris', u'Canada', u'Ontario', u'Ontario']
>
> [u'Paris', u'Paris', u'France', u'\xcele-de-France', u'Ile-de-France', 
> u'Paris', u'Paris']
>
> I don't understand why Paris, CA is first, 'cause there's so much more 
> "Paris" in the second one...
>
>
> Anyway, is there any way to make the number of "my_query" terms appearance 
> make the difference ? Because with alternate names, there will be so much 
> much more Paris that it has te count. 
>
> Actually I think the array length matters in the scoring and I don't want 
> it to... I thought of a custom query score, but I don't think I'm able to 
> get the query term in the script query.
>
>
> Any ideas ?
>
>
> Thanks !
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0f1e6aec-697c-46fc-882e-d8927783fab5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Custom Query variables ?

2014-06-30 Thread Pierrick Boutruche
Hi everyone, 

I'm creating on my own a little Geocoder. My goal is to be able to retrieve 
a big city or a country with a string on input. This string can be 
mistyped, so I indexed geonames cities5000 data (cities > 5000 inhab), and 
crossed theses data with countries & admin data. So I got a 46000 cities 
index with country, admin & pop. 

I created a search_field in which I put country, admin & city name + 
alternate names provided in cities5000 file. 

I want, within this array, search for a string. 

Currently, I'm just searching with a MatchQuery, like "Paris" in 
"search_field". Unfortunately, the first result is Paris... in Canada... 

Still, the "search_field" data is this one, for Paris (CA) and Paris (FR):

[u'Paris', u'Paris', u'Canada', u'Ontario', u'Ontario']

[u'Paris', u'Paris', u'France', u'\xcele-de-France', u'Ile-de-France', 
u'Paris', u'Paris']

I don't understand why Paris, CA is first, 'cause there's so much more 
"Paris" in the second one...


Anyway, is there any way to make the number of "my_query" terms appearance 
make the difference ? Because with alternate names, there will be so much 
much more Paris that it has te count. 

Actually I think the array length matters in the scoring and I don't want 
it to... I thought of a custom query score, but I don't think I'm able to 
get the query term in the script query.


Any ideas ?


Thanks !

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/edddf66e-9553-479b-bb68-dfef8b2ba36b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.