subject:"Score depending on position in the term on the field"

Re: Score depending on position in the term on the field

2014-01-24 Thread simonw

maybe this is interesting in your situation?

http://www.elasticsearch.org/blog/you-complete-me/

simon

On Thursday, January 23, 2014 4:15:05 PM UTC+1, Nikolay Chankov wrote:

 Well, it's really strange, that position is not encountered, since with 
 many results and especially data with lot of similarities (user names) 
 doesn't get sorted also by string position somehow.

 Anyone to share how do they make autocomplete? Feeling really stupid :(

 On Thursday, January 23, 2014 10:56:30 AM UTC, Nikolay Chankov wrote:

 Just checked the Facebook suggestion style and I would imagine something 
 like, this, When you start typing terms which start with the phrase/word 
 are in the top, while the terms which just contain the phrase/word, are at 
 the bottom. But I could be wrong that that the order is this way :)


 On Thursday, January 23, 2014 10:08:25 AM UTC, Nikolay Chankov wrote:

 Hi Johan,

 I've already saw this suggestion, but it's not really useful for me, 
 since in the index there are various document with different types. For 
 example, I have users, but also I have venues and events, last two had only 
 name, while user could have two names.

 In general I am trying to build an autosuggest feature on a site and 
 when you start typing 'Joh' the suggestions will be first users starting 
 with 'Joh', but there could be some venues starting with 'Joh' string as 
 well, so as much you type, the more concrete results you will have, but I 
 am also reading about suggesters, and probably I will implement a solution 
 which will be more like Google autosuggest rather displaying the first few 
 results of the search itself.

 I've seen also text scoring in 
 scriptshttp://www.elasticsearch.org/guide/en/elasticsearch/reference/master/modules-advanced-scripting.html
  which 
 could be a solution as well.

 I just wanted to ask if there is something like common way of score the 
 position of the term in a field, but obviously there is no such way. :)

 Thanks for the help



 On Thursday, January 23, 2014 9:36:47 AM UTC, Johan Rask wrote:

 Hi again,

 Can you explain more what you are trying to accomplish?

 I think that the only way you can solve this is to split into 
 multiple fields and
 then boost individual fields in your query. Not sure if thats possible 
 for you.

 /Johan

 Den torsdagen den 23:e januari 2014 kl. 09:44:02 UTC+1 skrev Nikolay 
 Chankov:

 Hi Johan,

 thanks for the reply

 I would agree that it's ok, if I am searching for common term, like 
 'venue', 'club' or 'bar', but when it comes to User names, it make sense 
 to 
 score the position in the field too, because when you search in field 
 user.name, and type 'Jo' you would expect first to see users with 
 first name Joe, Johan, John, rather than having users with Jo in the 
 family.

 And especially when you search for user name field you don't expect to 
 have more occurrence of the name  in that field.



 On Wednesday, January 22, 2014 9:30:19 PM UTC, Johan Rask wrote:

 Lucene will calculate you score based on a scoring formula. I am 
 pretty sure that the location of the word is not part of this formula 
 but 
 rather how common the
 word is in your sentence. I.e multiple occurences of 'venue' should 
 increase scoring and adding other words to your sentence should decrease 
 the scoring.

 Hope this helps, I am pretty sure there is detailed info about this 
 in the lucene docs. 

 Kind regards /Johan

 Den onsdagen den 22:e januari 2014 kl. 13:10:00 UTC+1 skrev Nikolay 
 Chankov:

 I am playing with elasticsearch so far, and i noticed something:

 If I search for a word in a string, the _score is equal no matter 
 where is placed the word. Here I have prepared a test case:

 curl -XDELETE 'http://localhost:9200/test_search'
 curl -XPUT 'http://localhost:9200/test_search/' -d '
 {
 mappings : {
 test_record : {
 properties : {
 name : { 
 type : string
 }
 }
 }
 }
 }'

 curl -XPUT 'http://localhost:9200/test_search/test_record/1' -d '{
 name : is the name Venue of that one
 }'

 curl -XPUT 'http://localhost:9200/test_search/test_record/2' -d '{
 name : is the name of that one Venue
 }'

 curl -XPUT 'http://localhost:9200/test_search/test_record/3' -d '{
 name : Venue is the name of that one
 }'

 curl -XGET 'http://localhost:9200/test_search/_search' -d '{
 query: {
 bool: {
 must: [ ],
 must_not: [ ],
 should: [
 {
 query_string : {
 default_field: _all,
 query : venue
 }
 }
 ]
 }
 },
 from: 0,
 size: 10
 }'

 The question is: how to have different score based on the position 
 of the word 'venue' in the test. When I search I would expect results 
 to be 
 ordered 3,1,2 while now they are as they are inserted ,1,2,3.

Re: Score depending on position in the term on the field

2014-01-23 Thread Nikolay Chankov

Hi Johan,

thanks for the reply

I would agree that it's ok, if I am searching for common term, like
'venue', 'club' or 'bar', but when it comes to User names, it make sense to
score the position in the field too, because when you search in field
user.name, and type 'Jo' you would expect first to see users with first
name Joe, Johan, John, rather than having users with Jo in the family.

And especially when you search for user name field you don't expect to have
more occurrence of the name in that field.

On Wednesday, January 22, 2014 9:30:19 PM UTC, Johan Rask wrote:

Lucene will calculate you score based on a scoring formula. I am pretty
sure that the location of the word is not part of this formula but rather
how common the
word is in your sentence. I.e multiple occurences of 'venue' should
increase scoring and adding other words to your sentence should decrease
the scoring.

Hope this helps, I am pretty sure there is detailed info about this in the
lucene docs.

Kind regards /Johan

Den onsdagen den 22:e januari 2014 kl. 13:10:00 UTC+1 skrev Nikolay
Chankov:

I am playing with elasticsearch so far, and i noticed something:

If I search for a word in a string, the _score is equal no matter where
is placed the word. Here I have prepared a test case:

curl -XDELETE 'http://localhost:9200/test_search'
curl -XPUT 'http://localhost:9200/test_search/' -d '
{
mappings : {
test_record : {
properties : {
name : {
type : string
}
}
}
}
}'

curl -XPUT 'http://localhost:9200/test_search/test_record/1' -d '{
name : is the name Venue of that one
}'

curl -XPUT 'http://localhost:9200/test_search/test_record/2' -d '{
name : is the name of that one Venue
}'

curl -XPUT 'http://localhost:9200/test_search/test_record/3' -d '{
name : Venue is the name of that one
}'

curl -XGET 'http://localhost:9200/test_search/_search' -d '{
query: {
bool: {
must: [ ],
must_not: [ ],
should: [
{
query_string : {
default_field: _all,
query : venue
}
}
]
}
},
from: 0,
size: 10
}'

The question is: how to have different score based on the position of the
word 'venue' in the test. When I search I would expect results to be
ordered 3,1,2 while now they are as they are inserted ,1,2,3.

Any hint will be much appreciated

--
You received this message because you are subscribed to the Google Groups
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/d2d79908-0131-4220-941c-eb31e51b9667%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Re: Score depending on position in the term on the field

2014-01-23 Thread Johan Rask

Hi again,

Can you explain more what you are trying to accomplish?

I think that the only way you can solve this is to split into multiple
fields and
then boost individual fields in your query. Not sure if thats possible for
you.

/Johan

Den torsdagen den 23:e januari 2014 kl. 09:44:02 UTC+1 skrev Nikolay
Chankov:

Hi Johan,

thanks for the reply

And especially when you search for user name field you don't expect to
have more occurrence of the name in that field.

On Wednesday, January 22, 2014 9:30:19 PM UTC, Johan Rask wrote:

Hope this helps, I am pretty sure there is detailed info about this in
the lucene docs.