Hi!
I tried to reproduce this however changing the sorting with modifying
boosts works perfectly for me:
require 'rubygems'
require 'ferret'
include Ferret
fi = Index::FieldInfos.new
fi.add_field :country_code
fi.add_field :city, :boost => 8
fi.add_field :district, :boost => 7
i = Ferret::I.new :field_infos => fi
i << { :country_code => 'de', :city => 'Berlin' }
i << { :country_code => 'de', :city => 'Seedorf', :district => 'Berlin' }
i.search_each 'berlin, de' do |hit,score|
puts "#{i[hit][:country_code]} #{i[hit][:district]} #{i[hit][:city]} Score:
#{score}"
end
this outputs
de Berlin Score: 0.841327428817749
de Berlin Seedorf Score: 0.740611553192139
Swapping the boost values (city:7, district:8) also changes the result
sorting.
Any more info on other circumstances that might cause your problems?
Jens
On Wed, Jul 11, 2007 at 02:24:33PM +0200, Andreas Korth wrote:
> Hi!
>
> I thought I understood Ferret's query scoring and how to tweak
> results using boost values. What I currently experience however,
> leaves me completely baffled.
>
> Perhaps someone can shed some light on the scoring algorithm, because
> asking Ferret to "explain" the score for a particular document isn't
> as informative as I thought. Actually, it confuses me even more.
>
> Here's what I got:
>
> I'm indexing locations (addresses) in Ferret using the following fields:
>
> street, zipcode, district, city, county, state, country_code
>
> Addresses are stored in different precisions, i.e. not all of the
> fields contain values depending on the location's accuracy. Here are
> two examples:
>
> 1. Berlin, Germany:
>
> country_code: de
> city: Berlin
>
> 2. The district 'Berlin' in a town called 'Seedorf':
>
> country_code: de
> city: Seedorf
> district: Berlin
>
> When querying for "berlin, de", document #2 is ranked higher
> (probably due to its natural position in the index). Since I want the
> less accurate locations to rank higher, I added boost values. In the
> example above, assume that city has a boost of 8 and district has a
> boost of 7.
>
> With this little adjustment the first document should rank higher
> since the term 'berlin' appears in the city field. As you might
> suspect, this is not what happens. And I consider this a bug.
>
> Then I went and set the document boost to be 8 for a countries and 1
> for streets. This doesn't help either.
>
> The ranking of other results change slightly but nothing seems to be
> consistent with the boost settings. Perhaps the boost settings and
> the results are related in some way. But it's definitely not a
> logical relation.
>
> I'm thankful for any hint on how to achieve a proper ranking.
>
> Thanks!
> Andy
>
>
> _______________________________________________
> Ferret-talk mailing list
> [email protected]
> http://rubyforge.org/mailman/listinfo/ferret-talk
>
--
Jens Krämer
webit! Gesellschaft für neue Medien mbH
Schnorrstraße 76 | 01069 Dresden
Telefon +49 351 46766-0 | Telefax +49 351 46766-66
[EMAIL PROTECTED] | www.webit.de
Amtsgericht Dresden | HRB 15422
GF Sven Haubold, Hagen Malessa
_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk