-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

I am aware of the fact that the corpus is a bit small (but nicer for
presentation purposes), but it surprised me that I found no way (even
when playing with the parameters) to get at least 1 common word from the
the set. (it wasn't intended to be usable, but presentable)

I will play around a bit more and add some documents. Thanks for the hints.

Greetings
Florian Gilcher


Jens Kraemer wrote:
> Hi,
> 
> first of all, 6 documents is not really a corpus to judge the usability
> of more_like_this - by default it will only consider terms occuring in
> at least 5 documents to be of any relevance (:min_doc_freq option). So
> if you have very different documents where the only common words are
> filtered out as noise words, you'll end up without any terms to use
> for finding similar documents, which would lead to the query you
> mentioned. 
> 
> However more_like_this should indeed return an empty result set in this
> case ;-)
> 
> Besides that, you should store term vectors (give :term_vector => :yes
> for the fields you want to use more_like_this on in your call to
> acts_as_ferret), this will speed up the search for relevant terms.
> 
> 
> Jens
> 
> 
> On Tue, Jul 17, 2007 at 12:11:55PM +0200, Florian Gilcher wrote:
> Hi,
> 
> I have the following Problem:
> 
> I created a fairly simple sample project to try out acts_as_ferret and
> present the results.
> 
> The test set is relatively easy: I have extracts from 6
> Wikipedia-Articles about several Topics, which are copied into a model
> that has two fields: title and text. This works quite well, until I try
> to use #more_like_this, which returns all of the other articles, even if
> they have nothing to do with the active article. I debugged a bit and
> found out that the query build by #more_like_this is nothing more then
> "-id:<id of the active record>".
> (so the _result_ is correct)
> 
> To try that out on the console, I used:
> 
> entry = Entry.find(1)
> entry.more_like_this(:field_names => ['text'])
> 
> Either I'm doing something entirely wrong or there is a bug. ;) Before
> filing a ticket, I want to rule out the first case.
> 
> Ferret version is 0.11.4, aaf version is the current stable version
> (although trunk didn't work as well).
> 
> I uploaded the demo project together with a dump of the Database to:
> 
> Project: http://putstuff.putfile.com/95477/8752808
> Dump: http://putstuff.putfile.com/95479/6169502
> 
> Thanks in advance.
> Florian Gilcher
> 
> P.S.: There is another minor bug. Altough #more_like_this does set a
> default option for :field_names (line #35), this option leads to a crash
> in #retrieve_terms. The default option is nil and #retrieve_terms thus
> tries to call #each on nil. (line #113)
_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk
>>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGnLDa8RlGMqQ8m7oRAvfwAJ9Tf3n8doy/EzkDS/Q4Mgf+WNTZZwCeMCnu
75or+J8oDXojyqO4oUzt3IY=
=uhKz
-----END PGP SIGNATURE-----
_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk

Reply via email to