I updated the first section in main() with this:
store = build_index()
searcher = PyLucene.IndexSearcher(store)
foo = searcher.getSimilarity()
print foo
bar = SimilaritySansTF()
print bar
searcher.setSimilarity(bar)
foo = searcher.getSimilarity()
print foo
parser = PyLucene.QueryParser('_all_', PyLucene.StandardAnalyzer())
It's a sanity check to ensure that I'm creating an object of type
SimilaritySansTF, and that the setSimilarity() call worked.
The first three lines of output are:
---
[EMAIL PROTECTED]
[EMAIL PROTECTED]
[EMAIL PROTECTED]
---
Now I'm thorougly baffled. The bar variable is being directly set with the
object created by the constructor call SimilaritySansTF(), and yet when I
print bar, it still identifies itself as type DefaultSimilarity.
I'm new to Python. Am I not understanding how inheritance works here?
-ofer
> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Ofer Nave
> Sent: Thursday, March 15, 2007 3:46 PM
> To: list: pylucene-dev
> Subject: [pylucene-dev] why isn't my custom Similarity object
> changing thebehavior?
>
> I'm just now starting to play with the scoring algorithm.
> The first change I want to make is to have the score ignore
> term frequency. I created this test script to validate my
> understanding of the API, but my custom Similarity class
> doesn't seem to affect the tf values in the output, and I
> can't figure out why. I've looked at the docs, the scoring
> page on the lucene site, and various archived posts, and I
> don't see anything I've done wrong.
>
> The print statement in tf() was to test if the overridden
> method is even getting called. It's not.
>
> ---
> import PyLucene
>
> def main():
> store = build_index()
> searcher = PyLucene.IndexSearcher(store)
> searcher.setSimilarity(SimilaritySansTF())
> parser = PyLucene.QueryParser('_all_',
> PyLucene.StandardAnalyzer())
>
> query = parser.parse('foo')
> hits = searcher.search(query)
>
> for i, doc in hits:
> print '[%02d] %s (%0.2f)' % (i, doc.get('_all_'),
> hits.score(i))
> print '\t%s' % (searcher.explain(query, hits.id(i)))
>
> def build_index():
> store = PyLucene.RAMDirectory()
> writer = PyLucene.IndexWriter(store,
> PyLucene.StandardAnalyzer(), True)
>
> doc = PyLucene.Document()
> doc.add(PyLucene.Field('_all_', 'foo bar bar',
> PyLucene.Field.Store.YES,
> PyLucene.Field.Index.TOKENIZED))
> writer.addDocument(doc)
>
> doc = PyLucene.Document()
> doc.add(PyLucene.Field('_all_', 'foo foo bar',
> PyLucene.Field.Store.YES,
> PyLucene.Field.Index.TOKENIZED))
> writer.addDocument(doc)
>
> doc = PyLucene.Document()
> doc.add(PyLucene.Field('_all_', 'foo bar',
> PyLucene.Field.Store.YES,
> PyLucene.Field.Index.TOKENIZED))
> writer.addDocument(doc)
>
> writer.optimize()
> writer.close()
>
> return store
>
> class SimilaritySansTF(PyLucene.DefaultSimilarity):
> def tf(freq):
> print 'freak out!'
> return 1
>
> main()
> ---
>
> -ofer
>
> _______________________________________________
> pylucene-dev mailing list
> [email protected]
> http://lists.osafoundation.org/mailman/listinfo/pylucene-dev
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev