On Tue, 18 Apr 2006, [EMAIL PROTECTED] wrote:

In PyLucene 1.9 (haven't checked the new 2.0 stuff yet), Hits.__iter__ uses a HitsEnumeration object that returns each document in the hits list. Would it make more sense for __iter__ to be implemented with the Hits.iterator() method (also not exposed in PyLucene 1.9)? Doing this would be a bit more consistent with what a Java Lucene user would expect, and it lets us get the score for each document as well as the document itself, using the clean Python looping syntax.

currently:

hits = searcher.search(query)
for i in range(hits.length()):
 print hits.doc(i)
 print hits.score(i)

This looping syntax, while working, is not how it is typically done in PyLucene. Since version 1.0, you've been able to say instead:

for i, doc in hits:
    print doc
    print hits.score(i)

This is documented in PyLucene's README file and is used in all the sample code.


Using a more Java-Lucene-ish way:

hits = searcher.search(query)
for hit in hits:
 print hit.getDocument()
 print hit.getScore()

Anyhow, I'd like it more that way...

In 2.0, I reconsidered this and realized that the new Java Lucene HitIterator was redundant with the way it's been done in PyLucene before, so I didn't change the interface to be more Java-Lucene-ish.

This being python, I think it should be possible to create a tuple object with some additional methods on it so that you could say either:

for hit in hits:
    print hit.getDocument()
    print hit.getScore()

   or

for i, doc in hits:
    print doc
    print hits.score(i)

Let me see ....

Andi..

_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev

Reply via email to