On Tue, 18 Apr 2006, Andi Vajda wrote:
On Tue, 18 Apr 2006, [EMAIL PROTECTED] wrote:
In PyLucene 1.9 (haven't checked the new 2.0 stuff yet), Hits.__iter__ uses
a HitsEnumeration object that returns each document in the hits list.
Would it make more sense for __iter__ to be implemented with the
Hits.iterator() method (also not exposed in PyLucene 1.9)? Doing this
would be a bit more consistent with what a Java Lucene user would expect,
and it lets us get the score for each document as well as the document
itself, using the clean Python looping syntax.
currently:
hits = searcher.search(query)
for i in range(hits.length()):
print hits.doc(i)
print hits.score(i)
This looping syntax, while working, is not how it is typically done in
PyLucene. Since version 1.0, you've been able to say instead:
for i, doc in hits:
print doc
print hits.score(i)
This is documented in PyLucene's README file and is used in all the sample
code.
Using a more Java-Lucene-ish way:
hits = searcher.search(query)
for hit in hits:
print hit.getDocument()
print hit.getScore()
Anyhow, I'd like it more that way...
In 2.0, I reconsidered this and realized that the new Java Lucene HitIterator
was redundant with the way it's been done in PyLucene before, so I didn't
change the interface to be more Java-Lucene-ish.
This being python, I think it should be possible to create a tuple object
with some additional methods on it so that you could say either:
for hit in hits:
print hit.getDocument()
print hit.getScore()
or
for i, doc in hits:
print doc
print hits.score(i)
I made the requested change. Both above forms now work.
I just uploaded a 2.0rc1-7 source tarball to pylucene.osafoundation.org which
includes this.
Andi..
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev