On 8/31/06, Kevin Ollivier <[EMAIL PROTECTED]> wrote:
One thing I'd like to do with my indexes is provide a browsable list
of various metadata fields, such as Subject, so that users could
click on any subject in the index and get a list of documents which
have that subject.
I do something similar. I found that using the MatchAllDocs() query
was indeed too slow. Based on the Lucene In Action examples, I found
that using a term enumerator was faster. On my index of over a million
rows, it took just a few seconds. Based on the LIA example for
distance sorting, try this:
fieldName = 'subject'
uniqueFieldValues = set()
enumerator = reader.terms(Term(fieldName, ""))
if reader.numDocs() > 0:
termDocs = reader.termDocs()
try:
while True:
term = enumerator.term()
if term is None:
raise RuntimeError, "no terms in field %s" %(fieldName)
if term.field() != fieldName:
break
termDocs.seek(enumerator)
while termDocs.next():
fieldValue = term.text()
if fieldValue not in uniqueFieldValues:
uniqueFieldValues.append(fieldValue)
if not enumerator.next():
break
finally:
termDocs.close()
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev