Chris Hostetter wrote:
I don't really know what a "term matrix" is, but when you ask about
"weight' is it possible you are just looking for the TermDoc.freq() of the
term/doc pair?
Thank you Chris,
that was also my first idea. I wanted to get the document frequency
indexreader.docFreq(term)
and the term frequency
termdoc.freq()
to calculate the term weight by my self.
If I change the scoring by sub classing the Similarity class I have to
change the code for the term weight calculation as well. The better way
would be to use the same scoring engine for a single term weight and the
ranking of search results.
It seems that there is no simple function to ask the weight for a term
in a document directly. So I decide not to iterate the documents of a
term or the terms of a document. I'm iterating the terms of the index,
searching for the term, iterating the result documents and using the
score as my term weight for the document term matrix:
TermEnum terms=indexreader.terms();
while(terms.next()) {
Term term=terms.term();
// write the term to the external document term matrix
Hits hits=indexsearcher.search(new TermQuery(term));
for(int i=0; i<hits.length(); i++) {
Document doc=hits.doc(i);
// write the document id (key, URL or index number) to the document
term matrix
float weight=hits.score(i);
// write the term weight to the document term matrix
}
}
Sören
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]