Chris Hostetter wrote:
I don't really know what a "term matrix" is, but when you ask about
"weight' is it possible you are just looking for the TermDoc.freq() of the
term/doc pair?

Thank you Chris,

that was also my first idea. I wanted to get the document frequency
        indexreader.docFreq(term)
and the term frequency
        termdoc.freq()
to calculate the term weight by my self.
If I change the scoring by sub classing the Similarity class I have to change the code for the term weight calculation as well. The better way would be to use the same scoring engine for a single term weight and the ranking of search results.

It seems that there is no simple function to ask the weight for a term in a document directly. So I decide not to iterate the documents of a term or the terms of a document. I'm iterating the terms of the index, searching for the term, iterating the result documents and using the score as my term weight for the document term matrix:

TermEnum terms=indexreader.terms();
while(terms.next()) {
  Term term=terms.term();
  // write the term to the external document term matrix
  Hits hits=indexsearcher.search(new TermQuery(term));
  for(int i=0; i<hits.length(); i++) {
    Document doc=hits.doc(i);
// write the document id (key, URL or index number) to the document term matrix
    float weight=hits.score(i);
    // write the term weight to the document term matrix
  }
}

Sören

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to