Hello,
I would like to extract and store the document term matrix externally. I
iterate the terms and the documents for each term:
TermEnum terms=IndexReader.terms();
while(terms.next()) {
TermDocs docs=IndexReader.termDocs(terms.term());
while(docs.next()) {
//s
cene.apache.org
: To: java-user@lucene.apache.org
: Subject: How to get Term Weights (document term matrix)?
:
: Hello,
:
: I would like to extract and store the document term matrix externally. I
: iterate the terms and the documents for each term:
: TermEnum terms=IndexReader.terms();
: while(t
Chris Hostetter wrote:
I don't really know what a "term matrix" is, but when you ask about
"weight' is it possible you are just looking for the TermDoc.freq() of the
term/doc pair?
Thank you Chris,
that was also my first idea. I wanted to get the document frequency
indexreader.docFreq(
: It seems that there is no simple function to ask the weight for a term
: in a document directly. So I decide not to iterate the documents of a
as i said: it depends on what you mean by "term weight" ...
: term or the terms of a document. I'm iterating the terms of the index,
: searching for th
Chris Hostetter wrote:
You really, *REALLY* don't wnat to be doing this using the "Hits" class
like in your example ...
1) this will re-execute your search behind the scenes many many times
2) the scores returnd by "Hits" are psuedo-normalized ... they will be
meaningless for any sort