dear all,
we have a linguistics project running here and we want to use lucene for the information retrieval. rather then just searching for specific terms we want to build frequency lists and detect coocurrences of terms. what we need is some kind of the following functionality (I will give what I think could be a resulting API) 1. IndexSearcher.search(query) (already implemented) 2. Hits.getLength() (already implemented) 3. for (...) Hits.doc(i).getTerms() or Hits.doc(i).getTerms(Field) (required) (4. and for each returned doc its frequency, but that is the same as above - or could it be retrieved together with the term list?) This means, that if I get a Hits object back, I want for all its documents to get the terms and their frequency. sure, I could look the document up and parse it - again. but then if the first query produces, say 20.000 hits, I would have to reparse these 20.000 documents while this parsing has already been done for the index creation. instead I wanted to ask if there is a possibility within the existing classes (or at least with some use of them and some new ones) to retrieve this information: to wich terms a single document is assigned to. thanx a lot for any help or hint sincerely, Chantal -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>