Yes, the term frequency vector is exactly what I needed. Thanks!
-James
Ajay Lakhani wrote:
>
> Hi James,
>
> Try this:
>
> Searcher searcher = new IndexSearcher(dir);
> QueryParser parser = new QueryParser("content", new
> StandardAnalyzer());
> Query query = parser.parse(queryString);
>
> HashSet queryTerms = new HashSet();
> query.extractTerms(queryTerms);
>
> Hits hits = searcher.search(query);
>
> IndexReader reader = IndexReader.open(dir);
>
> for (int i =0; i < hits.length() ; i ++){
> Document d = hits.doc(i);
> Field fid = d.getField("id");
> Field ftitle = d.getField("title");
> System.out.println("id is " + fid.stringValue());
> System.out.println("title is " + ftitle.stringValue());
>
> TermFreqVector tfv = reader.getTermFreqVector(hits.id(i),
> "content");
> String[] terms = tfv.getTerms();
> int [] freqs = tfv.getTermFrequencies();//get the frequencies
>
> // for each term in the query
> for (Iterator iter = queryTerms.iterator(); iter.hasNext();) {
> Term term = (Term) iter.next();
>
> // for each term in the vector
> for (int j = 0; j < terms.length; j++) {
> if (terms[j].equals(term.text())) {
> System.out.println("frequency of term ["+ term.text() +"] is "
> +
> freqs[j] );
> }
> }
> }
> }
>
> Let me know if this helps.
> Cheers
> AJ
>
> 2008/7/10 Karl Wettin <[EMAIL PROTECTED]>:
>
>> Maybe you are looking for the document TermFreqVector?
>>
>>
>> karl
>>
>> 9 jul 2008 kl. 15.49 skrev jnance:
>>
>>
>>> Hi,
>>>
>>> I am indexing lots of text files and need to see how many times a
>>> certain
>>> word comes up in each text file. Right now I have this constructor for
>>> "search":
>>>
>>> static void search(Searcher searcher, String queryString) throws
>>> ParseException, IOException {
>>> QueryParser parser = new QueryParser("content", new
>>> StandardAnalyzer());
>>> Query query = parser.parse(queryString);
>>> Hits hits = searcher.search(query);
>>>
>>> int hitCount = hits.length();
>>> if (hitCount == 0) {
>>> System.out.println("0 documents contain the word
>>> \"" + queryString +
>>> ".\"");
>>> }
>>> else {
>>> System.out.println(hitCount + " documents
>>> contain
>>> the word \"" +
>>> queryString + ".\"");
>>> }
>>> }
>>>
>>> This tells me how many documents contain the word I'm looking for... but
>>> how
>>> do I get it to tell me how many times the word occurs within that
>>> document?
>>>
>>> Thanks,
>>>
>>> James
>>> --
>>> View this message in context:
>>> http://www.nabble.com/Searching-for-instances-within-a-document-tp18362075p18362075.html
>>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>>> For additional commands, e-mail: [EMAIL PROTECTED]
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>> For additional commands, e-mail: [EMAIL PROTECTED]
>>
>>
>
>
--
View this message in context:
http://www.nabble.com/Searching-for-instances-within-a-document-tp18362075p18381743.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]