http://mail-archives.apache.org/mod_mbox/lucene-java-user/201010.mbox/%3c128
7065863.4cb7110774...@netmail.pipex.net%3e will give you a better idea of
what I'm moving towards.

It's all a bit grey at the moment so further investigation is inevitable.

I expect that a combination of MySQL database storage and Lucene indexing is
going to be the end result.



-----Original Message-----
From: Grant Ingersoll [mailto:gsing...@apache.org] 
Sent: 20 Oct 2010 21 20
To: java-user@lucene.apache.org
Subject: Re: Using a TermFreqVector to get counts of all words in a document


On Oct 20, 2010, at 2:53 PM, Martin O'Shea wrote:

> Uwe
> 
> Thanks - I figured that bit out. I'm a Lucene 'newbie'.
> 
> What I would like to know though is if it is practical to search a single
> document of one field simply by doing this:
> 
> IndexReader trd = IndexReader.open(index);
>        TermFreqVector tfv = trd.getTermFreqVector(docId, "title");
>        String[] terms = tfv.getTerms();
>        int[] freqs = tfv.getTermFrequencies();
>        for (int i = 0; i < tfv.getTerms().length; i++) {
>            System.out.println("Term " + terms[i] + " Freq: " + freqs[i]);
>        }
>        trd.close();
> 
> where docId is set to 0.
> 
> The code works but can this be improved upon at all?
> 
> My situation is where I don't want to calculate the number of documents
with
> a particular string. Rather I want to get counts of individual words in a
> field in a document. So I can concatenate the strings before passing it to
> Lucene.

Can you describe the bigger problem you are trying to solve?  This looks
like a classic XY problem: http://people.apache.org/~hossman/#xyproblem

What you are doing above will work OK for what you describe (up to the
"passing it to Lucene" part), but you probably should explore the use of the
TermVectorMapper which provides a callback mechanism (similar to a SAX
parser) that will allow you to build your data structures on the fly instead
of having to serialize them into two parallel arrays and then loop over
those arrays to create some other structure.


--------------------------
Grant Ingersoll
http://www.lucidimagination.com


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to