=?iso-8859-1?Q?R=E9f=2E_=3A_Re=3A_IndexSearcher_and_number_of_occurence?=

Bertrand VENZAL Thu, 13 Jan 2005 07:18:16 -0800

Hi,

Thanks for your quick answer, I understood wot u meant by using the
indexSearcher to get the termFreqVector. But, you use an int as an id to
find the termFrequency so I suppose that it is the position number in the
IndexReader vector.
My problem is : during the indexing phase, I can store the id, but if a
document is deleted and recreated later on (like in an update), this will
change my vector and all the id's previously set will be no more correct.
Am i right on this point ? or am i missing something ...

thanks ...
|------------------------+------------------------+------------------------|
|                        |   Erik Hatcher         |                        |
|                        |   <[EMAIL PROTECTED]|                        |
|                        |   ns.com>              |      Pour :            |
|                        |   Envoyé par :         |                   "Luce|
|                        |   lucene-user-return-12|                   ne   |
|                        |   431-bertrand.venzal=c|                   Users|
|                        |   [EMAIL PROTECTED]|                   List"|
|                        |   e.org                |                   <luce|
|                        |                        |                   ne-us|
|                        |   13/01/2005 11:28     |                   [EMAIL 
PROTECTED]|
|                        |   Veuillez répondre à  |                   karta|
|                        |   "Lucene Users List"  |                   .apac|
|                        |                        |                   he.or|
|                        |                        |                   g>   |
|                        |                        |                        |
|                        |                        |                        |
|                        |                        |      cc :              |
|                        |                        |                        |
|                        |                        |                        |
|                        |                        |                        |
|                        |                        |                        |
|                        |                        |                        |
|                        |                        |      Objet :           |
|                        |                        |                   Re:  |
|                        |                        |                   Index|
|                        |                        |                   Searc|
|                        |                        |                   her  |
|                        |                        |                   and  |
|                        |                        |                   numbe|
|                        |                        |                   r of |
|                        |                        |                   occur|
|                        |                        |                   ence |
|                        |                        |                        |
|                        |                        |                        |
|------------------------+------------------------+------------------------|










On Jan 13, 2005, at 5:03 AM, Bertrand VENZAL wrote:

>
>
> Hi all,
>
> Im quite new in this mailing list. I ve many difficulties to find the
> number of a word (occurence) in a document, I need to use indexSearcher
> because of the query but the score returning is not wot i m looking
> for.
> I found in the mailing List the class TermDoc but it seems to work only
> with indexReader.
>
> If anyone can give a hand of this one, I will appreciate ...

Perhaps this technique is what you're looking for.... set the field(s)
you're interested in capturing frequency on to be vectored.  You'll see
that flag as additional overloaded methods on the Field.  You'll still
need to use an IndexReader, but that is no problem.  Construct an
IndexReader and use it to construct the IndexSearcher that you'll also
use.  Here's some snippets of code:

                // During indexing, "subject" field was added like this:
    doc.add(Field.UnStored("subject", subject, true));

... // now during searching...

    IndexReader reader = IndexReader.open(directory);

    ...
    // from your Hits, get the document id
    int id = hits.doc(i);

    TermFreqVector vector =
        reader.getTermFreqVector(id, "subject");

Now read up on the TermFreqVector API to get at the frequency of a
specific term.

                Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]







---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
=?iso-8859-1?Q?R=E9f=2E_=3A_Re=3A_IndexSearcher_and_number_of_occurence?=

Reply via email to