@All : Elaborating the problem
The phrase is being indexed as a single token ...
I have a Gene tag in the xml document which is like
<Gene>brain natriuretic peptide </Gene>
This phrase is present in the abstract text for the given document .
Code is as :
doc.add(new Field("Gene", geneName, Field.Store.YES,
Field.Index.ANALYZED,Field.TermVector.YES));
doc.add(new Field("Token", abstractText.toString().toLowerCase(),
Field.Store.YES, Field.Index.ANALYZED,Field.TermVector.YES));
When I retrieve all tokens as well as genes for a given doc and calculate
the tf for each of these ,
a null exception is thrown . Term = brain natriuretic peptide
TermDocs termDocs = indexReader.termDocs(term);
termDocs.next();
double tf = termDocs.freq();
Regards,
Hrishi
Grant Ingersoll-6 wrote:
>
> When do you detect that they are phrases? During indexing or during
> search?
>
> On Jan 8, 2010, at 5:16 AM, hrishim wrote:
>
>>
>> Hi .
>> I have phrases like brain natriuretic peptide indexed as a single token
>> using Lucene.
>> When I calculate the term frequency for the same the count is 0 since
>> the
>> tokens from the text are indexed separately i.e. brain , natriuretic ,
>> peptide.
>> Is there a way to solve this problem and get the term frequency for the
>> entire phrase ?
>>
>> Regards,
>> Hrishi
>> --
>> View this message in context:
>> http://old.nabble.com/Term-Frequency-for-phrases-tp27073866p27073866.html
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
>
> Search the Lucene ecosystem using Solr/Lucene:
> http://www.lucidimagination.com/search
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>
>
--
View this message in context:
http://old.nabble.com/Term-Frequency-for-phrases-tp27073866p27079648.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]