@All : Elaborating the problem The phrase is being indexed as a single token ... I have a Gene tag in the xml document which is like <Gene>brain natriuretic peptide </Gene> This phrase is present in the abstract text for the given document . Code is as :
doc.add(new Field("Gene", geneName, Field.Store.YES, Field.Index.ANALYZED,Field.TermVector.YES)); doc.add(new Field("Token", abstractText.toString().toLowerCase(), Field.Store.YES, Field.Index.ANALYZED,Field.TermVector.YES)); When I retrieve all tokens as well as genes for a given doc and calculate the tf for each of these , a null exception is thrown . Term = brain natriuretic peptide TermDocs termDocs = indexReader.termDocs(term); termDocs.next(); double tf = termDocs.freq(); Regards, Hrishi Grant Ingersoll-6 wrote: > > When do you detect that they are phrases? During indexing or during > search? > > On Jan 8, 2010, at 5:16 AM, hrishim wrote: > >> >> Hi . >> I have phrases like brain natriuretic peptide indexed as a single token >> using Lucene. >> When I calculate the term frequency for the same the count is 0 since >> the >> tokens from the text are indexed separately i.e. brain , natriuretic , >> peptide. >> Is there a way to solve this problem and get the term frequency for the >> entire phrase ? >> >> Regards, >> Hrishi >> -- >> View this message in context: >> http://old.nabble.com/Term-Frequency-for-phrases-tp27073866p27073866.html >> Sent from the Lucene - Java Users mailing list archive at Nabble.com. >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> > > -------------------------- > Grant Ingersoll > http://www.lucidimagination.com/ > > Search the Lucene ecosystem using Solr/Lucene: > http://www.lucidimagination.com/search > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > -- View this message in context: http://old.nabble.com/Term-Frequency-for-phrases-tp27073866p27079648.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org