Get TermVectors for query hits only

2009-07-13 Thread Walter Ravenek

Hi all,

When I'm using the TermVectorComponent I receive term vectors with all 
tokens in the documents that meet my search criteria. I would be 
interested in getting the offsets for just those terms in the documents 
that meet the search citeria. My documents are about 200 K and are in 
XML. If I have just the offsets for the hits, I can easily implement my 
own highligting on the client side.


Does anyone know how to go about doing this?



Re: Get TermVectors for query hits only

2009-07-13 Thread Walter Ravenek

Thanks Grant,

I think I get the idea.


Grant Ingersoll wrote:
I seem to recall that the Highlighter in Solr is pluggable, so you may 
want to work at that level instead of the client side.  Otherwise, you 
likely would have to implement your own TermVectorMapper and add that 
to the TermVectorComponent capability which then feeds your client.


For an example of using TermVectorMapper, but not solving exactly your 
problem (but close), see 
http://www.lucidimagination.com/blog/2009/05/26/accessing-words-around-a-positional-match-in-lucene/ but 
note that is at the Lucene level.



On Jul 13, 2009, at 2:37 PM, Walter Ravenek wrote:


Hi all,

When I'm using the TermVectorComponent I receive term vectors with 
all tokens in the documents that meet my search criteria. I would be 
interested in getting the offsets for just those terms in the 
documents that meet the search citeria. My documents are about 200 K 
and are in XML. If I have just the offsets for the hits, I can easily 
implement my own highligting on the client side.


Does anyone know how to go about doing this?



--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) 
using Solr/Lucene:

http://www.lucidimagination.com/search




No virus found in this incoming message.
Checked by AVG - www.avg.com 
Version: 8.5.387 / Virus Database: 270.13.12/2234 - Release Date: 07/12/09 17:56:00