Sebastian,

sounds like an exciting project.



> We've found the argument "TokenGroup" in method "highlightTerm"
> implemented in SimpleHtmlFormatter. "TokenGroup" provides the method
> "getPayload()", but the returned value is always "NULL". 
> 
No, Token provides this method, not TokenGroup. But this might not be the
mistake.

Hm, since this approach is very special, I would suggest to do something
easier.
You already got tools to retrive the word and the word's position from the
image, right?

What would be, if you add a field to the schema.xml with a preprocessed
input-string.

I.e:
You got two fields:
"page's text" and "page's text's word-positions".

Page's text's word-positions needs preprocessing outside of Solr where you
add the coordinates of the words .

This preprocessing will be a little bit tricky.
If the 10th word is "Solr" and the 30th word also, you do not want to have
"solr" two times with different coordinates.
In fact, you want to store both coordinates for the term "solr".

However, on the Solr-side you can add this preprocessed string to a field
with TermVectors.
If your query hits the page, you will get all the coordinates you want to
get.
Unfortunately, highlighting must be done on the client side.

Hope this helps
- Mitch
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-1-4-Image-Highlighting-and-Payloads-tp919266p919342.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to