Hi MFM,

This comes down to a preprocessing step that you would have to do before putting into Lucene, although I suppose you might be able to identify it during analysis and use the TeeTokenFilter and the SinkTokenizer. Once you do this, then you can add them as fields on a Document. I know that's not a great help, but not much Lucene can do b/c it is application specific.

Document/field wise, I would probably have:
Document
   question
   answer

Then, when you search in the question field, you can also retrieve the answer.

-Grant

On Mar 24, 2009, at 4:04 PM, MFM wrote:


I have been able to successfully index and search text from structured
documents like PDF and MS Word. I am having a real hard time trying to
figure out how to group the index strings together e.g. if my document had a question and answer in a table, the search will produce the text with the question based on the keyword. How would I group or associate the question and answer as part of the indexing ? I have tried using POI to read thru the MS Word file and try and group them, but then it gets really intense into
pattern matching.

Thanks
MFM
--
View this message in context: 
http://www.nabble.com/question-about-grouping-text-tp22682433p22682433.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to