On Fri, Nov 20, 2009 at 9:13 PM, javaxmlsoapdev <vika...@yahoo.com> wrote:
> > did you extend DIH to do this work? can you share code samples. I have > similar requirement where I need tp index database records and each record > has a column with document path so need to create another index for > documents (we allow users to search both index separately) in parallel with > reading some meta data of documents from database as well. I have all sorts > of different document formats to index. fyi; I am on solr 1.4.0. Any > pointers would be appreciated. > > He did not extend DIH for this. He extracted out text from his documents and saved them into files and used XPathEntityProcessor (you can use PlainTextEntityProcessor) to index them. I don't know much about ExtractionRequestHandler but if you want to use DIH, you'll have to extend it to add Tika support. You may want to look at a couple of open issues: 1. https://issues.apache.org/jira/browse/SOLR-1358 2. https://issues.apache.org/jira/browse/SOLR-1583 -- Regards, Shalin Shekhar Mangar.