Re: How to use DataImportHandler with ExtractingRequestHandler?

Shalin Shekhar Mangar Tue, 24 Nov 2009 01:55:22 -0800

On Fri, Nov 20, 2009 at 9:13 PM, javaxmlsoapdev <vika...@yahoo.com> wrote:


>
> did you extend DIH to do this work? can you share code samples. I have
> similar requirement where I need tp index database records and each record
> has a column with document path so need to create another index for
> documents (we allow users to search both index separately) in parallel with
> reading some meta data of documents from database as well. I have all sorts
> of different document formats to index. fyi; I am on solr 1.4.0. Any
> pointers would be appreciated.
>
>
He did not extend DIH for this. He extracted out text from his documents and
saved them into files and used XPathEntityProcessor (you can use
PlainTextEntityProcessor) to index them.

I don't know much about ExtractionRequestHandler but if you want to use DIH,
you'll have to extend it to add Tika support. You may want to look at a
couple of open issues:

   1. https://issues.apache.org/jira/browse/SOLR-1358
   2. https://issues.apache.org/jira/browse/SOLR-1583

-- 
Regards,
Shalin Shekhar Mangar.

Re: How to use DataImportHandler with ExtractingRequestHandler?

Reply via email to