Re: Data Import Handelr Question

Shawn Heisey Thu, 24 Apr 2014 08:48:12 -0700

On 4/24/2014 9:24 AM, Yuval Dotan wrote:

I want to use the DIH component in order to import data from old postgresql
DB.
I want to be able to recover from errors and crashes.
If an error occurs I should be able to restart and continue indexing from
where it stopped.
Is the DIH good enough for my requirements ?
If not is it possible to extend one of its classes in order to support the
recovery?

The entity in the Dataimport Handler (DIH) config has an "onError"attribute.


http://wiki.apache.org/solr/DataImportHandler#Schema_for_the_data_config
https://cwiki.apache.org/confluence/display/solr/Uploading+Structured+Data+Store+Data+with+the+Data+Import+Handler#UploadingStructuredDataStoreDatawiththeDataImportHandler-EntityProcessors

But honestly, if you want a really robust Java program that indexes toSolr and does precisely what you want, you may be better off writing ityourself using SolrJ and JDBC. DIH is powerful and efficient, but whenyou write the program yourself, you can do anything you want with your data.

You also have the possibility of resuming an import after a Solr crash.Because DIH is embedded in Solr and doesn't save any kind of state dataabout an import in progress, that's pretty much impossible with DIH.With a SolrJ program, you'd have to handle that yourself, but it wouldbe *possible*.


https://cwiki.apache.org/confluence/display/solr/Using+SolrJ

Thanks,
Shawn

Re: Data Import Handelr Question

Reply via email to