We do this in production and haven't had any issues.  This is a 1.4.1 
installation, back when there was no "threads" option in DIH.  We divide the 
index into 8 parts and then run 8 DIH handlers at the same time, indexing 
simultaneously.  While Lucene itself is a bottleneck, we have a lot of data 
sources that DIH has to join, transformers, etc, so running multiple DIH 
handlers at once provides scale.

One annoyance is because of how DIH is designed, you need a separate handler 
set up in solrconfig.xml for each DIH you plan to run.  So you have to plan in 
advance how many DIH instances you want to run, which config files they'll use, 
etc.  

The other thing is you want to avoid the scenario where more than one DIH 
handler ends around the same time and they auto-commit on top of one another 
(or worse, optimize).  Because in our case we split the work in equal parts, we 
just turned auto-commit off in DIH and then do one big commit at the end once 
all 8 DIH's are done running.

James Dyer
E-Commerce Systems
Ingram Content Group
(615) 213-4311

-----Original Message-----
From: Shawn Heisey [mailto:s...@elyograg.org] 
Sent: Monday, January 09, 2012 1:26 PM
To: solr-user@lucene.apache.org
Subject: Multiple dataimport processes to same core?

Is it safe or advisable to run multiple dataimport handler requests on 
one Solr core simultaneously?

Thanks,
Shawn

Reply via email to