Hi Costi, I'd recommend SolrJ, parallelize the inserts. Also, it helps to set the commit intervals reasonable.
Just to get a better perspective * Why do you want to do a full index everyday? * How much of data are we talking about? * What's your SolrCloud setup like? * Do you already have some benchmarks which you're not happy with? On Thu, May 1, 2014 at 1:47 PM, Costi Muraru <costimur...@gmail.com> wrote: > Hi guys, > > What would you say it's the fastest way to import data in SolrCloud? > Our use case: each day do a single import of a big number of documents. > > Should we use SolrJ/DataImportHandler/other? Or perhaps is there a bulk > import feature in SOLR? I came upon this promising link: > http://wiki.apache.org/solr/UpdateCSV > Any idea on how UpdateCSV is performance-wise compared with > SolrJ/DataImportHandler? > > If SolrJ, should we split the data in chunks and start multiple clients at > once? In this way we could perhaps take advantage of the multitude number > of servers in the SolrCloud configuration? > > Either way, after the import is finished, should we do an optimize or a > commit or none ( > http://wiki.solarium-project.org/index.php/V1:Optimize_command)? > > Any tips and tricks to perform this process the right way are gladly > appreciated. > > Thanks, > Costi > -- Anshum Gupta http://www.anshumgupta.net