If you build your index in Hadoop, read this (it is about the Cloudera Search but in my understanding also should work with Solr Hadoop contrib since 4.7) http://www.cloudera.com/content/cloudera-content/cloudera-docs/Search/latest/Cloudera-Search-User-Guide/csug_batch_index_to_solr_servers_using_golive.html
On Thu, May 1, 2014 at 1:47 PM, Costi Muraru <costimur...@gmail.com> wrote: > Hi guys, > > What would you say it's the fastest way to import data in SolrCloud? > Our use case: each day do a single import of a big number of documents. > > Should we use SolrJ/DataImportHandler/other? Or perhaps is there a bulk > import feature in SOLR? I came upon this promising link: > http://wiki.apache.org/solr/UpdateCSV > Any idea on how UpdateCSV is performance-wise compared with > SolrJ/DataImportHandler? > > If SolrJ, should we split the data in chunks and start multiple clients at > once? In this way we could perhaps take advantage of the multitude number > of servers in the SolrCloud configuration? > > Either way, after the import is finished, should we do an optimize or a > commit or none ( > http://wiki.solarium-project.org/index.php/V1:Optimize_command)? > > Any tips and tricks to perform this process the right way are gladly > appreciated. > > Thanks, > Costi >