[ https://issues.apache.org/jira/browse/NUTCH-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488744#comment-13488744 ]
Markus Jelsma commented on NUTCH-1480: -------------------------------------- I think you mean the justed linked issue NUTCH-1377? I just happen to work on that issue. Using the CloudSolrServer will send the docs to the correct shard already. The way it works now is sending millions of records to a single node which then distributes it again, a waste of IO. That issue will work with this issue so i may want to push them in together. It's just a matter of returning the correct SolrServer instance and working around the HTTPCLient issues. agreed on deduplication. It would also be hard for this issue to work with dedup because not all indices may be identical. > SolrIndexer to write to multiple servers. > ----------------------------------------- > > Key: NUTCH-1480 > URL: https://issues.apache.org/jira/browse/NUTCH-1480 > Project: Nutch > Issue Type: Improvement > Components: indexer > Reporter: Markus Jelsma > Assignee: Markus Jelsma > Priority: Minor > Fix For: 1.6 > > Attachments: NUTCH-1480-1.6.1.patch > > > SolrUtils should return an array of SolrServers and read the SolrUrl as a > comma delimited list of URL's using Configuration.getString(). SolrWriter > should be able to handle this list of SolrServers. > This is useful if you want to send documents to multiple servers if no > replication is available or if you want to send documents to multiple NOCs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira