[
https://issues.apache.org/jira/browse/NUTCH-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16154354#comment-16154354
]
ASF GitHub Bot commented on NUTCH-1480:
---------------------------------------
lewismc commented on a change in pull request #218: fix for NUTCH-1480
contributed by r0ann3l
URL: https://github.com/apache/nutch/pull/218#discussion_r137121058
##########
File path:
src/plugin/indexer-solr/src/java/org/apache/nutch/indexwriter/solr/SolrIndexWriter.java
##########
@@ -75,19 +70,61 @@
private int totalUpdates = 0;
private boolean delete = false;
+ @Override
public void open(JobConf job, String name) throws IOException {
- solrClients = SolrUtils.getSolrClients(job);
- init(solrClients, job);
+ //Implementation not required
}
- // package protected for tests
- void init(List<SolrClient> solrClients, JobConf job) throws IOException {
- batchSize = job.getInt(SolrConstants.COMMIT_SIZE, 1000);
- solrMapping = SolrMappingReader.getInstance(job);
- delete = job.getBoolean(IndexerMapReduce.INDEXER_DELETE, false);
+ /**
+ * Initializes the internal variables from a given index writer
configuration.
+ *
+ * @param parameters Params from the index writer configuration.
+ * @throws IOException Some exception thrown by writer.
+ */
+ @Override
+ public void open(Map<String, String> parameters) throws IOException {
+ String type = parameters.getOrDefault("type", "http");
+
+ String[] urls = StringUtils.getStrings(parameters.get("url"));
+
+ if (urls == null) {
+ String message = "Missing SOLR URL.\n" + describe();
+ LOG.error(message);
+ throw new RuntimeException(message);
+ }
+
+ this.solrClients = new ArrayList<>();
+
+ switch (type) {
+ case "http":
+ for (String url : urls) {
+ solrClients.add(SolrUtils.getHttpSolrClient(url));
+ }
+ break;
+ case "cloud":
+ for (String url : urls) {
+ CloudSolrClient sc = SolrUtils.getCloudSolrClient(url);
+ sc.setDefaultCollection(parameters.get(SolrConstants.COLLECTION));
+ solrClients.add(sc);
+ }
+ break;
+ case "concurrent":
Review comment:
Can you throw unsupported Exception at this stage? and also a default case?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> SolrIndexer to write to multiple servers.
> -----------------------------------------
>
> Key: NUTCH-1480
> URL: https://issues.apache.org/jira/browse/NUTCH-1480
> Project: Nutch
> Issue Type: Improvement
> Components: indexer
> Reporter: Markus Jelsma
> Assignee: Markus Jelsma
> Priority: Minor
> Attachments: adding-support-for-sharding-indexer-for-solr.patch,
> NUTCH-1480-1.6.1.patch
>
>
> SolrUtils should return an array of SolrServers and read the SolrUrl as a
> comma delimited list of URL's using Configuration.getString(). SolrWriter
> should be able to handle this list of SolrServers.
> This is useful if you want to send documents to multiple servers if no
> replication is available or if you want to send documents to multiple NOCs.
> edit:
> This does not replace NUTCH-1377 but complements it. With NUTCH-1377 this
> issue allows you to index to multiple SolrCloud clusters at the same time.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)