[jira] [Commented] (NUTCH-2205) Nutch solrdedup error in solrcloud for larger docs

2017-01-09 Thread Furkan KAMACI (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15812265#comment-15812265
 ] 

Furkan KAMACI commented on NUTCH-2205:
--

[~VictorHu] Do you still get that error? Because logs says:

bq.No live SolrServers available
 
and it seems that your cluster was down as [~markus17] pointed.

> Nutch solrdedup error in solrcloud for larger docs 
> ---
>
> Key: NUTCH-2205
> URL: https://issues.apache.org/jira/browse/NUTCH-2205
> Project: Nutch
>  Issue Type: Bug
>  Components: indexer
>Affects Versions: 2.3
> Environment: CentOS 6.5,Jdk 1.7.0_75,omcat 8.0.9 ,Hadoop 
> 2.5.2,Zookeeper 3.4.6 ,Hbase 0.98.8 ,Solr 4.8.1 ,Nutch 2.3.1
>Reporter: VictorHu
> Fix For: 2.5
>
>
> When the number of solr docs larger than 9000,the solrdedup of the nutch is 
> broken.This is log: 
> http://10.192.1.100:8080/solr/myEnterpriseCollection_shard2_replica2
> 16/01/25 17:02:38 INFO solr.SolrDeleteDuplicates: SolrDeleteDuplicates: 
> starting...
> 16/01/25 17:02:38 INFO solr.SolrDeleteDuplicates: SolrDeleteDuplicates: Solr 
> url: http://10.192.1.100:8080/solr/myEnterpriseCollection_shard2_replica2
> 16/01/25 17:02:39 INFO client.RMProxy: Connecting to ResourceManager at 
> master.Itble/10.192.1.100:8032
> 16/01/25 17:02:43 INFO mapreduce.JobSubmitter: number of splits:1
> 16/01/25 17:02:44 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
> job_1453104806095_0162
> 16/01/25 17:02:44 INFO impl.YarnClientImpl: Submitted application 
> application_1453104806095_0162
> 16/01/25 17:02:44 INFO mapreduce.Job: The url to track the job: 
> http://master.Itble:8088/proxy/application_1453104806095_0162/
> 16/01/25 17:02:44 INFO mapreduce.Job: Running job: job_1453104806095_0162
> 16/01/25 17:02:54 INFO mapreduce.Job: Job job_1453104806095_0162 running in 
> uber mode : false
> 16/01/25 17:02:54 INFO mapreduce.Job:  map 0% reduce 0%
> 16/01/25 17:03:02 INFO mapreduce.Job: Task Id : 
> attempt_1453104806095_0162_m_00_0, Status : FAILED
> Error: org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: 
> org.apache.solr.client.solrj.SolrServerException: No live SolrServers 
> available to handle this 
> request:[http://10.192.1.100:8080/solr/myEnterpriseCollection_shard2_replica2,
>  http://10.192.1.101:8080/solr/myEnterpriseCollection_shard1_replica2, 
> http://10.192.1.103:8080/solr/myEnterpriseCollection_shard2_replica1]
> at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:554)
> at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
> at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
> at 
> org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:91)
> at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:301)
> at 
> org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat.createRecordReader(SolrDeleteDuplicates.java:291)
> at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.(MapTask.java:492)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:735)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> 16/01/25 17:03:12 INFO mapreduce.Job: Task Id : 
> attempt_1453104806095_0162_m_00_1, Status : FAILED
> Error: org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: 
> org.apache.solr.client.solrj.SolrServerException: No live SolrServers 
> available to handle this 
> request:[http://10.192.1.100:8080/solr/myEnterpriseCollection_shard2_replica2,
>  http://10.192.1.101:8080/solr/myEnterpriseCollection_shard1_replica2, 
> http://10.192.1.103:8080/solr/myEnterpriseCollection_shard2_replica1, 
> http://10.192.1.102:8080/solr/myEnterpriseCollection_shard1_replica1]
> at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:554)
> at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
> at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
> at 
> org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:91)
> at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:301)
> at 
> 

[jira] [Commented] (NUTCH-2205) Nutch solrdedup error in solrcloud for larger docs

2016-01-25 Thread Markus Jelsma (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15114991#comment-15114991
 ] 

Markus Jelsma commented on NUTCH-2205:
--

This looks like your cluster was down, not a Nutch error.

> Nutch solrdedup error in solrcloud for larger docs 
> ---
>
> Key: NUTCH-2205
> URL: https://issues.apache.org/jira/browse/NUTCH-2205
> Project: Nutch
>  Issue Type: Bug
>  Components: indexer
>Affects Versions: 2.3
> Environment: CentOS 6.5,Jdk 1.7.0_75,omcat 8.0.9 ,Hadoop 
> 2.5.2,Zookeeper 3.4.6 ,Hbase 0.98.8 ,Solr 4.8.1 ,Nutch 2.3.1
>Reporter: VictorHu
> Fix For: 2.4
>
>
> When the number of solr docs larger than 9000,the solrdedup of the nutch is 
> broken.This is log: 
> http://10.192.1.100:8080/solr/myEnterpriseCollection_shard2_replica2
> 16/01/25 17:02:38 INFO solr.SolrDeleteDuplicates: SolrDeleteDuplicates: 
> starting...
> 16/01/25 17:02:38 INFO solr.SolrDeleteDuplicates: SolrDeleteDuplicates: Solr 
> url: http://10.192.1.100:8080/solr/myEnterpriseCollection_shard2_replica2
> 16/01/25 17:02:39 INFO client.RMProxy: Connecting to ResourceManager at 
> master.Itble/10.192.1.100:8032
> 16/01/25 17:02:43 INFO mapreduce.JobSubmitter: number of splits:1
> 16/01/25 17:02:44 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
> job_1453104806095_0162
> 16/01/25 17:02:44 INFO impl.YarnClientImpl: Submitted application 
> application_1453104806095_0162
> 16/01/25 17:02:44 INFO mapreduce.Job: The url to track the job: 
> http://master.Itble:8088/proxy/application_1453104806095_0162/
> 16/01/25 17:02:44 INFO mapreduce.Job: Running job: job_1453104806095_0162
> 16/01/25 17:02:54 INFO mapreduce.Job: Job job_1453104806095_0162 running in 
> uber mode : false
> 16/01/25 17:02:54 INFO mapreduce.Job:  map 0% reduce 0%
> 16/01/25 17:03:02 INFO mapreduce.Job: Task Id : 
> attempt_1453104806095_0162_m_00_0, Status : FAILED
> Error: org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: 
> org.apache.solr.client.solrj.SolrServerException: No live SolrServers 
> available to handle this 
> request:[http://10.192.1.100:8080/solr/myEnterpriseCollection_shard2_replica2,
>  http://10.192.1.101:8080/solr/myEnterpriseCollection_shard1_replica2, 
> http://10.192.1.103:8080/solr/myEnterpriseCollection_shard2_replica1]
> at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:554)
> at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
> at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
> at 
> org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:91)
> at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:301)
> at 
> org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat.createRecordReader(SolrDeleteDuplicates.java:291)
> at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.(MapTask.java:492)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:735)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> 16/01/25 17:03:12 INFO mapreduce.Job: Task Id : 
> attempt_1453104806095_0162_m_00_1, Status : FAILED
> Error: org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: 
> org.apache.solr.client.solrj.SolrServerException: No live SolrServers 
> available to handle this 
> request:[http://10.192.1.100:8080/solr/myEnterpriseCollection_shard2_replica2,
>  http://10.192.1.101:8080/solr/myEnterpriseCollection_shard1_replica2, 
> http://10.192.1.103:8080/solr/myEnterpriseCollection_shard2_replica1, 
> http://10.192.1.102:8080/solr/myEnterpriseCollection_shard1_replica1]
> at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:554)
> at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
> at 
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
> at 
> org.apache.solr.client.solrj.request.QueryRequest.process(QueryRequest.java:91)
> at org.apache.solr.client.solrj.SolrServer.query(SolrServer.java:301)
> at 
> org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat.createRecordReader(SolrDeleteDuplicates.java:291)
> at 
>