Thank you Marcus - they are indeed set to 1024 for the hdfs user. We'll re-configure limits.conf and try again.
-Joe On Tue, Jun 21, 2016 at 10:38 AM, Markus Jelsma <markus.jel...@openindex.io> wrote: > Hello Joseph, > > Your datanodes are in a bad state, you probably overwhelmed it when > indexing. Check your max open files on those nodes. Usual default of 1024 > is way too low. > > Markus > > > > -----Original message----- > > From:Joseph Obernberger <joseph.obernber...@gmail.com> > > Sent: Monday 20th June 2016 19:36 > > To: solr-user@lucene.apache.org > > Subject: All Datanodes are Bad > > > > Anyone ever seen an error like this? We are running using HDFS for the > > index. At the time of the error, we are doing a lot of indexing. > > > > Two errors: > > java.io.IOException: All datanodes DatanodeInfoWithStorage[ > > 172.16.100.220:50010,DS-4b806395-0661-4a70-a32b-deef82a85359,DISK] are > bad. > > Aborting... > > at > > > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1357) > > at > > > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:1119) > > at > > > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:622) > > > > and > > > > auto commit error...:org.apache.solr.common.SolrException: > > java.io.IOException: All datanodes DatanodeInfoWithStorage[ > > 172.16.100.220:50010,DS-4b806395-0661-4a70-a32b-deef82a85359,DISK] are > bad. > > Aborting... > > at > > > org.apache.solr.update.HdfsTransactionLog.close(HdfsTransactionLog.java:321) > > at org.apache.solr.update.TransactionLog.decref(TransactionLog.java:510) > > at org.apache.solr.update.UpdateLog.addOldLog(UpdateLog.java:372) > > at org.apache.solr.update.UpdateLog.postCommit(UpdateLog.java:668) > > at > > > org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:658) > > at org.apache.solr.update.CommitTracker.run(CommitTracker.java:217) > > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > > at > > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) > > at > > > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) > > at > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > > at > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > > at java.lang.Thread.run(Thread.java:745) > > Caused by: java.io.IOException: All datanodes DatanodeInfoWithStorage[ > > 172.16.100.220:50010,DS-4b806395-0661-4a70-a32b-deef82a85359,DISK] are > bad. > > Aborting... > > at > > > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1357) > > at > > > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:1119) > > at > > > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:622) > > > > On the client side doing the indexing, we see: > > > > org.apache.solr.client.solrj.impl.CloudSolrClient$RouteException: Error > > from server at http://deimos:9100/solr/UNCLASS: java.io.IOException: All > > datanodes DatanodeInfoWithStorage[172.16.100.220:50010 > ,DS-f0e14105-9557-4a59-8918-43724aaa8346,DISK] > > are bad. Aborting... > > at > > > org.apache.solr.client.solrj.impl.CloudSolrClient.directUpdate(CloudSolrClient.java:632) > > at > > > org.apache.solr.client.solrj.impl.CloudSolrClient.sendRequest(CloudSolrClient.java:981) > > at > > > org.apache.solr.client.solrj.impl.CloudSolrClient.requestWithRetryOnStaleState(CloudSolrClient.java:870) > > at > > > org.apache.solr.client.solrj.impl.CloudSolrClient.request(CloudSolrClient.java:806) > > at > > org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:149) > > at > org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:106) > > at > org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:71) > > at > org.apache.solr.client.solrj.SolrClient.add(SolrClient.java:85) > > . > > . > > . > > Caused by: > > org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: > Error > > from server at http://deimos:9100/solr/UNCLASS: java.io.IOException: All > > datanodes DatanodeInfoWithStorage[172.16.100.220:50010 > ,DS-f0e14105-9557-4a59-8918-43724aaa8346,DISK] > > are bad. Aborting... > > at > > > org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:576) > > at > > > org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:240) > > at > > > org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:229) > > at > > > org.apache.solr.client.solrj.impl.LBHttpSolrClient.doRequest(LBHttpSolrClient.java:372) > > at > > > org.apache.solr.client.solrj.impl.LBHttpSolrClient.request(LBHttpSolrClient.java:325) > > at > > > org.apache.solr.client.solrj.impl.CloudSolrClient$2.call(CloudSolrClient.java:607) > > at > > > org.apache.solr.client.solrj.impl.CloudSolrClient$2.call(CloudSolrClient.java:604) > > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > > at > > > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:231) > > > > Thanks! > > > > -Joe > > >