Hi, Glassfish 3.1.2.2 Solr 4.5 Zookeeper 3.4.5
We have set up a SolrCloud with 4 Solr nodes and 3 zookeeper instances. I start the cluster for the first time with bootstrap_conf= true.... All the nodes starts property.. I am creating cores (with the same name) on all 4 instances. I can add multiple cores on each of the instances... logically I have 5 collections. Now i am creating indexes.. and it automatically creates 4 copies of the index, one for each instance in appropriate SolrHome directory... It will work properly untill I restart the Solr cluster as soon as I restart the cluster, it throws this error (refer below) and none of the collection works properly... ERROR - 2013-10-31 19:23:24.411; org.apache.solr.core.CoreContainer; Unable to create core: xyz org.apache.solr.common.SolrException: Error opening new searcher at org.apache.solr.core.SolrCore.<init>(SolrCore.java:834) at org.apache.solr.core.SolrCore.<init>(SolrCore.java:625) at org.apache.solr.core.ZkContainer.createFromZk(ZkContainer.java:256) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:557) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:249) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:241) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: org.apache.solr.common.SolrException: Error opening new searcher at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1477) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1589) at org.apache.solr.core.SolrCore.<init>(SolrCore.java:821) ... 13 more Caused by: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: NativeFSLock@/mnt/emc/app_name/data-refresh/SolrCloud/SolrHome1/solr/xyz/data/index/write.lock at org.apache.lucene.store.Lock.obtain(Lock.java:84) at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:673) at org.apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:77) at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64) at org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:267) at org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:110) at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1440) ... 15 more ERROR - 2013-10-31 19:23:24.420; org.apache.solr.common.SolrException; null:org.apache.solr.common.SolrException: Unable to create core: xyz at org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:936) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:568) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:249) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:241) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: org.apache.solr.common.SolrException: Error opening new searcher at org.apache.solr.core.SolrCore.<init>(SolrCore.java:834) at org.apache.solr.core.SolrCore.<init>(SolrCore.java:625) at org.apache.solr.core.ZkContainer.createFromZk(ZkContainer.java:256) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:557) ... 10 more Caused by: org.apache.solr.common.SolrException: Error opening new searcher at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1477) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1589) at org.apache.solr.core.SolrCore.<init>(SolrCore.java:821) ... 13 more Caused by: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: NativeFSLock@/mnt/emc/app_name/data-refresh/SolrCloud/SolrHome1/solr/xyz/data/index/write.lock at org.apache.lucene.store.Lock.obtain(Lock.java:84) at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:673) at org.apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:77) at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64) at org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:267) at org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:110) at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1440) ... 15 more INFO - 2013-10-31 19:23:24.421; org.apache.solr.servlet.SolrDispatchFilter; user.dir=/usr/wbol/glassfish3/glassfish/nodes/localhost-domain1/SolrCloud_01/config INFO - 2013-10-31 19:23:24.421; org.apache.solr.servlet.SolrDispatchFilter; SolrDispatchFilter.init() done ERROR - 2013-10-31 19:23:24.556; org.apache.solr.update.SolrIndexWriter; SolrIndexWriter was not closed prior to finalize(), indicates a bug -- POSSIBLE RESOURCE LEAK!!! ERROR - 2013-10-31 19:23:24.558; org.apache.solr.update.SolrIndexWriter; Error closing IndexWriter, trying rollback java.lang.NullPointerException at org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:962) at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:923) at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:885) at org.apache.solr.update.SolrIndexWriter.close(SolrIndexWriter.java:132) at org.apache.solr.update.SolrIndexWriter.finalize(SolrIndexWriter.java:185) at java.lang.ref.Finalizer.invokeFinalizeMethod(Native Method) at java.lang.ref.Finalizer.runFinalizer(Finalizer.java:83) at java.lang.ref.Finalizer.access$100(Finalizer.java:14) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:160) WARN - 2013-10-31 19:23:24.912; org.apache.solr.cloud.LeaderElector; Failed setting watch org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /collections/xyz/leader_elect/shard1/election/234764442573733967-core_node2-n_0000000005 at org.apache.zookeeper.KeeperException.create(KeeperException.java:111) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1151) at org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:252) at org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:249) at org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:65) at org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:249) at org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:117) at org.apache.solr.cloud.LeaderElector.access$000(LeaderElector.java:55) at org.apache.solr.cloud.LeaderElector$1.process(LeaderElector.java:129) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) ------------------------------------------- It continuously tries to recover but never get success... it also deletes collection xyz from the zookeeper Some points to mention--- 1. I have removed dataDir from solrconfig.xml as suggested by Shaun here... http://lucene.472066.n3.nabble.com/Solr-4-3-0-Shard-instances-using-incorrect-data-directory-on-machine-boot-td4063799.html 2. I have provided absolute dataDir path in the core.properties file - https://issues.apache.org/jira/browse/SOLR-4878 3. InstanceDir in each SolrHome have same name for every core/collection-- for example SolrHome1/solr/xyz/conf SolrHome1/solr/xyz/data SolrHome1/solr/xyz/core.properties SolrHome1/solr/pqr/conf SolrHome1/solr/pqr/data SolrHome1/solr/pqr/core.properties SolrHome2/solr/xyz/conf SolrHome2/solr/xyz/data SolrHome2/solr/xyz/core.properties SolrHome2/solr/pqr/conf SolrHome2/solr/pqr/data SolrHome2/solr/pqr/core.properties ... 3. The 4 SolrHome for each of the instances are on a single shared drive... but are in different directories 4. All my collections and cores share the same solrconfig.xml I am stuck with this problem since long. Please help. Thanks, Kaustubh -- View this message in context: http://lucene.472066.n3.nabble.com/unable-to-load-core-after-cluster-restart-tp4098731.html Sent from the Solr - User mailing list archive at Nabble.com.