Hi,
we're running tests on a stand-alone Solr instance, which create Solr
cores from multiple applications using CoreAdmin (via SolrJ).
Lately, we upgraded from 8.4.1 to 8.6.3, and sometimes we now see a
LockObtainFailedException for a lock held by the same JVM, after which
Solr is broken and runs into NullPointerExceptions for simple CoreAdmin
STATUS requests. We have to restart Solr then. I've never seen this with
8.4.1 or previous releases.
This bug is quite severe for us because it breaks our system tests with
Solr, and we fear that it may also happen in production. Is this a known
bug?
Our applications use a CoreAdmin STATUS request to check whether a core
exists, followed by a CREATE request, if the core does not exist. With
multiple applications, and bad timing, two concurrent CREATE requests
for the same core are of course still possible. Solr 8.4.1 rejected
duplicate requests and logged ERRORs but kept working correctly [1]. I
can still see the same log messages in 8.6.3 ("Core with name ...
already exists" or "Error CREATEing SolrCore ... Could not create a new
core in ... as another core is already defined there") - but sometimes
also the following error, after which Solr is broken:
2020-10-27 00:29:25.350 ERROR (qtp2029754983-24) [ ]
o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Error
CREATEing SolrCore 'blueprint_acgqqafsogyc_comments': Unable to create core
[blueprint_acgqqafsogyc_comments] Caused by: Lock held by this virtual machine:
/var/solr/data/blueprint_acgqqafsogyc_comments/data/index/write.lock
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1312)
at
org.apache.solr.handler.admin.CoreAdminOperation.lambda$static$0(CoreAdminOperation.java:95)
at
org.apache.solr.handler.admin.CoreAdminOperation.execute(CoreAdminOperation.java:367)
...
Caused by: org.apache.solr.common.SolrException: Unable to create core
[blueprint_acgqqafsogyc_comments]
at
org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1408)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1273)
... 47 more
Caused by: org.apache.solr.common.SolrException: Error opening new searcher
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:1071)
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:906)
at
org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1387)
... 48 more
Caused by: org.apache.solr.common.SolrException: Error opening new searcher
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2184)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2308)
at org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:1130)
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:1012)
... 50 more
Caused by: org.apache.lucene.store.LockObtainFailedException: Lock held by this
virtual machine:
/var/solr/data/blueprint_acgqqafsogyc_comments/data/index/write.lock
at
org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:139)
at
org.apache.lucene.store.FSLockFactory.obtainLock(FSLockFactory.java:41)
at
org.apache.lucene.store.BaseDirectory.obtainLock(BaseDirectory.java:45)
at
org.apache.lucene.store.FilterDirectory.obtainLock(FilterDirectory.java:105)
at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:785)
at
org.apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:126)
at
org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:100)
at
org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:261)
at
org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:135)
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2145)
2020-10-27 00:29:25.353 INFO (qtp2029754983-19) [ ] o.a.s.s.HttpSolrCall [admin] webapp=null
path=/admin/cores
params={core=blueprint_acgqqafsogyc_comments&action=STATUS&indexInfo=false&wt=javabin&version=2}
status=500 QTime=0
2020-10-27 00:29:25.353 ERROR (qtp2029754983-19) [ ] o.a.s.s.HttpSolrCall
null:org.apache.solr.common.SolrException: Error handling 'STATUS' action
at
org.apache.solr.handler.admin.CoreAdminOperation.execute(CoreAdminOperation.java:372)
at
org.apache.solr.handler.admin.CoreAdminHandler$CallInfo.call(CoreAdminHandler.java:397)
at
org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:181)
...
Caused by: java.lang.NullPointerException
at org.apache.solr.core.SolrCore.getInstancePath(SolrCore.java:333)
at
org.apache.solr.handler.admin.CoreAdminOperation.getCoreStatus(CoreAdminOperation.java:329)
at org.apache.solr.handler.admin.StatusOp.execute(StatusOp.java:54)
at
org.apache.solr.handler.admin.CoreAdminOperation.execute(CoreAdminOperation.java:367)
Any ideas? Were there any recent changes since 8.4.1 that could have
caused this?
Thank you,
Andreas
[1] BTW, it would be nice if Solr would support an atomic request
"create core if not exists" so that I wouldn't need separate
STATUS/CREATE requests and could avoid ERRORs in the log. But this is
not critical, and not the problem here. (I asked for this years ago:
https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201606.mbox/%3C5762B310.60600%40coremedia.com%3E
)