Re: [dspace-tech] Re: Solr lock obtained timeout

2019-05-10 Thread Mark H. Wood
On Mon, May 06, 2019 at 11:50:15AM +0300, Alan Orth wrote:
> Some Solr parameter that seems like it should help is writeLockTimeout. The
> default in solrconfig.xml is 1000ms (1 sec). I tried setting it to 5000 and
> 1, neither of which helped. Should I try to bump this setting up to
> some obscenely high value? Could there be another factor here, like JVM
> heap size or settings? Happy to discuss experiences others are having.

That is the amount of time that Solr will wait for another thread to
unlock the index.  If the lock is stale, there is no other thread to
unlock it, and no amount of waiting would help.

-- 
Mark H. Wood
Lead Technology Analyst

University Library
Indiana University - Purdue University Indianapolis
755 W. Michigan Street
Indianapolis, IN 46202
317-274-0749
www.ulib.iupui.edu

-- 
All messages to this mailing list should adhere to the DuraSpace Code of 
Conduct: https://duraspace.org/about/policies/code-of-conduct/
--- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dspace-tech/20190510185754.GA5429%40IUPUI.Edu.
For more options, visit https://groups.google.com/d/optout.


signature.asc
Description: PGP signature


Re: [dspace-tech] Re: Solr lock obtained timeout

2019-05-10 Thread Mark H. Wood
On Mon, May 06, 2019 at 11:50:15AM +0300, Alan Orth wrote:
> I'm still experiencing sporadic occurrences of the "Error opening new
> searcher" issue with the Solr statistics core. In our case Tomcat always
> shuts down cleanly, and even so, manually deleting the write locks before
> Tomcat startup doesn't seem to help. Thinking about this a bit more, it
> sounds plausible to me that high load or slow disks could indeed cause a
> timeout waiting for a write lock with large Solr cores. Our server has an
> SSD, but is indeed busy, and we have nine years of statistics with hundreds
> of millions of events.
> 
> Some Solr parameter that seems like it should help is writeLockTimeout. The
> default in solrconfig.xml is 1000ms (1 sec). I tried setting it to 5000 and
> 1, neither of which helped. Should I try to bump this setting up to
> some obscenely high value? Could there be another factor here, like JVM
> heap size or settings? Happy to discuss experiences others are having.

It may be time to ask on the Solr users' ML
(solr-u...@lucene.apache.org).  They may at least know better than,
for example, I which classes' logging would be the most useful to
increase.

> P.S. I've been running Solr 4.10.4 in DSpace 5.8 production for over one
> month now (versus 4.10.2 in DSpace pom.xml).

That's useful to know.  Thanks!

-- 
Mark H. Wood
Lead Technology Analyst

University Library
Indiana University - Purdue University Indianapolis
755 W. Michigan Street
Indianapolis, IN 46202
317-274-0749
www.ulib.iupui.edu

-- 
All messages to this mailing list should adhere to the DuraSpace Code of 
Conduct: https://duraspace.org/about/policies/code-of-conduct/
--- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/dspace-tech/20190510183609.GB16571%40IUPUI.Edu.
For more options, visit https://groups.google.com/d/optout.


signature.asc
Description: PGP signature


Re: [dspace-tech] Re: Solr lock obtained timeout

2019-05-06 Thread Alan Orth
Dear list,

I'm still experiencing sporadic occurrences of the "Error opening new
searcher" issue with the Solr statistics core. In our case Tomcat always
shuts down cleanly, and even so, manually deleting the write locks before
Tomcat startup doesn't seem to help. Thinking about this a bit more, it
sounds plausible to me that high load or slow disks could indeed cause a
timeout waiting for a write lock with large Solr cores. Our server has an
SSD, but is indeed busy, and we have nine years of statistics with hundreds
of millions of events.

Some Solr parameter that seems like it should help is writeLockTimeout. The
default in solrconfig.xml is 1000ms (1 sec). I tried setting it to 5000 and
1, neither of which helped. Should I try to bump this setting up to
some obscenely high value? Could there be another factor here, like JVM
heap size or settings? Happy to discuss experiences others are having.

Thanks,

P.S. I've been running Solr 4.10.4 in DSpace 5.8 production for over one
month now (versus 4.10.2 in DSpace pom.xml).

On Thu, Mar 21, 2019 at 11:54 PM Alan Orth  wrote:

> Thanks for the response, Mark.
>
> I have started testing Solr 4.10.4 in a few development environments will
> share my notes after some time. Regarding the lock files, I might be
> barking up the wrong tree because it seems like Solr never deletes these: I
> did a quick check by adding a `find /dspace -type f -iname '*.lock'` before
> and after the Tomcat service starts and stops, and the lock files are
> always there. I'm not sure what to make of that...
>
> Regards,
>
> On Tue, Mar 19, 2019 at 6:02 PM Mark H. Wood  wrote:
>
>> On Tuesday, March 19, 2019 at 11:35:47 AM UTC-4, Alan Orth wrote:
>>>
>>> Dear list,
>>>
>>> In the last few months we've been having issues with Solr throwing
>>> "Error creating core" errors one out of every two times we start Tomcat.
>>> This results in our statistics from previous years being inaccessible. For
>>> example, from the Solr log yesterday:
>>>
>>> 2019-03-18 12:32:39,799 ERROR org.apache.solr.core.CoreContainer @ Error
>>> creating core [statistics-2018]: Error opening new searcher
>>> ...
>>> Caused by: org.apache.solr.common.SolrException: Error opening new
>>> searcher
>>> at
>>> org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1565)
>>> at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1677)
>>> at org.apache.solr.core.SolrCore.(SolrCore.java:845)
>>> ... 31 more
>>> Caused by: org.apache.lucene.store.LockObtainFailedException: Lock
>>> obtain timed out: NativeFSLock@
>>> /dspace/solr/statistics-2018/data/index/write.lock
>>>
>>> Some old Stack Overflow posts recommend setting Solr's address space to
>>> unlimited (ulimit -v unlimited), but the issue still occurs seemingly
>>> randomly for us.
>>>
>>
>>
>> I think that raising ulimit would only help if Tomcat's JVM memory
>> settings add up to more than the OS was configured to give the process.
>> Those recommendations may have been for stand-alone Solr running in its own
>> Jetty instance.
>>
>>
>>
>>> Now I usually just try to restart Tomcat again, or sometimes I shut it
>>> down cleanly, delete all the Solr write locks, and then start Tomcat back
>>> up. It's starting to feel a bit superstitious... maybe I should try to kill
>>> a chicken.
>>>
>>
>>
>>> We are running DSpace 5.8 with Tomcat 7.0.93 on Ubuntu 16.04. We
>>> upgraded from DSpace 5.5 in late 2018 and 2019 was the first year that the
>>> yearly stats-util sharding completed successfully (fixed in DSpace 5.7). I
>>> believe our problems are related to the existence of these shards. It feels
>>> like there is some kind of *race condition* because it only every so
>>> often and it's not always the same core that Solr is refusing to create.
>>> Sometimes it's statistics-2018, statistics-2015, etc.
>>>
>>
>>
>> So, sometimes Solr doesn't remove all of its locks when shut down.  Does
>> it log ([DSpace]log/solr*.log) anything interesting at shutdown? does
>> Tomcat?
>>
>>
>>
>>> On this note, is there any reason we are still using Solr 4.10.2 with
>>> DSpace 5.x and 6.x? The Solr project issued two bug fix releases in that
>>> series—4.10.3 and 4.10.4—and there are about fifty bug fixes in those
>>> releases, some of which address memory leaks and shard handling. See the
>>> change logs for 4.10.3¹ and 4.10.4². As an experiment I just bumped the
>>> version to 4.10.3 in my test environment and DSpace starts up... so that's
>>> a good sign. I will do more testing of ingests, indexing, etc and report
>>> back.
>>>
>>>
>>
>> I think the reason is that nobody went looking for newer releases.  Your
>> experiment is definitely worth trying, and your results will be interesting.
>>
>> --
>> All messages to this mailing list should adhere to the DuraSpace Code of
>> Conduct: https://duraspace.org/about/policies/code-of-conduct/
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "DSpace Technical 

Re: [dspace-tech] Re: Solr lock obtained timeout

2019-03-21 Thread Alan Orth
Thanks for the response, Mark.

I have started testing Solr 4.10.4 in a few development environments will
share my notes after some time. Regarding the lock files, I might be
barking up the wrong tree because it seems like Solr never deletes these: I
did a quick check by adding a `find /dspace -type f -iname '*.lock'` before
and after the Tomcat service starts and stops, and the lock files are
always there. I'm not sure what to make of that...

Regards,

On Tue, Mar 19, 2019 at 6:02 PM Mark H. Wood  wrote:

> On Tuesday, March 19, 2019 at 11:35:47 AM UTC-4, Alan Orth wrote:
>>
>> Dear list,
>>
>> In the last few months we've been having issues with Solr throwing "Error
>> creating core" errors one out of every two times we start Tomcat. This
>> results in our statistics from previous years being inaccessible. For
>> example, from the Solr log yesterday:
>>
>> 2019-03-18 12:32:39,799 ERROR org.apache.solr.core.CoreContainer @ Error
>> creating core [statistics-2018]: Error opening new searcher
>> ...
>> Caused by: org.apache.solr.common.SolrException: Error opening new
>> searcher
>> at
>> org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1565)
>> at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1677)
>> at org.apache.solr.core.SolrCore.(SolrCore.java:845)
>> ... 31 more
>> Caused by: org.apache.lucene.store.LockObtainFailedException: Lock obtain
>> timed out: NativeFSLock@
>> /dspace/solr/statistics-2018/data/index/write.lock
>>
>> Some old Stack Overflow posts recommend setting Solr's address space to
>> unlimited (ulimit -v unlimited), but the issue still occurs seemingly
>> randomly for us.
>>
>
>
> I think that raising ulimit would only help if Tomcat's JVM memory
> settings add up to more than the OS was configured to give the process.
> Those recommendations may have been for stand-alone Solr running in its own
> Jetty instance.
>
>
>
>> Now I usually just try to restart Tomcat again, or sometimes I shut it
>> down cleanly, delete all the Solr write locks, and then start Tomcat back
>> up. It's starting to feel a bit superstitious... maybe I should try to kill
>> a chicken.
>>
>
>
>> We are running DSpace 5.8 with Tomcat 7.0.93 on Ubuntu 16.04. We upgraded
>> from DSpace 5.5 in late 2018 and 2019 was the first year that the yearly
>> stats-util sharding completed successfully (fixed in DSpace 5.7). I believe
>> our problems are related to the existence of these shards. It feels like
>> there is some kind of *race condition* because it only every so often
>> and it's not always the same core that Solr is refusing to create.
>> Sometimes it's statistics-2018, statistics-2015, etc.
>>
>
>
> So, sometimes Solr doesn't remove all of its locks when shut down.  Does
> it log ([DSpace]log/solr*.log) anything interesting at shutdown? does
> Tomcat?
>
>
>
>> On this note, is there any reason we are still using Solr 4.10.2 with
>> DSpace 5.x and 6.x? The Solr project issued two bug fix releases in that
>> series—4.10.3 and 4.10.4—and there are about fifty bug fixes in those
>> releases, some of which address memory leaks and shard handling. See the
>> change logs for 4.10.3¹ and 4.10.4². As an experiment I just bumped the
>> version to 4.10.3 in my test environment and DSpace starts up... so that's
>> a good sign. I will do more testing of ingests, indexing, etc and report
>> back.
>>
>>
>
> I think the reason is that nobody went looking for newer releases.  Your
> experiment is definitely worth trying, and your results will be interesting.
>
> --
> All messages to this mailing list should adhere to the DuraSpace Code of
> Conduct: https://duraspace.org/about/policies/code-of-conduct/
> ---
> You received this message because you are subscribed to the Google Groups
> "DSpace Technical Support" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dspace-tech+unsubscr...@googlegroups.com.
> To post to this group, send email to dspace-tech@googlegroups.com.
> Visit this group at https://groups.google.com/group/dspace-tech.
> For more options, visit https://groups.google.com/d/optout.
>


-- 
Alan Orth
alan.o...@gmail.com
https://picturingjordan.com
https://englishbulgaria.net
https://mjanja.ch
"In heaven all the interesting people are missing." ―Friedrich Nietzsche

-- 
All messages to this mailing list should adhere to the DuraSpace Code of 
Conduct: https://duraspace.org/about/policies/code-of-conduct/
--- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.


[dspace-tech] Re: Solr lock obtained timeout

2019-03-19 Thread Mark H. Wood
On Tuesday, March 19, 2019 at 11:35:47 AM UTC-4, Alan Orth wrote:
>
> Dear list,
>
> In the last few months we've been having issues with Solr throwing "Error 
> creating core" errors one out of every two times we start Tomcat. This 
> results in our statistics from previous years being inaccessible. For 
> example, from the Solr log yesterday:
>
> 2019-03-18 12:32:39,799 ERROR org.apache.solr.core.CoreContainer @ Error 
> creating core [statistics-2018]: Error opening new searcher 
> ... 
> Caused by: org.apache.solr.common.SolrException: Error opening new 
> searcher 
> at 
> org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1565) 
> at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1677) 
> at org.apache.solr.core.SolrCore.(SolrCore.java:845) 
> ... 31 more 
> Caused by: org.apache.lucene.store.LockObtainFailedException: Lock obtain 
> timed out: NativeFSLock@/dspace/solr/statistics-2018/data/index/write.lock
>
> Some old Stack Overflow posts recommend setting Solr's address space to 
> unlimited (ulimit -v unlimited), but the issue still occurs seemingly 
> randomly for us.
>


I think that raising ulimit would only help if Tomcat's JVM memory settings 
add up to more than the OS was configured to give the process.  Those 
recommendations may have been for stand-alone Solr running in its own Jetty 
instance.

 

> Now I usually just try to restart Tomcat again, or sometimes I shut it 
> down cleanly, delete all the Solr write locks, and then start Tomcat back 
> up. It's starting to feel a bit superstitious... maybe I should try to kill 
> a chicken.
>
 

> We are running DSpace 5.8 with Tomcat 7.0.93 on Ubuntu 16.04. We upgraded 
> from DSpace 5.5 in late 2018 and 2019 was the first year that the yearly 
> stats-util sharding completed successfully (fixed in DSpace 5.7). I believe 
> our problems are related to the existence of these shards. It feels like 
> there is some kind of *race condition* because it only every so often and 
> it's not always the same core that Solr is refusing to create. Sometimes 
> it's statistics-2018, statistics-2015, etc.
>


So, sometimes Solr doesn't remove all of its locks when shut down.  Does it 
log ([DSpace]log/solr*.log) anything interesting at shutdown? does Tomcat?

 

> On this note, is there any reason we are still using Solr 4.10.2 with 
> DSpace 5.x and 6.x? The Solr project issued two bug fix releases in that 
> series—4.10.3 and 4.10.4—and there are about fifty bug fixes in those 
> releases, some of which address memory leaks and shard handling. See the 
> change logs for 4.10.3¹ and 4.10.4². As an experiment I just bumped the 
> version to 4.10.3 in my test environment and DSpace starts up... so that's 
> a good sign. I will do more testing of ingests, indexing, etc and report 
> back.
>
>

I think the reason is that nobody went looking for newer releases.  Your 
experiment is definitely worth trying, and your results will be interesting.

-- 
All messages to this mailing list should adhere to the DuraSpace Code of 
Conduct: https://duraspace.org/about/policies/code-of-conduct/
--- 
You received this message because you are subscribed to the Google Groups 
"DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dspace-tech+unsubscr...@googlegroups.com.
To post to this group, send email to dspace-tech@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.