Ok, i created collection from scratch based on config

Unfortunately, it does not improve. It is just growing and growing. Except
when I stop solr and then during startup the unnecessary index files are
purged. Even with the previous config this did not happen in older Solr
versions (for sure not in 8.2, in 8.3 maybe, but for sure in 8.4).

Reproduction is simple: just load documents into the index (even during the
first load i observe a significant index size increase (4x fold) that is
then reduced after restart).

I observe though that during metadata update (= atomic updates) it
increases double (not anywhere near what is expected due to the update) and
then slightly reduce (a few megabytes, nothing compared to the real full
size that the index now has).

At the moment, it looks to me it is due to the Solr version, because the
config did not change (we have them all versioned, I checked). However,
maybe I am overlooking something.

Furthermore, it seems that during segment merges old segments are not
deleted until restart (but again, it is a speculation).
I suspect not many have observed this, because the only way that would be
observe is 1) they index a collection completely new and see a huge index
file consumption 2) they update their collection a lot and hit a limit of
disk space (which may happen in some cases not so soon).

I created a JIRA: https://issues.apache.org/jira/browse/SOLR-14202

Please let me know if I can test anything else.

On Tue, Jan 21, 2020 at 10:58 PM Jörn Franke <jornfra...@gmail.com> wrote:

> After testing the update?commit=true i now face an error: "Maximum lock
> count exceeded". strange this is the first time i see this in the lockfiles
> and when doing commit=true
> ava.lang.Error: Maximum lock count exceeded
>     at
> java.base/java.util.concurrent.locks.ReentrantReadWriteLock$Sync.fullTryAcquireShared(ReentrantReadWriteLock.java:535)
>     at
> java.base/java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryAcquireShared(ReentrantReadWriteLock.java:494)
>     at
> java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1368)
>     at
> java.base/java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.tryLock(ReentrantReadWriteLock.java:882)
>     at
> org.apache.solr.update.DefaultSolrCoreState.lock(DefaultSolrCoreState.java:179)
>     at
> org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:124)
>     at
> org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:658)
>     at
> org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:102)
>     at
> org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:68)
>     at
> org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalCommit(DistributedUpdateProcessor.java:1079)
>     at
> org.apache.solr.update.processor.DistributedZkUpdateProcessor.processCommit(DistributedZkUpdateProcessor.java:220)
>     at
> org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:160)
>     at
> org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:68)
>     at
> org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:68)
>     at
> org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:68)
>     at
> org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:68)
>     at
> org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:68)
>     at
> org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:68)
>     at
> org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:68)
>     at
> org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:68)
>     at
> org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:69)
>     at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:62)
>     at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:211)
>     at org.apache.solr.core.SolrCore.execute(SolrCore.java:2596)
>     at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:799)
>     at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:578)
>     at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:419)
>     at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:351)
>     at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602)
>     at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)
>     at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)
>     at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>     at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>     at
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
>     at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1711)
>     at
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
>     at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1347)
>     at
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
>     at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480)
>     at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1678)
>     at
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
>     at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1249)
>     at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
>     at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)
>     at
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:152)
>     at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>     at
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>     at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>     at org.eclipse.jetty.server.Server.handle(Server.java:505)
>     at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:370)
>     at
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:267)
>     at
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)
>     at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
>     at
> org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.onFillable(SslConnection.java:427)
>     at
> org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:321)
>     at
> org.eclipse.jetty.io.ssl.SslConnection$2.succeeded(SslConnection.java:159)
>     at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
>     at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
>     at
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)
>     at
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)
>     at
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)
>     at
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)
>     at
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)
>     at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:781)
>     at
> org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:917)
>     at java.base/java.lang.Thread.run(Thread.java:834)
>
> On Tue, Jan 21, 2020 at 10:51 PM Jörn Franke <jornfra...@gmail.com> wrote:
>
>> The only weird thing is I see that for instance I have
>> <maxTime>${solr.autoCommit.maxTime:15000}</maxTime>  and similar entries.
>> It looks like a template gone wrong, but this was not caused due to an
>> internal development. It must have been come from a Solr version.
>>
>> On Tue, Jan 21, 2020 at 10:49 PM Jörn Franke <jornfra...@gmail.com>
>> wrote:
>>
>>> It is btw. a Linux system and autosoftcommit is set to -1. However,
>>> indeed openSearcher is set to false. A commit is set to true after doing
>>> all the updates, but the index is not shrinking. The files are not
>>> disappearing during shutdown, but they disappear after starting up again.
>>>
>>> On Tue, Jan 21, 2020 at 4:04 PM Jörn Franke <jornfra...@gmail.com>
>>> wrote:
>>>
>>>> thanks for the answer I will look into it - it is a possible
>>>> explanation.
>>>>
>>>> > Am 20.01.2020 um 14:30 schrieb Erick Erickson <
>>>> erickerick...@gmail.com>:
>>>> >
>>>> > Jörn:
>>>> >
>>>> > The only thing I can think of that _might_ cause this (I’m not all
>>>> that familiar with the code) is if your solrconfig settings never open a
>>>> searcher. Either you need to be sure openSearcher is set to true in the
>>>> autocommit section in solrconfig.xml or your autoSoftCommit is set to
>>>> something other than -1. Real Time Get requires access to all segments and
>>>> it takes a new searcher being opened to release them. Actually, a very
>>>> quick test would be to submit 
>>>> “http://host:port/solr/collection/update?commit=true”
>>>> and see if the index shrinks as a result. You don’t need to change
>>>> solrconfig.xml for that test.
>>>> >
>>>> > If you are opening a new searcher, this is very concerning. There
>>>> shouldn’t be anything else you have to set to prevent the index from
>>>> growing. Could you check one thing? Compare the directory listing of the
>>>> data/index directory just before you shut down Solr and then just after.
>>>> What I’m  interested in is whether some subset of files disappears when you
>>>> shut down Solr. This assumes you’re running on a *nix system, if Windows
>>>> you may have to start Solr again to see the difference.
>>>> >
>>>> > So if you open a searcher and still see the problem, I can try to
>>>> reproduce it. Can you share your solrconfig file or at least the autocommit
>>>> and cache portions?
>>>> >
>>>> > Best,
>>>> > Erick
>>>> >
>>>> >> On Jan 20, 2020, at 5:40 AM, Jörn Franke <jornfra...@gmail.com>
>>>> wrote:
>>>> >>
>>>> >> From what is see it basically duplicates the index files, but does
>>>> not delete the old ones.
>>>> >> It uses caffeine cache.
>>>> >>
>>>> >> What I observe is that there is an exception when shutting down for
>>>> the collection that is updated - timeout waiting for all directory ref
>>>> counts to be released - gave up waiting on CacheDir.
>>>> >>
>>>> >>>> Am 20.01.2020 um 11:26 schrieb Jörn Franke <jornfra...@gmail.com>:
>>>> >>>
>>>> >>> Sorry I missed a line - not tlog is growing but the /data/index
>>>> folder is growing - until restart when it seems to be purged.
>>>> >>>
>>>> >>>> Am 20.01.2020 um 10:47 schrieb Jörn Franke <jornfra...@gmail.com>:
>>>> >>>>
>>>> >>>> Hi,
>>>> >>>>
>>>> >>>> I have a test system here with Solr 8.4 (but this is also
>>>> reproducible in older Solr versions), which has an index which is growing
>>>> and growing - until the SolrCloud instance is restarted - then it is
>>>> reduced tot the expected normal size.
>>>> >>>> The collection is configured to do auto commit after 15000 ms. I
>>>> expect the index grows comes due to the usage of atomic updates, but I
>>>> would expect that due to the auto commit this does not grow all the time.
>>>> >>>> After the atomic updates a commit is done in any case.
>>>> >>>>
>>>> >>>> I don’t see any error message in the log files, but the growth is
>>>> quiet significant and frequent restarts are not a solution of course.
>>>> >>>>
>>>> >>>> Maybe I am overlooking here a tiny configuration issue?
>>>> >>>>
>>>> >>>> Thank you.
>>>> >>>>
>>>> >>>>
>>>> >>>> Best regards
>>>> >
>>>>
>>>

Reply via email to