Hey Dwane,

Thanks for your email, gah I should have mentioned that I had applied the
patches from 8.x branches onto the exporter already *(such as the fixed
thread pooling that you mentioned). *I still haven't gotten to the bottom
of the IndexReader is closed issue, I found that if that was present on an
instance, even calling just http://ip.address:port/solr/admin/metrics would
return that and 0 metrics. If I added the following parameter to the
call; &regex=^(?!SEARCHER).*
It was all fine. I'm trying to wrap my head around the relationship between
a solr core, and an index searcher / reader in the code, but it's quite
complicated, similarly, trying to understand how I could replicate this for
testing purposes. So if you have any guidance/advice on that area, would be
greatly appreciated.

Cheers,

On Wed, 6 May 2020 at 21:36, Dwane Hall <dwaneh...@hotmail.com> wrote:

> Hey Richard,
>
> I noticed this issue with the exporter in the 7.x branch. If you look
> through the release notes for Solr since then there have been quite a few
> improvements to the exporter particularly around thread safety and
> concurrency (and the number of nodes it can monitor).  The version of the
> exporter can run independently to your Solr version so my advice would be
> to download the most recent Solr version, check and modify the exporter
> start script for its library dependencies, extract these files to a
> separate location, and run this version against your 7.x instance. If you
> have the capacity to upgrade your Solr version this will save you having to
> maintain the exporter separately. Since making this change the exporter has
> not missed a beat and we monitor around 100 Solr nodes.
>
> Good luck,
>
> Dwane
> ------------------------------
> *From:* Richard Goodman <richa...@brandwatch.com>
> *Sent:* Tuesday, 5 May 2020 10:22 PM
> *To:* solr-user@lucene.apache.org <solr-user@lucene.apache.org>
> *Subject:* solr core metrics & prometheus exporter - indexreader is closed
>
> Hi there,
>
> I've been playing with the prometheus exporter for solr, and have created
> my config and have deployed it, so far, all groups were running fine (node,
> jetty, jvm), however, I'm repeatedly getting an issue with the core group;
>
> WARN  - 2020-05-05 12:01:24.812; org.apache.solr.prometheus.scraper.Async;
> Error occurred during metrics collection
> java.util.concurrent.ExecutionException:
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
> from server at http://127.0.0.1:8083/solr: Server Error
>
> request:
> http://127.0.0.1:8083/solr/admin/metrics?group=core&wt=json&version=2.2
>         at
>
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
> ~[?:1.8.0_141]
>         at
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
> ~[?:1.8.0_141]
>         at
> org.apache.solr.prometheus.scraper.Async.lambda$null$1(Async.java:45)
> ~[solr-prometheus-exporter-7.7.2-SNAPSHOT.jar:7.7.2-SNAPSHOT
> e5d04ab6a061a02e47f9e6df62a3cfa69632987b - jenkins - 2019-11-22 16:23:03]
>         at
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
> ~[?:1.8.0_141]
>         at
> java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
> ~[?:1.8.0_141]
>         at
>
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374)
> ~[?:1.8.0_141]
>         at
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
> ~[?:1.8.0_141]
>         at
>
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
> ~[?:1.8.0_141]
>         at
>
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
> ~[?:1.8.0_141]
>         at
>
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
> ~[?:1.8.0_141]
>         at
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> ~[?:1.8.0_141]
>         at
> java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
> ~[?:1.8.0_141]
>         at
>
> org.apache.solr.prometheus.scraper.Async.lambda$waitForAllSuccessfulResponses$3(Async.java:43)
> ~[solr-prometheus-exporter-7.7.2-SNAPSHOT.jar:7.7.2-SNAPSHOT
> e5d04ab6a061a02e47f9e6df62a3cfa69632987b - jenkins - 2019-11-22 16:23:03]
>         at
>
> java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
> ~[?:1.8.0_141]
>         at
>
> java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852)
> ~[?:1.8.0_141]
>         at
>
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
> ~[?:1.8.0_141]
>         at
>
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1595)
> ~[?:1.8.0_141]
>         at
>
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)
> ~[solr-solrj-7.7.2-SNAPSHOT.jar:7.7.2-SNAPSHOT
> e5d04ab6a061a02e47f9e6df62a3cfa69632987b - jenkins - 2019-11-22 16:23:11]
>         at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [?:1.8.0_141]
>         at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [?:1.8.0_141]
>         at java.lang.Thread.run(Thread.java:748) [?:1.8.0_141]
> Caused by:
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error
> from server at http://127.0.0.1:8083/solr: Server Error
>
> request:
> http://127.0.0.1:8083/solr/admin/metrics?group=core&wt=json&version=2.2
>         at
>
> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:643)
> ~[solr-solrj-7.7.2-SNAPSHOT.jar:7.7.2-SNAPSHOT
> e5d04ab6a061a02e47f9e6df62a3cfa69632987b - jenkins - 2019-11-22 16:23:11]
>         at
>
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255)
> ~[solr-solrj-7.7.2-SNAPSHOT.jar:7.7.2-SNAPSHOT
> e5d04ab6a061a02e47f9e6df62a3cfa69632987b - jenkins - 2019-11-22 16:23:11]
>         at
>
> org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:244)
> ~[solr-solrj-7.7.2-SNAPSHOT.jar:7.7.2-SNAPSHOT
> e5d04ab6a061a02e47f9e6df62a3cfa69632987b - jenkins - 2019-11-22 16:23:11]
>         at
> org.apache.solr.client.solrj.SolrClient.request(SolrClient.java:1260)
> ~[solr-solrj-7.7.2-SNAPSHOT.jar:7.7.2-SNAPSHOT
> e5d04ab6a061a02e47f9e6df62a3cfa69632987b - jenkins - 2019-11-22 16:23:11]
>         at
>
> org.apache.solr.prometheus.scraper.SolrScraper.request(SolrScraper.java:102)
> ~[solr-prometheus-exporter-7.7.2-SNAPSHOT.jar:7.7.2-SNAPSHOT
> e5d04ab6a061a02e47f9e6df62a3cfa69632987b - jenkins - 2019-11-22 16:23:03]
>         at
>
> org.apache.solr.prometheus.scraper.SolrCloudScraper.lambda$metricsForAllHosts$6(SolrCloudScraper.java:121)
> ~[solr-prometheus-exporter-7.7.2-SNAPSHOT.jar:7.7.2-SNAPSHOT
> e5d04ab6a061a02e47f9e6df62a3cfa69632987b - jenkins - 2019-11-22 16:23:03]
>         at
>
> org.apache.solr.prometheus.scraper.SolrScraper.lambda$null$0(SolrScraper.java:81)
> ~[solr-prometheus-exporter-7.7.2-SNAPSHOT.jar:7.7.2-SNAPSHOT
> e5d04ab6a061a02e47f9e6df62a3cfa69632987b - jenkins - 2019-11-22 16:23:03]
>         at
>
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
> ~[?:1.8.0_141]
>
>
> Because of this, I believe the exporter is then reporting the following
> failure:
>
> WARN  - 2020-05-05 12:01:24.825; org.apache.solr.prometheus.scraper.Async;
> Error occurred during metrics collection
> java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError:
> org/jcodings/Encoding
>         at
>
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
> ~[?:1.8.0_141]
>         at
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
> ~[?:1.8.0_141]
>         at
> org.apache.solr.prometheus.scraper.Async.lambda$null$1(Async.java:45)
> ~[solr-prometheus-exporter-7.7.2-SNAPSHOT.jar:7.7.2-SNAPSHOT
> e5d04ab6a061a02e47f9e6df62a3cfa69632987b - jenkins - 2019-11-22 16:23:03]
>         at
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
> ~[?:1.8.0_141]
>         at
> java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
> ~[?:1.8.0_141]
>         at
>
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1374)
> ~[?:1.8.0_141]
>         at
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
> ~[?:1.8.0_141]
>         at
>
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
> ~[?:1.8.0_141]
>         at
>
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
> ~[?:1.8.0_141]
>         at
>
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
> ~[?:1.8.0_141]
>         at
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> ~[?:1.8.0_141]
>         at
> java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
> ~[?:1.8.0_141]
>         at
>
> org.apache.solr.prometheus.scraper.Async.lambda$waitForAllSuccessfulResponses$3(Async.java:43)
> ~[solr-prometheus-exporter-7.7.2-SNAPSHOT.jar:7.7.2-SNAPSHOT
> e5d04ab6a061a02e47f9e6df62a3cfa69632987b - jenkins - 2019-11-22 16:23:03]
>         at
>
> java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
> ~[?:1.8.0_141]
>         at
>
> java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852)
> ~[?:1.8.0_141]
>         at
>
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
> ~[?:1.8.0_141]
>         at
>
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1595)
> ~[?:1.8.0_141]
>         at
>
> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)
> ~[solr-solrj-7.7.2-SNAPSHOT.jar:7.7.2-SNAPSHOT
> e5d04ab6a061a02e47f9e6df62a3cfa69632987b - jenkins - 2019-11-22 16:23:11]
>         at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> [?:1.8.0_141]
>         at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> [?:1.8.0_141]
>         at java.lang.Thread.run(Thread.java:748) [?:1.8.0_141]
> Caused by: java.lang.NoClassDefFoundError: org/jcodings/Encoding
>
>
> And when I hit the api directly, I'm met with the following;
>
> {
>   "responseHeader":{
>     "status":500,
>     "QTime":44},
>   "error":{
>     "msg":"this IndexReader is closed",
>     "trace":"org.apache.lucene.store.AlreadyClosedException: this
> IndexReader is closed\n\tat
> org.apache.lucene.index.IndexReader.ensureOpen(IndexReader.java:257)\n\tat
>
> org.apache.lucene.index.StandardDirectoryReader.getVersion(StandardDirectoryReader.java:339)\n\tat
>
> org.apache.lucene.index.FilterDirectoryReader.getVersion(FilterDirectoryReader.java:127)\n\tat
>
> org.apache.lucene.index.FilterDirectoryReader.getVersion(FilterDirectoryReader.java:127)\n\tat
>
> org.apache.solr.search.SolrIndexSearcher.lambda$initializeMetrics$13(SolrIndexSearcher.java:2268)\n\tat
>
> org.apache.solr.metrics.SolrMetricManager$GaugeWrapper.getValue(SolrMetricManager.java:683)\n\tat
>
> org.apache.solr.util.stats.MetricUtils.convertGauge(MetricUtils.java:488)\n\tat
>
> org.apache.solr.util.stats.MetricUtils.convertMetric(MetricUtils.java:274)\n\tat
>
> org.apache.solr.util.stats.MetricUtils.lambda$toMaps$4(MetricUtils.java:213)\n\tat
>
> java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)\n\tat
>
> java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)\n\tat
>
> java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)\n\tat
> java.util.TreeMap$KeySpliterator.forEachRemaining(TreeMap.java:2746)\n\tat
> java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)\n\tat
>
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)\n\tat
>
> java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)\n\tat
>
> java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)\n\tat
> java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)\n\tat
>
> java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)\n\tat
> org.apache.solr.util.stats.MetricUtils.toMaps(MetricUtils.java:211)\n\tat
>
> org.apache.solr.handler.admin.MetricsHandler.handleRequest(MetricsHandler.java:121)\n\tat
>
> org.apache.solr.handler.admin.MetricsHandler.handleRequestBody(MetricsHandler.java:101)\n\tat
>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199)\n\tat
>
> org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:736)\n\tat
>
> org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:717)\n\tat
> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:496)\n\tat
>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:395)\n\tat
>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:341)\n\tat
>
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602)\n\tat
>
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)\n\tat
>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)\n\tat
>
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)\n\tat
>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)\n\tat
>
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)\n\tat
>
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1588)\n\tat
>
> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)\n\tat
>
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1345)\n\tat
>
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)\n\tat
>
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480)\n\tat
>
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1557)\n\tat
>
> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)\n\tat
>
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247)\n\tat
>
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)\n\tat
>
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)\n\tat
>
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)\n\tat
>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)\n\tat
>
> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)\n\tat
>
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)\n\tat
> org.eclipse.jetty.server.Server.handle(Server.java:502)\n\tat
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:364)\n\tat
>
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:260)\n\tat
>
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:305)\n\tat
> org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)\n\tat
> org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:118)\n\tat
>
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:333)\n\tat
>
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:310)\n\tat
>
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:168)\n\tat
>
> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:126)\n\tat
>
> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:366)\n\tat
>
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:765)\n\tat
>
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:683)\n\tat
> java.lang.Thread.run(Thread.java:748)\n",
>     "code":500}}
>
>
> Because of these errors, when I actually then go to the endpoint the
> prometheus exporter is running for the core metrics, I get 0 metrics back
> because of this, I am yet to further investigate the prometheus exporter
> to determine if in the scenario some metrics can not be gathered and throw
> an error, if no metrics are recorded at all.
>
> Whilst I was working on SOLR-14325
> <https://issues.apache.org/jira/browse/SOLR-14325>, Andrzej noted that the
> core level metrics are only reported if there is an open SolrIndexSearcher.
> I started looking at the code for this, but wanted to know if anyone else
> has encountered this issue before, it seems to be very frequent with this
> cluster I am testing (96 instances, each instance having around 450GB of
> indexes on disk w/ 3 way replication).
>
> I guess also, would it bring up a question of having a better response
> rather than a 500 status error if no metrics are available?
>
> Kind regards,
>
> --
>
> Richard Goodman
>


-- 

Richard Goodman    |    Data Infrastructure engineer

richa...@brandwatch.com


NEW YORK   | BOSTON   | BRIGHTON   | LONDON   | BERLIN |   STUTTGART |
PARIS   | SINGAPORE | SYDNEY

<https://www.brandwatch.com/blog/digital-consumer-intelligence/>

Reply via email to