[
https://issues.apache.org/jira/browse/SOLR-7121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14523312#comment-14523312
]
Mark Miller commented on SOLR-7121:
-----------------------------------
{code}
+ if (cc.isZooKeeperAware() && isUnderHeavyLoad(false)) {
+ try {
+ log.info("Bringing {} down due to heavy load", cd.getName());
+ cc.getZkController().publish(cd, Replica.State.DOWN);
+ startHealthPoller(core);
+ } catch (KeeperException | InterruptedException e) {
+ log.error(e.getMessage(), e);
+ }
{code}
What if we are the leader and publish a down state due to overload? Shouldn't
we also give up our leader position?
> Solr nodes should go down based on configurable thresholds and not rely on
> resource exhaustion
> ----------------------------------------------------------------------------------------------
>
> Key: SOLR-7121
> URL: https://issues.apache.org/jira/browse/SOLR-7121
> Project: Solr
> Issue Type: New Feature
> Reporter: Sachin Goyal
> Attachments: SOLR-7121.patch, SOLR-7121.patch, SOLR-7121.patch,
> SOLR-7121.patch, SOLR-7121.patch, SOLR-7121.patch, SOLR-7121.patch
>
>
> Currently, there is no way to control when a Solr node goes down.
> If the server is having high GC pauses or too many threads or is just getting
> too many queries due to some bad load-balancer, the cores in the machine keep
> on serving unless they exhaust the machine's resources and everything comes
> to a stall.
> Such a slow-dying core can affect other cores as well by taking huge time to
> serve their distributed queries.
> There should be a way to specify some threshold values beyond which the
> targeted core can its ill-health and proactively go down to recover.
> When the load improves, the core should come up automatically.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]