[
https://issues.apache.org/jira/browse/SOLR-7121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sachin Goyal updated SOLR-7121:
-------------------------------
Attachment: SOLR-7121.patch
[~elyograg], here is another patch which removes System.currentTimeMillis().
Most of the important values are already in the configuration and turned off by
default.
{code:xml}
<coreDownThresholds name="thresholds1">
<bool name="goDownIfHighLoad">false</bool>
<str name="coreNameExpression">abc.*</str>
<int name="coreLimitMaxThreads">45</int>
<int name="coreLimitMaxGcMillis">10000</int>
<!-- These 3 options must be specified together and are used as an AND
condition -->
<int name="coreLimitMaxLongQueries">100</int>
<int name="coreLimitLongQueryTime">100</int>
<int name="coreLimitMaxLongQueriesInterval">1000</int>
<!-- These 2 options must be specified together and are used as an AND
condition -->
<int name="coreLimitMax95thPcSelectTime">-1</int>
<int name="coreLimitMax5MinSelectRate">-1</int>
</coreDownThresholds>
{code}
Very few options are hard-coded values as I felt it would be best to leave
those out of configuration. Will wait for this patch's complete review comments
before converting them to configuration as well.
> Solr nodes should go down based on configurable thresholds and not rely on
> resource exhaustion
> ----------------------------------------------------------------------------------------------
>
> Key: SOLR-7121
> URL: https://issues.apache.org/jira/browse/SOLR-7121
> Project: Solr
> Issue Type: New Feature
> Reporter: Sachin Goyal
> Attachments: SOLR-7121.patch, SOLR-7121.patch
>
>
> Currently, there is no way to control when a Solr node goes down.
> If the server is having high GC pauses or too many threads or is just getting
> too many queries due to some bad load-balancer, the cores in the machine keep
> on serving unless they exhaust the machine's resources and everything comes
> to a stall.
> Such a slow-dying core can affect other cores as well by taking huge time to
> serve their distributed queries.
> There should be a way to specify some threshold values beyond which the
> targeted core can its ill-health and proactively go down to recover.
> When the load improves, the core should come up automatically.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]