[ 
https://issues.apache.org/jira/browse/SOLR-17200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter resolved SOLR-17200.
---------------------------------------
    Fix Version/s: main (10.0)
                   9.6.0
       Resolution: Fixed

> "False Positive" Race conditions using "/health?requireHealthyCores=true" 
> near startup
> --------------------------------------------------------------------------------------
>
>                 Key: SOLR-17200
>                 URL: https://issues.apache.org/jira/browse/SOLR-17200
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Chris M. Hostetter
>            Assignee: Chris M. Hostetter
>            Priority: Major
>             Fix For: main (10.0), 9.6.0
>
>         Attachments: SOLR-17200.patch
>
>
> There seem to be at least two possible thread race conditions that can lead 
> {{/health?requireHealthyCores=true}} to returning false positive while 
> {{CoreContainer}} is in the process of starting up.
>  # If the request comes in _after_ {{CoreContainer}} has initialized 
> {{healthCheckHandler}} but _before_ initializing & running the 
> {{coreLoadExecutor}}
>  # A more complex situation where the request comes in _while_ 
> {{coreLoadExecutor}} is loading cores, and all of the cores that have 
> _finished_ initialization are "active" in SolrCloud, but other SolrCores 
> remain to be initialized (and may need recovery)
> In both cases, the root of the issue is that {{requireHealthyCores=true}} 
> works by checking...
> {code:java}
>       Collection<CloudDescriptor> coreDescriptors =
>           coreContainer.getCores().stream()
>               .map(c -> c.getCoreDescriptor().getCloudDescriptor())
>               .collect(Collectors.toList());
>       long unhealthyCores = findUnhealthyCores(coreDescriptors, clusterState);
> {code}
> ..but that means the only {{CloudDescriptor}} s that are checked are the ones 
> that come from _loaded_ cores (which is what {{coreContainer.getCores()}} 
> returns). and any {{currentlyLoadingCores}} (registered by CoreContainer 
> calling {{solrCores.markCoreAsLoading(cd)}} before starting the 
> {{coreLoadExecutor}} ) are not considered.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to