[ 
https://issues.apache.org/jira/browse/SOLR-17200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17831516#comment-17831516
 ] 

ASF subversion and git services commented on SOLR-17200:
--------------------------------------------------------

Commit 2def20abfe85d77f1fcd42a24d01c77b9ee07b82 in solr's branch 
refs/heads/main from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=2def20abfe8 ]

SOLR-17200: Fix false positive race condition in  during core loading


> "False Positive" Race conditions using "/health?requireHealthyCores=true" 
> near startup
> --------------------------------------------------------------------------------------
>
>                 Key: SOLR-17200
>                 URL: https://issues.apache.org/jira/browse/SOLR-17200
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Chris M. Hostetter
>            Assignee: Chris M. Hostetter
>            Priority: Major
>         Attachments: SOLR-17200.patch
>
>
> There seem to be at least two possible thread race conditions that can lead 
> {{/health?requireHealthyCores=true}} to returning false positive while 
> {{CoreContainer}} is in the process of starting up.
>  # If the request comes in _after_ {{CoreContainer}} has initialized 
> {{healthCheckHandler}} but _before_ initializing & running the 
> {{coreLoadExecutor}}
>  # A more complex situation where the request comes in _while_ 
> {{coreLoadExecutor}} is loading cores, and all of the cores that have 
> _finished_ initialization are "active" in SolrCloud, but other SolrCores 
> remain to be initialized (and may need recovery)
> In both cases, the root of the issue is that {{requireHealthyCores=true}} 
> works by checking...
> {code:java}
>       Collection<CloudDescriptor> coreDescriptors =
>           coreContainer.getCores().stream()
>               .map(c -> c.getCoreDescriptor().getCloudDescriptor())
>               .collect(Collectors.toList());
>       long unhealthyCores = findUnhealthyCores(coreDescriptors, clusterState);
> {code}
> ..but that means the only {{CloudDescriptor}} s that are checked are the ones 
> that come from _loaded_ cores (which is what {{coreContainer.getCores()}} 
> returns). and any {{currentlyLoadingCores}} (registered by CoreContainer 
> calling {{solrCores.markCoreAsLoading(cd)}} before starting the 
> {{coreLoadExecutor}} ) are not considered.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to