[ 
https://issues.apache.org/jira/browse/SOLR-14210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17070548#comment-17070548
 ] 

Jan Høydahl commented on SOLR-14210:
------------------------------------

Renamed param to {{requireHealthyCores=true}}. Added unit test and CHANGES 
entry.

Still no test for the request param parsing.

> Introduce Node-level status handler for replicas
> ------------------------------------------------
>
>                 Key: SOLR-14210
>                 URL: https://issues.apache.org/jira/browse/SOLR-14210
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: master (9.0), 8.5
>            Reporter: Houston Putman
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> h2. Background
> As was brought up in SOLR-13055, in order to run Solr in a more cloud-native 
> way, we need some additional features around node-level healthchecks.
> {quote}Like in Kubernetes we need 'liveliness' and 'readiness' probe 
> explained in 
> [https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/n]
>  determine if a node is live and ready to serve live traffic.
> {quote}
>  
> However there are issues around kubernetes managing it's own rolling 
> restarts. With the current healthcheck setup, it's easy to envision a 
> scenario in which Solr reports itself as "healthy" when all of its replicas 
> are actually recovering. Therefore kubernetes, seeing a healthy pod would 
> then go and restart the next Solr node. This can happen until all replicas 
> are "recovering" and none are healthy. (maybe the last one restarted will be 
> "down", but still there are no "active" replicas)
> h2. Proposal
> I propose we make an additional healthcheck handler that returns whether all 
> replicas hosted by that Solr node are healthy and "active". That way we will 
> be able to use the [default kubernetes rolling restart 
> logic|https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#update-strategies]
>  with Solr.
> To add on to [Jan's point 
> here|https://issues.apache.org/jira/browse/SOLR-13055?focusedCommentId=16716559&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16716559],
>  this handler should be more friendly for other Content-Types and should use 
> bettter HTTP response statuses.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to