HoustonPutman commented on issue #471:
URL: https://github.com/apache/solr-operator/issues/471#issuecomment-1245949988

   This is a very good callout, so thank you for bringing it up.
   
   We can easily add a PodDisruptionBudget for the entire SolrCloud cluster, 
and the `maxUnavailable` can be populated with the 
`SolrCloud.spec.updateStrategy.managed.maxPodsUnavailable` value. This is a 
pretty good first-step and gets us halfway there.
   
   The next half would be replicating the 
`SolrCloud.spec.updateStrategy.managed.maxShardReplicasUnavailable` 
functionality through PDBs. Through the managed update code, we already 
understand the nodes that each shard resides on, so it wouldn't be far-fetched 
to create a PDB for every shard, using a custom labelSelector to pick out the 
node-name labels of nodes that we already know host that shard. We could even 
just routinely check (every minute or so) to update/create/delete PDBs, as we 
aren't listening to the cluster state in the cloud. The [PodDisruptionBudget 
documentation](https://kubernetes.io/docs/tasks/run-application/configure-pdb/#arbitrary-controllers-and-selectors)
 tells us that we can't use `maxUnavailable`, as PDBs with custom 
labelSelectors can only use int-valued `minAvailable`. That's fine because we 
can always convert between the two, since we know the number of Nodes that host 
the shard.
   
   However, there's [another 
rule](https://kubernetes.io/docs/tasks/run-application/configure-pdb/#arbitrary-controllers-and-selectors)
 for PDBs that makes this part of the solution untenable. It specifies that you 
can only have 1 `PodDisruptionBudget` per-pod, and for this solution we would 
need to have a PDB for every shard that lives on that pod, which will almost 
certainly be >1. (Otherwise the general cluster PDB should be fine to use)
   
   Hopefully Kubernetes will eventually remove the PDB per-pod limit, then we 
can fully (and not-too-difficultly) implement shard-level PDBs managed by the 
Solr Operator. In the meantime, we should go-ahead and implement the 
per-cluster `PodDisruptionBudget` and fill it with the value used in the 
managed update settings.
   
   Given the limitations, what are your thoughts on moving forward with the 
cluster-level PDB @joshsouza ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to