radu-gheorghe commented on PR #1650:
URL: https://github.com/apache/solr/pull/1650#issuecomment-1587759700

   Awesome stuff, @HoustonPutman! Unfortunately, I think it's too much for me 
to really understand everything your code does, so I'll add some higher-level 
comments here instead. I hope they're not too dumb, but if you are, ignore 
them, if they're useful, good 😄 And of course let me know if you have any 
additional comments/questions.
   
   * `OrderedNodePlacementPlugin` changes into its own PR - I don't think so, I 
think the changes are too tightly integrated and it will make things even 
harder to grok. It looks like that's the direction this PR is going towards 
anyway 🙂 
   * I saw a "parallel" flag, but I think replicas are moved one at a time, 
correct? If so, it will likely prove to be a bottleneck down the road. Some 
ideas to fix it (maybe in a separate PR):
   ** Allow rebalancing to happen in parallel for movements where the source 
AND the destination node are not on the list of source/destination nodes that 
are already participating in rebalancing. I would say this is pretty safe to 
do, because if one allows/triggers rebalancing, one would assume there's this 
extra load that can happen on any node
   ** Go for the Elasticsearch route and limit the number of concurrent 
rebalancing AND the number of concurrent incoming/outgoing rebalances per node. 
The problem is, IMO, with network throughput used by replication. AFAIK, in 
Solr you can limit it per core instead of the whole node, one has to keep that 
in mind to make sure rebalancing doesn't choke the network
   * Am I reading the code right that we compute all the shard movements that 
have to be done, then do them all, then check if we were successful? That seems 
a little bit risky because, if rebalancing takes a long time, changes can 
happen in the cluster (e.g. node added/failed, new collections 
added/removed...) so I guess one would have to cancel the rebalancing call and 
try again. I think the current approach has some advantages, but I have a 
feeling that if a call would just move one shard at a time (or N in parallel), 
then come back to see what's the next best thing, etc. it would be safer (and 
play nicer with the Operator loop, too). I'm thinking again of Elasticsearch 
(because all I have is a hammer 😅 ) and how it doesn't rebalance more stuff 
until all recoveries are done. Meaning that if a node went down, 
something/someone (kubernetes or human operator) could decide to stop the 
rebalance, replace the node, then resume. I hope I'm not way off here 🙈 
   * I saw a `maxBalanceSkew` property, but I'm not really sure what it does. 
Is this the number of "weight points" difference between nodes under which we 
say "don't bother?" And that would be different based on the placement plugin?
   * The balancing code that I saw for the Simple placement plugin seems more 
complicated than the `minimizeCores` one. In Affinity, we don't seem to really 
deal with the `withCollection` case, in the sense that we would probably want 
to move shards of both collections at the same time - we check if the 
destination node has that collection, correct? Anyway, this is not meant to be 
criticism, just to "sell" the crazy idea of having existing placement plugins 
**not** implement rebalancing and just have new ones, like:
   ** the Simple plugin as you've just implemented it (with a different name)
   ** the minimizeCores plugins as it is? Or maybe this is simple enough to 
simply remain as you extended it now?
   ** with the Affinity plugin, starting from scratch would avoid dealing (at 
least initially) with edge cases like `withCollection` or maybe even replica 
types.
   * Do I get it right that we soft-enforce constraints like AZ-awareness by 
adding big weights? If so, would it be better down the road to add them as 
constraints? Maybe in `canAddReplica`? This way we only make balancing moves 
that are "legal", e.g. not to have two copies of the same shard in the same AZ 
or on the same node, or if the node has too little space, etc. Maybe down the 
road we can add other constraints. One that I find really useful is the total 
number of shards in a collection in a node: if the cluster shrinks below a 
size, I prefer not to choke existing nodes with more replicas and create a 
domino effect.
   
   This last bit on "total shards per collection per node" isn't really about 
rebalancing, but about placing a new - e.g. recovered - replica. But now that 
we're in the realm of placement plugins, we might as well use weights for 
"placing" replicas in other contexts. And I think your PR already does that, 
no? If I simply create a new replica, the placement plugin will put it on the 
node with the lowest weight, correct?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to