HoustonPutman commented on PR #1650:
URL: https://github.com/apache/solr/pull/1650#issuecomment-1584658161

   I also want to point out that the balanceReplicas logic is _decent_, it's 
not perfect.
   
   In the affinityPlugin, adding/removing replicas from a Node will change the 
weights of other nodes. This is why we needed the `WeightedNode.addReplica()` 
to return a boolean if other nodes might have lower weights after the addition. 
The `computePlacement()` logic is perfectly fine for other nodes to have their 
weights increased silently, since we check if their weight is out-of-date when 
picking the lowest weight. However nodes that silently have their weights 
decreased will not be at the front of the priorityQueue, to have their weights 
re-checked and computed. So the "entire" list needs to be re-sorted.
   
   However, this is ok in `computePlacements()` because it is only looking for 
nodes with the lowest weights. For `computeBalancing()` it's using a TreeSet to 
find nodes with the lowest and highest weights. Therefore, any other nodes 
having their weights silently change (via a different node using `addReplica()` 
or `removeReplica()`) need to be resorted if their weight goes up or down, for 
optimal placement. Currently we don't do this, because it would likely slow 
down the computation massively for the affinityPlugin when using `spreadDomain` 
or `availabilityZone`. Since for every placement, `x` or `n - x` nodes would 
have to be resorted.
   
   So in the meantime the balancing will be good, but it won't be optimal 
(likely). The `computeBalancing()` loop will continue until there are no 
replicas that can be moved around to make weighting more optimized, so maybe 
I'm over thinking this anyways...
   
   Anyways, I think this is a fine concession to get the feature in Solr. And 
it's something we can make incremental improvements on in the future. (This is 
also why I left a `@lucene.experimental` tag on `WeightedNode`. The 
improvements we make might need different method signatures, like I had to make 
`addReplica()` return a boolean to make `computePlacements` optimal.)
   
   I'll add comments to the `computeBalancing()` method to make this clear to 
future people wanting to work on the code.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org
For additional commands, e-mail: issues-h...@solr.apache.org

Reply via email to