Hi, I had this happened at multiple clusters recently where after the restart the locality dropped from close to or exactly 100% down to single digits. The reason is that all regions were completely shuffled and reassigned to random servers. Upon reading the (yet again non-trivial) assignment code, I found that a single server missing will trigger a full "recovery" of all servers, which includes a drop of all previous assignments and random new assignment.
This is just terrible! In addition, I also assumed that - at least the StochasticLoadBalancer - is checking which node holds most of the data of a region locality wise and picks that server. But that is _not_ the case! It just spreads everything seemingly randomly across the servers. To me this is a big regression (or straight bug) given that a single server out of, for example, hundreds could trigger that and destroy the locality completely. Running a major compaction is not an approach for many reasons. This used to work better, why that regression? Lars