ddanielr opened a new issue, #4529: URL: https://github.com/apache/accumulo/issues/4529
**Is your feature request related to a problem? Please describe.** Scan server file references contain a file path and the uuid of the scan server for the GC to determine if a deletion candidate is still in use. So, as scan servers are scaled for user read activity, the number of metadata table entries also increases. This creates additional load on the metadata tservers for what could be a short duration action (i.e. scaling in response to high client reads) and could cause the need for additional tservers to be started for hosting metadata tablets. Once those tservers are up, they need to balance the metadata tablets across the new set of tservers. This could impact other system actions that rely on quick metadata read and writes as well as load balancing settings on the tservers (host regex, split point calculation, etc). Likewise, once scan servers are scaled in, the number of tservers would also be scaled in to conserve compute resources. This scale-in action would cause a number of hosting related operations to occur on the metadata table, which could impact system performance. **Describe the solution you'd like** Once #4528 is complete, then the scan servers file refs do not need to be managed for the root or metadata tables. At that point, they should be moved off to their own table to completely separate client read behavior from impacting the data hosting tservers (metadata table). Auto scaling actions can then be constrained to a client's resource group, vs creating hosting churn on the metadata table which affects all users. Likewise, it further isolates critical system information (tablet hosting, rfile fencing, etc) from transitory information (scan server references, problem reports, blip markers). If accumulo was being deployed via a helm chart, this also allows tservers to be spun up automatically in a scan server resource group to host the scan server references for that collection of scan servers. The GC would need to scan this table instead, however because it's a now a full table, its easier to validate that all references were found. **Describe alternatives you've considered** Thought about managing the metadata table with auto created splits, but the scale-in operations always cause metadata interaction impact. **Additional context** The scan server prefix could be dropped in favor of the scan server resource group. This could help to increase the rate which the manager removes old scan server references. Allowing a full resource group to be removed as opposed to having to check each individual uuid. For a helm chart deployment, the scan server's resource group could also be used to balance a given resource group's scan server references on that same resource group's tserver. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@accumulo.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org