ddanielr opened a new issue, #4529:
URL: https://github.com/apache/accumulo/issues/4529

   **Is your feature request related to a problem? Please describe.**
   
   Scan server file references contain a file path and the uuid of the scan 
server for the GC to determine if a deletion candidate is still in use. 
   So, as scan servers are scaled for user read activity, the number of 
metadata table entries also increases. 
   
   This creates additional load on the metadata tservers for what could be a 
short duration action (i.e. scaling in response to high client reads) and could 
cause the need for additional tservers to be started for hosting metadata 
tablets.
   Once those tservers are up, they need to balance the metadata tablets across 
the new set of tservers. 
   This could impact other system actions that rely on quick metadata read and 
writes as well as load balancing settings on the tservers (host regex, split 
point calculation, etc).  
   
   Likewise, once scan servers are scaled in, the number of tservers would also 
be scaled in to conserve compute resources. 
   This scale-in action would cause a number of hosting related operations to 
occur on the metadata table, which could impact system performance. 
   
   **Describe the solution you'd like**
   
   Once #4528 is complete, then the scan servers file refs do not need to be 
managed for the root or metadata tables.
   At that point, they should be moved off to their own table to completely 
separate client read behavior from impacting the data hosting tservers 
(metadata table). 
   
   Auto scaling actions can then be constrained to a client's resource group, 
vs creating hosting churn on the metadata table which affects all users. 
   Likewise, it further isolates critical system information (tablet hosting, 
rfile fencing, etc) from transitory information (scan server references, 
problem reports, blip markers). 
   
   If accumulo was being deployed via a helm chart, this also allows tservers 
to be spun up automatically in a scan server resource group to host the scan 
server references for that collection of scan servers. 
   
   The GC would need to scan this table instead, however because it's a now a 
full table, its easier to validate that all references were found.
   
   **Describe alternatives you've considered**
   
   Thought about managing the metadata table with auto created splits, but the 
scale-in operations always cause metadata interaction impact. 
   
   **Additional context**
   The scan server prefix could be dropped in favor of the scan server resource 
group. 
   This could help to increase the rate which the manager removes old scan 
server references. Allowing a full resource group to be removed as opposed to 
having to check each individual uuid. 
   
   For a helm chart deployment, the scan server's resource group could also be 
used to balance a given resource group's scan server references on that same 
resource group's tserver.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscr...@accumulo.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to