[GitHub] [accumulo] dlmarion commented on issue #3211: OnDemand Tables: Determine how OnDemand tablets are brought online

via GitHub Tue, 07 Mar 2023 11:28:43 -0800


dlmarion commented on issue #3211:
URL: https://github.com/apache/accumulo/issues/3211#issuecomment-1458706511


   > > > Is there a mechanism to alert on approaching resource limits or 
evaluate the cluster's current size/use before marking a tablet as onDemand?
   > > 
   > > 
   > > Given that we don't know which schedulers might be in use on a cluster, 
I think we just need to emit metrics that can be used for a scheduling system 
to make a determination that more tablet servers are needed. Accumulo doesn't 
do any alerting, that's typically set up by the users of the system - trigger 
an alert when some criteria is met.
   > 
   > 👍 Using an external metric collection system makes sense.
   > 
   > > > Given the possibility of a limited resource footprint, what mechanism 
is going to be used for scheduling the tablet hosting?
   > 
   > > I don't think we should build or use a specific scheduler. 
[KEDA](https://keda.sh/) and 
[HPA](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/)
 exist already and I'm sure the different commercial cloud vendors may supply 
their own solutions as well (Azure has 
[VPA](https://learn.microsoft.com/en-us/azure/aks/vertical-pod-autoscaler) for 
example).
   > 
   > Scheduling was probably the wrong word to use. But I agree on using a 
pre-defined scheduler for k8s resources. I think this question is wrapped into 
the priority discussion.
   > 
   > > > What determines hosting priority of onDemand tables? should this 
exist? should it be a per-client setting or perhaps per table?
   > 
   > > I don't think we have had any discussion of priority. Can you give an 
example of how priority would work?
   > 
   > Please correct any inaccuracy here.
   > 
   > Lets say we have an accumulo cluster that does not have tserver groups 
implemented and services two clients. Both clients have tables that, 
individually, fit within the cluster's resource footprint but when hosted at 
the same time exceed the current resources.
   >
   
   What does it mean to exceed? I believe that the limitation to a TabletServer 
hosting a Tablet is the amount of memory given to the TabletServer. If a 
TabletServer can host N Tablets, it will still be limited as to how many 
tablets can read/write concurrently based on other properties. If a 
TabletServer can't host N tablets, then it it needs more memory, or more 
TabletServers need to be stood up so that the load can be distributed. I would 
say that determining how many Tablets a TabletServer can host would be 
difficult as the TabletMetadata is not the same size for each Tablet. It might 
be possible to use some rough upper bound on the amount of memory required to 
host each Tablet to determine how many TabletServers are needed.
    
   > If my understanding is correct, both clients can attempt writes on 
"onDemand" tables that are currently unloaded. Or, both clients could request 
that tables be moved from offline to "onDemand". Each of these actions would 
result in the tablets being marked for assignment and attempt to be loaded onto 
tservers via the TabletGroupWatcher.
   > 
   
   Moving the table state from `offline` to `ondemand` would not result in 
tablets being loaded. The tablets for an ondemand table are only loaded when 
the client needs them for live ingest or scans.
   
   > Since these are not enough resources for both tables to be fully hosted, 
will the TabletGroupWatcher discern which "ondemand" tablets should be hosted 
first? Or does it just see a pool of tablets that need to be assigned and will 
do so indiscriminately?
   > 
   The latter
   
   > If it's the latter, since the clients are attempting actions at a table 
level vs a tablet level. Does that mean all tablets for a given table need to 
be assigned and loaded before the action can be completed?
   > 
   
   The way that I currently have this wired up, the tablets are loaded when 
they are required by the client. The client tries to determine the location 
(which TabletServer) a tablet is hosted on, so that it can send read or write 
requests. If the tablet is not hosted, and the table is ondemand, it sends a 
request for the tablet to be hosted which incurs some latency cost. So, the 
number of tablets that need to be hosted for an ondemand table are the sum of 
the tablets involved in read and write requests from all of the clients. If you 
do an unbounded scan, then it's going to eventually host the entire table, but 
that's what the user requested of the client.
   
   > If so, then as part of bringing tablets online, a priority level (1-x) or 
an "onDemand" request timestamp could be included in the metadata. Then the 
TabletGroupWatcher could ensure all tablets of a specific "onDemand" request 
would be fully hosted prior to assigning a second "OnDemand" requests tablets.
   > 
   > Otherwise, wouldn't there be a resource blocking issue where tablets of 
both tables are being assigned, but each table cannot fully be hosted due to 
resource constraints?
   
   If we end up with a mechanism for determining reliably how many tablets a 
TabletServer can host, then I agree we could introduce some priority scheme. 
Without that, and in its current form, the code just does what is asked. In 
#3212 we will create some mechanism for unloading ondemand tablets, not sure 
what that will look like yet. I think there are several possibilities, for 
example: a TabletServer could host N (configurable) ondemand tablets and unload 
them in an LRU fashion.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [accumulo] dlmarion commented on issue #3211: OnDemand Tables: Determine how OnDemand tablets are brought online

Reply via email to