[ 
https://issues.apache.org/jira/browse/MESOS-9254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16695121#comment-16695121
 ] 

Chun-Hung Hsiao commented on MESOS-9254:
----------------------------------------

Thought dumps:

1. Change {{reconcileStoragePools}} to reconcile both storage pools and 
preprovisioned volumes, similar to {{reconcileResourceProviderState}}.
2. Invoke {{reconcileStoragePools}} periodically, with a proper default 
interval.
3. Remove the calls to {{reconcileStoragePools}} in {{watchProfiles}} and 
{{applyDestroyDisk}}. These reconciliation could be done together with 2. 
Alternatively, we can keep the call in {{watchProfiles}} to avoid the delay but 
need some coordination between that call and 2.
4. Most importantly, don't drop operations during reconciliations since this 
leads to poor user experience. The approach I have in mind is that, in 
{{applyOperation}}, we wait for any ongoing reconciilation to finish, before 
checking for the resource version and accept the operation. The invariant here 
is that once an operation has been accepted, it is guaranteed to be applicable 
to the current set of total resources, so the next reconciliation would have to 
wait for these operations to become terminal.
5. Optionally, we can optimistically apply the operation even the resource 
version does not match after the reconciliation to improve user experience, 
provided that we handle the pipeline dependency properly.

> Make SLRP be able to update its volumes and storage pools.
> ----------------------------------------------------------
>
>                 Key: MESOS-9254
>                 URL: https://issues.apache.org/jira/browse/MESOS-9254
>             Project: Mesos
>          Issue Type: Improvement
>            Reporter: Chun-Hung Hsiao
>            Assignee: Chun-Hung Hsiao
>            Priority: Critical
>              Labels: mesosphere, storage
>
> We should consider making SLRP update its resources periodically, or adding 
> an endpoint to trigger that, for the following reasons:
> 1. Mesos currently assumes all profiles have disjoint storage pools. This is 
> because Mesos models each resource independently. However, in practice an 
> operator can set up, say two profiles, one for linear volumes and one for 
> raid volumes, and an "LVM" resource provider that can provision both linear 
> and raid volumes. The correlation between the storage pools of the linear and 
> raid profiles would reduce one's pool capacity when a volume of the other 
> type is provisioned. To reflect the actual sizes of correlated storage pools, 
> we need a way to make SLRP update its resources.
> 2. The SLRP now only queries the CSI plugin to report a list of volumes 
> during startup, so if a new device is added, the operator will have to 
> restart the agent to trigger another SLRP startup, which is inconvenient.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to