> What are the next steps for moving this forward, Jie James, the next step is to create a design doc for this.
Looks like we're aligned on the high level approach. On Wed, Aug 30, 2017 at 11:40 AM, Jie Yu <yujie....@gmail.com> wrote: > + Gaston and Greg > > Who might be working on this. > > On Tue, Aug 29, 2017 at 6:59 AM, James DeFelice <ja...@mesosphere.io> > wrote: > >> What are the next steps for moving this forward, Jie? I'm very interested >> in seeing status updates for operations land sooner than later. >> >> On Wed, Aug 23, 2017 at 5:55 PM, Gabriel Hartmann <gabr...@mesosphere.io> >> wrote: >> >>> Please can the "reason" be the reason for the failure and NOT the reason >>> the message was sent, e.g. "RECONCILIATION" >>> >>> On Wed, Aug 23, 2017 at 1:58 PM Yan Xu <xuj...@apple.com> wrote: >>> >>>> Yeah a reason for failed operations is probably useful for all resource >>>> operations. It looks like the task-style status update is still the best >>>> approach. >>>> >>>> --- >>>> @xujyan <https://twitter.com/xujyan> >>>> >>>> On Wed, Aug 23, 2017 at 11:40 AM, Jie Yu <yujie....@gmail.com> wrote: >>>> >>>>> We should continue the discussion here: >>>>> >>>>> I think I forgot to mention one important reason that I went for the >>>>> operation based reconciliation API proposal. For new operations like >>>>> CREATE_VOLUME/CREATE_BLOCK, not only we need to know the end result (the >>>>> resources) if it's successful, we also need to know the failure reason if >>>>> it fails. For instance, imagine you're creating an EBS volume by talking >>>>> to >>>>> a CSI EBS plugin. Surfacing the creation error (e.g., retryable or not >>>>> from >>>>> the CSI plugin) will be useful for scheduler to determine the next step. >>>>> >>>>> I don't think a resources based reconciliation API can address this. >>>>> Maybe we can add both if we feel both are useful? >>>>> >>>>> Thoughts? >>>>> - Jie >>>>> >>>>> On Wed, Aug 23, 2017 at 11:26 AM, Jie Yu <yujie....@gmail.com> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> We had a discussion on some very early proposal (see the attached >>>>>> slides) on providing feedback for offer operations (e.g., CREATE/DESTORY, >>>>>> RESERVE/UNRESERVE, etc.) with a bunch of folks from the community. Here >>>>>> are >>>>>> the notes I captured in the meeting: >>>>>> >>>>>> >>>>>> - One alternative approach discussed was to have best effort >>>>>> feedback, and a resources based reconciliation API allowing framework >>>>>> to >>>>>> query the resources on a given resource provider or agent. That way, >>>>>> we >>>>>> don't necessarily need the status update mechanism for offer >>>>>> operations, >>>>>> which causes complexity in the frameworks. >>>>>> - In the current proposal, do we need agent_id (or resource >>>>>> provider id) when performing reconciliation for that operation? The >>>>>> reason >>>>>> we require that in the task reconciliation case is because agent >>>>>> might not >>>>>> re-register yet during master failover. >>>>>> - We need to mock up the operator API for this work. >>>>>> - What's the order guarantee for the operations specified in one >>>>>> API call? >>>>>> - Wish list >>>>>> - Reservation tie to framework instead of role. >>>>>> - When a framework teardown, auto release resources reserved >>>>>> for that framework >>>>>> >>>>>> If I miss anything, please reply to this thread! Thanks! >>>>>> >>>>>> https://docs.google.com/presentation/d/1Mef8K3aLIuzcFVc3MnAo >>>>>> 64TkjpyTWarYVShtvCN4e48/edit?usp=sharing >>>>>> >>>>>> - Jie >>>>>> >>>>> >>>>> >>>> >> >