+ Gaston and Greg Who might be working on this.
On Tue, Aug 29, 2017 at 6:59 AM, James DeFelice <ja...@mesosphere.io> wrote: > What are the next steps for moving this forward, Jie? I'm very interested > in seeing status updates for operations land sooner than later. > > On Wed, Aug 23, 2017 at 5:55 PM, Gabriel Hartmann <gabr...@mesosphere.io> > wrote: > >> Please can the "reason" be the reason for the failure and NOT the reason >> the message was sent, e.g. "RECONCILIATION" >> >> On Wed, Aug 23, 2017 at 1:58 PM Yan Xu <xuj...@apple.com> wrote: >> >>> Yeah a reason for failed operations is probably useful for all resource >>> operations. It looks like the task-style status update is still the best >>> approach. >>> >>> --- >>> @xujyan <https://twitter.com/xujyan> >>> >>> On Wed, Aug 23, 2017 at 11:40 AM, Jie Yu <yujie....@gmail.com> wrote: >>> >>>> We should continue the discussion here: >>>> >>>> I think I forgot to mention one important reason that I went for the >>>> operation based reconciliation API proposal. For new operations like >>>> CREATE_VOLUME/CREATE_BLOCK, not only we need to know the end result (the >>>> resources) if it's successful, we also need to know the failure reason if >>>> it fails. For instance, imagine you're creating an EBS volume by talking to >>>> a CSI EBS plugin. Surfacing the creation error (e.g., retryable or not from >>>> the CSI plugin) will be useful for scheduler to determine the next step. >>>> >>>> I don't think a resources based reconciliation API can address this. >>>> Maybe we can add both if we feel both are useful? >>>> >>>> Thoughts? >>>> - Jie >>>> >>>> On Wed, Aug 23, 2017 at 11:26 AM, Jie Yu <yujie....@gmail.com> wrote: >>>> >>>>> Hi, >>>>> >>>>> We had a discussion on some very early proposal (see the attached >>>>> slides) on providing feedback for offer operations (e.g., CREATE/DESTORY, >>>>> RESERVE/UNRESERVE, etc.) with a bunch of folks from the community. Here >>>>> are >>>>> the notes I captured in the meeting: >>>>> >>>>> >>>>> - One alternative approach discussed was to have best effort >>>>> feedback, and a resources based reconciliation API allowing framework >>>>> to >>>>> query the resources on a given resource provider or agent. That way, we >>>>> don't necessarily need the status update mechanism for offer >>>>> operations, >>>>> which causes complexity in the frameworks. >>>>> - In the current proposal, do we need agent_id (or resource >>>>> provider id) when performing reconciliation for that operation? The >>>>> reason >>>>> we require that in the task reconciliation case is because agent might >>>>> not >>>>> re-register yet during master failover. >>>>> - We need to mock up the operator API for this work. >>>>> - What's the order guarantee for the operations specified in one >>>>> API call? >>>>> - Wish list >>>>> - Reservation tie to framework instead of role. >>>>> - When a framework teardown, auto release resources reserved >>>>> for that framework >>>>> >>>>> If I miss anything, please reply to this thread! Thanks! >>>>> >>>>> https://docs.google.com/presentation/d/1Mef8K3aLIuzcFVc3MnAo >>>>> 64TkjpyTWarYVShtvCN4e48/edit?usp=sharing >>>>> >>>>> - Jie >>>>> >>>> >>>> >>> >