[jira] [Commented] (MESOS-9448) Semantics of RECONCILE_OPERATIONS framework API call are incorrect
[ https://issues.apache.org/jira/browse/MESOS-9448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16792789#comment-16792789 ] Vinod Kone commented on MESOS-9448: --- [~greggomann] Can we close this as a dup of MESOS-9318? > Semantics of RECONCILE_OPERATIONS framework API call are incorrect > -- > > Key: MESOS-9448 > URL: https://issues.apache.org/jira/browse/MESOS-9448 > Project: Mesos > Issue Type: Bug > Components: framework, HTTP API, master >Reporter: Benjamin Bannier >Priority: Major > > The typical pattern in the framework HTTP API is that frameworks send calls > to which the master responds with {{Accepted}} responses and which trigger > events. The only designed exception to this are {{SUBSCRIBE}} calls to which > the master responds with an {{Ok}} response containing the assigned framework > ID. This is even codified in {{src/scheduler.cpp:646ff}}, > {code} > if (response->code == process::http::Status::OK) { > // Only SUBSCRIBE call should get a "200 OK" response. > CHECK_EQ(Call::SUBSCRIBE, call.type()); > {code} > Currently, the handling of {{RECONCILE_OPERATIONS}} calls does not follow > this pattern. Instead of sending events, the master immediately responds with > a {{Ok}} and a list of operations. This e.g., leads to assertion failures in > above hard check whenever one uses the {{Scheduler::send}} instead of > {{Scheduler::call}}. One can reproduce this by modifying the existing tests > in {{src/operation_reconciliation_tests.cpp}}, > {code} > mesos.send({createCallReconcileOperations(frameworkId, {operation})}); // ADD > THIS. > const Future result = > mesos.call({createCallReconcileOperations(frameworkId, {operation})}); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-9448) Semantics of RECONCILE_OPERATIONS framework API call are incorrect
[ https://issues.apache.org/jira/browse/MESOS-9448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16708723#comment-16708723 ] Benjamin Bannier commented on MESOS-9448: - Thanks for the additional details, [~gkleiman]. With the current approach we expect the master's HTTP handler to have enough information to assemble a response immediately, i.e., without deferring to the agent or and resource provider managers. If one wanted to say send different operation status for operations on (1) active, but currently unsubscribed resource providers, and on (2) removed resource providers, we would need to sync at least resource providers ever active in the cluster to the master. Currently master and agent communicate via {{UpdateSlaveMessage}} which is about_active providers_ and their operations, but not about all providers (both present and past) which the master would need to distinguish a disconnected provider from a removed one. Explicitly communicating that information to the master seems wasteful (after all, a resource provider manager would have this information already) and potentially not scalable (e.g., to many agents with a lot of provider churn). Currently master sends {{OPERATION_UNKNOWN}} for any resource provider it has not yet seen which is too coarse-grained for frameworks, see MESOS-9318. It seems that the current semantics impose a huge cost on improving that. All this would seem much simpler in a world were a call to reconcile operations would trigger asynchronously triggered operation status update events from all involved entities (i.e., agents, and resource provider managers). Here master would defer the work to the entity actually managing that state. > Semantics of RECONCILE_OPERATIONS framework API call are incorrect > -- > > Key: MESOS-9448 > URL: https://issues.apache.org/jira/browse/MESOS-9448 > Project: Mesos > Issue Type: Bug > Components: framework, HTTP API, master >Reporter: Benjamin Bannier >Priority: Major > > The typical pattern in the framework HTTP API is that frameworks send calls > to which the master responds with {{Accepted}} responses and which trigger > events. The only designed exception to this are {{SUBSCRIBE}} calls to which > the master responds with an {{Ok}} response containing the assigned framework > ID. This is even codified in {{src/scheduler.cpp:646ff}}, > {code} > if (response->code == process::http::Status::OK) { > // Only SUBSCRIBE call should get a "200 OK" response. > CHECK_EQ(Call::SUBSCRIBE, call.type()); > {code} > Currently, the handling of {{RECONCILE_OPERATIONS}} calls does not follow > this pattern. Instead of sending events, the master immediately responds with > a {{Ok}} and a list of operations. This e.g., leads to assertion failures in > above hard check whenever one uses the {{Scheduler::send}} instead of > {{Scheduler::call}}. One can reproduce this by modifying the existing tests > in {{src/operation_reconciliation_tests.cpp}}, > {code} > mesos.send({createCallReconcileOperations(frameworkId, {operation})}); // ADD > THIS. > const Future result = > mesos.call({createCallReconcileOperations(frameworkId, {operation})}); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-9448) Semantics of RECONCILE_OPERATIONS framework API call are incorrect
[ https://issues.apache.org/jira/browse/MESOS-9448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707713#comment-16707713 ] Gastón Kleiman commented on MESOS-9448: --- These are the intended semantics for {{RECONCILE_OPERATIONS}}, we decided that we wanted to follow a Request/Response pattern instead of an event based pattern like {{RECONCILE}}. {{send()}} is a {{void}} method, so we had to add the {{call()}} method in order to use this API call. We should update the description of {{send()}} in {{scheduler.hpp}} and {{scheduler.cpp}} to make it clear that it can't be used to send {{RECONCILE_OPERATIONS}} requests. > Semantics of RECONCILE_OPERATIONS framework API call are incorrect > -- > > Key: MESOS-9448 > URL: https://issues.apache.org/jira/browse/MESOS-9448 > Project: Mesos > Issue Type: Bug > Components: framework, HTTP API, master >Reporter: Benjamin Bannier >Priority: Major > > The typical pattern in the framework HTTP API is that frameworks send calls > to which the master responds with {{Accepted}} responses and which trigger > events. The only designed exception to this are {{SUBSCRIBE}} calls to which > the master responds with an {{Ok}} response containing the assigned framework > ID. This is even codified in {{src/scheduler.cpp:646ff}}, > {code} > if (response->code == process::http::Status::OK) { > // Only SUBSCRIBE call should get a "200 OK" response. > CHECK_EQ(Call::SUBSCRIBE, call.type()); > {code} > Currently, the handling of {{RECONCILE_OPERATIONS}} calls does not follow > this pattern. Instead of sending events, the master immediately responds with > a {{Ok}} and a list of operations. This e.g., leads to assertion failures in > above hard check whenever one uses the {{Scheduler::send}} instead of > {{Scheduler::call}}. One can reproduce this by modifying the existing tests > in {{src/operation_reconciliation_tests.cpp}}, > {code} > mesos.send({createCallReconcileOperations(frameworkId, {operation})}); // ADD > THIS. > const Future result = > mesos.call({createCallReconcileOperations(frameworkId, {operation})}); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)