[ 
https://issues.apache.org/jira/browse/MESOS-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14551569#comment-14551569
 ] 

Benjamin Hindman commented on MESOS-2735:
-----------------------------------------

There are definitely differences in message queue behavior, one of which is 
significantly safer than the other. There are two safety concerns that I can 
think of, one of which [~jieyu] has addressed here but I'll repeat to be sure I 
properly understood.

(1) Someone might write a ResourceEstimator that isn't asynchronous, causing 
the slave to "block" while the resource estimator estimates.

(2) The ResourceEstimator might cause a denial of service attack on the slave.

I understand the concern with (1) but I'm not too anxious about it. Why? It 
should be trivial to make a wrapper module which forces people to implement the 
ResourceEstimator to be asynchronous, either using `async` like you suggested 
or implementing a version of ResourceEstimator which wraps an actor (libprocess 
process). We'll only need to do this once and then other ResourceEstimator 
implementations can leverage this stuff.

On the other hand, I don't like the behavior of push because of (2). 
Fundamentally, if the slave can't keep up with the rate at which the 
ResourceEstimator is pushing then we could create a denial of service issue 
with the slave, i.e., it takes a long time to process non-ResourceEstimator 
messages because it's queue is full of just ResourceEstimator messages. I'm 
more anxious about (2) than (1) because it's harder to find bugs in (2) than 
with (1) since once you fix (1) it stays fixed forever but any time you updated 
the algorithm you impact the potential to cause (2).

Now, I acknowledge that implementing this as a pull versus push will make the 
implementation in the ResourceEstimator slightly more complicated, but not 
really. In particular, it should be trivial to always use a `Queue` to achieve 
the push semantics in any ResourceEstimator implementation, while still 
providing the pull semantics externally. Make sense?

Finally, one of the advantages of the pull model is that it's easier to reason 
about because we don't have "anonymous" lambdas that cause execution in some 
other random place in the code (i.e., you can easily see in the slave where the 
future that gets returned from `ResourceEstimator::estimate()` gets handled). 
In addition, the ResourceEstimator remains "functional" in the sense that it 
just has to return some value (or a future) from it's functions versus invoking 
some callback that causes something to get run some other place (and in fact, 
may also block, so isn't it safer for the ResourceEstimator to invoke the 
callback in it's own `async`?).

The invocation of the `ResourceEstimator::estimate()` followed by the `.then` 
is a nice pattern that let's us compose with other things as well, which is 
harder to do with the lambda style callbacks and why we've avoided it where 
we've been able (in fact, I'm curious which place in the code are you imitating 
here?).

> Change the interaction between the slave and the resource estimator from 
> polling to pushing 
> --------------------------------------------------------------------------------------------
>
>                 Key: MESOS-2735
>                 URL: https://issues.apache.org/jira/browse/MESOS-2735
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Jie Yu
>            Assignee: Jie Yu
>              Labels: twitter
>
> This will make the semantics more clear. The resource estimator can control 
> the speed of sending resources estimation to the slave.
> To avoid cyclic dependency, slave will register a callback with the resource 
> estimator and the resource estimator will simply invoke that callback when 
> there's a new estimation ready. The callback will be a defer to the slave's 
> main event queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to