> On Nov. 27, 2017, 4:25 p.m., Ilya Pronin wrote:
> > src/master/master.cpp
> > Line 9370 (original), 9322 (patched)
> > <https://reviews.apache.org/r/61473/diff/20/?file=1902101#file1902101line9370>
> >
> >     I would suggest moving this closer to `newTaskState` selection logic 
> > for readability.
> 
> Megha Sharma wrote:
>     I feel its more readable if we populate the value for `bool unreachable` 
> just before its usage but I am open to moving it if you feel strongly about 
> it.

Not stongly, but when I was reading this part I was like "This looks weird, why 
do we only check for `TASK_GONE_BY_OPERATOR` here? What about other terminal 
states?". Then I scrolled up and found that `newTaskState` can only be 
`TASK_LOST`, `TASK_UNREACHABLE` or `TASK_GONE_BY_OPERATOR`.


- Ilya


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61473/#review191674
-----------------------------------------------------------


On Nov. 27, 2017, 4:56 p.m., Megha Sharma wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61473/
> -----------------------------------------------------------
> 
> (Updated Nov. 27, 2017, 4:56 p.m.)
> 
> 
> Review request for mesos, James Peach, Vinod Kone, and Jiang Yan Xu.
> 
> 
> Bugs: MESOS-7215
>     https://issues.apache.org/jira/browse/MESOS-7215
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Master will not kill the tasks for non-Partition aware frameworks
> when an unreachable agent re-registers with the master.
> Master used to send a ShutdownFrameworkMessages to the agent
> to kill the tasks from non partition aware frameworks including
> the ones that are still registered which was problematic because
> the offer from this agent could still go to the same framework which
> could then launch new tasks. The agent would then receive tasks
> of the same framework and ignore them because it thinks the
> framework is shutting down. The framework is not shutting down of
> course, so from the master and the scheduler's perspective the task
> is pending in STAGING forever until the next agent reregistration,
> which could happen much later. This commit fixes the problem by
> not shutting down the non-partition aware frameworks on such an
> agent.
> 
> 
> Diffs
> -----
> 
>   include/mesos/mesos.proto e194093e490741acc552fd3ad328fd710b4b4435 
>   include/mesos/v1/mesos.proto 6fb1139683952877667abbcf8bf84b5b31bcd29e 
>   src/master/http.cpp 10084125deb839a9846a4f64d2e433ff02754c02 
>   src/master/master.hpp a309fc78ee2613762f3d5d22ac7559afc7aac4a3 
>   src/master/master.cpp 2ddd67ada3731803b00883b6a1f32b20c1bb238f 
>   src/tests/partition_tests.cpp e49c474167076b4136a161ed29b11db9a13455a7 
> 
> 
> Diff: https://reviews.apache.org/r/61473/diff/22/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Megha Sharma
> 
>

Reply via email to