> On Oct. 30, 2014, 12:09 a.m., Vinod Kone wrote:
> > looking good. some minor comments.
> > 
> > are you going to follow up with a review for metrics based on source and 
> > reason? that would make this effort super useful.
> 
> Dominic Hamon wrote:
>     yes!
>     
>     i started to fold it into this one but it became too complicated. I need 
> to think carefully about whether to still include aggregates. Ie:
>     
>         master/tasks_lost
>         master/tasks_lost/source_master
>         master/tasks_lost/source_master/reason_task_invalid
>         master/tasks_lost/source_slave
>         master/tasks_lost/source_slave/reason_something
>         master/tasks_lost/source_slave/reason_something_else
>         
>     It could get quite unwieldy but would be backwards compatible and 
> complete.

i like the structure. let's follow up in a different review which adds this as 
dependent.


> On Oct. 30, 2014, 12:09 a.m., Vinod Kone wrote:
> > src/master/master.cpp, line 3646
> > <https://reviews.apache.org/r/26382/diff/9/?file=741857#file741857line3646>
> >
> >     Not sure if REASON_RECONCILIATON_REQUEST is more appropriate here than 
> > REASON_TASK_UNKNOWN.
> 
> Dominic Hamon wrote:
>     i can go either way. left it as explicit for now.

talked w/ benm. i think it's better to be consistent here. the reason (no pun 
intended) being we can easily segregate status updates generated due to 
reconciliation. this will be useful for proper alerting. re-opening the issue, 
here and below.


> On Oct. 30, 2014, 12:09 a.m., Vinod Kone wrote:
> > src/master/master.cpp, line 2725
> > <https://reviews.apache.org/r/26382/diff/9/?file=741857#file741857line2725>
> >
> >     why no reason?
> >     
> >     use REASON_TASK_PENDING
> 
> Dominic Hamon wrote:
>     I went back and forth on this. That's not the reason. The reason is that 
> killTask is called, which is adequately captured by the status TASK_KILLED. 
> The fact that it's pending isn't a reason and I don't think it adds anything 
> useful.
>     
>     The other places we set TASK_KILLED are: executor (source is set), 
> framework is removed (reason is set), from the slave (source is set), 
> executor unregistered (source slave, and reason set).
>     
>     This is the only place that we get TASK_KILLED with no reason, and it's 
> because we don't have one.
>     
>     We could add a reason to the KillTaskMessage but i was hesitant to make 
> further changes to the API.
>     
>     Thoughts?

ok. fair enough. i'll drop this issue and the one in slave.


- Vinod


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26382/#review59070
-----------------------------------------------------------


On Oct. 30, 2014, 3:47 a.m., Dominic Hamon wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26382/
> -----------------------------------------------------------
> 
> (Updated Oct. 30, 2014, 3:47 a.m.)
> 
> 
> Review request for mesos, Vinod Kone and Bill Farner.
> 
> 
> Bugs: MESOS-1830 and MESOS-343
>     https://issues.apache.org/jira/browse/MESOS-1830
>     https://issues.apache.org/jira/browse/MESOS-343
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> Added source and reason, enabled TASK_ERROR, and made the changes necessary 
> throughout the codebase.
> 
> 
> Diffs
> -----
> 
>   include/mesos/mesos.proto 168a7a8c35ed1bf3f5bd6d7431b1e511bae7b789 
>   src/common/protobuf_utils.hpp 212d5124b9a4cc58e61719fa7f07a61cd166e834 
>   src/common/protobuf_utils.cpp a9b65e328c4c62bff7fbf5633dda25d742d79019 
>   src/examples/balloon_framework.cpp b05d5679fe2915142907af0b2dc00c6cd76eb9c1 
>   src/examples/java/TestFramework.java 
> bc593d0abfacb00690b1492b2b82c970f4e4de6d 
>   src/examples/low_level_scheduler_libprocess.cpp 
> 7ef5ea78ade4ed856b97009fdfe31281f0a55c17 
>   src/examples/low_level_scheduler_pthread.cpp 
> 6e233a10117a1c7aa669806b5b430e746e227ee5 
>   src/examples/no_executor_framework.cpp 
> f98a0735b9f287e7f1bf98af6c2e9a47ca6a77b2 
>   src/examples/test_framework.cpp 187a611ebfe35cb13ee48aa5eca934cf55f34dea 
>   src/master/master.cpp 762d2ff6c168ac212f70b43275692a77496a7fcd 
>   src/sched/sched.cpp 0fb8c7bda75545389f8024489b3c76ae115111f4 
>   src/slave/slave.cpp 96fb5f7385b0762d46d8129f7e43207bd6311644 
>   src/tests/fault_tolerance_tests.cpp 
> a18a41a3e34ff112e04e693447d757403e5013bd 
>   src/tests/master_authorization_tests.cpp 
> 652e80d0d4567b225c6ffb326725ddfde06f7fd3 
>   src/tests/master_tests.cpp 2e525749247626c05efb2f54a707599facb114b6 
>   src/tests/resource_offers_tests.cpp 
> fe66432b1bf75ee25feb73c4bb353e4d4e5b503f 
> 
> Diff: https://reviews.apache.org/r/26382/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Dominic Hamon
> 
>

Reply via email to