Re: Review Request 37531: MESOS-3070 (Master CHECK failure if a framework uses duplicated task id)

2015-09-25 Thread Klaus Ma

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37531/
---

(Updated Sept. 26, 2015, 2:52 a.m.)


Review request for mesos, Jie Yu and Vinod Kone.


Changes
---

Merge the code with the latest code; and re-check whether any potentail issue. 
I'll add more UT case on "kill duplicated tasks" and "show duplicated tasks in 
metrics"


Bugs: MESOS-3070
https://issues.apache.org/jira/browse/MESOS-3070


Repository: mesos


Description
---

__Phenomenon:__
The master crash because of duplicated task id

__Root Cause:__
The task id are stored in slave agent; if master failover, there's a time 
window that new slave lanched a task with same task id; so if the old task 
re-registered back, the master will crash because of duplicated task id.

__Solution:__
Stores tasks info in Master::Framework by SlaveID to avoid duplicated issue.


Diffs (updated)
-

  src/master/http.cpp cd37c91 
  src/master/master.hpp 4bb65f0 
  src/master/master.cpp 6bee4f3 
  src/tests/master_tests.cpp ee24739 

Diff: https://reviews.apache.org/r/37531/diff/


Testing
---

make
make check


Thanks,

Klaus Ma



Re: Review Request 37531: MESOS-3070 (Master CHECK failure if a framework uses duplicated task id)

2015-09-25 Thread Mesos ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37531/#review100737
---


Patch looks great!

Reviews applied: [37531]

All tests passed.

- Mesos ReviewBot


On Sept. 26, 2015, 2:52 a.m., Klaus Ma wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/37531/
> ---
> 
> (Updated Sept. 26, 2015, 2:52 a.m.)
> 
> 
> Review request for mesos, Jie Yu and Vinod Kone.
> 
> 
> Bugs: MESOS-3070
> https://issues.apache.org/jira/browse/MESOS-3070
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> __Phenomenon:__
> The master crash because of duplicated task id
> 
> __Root Cause:__
> The task id are stored in slave agent; if master failover, there's a time 
> window that new slave lanched a task with same task id; so if the old task 
> re-registered back, the master will crash because of duplicated task id.
> 
> __Solution:__
> Stores tasks info in Master::Framework by SlaveID to avoid duplicated issue.
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp cd37c91 
>   src/master/master.hpp 4bb65f0 
>   src/master/master.cpp 6bee4f3 
>   src/tests/master_tests.cpp ee24739 
> 
> Diff: https://reviews.apache.org/r/37531/diff/
> 
> 
> Testing
> ---
> 
> make
> make check
> 
> 
> Thanks,
> 
> Klaus Ma
> 
>



Re: Review Request 37531: MESOS-3070 (Master CHECK failure if a framework uses duplicated task id)

2015-09-04 Thread Klaus Ma

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37531/
---

(Updated Sept. 5, 2015, 3:27 a.m.)


Review request for mesos and Vinod Kone.


Changes
---

Add summary & description


Summary (updated)
-

MESOS-3070 (Master CHECK failure if a framework uses duplicated task id)


Bugs: MESOS-3070
https://issues.apache.org/jira/browse/MESOS-3070


Repository: mesos


Description (updated)
---

__Phenomenon:__
The master crash because of duplicated task id

__Root Cause:__
The task id are stored in slave agent; if master failover, there's a time 
window that new slave lanched a task with same task id; so if the old task 
re-registered back, the master will crash because of duplicated task id.

__Solution:__
Stores tasks info in Master::Framework by SlaveID to avoid duplicated issue.


Diffs
-

  src/master/http.cpp 37d76ee 
  src/master/master.hpp 36c6759 
  src/master/master.cpp 95207d2 
  src/tests/master_tests.cpp 8a6b98b 

Diff: https://reviews.apache.org/r/37531/diff/


Testing
---

make
make check


Thanks,

Klaus Ma



Re: Review Request 37531: MESOS-3070

2015-08-28 Thread Klaus Ma

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37531/
---

(Updated Aug. 28, 2015, 9:48 a.m.)


Review request for mesos.


Changes
---

Add UT case


Bugs: MESOS-3070
https://issues.apache.org/jira/browse/MESOS-3070


Repository: mesos


Description
---

MESOS-3070 (Master CHECK failure if a framework uses duplicated task id.)


Diffs (updated)
-

  src/master/http.cpp 37d76ee 
  src/master/master.hpp 36c6759 
  src/master/master.cpp 95207d2 
  src/tests/master_tests.cpp 8a6b98b 

Diff: https://reviews.apache.org/r/37531/diff/


Testing (updated)
---

make
make check


Thanks,

Klaus Ma



Re: Review Request 37531: MESOS-3070

2015-08-20 Thread Klaus Ma

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37531/
---

(Updated 八月 21, 2015, 2:10 a.m.)


Review request for mesos.


Bugs: MESOS-3070
https://issues.apache.org/jira/browse/MESOS-3070


Repository: mesos


Description
---

MESOS-3070 (Master CHECK failure if a framework uses duplicated task id.)


Diffs (updated)
-

  include/mesos/mesos.proto 33e1b28 
  include/mesos/type_utils.hpp dafe1df 
  src/master/master.hpp 0432842 
  src/master/master.cpp 95207d2 
  src/master/validation.cpp ffb7bf0 
  src/messages/messages.proto 8977d8e 
  src/sched/sched.cpp 012af05 
  src/slave/slave.cpp 2a99abc 
  src/tests/master_tests.cpp 8a6b98b 

Diff: https://reviews.apache.org/r/37531/diff/


Testing
---

Draft code diff, will update UT cases later.


Thanks,

Klaus Ma



Re: Review Request 37531: MESOS-3070

2015-08-20 Thread Klaus Ma

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37531/#review95670
---



src/master/master.cpp (line 3218)
https://reviews.apache.org/r/37531/#comment150763

Yes; the TaskID will send back to framework by statusUpdate, so framework 
can use the UUID to kill a task which is not included in taskTag logic.


- Klaus Ma


On 八月 17, 2015, 1:28 p.m., Klaus Ma wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/37531/
 ---
 
 (Updated 八月 17, 2015, 1:28 p.m.)
 
 
 Review request for mesos.
 
 
 Bugs: MESOS-3070
 https://issues.apache.org/jira/browse/MESOS-3070
 
 
 Repository: mesos
 
 
 Description
 ---
 
 MESOS-3070 (Master CHECK failure if a framework uses duplicated task id.)
 
 
 Diffs
 -
 
   include/mesos/mesos.proto 33e1b28f1ccbe227657a14395f81df20e0a9e193 
   include/mesos/type_utils.hpp dafe1df0cb5d0b83ca0579068916fe7fda848f02 
   src/master/master.hpp 0432842d77beba024c7895291ca410964bae96be 
   src/master/master.cpp 95207d24db0aa052eb70c4cc7eb75d0611c365cf 
   src/master/validation.cpp ffb7bf07b8a40d6e14f922eabcf46045462498b5 
   src/messages/messages.proto 8977d8e0f3b16003128b6b9cab556a7b224f083c 
 
 Diff: https://reviews.apache.org/r/37531/diff/
 
 
 Testing
 ---
 
 Draft code diff, will update UT cases later.
 
 
 Thanks,
 
 Klaus Ma
 




Re: Review Request 37531: MESOS-3070

2015-08-17 Thread Guangya Liu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37531/#review95625
---



src/master/master.cpp (line 3218)
https://reviews.apache.org/r/37531/#comment150737

Does this logic still needed? I see that the taskTag logic should already 
covered this?



src/messages/messages.proto (line 65)
https://reviews.apache.org/r/37531/#comment150736

I want some comments here for what is taskTag


- Guangya Liu


On Aug. 17, 2015, 1:28 p.m., Klaus Ma wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/37531/
 ---
 
 (Updated Aug. 17, 2015, 1:28 p.m.)
 
 
 Review request for mesos.
 
 
 Bugs: MESOS-3070
 https://issues.apache.org/jira/browse/MESOS-3070
 
 
 Repository: mesos
 
 
 Description
 ---
 
 MESOS-3070 (Master CHECK failure if a framework uses duplicated task id.)
 
 
 Diffs
 -
 
   include/mesos/mesos.proto 33e1b28f1ccbe227657a14395f81df20e0a9e193 
   include/mesos/type_utils.hpp dafe1df0cb5d0b83ca0579068916fe7fda848f02 
   src/master/master.hpp 0432842d77beba024c7895291ca410964bae96be 
   src/master/master.cpp 95207d24db0aa052eb70c4cc7eb75d0611c365cf 
   src/master/validation.cpp ffb7bf07b8a40d6e14f922eabcf46045462498b5 
   src/messages/messages.proto 8977d8e0f3b16003128b6b9cab556a7b224f083c 
 
 Diff: https://reviews.apache.org/r/37531/diff/
 
 
 Testing
 ---
 
 Draft code diff, will update UT cases later.
 
 
 Thanks,
 
 Klaus Ma