Re: Review Request 24356: Fixed allocator to do allocations per slave rather than per framework.

2014-08-07 Thread Vinod Kone


> On Aug. 7, 2014, 3:39 p.m., Adam B wrote:
> > Have you done any simulations or performance tests on a test cluster? I'd 
> > love to see some numbers that prove that this is more fair in some scenario 
> > than the previous algorithm. Not sure I can blindly trust a unit test.

Not yet. I'll have to think about how to test "fairness". Intuitively though, 
this is doing more fine-grained resource allocation. Previously we were giving 
roughly the entire cluster's resources to one framework with the smallest share 
and waiting for it decline them in order to give them to another framework. 
With this change, a framework will be allocated enough slaves' resources to 
make its DRF share on par/better than other frameworks. This way multiple 
frameworks would be given "some" resources during every allocation, while still 
honoring DRF. It would also help ameliorate concerns around frameworks holding 
on to offers and not playing well with DRF.


- Vinod


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24356/#review49901
---


On Aug. 7, 2014, 11:02 p.m., Vinod Kone wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24356/
> ---
> 
> (Updated Aug. 7, 2014, 11:02 p.m.)
> 
> 
> Review request for mesos, Adam B and Benjamin Hindman.
> 
> 
> Bugs: MESOS-1119
> https://issues.apache.org/jira/browse/MESOS-1119
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> The allocator now updates the DRF algorithm after every allocation of a 
> slave's resources. This will make the allocation more granular and fair.
> 
> 
> Diffs
> -
> 
>   src/master/hierarchical_allocator_process.hpp 
> d81082f7fa2a4ffc13f18c232abb8ad2814bebc1 
>   src/tests/allocator_tests.cpp f0226cb3fa4b54f917c0a0f59b986e9115352507 
> 
> Diff: https://reviews.apache.org/r/24356/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> make check GTEST_FILTER="*PerSlaveAllocation*" GTEST_REPEAT=1000 
> GTEST_BREAK_ON_FAILURE=1 GTEST_VERBOSE=1 MESOS_VERBOSE=1 GLOG_v=1
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>



Re: Review Request 24356: Fixed allocator to do allocations per slave rather than per framework.

2014-08-07 Thread Vinod Kone


> On Aug. 7, 2014, 3:23 p.m., Benjamin Hindman wrote:
> > src/tests/allocator_tests.cpp, lines 405-407
> > 
> >
> > This makes me feel like you're assuming there is going to be a time 
> > delay between when the resources are recovered and the next allocation, 
> > otherwise, especially since you set the refuse seconds to 0, these 
> > resources should be reallocated right away. I think you want to have the 
> > other framework already registered before the first framework declines so 
> > that you're guaranteed that there isn't some weird timing event that makes 
> > this test be flaky. Or if I'm missing something please leave a big comment 
> > explaining why you don't need to do it that way, and any assumptions you're 
> > making.

Added a comment.


- Vinod


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24356/#review49898
---


On Aug. 7, 2014, 11:02 p.m., Vinod Kone wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24356/
> ---
> 
> (Updated Aug. 7, 2014, 11:02 p.m.)
> 
> 
> Review request for mesos, Adam B and Benjamin Hindman.
> 
> 
> Bugs: MESOS-1119
> https://issues.apache.org/jira/browse/MESOS-1119
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> The allocator now updates the DRF algorithm after every allocation of a 
> slave's resources. This will make the allocation more granular and fair.
> 
> 
> Diffs
> -
> 
>   src/master/hierarchical_allocator_process.hpp 
> d81082f7fa2a4ffc13f18c232abb8ad2814bebc1 
>   src/tests/allocator_tests.cpp f0226cb3fa4b54f917c0a0f59b986e9115352507 
> 
> Diff: https://reviews.apache.org/r/24356/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> make check GTEST_FILTER="*PerSlaveAllocation*" GTEST_REPEAT=1000 
> GTEST_BREAK_ON_FAILURE=1 GTEST_VERBOSE=1 MESOS_VERBOSE=1 GLOG_v=1
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>



Re: Review Request 24356: Fixed allocator to do allocations per slave rather than per framework.

2014-08-07 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24356/
---

(Updated Aug. 7, 2014, 11:02 p.m.)


Review request for mesos, Adam B and Benjamin Hindman.


Changes
---

benh's comments.


Bugs: MESOS-1119
https://issues.apache.org/jira/browse/MESOS-1119


Repository: mesos-git


Description
---

The allocator now updates the DRF algorithm after every allocation of a slave's 
resources. This will make the allocation more granular and fair.


Diffs (updated)
-

  src/master/hierarchical_allocator_process.hpp 
d81082f7fa2a4ffc13f18c232abb8ad2814bebc1 
  src/tests/allocator_tests.cpp f0226cb3fa4b54f917c0a0f59b986e9115352507 

Diff: https://reviews.apache.org/r/24356/diff/


Testing
---

make check

make check GTEST_FILTER="*PerSlaveAllocation*" GTEST_REPEAT=1000 
GTEST_BREAK_ON_FAILURE=1 GTEST_VERBOSE=1 MESOS_VERBOSE=1 GLOG_v=1


Thanks,

Vinod Kone



Re: Review Request 24356: Fixed allocator to do allocations per slave rather than per framework.

2014-08-07 Thread Adam B

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24356/#review49901
---


Have you done any simulations or performance tests on a test cluster? I'd love 
to see some numbers that prove that this is more fair in some scenario than the 
previous algorithm. Not sure I can blindly trust a unit test.

- Adam B


On Aug. 6, 2014, 3:56 p.m., Vinod Kone wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24356/
> ---
> 
> (Updated Aug. 6, 2014, 3:56 p.m.)
> 
> 
> Review request for mesos, Adam B and Benjamin Hindman.
> 
> 
> Bugs: MESOS-1119
> https://issues.apache.org/jira/browse/MESOS-1119
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> The allocator now updates the DRF algorithm after every allocation of a 
> slave's resources. This will make the allocation more granular and fair.
> 
> 
> Diffs
> -
> 
>   src/master/hierarchical_allocator_process.hpp 
> c7e689e882e88ef5adb31e72909ef85d8a8c0067 
>   src/tests/allocator_tests.cpp f0226cb3fa4b54f917c0a0f59b986e9115352507 
> 
> Diff: https://reviews.apache.org/r/24356/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> make check GTEST_FILTER="*PerSlaveAllocation*" GTEST_REPEAT=1000 
> GTEST_BREAK_ON_FAILURE=1 GTEST_VERBOSE=1 MESOS_VERBOSE=1 GLOG_v=1
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>



Re: Review Request 24356: Fixed allocator to do allocations per slave rather than per framework.

2014-08-07 Thread Benjamin Hindman

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24356/#review49898
---

Ship it!


See my comment in the test below, otherwise this looks good to go.


src/tests/allocator_tests.cpp


Can we still not do Some("role1,role2")? I'm guessing you need 
Option because the compiler can't go from const char* to Option 
right?



src/tests/allocator_tests.cpp


This makes me feel like you're assuming there is going to be a time delay 
between when the resources are recovered and the next allocation, otherwise, 
especially since you set the refuse seconds to 0, these resources should be 
reallocated right away. I think you want to have the other framework already 
registered before the first framework declines so that you're guaranteed that 
there isn't some weird timing event that makes this test be flaky. Or if I'm 
missing something please leave a big comment explaining why you don't need to 
do it that way, and any assumptions you're making.


- Benjamin Hindman


On Aug. 6, 2014, 10:56 p.m., Vinod Kone wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24356/
> ---
> 
> (Updated Aug. 6, 2014, 10:56 p.m.)
> 
> 
> Review request for mesos, Adam B and Benjamin Hindman.
> 
> 
> Bugs: MESOS-1119
> https://issues.apache.org/jira/browse/MESOS-1119
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> The allocator now updates the DRF algorithm after every allocation of a 
> slave's resources. This will make the allocation more granular and fair.
> 
> 
> Diffs
> -
> 
>   src/master/hierarchical_allocator_process.hpp 
> c7e689e882e88ef5adb31e72909ef85d8a8c0067 
>   src/tests/allocator_tests.cpp f0226cb3fa4b54f917c0a0f59b986e9115352507 
> 
> Diff: https://reviews.apache.org/r/24356/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> make check GTEST_FILTER="*PerSlaveAllocation*" GTEST_REPEAT=1000 
> GTEST_BREAK_ON_FAILURE=1 GTEST_VERBOSE=1 MESOS_VERBOSE=1 GLOG_v=1
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>



Re: Review Request 24356: Fixed allocator to do allocations per slave rather than per framework.

2014-08-06 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24356/
---

(Updated Aug. 6, 2014, 10:56 p.m.)


Review request for mesos, Adam B and Benjamin Hindman.


Changes
---

rebased. s/resourcesUnused/resourcesRecovered/


Bugs: MESOS-1119
https://issues.apache.org/jira/browse/MESOS-1119


Repository: mesos-git


Description
---

The allocator now updates the DRF algorithm after every allocation of a slave's 
resources. This will make the allocation more granular and fair.


Diffs (updated)
-

  src/master/hierarchical_allocator_process.hpp 
c7e689e882e88ef5adb31e72909ef85d8a8c0067 
  src/tests/allocator_tests.cpp f0226cb3fa4b54f917c0a0f59b986e9115352507 

Diff: https://reviews.apache.org/r/24356/diff/


Testing
---

make check

make check GTEST_FILTER="*PerSlaveAllocation*" GTEST_REPEAT=1000 
GTEST_BREAK_ON_FAILURE=1 GTEST_VERBOSE=1 MESOS_VERBOSE=1 GLOG_v=1


Thanks,

Vinod Kone



Re: Review Request 24356: Fixed allocator to do allocations per slave rather than per framework.

2014-08-06 Thread Vinod Kone

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24356/
---

(Updated Aug. 6, 2014, 10:24 p.m.)


Review request for mesos, Adam B and Benjamin Hindman.


Changes
---

comments. PTAL.


Bugs: MESOS-1119
https://issues.apache.org/jira/browse/MESOS-1119


Repository: mesos-git


Description
---

The allocator now updates the DRF algorithm after every allocation of a slave's 
resources. This will make the allocation more granular and fair.


Diffs (updated)
-

  src/master/hierarchical_allocator_process.hpp 
c7e689e882e88ef5adb31e72909ef85d8a8c0067 
  src/tests/allocator_tests.cpp f0226cb3fa4b54f917c0a0f59b986e9115352507 

Diff: https://reviews.apache.org/r/24356/diff/


Testing
---

make check

make check GTEST_FILTER="*PerSlaveAllocation*" GTEST_REPEAT=1000 
GTEST_BREAK_ON_FAILURE=1 GTEST_VERBOSE=1 MESOS_VERBOSE=1 GLOG_v=1


Thanks,

Vinod Kone



Re: Review Request 24356: Fixed allocator to do allocations per slave rather than per framework.

2014-08-06 Thread Vinod Kone


> On Aug. 6, 2014, 6:37 a.m., Adam B wrote:
> > src/master/hierarchical_allocator_process.hpp, line 729
> > 
> >
> > In what order are you iterating through the slaves? What order should 
> > it be?
> 
> Vinod Kone wrote:
> good point. right now it's the order in which the keys are stored in 
> hashet which is undefined. maybe one option is to shuffle the slaves during 
> every allocation for some sort of fairness. or we could sort them by 
> available resources on a slave. not sure which one gels well with the DRF 
> algorithm. i'm open to suggestions here.

briefly chatted w/ benh. decided to go with random shuffling for now with a 
TODO for a smarter algorithm.


- Vinod


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24356/#review49700
---


On Aug. 6, 2014, 12:49 a.m., Vinod Kone wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24356/
> ---
> 
> (Updated Aug. 6, 2014, 12:49 a.m.)
> 
> 
> Review request for mesos and Benjamin Hindman.
> 
> 
> Bugs: MESOS-1119
> https://issues.apache.org/jira/browse/MESOS-1119
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> The allocator now updates the DRF algorithm after every allocation of a 
> slave's resources. This will make the allocation more granular and fair.
> 
> 
> Diffs
> -
> 
>   src/master/hierarchical_allocator_process.hpp 
> 35d1579bbd665dfdd238932902ae16880dadeb66 
>   src/tests/allocator_tests.cpp b920533bbc36a154c02b467673bc7a0d50d65bc7 
> 
> Diff: https://reviews.apache.org/r/24356/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> make check GTEST_FILTER="*PerSlaveAllocation*" GTEST_REPEAT=1000 
> GTEST_BREAK_ON_FAILURE=1 GTEST_VERBOSE=1 MESOS_VERBOSE=1 GLOG_v=1
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>



Re: Review Request 24356: Fixed allocator to do allocations per slave rather than per framework.

2014-08-06 Thread Vinod Kone


> On Aug. 6, 2014, 6:37 a.m., Adam B wrote:
> > src/master/hierarchical_allocator_process.hpp, line 729
> > 
> >
> > In what order are you iterating through the slaves? What order should 
> > it be?

good point. right now it's the order in which the keys are stored in hashet 
which is undefined. maybe one option is to shuffle the slaves during every 
allocation for some sort of fairness. or we could sort them by available 
resources on a slave. not sure which one gels well with the DRF algorithm. i'm 
open to suggestions here.


- Vinod


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24356/#review49700
---


On Aug. 6, 2014, 12:49 a.m., Vinod Kone wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24356/
> ---
> 
> (Updated Aug. 6, 2014, 12:49 a.m.)
> 
> 
> Review request for mesos and Benjamin Hindman.
> 
> 
> Bugs: MESOS-1119
> https://issues.apache.org/jira/browse/MESOS-1119
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> The allocator now updates the DRF algorithm after every allocation of a 
> slave's resources. This will make the allocation more granular and fair.
> 
> 
> Diffs
> -
> 
>   src/master/hierarchical_allocator_process.hpp 
> 35d1579bbd665dfdd238932902ae16880dadeb66 
>   src/tests/allocator_tests.cpp b920533bbc36a154c02b467673bc7a0d50d65bc7 
> 
> Diff: https://reviews.apache.org/r/24356/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> make check GTEST_FILTER="*PerSlaveAllocation*" GTEST_REPEAT=1000 
> GTEST_BREAK_ON_FAILURE=1 GTEST_VERBOSE=1 MESOS_VERBOSE=1 GLOG_v=1
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>



Re: Review Request 24356: Fixed allocator to do allocations per slave rather than per framework.

2014-08-05 Thread Adam B

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24356/#review49700
---



src/master/hierarchical_allocator_process.hpp


In what order are you iterating through the slaves? What order should it be?



src/master/hierarchical_allocator_process.hpp


Why even have an 'allocatedResources' variable if it's always the same as 
'unreserved'?


- Adam B


On Aug. 5, 2014, 5:49 p.m., Vinod Kone wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24356/
> ---
> 
> (Updated Aug. 5, 2014, 5:49 p.m.)
> 
> 
> Review request for mesos and Benjamin Hindman.
> 
> 
> Bugs: MESOS-1119
> https://issues.apache.org/jira/browse/MESOS-1119
> 
> 
> Repository: mesos-git
> 
> 
> Description
> ---
> 
> The allocator now updates the DRF algorithm after every allocation of a 
> slave's resources. This will make the allocation more granular and fair.
> 
> 
> Diffs
> -
> 
>   src/master/hierarchical_allocator_process.hpp 
> 35d1579bbd665dfdd238932902ae16880dadeb66 
>   src/tests/allocator_tests.cpp b920533bbc36a154c02b467673bc7a0d50d65bc7 
> 
> Diff: https://reviews.apache.org/r/24356/diff/
> 
> 
> Testing
> ---
> 
> make check
> 
> make check GTEST_FILTER="*PerSlaveAllocation*" GTEST_REPEAT=1000 
> GTEST_BREAK_ON_FAILURE=1 GTEST_VERBOSE=1 MESOS_VERBOSE=1 GLOG_v=1
> 
> 
> Thanks,
> 
> Vinod Kone
> 
>