Re: Review Request 68132: Batch '/state' requests on Master.

Alexander Rukletsov Mon, 06 Aug 2018 03:31:34 -0700

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/68132/
-----------------------------------------------------------


(Updated Aug. 6, 2018, 10:30 a.m.)


Review request for mesos, Benno Evers and Benjamin Mahler.


Bugs: MESOS-9122
    https://issues.apache.org/jira/browse/MESOS-9122


Repository: mesos


Description
-------

With this patch handlers for '/state' requests are not scheduled
directly after authorization, but are accumulated and then scheduled
for later parallel processing.

This approach allows, if there are N '/state' requests in the Master's
mailbox and T is the request response time, to block the Master actor
only once for time O(T) instead of blocking it for time N*T prior to
this patch.

This batching technique reduces both the time Master is spending
answering '/state' requests and the average request response time
in presence of multiple requests in the Master's mailbox. However,
for seldom '/state' requests that don't accumulate in the Master's
mailbox, the response time might increase due to an added trip
through the mailbox.

The change preserves the read-your-writes consistency model.


Diffs (updated)
-----

  src/master/http.cpp d43fbd689598612ec5946b46e2fa5e7f5e22cfa8 
  src/master/master.hpp 209b998db8d2bad7a3812df44f0939458f48eb11 


Diff: https://reviews.apache.org/r/68132/diff/2/

Changes: https://reviews.apache.org/r/68132/diff/1-2/


Testing
-------

`make check` on Mac OS 10.13.5 and various Linux distros.

Run `MasterStateQueryLoad_BENCHMARK_Test.v0State` benchmark?

**Setup**
Processor: Intel i7-4980HQ 2.8 GHz with 6 MB on-chip L3 cache and 128 MB L4 
cache (Crystalwell)
Total Number of Cores: 4
Total Number of Cores: 8
L2 Cache (per Core): 256 KB  

Average improvement without optimization: 62%–70%.
Average improvement with optimization: 17%–62%.

**[No batching, no 
optimization](https://dobianchi.files.wordpress.com/2011/11/no-barrique-no-berlusconi.jpg?w=638)**
```
Test setup: 100 agents with a total of 10000 running tasks and 10000 completed 
tasks; 10 '/state' and '/flags' requests will be sent with 200ms interval
Launching 10 '/state' requests in background
Launching 10 '/flags' requests
'/flags' response on average took 1.102349605secs, 10 responses are in 
[2.662342ms, 2.143755433secs]
'/state' response on average took 1.549122019secs, 10 responses are in 
[494.278454ms, 2.633971927secs]

Test setup: 1000 agents with a total of 100000 running tasks and 100000 
completed tasks; 10 '/state' and '/flags' requests will be sent with 200ms 
interval
Launching 10 '/state' requests in background
Launching 10 '/flags' requests
'/flags' response on average took 18.436968137secs, 10 responses are in 
[2.578238ms, 33.210561732secs]
'/state' response on average took 23.916379537secs, 10 responses are in 
[5.170660597secs, 43.008091744secs]
```

**With batching but no optimization**
```
Test setup: 100 agents with a total of 10000 running tasks and 10000 completed 
tasks; 10 '/state' and '/flags' requests will be sent with 200ms interval
Launching 10 '/state' requests in background
Launching 10 '/flags' requests
'/flags' response on average took 417.211022ms, 10 responses are in 
[4.066901ms, 728.045442ms]
'/state' response on average took 830.351291ms, 10 responses are in 
[459.033455ms, 1.208880892secs]

Test setup: 1000 agents with a total of 100000 running tasks and 100000 
completed tasks; 10 '/state' and '/flags' requests will be sent with 200ms 
interval
Launching 10 '/state' requests in background
Launching 10 '/flags' requests
'/flags' response on average took 5.439950928secs, 10 responses are in 
[3.246906ms, 9.343994388secs]
'/state' response on average took 16.764607823secs, 10 responses are in 
[4.980333091secs, 18.461983916secs]
```

**No batching but `-O3` optimization**
```
Test setup: 100 agents with a total of 10000 running tasks and 10000 completed 
tasks; 10 '/state' and '/flags' requests will be sent with 200ms interval
Launching 10 '/state' requests in background
Launching 10 '/flags' requests
'/flags' response on average took 2.396221ms, 10 responses are in [1.628583ms, 
2.816639ms]
'/state' response on average took 113.469574ms, 10 responses are in 
[104.218099ms, 134.477062ms]

Test setup: 1000 agents with a total of 100000 running tasks and 100000 
completed tasks; 10 '/state' and '/flags' requests will be sent with 200ms 
interval
Launching 10 '/state' requests in background
Launching 10 '/flags' requests
'/flags' response on average took 3.892615876secs, 10 responses are in 
[2.480517ms, 7.630934838secs]
'/state' response on average took 5.205245306secs, 10 responses are in 
[1.578161651secs, 8.789315237secs]
```

**Batching and `-O3` optimization**
```
Test setup: 100 agents with a total of 10000 running tasks and 10000 completed 
tasks; 10 '/state' and '/flags' requests will be sent with 200ms interval
Launching 10 '/state' requests in background
Launching 10 '/flags' requests
'/flags' response on average took 1.973573ms, 10 responses are in [1.221193ms, 
2.694713ms]
'/state' response on average took 113.331551ms, 10 responses are in 
[102.593397ms, 142.028555ms]

Test setup: 1000 agents with a total of 100000 running tasks and 100000 
completed tasks; 10 '/state' and '/flags' requests will be sent with 200ms 
interval
Launching 10 '/state' requests in background
Launching 10 '/flags' requests
'/flags' response on average took 1.475842691secs, 10 responses are in 
[2.437217ms, 3.815589561secs]
'/state' response on average took 4.742303751secs, 10 responses are in 
[4.047655443secs, 6.00752698secs]
```


Thanks,

Alexander Rukletsov

Re: Review Request 68132: Batch '/state' requests on Master.

Reply via email to