Re: Review Request 71750: Improved performance of v1 operator API GetAgents call.

2019-11-22 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71750/
---

(Updated Nov. 22, 2019, 6:27 p.m.)


Review request for mesos, Andrei Sekretenko, Greg Mann, and Meng Zhu.


Changes
---

* Used default instead of all cases, per Andrei's suggestion, since it fits 
within our style guide recommendation: 
https://issues.apache.org/jira/browse/MESOS-3754


Bugs: MESOS-10026
https://issues.apache.org/jira/browse/MESOS-10026


Repository: mesos


Description
---

This updates the handling to serialize directly to protobuf or json
from the in-memory v0 state, bypassing expensive intermediate
serialization / de-serialization / object construction / object
destruction.

This initial patch shows the approach that will be used for the
other expensive calls. Note that this type of manual writing is
more brittle and complex, but it can be mostly eliminated if we
keep an up-to-date v1 GetState in memory in the future.

When this approach is applied fully to GetState, it leads to the
following improvement:

Before:
v0 '/state' response took 6.55 secs
v1 'GetState' application/x-protobuf response took 24.08 secs
v1 'GetState' application/json response took 22.76 secs

After:
v0 '/state' response took 8.00 secs
v1 'GetState' application/x-protobuf response took 5.73 secs
v1 'GetState' application/json response took 9.62 secs


Diffs (updated)
-

  src/master/http.cpp 1778664dddf19f9ab6d6c09ec35d64674ae488df 
  src/master/master.hpp 8a140650a016c8afbfb39729eba2b5e78ea81c5f 


Diff: https://reviews.apache.org/r/71750/diff/3/

Changes: https://reviews.apache.org/r/71750/diff/2-3/


Testing
---

make check


Thanks,

Benjamin Mahler



Re: Review Request 71750: Improved performance of v1 operator API GetAgents call.

2019-11-22 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71750/
---

(Updated Nov. 22, 2019, 6:09 p.m.)


Review request for mesos, Andrei Sekretenko, Greg Mann, and Meng Zhu.


Changes
---

* Used NotAcceptable in place of BadRequest
* Added an UNREACHABLE()


Bugs: MESOS-10026
https://issues.apache.org/jira/browse/MESOS-10026


Repository: mesos


Description
---

This updates the handling to serialize directly to protobuf or json
from the in-memory v0 state, bypassing expensive intermediate
serialization / de-serialization / object construction / object
destruction.

This initial patch shows the approach that will be used for the
other expensive calls. Note that this type of manual writing is
more brittle and complex, but it can be mostly eliminated if we
keep an up-to-date v1 GetState in memory in the future.

When this approach is applied fully to GetState, it leads to the
following improvement:

Before:
v0 '/state' response took 6.55 secs
v1 'GetState' application/x-protobuf response took 24.08 secs
v1 'GetState' application/json response took 22.76 secs

After:
v0 '/state' response took 8.00 secs
v1 'GetState' application/x-protobuf response took 5.73 secs
v1 'GetState' application/json response took 9.62 secs


Diffs (updated)
-

  src/master/http.cpp 1778664dddf19f9ab6d6c09ec35d64674ae488df 
  src/master/master.hpp 8a140650a016c8afbfb39729eba2b5e78ea81c5f 


Diff: https://reviews.apache.org/r/71750/diff/2/

Changes: https://reviews.apache.org/r/71750/diff/1-2/


Testing
---

make check


Thanks,

Benjamin Mahler



Re: Review Request 71750: Improved performance of v1 operator API GetAgents call.

2019-11-22 Thread Benjamin Mahler


> On Nov. 21, 2019, 4:45 p.m., Andrei Sekretenko wrote:
> > src/master/http.cpp
> > Lines 2235-2237 (patched)
> > 
> >
> > One more thought after looking at r71751-r71754: it should be possible 
> > to precalculate the size of the would-be `output` and serialize directly 
> > into the "top-level" `CodedOutputStream` (in this patch: the one in 
> > getAgents), without the intermediate serialization into string and copying 
> > it into the "top-level" stream, right?
> > 
> > Given that the plan is to refactor the common parts later, this should 
> > probably be left as one more TODO item...

> One more thought after looking at r71751-r71754: it should be possible to 
> precalculate the size of the would-be output and serialize directly into the 
> "top-level" CodedOutputStream (in this patch: the one in getAgents), without 
> the intermediate serialization into string and copying it into the 
> "top-level" stream, right?

Yep, good observation! I originally wanted to implement it that way (no 
intermediate strings) but it's quite hard since we construct temporary objects 
in many cases (e.g. mesos::master::Response::GetAgents::Agent or pending task 
object). Even without temporary object construction, the logic needed to 
compute the sub-message size is quite complex since it has to have the same 
loops and so on as the actual serialization code. If there was an easy / 
reasonably simple way to do it, would be nice! I was curious how much of a 
performance impact it would have, I guess we'll find out somewhat when we keep 
an up-to-date v1 state object in memory.

If the protobuf encoding allowed fixed size ints for sub-message size (rather 
than mandating varint), then you could directly serialize without knowing size, 
and go back to write the size after.


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71750/#review218732
---


On Nov. 12, 2019, 12:03 a.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71750/
> ---
> 
> (Updated Nov. 12, 2019, 12:03 a.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko, Greg Mann, and Meng Zhu.
> 
> 
> Bugs: MESOS-10026
> https://issues.apache.org/jira/browse/MESOS-10026
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This updates the handling to serialize directly to protobuf or json
> from the in-memory v0 state, bypassing expensive intermediate
> serialization / de-serialization / object construction / object
> destruction.
> 
> This initial patch shows the approach that will be used for the
> other expensive calls. Note that this type of manual writing is
> more brittle and complex, but it can be mostly eliminated if we
> keep an up-to-date v1 GetState in memory in the future.
> 
> When this approach is applied fully to GetState, it leads to the
> following improvement:
> 
> Before:
> v0 '/state' response took 6.55 secs
> v1 'GetState' application/x-protobuf response took 24.08 secs
> v1 'GetState' application/json response took 22.76 secs
> 
> After:
> v0 '/state' response took 8.00 secs
> v1 'GetState' application/x-protobuf response took 5.73 secs
> v1 'GetState' application/json response took 9.62 secs
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp 60765c9b9d6903f6ed94fa8c614055698caad0da 
>   src/master/master.hpp dc45028d2ecfb61bf9ea82d90d2393af648a6023 
> 
> 
> Diff: https://reviews.apache.org/r/71750/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Re: Review Request 71750: Improved performance of v1 operator API GetAgents call.

2019-11-21 Thread Andrei Sekretenko


> On Nov. 19, 2019, 6:12 p.m., Andrei Sekretenko wrote:
> > src/master/http.cpp
> > Lines 2242 (patched)
> > 
> >
> > Did you consider allocating the `agent` temporaries outside of the 
> > loops? In my experience, protobuf construction used to be much more 
> > expensive than modification.
> > 
> > (Tried this, observed a ~10% speedup for `v1 'master::call::GetState' 
> > application/x-protobuf` in my setup.)
> 
> Benjamin Mahler wrote:
> Hm.. that's interesting, could certainly look into this for the GetAgents 
> and GetFrameworks code which both use temporaries. The temporaries aren't 
> necessary though, we could eliminate the need to copy entirely, but 
> allocating them outside the loops seems like an easier win.
> 
> Unfortunately the most expensive part in practice (non-pending tasks) are 
> not using temporaries like this. I'll look into a patch at the end of this 
> chain to allocate outside the loops.
> 
> Benjamin Mahler wrote:
> > (Tried this, observed a ~10% speedup for v1 'master::call::GetState' 
> application/x-protobuf in my setup.)
> 
> Curious to see your setup (did you just use a lot of agents instead of a 
> lot of tasks?) and your diff.
> 
> Benjamin Mahler wrote:
> Also did you run with optimizations on? How many runs did you do? Because 
> there's quite a high variance. Just did 10 runs of 
> AgentFrameworkTaskCountContentType/MasterStateQuery_BENCHMARK_Test.GetState/3:
> 
> before: median 5.29 from [4.41, 4.51, 5.07, 5.10, 5.27, 5.32, 5.49, 5.51, 
> 5.70, 6.46]
> after:  median 5.11 from [4.68, 4.79, 4.80, 4.84, 5.01, 5.21, 5.25, 5.42, 
> 6.42, 7.43]
> 
> So based on medians it's 96.6% of the time without the loop change?
> 
> With the following diff applied to this chain to produce 'after':
> 
> ```
> diff --git a/src/master/http.cpp b/src/master/http.cpp
> index 0ca25b6d7..c77788cdf 100644
> --- a/src/master/http.cpp
> +++ b/src/master/http.cpp
> @@ -1480,6 +1480,7 @@ string Master::Http::serializeGetFrameworks(
>// expect a large number of pending tasks, we currently don't
>// bother with the more efficient approach.
>  
> +  mesos::master::Response::GetFrameworks::Framework f;
>foreachvalue (const Framework* framework,
>  master->frameworks.registered) {
>  // Skip unauthorized frameworks.
> @@ -1487,7 +1488,7 @@ string Master::Http::serializeGetFrameworks(
>continue;
>  }
>  
> -mesos::master::Response::GetFrameworks::Framework f = 
> model(*framework);
> +f = model(*framework);
>  writer.WriteTag(
>  WireFormatLite::MakeTag(
>  mesos::master::Response::GetFrameworks
> @@ -1504,7 +1505,7 @@ string Master::Http::serializeGetFrameworks(
>continue;
>  }
>  
> -mesos::master::Response::GetFrameworks::Framework f = 
> model(*framework);
> +f = model(*framework);
>  writer.WriteTag(
>  WireFormatLite::MakeTag(
>  mesos::master::Response::GetFrameworks
> @@ -2970,15 +2971,15 @@ string Master::Http::serializeGetAgents(
>google::protobuf::io::StringOutputStream stream();
>google::protobuf::io::CodedOutputStream writer();
>  
> +  mesos::master::Response::GetAgents::Agent agent;
>foreachvalue (const Slave* slave, master->slaves.registered) {
>  // TODO(bmahler): Consider not constructing the temporary
>  // agent object and instead serialize directly.
> -mesos::master::Response::GetAgents::Agent agent =
> -  protobuf::master::event::createAgentResponse(
> -  *slave,
> -  master->slaves.draining.get(slave->id),
> -  master->slaves.deactivated.contains(slave->id),
> -  approvers);
> +agent = protobuf::master::event::createAgentResponse(
> +*slave,
> +master->slaves.draining.get(slave->id),
> +master->slaves.deactivated.contains(slave->id),
> +approvers);
>  
>  writer.WriteTag(
>  WireFormatLite::MakeTag(
> @@ -2988,14 +2989,15 @@ string Master::Http::serializeGetAgents(
>  agent.SerializeToCodedStream();
>}
>  
> +  SlaveInfo recoveredAgent;
>foreachvalue (const SlaveInfo& slaveInfo, master->slaves.recovered) {
>  // TODO(bmahler): Consider not constructing the temporary
>  // SlaveInfo object and instead serialize directly.
> -SlaveInfo agent = slaveInfo;
> -agent.clear_resources();
> +recoveredAgent = slaveInfo;
> +recoveredAgent.clear_resources();
>  foreach (const Resource& resource, slaveInfo.resources()) {
>if (approvers->approved(resource)) {
> -*agent.add_resources() = resource;
> 

Re: Review Request 71750: Improved performance of v1 operator API GetAgents call.

2019-11-21 Thread Andrei Sekretenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71750/#review218736
---




src/master/http.cpp
Lines 2142 (patched)


(see compiler warning) will trigger UB if, due to some bug in a caller of 
getAgents, control reaches here.


- Andrei Sekretenko


On Nov. 12, 2019, 12:03 a.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71750/
> ---
> 
> (Updated Nov. 12, 2019, 12:03 a.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko, Greg Mann, and Meng Zhu.
> 
> 
> Bugs: MESOS-10026
> https://issues.apache.org/jira/browse/MESOS-10026
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This updates the handling to serialize directly to protobuf or json
> from the in-memory v0 state, bypassing expensive intermediate
> serialization / de-serialization / object construction / object
> destruction.
> 
> This initial patch shows the approach that will be used for the
> other expensive calls. Note that this type of manual writing is
> more brittle and complex, but it can be mostly eliminated if we
> keep an up-to-date v1 GetState in memory in the future.
> 
> When this approach is applied fully to GetState, it leads to the
> following improvement:
> 
> Before:
> v0 '/state' response took 6.55 secs
> v1 'GetState' application/x-protobuf response took 24.08 secs
> v1 'GetState' application/json response took 22.76 secs
> 
> After:
> v0 '/state' response took 8.00 secs
> v1 'GetState' application/x-protobuf response took 5.73 secs
> v1 'GetState' application/json response took 9.62 secs
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp 60765c9b9d6903f6ed94fa8c614055698caad0da 
>   src/master/master.hpp dc45028d2ecfb61bf9ea82d90d2393af648a6023 
> 
> 
> Diff: https://reviews.apache.org/r/71750/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Re: Review Request 71750: Improved performance of v1 operator API GetAgents call.

2019-11-21 Thread Andrei Sekretenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71750/#review218732
---




src/master/http.cpp
Lines 2235-2237 (patched)


One more thought after looking at r71751-r71754: it should be possible to 
precalculate the size of the would-be `output` and serialize directly into the 
"top-level" `CodedOutputStream` (in this patch: the one in getAgents), without 
the intermediate serialization into string and copying it into the "top-level" 
stream, right?

Given that the plan is to refactor the common parts later, this should 
probably be left as one more TODO item...


- Andrei Sekretenko


On Nov. 12, 2019, 12:03 a.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71750/
> ---
> 
> (Updated Nov. 12, 2019, 12:03 a.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko, Greg Mann, and Meng Zhu.
> 
> 
> Bugs: MESOS-10026
> https://issues.apache.org/jira/browse/MESOS-10026
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This updates the handling to serialize directly to protobuf or json
> from the in-memory v0 state, bypassing expensive intermediate
> serialization / de-serialization / object construction / object
> destruction.
> 
> This initial patch shows the approach that will be used for the
> other expensive calls. Note that this type of manual writing is
> more brittle and complex, but it can be mostly eliminated if we
> keep an up-to-date v1 GetState in memory in the future.
> 
> When this approach is applied fully to GetState, it leads to the
> following improvement:
> 
> Before:
> v0 '/state' response took 6.55 secs
> v1 'GetState' application/x-protobuf response took 24.08 secs
> v1 'GetState' application/json response took 22.76 secs
> 
> After:
> v0 '/state' response took 8.00 secs
> v1 'GetState' application/x-protobuf response took 5.73 secs
> v1 'GetState' application/json response took 9.62 secs
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp 60765c9b9d6903f6ed94fa8c614055698caad0da 
>   src/master/master.hpp dc45028d2ecfb61bf9ea82d90d2393af648a6023 
> 
> 
> Diff: https://reviews.apache.org/r/71750/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Re: Review Request 71750: Improved performance of v1 operator API GetAgents call.

2019-11-20 Thread Andrei Sekretenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71750/#review218725
---


Fix it, then Ship it!





src/master/http.cpp
Lines 2141 (patched)


`NotAccepatble()` instead of `BadRequest()`? (Applies to the follow-up 
patches too - sorry, didn't notice earlier.)


- Andrei Sekretenko


On Nov. 12, 2019, 12:03 a.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71750/
> ---
> 
> (Updated Nov. 12, 2019, 12:03 a.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko, Greg Mann, and Meng Zhu.
> 
> 
> Bugs: MESOS-10026
> https://issues.apache.org/jira/browse/MESOS-10026
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This updates the handling to serialize directly to protobuf or json
> from the in-memory v0 state, bypassing expensive intermediate
> serialization / de-serialization / object construction / object
> destruction.
> 
> This initial patch shows the approach that will be used for the
> other expensive calls. Note that this type of manual writing is
> more brittle and complex, but it can be mostly eliminated if we
> keep an up-to-date v1 GetState in memory in the future.
> 
> When this approach is applied fully to GetState, it leads to the
> following improvement:
> 
> Before:
> v0 '/state' response took 6.55 secs
> v1 'GetState' application/x-protobuf response took 24.08 secs
> v1 'GetState' application/json response took 22.76 secs
> 
> After:
> v0 '/state' response took 8.00 secs
> v1 'GetState' application/x-protobuf response took 5.73 secs
> v1 'GetState' application/json response took 9.62 secs
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp 60765c9b9d6903f6ed94fa8c614055698caad0da 
>   src/master/master.hpp dc45028d2ecfb61bf9ea82d90d2393af648a6023 
> 
> 
> Diff: https://reviews.apache.org/r/71750/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Re: Review Request 71750: Improved performance of v1 operator API GetAgents call.

2019-11-19 Thread Benjamin Mahler


> On Nov. 19, 2019, 6:12 p.m., Andrei Sekretenko wrote:
> > src/master/http.cpp
> > Lines 2242 (patched)
> > 
> >
> > Did you consider allocating the `agent` temporaries outside of the 
> > loops? In my experience, protobuf construction used to be much more 
> > expensive than modification.
> > 
> > (Tried this, observed a ~10% speedup for `v1 'master::call::GetState' 
> > application/x-protobuf` in my setup.)
> 
> Benjamin Mahler wrote:
> Hm.. that's interesting, could certainly look into this for the GetAgents 
> and GetFrameworks code which both use temporaries. The temporaries aren't 
> necessary though, we could eliminate the need to copy entirely, but 
> allocating them outside the loops seems like an easier win.
> 
> Unfortunately the most expensive part in practice (non-pending tasks) are 
> not using temporaries like this. I'll look into a patch at the end of this 
> chain to allocate outside the loops.
> 
> Benjamin Mahler wrote:
> > (Tried this, observed a ~10% speedup for v1 'master::call::GetState' 
> application/x-protobuf in my setup.)
> 
> Curious to see your setup (did you just use a lot of agents instead of a 
> lot of tasks?) and your diff.

Also did you run with optimizations on? How many runs did you do? Because 
there's quite a high variance. Just did 10 runs of 
AgentFrameworkTaskCountContentType/MasterStateQuery_BENCHMARK_Test.GetState/3:

before: median 5.29 from [4.41, 4.51, 5.07, 5.10, 5.27, 5.32, 5.49, 5.51, 5.70, 
6.46]
after:  median 5.11 from [4.68, 4.79, 4.80, 4.84, 5.01, 5.21, 5.25, 5.42, 6.42, 
7.43]

So based on medians it's 96.6% of the time without the loop change?

With the following diff applied to this chain to produce 'after':

```
diff --git a/src/master/http.cpp b/src/master/http.cpp
index 0ca25b6d7..c77788cdf 100644
--- a/src/master/http.cpp
+++ b/src/master/http.cpp
@@ -1480,6 +1480,7 @@ string Master::Http::serializeGetFrameworks(
   // expect a large number of pending tasks, we currently don't
   // bother with the more efficient approach.
 
+  mesos::master::Response::GetFrameworks::Framework f;
   foreachvalue (const Framework* framework,
 master->frameworks.registered) {
 // Skip unauthorized frameworks.
@@ -1487,7 +1488,7 @@ string Master::Http::serializeGetFrameworks(
   continue;
 }
 
-mesos::master::Response::GetFrameworks::Framework f = model(*framework);
+f = model(*framework);
 writer.WriteTag(
 WireFormatLite::MakeTag(
 mesos::master::Response::GetFrameworks
@@ -1504,7 +1505,7 @@ string Master::Http::serializeGetFrameworks(
   continue;
 }
 
-mesos::master::Response::GetFrameworks::Framework f = model(*framework);
+f = model(*framework);
 writer.WriteTag(
 WireFormatLite::MakeTag(
 mesos::master::Response::GetFrameworks
@@ -2970,15 +2971,15 @@ string Master::Http::serializeGetAgents(
   google::protobuf::io::StringOutputStream stream();
   google::protobuf::io::CodedOutputStream writer();
 
+  mesos::master::Response::GetAgents::Agent agent;
   foreachvalue (const Slave* slave, master->slaves.registered) {
 // TODO(bmahler): Consider not constructing the temporary
 // agent object and instead serialize directly.
-mesos::master::Response::GetAgents::Agent agent =
-  protobuf::master::event::createAgentResponse(
-  *slave,
-  master->slaves.draining.get(slave->id),
-  master->slaves.deactivated.contains(slave->id),
-  approvers);
+agent = protobuf::master::event::createAgentResponse(
+*slave,
+master->slaves.draining.get(slave->id),
+master->slaves.deactivated.contains(slave->id),
+approvers);
 
 writer.WriteTag(
 WireFormatLite::MakeTag(
@@ -2988,14 +2989,15 @@ string Master::Http::serializeGetAgents(
 agent.SerializeToCodedStream();
   }
 
+  SlaveInfo recoveredAgent;
   foreachvalue (const SlaveInfo& slaveInfo, master->slaves.recovered) {
 // TODO(bmahler): Consider not constructing the temporary
 // SlaveInfo object and instead serialize directly.
-SlaveInfo agent = slaveInfo;
-agent.clear_resources();
+recoveredAgent = slaveInfo;
+recoveredAgent.clear_resources();
 foreach (const Resource& resource, slaveInfo.resources()) {
   if (approvers->approved(resource)) {
-*agent.add_resources() = resource;
+*recoveredAgent.add_resources() = resource;
   }
 }
 
@@ -3003,8 +3005,8 @@ string Master::Http::serializeGetAgents(
 WireFormatLite::MakeTag(
 mesos::master::Response::GetAgents::kRecoveredAgentsFieldNumber,
 WireFormatLite::WIRETYPE_LENGTH_DELIMITED));
-writer.WriteVarint32(agent.ByteSizeLong());
-agent.SerializeToCodedStream();
+writer.WriteVarint32(recoveredAgent.ByteSizeLong());
+recoveredAgent.SerializeToCodedStream();
   }
 
   // 

Re: Review Request 71750: Improved performance of v1 operator API GetAgents call.

2019-11-19 Thread Benjamin Mahler


> On Nov. 19, 2019, 6:12 p.m., Andrei Sekretenko wrote:
> > src/master/http.cpp
> > Lines 2242 (patched)
> > 
> >
> > Did you consider allocating the `agent` temporaries outside of the 
> > loops? In my experience, protobuf construction used to be much more 
> > expensive than modification.
> > 
> > (Tried this, observed a ~10% speedup for `v1 'master::call::GetState' 
> > application/x-protobuf` in my setup.)
> 
> Benjamin Mahler wrote:
> Hm.. that's interesting, could certainly look into this for the GetAgents 
> and GetFrameworks code which both use temporaries. The temporaries aren't 
> necessary though, we could eliminate the need to copy entirely, but 
> allocating them outside the loops seems like an easier win.
> 
> Unfortunately the most expensive part in practice (non-pending tasks) are 
> not using temporaries like this. I'll look into a patch at the end of this 
> chain to allocate outside the loops.

> (Tried this, observed a ~10% speedup for v1 'master::call::GetState' 
> application/x-protobuf in my setup.)

Curious to see your setup (did you just use a lot of agents instead of a lot of 
tasks?) and your diff.


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71750/#review218678
---


On Nov. 12, 2019, 12:03 a.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71750/
> ---
> 
> (Updated Nov. 12, 2019, 12:03 a.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko, Greg Mann, and Meng Zhu.
> 
> 
> Bugs: MESOS-10026
> https://issues.apache.org/jira/browse/MESOS-10026
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This updates the handling to serialize directly to protobuf or json
> from the in-memory v0 state, bypassing expensive intermediate
> serialization / de-serialization / object construction / object
> destruction.
> 
> This initial patch shows the approach that will be used for the
> other expensive calls. Note that this type of manual writing is
> more brittle and complex, but it can be mostly eliminated if we
> keep an up-to-date v1 GetState in memory in the future.
> 
> When this approach is applied fully to GetState, it leads to the
> following improvement:
> 
> Before:
> v0 '/state' response took 6.55 secs
> v1 'GetState' application/x-protobuf response took 24.08 secs
> v1 'GetState' application/json response took 22.76 secs
> 
> After:
> v0 '/state' response took 8.00 secs
> v1 'GetState' application/x-protobuf response took 5.73 secs
> v1 'GetState' application/json response took 9.62 secs
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp 60765c9b9d6903f6ed94fa8c614055698caad0da 
>   src/master/master.hpp dc45028d2ecfb61bf9ea82d90d2393af648a6023 
> 
> 
> Diff: https://reviews.apache.org/r/71750/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Re: Review Request 71750: Improved performance of v1 operator API GetAgents call.

2019-11-19 Thread Benjamin Mahler


> On Nov. 19, 2019, 6:12 p.m., Andrei Sekretenko wrote:
> > src/master/http.cpp
> > Lines 2242 (patched)
> > 
> >
> > Did you consider allocating the `agent` temporaries outside of the 
> > loops? In my experience, protobuf construction used to be much more 
> > expensive than modification.
> > 
> > (Tried this, observed a ~10% speedup for `v1 'master::call::GetState' 
> > application/x-protobuf` in my setup.)

Hm.. that's interesting, could certainly look into this for the GetAgents and 
GetFrameworks code which both use temporaries. The temporaries aren't necessary 
though, we could eliminate the need to copy entirely, but allocating them 
outside the loops seems like an easier win.

Unfortunately the most expensive part in practice (non-pending tasks) are not 
using temporaries like this. I'll look into a patch at the end of this chain to 
allocate outside the loops.


> On Nov. 19, 2019, 6:12 p.m., Andrei Sekretenko wrote:
> > src/master/http.cpp
> > Lines 2257-2263 (patched)
> > 
> >
> > There is a certain similarity between serializeGetAgents and 
> > jsonifyGetAgents which probably can be factored out... 
> > 
> > However, it might make more sense to leave this as is for now, and 
> > refactor it after we get to getTasks/Executors/Frameworks.

Yeah, the same is true for the other serialization paths which I just 
published. I debated refactoring in front or behind, and I think I'll go with 
behind, so that these patches just duplicate the common parts for now.


- Benjamin


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71750/#review218678
---


On Nov. 12, 2019, 12:03 a.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71750/
> ---
> 
> (Updated Nov. 12, 2019, 12:03 a.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko, Greg Mann, and Meng Zhu.
> 
> 
> Bugs: MESOS-10026
> https://issues.apache.org/jira/browse/MESOS-10026
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This updates the handling to serialize directly to protobuf or json
> from the in-memory v0 state, bypassing expensive intermediate
> serialization / de-serialization / object construction / object
> destruction.
> 
> This initial patch shows the approach that will be used for the
> other expensive calls. Note that this type of manual writing is
> more brittle and complex, but it can be mostly eliminated if we
> keep an up-to-date v1 GetState in memory in the future.
> 
> When this approach is applied fully to GetState, it leads to the
> following improvement:
> 
> Before:
> v0 '/state' response took 6.55 secs
> v1 'GetState' application/x-protobuf response took 24.08 secs
> v1 'GetState' application/json response took 22.76 secs
> 
> After:
> v0 '/state' response took 8.00 secs
> v1 'GetState' application/x-protobuf response took 5.73 secs
> v1 'GetState' application/json response took 9.62 secs
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp 60765c9b9d6903f6ed94fa8c614055698caad0da 
>   src/master/master.hpp dc45028d2ecfb61bf9ea82d90d2393af648a6023 
> 
> 
> Diff: https://reviews.apache.org/r/71750/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Re: Review Request 71750: Improved performance of v1 operator API GetAgents call.

2019-11-19 Thread Andrei Sekretenko

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71750/#review218678
---




src/master/http.cpp
Lines 2242 (patched)


Did you consider allocating the `agent` temporaries outside of the loops? 
In my experience, protobuf construction used to be much more expensive than 
modification.

(Tried this, observed a ~10% speedup for `v1 'master::call::GetState' 
application/x-protobuf` in my setup.)



src/master/http.cpp
Lines 2257-2263 (patched)


There is a certain similarity between serializeGetAgents and 
jsonifyGetAgents which probably can be factored out... 

However, it might make more sense to leave this as is for now, and refactor 
it after we get to getTasks/Executors/Frameworks.


- Andrei Sekretenko


On Nov. 12, 2019, 12:03 a.m., Benjamin Mahler wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71750/
> ---
> 
> (Updated Nov. 12, 2019, 12:03 a.m.)
> 
> 
> Review request for mesos, Andrei Sekretenko, Greg Mann, and Meng Zhu.
> 
> 
> Bugs: MESOS-10026
> https://issues.apache.org/jira/browse/MESOS-10026
> 
> 
> Repository: mesos
> 
> 
> Description
> ---
> 
> This updates the handling to serialize directly to protobuf or json
> from the in-memory v0 state, bypassing expensive intermediate
> serialization / de-serialization / object construction / object
> destruction.
> 
> This initial patch shows the approach that will be used for the
> other expensive calls. Note that this type of manual writing is
> more brittle and complex, but it can be mostly eliminated if we
> keep an up-to-date v1 GetState in memory in the future.
> 
> When this approach is applied fully to GetState, it leads to the
> following improvement:
> 
> Before:
> v0 '/state' response took 6.55 secs
> v1 'GetState' application/x-protobuf response took 24.08 secs
> v1 'GetState' application/json response took 22.76 secs
> 
> After:
> v0 '/state' response took 8.00 secs
> v1 'GetState' application/x-protobuf response took 5.73 secs
> v1 'GetState' application/json response took 9.62 secs
> 
> 
> Diffs
> -
> 
>   src/master/http.cpp 60765c9b9d6903f6ed94fa8c614055698caad0da 
>   src/master/master.hpp dc45028d2ecfb61bf9ea82d90d2393af648a6023 
> 
> 
> Diff: https://reviews.apache.org/r/71750/diff/1/
> 
> 
> Testing
> ---
> 
> make check
> 
> 
> Thanks,
> 
> Benjamin Mahler
> 
>



Re: Review Request 71750: Improved performance of v1 operator API GetAgents call.

2019-11-11 Thread Mesos Reviewbot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71750/#review218598
---



Bad patch!

Reviews applied: [71747, 71748, 71749, 71750]

Failed command: ['bash', '-c', "set -o pipefail; export OS='ubuntu:14.04' 
BUILDTOOL='autotools' COMPILER='gcc' CONFIGURATION='--verbose 
--disable-libtool-wrappers --disable-parallel-test-execution' 
ENVIRONMENT='GLOG_v=1 MESOS_VERBOSE=1'; ./support/docker-build.sh 2>&1 | tee 
build_71750"]

Error:
..
-DPACKAGE_STRING=\"mesos 1.10.0\"" -DPACKAGE_BUGREPORT=\"\" -DPACKAGE_URL=\"\" 
-DPACKAGE=\"mesos\" -DVERSION=\"1.10.0\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 
-DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 
-DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 
-DHAVE_DLFCN_H=1 -DLT_OBJDIR=\".libs/\" -DHAVE_CXX11=1 
-DHAVE_PTHREAD_PRIO_INHERIT=1 -DHAVE_PTHREAD=1 -DHAVE_OPENSSL_SSL_H=1 
-DHAVE_FTS_H=1 -DHAVE_APR_POOLS_H=1 -DHAVE_LIBAPR_1=1 -DHAVE_LIBCURL=1 
-DMESOS_HAS_JAVA=1 -DHAVE_LIBSASL2=1 -DHAVE_SVN_VERSION_H=1 
-DHAVE_LIBSVN_SUBR_1=1 -DHAVE_SVN_DELTA_H=1 -DHAVE_LIBSVN_DELTA_1=1 
-DHAVE_ZLIB_H=1 -DHAVE_LIBZ=1 -DHAVE_PYTHON=\"2.7\" -DMESOS_HAS_PYTHON=1 -I. 
-I../../src -Werror -DLIBDIR=\"/mesos/mesos-1.10.0/_inst/lib\" 
-DPKGLIBEXECDIR=\"/mesos/mesos-1.10.0/_inst/libexec/mesos\" 
-DPKGDATADIR=\"/mesos/mesos-1.10.0/_inst/share/mesos\" 
-DPKGMODULEDIR=\"/mesos/mesos-1.10.0/_inst/lib/mesos/modules\" -I../../include 
-I../include -I../include/mesos -
 D__STDC_FORMAT_MACROS -I../3rdparty/boost-1.65.0 
-I../3rdparty/concurrentqueue-7b69a8f -I../3rdparty/elfio-3.2 
-I../3rdparty/glog-0.4.0/src -I../3rdparty/grpc-1.10.0/include 
-I../3rdparty/leveldb-1.19/include -I../3rdparty/libarchive-3.3.2/libarchive/ 
-I../../3rdparty/libprocess/include -I../3rdparty/nvml-352.79 
-I../3rdparty/picojson-1.3.0 -I../3rdparty/protobuf-3.5.0/src 
-I../3rdparty/rapidjson-1.1.0/include -I../../3rdparty/stout/include 
-I../3rdparty/zookeeper-3.4.8/src/c/include 
-I../3rdparty/zookeeper-3.4.8/src/c/generated -I/usr/include/subversion-1 
-I/usr/include/apr-1 -I/usr/include/apr-1.0 -pthread -Wall -Wsign-compare 
-Wformat-security -fstack-protector -fPIC -g1 -O0 -Wno-unused-local-typedefs 
-std=c++11 -c ../../src/logging/flags.cpp  -fPIC -DPIC -o 
logging/.libs/libmesos_no_3rdparty_la-flags.o
/bin/bash ../libtool  --tag=CXX   --mode=compile g++ -DPACKAGE_NAME=\"mesos\" 
-DPACKAGE_TARNAME=\"mesos\" -DPACKAGE_VERSION=\"1.10.0\" 
-DPACKAGE_STRING=\"mesos\ 1.10.0\" -DPACKAGE_BUGREPORT=\"\" -DPACKAGE_URL=\"\" 
-DPACKAGE=\"mesos\" -DVERSION=\"1.10.0\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 
-DHAVE_SYS_STAT_H=1 -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 
-DHAVE_STRINGS_H=1 -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 
-DHAVE_DLFCN_H=1 -DLT_OBJDIR=\".libs/\" -DHAVE_CXX11=1 
-DHAVE_PTHREAD_PRIO_INHERIT=1 -DHAVE_PTHREAD=1 -DHAVE_OPENSSL_SSL_H=1 
-DHAVE_FTS_H=1 -DHAVE_APR_POOLS_H=1 -DHAVE_LIBAPR_1=1 -DHAVE_LIBCURL=1 
-DMESOS_HAS_JAVA=1 -DHAVE_LIBSASL2=1 -DHAVE_SVN_VERSION_H=1 
-DHAVE_LIBSVN_SUBR_1=1 -DHAVE_SVN_DELTA_H=1 -DHAVE_LIBSVN_DELTA_1=1 
-DHAVE_ZLIB_H=1 -DHAVE_LIBZ=1 -DHAVE_PYTHON=\"2.7\" -DMESOS_HAS_PYTHON=1 -I. 
-I../../src   -Werror -DLIBDIR=\"/mesos/mesos-1.10.0/_inst/lib\" 
-DPKGLIBEXECDIR=\"/mesos/mesos-1.10.0/_inst/libexec/mesos\" 
-DPKGDATADIR=\"/mesos/mesos-1.
 10.0/_inst/share/mesos\" 
-DPKGMODULEDIR=\"/mesos/mesos-1.10.0/_inst/lib/mesos/modules\" -I../../include 
-I../include -I../include/mesos -D__STDC_FORMAT_MACROS 
-I../3rdparty/boost-1.65.0 -I../3rdparty/concurrentqueue-7b69a8f 
-I../3rdparty/elfio-3.2 -I../3rdparty/glog-0.4.0/src 
-I../3rdparty/grpc-1.10.0/include -I../3rdparty/leveldb-1.19/include 
-I../3rdparty/libarchive-3.3.2/libarchive/ -I../../3rdparty/libprocess/include  
-I../3rdparty/nvml-352.79 -I../3rdparty/picojson-1.3.0 
-I../3rdparty/protobuf-3.5.0/src -I../3rdparty/rapidjson-1.1.0/include 
-I../../3rdparty/stout/include -I../3rdparty/zookeeper-3.4.8/src/c/include 
-I../3rdparty/zookeeper-3.4.8/src/c/generated -I/usr/include/subversion-1 
-I/usr/include/apr-1 -I/usr/include/apr-1.0  -pthread -Wall 
-Wsign-compare -Wformat-security -fstack-protector -fPIC -fPIE -g1 -O0 
-Wno-unused-local-typedefs -std=c++11 -c -o 
logging/libmesos_no_3rdparty_la-logging.lo `test -f 'logging/logging.cpp' || 
echo '../../src/'`logging/logging.cpp
libtool: compile:  g++ -DPACKAGE_NAME=\"mesos\" -DPACKAGE_TARNAME=\"mesos\" 
-DPACKAGE_VERSION=\"1.10.0\" "-DPACKAGE_STRING=\"mesos 1.10.0\"" 
-DPACKAGE_BUGREPORT=\"\" -DPACKAGE_URL=\"\" -DPACKAGE=\"mesos\" 
-DVERSION=\"1.10.0\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 
-DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 
-DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 
-DLT_OBJDIR=\".libs/\" -DHAVE_CXX11=1 -DHAVE_PTHREAD_PRIO_INHERIT=1 
-DHAVE_PTHREAD=1 

Review Request 71750: Improved performance of v1 operator API GetAgents call.

2019-11-11 Thread Benjamin Mahler

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71750/
---

Review request for mesos, Andrei Sekretenko, Greg Mann, and Meng Zhu.


Bugs: MESOS-10026
https://issues.apache.org/jira/browse/MESOS-10026


Repository: mesos


Description
---

This updates the handling to serialize directly to protobuf or json
from the in-memory v0 state, bypassing expensive intermediate
serialization / de-serialization / object construction / object
destruction.

This initial patch shows the approach that will be used for the
other expensive calls. Note that this type of manual writing is
more brittle and complex, but it can be mostly eliminated if we
keep an up-to-date v1 GetState in memory in the future.

When this approach is applied fully to GetState, it leads to the
following improvement:

Before:
v0 '/state' response took 6.55 secs
v1 'GetState' application/x-protobuf response took 24.08 secs
v1 'GetState' application/json response took 22.76 secs

After:
v0 '/state' response took 8.00 secs
v1 'GetState' application/x-protobuf response took 5.73 secs
v1 'GetState' application/json response took 9.62 secs


Diffs
-

  src/master/http.cpp 60765c9b9d6903f6ed94fa8c614055698caad0da 
  src/master/master.hpp dc45028d2ecfb61bf9ea82d90d2393af648a6023 


Diff: https://reviews.apache.org/r/71750/diff/1/


Testing
---

make check


Thanks,

Benjamin Mahler