[jira] [Commented] (MESOS-10026) Improve v1 operator API read performance.

Benjamin Mahler (Jira) Fri, 22 Nov 2019 13:43:08 -0800


    [ 
https://issues.apache.org/jira/browse/MESOS-10026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16980530#comment-16980530
 ]


Benjamin Mahler commented on MESOS-10026:
-----------------------------------------

All of the GET_ reads are done: 

{noformat}
commit b275a032f217794f20dae15d86f188e55d43ce59
Author: Benjamin Mahler <bmah...@apache.org>
Date:   Fri Nov 8 16:13:40 2019 -0800

    Support jsonifying v0 protobuf to v1 protobuf.

    This allows us to jsonify a v0 protobuf directly to a v1 protobuf
    efficiently, with no need to `evolve()` the message (which is rather
    expensive).

    The way this works is by converting all "slave" and "SLAVE" strings
    in fields and enum values, respectively, to "agent" and "AGENT".

    Our current v0 to v1 conversion for the v1 operator API simply
    serializes the v0 message and de-serializes into a v1 message, which
    means all field tags and message structures are the same, except
    for field names. The only difference with field names is the use
    of "agent" in place of "slave".

    Review: https://reviews.apache.org/r/71748
{noformat}

{noformat}
commit 8bfbcab09be82d3a443697925fbf3c4f31333060
Author: Benjamin Mahler <bmah...@apache.org>
Date:   Fri Nov 8 16:49:36 2019 -0800

    Added a test for AsV1Protobuf.

    Review: https://reviews.apache.org/r/71749
{noformat}

{noformat}
commit 715035b24cb90ba17f9d92217f6556a2f66979e8
Author: Benjamin Mahler <bmah...@apache.org>
Date:   Fri Nov 8 16:52:37 2019 -0800

    Improved performance of v1 operator API GetAgents call.

    This updates the handling to serialize directly to protobuf or json
    from the in-memory v0 state, bypassing expensive intermediate
    serialization / de-serialization / object construction / object
    destruction.

    This initial patch shows the approach that will be used for the
    other expensive calls. Note that this type of manual writing is
    more brittle and complex, but it can be mostly eliminated if we
    keep an up-to-date v1 GetState in memory in the future.

    When this approach is applied fully to GetState, it leads to the
    following improvement:

    Before:
    v0 '/state' response took 6.55 secs
    v1 'GetState' application/x-protobuf response took 24.08 secs
    v1 'GetState' application/json response took 22.76 secs

    After:
    v0 '/state' response took 8.00 secs
    v1 'GetState' application/x-protobuf response took 5.73 secs
    v1 'GetState' application/json response took 9.62 secs

    Review: https://reviews.apache.org/r/71750
{noformat}

{noformat}
commit 4f4dab961bd45ca444d13b831cdb2541dd10ced8
Author: Benjamin Mahler <bmah...@apache.org>
Date:   Fri Nov 8 16:56:16 2019 -0800

    Improved performance of v1 operator API GetFrameworks call.

    This follow the same approach used in the GetAgents call;
    serializing directly to protobuf or json from the in-memory
    v0 state.

    Review: https://reviews.apache.org/r/71751
{noformat}

{noformat}
commit 6ab835459a452e53fec8982a5aaab7e78094bbcb
Author: Benjamin Mahler <bmah...@apache.org>
Date:   Fri Nov 8 16:57:28 2019 -0800

    Improved performance of v1 operator API GetExecutors call.

    This follow the same approach used in the GetAgents call;
    serializing directly to protobuf or json from the in-memory
    v0 state.

    Review: https://reviews.apache.org/r/71752
{noformat}

{noformat}
commit d7dd4d0e8493331d7b7a21b504ebeab702ff06d5
Author: Benjamin Mahler <bmah...@apache.org>
Date:   Fri Nov 8 16:58:47 2019 -0800

    Improved performance of v1 operator API GetTasks call.

    This follow the same approach used in the GetAgents call;
    serializing directly to protobuf or json from the in-memory
    v0 state.

    Review: https://reviews.apache.org/r/71753
{noformat}

{noformat}
commit 1c60f0e4acbac96c34bd90e265150cdd3844f915
Author: Benjamin Mahler <bmah...@apache.org>
Date:   Fri Nov 8 16:59:44 2019 -0800

    Improved performance of v1 operator API GetState call.

    This follow the same approach used in the GetAgents call;
    serializing directly to protobuf or json from the in-memory
    v0 state.

    Before:
    v0 '/state' response took 6.55 secs
    v1 'GetState' application/x-protobuf response took 24.08 secs
    v1 'GetState' application/json response took 22.76 secs

    After:
    v0 '/state' response took 8.00 secs
    v1 'GetState' application/x-protobuf response took 5.73 secs
    v1 'GetState' application/json response took 9.62 secs

    Review: https://reviews.apache.org/r/71754
{noformat}

{noformat}
commit 469f2ebaf65b1642d1eb4a1df81abfc2c94889dd
Author: Benjamin Mahler <bmah...@apache.org>
Date:   Fri Nov 8 17:00:37 2019 -0800

    Improved performance of v1 operator API GetMetrics call.

    This follow the same approach used in the GetAgents call;
    serializing directly to protobuf or json from the in-memory
    v0 state.

    Review: https://reviews.apache.org/r/71755
{noformat}

SUBSCRIBE will be next.

> Improve v1 operator API read performance.
> -----------------------------------------
>
>                 Key: MESOS-10026
>                 URL: https://issues.apache.org/jira/browse/MESOS-10026
>             Project: Mesos
>          Issue Type: Improvement
>          Components: HTTP API
>            Reporter: Benjamin Mahler
>            Assignee: Benjamin Mahler
>            Priority: Major
>              Labels: foundations
>
> Currently, the v1 operator API has poor performance relative to the v0 json 
> API. The following initial numbers were provided by [~Will Mahler] from our 
> state serving benchmark:
>  
> |OPTIMIZED - Master (baseline)| | | | |
> |Test setup|1000 agents with a total of 10000 running tasks and 10000 
> completed tasks|10000 agents with a total of 100000 running tasks and 100000 
> completed tasks|20000 agents with a total of 200000 running tasks and 200000 
> completed tasks|40000 agents with a total of 400000 running tasks and 400000 
> completed tasks|
> |v0 'state' response|0.17|1.66|8.96|12.42|
> |v1 x-protobuf|0.35|3.21|9.47|19.09|
> |v1 json|0.45|4.72|10.81|31.43|
> There is quite a lot of variance, but v1 protobuf consistently slower than v0 
> (sometimes significantly so) and v1 json is consistently slower than v1 
> protobuf (sometimes significantly so).
> The reason that the v1 operator API is slower is that it does the following:
> (1) Construct temporary unversioned state response object by copying 
> in-memory un-versioned state into overall response object. (expensive!)
> (2) Evolve it to v1: serialize, de-serialize into v1 overall state object. 
> (expensive!)
> (3) Serialize the overall v1 state object to protobuf or json.
> (4) Destruct the temporaries (expensive! but is done after response starts 
> serving)
> On the other hand, the v0 jsonify approach does the following:
> (1) Serialize the in-memory unversioned state into json, by traversing state 
> and accumulating the overall serialized json.
> This means that v1 has substantial overhead vs v0, and we need to remove it 
> to bring v1 on-par or better than v0. v1 should serialize directly to json 
> (straightforward with jsonify) or protobuf (this can be done via a 
> io::CodedOutputStream).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (MESOS-10026) Improve v1 operator API read performance.

Reply via email to