[jira] [Commented] (BEAM-3617) Performance degradation on the direct runner

2018-02-06 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16354170#comment-16354170
 ] 

Kenneth Knowles commented on BEAM-3617:
---

Ah, it is almost certainly due to a repeated non-cached deserialization. Revert 
for release is a good idea. Probably shows up most in Query7 because it uses a 
particular primitive more. The only primitive that caches is ParDo. There's 
some fairly arbitrary choices about which piece the runner should cache and 
which piece is built into pipeline rehydration.

> Performance degradation on the direct runner
> 
>
> Key: BEAM-3617
> URL: https://issues.apache.org/jira/browse/BEAM-3617
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.3.0
>Reporter: Jean-Baptiste Onofré
>Assignee: Kenneth Knowles
>Priority: Blocker
> Fix For: 2.3.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Running Nexmark queries with the direct runner between Beam 2.2.0 and 2.3.0 
> shows a performance degradation:
> {code}
> 
>  Beam 2.2.0   Beam 2.3.0
>   Query  Runtime(sec) Runtime(sec)
> 
>      6.410.6
>   0001   5.110.2
>   0002   3.0 5.8
>   0003   3.8 6.2
>   0004   0.9 1.4
>   0005   5.811.4
>   0006   0.8 1.4
>   0007 193.8  1249.1
>   0008   3.9 6.9
>   0009   0.9 1.3
>   0010   6.4 8.2
>   0011   5.0 9.4
>   0012   4.7 9.1
> {code}
> We can see especially Query 7 that is 10 times longer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3617) Performance degradation on the direct runner

2018-02-06 Thread JIRA

[ 
https://issues.apache.org/jira/browse/BEAM-3617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16353662#comment-16353662
 ] 

Jean-Baptiste Onofré commented on BEAM-3617:


After {{git bisect}}, it seems the performance degradation is due to the 
following commit:

{code}
c0cb28cc30733f561d4cc6155be5738584956ebd is the first bad commit
commit c0cb28cc30733f561d4cc6155be5738584956ebd
Author: Kenn Knowles 
Date:   Sat Sep 30 10:30:20 2017 -0700

Reinstate proto round trip in Java DirectRunner

:04 04 9ecbf35bc0f21edba14076d089a0719c3996aaca 
0137bbebd4032213001ccf5544d2cd717282db8c M  runners
{code}

> Performance degradation on the direct runner
> 
>
> Key: BEAM-3617
> URL: https://issues.apache.org/jira/browse/BEAM-3617
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.3.0
>Reporter: Jean-Baptiste Onofré
>Assignee: Thomas Groh
>Priority: Blocker
> Fix For: 2.3.0
>
>
> Running Nexmark queries with the direct runner between Beam 2.2.0 and 2.3.0 
> shows a performance degradation:
> {code}
> 
>  Beam 2.2.0   Beam 2.3.0
>   Query  Runtime(sec) Runtime(sec)
> 
>      6.410.6
>   0001   5.110.2
>   0002   3.0 5.8
>   0003   3.8 6.2
>   0004   0.9 1.4
>   0005   5.811.4
>   0006   0.8 1.4
>   0007 193.8  1249.1
>   0008   3.9 6.9
>   0009   0.9 1.3
>   0010   6.4 8.2
>   0011   5.0 9.4
>   0012   4.7 9.1
> {code}
> We can see especially Query 7 that is 10 times longer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3617) Performance degradation on the direct runner

2018-02-05 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-3617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352829#comment-16352829
 ] 

Kenneth Knowles commented on BEAM-3617:
---

Good idea to check this. I added it to the spreadsheet and signed you up JB. I 
only added direct runner since other runners should opt in.

> Performance degradation on the direct runner
> 
>
> Key: BEAM-3617
> URL: https://issues.apache.org/jira/browse/BEAM-3617
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.3.0
>Reporter: Jean-Baptiste Onofré
>Assignee: Thomas Groh
>Priority: Blocker
> Fix For: 2.3.0
>
>
> Running Nexmark queries with the direct runner between Beam 2.2.0 and 2.3.0 
> shows a performance degradation:
> {code}
> 
>  Beam 2.2.0   Beam 2.3.0
>   Query  Runtime(sec) Runtime(sec)
> 
>      6.410.6
>   0001   5.110.2
>   0002   3.0 5.8
>   0003   3.8 6.2
>   0004   0.9 1.4
>   0005   5.811.4
>   0006   0.8 1.4
>   0007 193.8  1249.1
>   0008   3.9 6.9
>   0009   0.9 1.3
>   0010   6.4 8.2
>   0011   5.0 9.4
>   0012   4.7 9.1
> {code}
> We can see especially Query 7 that is 10 times longer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (BEAM-3617) Performance degradation on the direct runner

2018-02-05 Thread JIRA

[ 
https://issues.apache.org/jira/browse/BEAM-3617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352800#comment-16352800
 ] 

Jean-Baptiste Onofré commented on BEAM-3617:


I just ran nexmark ({{mvn exec:java 
-Dexec.mainClass=org.apache.beam.sdk.nexmark.Main -Dexec.args="--suite=SMOKE 
--streaming=true --manageResources=false --monitorJobs=true"}}):

* On Beam 2.2.0

{code}
Performance:
  Conf  Runtime(sec)(Baseline)  Events(/sec)(Baseline)   Results
(Baseline)
     8.1 12335.0  10
  0001   4.0 25106.7   92000
  0002   2.2 46082.9 351
  0003   4.6 21767.5 444
  0004   1.4  7092.2  40
  0005   5.9 16869.1  12
  0006   1.0 10111.2 401
  0007 153.2   653.0   1
  0008   3.3 30413.66000
  0009   0.9 1.1 298
  0010   5.2 19241.9   1
  0011   4.3 23153.51919
  0012   3.3 30712.51919
==
{code}

* On Beam 2.3.0 (release branch):

{code}
Performance:
  Conf  Runtime(sec)(Baseline)  Events(/sec)(Baseline)   Results
(Baseline)
    10.5  9554.7  10
  0001   7.2 13848.5   92000
  0002   3.9 25654.2 351
  0003   5.9 17059.0 444
  0004   1.7  6013.2  40
  0005   8.4 11899.1  12
  0006   1.4  7077.1 401
  00071019.098.1   1
  0008   5.0 19888.66000
  0009   1.3  7905.1 298
  0010   6.2 16186.5   1
  0011   9.0 11088.91919
  0012   6.4 15535.21919
==
{code}

> Performance degradation on the direct runner
> 
>
> Key: BEAM-3617
> URL: https://issues.apache.org/jira/browse/BEAM-3617
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Affects Versions: 2.3.0
>Reporter: Jean-Baptiste Onofré
>Assignee: Thomas Groh
>Priority: Blocker
> Fix For: 2.3.0
>
>
> Running Nexmark queries with the direct runner between Beam 2.2.0 and 2.3.0 
> shows a performance degradation:
> {code}
> 
>  Beam 2.2.0   Beam 2.3.0
>   Query  Runtime(sec) Runtime(sec)
> 
>      6.410.6
>   0001   5.110.2
>   0002   3.0 5.8
>   0003   3.8 6.2
>   0004   0.9 1.4
>   0005   5.811.4
>   0006   0.8 1.4
>   0007 193.8  1249.1
>   0008   3.9 6.9
>   0009   0.9 1.3
>   0010   6.4 8.2
>   0011   5.0 9.4
>   0012   4.7 9.1
> {code}
> We can see especially Query 7 that is 10 times longer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)