[ 
https://issues.apache.org/jira/browse/FLINK-16001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17061696#comment-17061696
 ] 

Gary Yao commented on FLINK-16001:
----------------------------------

First of all, thanks for the benchmark. I am able to reproduce the results. 

Next, I have to correct my own statement that the complexity is linear to the 
number of distinct pipelined regions. Since we have to touch every vertex, the 
complexity should be linear to the number of vertices. 

However the time difference between the streams the non-streams version in your 
benchmark is less than 1ms for 5000 regions. By increasing the number of 
vertices per regions to 21, I can measure a difference of 8ms. 

This is a drop in the bucket especially considering that building the regions 
can take several seconds. Therefore, rewriting the code to non-streams should 
be motivated by reasons of legibility and not performance. If you still insist 
on this performance improvement, I can assign you to this ticket but I would 
recommend to optimize code paths that are actually slow.

> Avoid using Java Streams in construction of ExecutionGraph
> ----------------------------------------------------------
>
>                 Key: FLINK-16001
>                 URL: https://issues.apache.org/jira/browse/FLINK-16001
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Coordination
>    Affects Versions: 1.10.0
>            Reporter: Jiayi Liao
>            Priority: Major
>         Attachments: benchmark.csv
>
>
> I think we should avoid {{Java Streams}} in construction of 
> {{ExecutionGraph}} like function {{toPipelinedRegionsSet}} in 
> {{PipelinedRegionComputeUtil}} because the job submission is definitely 
> performance sensitive, especially when {{distinctRegions}} has a large 
> cardinality.
> Also includes some other places in package 
> {{org.apache.flink.runtime.executiongraph}}
> cc [~trohrmann] [~gjy] [~zhuzh] 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to