[ https://issues.apache.org/jira/browse/FLINK-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Till Rohrmann closed FLINK-4175. -------------------------------- Resolution: Abandoned This issue has been abandoned. > Broadcast data sent increases with # slots per TM > ------------------------------------------------- > > Key: FLINK-4175 > URL: https://issues.apache.org/jira/browse/FLINK-4175 > Project: Flink > Issue Type: Improvement > Components: Runtime / Network > Affects Versions: 1.0.3 > Reporter: Felix Neutatz > Priority: Major > Labels: performance, stale-assigned > > Problem: > we experience some unexpected increase of data sent over the network for > broadcasts with increasing number of slots per Taskmanager. > We provided a benchmark [1]. It not only increases the size of data sent over > the network but also hurts performance as seen in the preliminary results > below. In this results cloud-11 has 25 nodes and ibm-power has 8 nodes with > scaling the number of slots per node from 1 - 16. > +-----------------------+--------------+-------------+ > | suite | name | median_time | > +=======================+==============+=============+ > | broadcast.cloud-11 | broadcast.01 | 8796 | > | broadcast.cloud-11 | broadcast.02 | 14802 | > | broadcast.cloud-11 | broadcast.04 | 30173 | > | broadcast.cloud-11 | broadcast.08 | 56936 | > | broadcast.cloud-11 | broadcast.16 | 117507 | > | broadcast.ibm-power-1 | broadcast.01 | 6807 | > | broadcast.ibm-power-1 | broadcast.02 | 8443 | > | broadcast.ibm-power-1 | broadcast.04 | 11823 | > | broadcast.ibm-power-1 | broadcast.08 | 21655 | > | broadcast.ibm-power-1 | broadcast.16 | 37426 | > +-----------------------+--------------+-------------+ > After looking into the code base it, it seems that the data is de-serialized > only once per TM, but the actual data is sent for all slots running the > operator with broadcast vars and just gets discarded in case its already > de-serialized. > We do not see a reason the data can't be shared among the slots of a TM and > therefore just sent once. > [1] https://github.com/TU-Berlin-DIMA/flink-broadcast > This Jira will continue the discussion started here: > https://mail-archives.apache.org/mod_mbox/flink-dev/201606.mbox/%3c1465386300767.94...@tu-berlin.de%3E -- This message was sent by Atlassian Jira (v8.3.4#803005)