Michael Ho created IMPALA-6766: ---------------------------------- Summary: Resource management for network Key: IMPALA-6766 URL: https://issues.apache.org/jira/browse/IMPALA-6766 Project: IMPALA Issue Type: Improvement Components: Distributed Exec Affects Versions: Impala 3.0, Impala 2.12.0 Reporter: Michael Ho
There is no way to manage the network bandwidth usages of a query. In other words, a query which shuffles a huge amount of data can slow down other concurrent queries. The followings are the observed bandwidth of a query when it's run alone and when it's run with another query which shuffles a lot of data across the network. We should consider extending the resource pool concept to also manage network usage. Good case: DataStreamSender (dst_id=4) - BytesSent: 828.3 MiB (868564531) - InactiveTotalTime: 0ns (0) - NetworkThroughput(*): 706.4 MiB/s (740751383) Bad case: DataStreamSender (dst_id=4) - BytesSent: 828.3 MiB (868564531) - InactiveTotalTime: 0ns (0) - NetworkThroughput(*): 182.3 MiB/s (191106930) -- This message was sent by Atlassian JIRA (v7.6.3#76005)