Michael Ho created IMPALA-6766:
----------------------------------

             Summary: Resource management for network
                 Key: IMPALA-6766
                 URL: https://issues.apache.org/jira/browse/IMPALA-6766
             Project: IMPALA
          Issue Type: Improvement
          Components: Distributed Exec
    Affects Versions: Impala 3.0, Impala 2.12.0
            Reporter: Michael Ho


There is no way to manage the network bandwidth usages of a query. In other 
words, a query which shuffles a huge amount of data can slow down other 
concurrent queries. The followings are the observed bandwidth of a query when 
it's run alone and when it's run with another query which shuffles a lot of 
data across the network. We should consider extending the resource pool concept 
to also manage network usage.
 
Good case:
      DataStreamSender (dst_id=4)
        - BytesSent: 828.3 MiB (868564531)
        - InactiveTotalTime: 0ns (0)
        - NetworkThroughput(*): 706.4 MiB/s (740751383)
 
Bad case:
      DataStreamSender (dst_id=4)
        - BytesSent: 828.3 MiB (868564531)
        - InactiveTotalTime: 0ns (0)
        - NetworkThroughput(*): 182.3 MiB/s (191106930)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to