[jira] [Updated] (IMPALA-10578) Big Query influence other query seriously when hardware not reach limit

wesleydeng (Jira) Tue, 30 Mar 2021 21:54:04 -0700


     [ 
https://issues.apache.org/jira/browse/IMPALA-10578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


wesleydeng updated IMPALA-10578:
--------------------------------
    Description: 
When a big query is running(use mt_dop=8), other query is very difficult to 
start. 

A small query (select distinct one field from a small table)  may take about 1 
minutes, normallly it take only about 1~3 second.

 From the impalad log, I found a incomprehensible log like this:

!image-2021-03-16-16-32-37-862.png|width=836,height=189!

!image-2021-03-10-19-59-24-188.png|width=892,height=435!

---------------

About the gap between "Handling call" and "Deserializing Batch", I found 
another path : 
--KrpcDataStreamRecvr::SenderQueue::AddBatch

  ----EnqueueDeferredRpc(move(payload), l);   // after dequeue, will call 
KrpcDataStreamRecvr::SenderQueue::AddBatchWork

--------------- 

 

 

When the Big query is running, data spilled  has happened because mem_limit was 
set and this big query waste a lot of memory.

 

In the attchment, I append the profile of big query and small query. The small 
query can be finished in seconds normally. the timeline of small query show  as 
below:

Query Timeline: 21m39s
 - Query submitted: 48.846us (48.846us)
 - Planning finished: 2.934ms (2.886ms)
 - Submit for admission: 12.572ms (9.637ms)
 - Completed admission: 13.622ms (1.050ms)
 - Ready to start on 56 backends: 15.271ms (1.649ms)
 -- All 56 execution backends (171 fragment instances) started: 18s505ms 
(18s489ms)*
 - Rows available: 51s770ms (33s265ms)
 - First row fetched: 57s220ms (5s449ms)
 - Last row fetched: 59s119ms (1s899ms)
 - Released admission control resources: 1m1s (2s223ms)
 - AdmissionControlTimeSinceLastUpdate: 80.000ms
 - ComputeScanRangeAssignmentTimer: 439.749us

 

 

 

 

  was:
When a big query is running(use mt_dop=8), other query is very difficult to 
start. 

A small query (select distinct one field from a small table)  may take about 1 
minutes, normallly it take only about 1~3 second.

 From the impalad log, I found a incomprehensible log like this:

!image-2021-03-16-16-32-37-862.png|width=836,height=189!

!image-2021-03-10-19-59-24-188.png|width=892,height=435!

 

When the Big query is running, data spilled  has happened because mem_limit was 
set and this big query waste a lot of memory.

 

In the attchment, I append the profile of big query and small query. The small 
query can be finished in seconds normally. the timeline of small query show  as 
below:

Query Timeline: 21m39s
 - Query submitted: 48.846us (48.846us)
 - Planning finished: 2.934ms (2.886ms)
 - Submit for admission: 12.572ms (9.637ms)
 - Completed admission: 13.622ms (1.050ms)
 - Ready to start on 56 backends: 15.271ms (1.649ms)
 *- All 56 execution backends (171 fragment instances) started: 18s505ms 
(18s489ms)*
 - Rows available: 51s770ms (33s265ms)
 - First row fetched: 57s220ms (5s449ms)
 - Last row fetched: 59s119ms (1s899ms)
 - Released admission control resources: 1m1s (2s223ms)
 - AdmissionControlTimeSinceLastUpdate: 80.000ms
 - ComputeScanRangeAssignmentTimer: 439.749us

 

 

 

 


> Big Query influence other query seriously when hardware not reach limit 
> ------------------------------------------------------------------------
>
>                 Key: IMPALA-10578
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10578
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 3.4.0
>         Environment: impala-3.4
> 80 machines with 96 cpu and 256GB mem
> scratch-dir is on separate disk different from HDFS data dir
>            Reporter: wesleydeng
>            Priority: Major
>         Attachments: big_query.txt.bz2, image-2021-03-10-19-59-24-188.png, 
> image-2021-03-16-16-32-37-862.png, small_query_be_influenced_very_slow.txt.bz2
>
>
> When a big query is running(use mt_dop=8), other query is very difficult to 
> start. 
> A small query (select distinct one field from a small table)  may take about 
> 1 minutes, normallly it take only about 1~3 second.
>  From the impalad log, I found a incomprehensible log like this:
> !image-2021-03-16-16-32-37-862.png|width=836,height=189!
> !image-2021-03-10-19-59-24-188.png|width=892,height=435!
> ---------------
> About the gap between "Handling call" and "Deserializing Batch", I found 
> another path : 
> --KrpcDataStreamRecvr::SenderQueue::AddBatch
>   ----EnqueueDeferredRpc(move(payload), l);   // after dequeue, will call 
> KrpcDataStreamRecvr::SenderQueue::AddBatchWork
> --------------- 
>  
>  
> When the Big query is running, data spilled  has happened because mem_limit 
> was set and this big query waste a lot of memory.
>  
> In the attchment, I append the profile of big query and small query. The 
> small query can be finished in seconds normally. the timeline of small query 
> show  as below:
> Query Timeline: 21m39s
>  - Query submitted: 48.846us (48.846us)
>  - Planning finished: 2.934ms (2.886ms)
>  - Submit for admission: 12.572ms (9.637ms)
>  - Completed admission: 13.622ms (1.050ms)
>  - Ready to start on 56 backends: 15.271ms (1.649ms)
>  -- All 56 execution backends (171 fragment instances) started: 18s505ms 
> (18s489ms)*
>  - Rows available: 51s770ms (33s265ms)
>  - First row fetched: 57s220ms (5s449ms)
>  - Last row fetched: 59s119ms (1s899ms)
>  - Released admission control resources: 1m1s (2s223ms)
>  - AdmissionControlTimeSinceLastUpdate: 80.000ms
>  - ComputeScanRangeAssignmentTimer: 439.749us
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

[jira] [Updated] (IMPALA-10578) Big Query influence other query seriously when hardware not reach limit

Reply via email to