[ https://issues.apache.org/jira/browse/IMPALA-10578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
wesleydeng updated IMPALA-10578: -------------------------------- Description: When a big query is running(use mt_dop=8), other query is very difficult to start. A small query (select distinct one field from a small table) may take about 1 minutes, normallly it take only about 1~3 second. From the impalad log, I found a incomprehensible log like this: !image-2021-03-16-16-32-37-862.png|width=836,height=189! !image-2021-03-10-19-59-24-188.png|width=892,height=435! --------------- About the gap between "Handling call" and "Deserializing Batch", I found another path : --KrpcDataStreamRecvr::SenderQueue::AddBatch ----EnqueueDeferredRpc(move(payload), l); // after dequeue, will call KrpcDataStreamRecvr::SenderQueue::AddBatchWork --------------- When the Big query is running, data spilled has happened because mem_limit was set and this big query waste a lot of memory. In the attchment, I append the profile of big query and small query. The small query can be finished in seconds normally. the timeline of small query show as below: Query Timeline: 21m39s - Query submitted: 48.846us (48.846us) - Planning finished: 2.934ms (2.886ms) - Submit for admission: 12.572ms (9.637ms) - Completed admission: 13.622ms (1.050ms) - Ready to start on 56 backends: 15.271ms (1.649ms) -- All 56 execution backends (171 fragment instances) started: 18s505ms (18s489ms)* - Rows available: 51s770ms (33s265ms) - First row fetched: 57s220ms (5s449ms) - Last row fetched: 59s119ms (1s899ms) - Released admission control resources: 1m1s (2s223ms) - AdmissionControlTimeSinceLastUpdate: 80.000ms - ComputeScanRangeAssignmentTimer: 439.749us was: When a big query is running(use mt_dop=8), other query is very difficult to start. A small query (select distinct one field from a small table) may take about 1 minutes, normallly it take only about 1~3 second. From the impalad log, I found a incomprehensible log like this: !image-2021-03-16-16-32-37-862.png|width=836,height=189! !image-2021-03-10-19-59-24-188.png|width=892,height=435! When the Big query is running, data spilled has happened because mem_limit was set and this big query waste a lot of memory. In the attchment, I append the profile of big query and small query. The small query can be finished in seconds normally. the timeline of small query show as below: Query Timeline: 21m39s - Query submitted: 48.846us (48.846us) - Planning finished: 2.934ms (2.886ms) - Submit for admission: 12.572ms (9.637ms) - Completed admission: 13.622ms (1.050ms) - Ready to start on 56 backends: 15.271ms (1.649ms) *- All 56 execution backends (171 fragment instances) started: 18s505ms (18s489ms)* - Rows available: 51s770ms (33s265ms) - First row fetched: 57s220ms (5s449ms) - Last row fetched: 59s119ms (1s899ms) - Released admission control resources: 1m1s (2s223ms) - AdmissionControlTimeSinceLastUpdate: 80.000ms - ComputeScanRangeAssignmentTimer: 439.749us > Big Query influence other query seriously when hardware not reach limit > ------------------------------------------------------------------------ > > Key: IMPALA-10578 > URL: https://issues.apache.org/jira/browse/IMPALA-10578 > Project: IMPALA > Issue Type: Bug > Components: Backend > Affects Versions: Impala 3.4.0 > Environment: impala-3.4 > 80 machines with 96 cpu and 256GB mem > scratch-dir is on separate disk different from HDFS data dir > Reporter: wesleydeng > Priority: Major > Attachments: big_query.txt.bz2, image-2021-03-10-19-59-24-188.png, > image-2021-03-16-16-32-37-862.png, small_query_be_influenced_very_slow.txt.bz2 > > > When a big query is running(use mt_dop=8), other query is very difficult to > start. > A small query (select distinct one field from a small table) may take about > 1 minutes, normallly it take only about 1~3 second. > From the impalad log, I found a incomprehensible log like this: > !image-2021-03-16-16-32-37-862.png|width=836,height=189! > !image-2021-03-10-19-59-24-188.png|width=892,height=435! > --------------- > About the gap between "Handling call" and "Deserializing Batch", I found > another path : > --KrpcDataStreamRecvr::SenderQueue::AddBatch > ----EnqueueDeferredRpc(move(payload), l); // after dequeue, will call > KrpcDataStreamRecvr::SenderQueue::AddBatchWork > --------------- > > > When the Big query is running, data spilled has happened because mem_limit > was set and this big query waste a lot of memory. > > In the attchment, I append the profile of big query and small query. The > small query can be finished in seconds normally. the timeline of small query > show as below: > Query Timeline: 21m39s > - Query submitted: 48.846us (48.846us) > - Planning finished: 2.934ms (2.886ms) > - Submit for admission: 12.572ms (9.637ms) > - Completed admission: 13.622ms (1.050ms) > - Ready to start on 56 backends: 15.271ms (1.649ms) > -- All 56 execution backends (171 fragment instances) started: 18s505ms > (18s489ms)* > - Rows available: 51s770ms (33s265ms) > - First row fetched: 57s220ms (5s449ms) > - Last row fetched: 59s119ms (1s899ms) > - Released admission control resources: 1m1s (2s223ms) > - AdmissionControlTimeSinceLastUpdate: 80.000ms > - ComputeScanRangeAssignmentTimer: 439.749us > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org