Mostafa Mokhtar created IMPALA-6396:
---------------------------------------

             Summary: KrpcDataStreamRecvr should correctly report peak memory 
in query profile and summary
                 Key: IMPALA-6396
                 URL: https://issues.apache.org/jira/browse/IMPALA-6396
             Project: IMPALA
          Issue Type: Sub-task
          Components: Distributed Exec
            Reporter: Mostafa Mokhtar
            Assignee: Michael Ho
         Attachments: KrpcDataStreamRecvr with 5.7GB memory profile.txt

KRPC KrpcDataStreamRecvr doesn't correctly report used memory in the query 
profile
>From OOM message
{code}
    EXCHANGE_NODE (id=22): Total=0 Peak=0
    KrpcDataStreamRecvr: Total=5.77 GB Peak=5.77 GB
    EXCHANGE_NODE (id=23): Total=0 Peak=0
    KrpcDataStreamRecvr: Total=0 Peak=35.25 MB
    EXCHANGE_NODE (id=24): Total=0 Peak=0
{code}

>From profile
{code}
        EXCHANGE_NODE (id=22):(Total: 5m32s, non-child: 9s448ms, % non-child: 
2.84%)
           - ConvertRowBatchTime: 3s039ms
           - PeakMemoryUsage: 0
           - RowsReturned: 124.08M (124083200)
           - RowsReturnedRate: 373.33 K/sec
          RecvrSide:
            BytesReceived(16s000ms): 8.70 MB, 9.90 MB, 9.90 MB, 9.90 MB, 9.90 
MB, 9.90 MB, 9.90 MB, 9.90 MB, 9.90 MB, 9.90 MB, 9.90 MB, 9.90 MB, 9.90 MB, 
9.90 MB, 9.90 MB, 9.90 MB, 13.08 MB, 90.83 MB, 241.70 MB, 430.76 MB, 641.32 MB, 
864.95 MB, 1.05 GB, 1.22 GB, 1.36 GB, 1.52 GB, 1.69 GB, 1.86 GB, 2.01 GB, 2.18 
GB, 2.34 GB, 2.51 GB, 2.70 GB, 2.88 GB, 3.09 GB, 3.29 GB, 3.49 GB, 3.64 GB
             - FirstBatchArrivalWaitTime: 0.000ns
             - TotalBytesReceived: 3.81 GB (4094818431)
             - TotalGetBatchTime: 5m29s
               - DataArrivalTimer: 5m22s
          SenderSide:
             - DeserializeRowBatchTime: 2m14s
             - NumBatchesAccepted: 5.92K (5917)
             - NumBatchesDeferred: 37 (37)
             - NumEarlySenders: 0 (0)
          Buffer pool:
             - AllocTime: 105.467ms
             - CumulativeAllocationBytes: 104.00 MB (109051904)
             - CumulativeAllocations: 52 (52)
             - PeakReservation: 104.00 MB (109051904)
             - PeakUnpinnedBytes: 0
             - PeakUsedReservation: 104.00 MB (109051904)
             - ReadIoBytes: 0
             - ReadIoOps: 0 (0)
             - ReadIoWaitTime: 0.000ns
             - WriteIoBytes: 0
             - WriteIoOps: 0 (0)
             - WriteIoWaitTime: 0.000ns
{code}

Exec summary
{code}
Operator                #Hosts   Avg Time   Max Time    #Rows  Est. #Rows   
Peak Mem  Est. Peak Mem  Detail                                              
---------------------------------------------------------------------------------------------------------------------------------------------------------
28:MERGING-EXCHANGE          1    0.000ns    0.000ns        0         100       
   0              0  UNPARTITIONED                                       
16:TOP-N                    37   34.139us   49.928us        0         100    
4.00 KB       20.72 KB                                                      
27:AGGREGATE                37   12.541ms  104.100ms        0      42.92M   
76.12 KB        9.33 GB  FINALIZE                                            
26:EXCHANGE                 37    0.000ns    0.000ns        0      42.92M       
   0              0  HASH(i_item_id,i_item_desc,s_store_id,s_store_name) 
15:AGGREGATE                37    2.728ms   91.568ms        0      42.92M   
76.12 KB        9.33 GB  STREAMING                                           
14:HASH JOIN                37    3.255ms   75.023ms        0      42.92M    
2.03 MB        1.94 MB  INNER JOIN, BROADCAST                               
|--25:EXCHANGE              37   66.054us   93.020us    1.50K       1.50K       
   0              0  BROADCAST                                           
|  06:SCAN HDFS              1   19.992ms   19.992ms    1.50K       1.50K  
349.98 KB       48.00 MB  tpcds_10000_parquet_1_rack.store                    
13:HASH JOIN                37    5.681ms  141.448ms        0      42.92M    
2.03 MB        1.94 MB  INNER JOIN, BROADCAST                               
|--24:EXCHANGE              37   47.316us   76.244us    1.10K       1.12K       
   0              0  BROADCAST                                           
|  05:SCAN HDFS              1   25.697ms   25.697ms    1.10K       1.12K  
744.27 KB       32.00 MB  tpcds_10000_parquet_1_rack.date_dim d3              
12:HASH JOIN                37    1s123ms   21s802ms        0      70.53M  
108.13 MB      467.78 MB  INNER JOIN, PARTITIONED                             
|--23:EXCHANGE              37   60.893ms  194.422ms   11.15M      70.53M       
   0              0  HASH(sr_customer_sk,sr_item_sk)                     
|  11:HASH JOIN             37    1s202ms    1s969ms   11.15M      70.53M   
80.07 MB       65.90 MB  INNER JOIN, BROADCAST                               
|  |--21:EXCHANGE           37   62.813ms  801.796ms  402.00K     402.00K       
   0              0  BROADCAST                                           
|  |  07:SCAN HDFS          13  211.661ms  577.861ms  402.00K     402.00K   
14.83 MB       48.00 MB  tpcds_10000_parquet_1_rack.item                     
|  10:HASH JOIN             37  130.099ms  372.693ms   11.15M      70.94M    
1.99 MB        1.94 MB  INNER JOIN, BROADCAST                               
|  |--20:EXCHANGE           37   16.704us   31.121us      122         118       
   0              0  BROADCAST                                           
|  |  04:SCAN HDFS           1   57.044ms   57.044ms      122         118  
808.28 KB       48.00 MB  tpcds_10000_parquet_1_rack.date_dim d2              
|  09:HASH JOIN             37    7s109ms    7s982ms   11.33M       1.20B  
576.10 MB        2.02 GB  INNER JOIN, PARTITIONED                             
|  |--19:EXCHANGE           37  318.715ms  629.917ms  260.03M       1.71B       
   0              0  HASH(ss_customer_sk,ss_item_sk,ss_ticket_number)    
|  |  08:HASH JOIN          37  496.198ms  735.656ms  260.03M       1.71B    
2.03 MB        1.94 MB  INNER JOIN, BROADCAST                               
|  |  |--17:EXCHANGE        37   15.748us   36.389us       30         108       
   0              0  BROADCAST                                           
|  |  |  03:SCAN HDFS        1   58.974ms   58.974ms       30         108  
808.28 KB       48.00 MB  tpcds_10000_parquet_1_rack.date_dim d1              
|  |  00:SCAN HDFS          37    1s440ms   12s142ms  260.03M      28.80B   
12.87 MB      160.00 MB  tpcds_10000_parquet_1_rack.store_sales              
|  18:EXCHANGE              37  468.946ms    1s080ms  211.94M       2.88B       
   0              0  HASH(sr_customer_sk,sr_item_sk,sr_ticket_number)    
|  01:SCAN HDFS             37    5s688ms   11s084ms  211.94M       2.88B   
46.75 MB       64.00 MB  tpcds_10000_parquet_1_rack.store_returns            
22:EXCHANGE                 37    7s656ms   10s110ms    4.46B      14.40B       
   0              0  HASH(cs_bill_customer_sk,cs_item_sk)                
02:SCAN HDFS                37   13s752ms   20s057ms    4.46B      14.40B   
34.68 MB      168.00 MB  tpcds_10000_parquet_1_rack.catalog_sales  
{code}


With KRPC KrpcDataStreamRecvr and KrpcDataStreamSender can consume lots memory 
due to queuing and caching of allocations via FreePool so it would help to have 
accurate reporting in the query profile. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to