Mostafa Mokhtar created IMPALA-6396: ---------------------------------------
Summary: KrpcDataStreamRecvr should correctly report peak memory in query profile and summary Key: IMPALA-6396 URL: https://issues.apache.org/jira/browse/IMPALA-6396 Project: IMPALA Issue Type: Sub-task Components: Distributed Exec Reporter: Mostafa Mokhtar Assignee: Michael Ho Attachments: KrpcDataStreamRecvr with 5.7GB memory profile.txt KRPC KrpcDataStreamRecvr doesn't correctly report used memory in the query profile >From OOM message {code} EXCHANGE_NODE (id=22): Total=0 Peak=0 KrpcDataStreamRecvr: Total=5.77 GB Peak=5.77 GB EXCHANGE_NODE (id=23): Total=0 Peak=0 KrpcDataStreamRecvr: Total=0 Peak=35.25 MB EXCHANGE_NODE (id=24): Total=0 Peak=0 {code} >From profile {code} EXCHANGE_NODE (id=22):(Total: 5m32s, non-child: 9s448ms, % non-child: 2.84%) - ConvertRowBatchTime: 3s039ms - PeakMemoryUsage: 0 - RowsReturned: 124.08M (124083200) - RowsReturnedRate: 373.33 K/sec RecvrSide: BytesReceived(16s000ms): 8.70 MB, 9.90 MB, 9.90 MB, 9.90 MB, 9.90 MB, 9.90 MB, 9.90 MB, 9.90 MB, 9.90 MB, 9.90 MB, 9.90 MB, 9.90 MB, 9.90 MB, 9.90 MB, 9.90 MB, 9.90 MB, 13.08 MB, 90.83 MB, 241.70 MB, 430.76 MB, 641.32 MB, 864.95 MB, 1.05 GB, 1.22 GB, 1.36 GB, 1.52 GB, 1.69 GB, 1.86 GB, 2.01 GB, 2.18 GB, 2.34 GB, 2.51 GB, 2.70 GB, 2.88 GB, 3.09 GB, 3.29 GB, 3.49 GB, 3.64 GB - FirstBatchArrivalWaitTime: 0.000ns - TotalBytesReceived: 3.81 GB (4094818431) - TotalGetBatchTime: 5m29s - DataArrivalTimer: 5m22s SenderSide: - DeserializeRowBatchTime: 2m14s - NumBatchesAccepted: 5.92K (5917) - NumBatchesDeferred: 37 (37) - NumEarlySenders: 0 (0) Buffer pool: - AllocTime: 105.467ms - CumulativeAllocationBytes: 104.00 MB (109051904) - CumulativeAllocations: 52 (52) - PeakReservation: 104.00 MB (109051904) - PeakUnpinnedBytes: 0 - PeakUsedReservation: 104.00 MB (109051904) - ReadIoBytes: 0 - ReadIoOps: 0 (0) - ReadIoWaitTime: 0.000ns - WriteIoBytes: 0 - WriteIoOps: 0 (0) - WriteIoWaitTime: 0.000ns {code} Exec summary {code} Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail --------------------------------------------------------------------------------------------------------------------------------------------------------- 28:MERGING-EXCHANGE 1 0.000ns 0.000ns 0 100 0 0 UNPARTITIONED 16:TOP-N 37 34.139us 49.928us 0 100 4.00 KB 20.72 KB 27:AGGREGATE 37 12.541ms 104.100ms 0 42.92M 76.12 KB 9.33 GB FINALIZE 26:EXCHANGE 37 0.000ns 0.000ns 0 42.92M 0 0 HASH(i_item_id,i_item_desc,s_store_id,s_store_name) 15:AGGREGATE 37 2.728ms 91.568ms 0 42.92M 76.12 KB 9.33 GB STREAMING 14:HASH JOIN 37 3.255ms 75.023ms 0 42.92M 2.03 MB 1.94 MB INNER JOIN, BROADCAST |--25:EXCHANGE 37 66.054us 93.020us 1.50K 1.50K 0 0 BROADCAST | 06:SCAN HDFS 1 19.992ms 19.992ms 1.50K 1.50K 349.98 KB 48.00 MB tpcds_10000_parquet_1_rack.store 13:HASH JOIN 37 5.681ms 141.448ms 0 42.92M 2.03 MB 1.94 MB INNER JOIN, BROADCAST |--24:EXCHANGE 37 47.316us 76.244us 1.10K 1.12K 0 0 BROADCAST | 05:SCAN HDFS 1 25.697ms 25.697ms 1.10K 1.12K 744.27 KB 32.00 MB tpcds_10000_parquet_1_rack.date_dim d3 12:HASH JOIN 37 1s123ms 21s802ms 0 70.53M 108.13 MB 467.78 MB INNER JOIN, PARTITIONED |--23:EXCHANGE 37 60.893ms 194.422ms 11.15M 70.53M 0 0 HASH(sr_customer_sk,sr_item_sk) | 11:HASH JOIN 37 1s202ms 1s969ms 11.15M 70.53M 80.07 MB 65.90 MB INNER JOIN, BROADCAST | |--21:EXCHANGE 37 62.813ms 801.796ms 402.00K 402.00K 0 0 BROADCAST | | 07:SCAN HDFS 13 211.661ms 577.861ms 402.00K 402.00K 14.83 MB 48.00 MB tpcds_10000_parquet_1_rack.item | 10:HASH JOIN 37 130.099ms 372.693ms 11.15M 70.94M 1.99 MB 1.94 MB INNER JOIN, BROADCAST | |--20:EXCHANGE 37 16.704us 31.121us 122 118 0 0 BROADCAST | | 04:SCAN HDFS 1 57.044ms 57.044ms 122 118 808.28 KB 48.00 MB tpcds_10000_parquet_1_rack.date_dim d2 | 09:HASH JOIN 37 7s109ms 7s982ms 11.33M 1.20B 576.10 MB 2.02 GB INNER JOIN, PARTITIONED | |--19:EXCHANGE 37 318.715ms 629.917ms 260.03M 1.71B 0 0 HASH(ss_customer_sk,ss_item_sk,ss_ticket_number) | | 08:HASH JOIN 37 496.198ms 735.656ms 260.03M 1.71B 2.03 MB 1.94 MB INNER JOIN, BROADCAST | | |--17:EXCHANGE 37 15.748us 36.389us 30 108 0 0 BROADCAST | | | 03:SCAN HDFS 1 58.974ms 58.974ms 30 108 808.28 KB 48.00 MB tpcds_10000_parquet_1_rack.date_dim d1 | | 00:SCAN HDFS 37 1s440ms 12s142ms 260.03M 28.80B 12.87 MB 160.00 MB tpcds_10000_parquet_1_rack.store_sales | 18:EXCHANGE 37 468.946ms 1s080ms 211.94M 2.88B 0 0 HASH(sr_customer_sk,sr_item_sk,sr_ticket_number) | 01:SCAN HDFS 37 5s688ms 11s084ms 211.94M 2.88B 46.75 MB 64.00 MB tpcds_10000_parquet_1_rack.store_returns 22:EXCHANGE 37 7s656ms 10s110ms 4.46B 14.40B 0 0 HASH(cs_bill_customer_sk,cs_item_sk) 02:SCAN HDFS 37 13s752ms 20s057ms 4.46B 14.40B 34.68 MB 168.00 MB tpcds_10000_parquet_1_rack.catalog_sales {code} With KRPC KrpcDataStreamRecvr and KrpcDataStreamSender can consume lots memory due to queuing and caching of allocations via FreePool so it would help to have accurate reporting in the query profile. -- This message was sent by Atlassian JIRA (v6.4.14#64029)