[ 
https://issues.apache.org/jira/browse/IMPALA-13075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17848720#comment-17848720
 ] 

Riza Suminto commented on IMPALA-13075:
---------------------------------------

I think I managed to reproduce the issue myself with TPC-DS Q97

 
{code:java}
set RUNTIME_FILTER_MODE=OFF;
set BATCH_SIZE=65536;
set MEM_LIMIT=149mb;

use tpcds_partitioned_parquet_snap;

with ssci as (
select ss_customer_sk customer_sk
      ,ss_item_sk item_sk
from store_sales,date_dim
where ss_sold_date_sk = d_date_sk
  and d_month_seq between 1199 and 1199 + 11
group by ss_customer_sk
        ,ss_item_sk),
csci as(
 select cs_bill_customer_sk customer_sk
      ,cs_item_sk item_sk
from catalog_sales,date_dim
where cs_sold_date_sk = d_date_sk
  and d_month_seq between 1199 and 1199 + 11
group by cs_bill_customer_sk
        ,cs_item_sk)
 select  sum(case when ssci.customer_sk is not null and csci.customer_sk is 
null then 1 else 0 end) store_only
      ,sum(case when ssci.customer_sk is null and csci.customer_sk is not null 
then 1 else 0 end) catalog_only
      ,sum(case when ssci.customer_sk is not null and csci.customer_sk is not 
null then 1 else 0 end) store_and_catalog
from ssci full outer join csci on (ssci.customer_sk=csci.customer_sk
                               and ssci.item_sk = csci.item_sk)
limit 100;{code}
Query failed with following status in profile:
{code:java}
    Query State: EXCEPTION
    Impala Query State: ERROR
    Query Status: Memory limit exceeded: 
ParquetColumnChunkReader::ReadDataPage() failed to allocate 258 bytes for 
decompressed data.
HDFS_SCAN_NODE (id=4) could not allocate 258.00 B without exceeding limit.
Error occurred on backend rsuminto-22746:27000 by fragment 
084e26373d5447b1:c381024600000006
Memory left in process limit: 11.62 GB
Memory left in query limit: -92.75 KB
Query(084e26373d5447b1:c381024600000000): memory limit exceeded. Limit=149.00 
MB Reservation=116.00 MB ReservationLimit=117.00 MB OtherMemory=33.09 MB 
Total=149.09 MB Peak=149.88 MB
  Unclaimed reservations: Reservation=51.00 MB OtherMemory=0 Total=51.00 MB 
Peak=117.00 MB
  Fragment 084e26373d5447b1:c381024600000005: Reservation=0 OtherMemory=0 
Total=0 Peak=3.56 MB
    HDFS_SCAN_NODE (id=5): Reservation=0 OtherMemory=0 Total=0 Peak=3.02 MB
    KrpcDataStreamSender (dst_id=13): Total=0 Peak=49.12 KB
      RowBatchSerialization: Total=0 Peak=6.46 KB
  CodeGen: Total=1.86 KB Peak=1.86 KB
  Fragment 084e26373d5447b1:c381024600000006: Reservation=11.06 MB 
OtherMemory=12.29 MB Total=23.36 MB Peak=28.42 MB
    AGGREGATION_NODE (id=7): Reservation=9.00 MB OtherMemory=8.79 MB 
Total=17.79 MB Peak=18.61 MB
      GroupingAggregator 0: Reservation=9.00 MB OtherMemory=452.00 KB 
Total=9.44 MB Peak=9.44 MB
        Exprs: Total=452.00 KB Peak=452.00 KB
    HASH_JOIN_NODE (id=6): Reservation=1.94 MB OtherMemory=1.64 MB Total=3.58 
MB Peak=4.58 MB
      Exprs: Total=584.00 KB Peak=584.00 KB
      Hash Join Builder (join_node_id=6): Total=584.00 KB Peak=1.07 MB
        Hash Join Builder (join_node_id=6) Exprs: Total=584.00 KB Peak=584.00 KB
    HDFS_SCAN_NODE (id=4): Reservation=128.00 KB OtherMemory=1.32 MB Total=1.44 
MB Peak=6.20 MB
    EXCHANGE_NODE (id=13): Reservation=0 OtherMemory=0 Total=0 Peak=16.00 KB
      KrpcDeferredRpcs: Total=0 Peak=0
    KrpcDataStreamSender (dst_id=14): Total=42.66 KB Peak=42.66 KB
      RowBatchSerialization: Total=0 Peak=0
  Fragment 084e26373d5447b1:c381024600000001: Reservation=0 OtherMemory=0 
Total=0 Peak=3.37 MB
    HDFS_SCAN_NODE (id=1): Reservation=0 OtherMemory=0 Total=0 Peak=2.83 MB
    KrpcDataStreamSender (dst_id=10): Total=0 Peak=49.12 KB
      RowBatchSerialization: Total=0 Peak=6.46 KB
  CodeGen: Total=1.86 KB Peak=1.86 KB
  Fragment 084e26373d5447b1:c381024600000002: Reservation=19.94 MB 
OtherMemory=16.35 MB Total=36.28 MB Peak=38.69 MB
    AGGREGATION_NODE (id=3): Reservation=17.00 MB OtherMemory=9.61 MB 
Total=26.61 MB Peak=26.61 MB
      GroupingAggregator 0: Reservation=17.00 MB OtherMemory=452.00 KB 
Total=17.44 MB Peak=17.44 MB
        Exprs: Total=452.00 KB Peak=452.00 KB
    HASH_JOIN_NODE (id=2): Reservation=1.94 MB OtherMemory=1.64 MB Total=3.58 
MB Peak=4.58 MB
      Exprs: Total=584.00 KB Peak=584.00 KB
      Hash Join Builder (join_node_id=2): Total=584.00 KB Peak=1.07 MB
        Hash Join Builder (join_node_id=2) Exprs: Total=584.00 KB Peak=584.00 KB
    HDFS_SCAN_NODE (id=0): Reservation=1.00 MB OtherMemory=4.32 MB Total=5.32 
MB Peak=7.96 MB
      Queued Batches: Total=4.32 MB Peak=5.64 MB
    EXCHANGE_NODE (id=10): Reservation=0 OtherMemory=0 Total=0 Peak=16.00 KB
      KrpcDeferredRpcs: Total=0 Peak=0
    KrpcDataStreamSender (dst_id=11): Total=254.71 KB Peak=294.71 KB
      RowBatchSerialization: Total=128.05 KB Peak=144.05 KB
  Fragment 084e26373d5447b1:c381024600000009: Reservation=34.00 MB 
OtherMemory=3.84 MB Total=37.84 MB Peak=37.84 MB
    AGGREGATION_NODE (id=9): Total=4.00 KB Peak=4.00 KB
      NonGroupingAggregator 0: Total=4.00 KB Peak=4.00 KB
        Exprs: Total=4.00 KB Peak=4.00 KB
    HASH_JOIN_NODE (id=8): Total=1.88 MB Peak=1.88 MB
      Exprs: Total=452.00 KB Peak=452.00 KB
      Hash Join Builder (join_node_id=8): Total=452.00 KB Peak=452.00 KB
        Hash Join Builder (join_node_id=8) Exprs: Total=452.00 KB Peak=452.00 KB
    AGGREGATION_NODE (id=12): Reservation=34.00 MB OtherMemory=964.00 KB 
Total=34.94 MB Peak=34.94 MB
      GroupingAggregator 0: Reservation=34.00 MB OtherMemory=452.00 KB 
Total=34.44 MB Peak=34.44 MB
        Exprs: Total=452.00 KB Peak=452.00 KB
    EXCHANGE_NODE (id=11): Reservation=64.00 KB OtherMemory=0 Total=64.00 KB 
Peak=64.00 KB
      KrpcDeferredRpcs: Total=0 Peak=0
    AGGREGATION_NODE (id=15): Reservation=0 OtherMemory=452.00 KB Total=452.00 
KB Peak=452.00 KB
      GroupingAggregator 0: Total=452.00 KB Peak=452.00 KB
        Exprs: Total=452.00 KB Peak=452.00 KB
    EXCHANGE_NODE (id=14): Reservation=0 OtherMemory=0 Total=0 Peak=0
      KrpcDeferredRpcs: Total=0 Peak=0
    KrpcDataStreamSender (dst_id=16): Total=5.12 KB Peak=5.12 KB
      RowBatchSerialization: Total=0 Peak=0
  Fragment 084e26373d5447b1:c381024600000000: Reservation=0 OtherMemory=516.00 
KB Total=516.00 KB Peak=516.00 KB
    AGGREGATION_NODE (id=17): Total=4.00 KB Peak=4.00 KB
      NonGroupingAggregator 0: Total=4.00 KB Peak=4.00 KB
        Exprs: Total=4.00 KB Peak=4.00 KB
    EXCHANGE_NODE (id=16): Reservation=0 OtherMemory=0 Total=0 Peak=0
      KrpcDeferredRpcs: Total=0 Peak=0
    PLAN_ROOT_SINK: Total=0 Peak=0
  CodeGen: Total=402.00 B Peak=402.00 B
  CodeGen: Total=26.20 KB Peak=26.20 KB
  CodeGen: Total=26.20 KB Peak=26.20 KB
  CodeGen: Total=31.71 KB Peak=31.71 KB {code}
 

> Setting very high BATCH_SIZE can blow up memory usage of fragments
> ------------------------------------------------------------------
>
>                 Key: IMPALA-13075
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13075
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 4.0.0
>            Reporter: Ezra Zerihun
>            Priority: Major
>
> In Impala 4.0, setting a very high BATCH_SIZE or near max limit of 65536 can 
> cause some fragment's memory usage to spike way past the query's defined 
> MEM_LIMIT or pool's Maximum Query Memory Limit with Clamp on. So even though 
> MEM_LIMIT is set reasonable, the query can still fail with out of memory and 
> a huge amount of memory used on fragment. Reducing BATCH_SIZE to a reasonable 
> amount or back to default will allow the query to run without issue and use 
> reasonable amount of memory within query's MEM_LIMIT or pool's Maximum Query 
> Memory Limit.
>  
> 1) set BATCH_SIZE=65536; set MEM_LIMIT=1g;
>  
> {code:java}
>     Query State: EXCEPTION
>     Impala Query State: ERROR
>     Query Status: Memory limit exceeded: Error occurred on backend ...:27000 
> by fragment ... Memory left in process limit: 145.53 GB Memory left in query 
> limit: -6.80 GB Query(...): memory limit exceeded. Limit=1.00 GB 
> Reservation=86.44 MB ReservationLimit=819.20 MB OtherMemory=7.71 GB 
> Total=7.80 GB Peak=7.84 GB   Unclaimed reservations: Reservation=8.50 MB 
> OtherMemory=0 Total=8.50 MB Peak=56.44 MB   Runtime Filter Bank: 
> Reservation=4.00 MB ReservationLimit=4.00 MB OtherMemory=0 Total=4.00 MB 
> Peak=4.00 MB   Fragment ...: Reservation=1.94 MB OtherMemory=7.59 GB 
> Total=7.59 GB Peak=7.63 GB     HASH_JOIN_NODE (id=8): Reservation=1.94 MB 
> OtherMemory=7.57 GB Total=7.57 GB Peak=7.57 GB       Exprs: Total=7.57 GB 
> Peak=7.57 GB       Hash Join Builder (join_node_id=8): Total=0 Peak=1.95 MB
> ...
>     Query Options (set by configuration): 
> BATCH_SIZE=65536,MEM_LIMIT=1073741824,CLIENT_IDENTIFIER=Impala Shell 
> v4.0.0.7.2.16.0-287 (5ae3917) built on Mon Jan  9 21:23:59 UTC 
> 2023,DEFAULT_FILE_FORMAT=PARQUET,...
> ...
>    ExecSummary:
> ...
> 09:AGGREGATE                    32     32    0.000ns    0.000ns        0      
>  4.83M   36.31 MB      212.78 MB  STREAMING                                 
> 08:HASH JOIN                    32     32    5s149ms      2m44s        0     
> 194.95M    7.57 GB        1.94 MB  RIGHT OUTER JOIN, PARTITIONED
> |--18:EXCHANGE                  32     32   93.750us    1.000ms   10.46K      
>  1.55K    1.65 MB        2.56 MB  HASH(...
> {code}
>  
>  
> 2) set BATCH_SIZE=0; set MEM_LIMIT=1g;
>  
> {code:java}
>     Query State: FINISHED
>     Impala Query State: FINISHED
> ...
>     Query Options (set by configuration and planner): 
> MEM_LIMIT=1073741824,CLIENT_IDENTIFIER=Impala Shell v4.0.0.7.2.16.0-287 
> (5ae3917) built on Mon Jan  9 21:23:59 UTC 
> 2023,DEFAULT_FILE_FORMAT=PARQUET,...
> ...
>     ExecSummary:
> ...
> 09:AGGREGATE                    32     32  593.748us   18.999ms       45      
>  4.83M    34.06 MB      212.78 MB  STREAMING
> 08:HASH JOIN                    32     32   10s873ms      5m47s   10.47K     
> 194.95M   123.48 MB        1.94 MB  RIGHT OUTER JOIN, PARTITIONED
> |--18:EXCHANGE                  32     32    0.000ns    0.000ns   10.46K      
>  1.55K   344.00 KB        1.69 MB  HASH(...
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to