Ezra Zerihun created IMPALA-13075:
-------------------------------------

             Summary: Setting very high BATCH_SIZE can blow up memory usage of 
fragments
                 Key: IMPALA-13075
                 URL: https://issues.apache.org/jira/browse/IMPALA-13075
             Project: IMPALA
          Issue Type: Improvement
          Components: Backend
    Affects Versions: Impala 4.0.0
            Reporter: Ezra Zerihun


In Impala 4.0, setting a very high BATCH_SIZE or near max limit of 65536 can 
cause some fragment's memory usage to spike way past the query's defined 
MEM_LIMIT or pool's Maximum Query Memory Limit with Clamp on. So even though 
MEM_LIMIT is set reasonable, the query can still fail with out of memory and a 
huge amount of memory used on fragment. Reducing BATCH_SIZE to a reasonable 
amount or back to default will allow the query to run without issue and use 
reasonable amount of memory within query's MEM_LIMIT or pool's Maximum Query 
Memory Limit.

 

1) set BATCH_SIZE=65536; set MEM_LIMIT=1g;

 
{code:java}
    Query State: EXCEPTION
    Impala Query State: ERROR
    Query Status: Memory limit exceeded: Error occurred on backend ...:27000 by 
fragment ... Memory left in process limit: 145.53 GB Memory left in query 
limit: -6.80 GB Query(...): memory limit exceeded. Limit=1.00 GB 
Reservation=86.44 MB ReservationLimit=819.20 MB OtherMemory=7.71 GB Total=7.80 
GB Peak=7.84 GB   Unclaimed reservations: Reservation=8.50 MB OtherMemory=0 
Total=8.50 MB Peak=56.44 MB   Runtime Filter Bank: Reservation=4.00 MB 
ReservationLimit=4.00 MB OtherMemory=0 Total=4.00 MB Peak=4.00 MB   Fragment 
...: Reservation=1.94 MB OtherMemory=7.59 GB Total=7.59 GB Peak=7.63 GB     
HASH_JOIN_NODE (id=8): Reservation=1.94 MB OtherMemory=7.57 GB Total=7.57 GB 
Peak=7.57 GB       Exprs: Total=7.57 GB Peak=7.57 GB       Hash Join Builder 
(join_node_id=8): Total=0 Peak=1.95 MB
...
    Query Options (set by configuration): 
BATCH_SIZE=65536,MEM_LIMIT=1073741824,CLIENT_IDENTIFIER=Impala Shell 
v4.0.0.7.2.16.0-287 (5ae3917) built on Mon Jan  9 21:23:59 UTC 
2023,DEFAULT_FILE_FORMAT=PARQUET,...
...
   ExecSummary:
...
09:AGGREGATE                    32     32    0.000ns    0.000ns        0       
4.83M   36.31 MB      212.78 MB  STREAMING                                 
08:HASH JOIN                    32     32    5s149ms      2m44s        0     
194.95M    7.57 GB        1.94 MB  RIGHT OUTER JOIN, PARTITIONED
|--18:EXCHANGE                  32     32   93.750us    1.000ms   10.46K       
1.55K    1.65 MB        2.56 MB  HASH(...
{code}
 

 

2) set BATCH_SIZE=0; set MEM_LIMIT=1g;

 
{code:java}
    Query State: FINISHED
    Impala Query State: FINISHED
...
    Query Options (set by configuration and planner): 
MEM_LIMIT=1073741824,CLIENT_IDENTIFIER=Impala Shell v4.0.0.7.2.16.0-287 
(5ae3917) built on Mon Jan  9 21:23:59 UTC 2023,DEFAULT_FILE_FORMAT=PARQUET,...
...
    ExecSummary:
...
09:AGGREGATE                    32     32  593.748us   18.999ms       45       
4.83M    34.06 MB      212.78 MB  STREAMING
08:HASH JOIN                    32     32   10s873ms      5m47s   10.47K     
194.95M   123.48 MB        1.94 MB  RIGHT OUTER JOIN, PARTITIONED
|--18:EXCHANGE                  32     32    0.000ns    0.000ns   10.46K       
1.55K   344.00 KB        1.69 MB  HASH(...
{code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to