Arina Ielchiieva created DRILL-4823:
---------------------------------------

             Summary: Fix OOM while trying to prune partitions with reasonable 
data size
                 Key: DRILL-4823
                 URL: https://issues.apache.org/jira/browse/DRILL-4823
             Project: Apache Drill
          Issue Type: Bug
          Components: Functions - Drill, Query Planning & Optimization
    Affects Versions: 1.6.0
            Reporter: Arina Ielchiieva
            Assignee: Arina Ielchiieva
             Fix For: 1.8.0


_Example query:_
{code:sql}
select  '/'||dir0||'/'||dir1||'/'||dir2||'/'||dir3 , count(*)
  FROM dfs.`/path/to/parquet/files`
  WHERE ('/'||dir0||'/'||dir1||'/'||dir2||'/'||dir3)  IN ( 
  '/2015/11/30', 
  '//2015/09/01',
  '/2015/09/02', 
  '/2015/09/03',
  '/2015/09/04',
  '/2015/09/09',  
  '/2016/03/30'
  )
  group by   '/'||dir0||'/'||dir1||'/'||dir2||'/'||dir3
  order by 1;
{code}

_Error:_
OOM while trying to prune partitions:
{noformat}
org.apache.drill.exec.exception.OutOfMemoryException: Unable to allocate buffer 
of size 256 due to memory limit. Current allocation: 5242880
        at 
org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:216) 
~[drill-memory-base-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.ops.BufferManagerImpl.getManagedBuffer(BufferManagerImpl.java:60)
 ~[drill-java-exec-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.ops.BufferManagerImpl.getManagedBuffer(BufferManagerImpl.java:56)
 ~[drill-java-exec-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.ops.QueryContext.getManagedBuffer(QueryContext.java:241) 
~[drill-java-exec-1.6.0.jar:1.6.0]
        at 
org.apache.drill.exec.expr.fn.interpreter.InterpreterEvaluator$EvalVisitor.getManagedBufferIfAvailable(InterpreterEvaluator.java:158)
 ~[drill-java-exec-1.6.0.jar:1.6.0]
{noformat}

_Cause:_
Interpreter always asks for a new buffer to hold varchar/varbinary or decimal 
constant values. That's why the memory size required would be proportion to # 
of constant expressions multiplied by # of input rows (partition). This is 
different from evaluation from run-time generated where constant expression 
will be evaluated once and use only one buffer per value.

_Fix:_
To use one buffer for each unique constant value in query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to