[ 
https://issues.apache.org/jira/browse/DRILL-2418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744390#comment-14744390
 ] 

Victoria Markman commented on DRILL-2418:
-----------------------------------------

The original error happened much later during execution. After we started 
throwing this new error message, memory leak is not reproducible any more.
I automated not supported implicit cast cases (under 
Functional/Passing/joins/implicit_cast_not_supported) and ran these in a loop 
of 20 iterations with 10 concurrent queries and I don't see memory leak any 
more.

{code}
heap(b)           direct(b)       jvm_direct(b)
2442800296      11799156        1509996862
2207849320      11798941        1493219626
1968600792      11798941        1493219598
1737329064      11798941        1493219582
1503715872      11798941        1493219558
1282546296      11798941        1493219538
1050979952      11798941        1493203108
3357847152      11799156        1493203340
3120888640      11798941        1493203308
2885817664      11798941        1493203288
2654219216      11798941        1493203268
2421491408      11799156        1493203236
2193786112      11798941        1493203220
1957000352      11798941        1493186745
1725191640      11798941        1493186725
1500829160      11798941        1493186701
1282839416      11798941        1493186689
1061937448      11798941        1493186661
3366707496      11798941        1493186893
3131793720      11798941        1493186865
2901312272      11798941        1493186849
{code}

> Memory leak during execution if comparison function is not found
> ----------------------------------------------------------------
>
>                 Key: DRILL-2418
>                 URL: https://issues.apache.org/jira/browse/DRILL-2418
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Flow
>    Affects Versions: 0.8.0
>            Reporter: Victoria Markman
>            Assignee: Chris Westin
>             Fix For: 1.2.0
>
>         Attachments: cast_tbl_1.parquet, cast_tbl_2.parquet, 
> not_supported_cast.txt
>
>
> While testing implicit cast during join, I ran into an issue where if you run 
> a query that throws an exception during execution, eventually, if you run 
> enough of those, drill will run out of memory.
> Here is a query example:
> {code}
> select count(*) from cast_tbl_1 a, cast_tbl_2 b where a.c_float = b.c_time
>  failed: RemoteRpcException: Failure while running fragment., Failure finding 
> function that runtime code generation expected.  Signature: 
> compare_to_nulls_high( TIME:OPTIONAL, FLOAT4:OPTIONAL ) returns INT:REQUIRED 
> [ 633c8ce3-1ed2-4a0a-8248-1e3d5b4f7c0a on atsqa4-133.qa.lab:31010 ]
> [ 633c8ce3-1ed2-4a0a-8248-1e3d5b4f7c0a on atsqa4-133.qa.lab:31010 ]
> Test_Failed: 2015/03/10 18:34:15.0015 - Failed to execute.
> {code}
> If you set planner.slice_target to 1, you hit out of memory after about ~40 
> or so of such failures on my cluster.
> {code}
> select count(*) from cast_tbl_1 a, cast_tbl_2 b where a.d38 = b.c_double
> Query failed: OutOfMemoryException: You attempted to create a new child 
> allocator with initial reservation 3000000 but only 916199 bytes of memory 
> were available.
> {code}
> From the drillbit.log
> {code}
> 2015-03-10 18:34:34,588 [2b00c6c5-5525-ae65-25f8-24ea2d88ba2f:foreman] INFO  
> o.a.d.e.store.parquet.FooterGatherer - Fetch Parquet Footers: Executed 1 out 
> of 1 using 1 threads. Time: 1ms total, 1.190007ms avg, 1ms max.
> 2015-03-10 18:34:34,591 [2b00c6c5-5525-ae65-25f8-24ea2d88ba2f:foreman] INFO  
> o.a.d.e.store.parquet.FooterGatherer - Fetch Parquet Footers: Executed 1 out 
> of 1 using 1 threads. Time: 0ms total, 0.953679ms avg, 0ms max.
> 2015-03-10 18:34:34,627 [2b00c6c5-5525-ae65-25f8-24ea2d88ba2f:foreman] INFO  
> o.a.d.e.s.schedule.BlockMapBuilder - Failure finding Drillbit running on host 
> atsqa4-136.qa.lab.  Skipping affinity to that host.
> 2015-03-10 18:34:34,627 [2b00c6c5-5525-ae65-25f8-24ea2d88ba2f:foreman] INFO  
> o.a.d.e.s.parquet.ParquetGroupScan - Load Parquet RowGroup block maps: 
> Executed 1 out of 1 using 1 threads. Time: 1ms total, 1.609586ms avg, 1ms max.
> 2015-03-10 18:34:34,629 [2b00c6c5-5525-ae65-25f8-24ea2d88ba2f:foreman] INFO  
> o.a.d.e.s.schedule.BlockMapBuilder - Failure finding Drillbit running on host 
> atsqa4-136.qa.lab.  Skipping affinity to that host.
> 2015-03-10 18:34:34,629 [2b00c6c5-5525-ae65-25f8-24ea2d88ba2f:foreman] INFO  
> o.a.d.e.s.parquet.ParquetGroupScan - Load Parquet RowGroup block maps: 
> Executed 1 out of 1 using 1 threads. Time: 1ms total, 1.270340ms avg, 1ms max.
> 2015-03-10 18:34:34,684 [2b00c6c5-5525-ae65-25f8-24ea2d88ba2f:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - State change requested.  PENDING --> 
> FAILED
> org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception 
> during fragment initialization: Failure while getting memory allocator for 
> fragment.
>         at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:195) 
> [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:303)
>  [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_71]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_71]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
> Caused by: org.apache.drill.common.exceptions.ExecutionSetupException: 
> Failure while getting memory allocator for fragment.
>         at 
> org.apache.drill.exec.ops.FragmentContext.<init>(FragmentContext.java:119) 
> ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.work.foreman.Foreman.setupRootFragment(Foreman.java:535)
>  [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.work.foreman.Foreman.runPhysicalPlan(Foreman.java:307) 
> [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:511) 
> [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>         at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:186) 
> [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>         ... 4 common frames omitted
> Caused by: org.apache.drill.exec.memory.OutOfMemoryException: You attempted 
> to create a new child allocator with initial reservation 3000000 but only 
> 916199 bytes of memory were available.
>         at 
> org.apache.drill.exec.memory.TopLevelAllocator.getChildAllocator(TopLevelAllocator.java:121)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.ops.FragmentContext.<init>(FragmentContext.java:116) 
> ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>         ... 8 common frames omitted
> 2015-03-10 18:34:34,700 [2b00c6c5-5525-ae65-25f8-24ea2d88ba2f:foreman] ERROR 
> o.a.drill.exec.work.foreman.Foreman - Error 
> 96a7baf4-f17a-454c-831b-f3dc77bd4381: OutOfMemoryException: You attempted to 
> create a new child allocator with initial reservation 3000000 but only 916199 
> bytes of memory were available.
> org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception 
> during fragment initialization: Failure while getting memory allocator for 
> fragment.
>         at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:195) 
> [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:303)
>  [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_71]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_71]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
> Caused by: org.apache.drill.common.exceptions.ExecutionSetupException: 
> Failure while getting memory allocator for fragment.
>         at 
> org.apache.drill.exec.ops.FragmentContext.<init>(FragmentContext.java:119) 
> ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.work.foreman.Foreman.setupRootFragment(Foreman.java:535)
>  [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.work.foreman.Foreman.runPhysicalPlan(Foreman.java:307) 
> [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:511) 
> [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>         at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:186) 
> [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>         ... 4 common frames omitted
> Caused by: org.apache.drill.exec.memory.OutOfMemoryException: You attempted 
> to create a new child allocator with initial reservation 3000000 but only 
> 916199 bytes of memory were available.
>         at 
> org.apache.drill.exec.memory.TopLevelAllocator.getChildAllocator(TopLevelAllocator.java:121)
>  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>         at 
> org.apache.drill.exec.ops.FragmentContext.<init>(FragmentContext.java:116) 
> ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
>         ... 8 common frames omitted
> 2015-03-10 18:34:34,700 [2b00c6c5-5525-ae65-25f8-24ea2d88ba2f:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - foreman cleaning up - status: 
> [0=>[0=>FragmentData [isLocal=true, status=profile {
> {code}
> I will attach reproduction and I have to add that I have no proof that error 
> is actually causing memory leak (speculation on my part).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to