[jira] [Commented] (DRILL-5370) Drillbit dies for 5 MB SELECT statement

Paul Rogers (JIRA) Mon, 20 Mar 2017 18:03:07 -0700

    [ 
https://issues.apache.org/jira/browse/DRILL-5370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15933918#comment-15933918
 ]


Paul Rogers commented on DRILL-5370:
------------------------------------

Output:

{code}
Query Length: 5,243,141
17:49:30.583 [272f877e-f0f4-2e8b-29ef-d4220ab8dac2:foreman] ERROR 
o.a.drill.exec.work.foreman.Foreman - SYSTEM ERROR: OutOfMemoryError: Java heap 
space
[Error Id: 6f097f5a-908e-4c54-8d98-6cb47072c9b9 on 172.30.1.69:31010]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
OutOfMemoryError: Java heap space
...
Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected 
exception during fragment initialization: Internal error: Error while applying 
rule ProjectRemoveRule, args 
[rel#8891:DrillProjectRel.LOGICAL.ANY([]).[](input=rel#8890:Subset#1.LOGICAL.ANY([]).[],outer1_0…
...
org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
OutOfMemoryError: Java heap space
{code}

This run did not result in a "catastrophic error, system shut down", but others 
did.

> Drillbit dies for 5 MB SELECT statement
> ---------------------------------------
>
>                 Key: DRILL-5370
>                 URL: https://issues.apache.org/jira/browse/DRILL-5370
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.10.0
>            Reporter: Paul Rogers
>
> Some community users use Drill with BI tools that generate queries. One such 
> tool generates queries that map Drill data into a "cube" format for a 
> cube-based visualization engine. Such tools tend to create very large, very 
> complex queries.
> In replicating an issue found by this user, I created a simple program that 
> creates deeply-nested queries of the form:
> SELECT a99 AS a98 FROM (SELECT a97 AS a98 FROM(… SELECT a1 FROM myTable)…))
> The test used 200 columns each with names of 500 characters long. (Drill has 
> a hard limit of 1024 characters for a symbol name.)
> The setup was an embedded Drillbit using the new "cluster fixture" test 
> framework. The test ran multiple iterations, each wrapping the prior SELECT 
> in a new one as shown above. The result is a series of queries that grew in 
> size by about 100K each iteration.
> Drill handled SELECT statements up to 5 MB in size, after which the Drillbit 
> ran out of heap memory, suffered a fatal exception and exited.
> One question is why a 5 MB query exhausted multiple GB of heap during query 
> parsing and planning.
> But, more importantly, Drill should have some way to protect itself from such 
> failures. In a production cluster, heap exhaustion will bring down all 
> in-flight queries and require a manual restart of the Drillbit.
> So, Drill should enforce some limit on the amount of heap memory used by a 
> query during the parsing and planning process.
> The community user found a failure at around 1 MB, but they very likely had a 
> query with much more complex structure than the simple nested-SELECT used in 
> my test.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (DRILL-5370) Drillbit dies for 5 MB SELECT statement

Reply via email to