[ 
https://issues.apache.org/jira/browse/DRILL-7675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-7675:
-------------------------------
    Comment: was deleted

(was: Increase parallelism to 2. Query time decreases to 3 seconds.

{code:sql}
ALTER SYSTEM SET `planner.width.max_per_node`=1
{code}

The hash partition senders now consume 145MB each, for two threads, for nearly 
300MB total: far larger than the actual data size.

At a parallelism of 3 we run out of memory on the node as shown above. Clearly 
there is a problem somewhere.

The short-term workaround is to reduce parallelism to whatever works on the 
target setup since, even with just two fragments, the query runs decently fast.

)

> Very slow performance and Memory exhaustion while querying on very small 
> dataset of parquet files
> -------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-7675
>                 URL: https://issues.apache.org/jira/browse/DRILL-7675
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization, Storage - Parquet
>    Affects Versions: 1.18.0
>         Environment: [^sample-dataset.zip]
>            Reporter: Idan Sheinberg
>            Assignee: Paul Rogers
>            Priority: Critical
>         Attachments: sample-dataset.zip
>
>
> Per our discussion in Slack/Dev-list Here are all details and sample data-set 
> to recreate problematic query behavior:
>  * We are using Drill 1.18.0-SNAPSHOT built on March 6
>  * We are joining on two small Parquet datasets residing on S3 using the 
> following query:
> {code:java}
> SELECT 
>  CASE
>  WHEN tbl1.`timestamp` IS NULL THEN tbl2.`timestamp`
>  ELSE tbl1.`timestamp`
>  END AS ts, *
>  FROM `s3-store.state.`/164` AS tbl1
>  FULL OUTER JOIN `s3-store.result`.`/164` AS tbl2
>  ON tbl1.`timestamp`*10 = tbl2.`timestamp`
>  ORDER BY ts ASC
>  LIMIT 500 OFFSET 0 ROWS
> {code}
>  * We are running drill in a single node setup on a 16 core, 64GB ram 
> machine. Drill heap size is set to 16GB, while max direct memory is set to 
> 32GB.
>  * As the dataset consist of really small files, Drill has been tweaked to 
> parallelize on small item count by tweaking the following variables:
> {code:java}
> planner.slice_target = 25
> planner.width.max_per_node = 16 (to match the core count){code}
>  * Without the above parallelization, query speeds on parquet files are super 
> slow (tens of seconds)
>  * While queries do work, we are seeing non-proportional direct memory/heap 
> utilization. (up 20GB of direct memory used, a min of 12GB heap required)
>  * We're still encountering the occasional OOM of memory error (we're also 
> seeing heap exhaustion, but I guess that's another indication to same 
> problem. Reducing the node parallelization width to say, 8, reduces memory 
> contention, though it still reaches 8 gb of direct memory 
> {code:java}
> User Error Occurred: One or more nodes ran out of memory while executing the 
> query. (null)
>  org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: One or 
> more nodes ran out of memory while executing the query.null[Error Id: 
> 67b61fc9-320f-47a1-8718-813843a10ecc ]
>  at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:657)
>  at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:338)
>  at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
>  Caused by: org.apache.drill.exec.exception.OutOfMemoryException: null
>  at 
> org.apache.drill.exec.vector.complex.AbstractContainerVector.allocateNew(AbstractContainerVector.java:59)
>  at 
> org.apache.drill.exec.test.generated.PartitionerGen5$OutgoingRecordBatch.allocateOutgoingRecordBatch(PartitionerTemplate.java:380)
>  at 
> org.apache.drill.exec.test.generated.PartitionerGen5$OutgoingRecordBatch.initializeBatch(PartitionerTemplate.java:400)
>  at 
> org.apache.drill.exec.test.generated.PartitionerGen5.setup(PartitionerTemplate.java:126)
>  at 
> org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.createClassInstances(PartitionSenderRootExec.java:263)
>  at 
> org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.createPartitioner(PartitionSenderRootExec.java:218)
>  at 
> org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.innerNext(PartitionSenderRootExec.java:188)
>  at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:93)
>  at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:323)
>  at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:310)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>  at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:310)
>  ... 4 common frames omitted{code}
> I've attached a (real!) sample data-set to match the query above. That same 
> dataset recreates the aforementioned memory behavior
> Help, please.
> Idan
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to