[ https://issues.apache.org/jira/browse/DRILL-7675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Paul Rogers updated DRILL-7675: ------------------------------- Comment: was deleted (was: Increase parallelism to 2. Query time decreases to 3 seconds. {code:sql} ALTER SYSTEM SET `planner.width.max_per_node`=1 {code} The hash partition senders now consume 145MB each, for two threads, for nearly 300MB total: far larger than the actual data size. At a parallelism of 3 we run out of memory on the node as shown above. Clearly there is a problem somewhere. The short-term workaround is to reduce parallelism to whatever works on the target setup since, even with just two fragments, the query runs decently fast. ) > Very slow performance and Memory exhaustion while querying on very small > dataset of parquet files > ------------------------------------------------------------------------------------------------- > > Key: DRILL-7675 > URL: https://issues.apache.org/jira/browse/DRILL-7675 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization, Storage - Parquet > Affects Versions: 1.18.0 > Environment: [^sample-dataset.zip] > Reporter: Idan Sheinberg > Assignee: Paul Rogers > Priority: Critical > Attachments: sample-dataset.zip > > > Per our discussion in Slack/Dev-list Here are all details and sample data-set > to recreate problematic query behavior: > * We are using Drill 1.18.0-SNAPSHOT built on March 6 > * We are joining on two small Parquet datasets residing on S3 using the > following query: > {code:java} > SELECT > CASE > WHEN tbl1.`timestamp` IS NULL THEN tbl2.`timestamp` > ELSE tbl1.`timestamp` > END AS ts, * > FROM `s3-store.state.`/164` AS tbl1 > FULL OUTER JOIN `s3-store.result`.`/164` AS tbl2 > ON tbl1.`timestamp`*10 = tbl2.`timestamp` > ORDER BY ts ASC > LIMIT 500 OFFSET 0 ROWS > {code} > * We are running drill in a single node setup on a 16 core, 64GB ram > machine. Drill heap size is set to 16GB, while max direct memory is set to > 32GB. > * As the dataset consist of really small files, Drill has been tweaked to > parallelize on small item count by tweaking the following variables: > {code:java} > planner.slice_target = 25 > planner.width.max_per_node = 16 (to match the core count){code} > * Without the above parallelization, query speeds on parquet files are super > slow (tens of seconds) > * While queries do work, we are seeing non-proportional direct memory/heap > utilization. (up 20GB of direct memory used, a min of 12GB heap required) > * We're still encountering the occasional OOM of memory error (we're also > seeing heap exhaustion, but I guess that's another indication to same > problem. Reducing the node parallelization width to say, 8, reduces memory > contention, though it still reaches 8 gb of direct memory > {code:java} > User Error Occurred: One or more nodes ran out of memory while executing the > query. (null) > org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: One or > more nodes ran out of memory while executing the query.null[Error Id: > 67b61fc9-320f-47a1-8718-813843a10ecc ] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:657) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:338) > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.drill.exec.exception.OutOfMemoryException: null > at > org.apache.drill.exec.vector.complex.AbstractContainerVector.allocateNew(AbstractContainerVector.java:59) > at > org.apache.drill.exec.test.generated.PartitionerGen5$OutgoingRecordBatch.allocateOutgoingRecordBatch(PartitionerTemplate.java:380) > at > org.apache.drill.exec.test.generated.PartitionerGen5$OutgoingRecordBatch.initializeBatch(PartitionerTemplate.java:400) > at > org.apache.drill.exec.test.generated.PartitionerGen5.setup(PartitionerTemplate.java:126) > at > org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.createClassInstances(PartitionSenderRootExec.java:263) > at > org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.createPartitioner(PartitionSenderRootExec.java:218) > at > org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.innerNext(PartitionSenderRootExec.java:188) > at > org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:93) > at > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:323) > at > org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:310) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:310) > ... 4 common frames omitted{code} > I've attached a (real!) sample data-set to match the query above. That same > dataset recreates the aforementioned memory behavior > Help, please. > Idan > -- This message was sent by Atlassian Jira (v8.3.4#803005)