[ https://issues.apache.org/jira/browse/DRILL-5478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16168286#comment-16168286 ]
Paul Rogers commented on DRILL-5478: ------------------------------------ First, in your tests, without any changes, what size are the spill files? Then, when you change the size, what size are the files? Files are approximately equal to the requested size. But, due to Drill's internal fragmentation and variety of memory layouts, it is hard to nail the requested size exactly. > Spill file size parameter is not honored by the managed external sort > --------------------------------------------------------------------- > > Key: DRILL-5478 > URL: https://issues.apache.org/jira/browse/DRILL-5478 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Relational Operators > Affects Versions: 1.10.0 > Reporter: Rahul Challapalli > Assignee: Paul Rogers > Fix For: 1.12.0 > > > git.commit.id.abbrev=1e0a14c > Query: > {code} > ALTER SESSION SET `exec.sort.disable_managed` = false; > alter session set `planner.width.max_per_node` = 1; > alter session set `planner.disable_exchanges` = true; > alter session set `planner.width.max_per_query` = 1; > alter session set `planner.memory.max_query_memory_per_node` = 1052428800; > alter session set `planner.enable_decimal_data_type` = true; > select count(*) from ( > select * from dfs.`/drill/testdata/resource-manager/all_types_large` d1 > order by d1.map.missing > ) d; > {code} > Boot Options (spill file size is set to 256MB) > {code} > 0: jdbc:drill:zk=10.10.100.190:5181> select * from sys.boot where name like > '%spill%'; > +--------------------------------------------------+---------+-------+---------+----------+----------------------------------------------------+-----------+------------+ > | name | kind | type | status > | num_val | string_val | bool_val > | float_val | > +--------------------------------------------------+---------+-------+---------+----------+----------------------------------------------------+-----------+------------+ > | drill.exec.sort.external.spill.directories | STRING | BOOT | BOOT > | null | [ > # drill-override.conf: 26 > "/tmp/test" > ] | null | null | > | drill.exec.sort.external.spill.file_size | STRING | BOOT | BOOT > | null | "256M" | null > | null | > | drill.exec.sort.external.spill.fs | STRING | BOOT | BOOT > | null | "maprfs:///" | null > | null | > | drill.exec.sort.external.spill.group.size | LONG | BOOT | BOOT > | 40000 | null | null > | null | > | drill.exec.sort.external.spill.merge_batch_size | STRING | BOOT | BOOT > | null | "16M" | null > | null | > | drill.exec.sort.external.spill.spill_batch_size | STRING | BOOT | BOOT > | null | "8M" | null > | null | > | drill.exec.sort.external.spill.threshold | LONG | BOOT | BOOT > | 40000 | null | null > | null | > +--------------------------------------------------+---------+-------+---------+----------+----------------------------------------------------+-----------+------------+ > {code} > Below are the spill files while the query is still executing. The size of the > spill files is ~34MB > {code} > -rwxr-xr-x 3 root root 34957815 2017-05-05 11:26 > /tmp/test/26f33c36-4235-3531-aeaa-2c73dc4ddeb5_major0_minor0_op5_sort/run1 > -rwxr-xr-x 3 root root 34957815 2017-05-05 11:27 > /tmp/test/26f33c36-4235-3531-aeaa-2c73dc4ddeb5_major0_minor0_op5_sort/run2 > -rwxr-xr-x 3 root root 0 2017-05-05 11:27 > /tmp/test/26f33c36-4235-3531-aeaa-2c73dc4ddeb5_major0_minor0_op5_sort/run3 > {code} > The data set is too large to attach here. Reach out to me if you need anything -- This message was sent by Atlassian JIRA (v6.4.14#64029)