Rahul Challapalli created DRILL-5472: ----------------------------------------
Summary: Parquet reader generating low-density batches causing Sort operator to spill un-necessarily Key: DRILL-5472 URL: https://issues.apache.org/jira/browse/DRILL-5472 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators, Storage - Parquet Reporter: Rahul Challapalli Assignee: Paul Rogers git.commit.id.abbrev=1e0a14c The parquet file used in the below query is ~20MB. The uncompressed size id ~1.2 GB. Now the below query has a sort which is given ~6GB memory for a single fragment and yet it spills. {code} select * from (select * from dfs.`/drill/testdata/resource-manager/all_types_large` s order by s.missing12.x) d where d.missing3 is false; {code} The profile indicates that the above query has spilled twice. Attached the profile and the logs -- This message was sent by Atlassian JIRA (v6.3.15#6346)