[jira] [Created] (DRILL-5472) Parquet reader generating low-density batches causing Sort operator to spill un-necessarily

Rahul Challapalli (JIRA) Thu, 04 May 2017 09:50:34 -0700

Rahul Challapalli created DRILL-5472:
----------------------------------------


             Summary: Parquet reader generating low-density batches causing 
Sort operator to spill un-necessarily
                 Key: DRILL-5472
                 URL: https://issues.apache.org/jira/browse/DRILL-5472
             Project: Apache Drill
          Issue Type: Bug
          Components: Execution - Relational Operators, Storage - Parquet
            Reporter: Rahul Challapalli
            Assignee: Paul Rogers


git.commit.id.abbrev=1e0a14c

The parquet file used in the below query is ~20MB. The uncompressed size id 
~1.2 GB. Now the below query has a sort which is given ~6GB memory for a single 
fragment and yet it spills.
{code}
select * from (select * from 
dfs.`/drill/testdata/resource-manager/all_types_large` s order by 
s.missing12.x) d where d.missing3 is false;
{code}

The profile indicates that the above query has spilled twice. Attached the 
profile and the logs



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (DRILL-5472) Parquet reader generating low-density batches causing Sort operator to spill un-necessarily

Reply via email to