GitHub user paul-rogers opened a pull request:
https://github.com/apache/drill/pull/761
DRILL-5284: Roll-up of final fixes for managed sort
See subtasks for details.
* Provide detailed, accurate estimate of size consumed by a record batch
* Managed external sort spills too often with Parquet data
* Managed External Sort fails with OOM
* External sort refers to the deprecated HDFS fs.default.name param
* Config param drill.exec.sort.external.batch.size is not used
* NPE in managed external sort while spilling to disk
* External Sort BatchGroup leaks memory if an OOM occurs during read
* Ensure at least two batches are merged in low-memory conditions
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/paul-rogers/drill DRILL-5284
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/drill/pull/761.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #761
----
commit 5558e9439805d595cfc9625591a276385454625f
Author: Paul Rogers <[email protected]>
Date: 2017-02-24T18:31:25Z
DRILL-5284: Roll-up of final fixes for managed sort
See subtasks for details.
* Provide detailed, accurate estimate of size consumed by a record batch
* Managed external sort spills too often with Parquet data
* Managed External Sort fails with OOM
* External sort refers to the deprecated HDFS fs.default.name param
* Config param drill.exec.sort.external.batch.size is not used
* NPE in managed external sort while spilling to disk
* External Sort BatchGroup leaks memory if an OOM occurs during read
commit 0028f26fef5d9b462700a28b689d47241ee3a1ce
Author: Paul Rogers <[email protected]>
Date: 2017-02-24T20:45:18Z
Fix for DRILL-5294
Under certain low-memory conditions, need to force the sort to merge
two batches to make progress, even though this is a bit more than
comfortably fits into memory.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---