Github user paul-rogers commented on the issue: https://github.com/apache/drill/pull/717 Some comment got lost in the force-push. One was related to the output batch size, suggesting we cap it at 16 MB. The reason is that value vectors about 16 MB cause memory fragmentation. A later fix will limit an output batch to either 64K rows (the size of an sv2) or so that the longest vector is smaller than 16 MB. The most recent commit added per-column size information so that we can enforce this limit. For example, we can have 64K rows with columns of size 256 bytes within a 16 MB vector. There is no reason not to allow 64K rows even for rows with four of the 256 columns. Total batch size would be 64 MB, but no single vector would be above 16 MB. That fix will be offered, along with tests and enabling the managed sort by default, in a subsequent PR.
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---