[ 
https://issues.apache.org/jira/browse/DRILL-5211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16052570#comment-16052570
 ] 

Paul Rogers commented on DRILL-5211:
------------------------------------

In reference to the use case mentioned above, a special case to consider for 
this project is the one in which the input row contains an array, and that 
array is larger than 16 GB in size. In this case, we can't even fit a single 
record into a 16 MB vector. Possible solutions:

* Forbid the case: fail for "oversize" arrays.
* Create the array as a special "chained" set of vectors, in which each vector 
contains a block of entries. This adds lots of complexity to downstream 
operators and will likely slow that code as more bounds checks are needed.
* Push the flatten operator into the JSON reader (etc.) so that we don't read 
the jumbo array, then flatten, but rather flatten as we read so that each 
record stays of reasonable size.
* Other solutions?

> Queries fail due to direct memory fragmentation
> -----------------------------------------------
>
>                 Key: DRILL-5211
>                 URL: https://issues.apache.org/jira/browse/DRILL-5211
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>             Fix For: 1.9.0
>
>         Attachments: ApacheDrillMemoryFragmentationBackground.pdf, 
> ApacheDrillVectorSizeLimits.pdf, EnhancedScanOperator.pdf, 
> ScanSchemaManagement.pdf
>
>
> Consider a test of the external sort as follows:
> * Direct memory: 3GB
> * Input file: 18 GB, with one Varchar column of 8K width
> The sort runs, spilling to disk. Once all data arrives, the sort beings to 
> merge the results. But, to do that, it must first do an intermediate merge. 
> For example, in this sort, there are 190 spill files, but only 19 can be 
> merged at a time. (Each merge file contains 128 MB batches, and only 19 can 
> fit in memory, giving a total footprint of 2.5 GB, well below the 3 GB limit.
> Yet, when loading batch xx, Drill fails with an OOM error. At that point, 
> total available direct memory is 3,817,865,216. (Obtained from {{maxMemory}} 
> in the {{Bits}} class in the JDK.)
> It appears that Drill wants to allocate 58,257,868 bytes, but the 
> {{totalCapacity}} (again in {{Bits}}) is already 3,800,769,206, causing an 
> OOM.
> The problem is that, at this point, the external sort should not ask the 
> system for more memory. The allocator for the external sort is at just 
> 1,192,350,366 before the allocation request. Plenty of spare memory should be 
> available, released when the in-memory batches were spilled to disk prior to 
> merging. Indeed, earlier in the run, the sort had reached a peak memory usage 
> of 2,710,716,416 bytes. This memory should be available for reuse during 
> merging, and is plenty sufficient to fill the particular request in question.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to