Parth notes: Also note that memory allocations by Netty greater than the 16MB chunk sizeare returned to the OS when the memory is free'd. Both this document andthe original document on memory fragmentation state incorrectly that suchmemory is not released back to the OS. A quick thought experiment - wheredoes this memory go if it is not released back to the OS?
This is true. If the original docs said otherwise, then it is an error for which I apologize. If this were not true, we'd have lots of memory leaks, which we'd have found and fixed. So, clearly memory is returned. It is not returning memory to the OS that is the issue. Rather, it is the fragmentation that occurs when most memory is on the Netty free list and we want to get a large chunk from the OS. We can run out of memory even when lots is free (in Netty). The original jemalloc paper talks about an algorithm to return unused memory to the OS, perhaps we can add that to our own Netty-based allocator. We'd want to be clever, however, because allocations from the OS are 1000 times slower than allocations from the Netty free list, or at least that was try in a prototype I did a year ago on the Mac. Further, in the general case, even Netty is not a panacea. Even if we keep blocks to 16 MB and smaller, doing random sized allocations in random order will cause Netty fragmentation: we might want a 16 MB block, half of memory might be free, but due to historical alloc/free patterns, all memory is free as 8 GB blocks and so allocation fails. Java avoids this issue because it does compaction of free heap space. I'd guess we don't really want to try to implement that for direct memory. This is why DBs generally use fixed-size allocations: it completely avoids memory fragmentation issues. One of the goals of the recent "result set loader" work is to encapsulate all vector accesses in a higher-level abstraction so that, eventually, we can try alternative memory layouts with minimal impact on the rest of Drill code. (The column reader and writer layer isolates code from actual vector APIs and memory layout.) Thanks, - Paul