Re: std.allocator needs your help

Brad Anderson Tue, 24 Sep 2013 09:41:28 -0700

On Tuesday, 24 September 2013 at 08:46:36 UTC, Dmitry Olshanskywrote:

23-Sep-2013 03:49, Andrei Alexandrescu пишет:
Hello,
I am making good progress on the design of std.allocator, andI amoptimistic about the way it turns out. D's introspectioncapabilities
really shine through, and in places the design does feel really
archetypal - e.g. "this is the essence of a freelistallocator". It's a
very good feeling. The overall inspiration comes from Berger's
HeapLayers, but D's introspection takes that pattern to awhole new level.
Several details are still in flux, but at the top level itseems most
natural to divide things in two tiers:
1. Typed allocator, i.e. every request for allocation comeswith the
exact type requested;

2. Untyped allocator - traffics exclusively in ubyte[].
Looks good (s/ubyte[]/void[] per current discussion).
Do you imagine Typed allocators as something more then adaptersthat simplify a common pattern of allocate + emplace / destroy+ deallocate? (create!T/destroy!T)
struct NullAllocator
{
    enum alignment = real.alignof;
    enum size_t available = 0;
    ubyte[] allocate(size_t s) shared { return null; }
bool expand(ref ubyte[] b, size_t minDelta, size_tmaxDelta) shared
    { assert(b is null); return false; }
    bool reallocate(ref ubyte[] b, size_t) shared
    { assert(b is null); return false; }
    void deallocate(ubyte[] b) shared { assert(b is null); }
    void collect() shared { }
    void deallocateAll() shared { }
    static shared NullAllocator it;
}

Primitives:
First things first - can't allocator return alignment asrun-time value - a property (just like 'available' does)? Theimplementation contract is that it must be O(1) vanilasyscall-free function. (Read this as consult system infoexactly once, at construction).
Thinking more of it - it also may be immutable size_t? Then itgets proper value at construction and then is never changed.
* expand(b, minDelta, maxDelta) grows b's length by at leastminDelta(and on a best-effort basis by at least maxDelta) and returnstrue, ordoes nothing and returns false. In most allocators this shouldbe @safe.(One key insight is that expand() can be made @safe whereasshrink() orrealloc() are most often not; such mini-epiphanies are veryexcitingbecause they put the design on a beam guide with fewprinciples and manyconsequences.) The precondition is that b is null or has beenpreviously
returned by allocate(). This method is optional.
Great to see this primitive. Classic malloc-ators are so lame...
(+ WinAPI Heaps fits here)
* deallocate(b) deallocates the memory allocated for b. b musthave beenpreviously allocated by the same allocator. This method isusuallyunsafe (but there are notable implementations that may offersafety,
such as unbounded freelists.) This method is optional.
Does the following implication hold "have a deallocate" -->must be manually managed? Otherwise how would one reliably workwith it and not leak? This brings us to traits that allocatorsmay (should) havea-la automatic? free-all on termination? Zeros on allocate(more common then one would think)? etc.
* deallocateAll() deallocates in one shot all memory previously
allocated by this allocator. This method is optional, and whenpresentis almost always unsafe (I see no meaningful @safeimplementation.)Region allocators are notable examples of allocators thatdefine this
method.
Okay.. I presume region "mark" works by spiting off asubinstance of allocator and "release" by deallocateAll().
* collect() frees unused memory that had been allocated withthisallocator. This optional method is implemented by tracingcollectors and
is usually @safe.
This seems hard and/or suboptimal w/o typeinfo --> typedallocators? I guess they could be more then a simple helper.
There are quite a few more things to define more precisely,but thispart of the design has become quite stable. To validate thedesign, I'vedefined some simple allocators (Mallocator, GCAllocator,Freelist,StackRegion, Region etc) but not the more involved ones, suchas
coalescing heaps, buddy system etc.
The problem I see is that allocators are rooted in poorfoundation - malloc is so out of fashion (its interface issimply too thin on guarantees), sbrk is frankly a stone-agesyscall.
I would love to make a root allocator one on top ofmmap/VirtualAlloc.This leads me to my biggest problem with classical memorymanagement - ~20 years of PCs having virtual memory supportdirectly in the CPU, and yet hardly anything outside of OStakes advantage of it. They (allocators) seem simply not awareof it - none of interface implies anything that would help usertake advantage of it. I'm talking of (potentially) largeallocations that may be often resized.
(large allocations actually almost always a blind spot/fallbackin allocators)
To the point - we have address space and optionally memorycommitted in there, leading us to a 3-state of an _addressrange_:available, reserved, committed. The ranges of 2nd state aredirt cheap (abundant on 64bit) and could be easily flipped tocommitted and back.
So I have a pattern I want to try to fit in your design. Atpresent it is a datastructure + allocation logic. What I wouldwant is clean and nice composition for ultimate reuse.
An example is a heavy-duty array (could be used for potentiallylarge stack). The requirements are:a) Never reallocate on resize - in certain cases it may even betoo large to reallocate in RAM (!)b) Never waste RAM beyond some limit (a few pages or a tinyfaction of size is fine)c) O(1) amortized appending as in plain array and otherproperties of an array -> it's a dynamic array after all
Currently I use mmap/madvise and related entities directly. Itgoes as follows.
Allocate a sufficiently large - as large as "it" may get (1Mb,1G, 10G you name it) _address range_. Call this size a'capacity'.Commit some memory (optionally on first resize) - call this a'committed'. Appending goes using up committed space as typicalarray would do. Whenever it needs more it then that applies thesame extending algorithm as usual but with (committed,capacity) pair.Ditto on resize back - (with some amortization) it asks OS todecommit pages decreasing the commited amount.
In short - a c++ "vector" that has 2 capacities (current andabsolute maximum) with a plus that it never reallocates. Whatto do if/when it hits the upper bound - is up to specificapplication (terminate, fallback to something else). It has thedanger of sucking up address space on 32bits though.. but whocares :)

Somewhat related:http://probablydance.com/2013/05/13/4gb-per-vector/

Re: std.allocator needs your help

Reply via email to