Thanks for the review.  Detailed comments below.

On 9/9/2011 2:38 AM, Martin Nowak wrote:
So I have used RegionAllocator.
First of all, thanks for getting things like these into the std library.
I didn't even know segmented stacks before and they were the perfect
solution to my use case.
Since first reading about this I've used a TLS Appender which I cleared
occasionally.
Performance is top. For me it's on par with statically allocating a
huge chunk and draw from this with memset initalization.

The Use Case:

I have implemented a wavelet rasterizer. It uses a sparse quadtree.
Bezier curves are recursively sliced into each child quadrant.
I lazily allocate 4 child nodes when recursing into a quadrant.
After all bezier curves are added the coverage for pixels can be
calculated and used for blitting.
Afterward the whole tree is disposed.

Details for this pretty neat and simple rasterizer at
http://josiahmanson.com/research/wavelet_rasterization/.

Criticism of regionallocator:

1. Please M-x delete-trailing-whitespace


I am surprised there is any trailing whitespace. I use CodeBlocks which usually gets rid of it. I think I disabled this because it was showing up in diffs when I edited Phobos modules.

2. Using RAII for the allocator also pushes RAII onto the user.
To avoid having to pass the allocator around I had to add another static
variable holding the allocator.
This doesn't work well, because there is no explicit freeAll function
for the allocator.
Furthermore assigning a newRegionAllocator to an initialized
allocator is not possible as it breaks the LIFO order. So I had
to resort to sAlloc = RegionAllocator.init; sAlloc = newRegionAllocator();.

But RAII makes things simple and relatively safe. What would you suggest as the alternative?


3. All initialization (newArray, newUnitializedArray, newT) should
not be part of an allocator interface.
We should instead create a templated Initializer or free functions
that will do this using one of the new allocators providing void*
memory.

Unfortunately, this leads to a fundamental disagreement that I have with a large portion of this review. I want RegionAllocator to be high-level enough to be easy to use right out of the box. You seem to be asking for something lower level but more composable. The problem is that low-level but composable often makes simple things complicated and messy. Furthermore, if the allocator knows the type being allocated, it can sometimes take that into account and do "special" things like GC scanning, etc. Similarly, array() takes advantage of knowing the implementation details of the allocator.


4. I don't appreciate all the bookkeeping involved with every
allocation, mainly just to provide freeLast.
RegionAllocator should be such lightweight that you can use one
for each allocation that you want individual freeing for.

This is also necessary to allow proper freeing of huge allocations (bigger than a segment size). I really don't see much downside given that the overhead is small. Furthermore, having tons of RegionAllocators local to one function would get really messy really fast.


5. Separating the stack from the allocator creates mental/code bloat.
It suggest the wrong association with allocating from the
allocator while in reality your allocating off the stack using
kind of a cookie.
I had a simpler interface in mind https://gist.github.com/1204402.
Can you elaborate a little why/what additional complexity is needed.

Creating a brand-new stack is expensive because it heap allocates and incurs synchronization overhead. The stack is designed to make creating a new RegionAllocator cheap (i.e. no synchronization overhead or heap allocation in most cases). RegionAllocator mainly provides RAII freeing.


6. Allocating more bytes than the segment size could be an error.
This would reduce the complexity for handling big objects.

I really like the transparent "just works" design for big objects. Again, this goes back to the high-level vs. simple point. You seem to be calling for the Unix way (everything is simple, stupid and composable, implementation simplicity is most important). I'd much rather make the implementation more complicated but the interface simpler/less error prone.


Allocator interface:

7. Instead of using malloc/free you should use an malloc/free version of
the allocator interface.
This way RegionAllocator can be used for other memory (e.g. shared memory).

This might not be a bad idea, especially if I make the malloc/free allocator the default so most users don't need to think about this.


8. The allocator interface should provide means to do aligned allocations.
As this would arguably complicate implementations alternatively requiring
all allocators to return 16-Byte aligned memory seems reasonable.
Still another possibility is having an enum flag for alignment and
provide an
allocator wrapper that does aligned allocations.

Hmm, I guess my alignedMalloc/alignedFree could be made into LibcAllocator.


9. The allocator interface should have a flag to advice GC range adding.
Could be:
alloc(size_t nbytes, GCScan scan = GCScan.no)

RegionAllocator could deduce it's scan flag from the first use and
enforce it never changes afterwards. I'm not to sure about this,
but requiring the user to add memory to the GC seems error prone and
reduce the design space for allocators.

I don't understand. If they have to set a flag to get it added, then I fail to see how adding it manually is any more difficult.


10. Using a free list for the segments should be left to a
FreeListAllocator (see 7).

Maybe in the long run once we have a FreeListAllocator, but IMHO this is an implementation detail that can be changed later. I definitely don't want to expose the free list stuff in the API because I'd rather not complicate the interface with these details.


11. I'm really much in favor of using classes (final) for such long
living objects (the stack).
This allow simpler lazy static initialization and has scope new for
stack variables.
But most important you don't need to have unitialized structs. Which at the
end of the day use a ref counted impl (see 2). OTOH classes are more
complicated
for templated decoration.

Hmm, I really like the idea of the stack memory getting freed automatically and deterministically by ref counting.


12. Ideally RegionAllocator would be AlignedAllocator!(16,
RegionAllocator!(FreeListAllocator!(LibcAllocator))).

The alignment is taken care of internally to RegionAllocator in a way that a generic AlignedAllocator wouldn't be able to handle as efficiently. FreeListAllocator and the whole free list concept is an implementation detail and shouldn't appear in the API at all. I think you may have a good point about LibcAllocator being a template parameter, though.



Smaller issues:

- In the constructor of RegionAllocatorStack you should enforce that
segmentSize is bigger than alignBytes instead of only catching 0.

Good point.  Will fix.


- At a quick glance in alignedMalloc you only need to store an offset
smaller than alignBytes instead of a pointer (ubyte will suffice).

Good catch. I'll do this if I make a generic LibcAllocator with these functions, but otherwise the savings are so small (a few bytes on the multi-megabyte allocations that RegionAllocator does) that I'd rather not change it just due to inertia and risk aversion.


- regionallocator.d(594): idempotent assertion
while (nLargeObjects > 0 && bookkeepIndex > regionIndex) {
assert(bookkeepIndex > regionIndex);
freeLast();
}

Another good catch. This is just cruft from when the second check in the while statement wasn't there.


- regionallocator.d(939): something is missing/too much
static size_t alignBytes(size_t nBytes) {
return .alignBytes;
}

alignBytes() is a function that is part of the allocator interface. .alignBytes is an enum that needs to be visible at the module level for alignedMalloc/alignedFree.

Reply via email to