On 9/22/13 9:03 PM, Manu wrote:
On 23 September 2013 12:28, Andrei Alexandrescu
<seewebsiteforem...@erdani.org <mailto:seewebsiteforem...@erdani.org>>
wrote:
    My design makes it very easy to experiment by allowing one to define
    complex allocators out of a few simple building blocks. It is not a
    general-purpose allocator, but it allows one to define any number of
    such.

Oh okay, so this isn't really intended as a system then, so much a
suggested API?

For some definition of "system" and "API", yes :o).

That makes almost all my questions redundant. I'm interested in the
system, not the API of a single allocator (although your API looks fine
to me).
I already have allocators I use in my own code. Naturally, they don't
inter-operate with anything, and that's what I thought std.allocator was
meant to address.

Great. Do you have a couple of nontrivial allocators (heap, buddy system etc) that could be adapted to the described API?

    The proposed design makes it easy to create allocator objects. How
    they are used and combined is left to the application.

Is that the intended limit of std.allocator's responsibility, or will
patterns come later?

Some higher level design will come later. I'm not sure whether or not you'll find it satisfying, for reasons I'll expand on below.

Leaving the usage up to the application means we've gained nothing.
I already have more than enough allocators which I use throughout my
code. The problem is that they don't inter-operate, and certainly not
with foreign code/libraries.
This is what I hoped std.allocator would address.

Again, if you already have many allocators, please let me know if you can share some.

std.allocator will prescribe a standard for defining allocators, with which the rest of std will work, same as std.range prescribes a standard for defining ranges, with which std.algorithm, std.format, and other modules work. Clearly one could come back with "but I already have my own ranges that use first/done/next instead of front/empty/popFront, so I'm not sure what we're gaining here".

    An allocator instance is a variable like any other. So you use the
    classic techniques (shared globals, thread-local globals, passing
    around as parameter) for using the same allocator object from
    multiple places.


Okay, that's fine... but this sort of manual management implies that I'm
using it explicitly. That's where it all falls down for me.

I think a disconnect here is that you think "it" where I think "them". It's natural for an application to use one allocator that's not provided by the standard library, and it's often the case that an application defines and uses _several_ allocators for different parts of it. Then the natural question arises, how to deal with these allocators, pass them around, etc. etc.

Eg, I want to use a library, it's allocation patterns are incompatible
with my application; I need to provide it with an allocator.
What now? Is every library responsible for presenting the user with a
mechanism for providing allocators? What if the author forgets? (a
problem I've frequently had to chase up in the past when dealing with
3rd party libraries)

If the author forgets and hardcodes a library to use malloc(), I have no way around that.

Once a library is designed to expect a user to supply an allocator, what
happens if the user doesn't? Fall-back logic/boilerplate exists in every
library I guess...

The library wouldn't need to worry as there would be the notion of a default allocator (probably backed by the existing GC).

And does that mean that applications+libraries are required to ALWAYS
allocate through given allocator objects?

Yes, they should.

That effectively makes the new keyword redundant.

new will still be used to tap into the global shared GC. std.allocator will provide other means of allocating memory.

And what about the GC?

The current global GC is unaffected for the time being.

I can't really consider std.allocator intil it presents some usage patterns.

Then you'd need to wait a little bit.

        It wasn't clear to me from your demonstration, but 'collect()'
        implies
        that GC becomes allocator-aware; how does that work?


    No, each allocator has its own means of dealing with memory. One
    could define a tracing allocator independent of the global GC.


I'm not sure what this means. Other than I gather that the GC and
allocators are fundamentally separate?

Yes, they'd be distinct. Imagine an allocator that requests 4 MB from the GC as NO_SCAN memory, and then does its own management inside that block. User-level code allocates and frees e.g. strings or whatever from that block, without the global GC intervening.

Is it possible to create a tracing allocator without language support?

I think it is possible.

Does the current language insert any runtime calls to support the GC?

Aside from operator new, I don't think so.

I want a ref-counting GC for instance to replace the existing GC, but
it's impossible to implement one of them nicely without support from the
language, to insert implicit inc/dec ref calls all over the place, and
to optimise away redundant inc/dec sequences.

Unfortunately that's a chymera I had to abandon, at least at this level. The problem is that installing an allocator does not get to define what a pointer is and what a reference is. These are notions hardwired into the language, so the notion of turning a switch and replacing the global GC with a reference counting scheme is impossible at the level of a library API.

(As an aside, you still need tracing for collecting cycles in a transparent reference counting scheme, so it's not all roses.)

What I do hope to get to is to have allocators define their own pointers and reference types. User code that uses those will be guaranteed certain allocation behaviors.

I can easily define an allocator to use in my own code if it's entirely
up to me how I use it, but that completely defeats the purpose of this
exercise.

It doesn't. As long as the standard prescribes ONE specific API for defining untyped allocators, if you define your own to satisfy that API, then you'll be able to use your allocator with e.g. std.container, just the same as defining your own range as std.range requires allows you to tap into std.algorithm.

Until there aren't standard usage patterns, practises, conventions that
ALL code follows, then we have nothing. I was hoping to hear your
thoughts about those details.



        It's quite an additional burden of resources and management to
        manage
        the individual allocations with a range allocator above what is
        supposed
        to be a performance critical allocator to begin with.


    I don't understand this.


It's irrelevant here.
But fwiw, in relation to the prior point about block-freeing a range
allocation;

What is a "range allocation"?

there will be many *typed* allocations within these ranges,
but a typical range allocator doesn't keep track of the allocations within.

Do you mean s/range/region/?

This seems like a common problem that may or may not want to be
addressed in std.allocator.
If the answer is simply "your range allocator should keep track of the
offsets of allocations, and their types", then fine. But that seems like
boilerplate that could be automated, or maybe there is a
different/separate system for such tracking?

If you meant region, then yes that's boilerplate that hopefully will be reasonably automated by std.allocator. (What I discussed so far predates that stage of the design.)

        C++'s design seems reasonable in some ways, but history has
        demonstrated
        that it's a total failure, which is almost never actually used (I've
        certainly never seen anyone use it).


    Agreed. I've seen some uses of it that quite fall within the notion
    of the proverbial exception that prove the rule.


I think the main fail of C++'s design is that it mangles the type.
I don't think a type should be defined by the way it's memory is
allocated, especially since that could change from application to
application, or even call to call. For my money, that's the fundamental
flaw in C++'s design.

This is not a flaw as much as an engineering choice with advantages and disadvantages on the relative merits of which reasonable people may disagree.

There are two _fundamental_ flaws of the C++ allocator design, in the sense that they are very difficult to argue in favor of and relatively easy to argue against:

1. Allocators are parameterized by type; instead, individual allocations should be parameterized by type.

2. There is no appropriate handling for allocators with state.

The proposed std.allocator design deals with (2) with care, and will deal with (1) when it gets to typed allocators.

Well as an atom, as you say, it seems like a good first step.
I can't see any obvious issues, although I don't think I quite
understand the collect() function if it has no relation to the GC. What
is it's purpose?

At this point collect() is only implemented by the global GC. It is possible I'll drop it from the final design. However, it's also possible that collect() will be properly defined as "collect all objects allocated within this particular allocator that are not referred from any objects also allocated within this allocator". I think that's a useful definition.

If the idea is that you might implement some sort of tracking heap which
is able to perform a collect, how is that actually practical without
language support?

Language support would be needed for things like scanning the stack and the globals. But one can gainfully use a heap with semantics as described just above, which requires no language support.

I had imagined going into this that, like the range interface which the
_language_ understands and interacts with, the allocator interface would
be the same, ie, the language would understand this API and integrate it
with 'new', and the GC... somehow.

The D language has no idea what a range is. The notion is completely defined in std.range.

If allocators are just an object like in C++ that people may or may not
use, I don't think it'll succeed as a system. I reckon it needs deep
language integration to be truly useful.

I guess that's to be seen.

The key problem to solve is the friction between different libraries,
and different moments within a single application its self.
I feel almost like the 'current' allocator needs to be managed as some
sort of state-machine. Passing them manually down the callstack is no
good. And 'hard' binding objects to their allocators like C++ is no good
either.

I think it's understood that if a library chooses its own ways to allocate memory, there's no way around that. The point of std.allocator is that it defines a common interface that user code can work with.


Andrei

Reply via email to