On Tue, 14 Dec 2010 14:02:34 -0500, Andrei Alexandrescu
<seewebsiteforem...@erdani.org> wrote:
I continue to believe that containers should have reference semantics,
just like classes. Copying a container wholesale is not something you
want to be automatic.
I agree.
I also continue to believe that controlled lifetime (i.e.
reference-counted implementation) is important for a container.
Containers tend to be large compared to other objects, so exercising
strict control over their allocated storage makes a lot of sense. What
has recently shifted in my beliefs is that we should attempt to
implement controlled lifetime _outside_ the container definition, by
using introspection. (Currently some containers use reference counting
internally, which makes their implementation more complicated than it
could be.)
I think ref counting needs to be fleshed out more before we use it. I'm
not of the mind that phobos should use concepts that are not properly
implementable based on the current compiler/runtime design in hopes that
the design gets better. I'd rather design it to work now, and redesign
later if the opportunity becomes available.
Finally, I continue to believe that sealing is worthwhile. In brief, a
sealing container never gives out addresses of its elements so it has
great freedom in controlling the data layout (e.g. pack 8 bools in one
ubyte) and in controlling the lifetime of its own storage. Currently I'm
not sure whether that decision should be taken by the container, by the
user of the container, or by an introspection-based wrapper around an
unsealed container.
I agree that a sealed container is worthwhile. I think it needs to be the
container's decision (for instance, the pack bools into bits must be a
container decision).
That all being said, I'd like to make a motion that should simplify
everyone's life - if only for a bit. I'm thinking of making all
containers classes (either final classes or at a minimum classes with
only final methods). Currently containers are implemented as structs
that are engineered to have reference semantics. Some collections use
reference counting to keep track of the memory used.
I think this is the right move. Responding to pros/cons below:
Advantages of the change:
- Clear, self-documented reference semantics
- Uses the right tool (classes) for the job (define a type with
reference semantics)
- Pushes deterministic lifetime issues outside the containers
(simplifying them) and factors such issues into reusable wrappers a la
RefCounted.
- exposes the issue of default initialization by disallowing that. This
is the problem of passing an uninitialized struct into a function and
having the function not be able to affect the original. A class has a
more defined and better understood lifetime cycle -- nothing exists until
new is used.
- no more need to "check if it's valid" in every member function.
Disadvantages:
- Containers must be dynamically allocated to do anything - even calling
empty requires allocation.
Can't emplace work to fix this? At least for cases where you don't need
the container to live beyond the scope of a function.
- There's a two-words overhead associated with any class object.
I assume this is in response to containers of containers? It's actually
96 bits, because the minimal memory block size is 16 bytes. Therefore, a
container which could potentially have a 1-word footprint must have 4
words. For 64-bit, I'm unsure of the proposed GC implementation.
I have some ideas to solve this, but they are abstract in my head, I
haven't solidified them enough to start a discussion yet. Short story --
I think if we clearly separate the implementation from the container, we
might be able to combine implementations in a minimal way.
- Containers cannot do certain optimizations that depend on container's
control over its own storage.
Can you explain this further?
-Steve