Re: Is mimicking a reference type with a struct reliable?

Denis Koroskin Sat, 16 Oct 2010 10:00:46 -0700

Sorry, I misclicked a button and send the message preliminary.

On Sat, 16 Oct 2010 20:16:40 +0400, Steven Schveighoffer<schvei...@yahoo.com> wrote:

A final option is to disable the copy constructor of such an unsafeappender, but then you couldn't pass it around.
What do you think? If you think it's worth having, suggest it on thephobos mailing list, and we'll discuss.

It's still possible to pass it by reference, or even by pointer. You know,that's what you actually do right now - you are passing a Data* (a pointerto an internal state, wrapped with an Appender struct).Passing by pointer might actually be a good idea (because you can defaultit to null). One of the reasons I use "T[] buffer = null" as a buffer isbecause you aren't force to provide one, null is also a valid buffer. Manyfunction would benefit of passing optional Appender (e.g. converting fromutf8 to utf16 etc), but we shouldn't force them to do so.

Note that Appender is supposed to be fast at *appending* notinitializing itself. In that respect, it's very fast.


This makes it useless for appending small amount of data.

I'm not sure it's worth the trade-off, and as such I defined and usemy own set of primitives that don't allocate when a buffer is provided:
void put(T)(ref T[] array, ref size_t offset, const(T) value)
{
     ensureCapacity(array, offset + 1);
     array[offset++] = value;
}

void put(T)(ref T[] array, ref size_t offset, const(T)[] value)
{
     // Same but for an array
}

void ensureCapacity(ref char[] array, size_t minCapacity)
{
    // ...
}
I'm not sure what ensureCapacity does, but if it does what I think itdoes (use the capacity property of arrays), it's probably slower thanAppender, which has a dedicated variable for capacity.
Back to my original question, can we mimick a reference behavior with astruct? I thought why not until I hit this bug:
import std.array;
import std.stdio;

void append(Appender!(string) a, string s)
{
        a.put(s);
}

void main()
{
        Appender!(string) a;
        string s = "test";
        
        append(a, s); // <
        
        writeln(a.data);        
}
I'm passing an appender by value since it's supposed to have areference type behavior and passing 4 bytes by reference is an overkill.
However, the code above doesn't work for a simple reason: structs lackdefault ctors. As such, an appender is initialized to null internally,when I call append a copy of it gets initialized (lazily), but theoriginal one remains unchanged. Note that if you append to appender atleast once before passing by value, it will work. But that's sad. Notonly it allocates when it shouldn't, I also have to initialize itexplicitly!
I think far better solution would be to make it non-copyable.
TL;DR Reference semantic mimicking with a struct without default ctorsis unreliable since you must initialize your object lazily. Moreover,you have to check that you struct is not initialized yet every singlefunction call, and that's error prone and bad for code clarity andperformance. I'm opposed of that practice.
This is a point I've brought up before. As of yet there is nosolution. There have been a couple of ideas passed around, but therehasn't been anything decided. The one idea I remember (but didn'treally like) is to have the copy constructor be able to modify theoriginal. This makes it possible to allocate the underlyingimplementation in Appender for example, even on the data being passed.There are lots of problems with this solution, and I don't think it gotmuch traction.
I think the default constructor solution is probably never going tohappen. It's very nice to always have a default fast way to initializestructs, and there is precedence (C# has the same rule).

I think there is, but it goes far beyond default ctors problem (it solvesmany other issues, too).

Currently, a struct is initialized with T.init/T.classinfo.init
Pros:
simple initialization - malloc, followed by memcpy

there is always an immutable instance of an object in memory, and you canuse it as default/not initialized state


Cons:
you can't initialize class/struct variables with runtime values
increased file size (every single class/struct now has a copy of its own)

In Java, they use another approach. Instead of memcpy'ing T.init on top ofallocated data, they invoke a so-called cctor (as opposed to ctor). Thisis a method that initializes memory so that a ctor can be called.memcpy'ing T.init has the same idea, however it is not moved into aseparate method. In general, cctor can be implemented the way it is in Dwithout sacrificing anything. However, a type-unique method is a lotbetter than that:

1) most structs initialize all of its members with 0. For these compilercan use memset instead.2) killer-feature in my opinion. It allows initializing values tonon-constant expressions:


class Foo
{
        ubyte[] buffer = new ubyte[BUFFER_SIZE];
}

This also solves an Appender issue:

struct Appender
{
        Data* data = new Data();
}

3) it allows getting rid of T.init, significantly reducing resulting filesize

I'm not sure Walter will agree to such a radical change, but it can beachieved in small steps. D doesn't even have to get rid of T.init, it canstill be there (but I'd like to get rid of it eventually)

a) Keep T.init/T.classinfo.init, introduce compiler-generated cctor whatmemcpy'ies T.init over the object(Optionally) Make cctor more smart, and generate proper class/structinitialization code that doesn't rely on T.initb) Allow non-constant expressions as initializers and initialize suchmembers in the cctor

(Optionally) Get rid of T.init altogether

My suggestion would be to have it be an actual reference type -- i.e. aclass. I don't see any issues with that. In that respect, you couldeven have it be stack-allocated, since you have emplace. But I don'thave a say in that. I was the last one to update Appender, since it hada bug-ridden design and needed to be fixed, but I tried to change aslittle as possible.
-Steve

Re: Is mimicking a reference type with a struct reliable?

Reply via email to