Re: Mutable enums

Timon Gehr Mon, 14 Nov 2011 13:30:26 -0800

On 11/14/2011 09:39 PM, Steven Schveighoffer wrote:

On Mon, 14 Nov 2011 14:59:50 -0500, Timon Gehr <[email protected]> wrote:

On 11/14/2011 08:37 PM, Steven Schveighoffer wrote:

On Mon, 14 Nov 2011 13:37:18 -0500, Timon Gehr <[email protected]>
wrote:

On 11/14/2011 02:13 PM, Steven Schveighoffer wrote:

On Mon, 14 Nov 2011 03:27:21 -0500, Timon Gehr <[email protected]>
wrote:

On 11/14/2011 01:02 AM, bearophile wrote:

Jonathan M Davis:

import std.algorithm;
void main() {
enum a = [3, 1, 2];
enum s = sort(a);
assert(equal(a, [3, 1, 2]));
assert(equal(s, [1, 2, 3]));
}


It's not a bug. Those an manifest constants. They're copy-pasted
into whatever
code you used them in. So,

enum a = [3, 1, 2];
enum s = sort(a);

is equivalent to

enum a = [3, 1, 2];
enum s = sort([3, 1, 2]);


You are right, there's no DMD bug here. Yet, it's a bit
surprising to
sort in-place a "constant". I have to stop thinking of them as
constants. I don't like this design of enums...


It is the right design. Why should enum imply const or immutable? (or
inout, for that matter). They are completely orthogonal.


There is definitely some debatable practice here for wherever enum is
used on an array.

Consider that:

enum a = "hello";

foo(a);

Does not allocate heap memory, even though "hello" is a reference
type.

However:

enum a = ['h', 'e', 'l', 'l', 'o'];

foo(a);

Allocates heap memory every time a is *used*. This is
counter-intuitive,
one uses enum to define things using the compiler, not during runtime.
It's used to invoke CTFE, to avoid heap allocation. It's not a
glorified
#define macro.

The deep issue here is not that enum is used as a manifest
constant, but
rather the fact that enum can map to a *function call* rather than the
*result* of that function call.

Would you say this should be acceptable?

enum a = malloc(5);

foo(a); // calls malloc(5) and passes the result to foo.

If the [...] form is an acceptable enum, I contend that malloc
should be
acceptable as well.


a indeed refers to the result of the evaluation of ['h', 'e', 'l',
'l', 'o'].

enum a = {return ['h', 'e', 'l', 'l', 'o'];}(); // also allocates on
every use

But malloc is not CTFE-able, that is why it fails.


You are comparing apples to oranges here. Whether it's CTFE able or not
has nothing to do with it, since the code is executed at runtime, not
compile time.


The code is executed at compile time. It is just that the value is
later created by allocating at runtime.

enum foo = {writeln("foo"); return [1,2,3];}(); // fails, because
writeln is not ctfe-able.


Look at the code generated for enum a = [1, 2, 3]. using a is replaced
with a call to _d_arrayliteral. There is no CTFE going on.

There is some ctfe going on, but the compiler has to allocate the resultanew every time it is used. So there is also some runtime overhead.


To make my point clearer:

int foo(){return 100;}
enum a = [foo(), foo(), foo()]; // a is the array literal [100, 100, 100];

void main(){

auto x = a; // this does *not* call foo. But it allocates a newarray literal

My view is that enum should only be acceptable on data that is
immutable, or implicitly cast to immutable,


Too restrictive imho.


It allows the compiler to evaluate the enum at compile time, and store
any referenced data in ROM, avoiding frequent heap allocations (similar
to string literals).

IMO, type freedom is lower on the priority list than performance.

You can already define a symbol that calls arbitrary code at runtime:

@property int[] a() { return [3, 1, 2];}

Why should we muddy enum's goals with also being able to call functions
during runtime?


As I said, I would not miss the capability of enums to create mutable
arrays a lot. Usually you don't want that behavior, and explicitly
.dup-ing is just fine.

But I think it is a bit exaggerated to say enums can call functions at
runtime. It is up to the compiler how to implement the array allocation.


The compiler has no choice. It must develop the array at runtime, or
else the type allows one to modify the source value (just like in D1 how
you could modify string literals). In essence, the compiler is creating
a new copy for every usage (and building it from scratch).

That is a quality of implementation issue. The language semantics do notrequire that.

and should *never* map to an
expression that calls a function during runtime.


Well, I would not miss that at all.
But being stored as enum should not imply restrictions on type
qualifiers.


The restrictions are required in order to avoid calling runtime
functions for enum usage. Without the restrictions, you must necessarily
call runtime functions for any reference-based types (to avoid modifying
the original).


Yes, I don't need that. But I don't really want compile time
capabilities hampered.

enum a = [2,1,4];
enum b = sort(a); // should be fine.


I was actually surprised that this compiles. But this should not be a
problem even if a was immutable(int)[]. sort should be able to create a
copy of an immutable array in order to sort it. It doesn't matter the
performance hit, because this should all be done at compile time.


It does not, but explicitly calling .dup works
immutable x = [3,2,1];
immutable y = sort(x.dup);


Note that I'm not saying literals in general should not trigger heap
allocations, I'm saying assigning such literals to enums should require
unrestricted copying without runtime function calls.


Yes, I get that. And I think it makes sense. But I am not (yet?)
convinced that the solution to make all enums non-assignable,
head-mutable and tail-immutable is satisfying.


When I see an enum, I think "evaluated at compile time". No matter how
complex it is to build that value, it should be built at compile-time
and *used* at runtime. No complex function calls should be done at
runtime, an enum is a value.


Exactly. Therefore you assign from it by copying it.

Compare to static array.

int[10] x = [1,2,3,4,5,6,7,8,9,0];

x still needs to be initialized at runtime.


I did an interesting little test:

import std.algorithm;
import std.stdio;

int[] foo(int[] x)
{
return x ~ x;
}
enum a = [3, 1, 2];
enum b = sort(foo(foo(foo(a))));

void main()
{
writeln(b);
}

Want to see the assembly generated for the writeln call?

push 018h
mov EAX,offset FLAT:_D11TypeInfo_Ai6__initZ@SYM32
push EAX
call _d_arrayliteralTX@PC32
add ESP,8
mov ECX,1
mov [EAX],ECX
mov 4[EAX],ECX
mov 8[EAX],ECX
mov 0Ch[EAX],ECX
mov 010h[EAX],ECX
mov 014h[EAX],ECX
mov 018h[EAX],ECX
mov 01Ch[EAX],ECX
mov EDX,2
mov 020h[EAX],EDX
mov 024h[EAX],EDX
mov 028h[EAX],EDX
mov 02Ch[EAX],EDX
mov 030h[EAX],EDX
mov 034h[EAX],EDX
mov 038h[EAX],EDX
mov 03Ch[EAX],EDX
mov EBX,3
mov 040h[EAX],EBX
mov 044h[EAX],EBX
mov 048h[EAX],EBX
mov 04Ch[EAX],EBX
mov 050h[EAX],EBX
mov 054h[EAX],EBX
mov 058h[EAX],EBX
mov 05Ch[EAX],EBX
mov ECX,EAX
mov EAX,018h
mov -8[EBP],EAX
mov -4[EBP],ECX
mov EDX,-4[EBP]
mov EAX,-8[EBP]
push EDX
push EAX
call
_D3std5stdio76__T7writelnTS3std5range37__T11SortedRangeTAiVAyaa5_61203c2062Z11SortedRangeZ7writelnFS3std5range37__T11SortedRangeTAiVAyaa5_61203c2062Z11SortedRangeZv@PC32



Really? That's a better solution than using ROM space to store the
result of the expression as evaluated at compile time? The worst part is
that this will be used *EVERY TIME* I use the enum b (even if I pass it
as a const array).


That just tells us that DMD sucks at generating code for array literals.

This generates identical code:

import std.stdio;

void main() {

writeln([1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3,3, 3, 3, 3, 3]);

}

You don't need enums for that.


What it actually should for both our examples is more like the following:

import std.stdio;

immutable _somewhereinrom = [1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2,2, 2, 3, 3, 3, 3, 3, 3, 3, 3];


void main() {
    writeln(_somewhereinrom.dup);
}

push   %ebp
mov    %esp,%ebp
pushl  0x8097184
pushl  0x8097180
mov    $0x80975c8,%eax
push   %eax
call   8079470 <_adDupT>
add    $0xc,%esp
push   %edx
push   %eax
call   807041c <_D3std5stdio15__T7writelnTAiZ7writelnFAiZv>
xor    %eax,%eax
pop    %ebp
ret

If writeln would actually be const correct, the compiler could even getrid of the allocation.


This is not about enums that much, it is about array literals.

The fact that stack static array initialization allocates is one of DMDsbigger warts.


Look at the ridiculous code generated for the following example:

void main() {

int[24] x = [1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3,3, 3, 3, 3, 3, 3];

    writeln(x);
}


I don't think you would miss this as much as you think. Assigning a
non-immutable array from an immutable one is as easy as adding a .dup,
and then the code is more clear that an allocation is taking place.


It would be somewhat odd.

enum a = [2,1,4];
enum b = sort(a.dup); // what exactly is that 'a.dup' thing?


I don't think .dup should be necessary at compile time. Creating a
sorted copy of an immutable array should be quite doable.


I agree, phobos won't currently do it though.

enum c = a.dup; // does this implicitly convert to immutable, or what
happens here?


Either a compile error (cannot store mutable reference data as an enum),
or an implicit conversion back to immutable.

enum d = sort(c); // does not work?

enum e = foo(a.dup, b.dup, c.dup, d.dup);


Again, I don't think .dup would be used for dependent enums, I was
rather thinking dup would be used where you need a mutable copy of an
array during enum usage in normal code.

But if the type of a,b,c,d is immutable(int)[] and foo is a functionthat takes 4 int[]s then the .dup's are necessary to pass type checking.

Re: Mutable enums

Reply via email to