enum pre-conditions

bearophile Tue, 10 Dec 2013 07:11:08 -0800

(This post is a second try of the ideas I discussed in this ER:
http://d.puremagic.com/issues/show_bug.cgi?id=5906 )

It is sometimes handy and useful to be able to give structs someof the capabilities and qualities of some build-in values, thismakes structs more powerful, handy and allow to better implementbuilt-in-like features with library code.

The bounds of built-in arrays are verified at compile time wherepossible:



void main() {
    int[5] a;
    immutable idx = 5;
    a[idx] = 1;
}

test.d(4): Error: array index 5 is out of bounds a[0 .. 5]

- - - - - - - - - - - - - - - - - - -

Where the values are not known at compile-time they are ignored,and the problem is found at run-time (here foo() could be easilyrun at compile-time, but the compiler does not attempt to run ita compile-time):


size_t foo() { return 5; }
void main(string[] args) {
    int[5] a;
    a[5 + args.length] = 1;
    a[foo()] = 2;
}


core.exception.RangeError@test(4): Range violation

- - - - - - - - - - - - - - - - - - -

If I define a library type, using a struct, that mimics/wraps afixed size array, the compile-time test capability of built-inarrays is lost:



struct IntArray(size_t n) {
    int[n] data;
    ref int opIndex(size_t i) {
        return data[i];
    }
}
void main() {
    IntArray!5 a;
    a[5] = 1;
}


core.exception.RangeError@test(4): Range violation

- - - - - - - - - - - - - - - - - - -

Since some time I'd like to give this basic built-ins feature touser-defined types too. (I have seen that Ada2012 has"Static_Predicate" and "Dynamic_Predicate" that are related tothis feature).

A bare-bones library implementation of a ranged value (thisinterval is closed on the right, to make it more easy to includethe last representable values of ints, ubytes, etc):



import std.stdio, std.typecons;

struct RangedValue(T, alias inf, alias sup) {
    T _data;
    alias access this;

    this(T x)
    in {
        assert(_data >= inf && _data <= sup);
    } body {
        _data = x;
    }

    invariant() { /*...*/ }

    @property T access() { return _data; }

    @property T access(T x)
    in {
        assert(_data >= inf && _data <= sup);
    } body {
        return _data = x;
    }
}

void main() {
    alias Month = RangedValue!(int, 1, 12);
    auto m = Month(12);
    m = 13; // line 29, core.exception.AssertError@test(11)
}

- - - - - - - - - - - - - - - - - - -

An idea is to introduce compile-time preconditions, here I haveused the syntax:

"enum in(arg1, arg2, ...)":


    this(T x)
    enum in(x) {
        assert(_data >= inf && _data <= sup);
    } in {
        assert(_data >= inf && _data <= sup);
    } body {
        _data = x;
    }

- - - - - - - - - - - - - - - - - - -

Some of the rules:

- A function with one or more arguments is allowed to have a(regular, dynamic) precondition, an enum ("static") precondition,or both.

- An enum pre-condition is run only at compile-time, or never.

- An enum pre-condition lists one or more arguments that will be"verified". If all the arguments listed inside the "in()" areliterals or explicit compile-time constants, then the enumprecondition is run, otherwise it is ignored. No attempts aredone to run the enum pre-condition in all other cases.- This analysis of "literalness" or "near-literalness" isconservative. This means a D compiler is free to assume a valueis not a literal and to ignore the enum pre-condition.- enum pre-conditions are run at the calling point. This meansthe code they contain is never present inside the binary, itnever causes code slowdown or binary bloat. And errors andviolations are reported at the calling point, so in the codeabove the compile-time error message of the violation shows theline number 29 instead of 11. So enum preconditions are a bitlike default values of functions, they are something relative tothe calling point and not to the function itself.- Like CTFE enum preconditions should be pure (but they can call__ctWrite, so purity is not fully enforced).- The D programmer that uses enum preconditions has to keep inmind that the run-rime correctness of a program can't rely on anenum precondition. An enum precondition is just meant to catch atcompile time some violations and bugs that otherwise will becaught at run-time. So typically there is some code duplicationbetween the precondition and the enum precondition. But this innot terrible.- D programmers that use enum preconditions should remember thatenum preconditions must contain very light code, to keepcompilation time low. (Given the performance of "interpreters"like JavaScript V8 or LuaJIT, future D compilers will runcompile-time code much faster even if they will keep not using aJIT, so this problem will become much smaller).- Future smart IDEs for D are allowed to run an enum preconditionwhile you type code, to catch and underline some bugs in redcolor even before compile-time.

[Note: to speed up compilation it could be added a compilerswitch that makes the D compiler ignore all enum preconditions.But probably this is not necessary.]

Data in external files loaded at compile-time withmixin(import("data.txt")) can be verified with an enumprecondition (currently you can call a test function atcompile-time and assign its result to an enum).

[I am not sure but I think enum preconditions of class membersfollow inheritance rules similar to the ones of the normaldynamic pre-conditions.]


- - - - - - - - - - - - - - - - - - -

Alternative syntax, one more usage of the "static" keyword:

"static in(arg1, arg2, ...)"

- - - - - - - - - - - - - - - - - - -

Some use cases for enum preconditions:

1-2) MyArray() and RangedValue!(int, 1, 12): enum preconditionscan catch at compile-time some bugs, they make the usage of suchsimple data structures more similar to built-ins givingcompile-time errors like "Error: array index 5 is out of boundsa[0 .. 5]".


- - - - - - - - -

3) EnumSubset() is an idea to define enumeration subsets inlibrary code, that is sometimes useful to me. enum preconditionsallow to catch some wrong assignments at compile time (I thinkthis is an use case of the "Static_Predicate" of Ada2012):


Given a struct that represents:
enum Foo { A, B, C }
alias Foo2 = EnumSubset!(Foo.A, Foo.B);

enum pre-conditions verify:

Foo f = Foo.A; // OK
Foo f = Foo.C; // Compile-time error.

- - - - - - - - -

4) enum array literals, for table games, etc.:

Sometimes I like to use enums of chars:

enum Cell : char { A='A', B='B', C='C' }

Then later in the program you can use a with() to catch mistakesin literals:


with (Cell)
    foo([A,B,A,A,B,B,A,A,D], 3);

But putting them in a string is more handy when such tablesbecome larger (Ada language allows something like this, andverifies at compile-time that such array of enumerated values iscorrect!):


foo(ToEnumArray!"ABAABBAAD", 3);

ToEnumArray veryfies its string input is correct and turns itinto an array, at compile-time (std.conv.to can't be useddirectly here because it manages strings differently when youwant to convert them to enums).

With an enum pre-condition I can simplify the code (this foo willstill need to convert the input string to an array of enums,perhaps even using just a cast):


foo("ABAABBAAD", 3);

- - - - - - - - -

5) In high-reliability D code you want to move as many tests ofthe input data from run-time to compile-time. You want tominimize the number of run-time asserts you need in the code. Ithink that D could (someday) be used to replace some of theusages of the Ada language for medium-integrity programs, like insome of the control units used in cars. In such programs you callfunctions with various kinds of literals. Verifying such literalsat compile-time could improve the run-time integrity of theprogram.

Here you can see a small D program that accepts complex literaldata:

http://rosettacode.org/wiki/Universal_Turing_machine#Nearly_Strongly_Typed_Version

    enum States2 : ubyte { A, B, C, H }
    alias Symbols2 = Symbols1;
    alias M2 = UTM!(States2, Symbols2);
    M2.TuringMachine tm2;

with (tm2) with (States2) with (Symbols2) with (M2.Direction){

        alias R = M2.Rule;
        initialState = A;
        rules = [A: [s0: R(s1, right, B), s1: R(s1, left,  C)],
                 B: [s0: R(s1, left,  A), s1: R(s1, right, B)],
                 C: [s0: R(s1, left,  B), s1: R(s1, stay,  H)]];
        symbolMap = ["0", "1"];
    }

The Ada entry of this program is able to verify its whole inputdata at compile-time:

http://rosettacode.org/wiki/Universal_Turing_machine#Ada


type States is (A, B, C, Stop);
   type Symbols is range 0 .. 1;
   package UTM is new Turing(States, Symbols); use UTM;

   Map: Symbol_Map := (1 => '1', 0 => '0');

   Rules: Rules_Type :=

(A => (0 => (New_State => B, Move_To => Right, New_Symbol =>1),1 => (New_State => C, Move_To => Left, New_Symbol =>1)),B => (0 => (New_State => A, Move_To => Left, New_Symbol =>1),1 => (New_State => B, Move_To => Right, New_Symbol =>1)),C => (0 => (New_State => B, Move_To => Left, New_Symbol =>1),1 => (New_State => Stop, Move_To => Stay, New_Symbol=> 1)));


   Tape:  Tape_Type := To_Tape("", Map);

enum pre-conditions could be used in D (with some changes in theD entry) to verify all the input at compile-time (again,currently you can write compile-time functions in D to performsimilar tests, but the code becomes less nice and less natural,you need two functions, etc).


- - - - - - - - -

6) writef("%d", 5.5), format("%d", 5.5):

void writef(Char, A...)(in Char[] fmt, A args) ...
enum in (fmt) {
    // Verify fmt correctness
} ...

This stops most writeflns in my D code giving formatting stringerrors at run-time. And this causes no binary bloat, unlike thistemplate function call (that also can't be used with a run-timeformatting string, unlike the writef with an enum precondition,needing only one function for both cases, and allowing writef tobe retrofitted with no changes in its API):


cWritef!"%d"(5.5);

(Rust language uses a different strategy to verify thoseformatting strings at compile-time.)


- - - - - - - - -

I am sure more usage cases could be found.

The ideas discussed here are also related to the__builtin_constant_p() of GCC:

http://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html

DMD is able to test some array bounds at compile-time so arelated feature is already present inside the compiler. enumpreconditions pull this compiler feature out for some user-codeusages, just like __traits often does.


Bye,
bearophile

enum pre-conditions

Reply via email to