On 02/02/2012 04:21 PM, bearophile wrote:
Through Reddit I've found this good and long slides pack, it's about using Java 
data structures to increase memory efficiency of programs:

http://domino.research.ibm.com/comm/research_people.nsf/pages/sevitsky.pubs.html/$FILE/oopsla08%20memory-efficient%20java%20slides.pdf

Despite the D situation is different (there are structs as in C#), it will be 
good to have weak and soft references in Phobos, and to have better memory 
analysis tools outside Phobos.

The slides have reminded me my desire of a column-oriented "struct array" in 
Phobos (some time ago someone has written a minimal version for D1).

The usage is simple:


import std.stdio, std.conv;

struct Foo { // an example struct
     int x;
     float y;
     string s;

     this(int xx, float yy) {
         x = xx;
         y = yy;
         s = text(x);
     }

     float sum() {
         return x + y;
     }
}

void main() {
     auto a1 = new Foo[1000]; // normal not parallel array
     foreach (ref Foo f; a1)
         writeln(f.s, " ", f.sum());

     // default usage example of ParallelArray
     // 3 Foo fields stored as 3 separated arrays inside a2
     ParallelArray!Foo a2; // valid
     static assert(a2[0].sizeof == size_t.sizeof * 4); // 3 pointers + 1 length
     a2.length = 1000;
     foreach (ref Foo f; a2) // A f Foo is built on the fly
         writeln(f, " ", f.sum());
     a2[10] = Foo(1, 2, "1");
     foreach (x; a2.x_array) // x_array is a property slice

Ideally this shouldn't require the property. The "natural" or auto type for iterating a ParallelArray should be a proxy value that defines properties for all the members and looks them up on demand. It would just need two words, a pointer to the parent ParallelArray and an index into it.

         writeln(x);
     foreach (y; a2.y_array)
         writeln(y);
     foreach (s; a2.s_array)
         writeln(s);

     // specialized usage example of ParallelArray
     // x,y fields stored as an array, s field as another array
     ParallelArray!(Foo, "x y # s") a3; // valid
     static assert(a3[0].sizeof == size_t.sizeof * 3); // 2 pointers + 1 length
     a3.length = 1000;
     foreach (ref Foo f; a3) // A f Foo is built on the fly
         writeln(f, " ", f.sum());
     a3[10] = Foo(1, 2, "1");
     foreach (xy; a3.x_y_array)
         writeln(xy.x, " ", xy.y);
     foreach (s; a3.s_array)
         writeln(s);

     // float z0 = a3.x_y_array[10].sum(); // invalid code
     ParallelArray!(Foo, "x # y # s") a4; // valid code
     // ParallelArray!(Foo, "x y # s x") a5; // invalid, dupe field x
     // ParallelArray!(Foo, "x # y") a6; // invalid, s field missing
     // so if you give a string with the field names, you need to
     // list them all, and only once each. Other designs are possible
     // but this is the simplest to use and implement.

     float z1 = a3[10].sum(); // a3[10] returns a Foo

     // a3(10) doesn't create a Foo, it just fetches what
     // .sum() needs, so it's faster if you have to call .sum()
     // on many records.
     // so the calls to sum() are implemented at compile-time
     float z2 = a3(10).sum();

     // To keep design simple. ParallelArray can't create 2D arrays
}


Do you like?
I have several usages of such struct in my code.

Bye,
bearophile

Reply via email to