Hello again,

Thanks to previous thread on multidimensional arrays, I managed to play around with pure D matrix representations and even benchmark a little against numpy:

+-------------------------------------------------------------+------------+-----------+
| benchmark | time (sec) | vs Numpy |
+-------------------------------------------------------------+------------+-----------+
| Sum of two [5000, 6000] int array of arrays | ~0.28 | x4.5 | | Multiplication of two [5000, 6000] double array of arrays | ~0.3 | x2.6 | | Sum of two [5000, 6000] int struct matrices | ~0.039 | x0.6 | | Multiplication of two [5000, 6000] double struct matrices | ~0.135 | x1.2 | | L2 norm of [5000, 6000] double struct matrix | ~0.015 | x15 | | Sort of [5000, 6000] double struct matrix (axis=-1) | ~2.435 | x1.9 | | Dot product of [500, 600]&[600, 500] double struct matrices | ~0.172 | -- |
+-------------------------------------------------------------+------------+-----------+

However, there is one benchmark I am trying to make at least a little comparable. That is the dot product of two struct matrices. Concretely [A x B] @ [B x A] = [A, A] operation. There is a dotProduct function in std.numeric but it works with 1D ranges only.

After it was clear that array of arrays are not very good to represent multidimensional data, I used struct to represent a multidimensional arrays like so:

******************************************************************
struct Matrix(T)
{
T[] data; // keep our data as 1D array and reshape to 2D when needed
    int rows;
    int cols;
    // allow Matrix[] instead of Matrix.data[]
    alias data this;

    this(int rows, int cols)
    {
        this.data = new T[rows * cols];
        this.rows = rows;
        this.cols = cols;
    }

    this(int rows, int cols, T[] data)
    {
        assert(data.length == rows * cols);
        this.data = data;
        this.rows = rows;
        this.cols = cols;
    }

    T[][] to2D()
    {
        return this.data.chunks(this.cols).array;
    }

    /// Allow element 2D indexing, e.g. Matrix[row, col]
    T opIndex(in int r, in int c)
    {
        return this.data[toIdx(this, r, c)];
    }

}

pragma(inline) static int toIdx(T)(Matrix!T m, in int i, in int j)
{
    return m.cols * i + j;
}
******************************************************************

And here is the dot product function:

******************************************************************
Matrix!T matrixDotProduct(T)(Matrix!T m1, Matrix!T m2)
in
{
    assert(m1.rows == m2.cols);
}
do
{
    Matrix!T m3 = Matrix!T(m1.rows, m2.cols);

    for (int i; i < m1.rows; ++i)
    {
        for (int j; j < m2.cols; ++j)
        {
            for (int k; k < m2.rows; ++k)
            {
                m3.data[toIdx(m3, i, j)] += m1[i, k] * m2[k, j];
            }
        }
    }
    return m3;
}
******************************************************************

However, attempting to run dotProduct on two 5000x6000 struct Matrices took ~20 min while 500x600 only 0.172 sec. And I wondered if there is something really wrong with the matrixDotProduct function.

I can see that accessing the appropriate array member in Matrix.data is costly due to toIdx operation but, I can hardly explain why it gets so much costly. Maybe there is a better way to do it after all?

Reply via email to