[PR] [SYSTEMDS-3778] Introduce determinant computation primitives [systemds]

via GitHub Wed, 29 Jan 2025 14:54:47 -0800


mike0609king opened a new pull request, #2196:
URL: https://github.com/apache/systemds/pull/2196


   [SYSTEMDS-3778] Introduce determinant computation primitives
   
   This PR introduces the computation of the determinant of a matrix. It
   provides different strategies of computing those determinants. Internally
   it is possible to switch between calculating the determinant using the
   builtin LU decomposition of the commons math library, Gaussian algorithm,
   Bareiss algorithm and Laplace expansion. Furthermore, static simplification
   rewrites
   - `det(t(X)) -> det(X)`
   - `det(X%*%Y) -> det(X)*det(Y)`
   - `det(lambda*X) -> lambda^nrow(X)*det(X)`
   
   have been implemented.
   
   All strategies have been benchmarked using 12x12-matrices. Very large 
matrices
   can result in the laplace algorithm to take too long. The numbers consist
   of the average execution time of three runs; with different matrix seeds
   (7, 233, 4242). Furthermore, it we distinguish between sparse and dense 
matrices,
   because each algorithm has its own way of optimizing for sparse matrices.
   - LU decomposition:
        - Total execution time (Dense matrix): 0.403
        - Total execution time (Sparse matrix): 0.056
   - Gaussian algorithm:
        - Total execution time (Dense matrix): 0.364
        - Total execution time (Sparse matrix): 0.049
   - Bareiss algorithm:
        - Total execution time (Dense matrix): 0.370
        - Total execution time (Sparse matrix): 0.054
   - Laplace expansion:
        - Total execution time (Dense matrix): 0.389
        - Total execution time (Sparse matrix): 6.553
   
   Further observations:
   - The Laplace expansion is slow for big matrices and should not be used.
   - Gaussian algorithm is a bit faster than Bareiss algorithm and LU
   decomposition.
   - Analytical observation: The Bareiss algorithm can be more accurate, if 
whole
   numbers are used, because the divisions in the algorithm have no remainder in
   that case.
   - Gaussian and Bareiss algorithm have larger deviations than eps if
   the matrix is transposed `(det(t(X)) in R vs det(X) in SystemDS)`. The LU
   decomposition does not have this issue. This could be an indicator that
   the LU decomposition is more robust against floating-point errors.
   - The Gaussian algorithm has higher floating-point error than 1e-8 in
   comparison to the equivalent implementation in R. It does not pass the
   test with matrix size of around 30x30. LU decomposition and Bareiss
   algorithm are suspected to have lower floating-point errors.
   
   It is adviced to never use the laplace expansion, because of its 
inefficiency;
   the runtime is factorial. Those observations lead to using using the
   LU decomposition because of its higher accuracy, despite being a bit slower
   than the alternatives.
   
   The reasoning `det(lambda*X) -> lambda^nrow(X)*det(X)` as rule is, that
   computing the power of a scalar can be done in logarithmic time, whereas
   the multiplication of a scalar with a matrix is quadratic. Another reason
   for this simplification are simplifications of examples such as
   `det(lambda*t(X)) -> lambda^nrow(X)*det(X)`, which would be harder to 
implement
   without this simplification.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] [SYSTEMDS-3778] Introduce determinant computation primitives [systemds]

Reply via email to