https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120835
--- Comment #7 from Benjamin Schulz <schulz.benjamin at googlemail dot com> --- note that the matrix multiplication becomes only crazy with increasing -O when called within another function that does computations..... correct are these: A Cholesky decomposition with the multiplication on gpu 4 12 -16 12 37 -43 -16 -43 98 2 0 0 6 1 0 -8 5 3 Now the cholesky decomposition is entirely done on gpu 2 0 0 6 1 0 -8 5 3 Now we do the same with the lu decomposition 1 -2 -2 -3 3 -9 0 -9 -1 2 4 7 -3 -6 26 2 Just the multiplication on gpu 1 0 0 0 3 1 0 0 -1 -0 1 0 -3 4 -2 1 1 -2 -2 -3 0 -3 6 0 0 0 2 4 0 0 0 1 Entirely on gpu 1 0 0 0 3 1 0 0 -1 -0 1 0 -3 4 -2 1 1 -2 -2 -3 0 -3 6 0 0 0 2 4 0 0 0 1 and on O1, one gets this: A Cholesky decomposition with the multiplication on gpu 4 12 -16 12 37 -43 -16 -43 98 2 0 0 6 1 0 -8 5 9.89949 Now the cholesky decomposition is entirely done on gpu 2 0 0 6 1 0 -8 5 9.89949 Now we do the same with the lu decomposition 1 -2 -2 -3 3 -9 0 -9 -1 2 4 7 -3 -6 26 2 Just the multiplication on gpu 1 0 0 0 3 1 0 0 -1 -0 1 0 -3 4 -2 1 1 -2 -2 -3 0 -3 6 0 0 0 2 4 0 0 0 2 Entirely on gpu 1 0 0 0 3 1 0 0 -1 -0 1 0 -3 4 -2 1 1 -2 -2 -3 0 -3 6 0 0 0 2 4 0 0 0 1