https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70855

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |jakub at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 38468
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38468&action=edit
gcc7-pr70855.patch

Untested simple fix (that is backportable too).
If we want to parallelize this, I'd say the right thing would be still to
disable the inlining during frontend passes when in omp workshare, make the
inline_matmul_assign function no longer static and during omp workshare
translation call that with some special arguments that would arrange for it to
be properly parallelized.  We'd need to ensure that the c = 0 clearing is split
to threads the same way as the following loop, and that each entry in the c
array is only set and modified in the same thread.

Reply via email to