Re: [petsc-dev] Broken MatMatMult_MPIAIJ_MPIDense

Smith, Barry F. via petsc-dev Sun, 22 Sep 2019 09:33:40 -0700

   Ok. So we definitely need better error checking and to clean up the code, 
comments and docs


   As the approaches for these computations of products get more complicated it 
becomes a bit harder to support the use of a raw product matrix so I don't 
think we want to add all the code needed to call the symbolic part (after the 
fact) when the matrix is raw. Would that make things terribly difficult for you 
not being able to use a raw matrix?

   I suspect that the dense case was just lucky that using a raw matrix worked. 

   The removal of the de facto support for REUSE on the raw matrix should be 
added to the changes document.

   Sorry for the difficulties. We have trouble testing all the combinations of 
possible usage, even a coverage tool would not have indicated a problems the 
lack of lda support. 

  Hong,

     Can you take a look at these things on Monday and maybe get a clean into a 
MR so it gets into the release?

   Thanks


   Barry





> On Sep 22, 2019, at 11:12 AM, Pierre Jolivet <pierre.joli...@enseeiht.fr> 
> wrote:
> 
> 
>> On 22 Sep 2019, at 6:03 PM, Smith, Barry F. <bsm...@mcs.anl.gov> wrote:
>> 
>> 
>> 
>>> On Sep 22, 2019, at 10:14 AM, Pierre Jolivet via petsc-dev 
>>> <petsc-dev@mcs.anl.gov> wrote:
>>> 
>>> FWIW, I’ve fixed MatMatMult and MatTransposeMatMult here 
>>> https://gitlab.com/petsc/petsc/commit/93d7d1d6d29b0d66b5629a261178b832a925de80
>>>  (with MAT_INITIAL_MATRIX).
>>> I believe there is something not right in your MR (2032) with 
>>> MAT_REUSE_MATRIX (without having called MAT_INITIAL_MATRIX first), cf. 
>>> https://gitlab.com/petsc/petsc/merge_requests/2069#note_220269898.
>>> Of course, I’d love to be proved wrong!
>> 
>>   I don't understand the context.
>> 
>>    MAT_REUSE_MATRIX requires that the C matrix has come from a previous call 
>> with MAT_INITIAL_MATRIX, you cannot just put any matrix in the C location.
> 
> 1) It was not the case before the MR, I’ve used that “feature” (which may be 
> specific for MatMatMult_MPIAIJ_MPIDense) for as long as I can remember
> 2) If it is not the case anymore, I think it should be mentioned somewhere 
> (and not only in the git log, because I don’t think all users will go through 
> that)
> 3) This comment should be removed from the code as well: 
> https://www.mcs.anl.gov/petsc/petsc-dev/src/mat/impls/aij/mpi/mpimatmatmult.c.html#line398
> 
>> This is documented in the manual page. We should have better error checking 
>> that this is the case so the code doesn't crash at memory access but instead 
>> produces a very useful error message if the matrix was not obtained with 
>> MAT_INITIAL_MATRIX. 
>> 
>>   Is this the issue or do I not understand?
> 
> This is exactly the issue.
> 
>>   Barry
>> 
>> BTW: yes MAT_REUSE_MATRIX has different meanings for different matrix 
>> operations in terms of where the matrix came from, this is suppose to be all 
>> documented in each methods manual page but some may be missing or 
>> incomplete, and error checking is probably not complete for all cases.  
>> Perhaps the code should be changed to have multiple different names for each 
>> reuse case for clarity to user?
> 
> Definitely, cf. above.
> 
> Thanks,
> Pierre
> 
>>> 
>>> Thanks,
>>> Pierre
>>> 
>>>> On 22 Sep 2019, at 5:04 PM, Zhang, Hong <hzh...@mcs.anl.gov> wrote:
>>>> 
>>>> I'll check it tomorrow.
>>>> Hong
>>>> 
>>>> On Sun, Sep 22, 2019 at 1:04 AM Pierre Jolivet via petsc-dev 
>>>> <petsc-dev@mcs.anl.gov> wrote:
>>>> Jed,
>>>> I’m not sure how easy it is to put more than a few lines of code on 
>>>> GitLab, so I’ll just send the (tiny) source here, as a follow-up of our 
>>>> discussion 
>>>> https://gitlab.com/petsc/petsc/merge_requests/2069#note_220229648.
>>>> Please find attached a .cpp showing the brokenness of C=A*B with A of type 
>>>> MPIAIJ and B of type MPIDense when the LDA of B is not equal to its number 
>>>> of local rows.
>>>> It does [[1,1];[1,1]] * [[0,1,2,3];[0,1,2,3]]
>>>> C should be equal to 2*B, but it’s not, unless lda = m (= 1).
>>>> Mat Object: 2 MPI processes
>>>>  type: mpidense
>>>> 0.0000000000000000e+00 1.0000000000000000e+00 2.0000000000000000e+00 
>>>> 3.0000000000000000e+00
>>>> 0.0000000000000000e+00 1.0000000000000000e+00 2.0000000000000000e+00 
>>>> 3.0000000000000000e+00
>>>> 
>>>> If you change Bm here 
>>>> https://www.mcs.anl.gov/petsc/petsc-dev/src/mat/impls/aij/mpi/mpimatmatmult.c.html#line549
>>>>  to the LDA of B, you’ll get the correct result.
>>>> Mat Object: 2 MPI processes
>>>>  type: mpidense
>>>> 0.0000000000000000e+00 2.0000000000000000e+00 4.0000000000000000e+00 
>>>> 6.0000000000000000e+00
>>>> 0.0000000000000000e+00 2.0000000000000000e+00 4.0000000000000000e+00 
>>>> 6.0000000000000000e+00
>>>> 
>>>> Unfortunately, w.r.t. MR 2069, I still don’t get the same results with a 
>>>> plain view LDA > m (KO) and a view + duplicate LDA = m (OK).
>>>> So there might be something else to fix (or this might not even be a 
>>>> correct fix), but the only reproducer I have right now is the full solver.
>>>> 
>>>> Thanks,
>>>> Pierre
>>>> 
>>> 
>> 
>

Re: [petsc-dev] Broken MatMatMult_MPIAIJ_MPIDense

Reply via email to