Hi Wolfgang,

Thanks for your reply!

Of course, I will make a pull request after the preconditioner is fully 
commented & tested,
although it was a really small addition to the code...

I'm sorry I was not clear in explaining my problem. I had a block matrix to 
solve, and the ILU 
is used for solving the inverse of the schur complement (by applying ILU to 
an approximate of it
as the preconditioner) in my DIY preconditioner.

I tried different # of dofs as you suggested and I found it very weird: for 
same case with different
number of global refinement, I found only when I have 26k dofs, the time 
with 2 cores is faster than
1 core, either coarser or finer mesh causes solver to be slower in parallel 
(also tried BlockJacobi
and the results are almost the same). I wonder if there could be any 
possibility that my communication
has problem or the matrix is already not distributed? Don't understand....

Thanks!
Feimi





On Tuesday, March 27, 2018 at 1:45:41 PM UTC-4, Wolfgang Bangerth wrote:
>
>
> > I'm going to solve a Schur complement in my preconditioner for SUPG 
> > stabilized fluid solver and will be using ILU(0) decomposition. 
>
> That's generally not a good preconditioner -- it will lead to large 
> numbers of iterations in your linear solver if the problem becomes 
> large. But that's tangential to your point... 
>
>
> > Since the BlockJacobi does not turn out to work well, I wrote a wrapper 
> > for Pilut, which is a package for ILUT decomposition in Hypre 
> > and wrapped by PETSc. I tested my Pilut preconditioner with tutorial 
> > step-17 (the MPI elastic problem) and everything looks ok. 
>
> Interesting. Would you be willing to contribute this wrapper? 
>
>
> > However, when I apply it to my own code, running with 2 ranks takes more 
> > time than 1, specifically on solving the Schur complement 
> > where my Pilut is applied. 
> > I tried to output the total iteration number of the solver, but with 
> > more ranks, actually less iterations are used. I don't understand why 
> > less iterations take more time. Is there any potential reason for that? 
>
> It often comes down to how much communication you do. If your problem is 
> small, then there is not very much work to do for each process, and much 
> time is spent on communication between processes. A typical rule of 
> thumb is that your problem needs to have at least 50,000 to 100,000 
> unknowns per process for computations to offset communication. 
>
>
> > Something that came up to my mind, but I'm not sure: my Schur complement 
> > is actually not a matrix, but a derived class that only has 
> > definition for vmult. For that reason, my solver is a dealii:solverGMRES 
> > instead of PETScWrappers::solverGMRES. 
> > However, my results in testing with step-17 did not show significant 
> > differences between these two. 
>
> I don't think that is the problem -- the two GMRES implementations are 
> comparable in performance. 
>
> Best 
>   W. 
>
> -- 
> ------------------------------------------------------------------------ 
> Wolfgang Bangerth          email:                 bang...@colostate.edu 
> <javascript:> 
>                             www: http://www.math.colostate.edu/~bangerth/ 
>

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to