Hello Professor,

Thanks for the reply. 
I had been struggling with this issue for 5 days now. I raised the ticket 
on XSEDE forum on 18th Jan.
The technical team was expecting everything was fine from their side and 
advised me to reinstall with some different modules loaded. They were also 
expecting mixing of different mpi components.

I actually tried every possible combination (over 20) of modules. When 
nothing worked, I finally decided that to seek help from this forum. I was 
hesitating at first as I was sure that this is not the problem with library 
or its installation but rather it is the problem with the system. I needed 
some guidance to prove this thing.

However, I finally received an email yesterday (after posting this), that 
they made some security update on the system last week which caused the 
issue.  After the change was corrected by them, everything works fine just 
using candi.

Thanks a lot for the help. I really appreciate it.

I am still posting the answers to the questions you asked.

1. It always used to happen at a different position when run with different 
processors on a single node.

2. Even using multiple nodes was giving this error.










On Monday, January 22, 2018 at 11:09:41 PM UTC-5, Wolfgang Bangerth wrote:
>
> On 01/22/2018 08:48 AM, RAJAT ARORA wrote: 
> > 
> > Running with PETSc on 2 MPI rank(s)... 
> > Cycle 0: 
> >     Number of active cells:       1024 
> >     Number of degrees of freedom: 4225 
> >     Solved in 10 iterations. 
> > 
> > 
> > 
> +---------------------------------------------+------------+------------+ 
> > | Total wallclock time elapsed since start    |     0.222s |           
>  | 
> > |                                             |            |           
>  | 
> > | Section                         | no. calls |  wall time | % of total 
> | 
> > 
> +---------------------------------+-----------+------------+------------+ 
> > | assembly                        |         1 |     0.026s |        12% 
> | 
> > | output                          |         1 |    0.0448s |        20% 
> | 
> > | setup                           |         1 |    0.0599s |        27% 
> | 
> > | solve                           |         1 |    0.0176s |       7.9% 
> | 
> > 
> +---------------------------------+-----------+------------+------------+ 
> > 
> > 
> > Cycle 1: 
> >     Number of active cells:       1960 
> >     Number of degrees of freedom: 8421 
> > r001.pvt.bridges.psc.edu.27927Assertion failure at 
> > 
> /nfs/site/home/phcvs2/gitrepo/ifs-all/Ofed_Delta/rpmbuild/BUILD/libpsm2-10.3.3/ptl_am/ptl.c:152:
>  
>
> > nbytes == req->recv_msglen 
> > r001.pvt.bridges.psc.edu.27927step-40: Reading from remote process' 
> memory 
> > failed. Disabling CMA support 
> > [r001:27927] *** Process received signal *** 
>
> These error messages suggest that the first cycle actually worked. So your 
> MPI 
> installation is not completely broken apparently. 
>
> Is the error message reproducible? Is it always in the same place and with 
> the 
> same message? When you run two processes, are they on separate machines or 
> on 
> the same one? 
>
> Best 
>   W. 
>
> -- 
> ------------------------------------------------------------------------ 
> Wolfgang Bangerth          email:                 bang...@colostate.edu 
> <javascript:> 
>                             www: http://www.math.colostate.edu/~bangerth/ 
>
>

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to