Re: [deal.II] Re: Parallel DoFRenumbering does not work on mac

2018-01-27 Thread Praveen C
Attached is the simplest code that shows problem for me on mac osx 10.13.3 + mpichRan in debug modewithout renumbering: runs finewith cuthill_mckee renumbering $ mpirun -np 2 ./step-40   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES=   PID 4712 RUNNING AT Praveens-MacBook-Pro.local=   EXIT CODE: 6=   CLEANING UP REMAINING PROCESSES=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES===YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Abort trap: 6 (signal 6)This typically refers to a problem with your application.Please see the FAQ page for debugging suggestionsSo the problem seems to occur inside renumbering itself.Bestpraveen



-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups "deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dealii+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


step-40.cc
Description: Binary data




-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups "deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dealii+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [deal.II] Re: Parallel DoFRenumbering does not work on mac

2018-01-27 Thread Praveen C
I ran the example code with renumbering on my macbook with mpich + osx 10.13 
and I get same type of crash when running in parallel.

I recently switched to mpich since I was having some memory leak with openmpi 
in one of my applications (not deal.II related).

I have not extensively tested my deal.II programs with mpich yet.

Best
praveen

> On 28-Jan-2018, at 9:53 AM, Wolfgang Bangerth  wrote:
> 
> On 01/27/2018 09:17 PM, Jie Cheng wrote:
>> I finally fixed it! If there is anyone who has the same problem as I 
>> described: replacing MPICH with open-mpi resolves this issue. I do not know 
>> why, but this method has fixed two macs for me: I have never installed 
>> open-mpi before so there were no conflicting between MPICH and open-mpi. I 
>> still think something is wrong among dealii, mpich and mac OS 10.13.
> 
> Jie -- I'm glad to hear you made it work!
> 
> Everyone who is in software for long enough will be prudent to say that it's 
> likely that these interactions exist. It's possible that dealii, mpich and 
> mac OS 10.13 don't work well together. The difficulty is finding a testcase 
> that clearly demonstrates that this is so, and then being able to understand 
> what exactly is happening. I tried the program you sent earlier this week and 
> it works just fine for me. I think that without a debugger backtrace and 
> other information, it's going to be difficult to figure out what concretely 
> doesn't work. But it's good to know that at least you have found a workaround.
> 
> Best
> W.

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [deal.II] Re: Parallel DoFRenumbering does not work on mac

2018-01-27 Thread Wolfgang Bangerth

On 01/27/2018 09:17 PM, Jie Cheng wrote:


I finally fixed it! If there is anyone who has the same problem as I 
described: replacing MPICH with open-mpi resolves this issue. I do not know 
why, but this method has fixed two macs for me: I have never installed 
open-mpi before so there were no conflicting between MPICH and open-mpi. I 
still think something is wrong among dealii, mpich and mac OS 10.13.


Jie -- I'm glad to hear you made it work!

Everyone who is in software for long enough will be prudent to say that it's 
likely that these interactions exist. It's possible that dealii, mpich and mac 
OS 10.13 don't work well together. The difficulty is finding a testcase that 
clearly demonstrates that this is so, and then being able to understand what 
exactly is happening. I tried the program you sent earlier this week and it 
works just fine for me. I think that without a debugger backtrace and other 
information, it's going to be difficult to figure out what concretely doesn't 
work. But it's good to know that at least you have found a workaround.


Best
 W.

--

Wolfgang Bangerth  email: bange...@colostate.edu
   www: http://www.math.colostate.edu/~bangerth/

--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups "deal.II User Group" group.

To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [deal.II] Re: Parallel DoFRenumbering does not work on mac

2018-01-27 Thread Jie Cheng
Hi Bruno and Wolfgang

I finally fixed it! If there is anyone who has the same problem as I 
described: replacing MPICH with open-mpi resolves this issue. I do not know 
why, but this method has fixed two macs for me: I have never installed 
open-mpi before so there were no conflicting between MPICH and open-mpi. I 
still think something is wrong among dealii, mpich and mac OS 10.13.

Jie

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [deal.II] Re: Parallel DoFRenumbering does not work on mac

2018-01-23 Thread Jie Cheng
Hi Wolfgang

Here is the program, I did not but adding a
DoFRenumbering::Cuthill_McKee(dof_handler) call in 251. I tried to debug
with lldb but did not gain any useful information. I will check the video
again to make sure I was doing it right.

Thank you very much!
Jie



On Tue, Jan 23, 2018 at 11:28 PM Wolfgang Bangerth 
wrote:

> On 01/23/2018 01:16 PM, Jie Cheng wrote:
> > These are just warnings -- what happens if you run the executable?
> >
> >
> > If I do not modify step-40.cc, it runs fine both in serial and parallel.
> After
> > I added DoFRenumbering in it, it crashes in parallel. Does
> DoFRenumbering have
> > any dependency?
>
> Not that I know of. Remind which function in that namespace you are using?
>
>
> > As I posted in previous messages, the debugger does not help
> > much. The same problem happens on another mac in my lab which has Mac OS
> > 10.13.2 installed.
> This is certainly strange. I wouldn't know what might cause this, and
> without
> having a backtrace of where things go wrong, it's just really hard to tell.
>
> If you send me the program again, I can try to run it on my linux boxes to
> see
> whether it's something I can reproduce. Otherwise, have you tried the
> technique for running a parallel program in a debugger I discuss in one of
> my
> video lectures?
>
> Best
>   W.
>
> --
> 
> Wolfgang Bangerth  email: bange...@colostate.edu
> www: http://www.math.colostate.edu/~bangerth/
>
> --
> The deal.II project is located at http://www.dealii.org/
> For mailing list/forum options, see
> https://groups.google.com/d/forum/dealii?hl=en
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "deal.II User Group" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/dealii/qYSv_95ay6I/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> dealii+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


step-40.cc
Description: Binary data


Re: [deal.II] Re: Parallel DoFRenumbering does not work on mac

2018-01-23 Thread Wolfgang Bangerth

On 01/23/2018 01:16 PM, Jie Cheng wrote:

These are just warnings -- what happens if you run the executable?


If I do not modify step-40.cc, it runs fine both in serial and parallel. After 
I added DoFRenumbering in it, it crashes in parallel. Does DoFRenumbering have 
any dependency?


Not that I know of. Remind which function in that namespace you are using?


As I posted in previous messages, the debugger does not help 
much. The same problem happens on another mac in my lab which has Mac OS 
10.13.2 installed.
This is certainly strange. I wouldn't know what might cause this, and without 
having a backtrace of where things go wrong, it's just really hard to tell.


If you send me the program again, I can try to run it on my linux boxes to see 
whether it's something I can reproduce. Otherwise, have you tried the 
technique for running a parallel program in a debugger I discuss in one of my 
video lectures?


Best
 W.

--

Wolfgang Bangerth  email: bange...@colostate.edu
   www: http://www.math.colostate.edu/~bangerth/

--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups "deal.II User Group" group.

To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [deal.II] Re: Parallel DoFRenumbering does not work on mac

2018-01-23 Thread Jie Cheng
Hi Wolfgang
 

> These are just warnings -- what happens if you run the executable? 
>

If I do not modify step-40.cc, it runs fine both in serial and parallel. 
After I added DoFRenumbering in it, it crashes in parallel. Does 
DoFRenumbering have any dependency? As I posted in previous messages, the 
debugger does not help much. The same problem happens on another mac in my 
lab which has Mac OS 10.13.2 installed. I wonder if there is anyone can 
reproduce this problem on Mac OS X 10.13.2?

Thanks
Jie

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [deal.II] Re: Parallel DoFRenumbering does not work on mac

2018-01-23 Thread Wolfgang Bangerth

On 01/22/2018 09:17 PM, Jie Cheng wrote:
I've reinstalled MPICH, and did a clean build of p4est, petsc and dealii, this 
problem still exists. At the linking stage of building dealii, I got warnings:


[526/579] Linking CXX shared library lib/libdeal_II.9.0.0-pre.dylib
ld: warning: could not create compact unwind for _dgehrd_: stack subq 
instruction is too different from dwarf stack size


These are just warnings -- what happens if you run the executable?

The function referenced here is a LAPACK function. It looks like it has been 
compiled for a different system than the one you're on. But the error message 
you show doesn't give away how this library got onto your system, so I don't 
know what to suggest.


Best
 W.

--

Wolfgang Bangerth  email: bange...@colostate.edu
   www: http://www.math.colostate.edu/~bangerth/

--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups "deal.II User Group" group.

To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [deal.II] Re: Parallel DoFRenumbering does not work on mac

2018-01-22 Thread Jie Cheng
I've reinstalled MPICH, and did a clean build of p4est, petsc and dealii, 
this problem still exists. At the linking stage of building dealii, I got 
warnings:

[526/579] Linking CXX shared library lib/libdeal_II.9.0.0-pre.dylib
   
ld: warning: could not create compact unwind for _dgehrd_: stack subq 
instruction is too different from dwarf stack size  
ld: warning: could not create compact unwind for _dhseqr_: stack subq 
instruction is too different from dwarf stack size  
ld: warning: could not create compact unwind for _sgehrd_: stack subq 
instruction is too different from dwarf stack size  
ld: warning: could not create compact unwind for _shseqr_: stack subq 
instruction is too different from dwarf stack size  
ld: warning: could not create compact unwind for _dormlq_: stack subq 
instruction is too different from dwarf stack size  
ld: warning: could not create compact unwind for _dormql_: stack subq 
instruction is too different from dwarf stack size  
ld: warning: could not create compact unwind for _dormqr_: stack subq 
instruction is too different from dwarf stack size  
ld: warning: could not create compact unwind for _sormlq_: stack subq 
instruction is too different from dwarf stack size  
ld: warning: could not create compact unwind for _sormql_: stack subq 
instruction is too different from dwarf stack size  
ld: warning: could not create compact unwind for _sormqr_: stack subq 
instruction is too different from dwarf stack size

I am not sure if this is related. Anyone has a clue?

Thanks
Jie

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [deal.II] Re: Parallel DoFRenumbering does not work on mac

2018-01-18 Thread Jie Cheng
Bruno,
 

> This says that you have a problem with MPI.  Your program died before 
>> executing any code in deal. Can you run other MPI codes?
>>
>
Yes, I can run step-40 in parallel. This problem happens only after I added 
a DoFRenumbering call in it (step-40.cc originally does not call 
DoFRenumbering). What's even more weird is that running on 1 processor 
without debugger is OK. But if I use lldb, it immediately crashes as shown 
above.

Jie

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [deal.II] Re: Parallel DoFRenumbering does not work on mac

2018-01-18 Thread Bruno Turcksin
Jie,

2018-01-18 14:31 GMT-05:00 Jie Cheng :

> Fatal error in MPI_Init_thread: Other MPI error, error stack:
> MPIR_Init_thread(474):
> MPID_Init(152)...: channel initialization failed
> MPID_Init(426)...: PMI_Get_appnum returned -1
> [cli_0]: write_line error; fd=6 buf=:cmd=abort exitcode=1094159
> :
> system msg for write_line failure : Bad file descriptor
> Process 73295 exited with status = 15 (0x000f)
>
This says that you have a problem with MPI.  Your program died before
executing any code in deal. Can you run other MPI codes?

Best,

Bruno

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[deal.II] Re: Parallel DoFRenumbering does not work on mac

2018-01-18 Thread Jie Cheng
Hi Bruno


> There is not enough information for us to help you, this error message is 
> not from deal. At this point the code has failed and this is just a 
> cleaning operation. Have you tried running your code in a debugger?
>
>
When I run `lldb ./step-40` everything runs smoothly. But if I do `mpirun 
-n 1 lldb ./step-40`, I get: 

zhangl-macpro12:step-40 jiecheng$ mpirun -n 1 lldb ./step-40
(lldb) target create "./step-40"
Current executable set to './step-40' (x86_64).
run
(lldb) run
Process 73295 launched: './step-40' (x86_64)
[cli_0]: write_line error; fd=6 buf=:cmd=init pmi_version=1 pmi_subversion=1
:
system msg for write_line failure : Bad file descriptor
[cli_0]: Unable to write to PMI_fd
[cli_0]: write_line error; fd=6 buf=:cmd=get_appnum
:
system msg for write_line failure : Bad file descriptor
Fatal error in MPI_Init_thread: Other MPI error, error stack:
MPIR_Init_thread(474): 
MPID_Init(152)...: channel initialization failed
MPID_Init(426)...: PMI_Get_appnum returned -1
[cli_0]: write_line error; fd=6 buf=:cmd=abort exitcode=1094159
:
system msg for write_line failure : Bad file descriptor
Process 73295 exited with status = 15 (0x000f) 


Is this useful information?

Jie

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[deal.II] Re: Parallel DoFRenumbering does not work on mac

2018-01-18 Thread Bruno Turcksin
Jie,

On Thursday, January 18, 2018 at 11:17:55 AM UTC-5, Jie Cheng wrote:
>
> I found that DoFRenumbering with distributed dof_handler does not work on 
> my mac (Mac OS X 10.13.2). My dealii is built with PETSc and Step-40 runs 
> fine. But once I added DoFRenumbering::Cuthill_McKee(dof_handler) 
> after dof_handler.distribute_dofs (fe), it gave me runtime error:
>
> ===
>   
>
> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> =   PID 65482 RUNNING AT zhangl-macpro12  
> =   EXIT CODE: 6
> =   CLEANING UP REMAINING PROCESSES 
> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES   
> ===
>   
>
> YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Abort trap: 6 (signal 
> 6)  
> This typically refers to a problem with your application.   
> Please see the FAQ page for debugging suggestions
>
There is not enough information for us to help you, this error message is 
not from deal. At this point the code has failed and this is just a 
cleaning operation. Have you tried running your code in a debugger?

Best,

Bruno

-- 
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see 
https://groups.google.com/d/forum/dealii?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to dealii+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.