Re: [OMPI users] problem
ok...Thank you so much sir On Wed, May 9, 2018 at 11:13 PM, Jeff Squyres (jsquyres) wrote: > It looks like you're getting a segv when calling MPI_Comm_rank(). > > This is quite unusual -- MPI_Comm_rank() is just a local lookup / return > of an integer. If MPI_Comm_rank() is seg faulting, it usually indicates > that there's some other kind of memory error in the application, and this > seg fault you're seeing is just a symptom -- it's not the real problem. It > may have worked with Intel MPI by chance, or for some reason, Intel MPI has > a different memory pattern than Open MPI and it didn't happen to trigger > this exact problem. > > You might want to run your application through a memory-checking debugger. > > > > > On May 9, 2018, at 11:39 AM, Ankita m wrote: > > > > yes. Because previously i was using intel-mpi. That time the program was > running perfectly. Now when i use openmpi this shows this error > files...Though i am not quite sure. I just thought if the issue will be for > Openmpi then i could get some help here. > > > > On Wed, May 9, 2018 at 6:47 PM, Gilles Gouaillardet < > gilles.gouaillar...@gmail.com> wrote: > > Ankita, > > > > Do you have any reason to suspect the root cause of the crash is Open > MPI ? > > > > Cheers, > > > > Gilles > > > > > > On Wednesday, May 9, 2018, Ankita m wrote: > > MPI "Hello World" program is also working > > > > please see this error file attached below. its of a different program > > > > On Wed, May 9, 2018 at 4:10 PM, John Hearns via users < > users@lists.open-mpi.org> wrote: > > Ankita, looks like your program is not launching correctly. > > I would try the following: > > define two hosts in a machinefile. Use mpirun -np 2 machinefile date > > Ie can you use mpirun just to run the command 'date' > > > > Secondly compile up and try to run an MPI 'Hello World' program > > > > > > On 9 May 2018 at 12:28, Ankita m wrote: > > I am using ompi -3.1.0 version in my program and compiler is mpicc > > > > its a parallel program which uses multiple nodes with 16 cores in each > node. > > > > but its not working and generates a error file . i Have attached the > error file below. > > > > can anyone please tell what is the issue actually > > > > ___ > > users mailing list > > users@lists.open-mpi.org > > https://lists.open-mpi.org/mailman/listinfo/users > > > > > > ___ > > users mailing list > > users@lists.open-mpi.org > > https://lists.open-mpi.org/mailman/listinfo/users > > > > > > ___ > > users mailing list > > users@lists.open-mpi.org > > https://lists.open-mpi.org/mailman/listinfo/users > > > > ___ > > users mailing list > > users@lists.open-mpi.org > > https://lists.open-mpi.org/mailman/listinfo/users > > > -- > Jeff Squyres > jsquy...@cisco.com > > ___ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users > ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] problem
yes. Because previously i was using intel-mpi. That time the program was running perfectly. Now when i use openmpi this shows this error files...Though i am not quite sure. I just thought if the issue will be for Openmpi then i could get some help here. On Wed, May 9, 2018 at 6:47 PM, Gilles Gouaillardet < gilles.gouaillar...@gmail.com> wrote: > Ankita, > > Do you have any reason to suspect the root cause of the crash is Open MPI ? > > Cheers, > > Gilles > > > On Wednesday, May 9, 2018, Ankita m wrote: > >> MPI "Hello World" program is also working >> >> please see this error file attached below. its of a different program >> >> On Wed, May 9, 2018 at 4:10 PM, John Hearns via users < >> users@lists.open-mpi.org> wrote: >> >>> Ankita, looks like your program is not launching correctly. >>> I would try the following: >>> define two hosts in a machinefile. Use mpirun -np 2 machinefile date >>> Ie can you use mpirun just to run the command 'date' >>> >>> Secondly compile up and try to run an MPI 'Hello World' program >>> >>> >>> On 9 May 2018 at 12:28, Ankita m wrote: >>> >>>> I am using ompi -3.1.0 version in my program and compiler is mpicc >>>> >>>> its a parallel program which uses multiple nodes with 16 cores in each >>>> node. >>>> >>>> but its not working and generates a error file . i Have attached the >>>> error file below. >>>> >>>> can anyone please tell what is the issue actually >>>> >>>> ___ >>>> users mailing list >>>> users@lists.open-mpi.org >>>> https://lists.open-mpi.org/mailman/listinfo/users >>>> >>> >>> >>> ___ >>> users mailing list >>> users@lists.open-mpi.org >>> https://lists.open-mpi.org/mailman/listinfo/users >>> >> >> > ___ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users > ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] problem
MPI "Hello World" program is also working please see this error file attached below. its of a different program On Wed, May 9, 2018 at 4:10 PM, John Hearns via users < users@lists.open-mpi.org> wrote: > Ankita, looks like your program is not launching correctly. > I would try the following: > define two hosts in a machinefile. Use mpirun -np 2 machinefile date > Ie can you use mpirun just to run the command 'date' > > Secondly compile up and try to run an MPI 'Hello World' program > > > On 9 May 2018 at 12:28, Ankita m wrote: > >> I am using ompi -3.1.0 version in my program and compiler is mpicc >> >> its a parallel program which uses multiple nodes with 16 cores in each >> node. >> >> but its not working and generates a error file . i Have attached the >> error file below. >> >> can anyone please tell what is the issue actually >> >> ___ >> users mailing list >> users@lists.open-mpi.org >> https://lists.open-mpi.org/mailman/listinfo/users >> > > > ___ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users > bicgstab_Test.e88 Description: Binary data ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
[OMPI users] problem
I am using ompi -3.1.0 version in my program and compiler is mpicc its a parallel program which uses multiple nodes with 16 cores in each node. but its not working and generates a error file . i Have attached the error file below. can anyone please tell what is the issue actually bicgstab_Test.e61 Description: Binary data ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] Fwd: Fwd: problem in cluster
Can you please tell me whether to use mpicc compiler ar any other compiler for openmpi programs On Wed, Apr 25, 2018 at 3:13 PM, Ankita m wrote: > i have 16 cores per one node. I usually use 4 node each node has 16 cores > so total 64 processes. > > On Wed, Apr 25, 2018 at 2:57 PM, John Hearns via users < > users@lists.open-mpi.org> wrote: > >> I do not see much wrong with that. >> However nodes=4 ppn=2 makes 8 processes in all. >> You are using mpirun -np 64 >> >> Actually it is better practice to use the PBS supplied environment >> variables during the job, rather than hard-wiring 64 >> I dont have access to a PBS cluster from my desk at the moment. >> You could also investigate using mpiprocs=2 Then I think with openmpi >> if it has compiled in PBS support all you would have to do is >> mpirun >> >> Are you sure your compute servers only have two cores ?? >> >> I also see that you are commenting out the module load openmpi-3.0.1 I >> would guess you want the default Opnempi, which is OK >> >> First thing I would do, before the mpirun line in that job script: >> >> which mpirun(check that you are picking up an Openmpi version) >> >> ldd ./cgles (check you are bringing in the libraries that you should) >> >> >> Also run mpirun with the verbose flag -v >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> On 25 April 2018 at 11:10, Ankita m wrote: >> >>> >>>> while using openmpi- 1.4.5 the program ended by showing this error file >>>> (in the attachment) >>>> >>> >>> I am Using PBS file . Below u can find the script that i am using to >>> run my program >>> >>> ___ >>> users mailing list >>> users@lists.open-mpi.org >>> https://lists.open-mpi.org/mailman/listinfo/users >>> >> >> >> ___ >> users mailing list >> users@lists.open-mpi.org >> https://lists.open-mpi.org/mailman/listinfo/users >> > > ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] Fwd: Fwd: problem in cluster
i have 16 cores per one node. I usually use 4 node each node has 16 cores so total 64 processes. On Wed, Apr 25, 2018 at 2:57 PM, John Hearns via users < users@lists.open-mpi.org> wrote: > I do not see much wrong with that. > However nodes=4 ppn=2 makes 8 processes in all. > You are using mpirun -np 64 > > Actually it is better practice to use the PBS supplied environment > variables during the job, rather than hard-wiring 64 > I dont have access to a PBS cluster from my desk at the moment. > You could also investigate using mpiprocs=2 Then I think with openmpi if > it has compiled in PBS support all you would have to do is > mpirun > > Are you sure your compute servers only have two cores ?? > > I also see that you are commenting out the module load openmpi-3.0.1 I > would guess you want the default Opnempi, which is OK > > First thing I would do, before the mpirun line in that job script: > > which mpirun(check that you are picking up an Openmpi version) > > ldd ./cgles (check you are bringing in the libraries that you should) > > > Also run mpirun with the verbose flag -v > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On 25 April 2018 at 11:10, Ankita m wrote: > >> >>> while using openmpi- 1.4.5 the program ended by showing this error file >>> (in the attachment) >>> >> >> I am Using PBS file . Below u can find the script that i am using to run >> my program >> >> ___ >> users mailing list >> users@lists.open-mpi.org >> https://lists.open-mpi.org/mailman/listinfo/users >> > > > ___ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users > ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
[OMPI users] Fwd: Fwd: problem in cluster
> > > while using openmpi- 1.4.5 the program ended by showing this error file > (in the attachment) > I am Using PBS file . Below u can find the script that i am using to run my program cgles.err Description: Binary data run.pbs Description: Binary data ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
[OMPI users] Fwd: problem in cluster
While using the open mpi got this error. Can you please tell why so -- Forwarded message - From: Ankita m Date: Tue, 24 Apr 2018, 12:55 pm Subject: Re: problem in cluster To: sagar mcp , Krishna Singh while using openmpi- 1.4.5 the program ended by showing this error On Tue, Apr 24, 2018 at 12:28 PM, Ankita m wrote: > teamviewer id 565 248 412 > > password jfu477 > my contact number 7830622816 > > On Tue, Apr 24, 2018 at 12:18 PM, Sagar Naik wrote: > >> Share your contact details >> >> >> >> >> >> *Thanks & Regards,* >> >> >> >> *Sagar Vijay Naik* >> >> *Sr. Customer Support Engineer* >> >> [image: Description: Logo][image: Description: >> cid:image002.jpg@01D084CC.4DBB74F0] >> >> Address: 17/18 Navketan Estate | Opp. Onida House | Mahakali Caves Road >> |Andheri ( East) | Mumbai - 400 093. >> >> Email: sa...@mpcl.in | Mobile No: +91 9969478594 | Board Line (D) : >> 022-40956342 |Fax No: 022- 6870250 |URL – www.mpcl.in | Follw us on : [image: >> Description: download][image: Description: Facebook] >> >> P Please don't print this e-mail unless you really need to. >> >> >> >> *From:* Ankita m [mailto:ankitamait...@gmail.com] >> *Sent:* 24 April 2018 10:52 >> *To:* sagar mcp ; Krishna Singh >> *Subject:* Fwd: problem in cluster >> >> >> >> >> >> -- Forwarded message -- >> From: *Ankita m* >> Date: Mon, Apr 23, 2018 at 4:18 PM >> Subject: problem in cluster >> To: sagar mcp >> >> Hello Sir >> >> >> >> I am Ankita Maity from Mechanical Department IIT Roorkee. >> >> >> >> I am facing problem while submitting a job . all the programs >> automatically are going to either queue or status is showing "H". Please >> help sir . >> >> >> >> my program folder is Home/ankitamed/MarineTurbine1/ >> >> >> >> Team viewer id password are >> >> >> >> 565 248 412 >> >> jfu477 >> >> >> >> Regards >> >> Ankita >> >> >> > > ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
[OMPI users] Fwd: problem related ORTE
On Wed, Apr 11, 2018 at 3:55 PM, Ankita m wrote: > Hello Sir > > Currently i am using version "openmpi-1.4.5". While submitting a parallel > program it fails generating the below error file which i have attached. I > think this is a run-time problem. > Therefore i have attached a zip file in which all the files are being > given as asked in the link https://www.open-mpi.org/community/help/ > > Regards > Ankita Maity > IIT Roorkee > India > > On Fri, Apr 6, 2018 at 9:55 PM, Jeff Squyres (jsquyres) < > jsquy...@cisco.com> wrote: > >> Can you please send all the information listed here: >> >> https://www.open-mpi.org/community/help/ >> >> Thanks! >> >> >> > On Apr 6, 2018, at 8:27 AM, Ankita m wrote: >> > >> > Hello Sir/Madam >> > >> > I am Ankita Maity, a PhD scholar from Mechanical Dept., IIT Roorkee, >> India >> > >> > I am facing a problem while submitting a parallel program to the HPC >> cluster available in our dept. >> > >> > I have attached the error file its showing during the time of run. >> > >> > Can You please help me with the issue. I will be very much grateful to >> you. >> > >> > With Regards >> > >> > ANKITA MAITY >> > IIT ROORKEE >> > INDIA >> > ___ >> > users mailing list >> > users@lists.open-mpi.org >> > https://lists.open-mpi.org/mailman/listinfo/users >> >> >> -- >> Jeff Squyres >> jsquy...@cisco.com >> >> ___ >> users mailing list >> users@lists.open-mpi.org >> https://lists.open-mpi.org/mailman/listinfo/users >> > > ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] disabling libraries?
Thank You so much sir. I will discuss about this with y Supervisor and will proceed accordingly On Fri, Apr 6, 2018 at 5:42 PM, Michael Di Domenico wrote: > On Thu, Apr 5, 2018 at 7:59 PM, Gilles Gouaillardet > wrote: > > That being said, the error suggest mca_oob_ud.so is a module from a > > previous install, > > Open MPI was not built on the system it is running, or libibverbs.so.1 > > has been removed after > > Open MPI was built. > > yes, understood, i compiled openmpi on a node that has all the > libraries installed for our various interconnects, opa/psm/mxm/ib, but > i ran mpirun on a node that has none of them > > so the resulting warnings i get > > mca_btl_openib: lbrdmacm.so.1 > mca_btl_usnic: libfabric.so.1 > mca_oob_ud: libibverbs.so.1 > mca_mtl_mxm: libmxm.so.2 > mca_mtl_ofi: libfabric.so.1 > mca_mtl_psm: libpsm_infinipath.so.1 > mca_mtl_psm2: libpsm2.so.2 > mca_pml_yalla: libmxm.so.2 > > you referenced them as "errors" above, but mpi actually runs just fine > for me even with these msgs, so i would consider them more warnings. > > > So I do encourage you to take a step back, and think if you can find a > > better solution for your site. > > there are two alternatives > > 1 i can compile a specific version of openmpi for each of our clusters > with each specific interconnect libraries > > 2 i can install all the libraries on all the machines regardless of > whether the interconnect is present > > both are certainly plausible, but my effort here is to see if i can > reduce the size of our software stack and/or reduce the number of > compiled versions of openmpi > > it would be nice if openmpi had (or may already have) a simple switch > that lets me disable entire portions of the library chain, ie this > host doesn't have a particular interconnect, so don't load any of the > libraries. this might run counter to how openmpi discovers and load > libs though. > ___ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users > ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
[OMPI users] problem related ORTE
Hello Sir/Madam I am Ankita Maity, a PhD scholar from Mechanical Dept., IIT Roorkee, India I am facing a problem while submitting a parallel program to the HPC cluster available in our dept. I have attached the error file its showing during the time of run. Can You please help me with the issue. I will be very much grateful to you. With Regards ANKITA MAITY IIT ROORKEE INDIA cgles.err Description: Binary data ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users