Re: [OMPI devel] Open-mpi in Fedora 5

2012-10-09 Thread Sandra Guija

Hi Jeff,Mi error is when either I run a mpi program on the remote host or I run 
on the local host and include the remote host.openmpi only works where the it 
was installed, local.Plan B is to install openmpi on the remote host and try to 
run it , but I'm getting discourage.I am using 1.2.9 and I found the same error 
in this post:http://www.open-mpi.org/community/lists/users/2008/07/6034.phpMy 
error looks exactly as the one describe above, but my fist line is:  bash: 
orted command not foundIf does not work, Plan C is to try what the post 
recommended version 1.3, but not sure.Any input is appreciatedThanks 
againSandra Guija

> From: jsquy...@cisco.com
> Date: Mon, 8 Oct 2012 20:19:14 -0400
> To: de...@open-mpi.org
> Subject: Re: [OMPI devel] Open-mpi in Fedora 5
> 
> I'm not sure I understand -- you said you are able to run on the master, but 
> then you said you get an error message when you run on the master.
> 
> Please see the help page I mentioned 
> (http://www.open-mpi.org/community/help/) for a list of information that we 
> need to be able to help you.
> 
> If you're getting a message about permission denied for 
> /home/openmpi/bin/orted, you should check the permissions on that file; it 
> needs to be executable.  If it isn't executable, then you need to check how 
> you installed Open MPI -- Open MPI's installer should have marked it as 
> executable during the installation process.
> 
> 
> On Oct 7, 2012, at 12:17 PM, Sandra Guija wrote:
> 
> > Hi Jeff, 
> > I shared the openpi libraries through NFS. I am able to ssh with out 
> > password, using id_rsa and eval`ssh-agent`
> > I got mpirun run the thee test you mentioned below. The test succeed only 
> > if I run on the master.
> > I am running with the mpiu user with same UID,
> > But I am not able to run the mpirun on the remote host. The message I got 
> > when I run on the remote host is below.
> > I have check the permission on the .ssh and libraries, and I included the 
> > PATH in .bashrc and .bash_profile, I tried modify /etc/profile but not sure 
> > how to do it.
> > I attached the screen shoot when I run on the master.
> > 
> > shell$ mpirun--debig-daemons --host tango1 hello_world
> > -bash:/home/openmpi/bin/mpirun: Permission denied
> > 
> > 
> > Sandra Guija
> > 
> > > From: jsquy...@cisco.com
> > > Date: Fri, 5 Oct 2012 16:55:44 -0400
> > > To: de...@open-mpi.org
> > > Subject: Re: [OMPI devel] Open-mpi in Fedora 5
> > > 
> > > On Oct 5, 2012, at 3:40 PM, Sandra Guija wrote:
> > > 
> > > > I decided to use an environment with Fedora 5 and gcc 4.1.0.
> > > > I tried to installed 1.6.2 and it failed, then tried 1.4.5 and it 
> > > > failed, then 1.2.9 and I did not get any error.
> > > 
> > > I know that we are sometimes a little slow to answer user emails, but you 
> > > need to give us more than a few hours to answer before re-posting your 
> > > mails. :-)
> > > 
> > > If you want to see if there are easy fixes to why 1.4.x and/or 1.6.x fail 
> > > to compile, see this page: http://www.open-mpi.org/community/help/ Send 
> > > all the info listed on that page.
> > > 
> > > > how I can check if the installation works, prior to configure the 
> > > > cluster
> > > 
> > > See:
> > > 
> > > http://www.open-mpi.org/community/lists/users/2012/03/18846.php
> > > 
> > > We say something quite similar in the (1.6.x) README file:
> > > 
> > > When verifying a new Open MPI installation, we recommend running three
> > > tests:
> > > 
> > > -
> > > 1. Use "mpirun" to launch a non-MPI program (e.g., hostname or uptime)
> > > across multiple nodes.
> > > 
> > > 2. Use "mpirun" to launch a trivial MPI program that does no MPI
> > > communication (e.g., the hello_c program in the examples/ directory
> > > in the Open MPI distribution).
> > > 
> > > 3. Use "mpirun" to launch a trivial MPI program that sends and
> > > receives a few MPI messages (e.g., the ring_c program in the
> > > examples/ directory in the Open MPI distribution).
> > > 
> > > If you can run all three of these tests successfully, that is a good
> > > indication that Open MPI built and installed properly.
> > > -
> > > 
> > > > Also, it will be ok if I copy the openmpi-1.2.9 directory to the other 
> > > > nodes? The installation took like almost 3 hours.
> > > 
> > > Wow; configuration / compilation of Open MPI took *3 hours*? I'm guessing 
> > > you have very old / low-power processors, or very slow network filesystem 
> > > access...?
> > > 
> > > See this FAQ information on where to install OMPI:
> > > 
> > > http://www.open-mpi.org/faq/?category=building#where-to-install
> > > 
> > > > I sent the "ccIVTymL.out" file to the forum but my mail is waiting for 
> > > > moderator approval.
> > > 
> > > It likely won't be approved. Send a smaller attachment, please, such as a 
> > > compressed text file (see the support page, above). :-)
> > > 
> > > -- 
> > > Jeff Squyres
> > > jsquy...@cisco.com
> > > For corporate legal information go to: 
> > > http://www.cisco.

Re: [OMPI devel] Open-mpi in Fedora 5

2012-10-09 Thread Jeff Squyres
On Oct 9, 2012, at 3:04 AM, Sandra Guija wrote:

> Mi error is when either I run a mpi program on the remote host or I run on 
> the local host and include the remote host.
> openmpi only works where the it was installed, local.
> Plan B is to install openmpi on the remote host and try to run it , but I'm 
> getting discourage.

You still have not provided all the information that we need to help you.  I 
cannot know how you have installed / configured Open MPI unless you tell me -- 
we need to you tell us *precisely* how you have set it up.

See http://www.open-mpi.org/community/help/.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI devel] RFC: New memchecker component - mempin

2012-10-09 Thread Shiqing Fan


Just a few more important points that I forgot to mention.

This work has been helped and guided by Rainer. He will also continue to 
use the tool for further research. :-)


This new component can be enabled via option --enable-memchecker. When 
disabled, there won't be any influence for Open MPI and application.


The patch has been tested based on a version of Open MPI a few month 
ago, but it should be easy to move to the latest OMPI trunk. I will make 
a bitbucket branch for merging it back.


I would like explain more details in the call today.

On 2012-10-08 3:05 PM, Shiqing Fan wrote:

*What:*

A new memory checking tool named MemPin was developed based on the 
Intel Pin framework. It uses a callback mechanism to do the similar 
tasks as Valgrind Memcheck. The new tool is tiny and flexible, and 
user may implement his own callback function for different purposes.


The basic idea here for Open MPI is to watch over the communication 
buffers. Every access to the buffers will be detected, and for 
specific memory operation (read/write), a memory check callback will 
be triggered.


Only the required memory will be taken care of, so the shadow memory 
can be kept to be as small as possible. The implemented shadow memory 
for Open MPI is handled in bit-wise, i.e. every byte of memory has 2 
bits of shadow (4 different states. This doesn't provide bit-wise 
validity of the memory like Valgirnd, where every byte of memory has 9 
bits of shadow. However, the shadow memory for this new tool is 
extensible.


Several predefined macros that may be used in user application and 
Open MPI:


  * *MEMPIN_RUNNING_WITH_PIN*: Checks whether the user application is
running under MemPin and Pin
  * *MEMPIN_REG_MEM_WATCH*: Registers the memory entry for specific
memory operation
  * *MEMPIN_UPDATE_MEM_WATCH:* Updates the memory entry parameters for
specific memory operation
  * *MEMPIN_UNREG_MEM_WATCH*: Unregisters one memory entry
  * *MEMPIN_UNREG_ALL_MEM_WATCH*: Unregisters all the memory entries
  * *MEMPIN_SEARCH_MEM_INDEX*: Returns the memory entry index from the
memory address storage
  * *MEMPIN_PRINT_CALLSTACK*: Prints the current callstack to standard
output or a file


The new component mempin will have the same memchecker API as valgrind 
component.


*WHY:*

This new implementation has similar functionalities as Valgrind 
Memcheck, but it is lightweight, faster, extensible and flexible. It 
also supports for Windows platforms.


*WHERE:*
  opal/mca/memchecker/
  ompi/include/ompi/memchecker.h or another header file.
  ompi/mca/pml/ob1several memchecker macro need to be updated.


*WHEN:*
 If everything is fine, probably some time next week or later this 
month.



We probably can also discuss it in the next teleconf.


Thanks,
Shiqing
--
---
Shiqing Fan
High Performance Computing Center Stuttgart (HLRS)
Tel: ++49(0)711-685-87234  Nobelstrasse 19
Fax: ++49(0)711-685-65832  70569 Stuttgart
http://www.hlrs.de/organization/people/shiqing-fan/
email:f...@hlrs.de


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
---
Shiqing Fan
High Performance Computing Center Stuttgart (HLRS)
Tel: ++49(0)711-685-87234  Nobelstrasse 19
Fax: ++49(0)711-685-65832  70569 Stuttgart
http://www.hlrs.de/organization/people/shiqing-fan/
email: f...@hlrs.de



Re: [OMPI devel] Open-mpi in Fedora 5

2012-10-09 Thread Sandra Guija

Hi jeff, I found my error "a typo" on LD_LIBRARY_PATH, and I found it while 
pulling my information to send you an email.Openmpi 1.2.9 is working on my old 
linux boxesThanks thanks again
Sandra Guija

> From: jsquy...@cisco.com
> Date: Tue, 9 Oct 2012 05:53:45 -0400
> To: de...@open-mpi.org
> Subject: Re: [OMPI devel] Open-mpi in Fedora 5
> 
> On Oct 9, 2012, at 3:04 AM, Sandra Guija wrote:
> 
> > Mi error is when either I run a mpi program on the remote host or I run on 
> > the local host and include the remote host.
> > openmpi only works where the it was installed, local.
> > Plan B is to install openmpi on the remote host and try to run it , but I'm 
> > getting discourage.
> 
> You still have not provided all the information that we need to help you.  I 
> cannot know how you have installed / configured Open MPI unless you tell me 
> -- we need to you tell us *precisely* how you have set it up.
> 
> See http://www.open-mpi.org/community/help/.
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
  

Re: [OMPI devel] Open-mpi in Fedora 5

2012-10-09 Thread Jeff Squyres
On Oct 9, 2012, at 3:03 PM, Sandra Guija wrote:

> I found my error "a typo" on LD_LIBRARY_PATH, and I found it while pulling my 
> information to send you an email.
> Openmpi 1.2.9 is working on my old linux boxes

Excellent!

Now that you grok what it takes to get Open MPI installed successfully, you 
might want to try working forwards in versions a bit to see if you can get a 
newer version working.  Try these two versions:

- v1.4.5
- v1.6.2

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI devel] RFC: Proposed Fix for mmap Bus Error Due to an Inadequate Amount of Disk Space

2012-10-09 Thread Gutierrez, Samuel K
Committed revision 27433.

Sam

On Oct 6, 2012, at 4:32 AM, Jeff Squyres wrote:

> Two minor suggestions:
> 
> 1. Ping Shiqing directly to get the proper Windows support before he 
> disappears.
> 
> 2. Word wrap the show-help message to 76 columns or so.
> 
> 
> On Oct 5, 2012, at 6:24 PM, Gutierrez, Samuel K wrote:
> 
>> WHAT: Fix the bus error caused by an inadequate amount of space during 
>> opal_shmem_segment_create by testing whether or not the target mount has 
>> enough space to accommodate the shared-memory backing store. I admit that 
>> this isn't an ideal solution, but I can't figure out another way to test 
>> this sort of thing given the way ftruncate and mmap behave.
>> 
>> WHY: Provide a nice error message instead of a bus error. See: 
>> https://svn.open-mpi.org/trac/ompi/ticket/2827
>> 
>> WHERE:
>> 
>> opal/util/path.[ch]
>> opal/mca/shmem/mmap/shmem_mmap_module.c
>> 
>> WHEN: Sometime next week, if everything is okay.
>> 
>> Please test, because the following branch has only been tested on Linux and 
>> OS X.
>> 
>> Code can be found:
>> 
>> https://bitbucket.org/samuelkgutierrez/opaldf
>> 
>> Give it a whirl and tell me what you think.
>> 
>> Thanks,
>> 
>> Sam
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel