[OMPI devel] Question regarding MCA_PML_CM_SEND_REQUEST_INIT_COMMON

2007-10-29 Thread Sajjad Tabib
Hi,

I was issuing an MPI_Bcast in a sample program and was hitting an unknown 
error; at least that was what MPI was telling me. I traced through the 
code to find my error and came upon MCA_PML_CM_REQUEST_INIT_COMMON macro 
function in pml_cm_sendreq.h. I looked at the function and noticed that in 
this function the elements of req_status were getting initialized; 
however, req_status.MPI_ERROR was not. I thought that maybe MPI_ERROR must 
also require initialization because if the value of MPI_ERROR was some 
arbitrary value not equal to MPI_SUCCESS then my program will definitely 
die. Unless, MPI_ERROR is propragating from upper layers to signify an 
error, but I wasn't sure. Anyway, I assumed that MPI_ERROR was not 
propagating from upper layers, so then I set req_status.MPI_ERROR to 
MPI_SUCCUSS and reran my test program. My program worked. Now, having 
gotten my program to work, I thought I should run this by you to make sure 
that MPI_ERROR was not propagating from upper layers. Is it ok that I did 
a:
"(req_send)->req_base.req_ompi.req_status.MPI_ERROR = MPI_SUCCESS;" in 
MCA_PML_CM_REQUEST_INIT_COMMON?

Thank You,

Sajjad Tabib


Re: [OMPI devel] Hostfile param: was Trying to get total procs num in odls framework

2007-10-29 Thread David Erukhimovich
Hi,
I was just reviewing my files in order to sent them to Jeff, And fixed the
problem!!
I should've written:
mca_base_param_string_name("rds_hostfile", "path" . . . );
instead if:
mca_base_param_string("rds_hostfile", "path" . . .);
in the component file, 'open' function.

But I don't understand how it compiled? The is no function
mca_base_param_string that takes string as first param (I know it doesn't
comple in the module file)
I compile using 'make all install' in the openmpi dir

Thanks
--David

On Mon, 29 Oct 2007, Jeff Squyres wrote:

> Sorry guys, I did miss this earlier.
>
> I don't see a patch anywhere in the e-mail thread below -- can
> someone send me the problematic code in question?
>
> FWIW: The MCA param space is global, so there's no reason that a new/
> different RDS shouldn't be able to read the hostfile MCA parameter.
>
>
>
> On Oct 28, 2007, at 2:09 PM, Ralph Castain wrote:
>
> > Yo Jeff
> >
> > This may have slipped through your inbox (had OMPI devel in
> > subject, so may
> > have been caught in some filter) - could you please provide any
> > thoughts on
> > why the hostfile isn't getting picked up correctly? As I indicated
> > on the
> > prior note, I verified that it is working for the default hostfile
> > component
> > - I can't see anything wrong in David's call to cause the problem.
> > Please
> > refer to the prior note for that code.
> >
> > Thanks
> > Ralph
> >
> >
> >
> > On 10/28/07 10:31 AM, "David Erukhimovich"
> >  wrote:
> >
> >> Thank you very much for the patch, it helped me a lot (It works!) and
> >> I'm really appreciate this.
> >>
> >> p.s. Any idea about the rds thing?
> >>
> >> Regards
> >> --David
> >>
> >>
> >> Ralph H Castain wrote:
> >>> Hi David
> >>>
> >>> Here is the promised patch - it passes params just fine, but I
> >>> cannot vouch
> >>> for any unintended consequences. I -think- it will be fine, but
> >>> it lacks all
> >>> the usual testing for a patch to an official release.
> >>>
> >>> Hope it helps
> >>> Ralph
> >>>
> >>>
> >>>
> >>> On 10/20/07 10:10 AM, "David Erukhimovich"
> >>>  wrote:
> >>>
> 
>  Hi Ralph,
> 
>  2. I do want the user to be able to switch between my way of
>  process
>  launching, and the default way. I can do it using an mca flag,
>  but I would
>  prefer a new component. If I is not too defficult for you,
>  please make the
>  patch, if it is, I'll just use an mca flag.
> 
>  1. Just remmembered another difficulty I had: I've created a new
>  rds
>  component identical to the hostfile one. lets call it mosix.
>  Now, orterun
>  is saving the hostfile path in the mca parameter -
>  rds_hostfile_path or
>  something like that. when I try to retrieve rds_hostfile_path or
>  rds_mosix_path in rds_mosix component I always get the default
>  hostfile path
>  (doesn't matter if I gave an hostfile or not). And I tried
>  everything -
>  changing names in rds_mosix_component, declaring a new parameter
>  rds_mosix_path in various places etc. So now I'm just altering
>  the existing
>  hostfile component.
>  Do you have any suggestions how to make it work?
> 
>  Sorry for all the questions and thank you very much for the
>  quick answers
> 
>  Regards
>  --David
> 
>  -- Forwarded message --
>  From: Ralph Castain 
>  Date: Oct 20, 2007 5:12 PM
>  Subject: Re: [OMPI devel] Trying to get total procs num in odls
>  framework
>  To: David Erukhimovich 
> 
>  Hi David
> 
>  Thanks for the info - see comments below.
> 
>  Ralph
> 
> 
>  On 10/20/07 6:58 AM, "David Erukhimovich"
>   wrote:
> 
> > Hi
> > Thank you for your answer.
> >
> > First of all, my two questions wasn't connected and they belong to
>  different
> > part of my project. and the subject of the mail should have
> > been: Trying
>  to
> > get total procs num in rds framework (sorry my mistake).
> >
> > Here the parts in the order of the last email
> >
> > 1. I've solved the problem about getting total num of procs in
> > rds (just
> > called  some function incorrectly), so sorry for disturbing you
> > about
>  that.
> > Now a bit more about what I'm trying to do, maybe there is a
> > better way
>  then
> > mine:
> > I have a tool (external application) that given a list of
> > machines and a
> > number n , it chooses the n best ones from the list (least
> > loaded  ones)
>  and
> > if the list of machines isn't given, it just returns the n best
> > machines
> > from the claster. I am wishing to include this in ompi. hence -
> > given a
> > machinefile, It'll run the process only on the best nodes. If a
>  machinefile
> > isn't given, it'll take the best node that my application returns.
> > I think the best p

Re: [OMPI devel] PathScale 3.0 problems with Open MPI 1.2.[34]

2007-10-29 Thread Tom Mitchell
On Oct 23 08:57, Jeff Squyres wrote:
> On Oct 23, 2007, at 6:33 AM, Bogdan Costescu wrote:
> 
> >> There is in the openib BTL.
> >
> > The bug #1025 has in one the answers the following phrase:
> >
> > "It looks like this will affect many threading issues with the
> > pathscale compiler -- the openib BTL is simply the first place we
> > tripped it."
..

> >> and I do not see the problems that you are seeing.  :-\  Is Debian
> >> etch a supported pathscale platform?
> >
> > Seems like it's not... And indeed the older RHEL4 is a supported
> > platform, which might explain the different results.
> 
> You might want to ask them if Debian etch is supported.


Debian is not supported by the PathScale compiler.  It might
be possible to get it to run... ;-) on the most recent "Debian
etch" but I am sure it is untested.

http://www.pathscale.com/docs/Install.pdf

See Section 2:  Table 2-1 Supported Linux Distributions and Platforms

Red Hat Enterprise Linux 4 (RHEL4) 
Fedora Core 3, 4, 5 (FC3, FC4, FC5)
SUSE Linux Enterprise 10a
SUSE Linux Enterprise Server (SLES) 9
SUSE Linux Professional 9.3

The Pathscale compiler itself is a 32bit object.
There are two versions of Debian etch,
32bit and 64bit
which one is is being used?  The new Debian etch 64bit has
minimal support for 32 bit user space programs by default. It may be
necessary to install 32 bit libs and or compatability libs to
get it to run correctly.

I do not know which gcc compiler suite Debian GCC is based on.
"gcc -v " can tell volumes.  Since the PathScale compiler
uses parts of the gcc compiler suite this could be important.
In RHEL4 compare and contrast gcc4-4.1.1-53  .vs. gcc-3.4.6-8.
The Pathscale compiler is recently rebased to also work with gcc4
based systems.

Do foundation Open MPI tools like .../bin/orterun 
compile and run well enough to return errors?  If the Usage message
fails to print then all bets are off for MPI itself. Example:

$  OMPIhome/bin/orterun
orterun (OpenRTE) YourVersion

Usage: orterun [OPTION]...  [PROGRAM]...
Start the given program using Open RTE

... etc.

Scan the eko man page for PathScale compiler specific options.

ompi_info does not capture compiler flags so report your
configure line, make line and environment variables.



-- 
T o m   M i t c h e l l
Host Solutions Group, QLogic Corp.  
http://www.qlogic.com   http://support.qlogic.com