Re: [OMPI devel] [OMPI users] flex.exe

2010-01-22 Thread Ralph Castain
Let's shift this to the devel mailing list and add it to the Tues telecon.

Thanks for clarifying. Sounds to me like the suggestions made below are right - 
we shouldn't be distributing binary in the main tarball for export reasons. 
Seems like we have four options:

1. A separate Windows-tool tarball

2. remove flex from the 3-4 places it is used in the code base and replace it 
with something that doesn't have this requirement. We don't use that much text 
processing - it may not take that much effort to write our own utility for this 
purpose.

3. not use the features that are missing from the windows version.

4. even though it changes sometimes, generate the flex-code output and ship it 
like we used to do

Regardless, shipping binary in a source tarball seems like a really bad idea in 
this age of viral concerns.


On Jan 22, 2010, at 3:09 AM, Shiqing Fan wrote:

> 
> Hi,
> 
> flex.exe is not generated at compile time, but flex.exe has to be used to 
> generate those *flex*.c files during compilation, like show_help_lex.c (a.k.a 
> the flex-generated code).
> 
> The windows binary of flex on sourceforge doesn't fit the requirement of Open 
> MPI, it has some missing features. That's why we have to compile a new 
> flex.exe for Windows, and put it in the source tree.
> 
> 
> Regards,
> Shiqing
> 
> 
> Ralph Castain wrote:
>> Maybe I'm misunderstanding, but if it is generated at -compile- time, then 
>> how did it get in the 1.4.1 tarball?
>> 
>> 
>> On Jan 22, 2010, at 1:56 AM, Shiqing Fan wrote:
>> 
>>  
>>> Hi,
>>> 
>>> No, that's not true, we did ship the flex-generated code a time ago, but as 
>>> that part of code changes sometimes, we decided to generate it during 
>>> compilation time, and the flex.exe came with the first support of Windows 
>>> (CMake).
>>> 
>>> 
>>> Regards,
>>> Shiqing
>>> 
>>> Jeff Squyres wrote:
>>>
 Don't we ship the flex-generated code in the tarball anyway?  If so, why 
 do we ship flex.exe?
 
 On Jan 21, 2010, at 12:14 PM, Barrett, Brian W wrote:
 
   
> I have to agree with the two requests here. Having either a windows 
> tarball or a windows build tools tarball doesn't seem too burdensom, and 
> could even be done automatically at make dist time.
> 
> Brian
> 
> 
> - Original Message -
> From: users-boun...@open-mpi.org 
> To: us...@open-mpi.org 
> Sent: Thu Jan 21 10:05:03 2010
> Subject: Re: [OMPI users] flex.exe
> 
> Am Donnerstag, den 21.01.2010, 11:52 -0500 schrieb Michael Di Domenico:
>   
>> openmpi-1.4.1/contrib/platform/win32/bin/flex.exe
>> 
>> I understand this file might be required for building on windows,
>> since I'm not I can just delete the file without issue.
>> 
>> However, for those of us under import restrictions, where binaries are
>> not allowed in, this file causes me to open the tarball and delete the
>> file (not a big deal, i know, i know).
>> 
>> But, can I put up a vote for a pure source only tree?
>>   
> I'm very much in favor of that since we can't ship this binary in
> Debian. We'd have to delete it from the tarball and repack it with every
> release which is quite cumbersome. If these tools could be shipped in a
> separate tarball that would be great!
> 
> Best regards
> Manuel
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
>   
   
>>> -- 
>>> --
>>> Shiqing Fan  http://www.hlrs.de/people/fan
>>> High Performance Computing   Tel.: +49 711 685 87234
>>> Center Stuttgart (HLRS)Fax.: +49 711 685 65832
>>> Address:Allmandring 30   email: f...@hlrs.de70569 Stuttgart
>>> 
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>> 
>>  
> 
> 
> -- 
> --
> Shiqing Fan  http://www.hlrs.de/people/fan
> High Performance Computing   Tel.: +49 711 685 87234
> Center Stuttgart (HLRS)Fax.: +49 711 685 65832
> Address:Allmandring 30   email: f...@hlrs.de70569 Stuttgart
> 




Re: [OMPI devel] [OMPI users] flex.exe

2010-01-22 Thread Shiqing Fan

Hi,

In the User's list, Jeff mentioned generating the windows flex code 
during make dist time, I didn't think about it before, it should work if 
flex is newer than 2.5.4 (the latest version is 2.5.35).


In the created tarball, the flex generated c source won't compile under 
Windows, that's because using an old version of flex, the generated file 
include unistd.h but there is no way to exclude it. The newer flex 
generate output file with following code piece:


#ifndef YY_NO_UNISTD_H
/* Special case for "unistd.h", since it is non-ANSI. We include it way
* down here because we want the user's section 1 to have been scanned 
first.

* The user has a chance to override it with an option.
*/
#include 
#endif

So that on the platforms that don't have unistd.h, just define 
'YY_NO_UNISTD_H' to get rid of it.


Updating the flex that used for make dist, will be the best solution to 
remove flex.exe from the tarball. But this windows flex.exe should be 
better remain in svn repository for svn checkout build.



Thanks,
Shiqing

Ralph Castain wrote:

Let's shift this to the devel mailing list and add it to the Tues telecon.

Thanks for clarifying. Sounds to me like the suggestions made below are right - 
we shouldn't be distributing binary in the main tarball for export reasons. 
Seems like we have four options:

1. A separate Windows-tool tarball

2. remove flex from the 3-4 places it is used in the code base and replace it 
with something that doesn't have this requirement. We don't use that much text 
processing - it may not take that much effort to write our own utility for this 
purpose.

3. not use the features that are missing from the windows version.

4. even though it changes sometimes, generate the flex-code output and ship it 
like we used to do

Regardless, shipping binary in a source tarball seems like a really bad idea in 
this age of viral concerns.


On Jan 22, 2010, at 3:09 AM, Shiqing Fan wrote:

  

Hi,

flex.exe is not generated at compile time, but flex.exe has to be used to 
generate those *flex*.c files during compilation, like show_help_lex.c (a.k.a 
the flex-generated code).

The windows binary of flex on sourceforge doesn't fit the requirement of Open 
MPI, it has some missing features. That's why we have to compile a new flex.exe 
for Windows, and put it in the source tree.


Regards,
Shiqing


Ralph Castain wrote:


Maybe I'm misunderstanding, but if it is generated at -compile- time, then how 
did it get in the 1.4.1 tarball?


On Jan 22, 2010, at 1:56 AM, Shiqing Fan wrote:

 
  

Hi,

No, that's not true, we did ship the flex-generated code a time ago, but as 
that part of code changes sometimes, we decided to generate it during 
compilation time, and the flex.exe came with the first support of Windows 
(CMake).


Regards,
Shiqing

Jeff Squyres wrote:
   


Don't we ship the flex-generated code in the tarball anyway?  If so, why do we 
ship flex.exe?

On Jan 21, 2010, at 12:14 PM, Barrett, Brian W wrote:

  
  

I have to agree with the two requests here. Having either a windows tarball or 
a windows build tools tarball doesn't seem too burdensom, and could even be 
done automatically at make dist time.

Brian


- Original Message -
From: users-boun...@open-mpi.org 
To: us...@open-mpi.org 
Sent: Thu Jan 21 10:05:03 2010
Subject: Re: [OMPI users] flex.exe

Am Donnerstag, den 21.01.2010, 11:52 -0500 schrieb Michael Di Domenico:
  


openmpi-1.4.1/contrib/platform/win32/bin/flex.exe

I understand this file might be required for building on windows,
since I'm not I can just delete the file without issue.

However, for those of us under import restrictions, where binaries are
not allowed in, this file causes me to open the tarball and delete the
file (not a big deal, i know, i know).

But, can I put up a vote for a pure source only tree?
  
  

I'm very much in favor of that since we can't ship this binary in
Debian. We'd have to delete it from the tarball and repack it with every
release which is quite cumbersome. If these tools could be shipped in a
separate tarball that would be great!

Best regards
Manuel

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

  

  
  

--
--
Shiqing Fan  http://www.hlrs.de/people/fan
High Performance Computing   Tel.: +49 711 685 87234
Center Stuttgart (HLRS)Fax.: +49 711 685 65832
Address:Allmandring 30   email: f...@hlrs.de70569 Stuttgart

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
   

 
 

Re: [OMPI devel] [OMPI users] flex.exe

2010-01-22 Thread Jeff Squyres
Ok, moving this back to devel (sorry, I replied to an earlier mail -- before 
Ralph moved it to devel).

Let's figure out how to generate the relevant code that you need at "make dist" 
time and not include flex.exe in the tarball -- it can still be in svn if you 
want/need it.  You might want to note in README.windows that flex.exe is not 
included in the tarball for the reasons cited on the users thread.

I'll poke around and see if I can get the .c files in the tarball and therefore 
be able to exclude flex.exe -- let me get back to you later today...



On Jan 22, 2010, at 8:07 AM, Shiqing Fan wrote:

> 
> Yes, that should work but only with newer version of flex, I didn't think 
> about it before. But the windows flex.exe should still be available for svn 
> checkout build.
> 
> 
> Thanks,
> Shiqing
> 
> 
> Jeff Squyres (jsquyres) wrote:
>> 
>> What prevents us from generating the code during make dist time and 
>> therefore not shipping flex.exe?
>> 
>> -jms
>> Sent from my PDA.  No type good.
>> 
>> - Original Message -
>> From: Shiqing Fan 
>> To: Open MPI Users 
>> Cc: Jeff Squyres (jsquyres)
>> Sent: Fri Jan 22 03:56:52 2010
>> Subject: Re: [OMPI users] flex.exe
>> 
>> Hi,
>> 
>> No, that's not true, we did ship the flex-generated code a time ago, but
>> as that part of code changes sometimes, we decided to generate it during
>> compilation time, and the flex.exe came with the first support of
>> Windows (CMake).
>> 
>> 
>> Regards,
>> Shiqing
>> 
>> Jeff Squyres wrote:
>> > Don't we ship the flex-generated code in the tarball anyway?  If so, why 
>> > do we ship flex.exe?
>> >
>> > On Jan 21, 2010, at 12:14 PM, Barrett, Brian W wrote:
>> >
>> >  >> I have to agree with the two requests here. Having either a windows 
>> > tarball or a windows build tools tarball doesn't seem too burdensom, and 
>> > could even be done automatically at make dist time.
>> >>
>> >> Brian
>> >>
>> >>
>> >> - Original Message -
>> >> From: users-boun...@open-mpi.org 
>> >> To: us...@open-mpi.org 
>> >> Sent: Thu Jan 21 10:05:03 2010
>> >> Subject: Re: [OMPI users] flex.exe
>> >>
>> >> Am Donnerstag, den 21.01.2010, 11:52 -0500 schrieb Michael Di Domenico:
>> >>>>> openmpi-1.4.1/contrib/platform/win32/bin/flex.exe
>> >>>
>> >>> I understand this file might be required for building on windows,
>> >>> since I'm not I can just delete the file without issue.
>> >>>
>> >>> However, for those of us under import restrictions, where binaries are
>> >>> not allowed in, this file causes me to open the tarball and delete the
>> >>> file (not a big deal, i know, i know).
>> >>>
>> >>> But, can I put up a vote for a pure source only tree?
>> >>>  >> I'm very much in favor of that since we can't ship this binary in
>> >> Debian. We'd have to delete it from the tarball and repack it with every
>> >> release which is quite cumbersome. If these tools could be shipped in a
>> >> separate tarball that would be great!
>> >>
>> >> Best regards
>> >> Manuel
>> >>
>> >> ___
>> >> users mailing list
>> >> us...@open-mpi.org
>> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >>
>> >>
>> >> ___
>> >> users mailing list
>> >> us...@open-mpi.org
>> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >>
>> >>>
>> >
>> >  
>> 
>> --
>> --
>> Shiqing Fan  http://www.hlrs.de/people/fan
>> High Performance Computing   Tel.: +49 711 685 87234
>>  Center Stuttgart (HLRS)Fax.: +49 711 685 65832
>> Address:Allmandring 30   email: f...@hlrs.de   70569 Stuttgart
>> 
> 
> 
> -- 
> --
> Shiqing Fan  http://www.hlrs.de/people/fan
> High Performance Computing   Tel.: +49 711 685 87234
> Center Stuttgart (HLRS)Fax.: +49 711 685 65832
> Address:Allmandring 30   email: f...@hlrs.de70569 Stuttgart
> 


-- 
Jeff Squyres
jsquy...@cisco.com




Re: [OMPI devel] [OMPI users] flex.exe

2010-01-22 Thread Jeff Squyres
Actually, I take that back -- the .c files *are* in the tarball already.

Are you saying (per your other mail) that the .c files are simply generated by 
a flex that is too old, and we need to update the flex that is used to generate 
the .c files in the tarball?  If so, that's a relatively simple change to make 
in the "make a tarball" scripts at IU.


On Jan 22, 2010, at 8:38 AM, Jeff Squyres (jsquyres) wrote:

> Ok, moving this back to devel (sorry, I replied to an earlier mail -- before 
> Ralph moved it to devel).
> 
> Let's figure out how to generate the relevant code that you need at "make 
> dist" time and not include flex.exe in the tarball -- it can still be in svn 
> if you want/need it.  You might want to note in README.windows that flex.exe 
> is not included in the tarball for the reasons cited on the users thread.
> 
> I'll poke around and see if I can get the .c files in the tarball and 
> therefore be able to exclude flex.exe -- let me get back to you later today...
> 
> 
> 
> On Jan 22, 2010, at 8:07 AM, Shiqing Fan wrote:
> 
> >
> > Yes, that should work but only with newer version of flex, I didn't think 
> > about it before. But the windows flex.exe should still be available for svn 
> > checkout build.
> >
> >
> > Thanks,
> > Shiqing
> >
> >
> > Jeff Squyres (jsquyres) wrote:
> >>
> >> What prevents us from generating the code during make dist time and 
> >> therefore not shipping flex.exe?
> >>
> >> -jms
> >> Sent from my PDA.  No type good.
> >>
> >> - Original Message -
> >> From: Shiqing Fan 
> >> To: Open MPI Users 
> >> Cc: Jeff Squyres (jsquyres)
> >> Sent: Fri Jan 22 03:56:52 2010
> >> Subject: Re: [OMPI users] flex.exe
> >>
> >> Hi,
> >>
> >> No, that's not true, we did ship the flex-generated code a time ago, but
> >> as that part of code changes sometimes, we decided to generate it during
> >> compilation time, and the flex.exe came with the first support of
> >> Windows (CMake).
> >>
> >>
> >> Regards,
> >> Shiqing
> >>
> >> Jeff Squyres wrote:
> >> > Don't we ship the flex-generated code in the tarball anyway?  If so, why 
> >> > do we ship flex.exe?
> >> >
> >> > On Jan 21, 2010, at 12:14 PM, Barrett, Brian W wrote:
> >> >
> >> >  >> I have to agree with the two requests here. Having either a windows 
> >> > tarball or a windows build tools tarball doesn't seem too burdensom, and 
> >> > could even be done automatically at make dist time.
> >> >>
> >> >> Brian
> >> >>
> >> >>
> >> >> - Original Message -
> >> >> From: users-boun...@open-mpi.org 
> >> >> To: us...@open-mpi.org 
> >> >> Sent: Thu Jan 21 10:05:03 2010
> >> >> Subject: Re: [OMPI users] flex.exe
> >> >>
> >> >> Am Donnerstag, den 21.01.2010, 11:52 -0500 schrieb Michael Di Domenico:
> >> >>>>> openmpi-1.4.1/contrib/platform/win32/bin/flex.exe
> >> >>>
> >> >>> I understand this file might be required for building on windows,
> >> >>> since I'm not I can just delete the file without issue.
> >> >>>
> >> >>> However, for those of us under import restrictions, where binaries are
> >> >>> not allowed in, this file causes me to open the tarball and delete the
> >> >>> file (not a big deal, i know, i know).
> >> >>>
> >> >>> But, can I put up a vote for a pure source only tree?
> >> >>>  >> I'm very much in favor of that since we can't ship this binary 
> >> >>> in
> >> >> Debian. We'd have to delete it from the tarball and repack it with every
> >> >> release which is quite cumbersome. If these tools could be shipped in a
> >> >> separate tarball that would be great!
> >> >>
> >> >> Best regards
> >> >> Manuel
> >> >>
> >> >> ___
> >> >> users mailing list
> >> >> us...@open-mpi.org
> >> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >> >>
> >> >>
> >> >> ___
> >> >> users mailing list
> >> >> us...@open-mpi.org
> >> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >> >>
> >> >>>
> >> >
> >> > 
> >>
> >> --
> >> --
> >> Shiqing Fan  http://www.hlrs.de/people/fan
> >> High Performance Computing   Tel.: +49 711 685 87234
> >>  Center Stuttgart (HLRS)Fax.: +49 711 685 65832
> >> Address:Allmandring 30   email: f...@hlrs.de   70569 Stuttgart
> >>
> >
> >
> > --
> > --
> > Shiqing Fan  http://www.hlrs.de/people/fan
> > High Performance Computing   Tel.: +49 711 685 87234
> > Center Stuttgart (HLRS)Fax.: +49 711 685 65832
> > Address:Allmandring 30   email: f...@hlrs.de70569 Stuttgart
> >
> 
> 
> --
> Jeff Squyres
> jsquy...@cisco.com
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 


-- 
Jeff Squyres
jsquy...@cisco.com




Re: [OMPI devel] [OMPI users] flex.exe

2010-01-22 Thread Shiqing Fan



Are you saying (per your other mail) that the .c files are simply generated by a flex 
that is too old, and we need to update the flex that is used to generate the .c files in 
the tarball?  If so, that's a relatively simple change to make in the "make a 
tarball" scripts at IU.
  


Yes, exactly what I meant. I've already tested under Linux with flex 
3.5.35, and the generated .c files also worked under Windows. So only 
the a new flex to be used, then we can remove the windows flex.exe from 
the tarball.




Thanks,
Shiqing



On Jan 22, 2010, at 8:38 AM, Jeff Squyres (jsquyres) wrote:

  

Ok, moving this back to devel (sorry, I replied to an earlier mail -- before 
Ralph moved it to devel).

Let's figure out how to generate the relevant code that you need at "make dist" 
time and not include flex.exe in the tarball -- it can still be in svn if you want/need 
it.  You might want to note in README.windows that flex.exe is not included in the 
tarball for the reasons cited on the users thread.

I'll poke around and see if I can get the .c files in the tarball and therefore 
be able to exclude flex.exe -- let me get back to you later today...



On Jan 22, 2010, at 8:07 AM, Shiqing Fan wrote:



Yes, that should work but only with newer version of flex, I didn't think about 
it before. But the windows flex.exe should still be available for svn checkout 
build.


Thanks,
Shiqing


Jeff Squyres (jsquyres) wrote:
  

What prevents us from generating the code during make dist time and therefore 
not shipping flex.exe?

-jms
Sent from my PDA.  No type good.

- Original Message -
From: Shiqing Fan 
To: Open MPI Users 
Cc: Jeff Squyres (jsquyres)
Sent: Fri Jan 22 03:56:52 2010
Subject: Re: [OMPI users] flex.exe

Hi,

No, that's not true, we did ship the flex-generated code a time ago, but
as that part of code changes sometimes, we decided to generate it during
compilation time, and the flex.exe came with the first support of
Windows (CMake).


Regards,
Shiqing

Jeff Squyres wrote:


Don't we ship the flex-generated code in the tarball anyway?  If so, why do we 
ship flex.exe?

On Jan 21, 2010, at 12:14 PM, Barrett, Brian W wrote:

 >> I have to agree with the two requests here. Having either a windows tarball 
or a windows build tools tarball doesn't seem too burdensom, and could even be done 
automatically at make dist time.
  

Brian


- Original Message -
From: users-boun...@open-mpi.org 
To: us...@open-mpi.org 
Sent: Thu Jan 21 10:05:03 2010
Subject: Re: [OMPI users] flex.exe

Am Donnerstag, den 21.01.2010, 11:52 -0500 schrieb Michael Di Domenico:
   >>> openmpi-1.4.1/contrib/platform/win32/bin/flex.exe


I understand this file might be required for building on windows,
since I'm not I can just delete the file without issue.

However, for those of us under import restrictions, where binaries are
not allowed in, this file causes me to open the tarball and delete the
file (not a big deal, i know, i know).

But, can I put up a vote for a pure source only tree?
 >> I'm very much in favor of that since we can't ship this binary in
  

Debian. We'd have to delete it from the tarball and repack it with every
release which is quite cumbersome. If these tools could be shipped in a
separate tarball that would be great!

Best regards
Manuel

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

   >

  

--
--
Shiqing Fan  http://www.hlrs.de/people/fan
High Performance Computing   Tel.: +49 711 685 87234
 Center Stuttgart (HLRS)Fax.: +49 711 685 65832
Address:Allmandring 30   email: f...@hlrs.de   70569 Stuttgart



--
--
Shiqing Fan  http://www.hlrs.de/people/fan
High Performance Computing   Tel.: +49 711 685 87234
Center Stuttgart (HLRS)Fax.: +49 711 685 65832
Address:Allmandring 30   email: f...@hlrs.de70569 Stuttgart

  

--
Jeff Squyres
jsquy...@cisco.com


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel





  



--
--
Shiqing Fan  http://www.hlrs.de/people/fan
High Performance Computing   Tel.: +49 711 685 87234
 Center Stuttgart (HLRS)Fax.: +49 711 685 65832
Address:Allmandring 30   email: f...@hlrs.de
70569 Stuttgart




[OMPI devel] HOSTNAME environment variable

2010-01-22 Thread Nadia Derbey
Hi,

I'm wondering whether the HOSTNAME environment variable shouldn't be
handled as a "special case" when the orted daemons launch the remote
jobs. This particularly applies to batch schedulers where the caller's
environment is copied to the remote job: we are inheriting a $HOSTNAME
which is the name of the host mpirun was called from:

I tried to run the following small test (see getenv.c in attachment - it
substantially gets the hostname once through $HOSTNAME, and once through
gethostname(2)):


[derbeyn@pichu0 ~]$ hostname
pichu0
[derbeyn@pichu0 ~]$ salloc -N 2 -p pichu mpirun ./getenv
salloc: Granted job allocation 358789
Processor 0 of 2 on $HOSTNAME pichu0: Hello World
Processor 0 of 2 on host pichu93: Hello World
Processor 1 of 2 on $HOSTNAME pichu0: Hello World
Processor 1 of 2 on host pichu94: Hello World
salloc: Relinquishing job allocation 358789


Shouldn't we be getting the same value when using getenv("HOSTNAME") and 
gethsotname()?
Applying the following small patch, we actually do.

Regards,
Nadia

--

Do not propagate the HOSTNAME environment variable on remote hosts

diff -r 4ab256be2a17 orte/orted/orted_main.c
--- a/orte/orted/orted_main.c   Wed Jan 20 16:45:07 2010 +0100
+++ b/orte/orted/orted_main.c   Fri Jan 22 14:54:02 2010 +0100
@@ -299,12 +299,17 @@ int orte_daemon(int argc, char *argv[])
  */
 orte_launch_environ = opal_argv_copy(environ);

+/*
+ * Set HOSTNAME to the actual hostname in order to avoid propagating
+ * the caller's HOSTNAME.
+ */
+gethostname(hostname, 100);
+opal_setenv("HOSTNAME", hostname, true, &orte_launch_environ);

 /* if orte_daemon_debug is set, let someone know we are alive right
  * away just in case we have a problem along the way
  */
 if (orted_globals.debug) {
-gethostname(hostname, 100);
 fprintf(stderr, "Daemon was launched on %s - beginning to 
initialize\n", hostname);
 }

#include 
#include 
#include 
#include 


int main(int argc, char **argv)
{
char *env_hostname;
char hostname[255];
int myrank, size;

MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &myrank);
MPI_Comm_size(MPI_COMM_WORLD, &size);

env_hostname = getenv("HOSTNAME");
if (NULL != env_hostname) {
printf("Processor %d of %d on $HOSTNAME %s: Hello World\n",
   myrank, size, env_hostname);
} else {
printf("Processor %d of %d on $HOSTNAME NULL: Hello World\n",
   myrank, size);
}
if (0 == gethostname(hostname, 255)) {
printf("Processor %d of %d on host %s: Hello World\n",
   myrank, size, hostname);
}

MPI_Finalize();

exit(0);
}


Re: [OMPI devel] HOSTNAME environment variable

2010-01-22 Thread N.M. Maclaren

On Jan 22 2010, Nadia Derbey wrote:


I'm wondering whether the HOSTNAME environment variable shouldn't be
handled as a "special case" when the orted daemons launch the remote
jobs. This particularly applies to batch schedulers where the caller's
environment is copied to the remote job: we are inheriting a $HOSTNAME
which is the name of the host mpirun was called from:


This is slightly orthogonal, but relevant.

This is an ancient mess with propagating environment variables, and predates
MPI by many years.  The most traditional form was the demented connexion
protocols that propagated TERM - truly wonderful when logging in from SunOS
to HP-UX!  Whether it is worth kludging up one variable and leaving the rest
is unclear.

Even if systems are fairly homogeneous, it is common for the head node to
have a different set of standard values from the others.  TMPDIR is one
very common one, but any of the dozen of so path variables is likely to
vary, at least sometimes, as are many of the others.

I used to have to write the most DISGUSTING hacks to stop unwanted export
when I managed our supercomputer.  Yet there are other systems that will
work only if you DO export environment variables.  And there are systems
where the secondary nodes aren't real systems, and using the parent hostname
would be better, though I haven't managed any.

Realistically, there should really be some kind of hook to control which
are transferred and which are not.  I haven't found one - if there is, it's
a better way to tackle this.

Regards,
Nick Maclaren.




Re: [OMPI devel] HOSTNAME environment variable

2010-01-22 Thread Ralph Castain
Hi Nadia

That sounds like a bug in your SLURM config file - SLURM certainly doesn't 
propagate "hostname" by default as that would definitely mess things up for 
more than OMPI.

Are you sure that SLURM is propagating the environment (something I have never 
seen before)? Or is OMPI mistakenly picking it up and propagating it?

On Jan 22, 2010, at 7:25 AM, Nadia Derbey wrote:

> Hi,
> 
> I'm wondering whether the HOSTNAME environment variable shouldn't be
> handled as a "special case" when the orted daemons launch the remote
> jobs. This particularly applies to batch schedulers where the caller's
> environment is copied to the remote job: we are inheriting a $HOSTNAME
> which is the name of the host mpirun was called from:
> 
> I tried to run the following small test (see getenv.c in attachment - it
> substantially gets the hostname once through $HOSTNAME, and once through
> gethostname(2)):
> 
> 
> [derbeyn@pichu0 ~]$ hostname
> pichu0
> [derbeyn@pichu0 ~]$ salloc -N 2 -p pichu mpirun ./getenv
> salloc: Granted job allocation 358789
> Processor 0 of 2 on $HOSTNAME pichu0: Hello World
> Processor 0 of 2 on host pichu93: Hello World
> Processor 1 of 2 on $HOSTNAME pichu0: Hello World
> Processor 1 of 2 on host pichu94: Hello World
> salloc: Relinquishing job allocation 358789
> 
> 
> Shouldn't we be getting the same value when using getenv("HOSTNAME") and 
> gethsotname()?
> Applying the following small patch, we actually do.
> 
> Regards,
> Nadia
> 
> --
> 
> Do not propagate the HOSTNAME environment variable on remote hosts
> 
> diff -r 4ab256be2a17 orte/orted/orted_main.c
> --- a/orte/orted/orted_main.c   Wed Jan 20 16:45:07 2010 +0100
> +++ b/orte/orted/orted_main.c   Fri Jan 22 14:54:02 2010 +0100
> @@ -299,12 +299,17 @@ int orte_daemon(int argc, char *argv[])
>  */
> orte_launch_environ = opal_argv_copy(environ);
> 
> +/*
> + * Set HOSTNAME to the actual hostname in order to avoid propagating
> + * the caller's HOSTNAME.
> + */
> +gethostname(hostname, 100);
> +opal_setenv("HOSTNAME", hostname, true, &orte_launch_environ);
> 
> /* if orte_daemon_debug is set, let someone know we are alive right
>  * away just in case we have a problem along the way
>  */
> if (orted_globals.debug) {
> -gethostname(hostname, 100);
> fprintf(stderr, "Daemon was launched on %s - beginning to 
> initialize\n", hostname);
> }
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] HOSTNAME environment variable

2010-01-22 Thread Ralph Castain
For SLURM, there is a config file where you can specify what gets propagated. 
It is clearly an error to include hostname as it messes many things up, not 
just OMPI. Frankly, I've never seen someone do that on SLURM.

I believe in this case OMPI is likely incorrectly picking up the environment 
and propagating it. We know this is incorrectly happening on Torque, and it 
appears to also be happening on SLURM. This is a bug that I will be fixing on 
Torque - and as soon as Nadia confirms, on SLURM as well.

I know that on Torque it was an innocent mistake where a line got added to the 
launch code that shouldn't have...

On Jan 22, 2010, at 8:07 AM, N.M. Maclaren wrote:

> On Jan 22 2010, Nadia Derbey wrote:
>> 
>> I'm wondering whether the HOSTNAME environment variable shouldn't be
>> handled as a "special case" when the orted daemons launch the remote
>> jobs. This particularly applies to batch schedulers where the caller's
>> environment is copied to the remote job: we are inheriting a $HOSTNAME
>> which is the name of the host mpirun was called from:
> 
> This is slightly orthogonal, but relevant.
> 
> This is an ancient mess with propagating environment variables, and predates
> MPI by many years.  The most traditional form was the demented connexion
> protocols that propagated TERM - truly wonderful when logging in from SunOS
> to HP-UX!  Whether it is worth kludging up one variable and leaving the rest
> is unclear.
> 
> Even if systems are fairly homogeneous, it is common for the head node to
> have a different set of standard values from the others.  TMPDIR is one
> very common one, but any of the dozen of so path variables is likely to
> vary, at least sometimes, as are many of the others.
> 
> I used to have to write the most DISGUSTING hacks to stop unwanted export
> when I managed our supercomputer.  Yet there are other systems that will
> work only if you DO export environment variables.  And there are systems
> where the secondary nodes aren't real systems, and using the parent hostname
> would be better, though I haven't managed any.
> 
> Realistically, there should really be some kind of hook to control which
> are transferred and which are not.  I haven't found one - if there is, it's
> a better way to tackle this.
> 
> Regards,
> Nick Maclaren.
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] HOSTNAME environment variable

2010-01-22 Thread Ralph Castain
A quick and easy way to answer my question of slurm vs ompi:

Just do "srun script-that-echos-hostname-and-gethostname". If you get the right 
hostnames, then OMPI is to blame, not slurm.

On Jan 22, 2010, at 8:07 AM, Ralph Castain wrote:

> Hi Nadia
> 
> That sounds like a bug in your SLURM config file - SLURM certainly doesn't 
> propagate "hostname" by default as that would definitely mess things up for 
> more than OMPI.
> 
> Are you sure that SLURM is propagating the environment (something I have 
> never seen before)? Or is OMPI mistakenly picking it up and propagating it?
> 
> On Jan 22, 2010, at 7:25 AM, Nadia Derbey wrote:
> 
>> Hi,
>> 
>> I'm wondering whether the HOSTNAME environment variable shouldn't be
>> handled as a "special case" when the orted daemons launch the remote
>> jobs. This particularly applies to batch schedulers where the caller's
>> environment is copied to the remote job: we are inheriting a $HOSTNAME
>> which is the name of the host mpirun was called from:
>> 
>> I tried to run the following small test (see getenv.c in attachment - it
>> substantially gets the hostname once through $HOSTNAME, and once through
>> gethostname(2)):
>> 
>> 
>> [derbeyn@pichu0 ~]$ hostname
>> pichu0
>> [derbeyn@pichu0 ~]$ salloc -N 2 -p pichu mpirun ./getenv
>> salloc: Granted job allocation 358789
>> Processor 0 of 2 on $HOSTNAME pichu0: Hello World
>> Processor 0 of 2 on host pichu93: Hello World
>> Processor 1 of 2 on $HOSTNAME pichu0: Hello World
>> Processor 1 of 2 on host pichu94: Hello World
>> salloc: Relinquishing job allocation 358789
>> 
>> 
>> Shouldn't we be getting the same value when using getenv("HOSTNAME") and 
>> gethsotname()?
>> Applying the following small patch, we actually do.
>> 
>> Regards,
>> Nadia
>> 
>> --
>> 
>> Do not propagate the HOSTNAME environment variable on remote hosts
>> 
>> diff -r 4ab256be2a17 orte/orted/orted_main.c
>> --- a/orte/orted/orted_main.c   Wed Jan 20 16:45:07 2010 +0100
>> +++ b/orte/orted/orted_main.c   Fri Jan 22 14:54:02 2010 +0100
>> @@ -299,12 +299,17 @@ int orte_daemon(int argc, char *argv[])
>> */
>>orte_launch_environ = opal_argv_copy(environ);
>> 
>> +/*
>> + * Set HOSTNAME to the actual hostname in order to avoid propagating
>> + * the caller's HOSTNAME.
>> + */
>> +gethostname(hostname, 100);
>> +opal_setenv("HOSTNAME", hostname, true, &orte_launch_environ);
>> 
>>/* if orte_daemon_debug is set, let someone know we are alive right
>> * away just in case we have a problem along the way
>> */
>>if (orted_globals.debug) {
>> -gethostname(hostname, 100);
>>fprintf(stderr, "Daemon was launched on %s - beginning to 
>> initialize\n", hostname);
>>}
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 




Re: [OMPI devel] HOSTNAME environment variable

2010-01-22 Thread Nadia Derbey
On Fri, 2010-01-22 at 08:12 -0700, Ralph Castain wrote:
> For SLURM, there is a config file where you can specify what gets propagated. 
> It is clearly an error to include hostname as it messes many things up, not 
> just OMPI. Frankly, I've never seen someone do that on SLURM.
> 
I'm going to check that.

Thanks,
Nadia

> I believe in this case OMPI is likely incorrectly picking up the environment 
> and propagating it. We know this is incorrectly happening on Torque, and it 
> appears to also be happening on SLURM. This is a bug that I will be fixing on 
> Torque - and as soon as Nadia confirms, on SLURM as well.
> 
> I know that on Torque it was an innocent mistake where a line got added to 
> the launch code that shouldn't have...
> 
> On Jan 22, 2010, at 8:07 AM, N.M. Maclaren wrote:
> 
> > On Jan 22 2010, Nadia Derbey wrote:
> >> 
> >> I'm wondering whether the HOSTNAME environment variable shouldn't be
> >> handled as a "special case" when the orted daemons launch the remote
> >> jobs. This particularly applies to batch schedulers where the caller's
> >> environment is copied to the remote job: we are inheriting a $HOSTNAME
> >> which is the name of the host mpirun was called from:
> > 
> > This is slightly orthogonal, but relevant.
> > 
> > This is an ancient mess with propagating environment variables, and predates
> > MPI by many years.  The most traditional form was the demented connexion
> > protocols that propagated TERM - truly wonderful when logging in from SunOS
> > to HP-UX!  Whether it is worth kludging up one variable and leaving the rest
> > is unclear.
> > 
> > Even if systems are fairly homogeneous, it is common for the head node to
> > have a different set of standard values from the others.  TMPDIR is one
> > very common one, but any of the dozen of so path variables is likely to
> > vary, at least sometimes, as are many of the others.
> > 
> > I used to have to write the most DISGUSTING hacks to stop unwanted export
> > when I managed our supercomputer.  Yet there are other systems that will
> > work only if you DO export environment variables.  And there are systems
> > where the secondary nodes aren't real systems, and using the parent hostname
> > would be better, though I haven't managed any.
> > 
> > Realistically, there should really be some kind of hook to control which
> > are transferred and which are not.  I haven't found one - if there is, it's
> > a better way to tackle this.
> > 
> > Regards,
> > Nick Maclaren.
> > 
> > 
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
-- 
Nadia Derbey 



Re: [OMPI devel] HOSTNAME environment variable

2010-01-22 Thread N.M. Maclaren

On Jan 22 2010, Ralph Castain wrote:

For SLURM, there is a config file where you can specify what gets 
propagated. It is clearly an error to include hostname as it messes many 
things up, not just OMPI. Frankly, I've never seen someone do that on 
SLURM.


Well, it's USUALLY an error   That's clearly a good solution.

I believe in this case OMPI is likely incorrectly picking up the 
environment and propagating it. We know this is incorrectly happening on 
Torque, and it appears to also be happening on SLURM. This is a bug that 
I will be fixing on Torque - and as soon as Nadia confirms, on SLURM as 
well.


I should have run a cross-check!  It doesn't happen on my bare OpenMPI
installation.

Regards,
Nick Maclaren.




Re: [OMPI devel] [OMPI users] flex.exe

2010-01-22 Thread Jeff Squyres
Shiqing and I took this offlist and have a solution which looks like it works.  
End results:

- no more flex.exe in tarballs
- updated the flex to 2.5.35 on the IU machine that is used to generate 1.4 and 
1.5 tarballs; hence, the generated _lex.c files in the tarball are 
Windows-friendly
- changes to cmake files to adapt to the above

We should be able to commit these changes sometime today (i.e., the changes 
will appear in trunk nightlies tonight); we'll CMR them to the v1.4 and 1.5 
branches so that they'll be in v1.4.2 and v1.5[.0], respectively.



On Jan 22, 2010, at 8:52 AM, Shiqing Fan wrote:

> 
> > Are you saying (per your other mail) that the .c files are simply generated 
> > by a flex that is too old, and we need to update the flex that is used to 
> > generate the .c files in the tarball?  If so, that's a relatively simple 
> > change to make in the "make a tarball" scripts at IU.
> >  
> 
> Yes, exactly what I meant. I've already tested under Linux with flex
> 3.5.35, and the generated .c files also worked under Windows. So only
> the a new flex to be used, then we can remove the windows flex.exe from
> the tarball.
> 
> 
> 
> Thanks,
> Shiqing
> 
> 
> > On Jan 22, 2010, at 8:38 AM, Jeff Squyres (jsquyres) wrote:
> >
> >  
> >> Ok, moving this back to devel (sorry, I replied to an earlier mail -- 
> >> before Ralph moved it to devel).
> >>
> >> Let's figure out how to generate the relevant code that you need at "make 
> >> dist" time and not include flex.exe in the tarball -- it can still be in 
> >> svn if you want/need it.  You might want to note in README.windows that 
> >> flex.exe is not included in the tarball for the reasons cited on the users 
> >> thread.
> >>
> >> I'll poke around and see if I can get the .c files in the tarball and 
> >> therefore be able to exclude flex.exe -- let me get back to you later 
> >> today...
> >>
> >>
> >>
> >> On Jan 22, 2010, at 8:07 AM, Shiqing Fan wrote:
> >>
> >>
> >>> Yes, that should work but only with newer version of flex, I didn't think 
> >>> about it before. But the windows flex.exe should still be available for 
> >>> svn checkout build.
> >>>
> >>>
> >>> Thanks,
> >>> Shiqing
> >>>
> >>>
> >>> Jeff Squyres (jsquyres) wrote:
> >>>  
>  What prevents us from generating the code during make dist time and 
>  therefore not shipping flex.exe?
> 
>  -jms
>  Sent from my PDA.  No type good.
> 
>  - Original Message -
>  From: Shiqing Fan 
>  To: Open MPI Users 
>  Cc: Jeff Squyres (jsquyres)
>  Sent: Fri Jan 22 03:56:52 2010
>  Subject: Re: [OMPI users] flex.exe
> 
>  Hi,
> 
>  No, that's not true, we did ship the flex-generated code a time ago, but
>  as that part of code changes sometimes, we decided to generate it during
>  compilation time, and the flex.exe came with the first support of
>  Windows (CMake).
> 
> 
>  Regards,
>  Shiqing
> 
>  Jeff Squyres wrote:
> 
> > Don't we ship the flex-generated code in the tarball anyway?  If so, 
> > why do we ship flex.exe?
> >
> > On Jan 21, 2010, at 12:14 PM, Barrett, Brian W wrote:
> >
> >  >> I have to agree with the two requests here. Having either a windows 
> > tarball or a windows build tools tarball doesn't seem too burdensom, 
> > and could even be done automatically at make dist time.
> >  
> >> Brian
> >>
> >>
> >> - Original Message -
> >> From: users-boun...@open-mpi.org 
> >> To: us...@open-mpi.org 
> >> Sent: Thu Jan 21 10:05:03 2010
> >> Subject: Re: [OMPI users] flex.exe
> >>
> >> Am Donnerstag, den 21.01.2010, 11:52 -0500 schrieb Michael Di Domenico:
> >>>>> openmpi-1.4.1/contrib/platform/win32/bin/flex.exe
> >>
> >>> I understand this file might be required for building on windows,
> >>> since I'm not I can just delete the file without issue.
> >>>
> >>> However, for those of us under import restrictions, where binaries are
> >>> not allowed in, this file causes me to open the tarball and delete the
> >>> file (not a big deal, i know, i know).
> >>>
> >>> But, can I put up a vote for a pure source only tree?
> >>>  >> I'm very much in favor of that since we can't ship this 
> >>> binary in
> >>>  
> >> Debian. We'd have to delete it from the tarball and repack it with 
> >> every
> >> release which is quite cumbersome. If these tools could be shipped in a
> >> separate tarball that would be great!
> >>
> >> Best regards
> >> Manuel
> >>
> >> ___
> >> users mailing list
> >> us...@open-mpi.org
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >>
> >> ___
> >> users mailing list
> >> us...@open-mpi.org
> >> htt

Re: [OMPI devel] HOSTNAME environment variable

2010-01-22 Thread Nadia Derbey
On Fri, 2010-01-22 at 08:22 -0700, Ralph Castain wrote:
> A quick and easy way to answer my question of slurm vs ompi:
> 
> Just do "srun script-that-echos-hostname-and-gethostname". If you get the 
> right hostnames, then OMPI is to blame, not slurm.
> 

No, I'm not...
Will check the configuration.

Thanks a lot,
Nadia

> On Jan 22, 2010, at 8:07 AM, Ralph Castain wrote:
> 
> > Hi Nadia
> > 
> > That sounds like a bug in your SLURM config file - SLURM certainly doesn't 
> > propagate "hostname" by default as that would definitely mess things up for 
> > more than OMPI.
> > 
> > Are you sure that SLURM is propagating the environment (something I have 
> > never seen before)? Or is OMPI mistakenly picking it up and propagating it?
> > 
> > On Jan 22, 2010, at 7:25 AM, Nadia Derbey wrote:
> > 
> >> Hi,
> >> 
> >> I'm wondering whether the HOSTNAME environment variable shouldn't be
> >> handled as a "special case" when the orted daemons launch the remote
> >> jobs. This particularly applies to batch schedulers where the caller's
> >> environment is copied to the remote job: we are inheriting a $HOSTNAME
> >> which is the name of the host mpirun was called from:
> >> 
> >> I tried to run the following small test (see getenv.c in attachment - it
> >> substantially gets the hostname once through $HOSTNAME, and once through
> >> gethostname(2)):
> >> 
> >> 
> >> [derbeyn@pichu0 ~]$ hostname
> >> pichu0
> >> [derbeyn@pichu0 ~]$ salloc -N 2 -p pichu mpirun ./getenv
> >> salloc: Granted job allocation 358789
> >> Processor 0 of 2 on $HOSTNAME pichu0: Hello World
> >> Processor 0 of 2 on host pichu93: Hello World
> >> Processor 1 of 2 on $HOSTNAME pichu0: Hello World
> >> Processor 1 of 2 on host pichu94: Hello World
> >> salloc: Relinquishing job allocation 358789
> >> 
> >> 
> >> Shouldn't we be getting the same value when using getenv("HOSTNAME") and 
> >> gethsotname()?
> >> Applying the following small patch, we actually do.
> >> 
> >> Regards,
> >> Nadia
> >> 
> >> --
> >> 
> >> Do not propagate the HOSTNAME environment variable on remote hosts
> >> 
> >> diff -r 4ab256be2a17 orte/orted/orted_main.c
> >> --- a/orte/orted/orted_main.c   Wed Jan 20 16:45:07 2010 +0100
> >> +++ b/orte/orted/orted_main.c   Fri Jan 22 14:54:02 2010 +0100
> >> @@ -299,12 +299,17 @@ int orte_daemon(int argc, char *argv[])
> >> */
> >>orte_launch_environ = opal_argv_copy(environ);
> >> 
> >> +/*
> >> + * Set HOSTNAME to the actual hostname in order to avoid propagating
> >> + * the caller's HOSTNAME.
> >> + */
> >> +gethostname(hostname, 100);
> >> +opal_setenv("HOSTNAME", hostname, true, &orte_launch_environ);
> >> 
> >>/* if orte_daemon_debug is set, let someone know we are alive right
> >> * away just in case we have a problem along the way
> >> */
> >>if (orted_globals.debug) {
> >> -gethostname(hostname, 100);
> >>fprintf(stderr, "Daemon was launched on %s - beginning to 
> >> initialize\n", hostname);
> >>}
> >> 
> >> ___
> >> devel mailing list
> >> de...@open-mpi.org
> >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> > 
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
-- 
Nadia Derbey 



[OMPI devel] crash when using coll_tuned_use_dynamic_rules option with 1.4

2010-01-22 Thread Shiqing Fan

Hi,

When I try to select different alltoall algorithms using command line:

$> mpirun -mca coll_tuned_use_dynamic_rules 1 -mca 
coll_tuned_alltoall_algorithm 2 IMB-MPI alltoall

it just crashes.

I suppose that "coll_tuned_use_dynamic_rules" and 
"coll_tuned_alltoall_algorithm" should be used together when no extra rule file is 
specified, is that correct? But whatever algorithm I try to use, the application just crashes. 
Could this be a bug?

The problem seems only exists in Open MPI v1.4, with 1.3 and 1.3.3, there isn't 
such problem.


Thanks,
Shiqing 




--
--
Shiqing Fan  http://www.hlrs.de/people/fan
High Performance Computing   Tel.: +49 711 685 87234
 Center Stuttgart (HLRS)Fax.: +49 711 685 65832
Address:Allmandring 30   email: f...@hlrs.de
70569 Stuttgart




Re: [OMPI devel] crash when using coll_tuned_use_dynamic_rules option with 1.4

2010-01-22 Thread Holger Berger
Hi,

I tracked this down a bit, and my impression is that this piece of code in
coll_tuned_component.c

if (ompi_coll_tuned_use_dynamic_rules) {
mca_base_param_reg_string(&mca_coll_tuned_component.super.collm_version,
  "dynamic_rules_filename",
  "Filename of configuration file that contains 
the dynamic (@runtime) decision function rules",
  false, false, 
ompi_coll_tuned_dynamic_rules_filename,
  &ompi_coll_tuned_dynamic_rules_filename);
if( ompi_coll_tuned_dynamic_rules_filename ) {
OPAL_OUTPUT((ompi_coll_tuned_stream,"coll:tuned:component_open 
Reading collective rules file [%s]",
 ompi_coll_tuned_dynamic_rules_filename));
rc = ompi_coll_tuned_read_rules_config_file( 
ompi_coll_tuned_dynamic_rules_filename,
 
&(mca_coll_tuned_component.all_base_rules), COLLCOUNT);
if( rc >= 0 ) {
OPAL_OUTPUT((ompi_coll_tuned_stream,"coll:tuned:module_open 
Read %d valid rules\n", rc));
} else {
OPAL_OUTPUT((ompi_coll_tuned_stream,"coll:tuned:module_open 
Reading collective rules file failed\n"));
mca_coll_tuned_component.all_base_rules = NULL;
}
}

}

Does not initialize the msg_rules as ompi_coll_tuned_read_rules_config_file 
does it by calling
ompi_coll_tuned_mk_msg_rules in the case that

ompi_coll_tuned_use_dynamic_rules is TRUE
and
ompi_coll_tuned_dynamic_rules_filename is FALSE

which leads to a crash in line 
  if( (NULL == base_com_rule) || (0 == base_com_rule->n_msg_sizes)) 
in coll_tuned_dynamic_rules.c:361
as base_com_rule seems to unitialized, but NOT zero, and points somewhere...


That is probably not inteneded, as it prohibits the selection of an algorithm
by switch like -mca coll_tuned_alltoall_algorithm 2.

Hope that helps fixing it...





-- 
Holger Berger
System Integration and Support
HPCE Division NEC Deutschland GmbH
Tel: +49-711-6877035 hber...@hpce.nec.com
Fax: +49-711-6877145 http://www.nec.com/de
NEC Deutschland GmbH, Hansaallee 101, 40549 Düsseldorf
Geschäftsführer Yuya Momose
Handelsregister Düsseldorf HRB 57941; VAT ID DE129424743