Re: [OMPI devel] [OMPI users] flex.exe
Let's shift this to the devel mailing list and add it to the Tues telecon. Thanks for clarifying. Sounds to me like the suggestions made below are right - we shouldn't be distributing binary in the main tarball for export reasons. Seems like we have four options: 1. A separate Windows-tool tarball 2. remove flex from the 3-4 places it is used in the code base and replace it with something that doesn't have this requirement. We don't use that much text processing - it may not take that much effort to write our own utility for this purpose. 3. not use the features that are missing from the windows version. 4. even though it changes sometimes, generate the flex-code output and ship it like we used to do Regardless, shipping binary in a source tarball seems like a really bad idea in this age of viral concerns. On Jan 22, 2010, at 3:09 AM, Shiqing Fan wrote: > > Hi, > > flex.exe is not generated at compile time, but flex.exe has to be used to > generate those *flex*.c files during compilation, like show_help_lex.c (a.k.a > the flex-generated code). > > The windows binary of flex on sourceforge doesn't fit the requirement of Open > MPI, it has some missing features. That's why we have to compile a new > flex.exe for Windows, and put it in the source tree. > > > Regards, > Shiqing > > > Ralph Castain wrote: >> Maybe I'm misunderstanding, but if it is generated at -compile- time, then >> how did it get in the 1.4.1 tarball? >> >> >> On Jan 22, 2010, at 1:56 AM, Shiqing Fan wrote: >> >> >>> Hi, >>> >>> No, that's not true, we did ship the flex-generated code a time ago, but as >>> that part of code changes sometimes, we decided to generate it during >>> compilation time, and the flex.exe came with the first support of Windows >>> (CMake). >>> >>> >>> Regards, >>> Shiqing >>> >>> Jeff Squyres wrote: >>> Don't we ship the flex-generated code in the tarball anyway? If so, why do we ship flex.exe? On Jan 21, 2010, at 12:14 PM, Barrett, Brian W wrote: > I have to agree with the two requests here. Having either a windows > tarball or a windows build tools tarball doesn't seem too burdensom, and > could even be done automatically at make dist time. > > Brian > > > - Original Message - > From: users-boun...@open-mpi.org > To: us...@open-mpi.org > Sent: Thu Jan 21 10:05:03 2010 > Subject: Re: [OMPI users] flex.exe > > Am Donnerstag, den 21.01.2010, 11:52 -0500 schrieb Michael Di Domenico: > >> openmpi-1.4.1/contrib/platform/win32/bin/flex.exe >> >> I understand this file might be required for building on windows, >> since I'm not I can just delete the file without issue. >> >> However, for those of us under import restrictions, where binaries are >> not allowed in, this file causes me to open the tarball and delete the >> file (not a big deal, i know, i know). >> >> But, can I put up a vote for a pure source only tree? >> > I'm very much in favor of that since we can't ship this binary in > Debian. We'd have to delete it from the tarball and repack it with every > release which is quite cumbersome. If these tools could be shipped in a > separate tarball that would be great! > > Best regards > Manuel > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > >>> -- >>> -- >>> Shiqing Fan http://www.hlrs.de/people/fan >>> High Performance Computing Tel.: +49 711 685 87234 >>> Center Stuttgart (HLRS)Fax.: +49 711 685 65832 >>> Address:Allmandring 30 email: f...@hlrs.de70569 Stuttgart >>> >>> ___ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >> >> > > > -- > -- > Shiqing Fan http://www.hlrs.de/people/fan > High Performance Computing Tel.: +49 711 685 87234 > Center Stuttgart (HLRS)Fax.: +49 711 685 65832 > Address:Allmandring 30 email: f...@hlrs.de70569 Stuttgart >
Re: [OMPI devel] [OMPI users] flex.exe
Hi, In the User's list, Jeff mentioned generating the windows flex code during make dist time, I didn't think about it before, it should work if flex is newer than 2.5.4 (the latest version is 2.5.35). In the created tarball, the flex generated c source won't compile under Windows, that's because using an old version of flex, the generated file include unistd.h but there is no way to exclude it. The newer flex generate output file with following code piece: #ifndef YY_NO_UNISTD_H /* Special case for "unistd.h", since it is non-ANSI. We include it way * down here because we want the user's section 1 to have been scanned first. * The user has a chance to override it with an option. */ #include #endif So that on the platforms that don't have unistd.h, just define 'YY_NO_UNISTD_H' to get rid of it. Updating the flex that used for make dist, will be the best solution to remove flex.exe from the tarball. But this windows flex.exe should be better remain in svn repository for svn checkout build. Thanks, Shiqing Ralph Castain wrote: Let's shift this to the devel mailing list and add it to the Tues telecon. Thanks for clarifying. Sounds to me like the suggestions made below are right - we shouldn't be distributing binary in the main tarball for export reasons. Seems like we have four options: 1. A separate Windows-tool tarball 2. remove flex from the 3-4 places it is used in the code base and replace it with something that doesn't have this requirement. We don't use that much text processing - it may not take that much effort to write our own utility for this purpose. 3. not use the features that are missing from the windows version. 4. even though it changes sometimes, generate the flex-code output and ship it like we used to do Regardless, shipping binary in a source tarball seems like a really bad idea in this age of viral concerns. On Jan 22, 2010, at 3:09 AM, Shiqing Fan wrote: Hi, flex.exe is not generated at compile time, but flex.exe has to be used to generate those *flex*.c files during compilation, like show_help_lex.c (a.k.a the flex-generated code). The windows binary of flex on sourceforge doesn't fit the requirement of Open MPI, it has some missing features. That's why we have to compile a new flex.exe for Windows, and put it in the source tree. Regards, Shiqing Ralph Castain wrote: Maybe I'm misunderstanding, but if it is generated at -compile- time, then how did it get in the 1.4.1 tarball? On Jan 22, 2010, at 1:56 AM, Shiqing Fan wrote: Hi, No, that's not true, we did ship the flex-generated code a time ago, but as that part of code changes sometimes, we decided to generate it during compilation time, and the flex.exe came with the first support of Windows (CMake). Regards, Shiqing Jeff Squyres wrote: Don't we ship the flex-generated code in the tarball anyway? If so, why do we ship flex.exe? On Jan 21, 2010, at 12:14 PM, Barrett, Brian W wrote: I have to agree with the two requests here. Having either a windows tarball or a windows build tools tarball doesn't seem too burdensom, and could even be done automatically at make dist time. Brian - Original Message - From: users-boun...@open-mpi.org To: us...@open-mpi.org Sent: Thu Jan 21 10:05:03 2010 Subject: Re: [OMPI users] flex.exe Am Donnerstag, den 21.01.2010, 11:52 -0500 schrieb Michael Di Domenico: openmpi-1.4.1/contrib/platform/win32/bin/flex.exe I understand this file might be required for building on windows, since I'm not I can just delete the file without issue. However, for those of us under import restrictions, where binaries are not allowed in, this file causes me to open the tarball and delete the file (not a big deal, i know, i know). But, can I put up a vote for a pure source only tree? I'm very much in favor of that since we can't ship this binary in Debian. We'd have to delete it from the tarball and repack it with every release which is quite cumbersome. If these tools could be shipped in a separate tarball that would be great! Best regards Manuel ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- -- Shiqing Fan http://www.hlrs.de/people/fan High Performance Computing Tel.: +49 711 685 87234 Center Stuttgart (HLRS)Fax.: +49 711 685 65832 Address:Allmandring 30 email: f...@hlrs.de70569 Stuttgart ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI devel] [OMPI users] flex.exe
Ok, moving this back to devel (sorry, I replied to an earlier mail -- before Ralph moved it to devel). Let's figure out how to generate the relevant code that you need at "make dist" time and not include flex.exe in the tarball -- it can still be in svn if you want/need it. You might want to note in README.windows that flex.exe is not included in the tarball for the reasons cited on the users thread. I'll poke around and see if I can get the .c files in the tarball and therefore be able to exclude flex.exe -- let me get back to you later today... On Jan 22, 2010, at 8:07 AM, Shiqing Fan wrote: > > Yes, that should work but only with newer version of flex, I didn't think > about it before. But the windows flex.exe should still be available for svn > checkout build. > > > Thanks, > Shiqing > > > Jeff Squyres (jsquyres) wrote: >> >> What prevents us from generating the code during make dist time and >> therefore not shipping flex.exe? >> >> -jms >> Sent from my PDA. No type good. >> >> - Original Message - >> From: Shiqing Fan >> To: Open MPI Users >> Cc: Jeff Squyres (jsquyres) >> Sent: Fri Jan 22 03:56:52 2010 >> Subject: Re: [OMPI users] flex.exe >> >> Hi, >> >> No, that's not true, we did ship the flex-generated code a time ago, but >> as that part of code changes sometimes, we decided to generate it during >> compilation time, and the flex.exe came with the first support of >> Windows (CMake). >> >> >> Regards, >> Shiqing >> >> Jeff Squyres wrote: >> > Don't we ship the flex-generated code in the tarball anyway? If so, why >> > do we ship flex.exe? >> > >> > On Jan 21, 2010, at 12:14 PM, Barrett, Brian W wrote: >> > >> > >> I have to agree with the two requests here. Having either a windows >> > tarball or a windows build tools tarball doesn't seem too burdensom, and >> > could even be done automatically at make dist time. >> >> >> >> Brian >> >> >> >> >> >> - Original Message - >> >> From: users-boun...@open-mpi.org >> >> To: us...@open-mpi.org >> >> Sent: Thu Jan 21 10:05:03 2010 >> >> Subject: Re: [OMPI users] flex.exe >> >> >> >> Am Donnerstag, den 21.01.2010, 11:52 -0500 schrieb Michael Di Domenico: >> >>>>> openmpi-1.4.1/contrib/platform/win32/bin/flex.exe >> >>> >> >>> I understand this file might be required for building on windows, >> >>> since I'm not I can just delete the file without issue. >> >>> >> >>> However, for those of us under import restrictions, where binaries are >> >>> not allowed in, this file causes me to open the tarball and delete the >> >>> file (not a big deal, i know, i know). >> >>> >> >>> But, can I put up a vote for a pure source only tree? >> >>> >> I'm very much in favor of that since we can't ship this binary in >> >> Debian. We'd have to delete it from the tarball and repack it with every >> >> release which is quite cumbersome. If these tools could be shipped in a >> >> separate tarball that would be great! >> >> >> >> Best regards >> >> Manuel >> >> >> >> ___ >> >> users mailing list >> >> us...@open-mpi.org >> >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> >> >> >> ___ >> >> users mailing list >> >> us...@open-mpi.org >> >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> >>> >> > >> > >> >> -- >> -- >> Shiqing Fan http://www.hlrs.de/people/fan >> High Performance Computing Tel.: +49 711 685 87234 >> Center Stuttgart (HLRS)Fax.: +49 711 685 65832 >> Address:Allmandring 30 email: f...@hlrs.de 70569 Stuttgart >> > > > -- > -- > Shiqing Fan http://www.hlrs.de/people/fan > High Performance Computing Tel.: +49 711 685 87234 > Center Stuttgart (HLRS)Fax.: +49 711 685 65832 > Address:Allmandring 30 email: f...@hlrs.de70569 Stuttgart > -- Jeff Squyres jsquy...@cisco.com
Re: [OMPI devel] [OMPI users] flex.exe
Actually, I take that back -- the .c files *are* in the tarball already. Are you saying (per your other mail) that the .c files are simply generated by a flex that is too old, and we need to update the flex that is used to generate the .c files in the tarball? If so, that's a relatively simple change to make in the "make a tarball" scripts at IU. On Jan 22, 2010, at 8:38 AM, Jeff Squyres (jsquyres) wrote: > Ok, moving this back to devel (sorry, I replied to an earlier mail -- before > Ralph moved it to devel). > > Let's figure out how to generate the relevant code that you need at "make > dist" time and not include flex.exe in the tarball -- it can still be in svn > if you want/need it. You might want to note in README.windows that flex.exe > is not included in the tarball for the reasons cited on the users thread. > > I'll poke around and see if I can get the .c files in the tarball and > therefore be able to exclude flex.exe -- let me get back to you later today... > > > > On Jan 22, 2010, at 8:07 AM, Shiqing Fan wrote: > > > > > Yes, that should work but only with newer version of flex, I didn't think > > about it before. But the windows flex.exe should still be available for svn > > checkout build. > > > > > > Thanks, > > Shiqing > > > > > > Jeff Squyres (jsquyres) wrote: > >> > >> What prevents us from generating the code during make dist time and > >> therefore not shipping flex.exe? > >> > >> -jms > >> Sent from my PDA. No type good. > >> > >> - Original Message - > >> From: Shiqing Fan > >> To: Open MPI Users > >> Cc: Jeff Squyres (jsquyres) > >> Sent: Fri Jan 22 03:56:52 2010 > >> Subject: Re: [OMPI users] flex.exe > >> > >> Hi, > >> > >> No, that's not true, we did ship the flex-generated code a time ago, but > >> as that part of code changes sometimes, we decided to generate it during > >> compilation time, and the flex.exe came with the first support of > >> Windows (CMake). > >> > >> > >> Regards, > >> Shiqing > >> > >> Jeff Squyres wrote: > >> > Don't we ship the flex-generated code in the tarball anyway? If so, why > >> > do we ship flex.exe? > >> > > >> > On Jan 21, 2010, at 12:14 PM, Barrett, Brian W wrote: > >> > > >> > >> I have to agree with the two requests here. Having either a windows > >> > tarball or a windows build tools tarball doesn't seem too burdensom, and > >> > could even be done automatically at make dist time. > >> >> > >> >> Brian > >> >> > >> >> > >> >> - Original Message - > >> >> From: users-boun...@open-mpi.org > >> >> To: us...@open-mpi.org > >> >> Sent: Thu Jan 21 10:05:03 2010 > >> >> Subject: Re: [OMPI users] flex.exe > >> >> > >> >> Am Donnerstag, den 21.01.2010, 11:52 -0500 schrieb Michael Di Domenico: > >> >>>>> openmpi-1.4.1/contrib/platform/win32/bin/flex.exe > >> >>> > >> >>> I understand this file might be required for building on windows, > >> >>> since I'm not I can just delete the file without issue. > >> >>> > >> >>> However, for those of us under import restrictions, where binaries are > >> >>> not allowed in, this file causes me to open the tarball and delete the > >> >>> file (not a big deal, i know, i know). > >> >>> > >> >>> But, can I put up a vote for a pure source only tree? > >> >>> >> I'm very much in favor of that since we can't ship this binary > >> >>> in > >> >> Debian. We'd have to delete it from the tarball and repack it with every > >> >> release which is quite cumbersome. If these tools could be shipped in a > >> >> separate tarball that would be great! > >> >> > >> >> Best regards > >> >> Manuel > >> >> > >> >> ___ > >> >> users mailing list > >> >> us...@open-mpi.org > >> >> http://www.open-mpi.org/mailman/listinfo.cgi/users > >> >> > >> >> > >> >> ___ > >> >> users mailing list > >> >> us...@open-mpi.org > >> >> http://www.open-mpi.org/mailman/listinfo.cgi/users > >> >> > >> >>> > >> > > >> > > >> > >> -- > >> -- > >> Shiqing Fan http://www.hlrs.de/people/fan > >> High Performance Computing Tel.: +49 711 685 87234 > >> Center Stuttgart (HLRS)Fax.: +49 711 685 65832 > >> Address:Allmandring 30 email: f...@hlrs.de 70569 Stuttgart > >> > > > > > > -- > > -- > > Shiqing Fan http://www.hlrs.de/people/fan > > High Performance Computing Tel.: +49 711 685 87234 > > Center Stuttgart (HLRS)Fax.: +49 711 685 65832 > > Address:Allmandring 30 email: f...@hlrs.de70569 Stuttgart > > > > > -- > Jeff Squyres > jsquy...@cisco.com > > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > -- Jeff Squyres jsquy...@cisco.com
Re: [OMPI devel] [OMPI users] flex.exe
Are you saying (per your other mail) that the .c files are simply generated by a flex that is too old, and we need to update the flex that is used to generate the .c files in the tarball? If so, that's a relatively simple change to make in the "make a tarball" scripts at IU. Yes, exactly what I meant. I've already tested under Linux with flex 3.5.35, and the generated .c files also worked under Windows. So only the a new flex to be used, then we can remove the windows flex.exe from the tarball. Thanks, Shiqing On Jan 22, 2010, at 8:38 AM, Jeff Squyres (jsquyres) wrote: Ok, moving this back to devel (sorry, I replied to an earlier mail -- before Ralph moved it to devel). Let's figure out how to generate the relevant code that you need at "make dist" time and not include flex.exe in the tarball -- it can still be in svn if you want/need it. You might want to note in README.windows that flex.exe is not included in the tarball for the reasons cited on the users thread. I'll poke around and see if I can get the .c files in the tarball and therefore be able to exclude flex.exe -- let me get back to you later today... On Jan 22, 2010, at 8:07 AM, Shiqing Fan wrote: Yes, that should work but only with newer version of flex, I didn't think about it before. But the windows flex.exe should still be available for svn checkout build. Thanks, Shiqing Jeff Squyres (jsquyres) wrote: What prevents us from generating the code during make dist time and therefore not shipping flex.exe? -jms Sent from my PDA. No type good. - Original Message - From: Shiqing Fan To: Open MPI Users Cc: Jeff Squyres (jsquyres) Sent: Fri Jan 22 03:56:52 2010 Subject: Re: [OMPI users] flex.exe Hi, No, that's not true, we did ship the flex-generated code a time ago, but as that part of code changes sometimes, we decided to generate it during compilation time, and the flex.exe came with the first support of Windows (CMake). Regards, Shiqing Jeff Squyres wrote: Don't we ship the flex-generated code in the tarball anyway? If so, why do we ship flex.exe? On Jan 21, 2010, at 12:14 PM, Barrett, Brian W wrote: >> I have to agree with the two requests here. Having either a windows tarball or a windows build tools tarball doesn't seem too burdensom, and could even be done automatically at make dist time. Brian - Original Message - From: users-boun...@open-mpi.org To: us...@open-mpi.org Sent: Thu Jan 21 10:05:03 2010 Subject: Re: [OMPI users] flex.exe Am Donnerstag, den 21.01.2010, 11:52 -0500 schrieb Michael Di Domenico: >>> openmpi-1.4.1/contrib/platform/win32/bin/flex.exe I understand this file might be required for building on windows, since I'm not I can just delete the file without issue. However, for those of us under import restrictions, where binaries are not allowed in, this file causes me to open the tarball and delete the file (not a big deal, i know, i know). But, can I put up a vote for a pure source only tree? >> I'm very much in favor of that since we can't ship this binary in Debian. We'd have to delete it from the tarball and repack it with every release which is quite cumbersome. If these tools could be shipped in a separate tarball that would be great! Best regards Manuel ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users > -- -- Shiqing Fan http://www.hlrs.de/people/fan High Performance Computing Tel.: +49 711 685 87234 Center Stuttgart (HLRS)Fax.: +49 711 685 65832 Address:Allmandring 30 email: f...@hlrs.de 70569 Stuttgart -- -- Shiqing Fan http://www.hlrs.de/people/fan High Performance Computing Tel.: +49 711 685 87234 Center Stuttgart (HLRS)Fax.: +49 711 685 65832 Address:Allmandring 30 email: f...@hlrs.de70569 Stuttgart -- Jeff Squyres jsquy...@cisco.com ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- -- Shiqing Fan http://www.hlrs.de/people/fan High Performance Computing Tel.: +49 711 685 87234 Center Stuttgart (HLRS)Fax.: +49 711 685 65832 Address:Allmandring 30 email: f...@hlrs.de 70569 Stuttgart
[OMPI devel] HOSTNAME environment variable
Hi, I'm wondering whether the HOSTNAME environment variable shouldn't be handled as a "special case" when the orted daemons launch the remote jobs. This particularly applies to batch schedulers where the caller's environment is copied to the remote job: we are inheriting a $HOSTNAME which is the name of the host mpirun was called from: I tried to run the following small test (see getenv.c in attachment - it substantially gets the hostname once through $HOSTNAME, and once through gethostname(2)): [derbeyn@pichu0 ~]$ hostname pichu0 [derbeyn@pichu0 ~]$ salloc -N 2 -p pichu mpirun ./getenv salloc: Granted job allocation 358789 Processor 0 of 2 on $HOSTNAME pichu0: Hello World Processor 0 of 2 on host pichu93: Hello World Processor 1 of 2 on $HOSTNAME pichu0: Hello World Processor 1 of 2 on host pichu94: Hello World salloc: Relinquishing job allocation 358789 Shouldn't we be getting the same value when using getenv("HOSTNAME") and gethsotname()? Applying the following small patch, we actually do. Regards, Nadia -- Do not propagate the HOSTNAME environment variable on remote hosts diff -r 4ab256be2a17 orte/orted/orted_main.c --- a/orte/orted/orted_main.c Wed Jan 20 16:45:07 2010 +0100 +++ b/orte/orted/orted_main.c Fri Jan 22 14:54:02 2010 +0100 @@ -299,12 +299,17 @@ int orte_daemon(int argc, char *argv[]) */ orte_launch_environ = opal_argv_copy(environ); +/* + * Set HOSTNAME to the actual hostname in order to avoid propagating + * the caller's HOSTNAME. + */ +gethostname(hostname, 100); +opal_setenv("HOSTNAME", hostname, true, &orte_launch_environ); /* if orte_daemon_debug is set, let someone know we are alive right * away just in case we have a problem along the way */ if (orted_globals.debug) { -gethostname(hostname, 100); fprintf(stderr, "Daemon was launched on %s - beginning to initialize\n", hostname); } #include #include #include #include int main(int argc, char **argv) { char *env_hostname; char hostname[255]; int myrank, size; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &myrank); MPI_Comm_size(MPI_COMM_WORLD, &size); env_hostname = getenv("HOSTNAME"); if (NULL != env_hostname) { printf("Processor %d of %d on $HOSTNAME %s: Hello World\n", myrank, size, env_hostname); } else { printf("Processor %d of %d on $HOSTNAME NULL: Hello World\n", myrank, size); } if (0 == gethostname(hostname, 255)) { printf("Processor %d of %d on host %s: Hello World\n", myrank, size, hostname); } MPI_Finalize(); exit(0); }
Re: [OMPI devel] HOSTNAME environment variable
On Jan 22 2010, Nadia Derbey wrote: I'm wondering whether the HOSTNAME environment variable shouldn't be handled as a "special case" when the orted daemons launch the remote jobs. This particularly applies to batch schedulers where the caller's environment is copied to the remote job: we are inheriting a $HOSTNAME which is the name of the host mpirun was called from: This is slightly orthogonal, but relevant. This is an ancient mess with propagating environment variables, and predates MPI by many years. The most traditional form was the demented connexion protocols that propagated TERM - truly wonderful when logging in from SunOS to HP-UX! Whether it is worth kludging up one variable and leaving the rest is unclear. Even if systems are fairly homogeneous, it is common for the head node to have a different set of standard values from the others. TMPDIR is one very common one, but any of the dozen of so path variables is likely to vary, at least sometimes, as are many of the others. I used to have to write the most DISGUSTING hacks to stop unwanted export when I managed our supercomputer. Yet there are other systems that will work only if you DO export environment variables. And there are systems where the secondary nodes aren't real systems, and using the parent hostname would be better, though I haven't managed any. Realistically, there should really be some kind of hook to control which are transferred and which are not. I haven't found one - if there is, it's a better way to tackle this. Regards, Nick Maclaren.
Re: [OMPI devel] HOSTNAME environment variable
Hi Nadia That sounds like a bug in your SLURM config file - SLURM certainly doesn't propagate "hostname" by default as that would definitely mess things up for more than OMPI. Are you sure that SLURM is propagating the environment (something I have never seen before)? Or is OMPI mistakenly picking it up and propagating it? On Jan 22, 2010, at 7:25 AM, Nadia Derbey wrote: > Hi, > > I'm wondering whether the HOSTNAME environment variable shouldn't be > handled as a "special case" when the orted daemons launch the remote > jobs. This particularly applies to batch schedulers where the caller's > environment is copied to the remote job: we are inheriting a $HOSTNAME > which is the name of the host mpirun was called from: > > I tried to run the following small test (see getenv.c in attachment - it > substantially gets the hostname once through $HOSTNAME, and once through > gethostname(2)): > > > [derbeyn@pichu0 ~]$ hostname > pichu0 > [derbeyn@pichu0 ~]$ salloc -N 2 -p pichu mpirun ./getenv > salloc: Granted job allocation 358789 > Processor 0 of 2 on $HOSTNAME pichu0: Hello World > Processor 0 of 2 on host pichu93: Hello World > Processor 1 of 2 on $HOSTNAME pichu0: Hello World > Processor 1 of 2 on host pichu94: Hello World > salloc: Relinquishing job allocation 358789 > > > Shouldn't we be getting the same value when using getenv("HOSTNAME") and > gethsotname()? > Applying the following small patch, we actually do. > > Regards, > Nadia > > -- > > Do not propagate the HOSTNAME environment variable on remote hosts > > diff -r 4ab256be2a17 orte/orted/orted_main.c > --- a/orte/orted/orted_main.c Wed Jan 20 16:45:07 2010 +0100 > +++ b/orte/orted/orted_main.c Fri Jan 22 14:54:02 2010 +0100 > @@ -299,12 +299,17 @@ int orte_daemon(int argc, char *argv[]) > */ > orte_launch_environ = opal_argv_copy(environ); > > +/* > + * Set HOSTNAME to the actual hostname in order to avoid propagating > + * the caller's HOSTNAME. > + */ > +gethostname(hostname, 100); > +opal_setenv("HOSTNAME", hostname, true, &orte_launch_environ); > > /* if orte_daemon_debug is set, let someone know we are alive right > * away just in case we have a problem along the way > */ > if (orted_globals.debug) { > -gethostname(hostname, 100); > fprintf(stderr, "Daemon was launched on %s - beginning to > initialize\n", hostname); > } > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] HOSTNAME environment variable
For SLURM, there is a config file where you can specify what gets propagated. It is clearly an error to include hostname as it messes many things up, not just OMPI. Frankly, I've never seen someone do that on SLURM. I believe in this case OMPI is likely incorrectly picking up the environment and propagating it. We know this is incorrectly happening on Torque, and it appears to also be happening on SLURM. This is a bug that I will be fixing on Torque - and as soon as Nadia confirms, on SLURM as well. I know that on Torque it was an innocent mistake where a line got added to the launch code that shouldn't have... On Jan 22, 2010, at 8:07 AM, N.M. Maclaren wrote: > On Jan 22 2010, Nadia Derbey wrote: >> >> I'm wondering whether the HOSTNAME environment variable shouldn't be >> handled as a "special case" when the orted daemons launch the remote >> jobs. This particularly applies to batch schedulers where the caller's >> environment is copied to the remote job: we are inheriting a $HOSTNAME >> which is the name of the host mpirun was called from: > > This is slightly orthogonal, but relevant. > > This is an ancient mess with propagating environment variables, and predates > MPI by many years. The most traditional form was the demented connexion > protocols that propagated TERM - truly wonderful when logging in from SunOS > to HP-UX! Whether it is worth kludging up one variable and leaving the rest > is unclear. > > Even if systems are fairly homogeneous, it is common for the head node to > have a different set of standard values from the others. TMPDIR is one > very common one, but any of the dozen of so path variables is likely to > vary, at least sometimes, as are many of the others. > > I used to have to write the most DISGUSTING hacks to stop unwanted export > when I managed our supercomputer. Yet there are other systems that will > work only if you DO export environment variables. And there are systems > where the secondary nodes aren't real systems, and using the parent hostname > would be better, though I haven't managed any. > > Realistically, there should really be some kind of hook to control which > are transferred and which are not. I haven't found one - if there is, it's > a better way to tackle this. > > Regards, > Nick Maclaren. > > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] HOSTNAME environment variable
A quick and easy way to answer my question of slurm vs ompi: Just do "srun script-that-echos-hostname-and-gethostname". If you get the right hostnames, then OMPI is to blame, not slurm. On Jan 22, 2010, at 8:07 AM, Ralph Castain wrote: > Hi Nadia > > That sounds like a bug in your SLURM config file - SLURM certainly doesn't > propagate "hostname" by default as that would definitely mess things up for > more than OMPI. > > Are you sure that SLURM is propagating the environment (something I have > never seen before)? Or is OMPI mistakenly picking it up and propagating it? > > On Jan 22, 2010, at 7:25 AM, Nadia Derbey wrote: > >> Hi, >> >> I'm wondering whether the HOSTNAME environment variable shouldn't be >> handled as a "special case" when the orted daemons launch the remote >> jobs. This particularly applies to batch schedulers where the caller's >> environment is copied to the remote job: we are inheriting a $HOSTNAME >> which is the name of the host mpirun was called from: >> >> I tried to run the following small test (see getenv.c in attachment - it >> substantially gets the hostname once through $HOSTNAME, and once through >> gethostname(2)): >> >> >> [derbeyn@pichu0 ~]$ hostname >> pichu0 >> [derbeyn@pichu0 ~]$ salloc -N 2 -p pichu mpirun ./getenv >> salloc: Granted job allocation 358789 >> Processor 0 of 2 on $HOSTNAME pichu0: Hello World >> Processor 0 of 2 on host pichu93: Hello World >> Processor 1 of 2 on $HOSTNAME pichu0: Hello World >> Processor 1 of 2 on host pichu94: Hello World >> salloc: Relinquishing job allocation 358789 >> >> >> Shouldn't we be getting the same value when using getenv("HOSTNAME") and >> gethsotname()? >> Applying the following small patch, we actually do. >> >> Regards, >> Nadia >> >> -- >> >> Do not propagate the HOSTNAME environment variable on remote hosts >> >> diff -r 4ab256be2a17 orte/orted/orted_main.c >> --- a/orte/orted/orted_main.c Wed Jan 20 16:45:07 2010 +0100 >> +++ b/orte/orted/orted_main.c Fri Jan 22 14:54:02 2010 +0100 >> @@ -299,12 +299,17 @@ int orte_daemon(int argc, char *argv[]) >> */ >>orte_launch_environ = opal_argv_copy(environ); >> >> +/* >> + * Set HOSTNAME to the actual hostname in order to avoid propagating >> + * the caller's HOSTNAME. >> + */ >> +gethostname(hostname, 100); >> +opal_setenv("HOSTNAME", hostname, true, &orte_launch_environ); >> >>/* if orte_daemon_debug is set, let someone know we are alive right >> * away just in case we have a problem along the way >> */ >>if (orted_globals.debug) { >> -gethostname(hostname, 100); >>fprintf(stderr, "Daemon was launched on %s - beginning to >> initialize\n", hostname); >>} >> >> ___ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >
Re: [OMPI devel] HOSTNAME environment variable
On Fri, 2010-01-22 at 08:12 -0700, Ralph Castain wrote: > For SLURM, there is a config file where you can specify what gets propagated. > It is clearly an error to include hostname as it messes many things up, not > just OMPI. Frankly, I've never seen someone do that on SLURM. > I'm going to check that. Thanks, Nadia > I believe in this case OMPI is likely incorrectly picking up the environment > and propagating it. We know this is incorrectly happening on Torque, and it > appears to also be happening on SLURM. This is a bug that I will be fixing on > Torque - and as soon as Nadia confirms, on SLURM as well. > > I know that on Torque it was an innocent mistake where a line got added to > the launch code that shouldn't have... > > On Jan 22, 2010, at 8:07 AM, N.M. Maclaren wrote: > > > On Jan 22 2010, Nadia Derbey wrote: > >> > >> I'm wondering whether the HOSTNAME environment variable shouldn't be > >> handled as a "special case" when the orted daemons launch the remote > >> jobs. This particularly applies to batch schedulers where the caller's > >> environment is copied to the remote job: we are inheriting a $HOSTNAME > >> which is the name of the host mpirun was called from: > > > > This is slightly orthogonal, but relevant. > > > > This is an ancient mess with propagating environment variables, and predates > > MPI by many years. The most traditional form was the demented connexion > > protocols that propagated TERM - truly wonderful when logging in from SunOS > > to HP-UX! Whether it is worth kludging up one variable and leaving the rest > > is unclear. > > > > Even if systems are fairly homogeneous, it is common for the head node to > > have a different set of standard values from the others. TMPDIR is one > > very common one, but any of the dozen of so path variables is likely to > > vary, at least sometimes, as are many of the others. > > > > I used to have to write the most DISGUSTING hacks to stop unwanted export > > when I managed our supercomputer. Yet there are other systems that will > > work only if you DO export environment variables. And there are systems > > where the secondary nodes aren't real systems, and using the parent hostname > > would be better, though I haven't managed any. > > > > Realistically, there should really be some kind of hook to control which > > are transferred and which are not. I haven't found one - if there is, it's > > a better way to tackle this. > > > > Regards, > > Nick Maclaren. > > > > > > ___ > > devel mailing list > > de...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > -- Nadia Derbey
Re: [OMPI devel] HOSTNAME environment variable
On Jan 22 2010, Ralph Castain wrote: For SLURM, there is a config file where you can specify what gets propagated. It is clearly an error to include hostname as it messes many things up, not just OMPI. Frankly, I've never seen someone do that on SLURM. Well, it's USUALLY an error That's clearly a good solution. I believe in this case OMPI is likely incorrectly picking up the environment and propagating it. We know this is incorrectly happening on Torque, and it appears to also be happening on SLURM. This is a bug that I will be fixing on Torque - and as soon as Nadia confirms, on SLURM as well. I should have run a cross-check! It doesn't happen on my bare OpenMPI installation. Regards, Nick Maclaren.
Re: [OMPI devel] [OMPI users] flex.exe
Shiqing and I took this offlist and have a solution which looks like it works. End results: - no more flex.exe in tarballs - updated the flex to 2.5.35 on the IU machine that is used to generate 1.4 and 1.5 tarballs; hence, the generated _lex.c files in the tarball are Windows-friendly - changes to cmake files to adapt to the above We should be able to commit these changes sometime today (i.e., the changes will appear in trunk nightlies tonight); we'll CMR them to the v1.4 and 1.5 branches so that they'll be in v1.4.2 and v1.5[.0], respectively. On Jan 22, 2010, at 8:52 AM, Shiqing Fan wrote: > > > Are you saying (per your other mail) that the .c files are simply generated > > by a flex that is too old, and we need to update the flex that is used to > > generate the .c files in the tarball? If so, that's a relatively simple > > change to make in the "make a tarball" scripts at IU. > > > > Yes, exactly what I meant. I've already tested under Linux with flex > 3.5.35, and the generated .c files also worked under Windows. So only > the a new flex to be used, then we can remove the windows flex.exe from > the tarball. > > > > Thanks, > Shiqing > > > > On Jan 22, 2010, at 8:38 AM, Jeff Squyres (jsquyres) wrote: > > > > > >> Ok, moving this back to devel (sorry, I replied to an earlier mail -- > >> before Ralph moved it to devel). > >> > >> Let's figure out how to generate the relevant code that you need at "make > >> dist" time and not include flex.exe in the tarball -- it can still be in > >> svn if you want/need it. You might want to note in README.windows that > >> flex.exe is not included in the tarball for the reasons cited on the users > >> thread. > >> > >> I'll poke around and see if I can get the .c files in the tarball and > >> therefore be able to exclude flex.exe -- let me get back to you later > >> today... > >> > >> > >> > >> On Jan 22, 2010, at 8:07 AM, Shiqing Fan wrote: > >> > >> > >>> Yes, that should work but only with newer version of flex, I didn't think > >>> about it before. But the windows flex.exe should still be available for > >>> svn checkout build. > >>> > >>> > >>> Thanks, > >>> Shiqing > >>> > >>> > >>> Jeff Squyres (jsquyres) wrote: > >>> > What prevents us from generating the code during make dist time and > therefore not shipping flex.exe? > > -jms > Sent from my PDA. No type good. > > - Original Message - > From: Shiqing Fan > To: Open MPI Users > Cc: Jeff Squyres (jsquyres) > Sent: Fri Jan 22 03:56:52 2010 > Subject: Re: [OMPI users] flex.exe > > Hi, > > No, that's not true, we did ship the flex-generated code a time ago, but > as that part of code changes sometimes, we decided to generate it during > compilation time, and the flex.exe came with the first support of > Windows (CMake). > > > Regards, > Shiqing > > Jeff Squyres wrote: > > > Don't we ship the flex-generated code in the tarball anyway? If so, > > why do we ship flex.exe? > > > > On Jan 21, 2010, at 12:14 PM, Barrett, Brian W wrote: > > > > >> I have to agree with the two requests here. Having either a windows > > tarball or a windows build tools tarball doesn't seem too burdensom, > > and could even be done automatically at make dist time. > > > >> Brian > >> > >> > >> - Original Message - > >> From: users-boun...@open-mpi.org > >> To: us...@open-mpi.org > >> Sent: Thu Jan 21 10:05:03 2010 > >> Subject: Re: [OMPI users] flex.exe > >> > >> Am Donnerstag, den 21.01.2010, 11:52 -0500 schrieb Michael Di Domenico: > >>>>> openmpi-1.4.1/contrib/platform/win32/bin/flex.exe > >> > >>> I understand this file might be required for building on windows, > >>> since I'm not I can just delete the file without issue. > >>> > >>> However, for those of us under import restrictions, where binaries are > >>> not allowed in, this file causes me to open the tarball and delete the > >>> file (not a big deal, i know, i know). > >>> > >>> But, can I put up a vote for a pure source only tree? > >>> >> I'm very much in favor of that since we can't ship this > >>> binary in > >>> > >> Debian. We'd have to delete it from the tarball and repack it with > >> every > >> release which is quite cumbersome. If these tools could be shipped in a > >> separate tarball that would be great! > >> > >> Best regards > >> Manuel > >> > >> ___ > >> users mailing list > >> us...@open-mpi.org > >> http://www.open-mpi.org/mailman/listinfo.cgi/users > >> > >> > >> ___ > >> users mailing list > >> us...@open-mpi.org > >> htt
Re: [OMPI devel] HOSTNAME environment variable
On Fri, 2010-01-22 at 08:22 -0700, Ralph Castain wrote: > A quick and easy way to answer my question of slurm vs ompi: > > Just do "srun script-that-echos-hostname-and-gethostname". If you get the > right hostnames, then OMPI is to blame, not slurm. > No, I'm not... Will check the configuration. Thanks a lot, Nadia > On Jan 22, 2010, at 8:07 AM, Ralph Castain wrote: > > > Hi Nadia > > > > That sounds like a bug in your SLURM config file - SLURM certainly doesn't > > propagate "hostname" by default as that would definitely mess things up for > > more than OMPI. > > > > Are you sure that SLURM is propagating the environment (something I have > > never seen before)? Or is OMPI mistakenly picking it up and propagating it? > > > > On Jan 22, 2010, at 7:25 AM, Nadia Derbey wrote: > > > >> Hi, > >> > >> I'm wondering whether the HOSTNAME environment variable shouldn't be > >> handled as a "special case" when the orted daemons launch the remote > >> jobs. This particularly applies to batch schedulers where the caller's > >> environment is copied to the remote job: we are inheriting a $HOSTNAME > >> which is the name of the host mpirun was called from: > >> > >> I tried to run the following small test (see getenv.c in attachment - it > >> substantially gets the hostname once through $HOSTNAME, and once through > >> gethostname(2)): > >> > >> > >> [derbeyn@pichu0 ~]$ hostname > >> pichu0 > >> [derbeyn@pichu0 ~]$ salloc -N 2 -p pichu mpirun ./getenv > >> salloc: Granted job allocation 358789 > >> Processor 0 of 2 on $HOSTNAME pichu0: Hello World > >> Processor 0 of 2 on host pichu93: Hello World > >> Processor 1 of 2 on $HOSTNAME pichu0: Hello World > >> Processor 1 of 2 on host pichu94: Hello World > >> salloc: Relinquishing job allocation 358789 > >> > >> > >> Shouldn't we be getting the same value when using getenv("HOSTNAME") and > >> gethsotname()? > >> Applying the following small patch, we actually do. > >> > >> Regards, > >> Nadia > >> > >> -- > >> > >> Do not propagate the HOSTNAME environment variable on remote hosts > >> > >> diff -r 4ab256be2a17 orte/orted/orted_main.c > >> --- a/orte/orted/orted_main.c Wed Jan 20 16:45:07 2010 +0100 > >> +++ b/orte/orted/orted_main.c Fri Jan 22 14:54:02 2010 +0100 > >> @@ -299,12 +299,17 @@ int orte_daemon(int argc, char *argv[]) > >> */ > >>orte_launch_environ = opal_argv_copy(environ); > >> > >> +/* > >> + * Set HOSTNAME to the actual hostname in order to avoid propagating > >> + * the caller's HOSTNAME. > >> + */ > >> +gethostname(hostname, 100); > >> +opal_setenv("HOSTNAME", hostname, true, &orte_launch_environ); > >> > >>/* if orte_daemon_debug is set, let someone know we are alive right > >> * away just in case we have a problem along the way > >> */ > >>if (orted_globals.debug) { > >> -gethostname(hostname, 100); > >>fprintf(stderr, "Daemon was launched on %s - beginning to > >> initialize\n", hostname); > >>} > >> > >> ___ > >> devel mailing list > >> de...@open-mpi.org > >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > -- Nadia Derbey
[OMPI devel] crash when using coll_tuned_use_dynamic_rules option with 1.4
Hi, When I try to select different alltoall algorithms using command line: $> mpirun -mca coll_tuned_use_dynamic_rules 1 -mca coll_tuned_alltoall_algorithm 2 IMB-MPI alltoall it just crashes. I suppose that "coll_tuned_use_dynamic_rules" and "coll_tuned_alltoall_algorithm" should be used together when no extra rule file is specified, is that correct? But whatever algorithm I try to use, the application just crashes. Could this be a bug? The problem seems only exists in Open MPI v1.4, with 1.3 and 1.3.3, there isn't such problem. Thanks, Shiqing -- -- Shiqing Fan http://www.hlrs.de/people/fan High Performance Computing Tel.: +49 711 685 87234 Center Stuttgart (HLRS)Fax.: +49 711 685 65832 Address:Allmandring 30 email: f...@hlrs.de 70569 Stuttgart
Re: [OMPI devel] crash when using coll_tuned_use_dynamic_rules option with 1.4
Hi, I tracked this down a bit, and my impression is that this piece of code in coll_tuned_component.c if (ompi_coll_tuned_use_dynamic_rules) { mca_base_param_reg_string(&mca_coll_tuned_component.super.collm_version, "dynamic_rules_filename", "Filename of configuration file that contains the dynamic (@runtime) decision function rules", false, false, ompi_coll_tuned_dynamic_rules_filename, &ompi_coll_tuned_dynamic_rules_filename); if( ompi_coll_tuned_dynamic_rules_filename ) { OPAL_OUTPUT((ompi_coll_tuned_stream,"coll:tuned:component_open Reading collective rules file [%s]", ompi_coll_tuned_dynamic_rules_filename)); rc = ompi_coll_tuned_read_rules_config_file( ompi_coll_tuned_dynamic_rules_filename, &(mca_coll_tuned_component.all_base_rules), COLLCOUNT); if( rc >= 0 ) { OPAL_OUTPUT((ompi_coll_tuned_stream,"coll:tuned:module_open Read %d valid rules\n", rc)); } else { OPAL_OUTPUT((ompi_coll_tuned_stream,"coll:tuned:module_open Reading collective rules file failed\n")); mca_coll_tuned_component.all_base_rules = NULL; } } } Does not initialize the msg_rules as ompi_coll_tuned_read_rules_config_file does it by calling ompi_coll_tuned_mk_msg_rules in the case that ompi_coll_tuned_use_dynamic_rules is TRUE and ompi_coll_tuned_dynamic_rules_filename is FALSE which leads to a crash in line if( (NULL == base_com_rule) || (0 == base_com_rule->n_msg_sizes)) in coll_tuned_dynamic_rules.c:361 as base_com_rule seems to unitialized, but NOT zero, and points somewhere... That is probably not inteneded, as it prohibits the selection of an algorithm by switch like -mca coll_tuned_alltoall_algorithm 2. Hope that helps fixing it... -- Holger Berger System Integration and Support HPCE Division NEC Deutschland GmbH Tel: +49-711-6877035 hber...@hpce.nec.com Fax: +49-711-6877145 http://www.nec.com/de NEC Deutschland GmbH, Hansaallee 101, 40549 Düsseldorf Geschäftsführer Yuya Momose Handelsregister Düsseldorf HRB 57941; VAT ID DE129424743