[hwloc-devel] hwloc nightly build: SUCCESS

2017-06-23 Thread mpiteam
Successful builds: ['v1.11', 'master']
Skipped builds: []
Failed builds: []

=== Build output ===

Branches: ['v1.11', 'master']

Starting build for v1.11
Found new revision 2110c37
v1.11 build of revision 2110c37 completed successfully

Starting build for master
Found new revision 7de553e
Successfully submitted Coverity build
master build of revision 7de553e completed successfully

Your friendly daemon,
Cyrador
___
hwloc-devel mailing list
hwloc-devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/hwloc-devel


Re: [OMPI devel] orterun busted

2017-06-23 Thread r...@open-mpi.org
Odd - I guess my machine is just consistently lucky, as was the CI’s when this 
went thru. The problem field is actually stale - we haven’t used it in years - 
so I simply removed it from orte_process_info.

https://github.com/open-mpi/ompi/pull/3741 


Should fix the problem.

> On Jun 23, 2017, at 3:38 AM, George Bosilca  wrote:
> 
> Ralph,
> 
> I got consistent segfaults during the infrastructure tearing down in the 
> orterun (I noticed them on a OSX). After digging a little bit it turns out 
> that the opal_buffet_t class has been cleaned-up in orte_finalize before 
> orte_proc_info_finalize is called, leading to calling the destructors into a 
> randomly initialized memory. If I change the order of the teardown to move 
> orte_proc_info_finalize before orte_finalize things work better, but I still 
> get a very annoying warning about a "Bad file descriptor in select".
> 
> Any better fix ?
> 
> George.
> 
> PS: Here is the patch I am currently using to get rid of the segfaults
> 
> diff --git a/orte/tools/orterun/orterun.c b/orte/tools/orterun/orterun.c
> index 85aba0a0f3..506b931d35 100644
> --- a/orte/tools/orterun/orterun.c
> +++ b/orte/tools/orterun/orterun.c
> @@ -222,10 +222,10 @@ int orterun(int argc, char *argv[])
>   DONE:
>  /* cleanup and leave */
>  orte_submit_finalize();
> -orte_finalize();
> -orte_session_dir_cleanup(ORTE_JOBID_WILDCARD);
>  /* cleanup the process info */
>  orte_proc_info_finalize();
> +orte_finalize();
> +orte_session_dir_cleanup(ORTE_JOBID_WILDCARD);
> 
>  if (orte_debug_flag) {
>  fprintf(stderr, "exiting with status %d\n", orte_exit_status);
> 
> ___
> devel mailing list
> devel@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

___
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

Re: [OMPI devel] Abstraction violation!

2017-06-23 Thread Jeff Squyres (jsquyres)
FWIW: mpi.h is created at the end of configure (it's an AC_CONFIG_HEADERS file).


> On Jun 22, 2017, at 9:37 PM, Barrett, Brian via devel 
>  wrote:
> 
> Thanks, Nathan.
> 
> There’s no mpi.h available on the PR builder hosts, so something works out.  
> Haven’t thought through that path, however.
> 
> Brian
> 
>> On Jun 22, 2017, at 6:04 PM, Nathan Hjelm  wrote:
>> 
>> I have a fix I am working on. Will open a PR tomorrow morning.
>> 
>> -Nathan
>> 
>>> On Jun 22, 2017, at 6:11 PM, r...@open-mpi.org wrote:
>>> 
>>> Here’s something even weirder. You cannot build that file unless mpi.h 
>>> already exists, which it won’t until you build the MPI layer. So apparently 
>>> what is happening is that we somehow pickup a pre-existing version of mpi.h 
>>> and use that to build the file?
>>> 
>>> Checking around, I find that all my available machines have an mpi.h 
>>> somewhere in the default path because we always install _something_. I 
>>> wonder if our master would fail in a distro that didn’t have an MPI 
>>> installed...
>>> 
 On Jun 22, 2017, at 5:02 PM, r...@open-mpi.org wrote:
 
 It apparently did come in that way. We just never test -no-ompi and so it 
 wasn’t discovered until a downstream project tried to update. Then...boom.
 
 
> On Jun 22, 2017, at 4:07 PM, Barrett, Brian via devel 
>  wrote:
> 
> I’m confused; looking at history, there’s never been a time when 
> opal/util/info.c hasn’t included mpi.h.  That seems odd, but so does info 
> being in opal.
> 
> Brian
> 
>> On Jun 22, 2017, at 3:46 PM, r...@open-mpi.org wrote:
>> 
>> I don’t understand what someone was thinking, but you CANNOT #include 
>> “mpi.h” in opal/util/info.c. It has broken pretty much every downstream 
>> project.
>> 
>> Please fix this!
>> Ralph
>> 
>> ___
>> devel mailing list
>> devel@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
> 
> ___
> devel mailing list
> devel@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
 
 ___
 devel mailing list
 devel@lists.open-mpi.org
 https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
>>> 
>>> ___
>>> devel mailing list
>>> devel@lists.open-mpi.org
>>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
>> 
>> ___
>> devel mailing list
>> devel@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
> 
> ___
> devel mailing list
> devel@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel


-- 
Jeff Squyres
jsquy...@cisco.com

___
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

[OMPI devel] orterun busted

2017-06-23 Thread George Bosilca
Ralph,

I got consistent segfaults during the infrastructure tearing down in the
orterun (I noticed them on a OSX). After digging a little bit it turns out
that the opal_buffet_t class has been cleaned-up in orte_finalize before
orte_proc_info_finalize is called, leading to calling the destructors into
a randomly initialized memory. If I change the order of the teardown to
move orte_proc_info_finalize before orte_finalize things work better, but I
still get a very annoying warning about a "Bad file descriptor in select".

Any better fix ?

George.

PS: Here is the patch I am currently using to get rid of the segfaults

diff --git a/orte/tools/orterun/orterun.c b/orte/tools/orterun/orterun.c
index 85aba0a0f3..506b931d35 100644
--- a/orte/tools/orterun/orterun.c
+++ b/orte/tools/orterun/orterun.c
@@ -222,10 +222,10 @@ int orterun(int argc, char *argv[])
  DONE:
 /* cleanup and leave */
 orte_submit_finalize();
-orte_finalize();
-orte_session_dir_cleanup(ORTE_JOBID_WILDCARD);
 /* cleanup the process info */
 orte_proc_info_finalize();
+orte_finalize();
+orte_session_dir_cleanup(ORTE_JOBID_WILDCARD);

 if (orte_debug_flag) {
 fprintf(stderr, "exiting with status %d\n", orte_exit_status);
___
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

Re: [OMPI devel] orte-clean not cleaning left over temporary I/O files in /tmp

2017-06-23 Thread Christoph Niethammer
Hi Howard,

You find the pull request under https://github.com/open-mpi/ompi/pull/3739

Best
Christoph

- Original Message -
From: "Howard Pritchard" 
To: "Open MPI Developers" 
Sent: Thursday, June 22, 2017 4:42:14 PM
Subject: Re: [OMPI devel] orte-clean not cleaning left over temporary I/O files 
in /tmp

Hi Chris 

Please go ahead and open a PR for master and I'll open corresponding ones for 
the release branches. 

Howard 

Christoph Niethammer < [ mailto:nietham...@hlrs.de | nietham...@hlrs.de ] > 
schrieb am Do. 22. Juni 2017 um 01:10: 


Hi Howard, 

Sorry, missed the new license policy. I added a Sign-off now. 
Shall I open a pull request? 

Best 
Christoph 

- Original Message - 
From: "Howard Pritchard" < [ mailto:hpprit...@gmail.com | hpprit...@gmail.com ] 
> 
To: "Open MPI Developers" < [ mailto:devel@lists.open-mpi.org | 
devel@lists.open-mpi.org ] > 
Sent: Wednesday, June 21, 2017 5:57:05 PM 
Subject: Re: [OMPI devel] orte-clean not cleaning left over temporary I/O files 
in /tmp 

Hi Chris, 

Sorry for being a bit picky, but could you add a sign-off to the commit 
message? 
I'm not suppose to manually add it for you. 

Thanks, 

Howard 


2017-06-21 9:45 GMT-06:00 Howard Pritchard < [ mailto: [ 
mailto:hpprit...@gmail.com | hpprit...@gmail.com ] | [ 
mailto:hpprit...@gmail.com | hpprit...@gmail.com ] ] > : 



Hi Chris, 

Thanks very much for the patch! 

Howard 


2017-06-21 9:43 GMT-06:00 Christoph Niethammer < [ mailto: [ 
mailto:nietham...@hlrs.de | nietham...@hlrs.de ] | [ mailto:nietham...@hlrs.de 
| nietham...@hlrs.de ] ] > : 


Hello Ralph, 

Thanks for the update on this issue. 

I used the latest master (c38866eb3929339147259a3a46c6fc815720afdb). 

The behaviour is still the same: aborting before MPI_File_close leaves 
/tmp/OMPI_*.sm files. 
These are not removed by your updated orte-clean. 

I now seeked for the origin of these files and it seems to be in 
ompi/mca/sharedfp/sm/sharedfp_sm_file_open.c:154 
where also a left over TODO note some lines above is mentioning the need for a 
correct directory. 

I would suggest updating the path there to be under the 
 directory which is cleaned by orte-clean, 
see 

[ [ 
https://github.com/cniethammer/ompi/commit/2aedf6134813299803628e7d6856a3b781542c02
 | 
https://github.com/cniethammer/ompi/commit/2aedf6134813299803628e7d6856a3b781542c02
 ] | [ 
https://github.com/cniethammer/ompi/commit/2aedf6134813299803628e7d6856a3b781542c02
 | 
https://github.com/cniethammer/ompi/commit/2aedf6134813299803628e7d6856a3b781542c02
 ] ] 

Best 
Christoph 

- Original Message - 
From: "Ralph Castain" < [ mailto: [ mailto:r...@open-mpi.org | 
r...@open-mpi.org ] | [ mailto:r...@open-mpi.org | r...@open-mpi.org ] ] > 
To: "Open MPI Developers" < [ mailto: [ mailto:devel@lists.open-mpi.org | 
devel@lists.open-mpi.org ] | [ mailto:devel@lists.open-mpi.org | 
devel@lists.open-mpi.org ] ] > 
Sent: Wednesday, June 21, 2017 4:33:29 AM 
Subject: Re: [OMPI devel] orte-clean not cleaning left over temporary I/O files 
in /tmp 

I updated orte-clean in master, and for v3.0, so it cleans up all both current 
and legacy session directory files as well as any pmix artifacts. I don’t see 
any files named OMPI_*.sm, though that might be something from v2.x? I don’t 
recall us ever making files of that name before - anything we make should be 
under the session directory, not directly in /tmp. 

> On May 9, 2017, at 2:10 AM, Christoph Niethammer < [ mailto: [ 
> mailto:nietham...@hlrs.de | nietham...@hlrs.de ] | [ 
> mailto:nietham...@hlrs.de | nietham...@hlrs.de ] ] > wrote: 
> 
> Hi, 
> 
> I am using Open MPI 2.1.0. 
> 
> Best 
> Christoph 
> 
> - Original Message - 
> From: "Ralph Castain" < [ mailto: [ mailto:r...@open-mpi.org | 
> r...@open-mpi.org ] | [ mailto:r...@open-mpi.org | r...@open-mpi.org ] ] > 
> To: "Open MPI Developers" < [ mailto: [ mailto:devel@lists.open-mpi.org | 
> devel@lists.open-mpi.org ] | [ mailto:devel@lists.open-mpi.org | 
> devel@lists.open-mpi.org ] ] > 
> Sent: Monday, May 8, 2017 6:28:42 PM 
> Subject: Re: [OMPI devel] orte-clean not cleaning left over temporary I/O 
> files in /tmp 
> 
> What version of OMPI are you using? 
> 
>> On May 8, 2017, at 8:56 AM, Christoph Niethammer < [ mailto: [ 
>> mailto:nietham...@hlrs.de | nietham...@hlrs.de ] | [ 
>> mailto:nietham...@hlrs.de | nietham...@hlrs.de ] ] > wrote: 
>> 
>> Hello 
>> 
>> According to the manpage "...orte-clean attempts to clean up any processes 
>> and files left over from Open MPI jobs that were run in the past as well as 
>> any currently running jobs. This includes OMPI infrastructure and helper 
>> commands, any processes that were spawned as part of the job, and any 
>> temporary files...". 
>> 
>> If I now have a program which calls MPI_File_open, MPI_File_write and 
>> MPI_Abort() in order, I get left over files /tmp/OMPI_*.sm. 
>> Running orte-clean does not remove them. 
>> 
>> Is this a