Re: [OMPI devel] 1.7.4rc: yet another launch failure

2014-01-22 Thread Ralph Castain
Interesting - still, I see no reason for OMPI to fail just because of that. We 
can run just fine with the uid, so I'll make things a little more flexible.

Thanks for tracking it down!

On Jan 22, 2014, at 7:54 PM, Paul Hargrove  wrote:

> Not lacking getpwuid():
> 
> [phh1@biou2 BLD]$ grep HAVE_GETPWUID */include/*_config.h
> opal/include/opal_config.h:#define HAVE_GETPWUID 1
> 
> I also can't see why the quoted code could fail.
> The following is working fine:
> 
> [phh1@biou2 BLD]$ cat q.c
> #include 
> #include 
> #include 
> #include 
> int main(void) {
>uid_t uid = getuid();
>printf("uid = %d\n", (int)uid);
>struct passwd *p = getpwuid(uid); 
>if (p) printf("name = %s\n", p->pw_name);
>return 0;
> }
> 
> [phh1@biou2 BLD]$ gcc -std=c99 q.c && ./a.out
> uid = 44154
> name = phh1
> 
> HOWEVER, building for ILP32 target (as in the reported failure) fails:
> 
> [phh1@biou2 BLD]$ gcc -m32 -std=c99 q.c && ./a.out
> uid = 44154
> 
> So, I am going to guess that this *is* a system misconfiguration (maybe 
> missing the 32-bit foo.so for the appropriate nsswitch resolver?) just as the 
> error message said.
> 
> Sorry for the false alarm,
> -Paul
> 
> 
> On Wed, Jan 22, 2014 at 7:36 PM, Ralph Castain  wrote:
> Here is the offending code:
> 
>  /* get the name of the user */
> uid = getuid();
> #ifdef HAVE_GETPWUID
> pwdent = getpwuid(uid);
> #else
> pwdent = NULL;
> #endif
> if (NULL != pwdent) {
> user = strdup(pwdent->pw_name);
> } else {
> orte_show_help("help-orte-runtime.txt",
>"orte:session:dir:nopwname", true);
> return ORTE_ERR_OUT_OF_RESOURCE;
> }
> 
> Is it possible on this platform that you don't have getpwuid? I'm surprised 
> at the code as we could just use the uid instead - not sure why this more 
> stringent test was applied
> 
> 
> 
> On Jan 22, 2014, at 7:02 PM, Paul Hargrove  wrote:
> 
>> On yet another test platform I see the following:
>> 
>> $ mpirun -mca btl sm,self -np 1 examples/ring_c
>> --
>> Open MPI was unable to obtain the username in order to create a path
>> for its required temporary directories.  This type of error is usually
>> caused by a transient failure of network-based authentication services
>> (e.g., LDAP or NIS failure due to network congestion), but can also be
>> an indication of system misconfiguration.
>> 
>> Please consult your system administrator about these issues and try
>> again.
>> --
>> [biou2.rice.edu:30021] [[40214,0],0] ORTE_ERROR_LOG: Out of resource in file 
>> /home/phh1/SCRATCH/OMPI/openmpi-1.7-latest-linux-ppc32-xlc-11.1/openmpi-1.7.4rc2r30361/orte/util/session_dir.c
>>  at line 380
>> [biou2.rice.edu:30021] [[40214,0],0] ORTE_ERROR_LOG: Out of resource in file 
>> /home/phh1/SCRATCH/OMPI/openmpi-1.7-latest-linux-ppc32-xlc-11.1/openmpi-1.7.4rc2r30361/orte/mca/ess/hnp/ess_hnp_module.c
>>  at line 599
>> --
>> It looks like orte_init failed for some reason; your parallel process is
>> likely to abort.  There are many reasons that a parallel process can
>> fail during orte_init; some of which are due to configuration or
>> environment problems.  This failure appears to be an internal failure;
>> here's some additional information (which may only be relevant to an
>> Open MPI developer):
>> 
>>   orte_session_dir failed
>>   --> Returned value Out of resource (-2) instead of ORTE_SUCCESS
>> --
>> 
>> 
>> An "-np 2" run fails in the same manner.
>> This is a production system and there is no problem with "whoami" or "id", 
>> leaving me doubting the explanation provided by the error message.
>> 
>> [phh1@biou2 ~]$ whoami
>> phh1
>> [phh1@biou2 ~]$ id
>> uid=44154(phh1) gid=2016(hpc) 
>> groups=2016(hpc),3803(hpcusers),3805(sshgw),3808(biou)
>> 
>> The "ompi_info --all" output is attached.
>> Please let me know what additional info is needed.
>> 
>> -Paul
>> 
>> -- 
>> Paul H. Hargrove  phhargr...@lbl.gov
>> Future Technologies Group
>> Computer and Data Sciences Department Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> 
> -- 
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> 

[OMPI devel] Unknown object files in libmpi.a

2014-01-22 Thread Irvanda Kurniadi
Hi,

I'm trying to port openmpi-1.6.5 in l4/fiasco. I checked the libmpi.a. I
did the " ar t libmpi.a " in my terminal. I can't find the source file (.c)
of some object files created in libmpi.a, such as:
ompi_bitmap.o
op_predefined.o
convertor.o
copy_functions.o
copy_functions_heterogeneous.o
datatype_pack.o
datatype_unpack.o
dt_add.o dt_args.o .. dt_sndrcv.o (15 files)
fake_stack.o
position.o
libdatatype_reliable_la-datatype_pack.o
libdatatype_reliable_la-datatype_unpack.o
common_sm_mmap.o

Can you tell me where is the source of those object files? Because I have
to compile every single .c file in openmpi which is needed to be compiled.
Thanks

regards,
Irvanda


Re: [OMPI devel] 1.7.4rc: MPI_F08_INTERFACES_CALLBACKS build failure with PathScale 4.0.12.1

2014-01-22 Thread Paul Hargrove
The reason appears to be:
  checking if Fortran compiler supports BIND(C) with LOGICAL params... no

The requested files are attached.

-Paul


On Wed, Jan 22, 2014 at 7:46 PM, Jeff Squyres (jsquyres)  wrote:

> Can you send me the configure output and config.log from this build?  I'd
> like to see why it chose not to build the mpi_f08 module.
>
>
> On Jan 22, 2014, at 10:08 PM, Paul Hargrove  wrote:
>
> >
> > On Wed, Jan 22, 2014 at 6:31 PM, Jeff Squyres (jsquyres) <
> jsquy...@cisco.com> wrote:
> > But just to confirm: you said that your pathscale compilers *do* compile
> 1.7.3 -- including the mpi_f08 module -- with no problems?  That would be a
> little surprising, because those same >=32 character symbol names are in
> 1.7.3...
> >
> > Not quite - I was a bit too quick in composing that email.
> > The mpi_f08 stuff is NOT getting built in 1.7.3 when using pathf95.
> > What I should have said is "the fortran code and configure script in
> 1.7.3 work together to produce a failure-free build".
> >
> > $ bin/ompi_info | grep -e Ident -e Fort
> > Ident string: 1.7.3
> >  Fort mpif.h: yes (all)
> > Fort use mpi: yes (full: ignore TKR)
> >Fort use mpi size: deprecated-ompi-info-value
> > Fort use mpi_f08: no
> >  Fort mpi_f08 compliance: The mpi_f08 module was not built
> >   Fort mpi_f08 subarrays: no
> >Fort compiler: pathf95
> >Fort compiler abs:
> /project/projectdirs/ftg/ekopath-4.0.12.1/bin/pathf95
> >  Fort ignore TKR: yes (!DIR$ IGNORE_TKR)
> >Fort 08 assumed shape: no
> >   Fort optional args: no
> > Fort BIND(C): yes
> > Fort PRIVATE: no
> >Fort ABSTRACT: no
> >Fort ASYNCHRONOUS: no
> >   Fort PROCEDURE: no
> >  Fort f08 using wrappers: yes
> >Fort mpif.h profiling: yes
> >   Fort use mpi profiling: yes
> >Fort use mpi_f08 prof: no
> >
> > -Paul
> >
> > --
> > Paul H. Hargrove  phhargr...@lbl.gov
> > Future Technologies Group
> > Computer and Data Sciences Department Tel: +1-510-495-2352
> > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


configure.stdout.bz2
Description: BZip2 compressed data


config.log.bz2
Description: BZip2 compressed data


Re: [OMPI devel] 1.7.4rc: yet another launch failure

2014-01-22 Thread Paul Hargrove
Not lacking getpwuid():

[phh1@biou2 BLD]$ grep HAVE_GETPWUID */include/*_config.h
opal/include/opal_config.h:#define HAVE_GETPWUID 1

I also can't see why the quoted code could fail.
The following is working fine:

[phh1@biou2 BLD]$ cat q.c
#include 
#include 
#include 
#include 
int main(void) {
   uid_t uid = getuid();
   printf("uid = %d\n", (int)uid);
   struct passwd *p = getpwuid(uid);
   if (p) printf("name = %s\n", p->pw_name);
   return 0;
}

[phh1@biou2 BLD]$ gcc -std=c99 q.c && ./a.out
uid = 44154
name = phh1

HOWEVER, building for ILP32 target (as in the reported failure) fails:

[phh1@biou2 BLD]$ gcc -m32 -std=c99 q.c && ./a.out
uid = 44154

So, I am going to guess that this *is* a system misconfiguration (maybe
missing the 32-bit foo.so for the appropriate nsswitch resolver?) just as
the error message said.

Sorry for the false alarm,
-Paul


On Wed, Jan 22, 2014 at 7:36 PM, Ralph Castain  wrote:

> Here is the offending code:
>
>  /* get the name of the user */
> uid = getuid();
> #ifdef HAVE_GETPWUID
> pwdent = getpwuid(uid);
> #else
> pwdent = NULL;
> #endif
> if (NULL != pwdent) {
> user = strdup(pwdent->pw_name);
> } else {
> orte_show_help("help-orte-runtime.txt",
>"orte:session:dir:nopwname", true);
> return ORTE_ERR_OUT_OF_RESOURCE;
> }
>
> Is it possible on this platform that you don't have getpwuid? I'm
> surprised at the code as we could just use the uid instead - not sure why
> this more stringent test was applied
>
>
>
> On Jan 22, 2014, at 7:02 PM, Paul Hargrove  wrote:
>
> On yet another test platform I see the following:
>
> $ mpirun -mca btl sm,self -np 1 examples/ring_c
> --
> Open MPI was unable to obtain the username in order to create a path
> for its required temporary directories.  This type of error is usually
> caused by a transient failure of network-based authentication services
> (e.g., LDAP or NIS failure due to network congestion), but can also be
> an indication of system misconfiguration.
>
> Please consult your system administrator about these issues and try
> again.
> --
> [biou2.rice.edu:30021] [[40214,0],0] ORTE_ERROR_LOG: Out of resource in
> file
> /home/phh1/SCRATCH/OMPI/openmpi-1.7-latest-linux-ppc32-xlc-11.1/openmpi-1.7.4rc2r30361/orte/util/session_dir.c
> at line 380
> [biou2.rice.edu:30021] [[40214,0],0] ORTE_ERROR_LOG: Out of resource in
> file
> /home/phh1/SCRATCH/OMPI/openmpi-1.7-latest-linux-ppc32-xlc-11.1/openmpi-1.7.4rc2r30361/orte/mca/ess/hnp/ess_hnp_module.c
> at line 599
> --
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
>
>   orte_session_dir failed
>   --> Returned value Out of resource (-2) instead of ORTE_SUCCESS
> --
>
>
> An "-np 2" run fails in the same manner.
> This is a production system and there is no problem with "whoami" or "id",
> leaving me doubting the explanation provided by the error message.
>
> [phh1@biou2 ~]$ whoami
> phh1
> [phh1@biou2 ~]$ id
> uid=44154(phh1) gid=2016(hpc)
> groups=2016(hpc),3803(hpcusers),3805(sshgw),3808(biou)
>
> The "ompi_info --all" output is attached.
> Please let me know what additional info is needed.
>
> -Paul
>
> --
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>  ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] 1.7.4rc: MPI_F08_INTERFACES_CALLBACKS build failure with PathScale 4.0.12.1

2014-01-22 Thread Jeff Squyres (jsquyres)
Can you send me the configure output and config.log from this build?  I'd like 
to see why it chose not to build the mpi_f08 module.


On Jan 22, 2014, at 10:08 PM, Paul Hargrove  wrote:

> 
> On Wed, Jan 22, 2014 at 6:31 PM, Jeff Squyres (jsquyres)  
> wrote:
> But just to confirm: you said that your pathscale compilers *do* compile 
> 1.7.3 -- including the mpi_f08 module -- with no problems?  That would be a 
> little surprising, because those same >=32 character symbol names are in 
> 1.7.3...
> 
> Not quite - I was a bit too quick in composing that email.
> The mpi_f08 stuff is NOT getting built in 1.7.3 when using pathf95.
> What I should have said is "the fortran code and configure script in 1.7.3 
> work together to produce a failure-free build".
> 
> $ bin/ompi_info | grep -e Ident -e Fort
> Ident string: 1.7.3
>  Fort mpif.h: yes (all)
> Fort use mpi: yes (full: ignore TKR)
>Fort use mpi size: deprecated-ompi-info-value
> Fort use mpi_f08: no
>  Fort mpi_f08 compliance: The mpi_f08 module was not built
>   Fort mpi_f08 subarrays: no
>Fort compiler: pathf95
>Fort compiler abs: 
> /project/projectdirs/ftg/ekopath-4.0.12.1/bin/pathf95
>  Fort ignore TKR: yes (!DIR$ IGNORE_TKR)
>Fort 08 assumed shape: no
>   Fort optional args: no
> Fort BIND(C): yes
> Fort PRIVATE: no
>Fort ABSTRACT: no
>Fort ASYNCHRONOUS: no
>   Fort PROCEDURE: no
>  Fort f08 using wrappers: yes
>Fort mpif.h profiling: yes
>   Fort use mpi profiling: yes
>Fort use mpi_f08 prof: no
> 
> -Paul
> 
> -- 
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI devel] 1.7.4rc: yet another launch failure

2014-01-22 Thread Ralph Castain
Here is the offending code:

 /* get the name of the user */
uid = getuid();
#ifdef HAVE_GETPWUID
pwdent = getpwuid(uid);
#else
pwdent = NULL;
#endif
if (NULL != pwdent) {
user = strdup(pwdent->pw_name);
} else {
orte_show_help("help-orte-runtime.txt",
   "orte:session:dir:nopwname", true);
return ORTE_ERR_OUT_OF_RESOURCE;
}

Is it possible on this platform that you don't have getpwuid? I'm surprised at 
the code as we could just use the uid instead - not sure why this more 
stringent test was applied



On Jan 22, 2014, at 7:02 PM, Paul Hargrove  wrote:

> On yet another test platform I see the following:
> 
> $ mpirun -mca btl sm,self -np 1 examples/ring_c
> --
> Open MPI was unable to obtain the username in order to create a path
> for its required temporary directories.  This type of error is usually
> caused by a transient failure of network-based authentication services
> (e.g., LDAP or NIS failure due to network congestion), but can also be
> an indication of system misconfiguration.
> 
> Please consult your system administrator about these issues and try
> again.
> --
> [biou2.rice.edu:30021] [[40214,0],0] ORTE_ERROR_LOG: Out of resource in file 
> /home/phh1/SCRATCH/OMPI/openmpi-1.7-latest-linux-ppc32-xlc-11.1/openmpi-1.7.4rc2r30361/orte/util/session_dir.c
>  at line 380
> [biou2.rice.edu:30021] [[40214,0],0] ORTE_ERROR_LOG: Out of resource in file 
> /home/phh1/SCRATCH/OMPI/openmpi-1.7-latest-linux-ppc32-xlc-11.1/openmpi-1.7.4rc2r30361/orte/mca/ess/hnp/ess_hnp_module.c
>  at line 599
> --
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>   orte_session_dir failed
>   --> Returned value Out of resource (-2) instead of ORTE_SUCCESS
> --
> 
> 
> An "-np 2" run fails in the same manner.
> This is a production system and there is no problem with "whoami" or "id", 
> leaving me doubting the explanation provided by the error message.
> 
> [phh1@biou2 ~]$ whoami
> phh1
> [phh1@biou2 ~]$ id
> uid=44154(phh1) gid=2016(hpc) 
> groups=2016(hpc),3803(hpcusers),3805(sshgw),3808(biou)
> 
> The "ompi_info --all" output is attached.
> Please let me know what additional info is needed.
> 
> -Paul
> 
> -- 
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel



Re: [OMPI devel] 1.7.4rc: MPI_F08_INTERFACES_CALLBACKS build failure with PathScale 4.0.12.1

2014-01-22 Thread Paul Hargrove
On Wed, Jan 22, 2014 at 6:31 PM, Jeff Squyres (jsquyres)  wrote:

> But just to confirm: you said that your pathscale compilers *do* compile
> 1.7.3 -- including the mpi_f08 module -- with no problems?  That would be a
> little surprising, because those same >=32 character symbol names are in
> 1.7.3...
>

Not quite - I was a bit too quick in composing that email.
The mpi_f08 stuff is NOT getting built in 1.7.3 when using pathf95.
What I should have said is "the fortran code and configure script in 1.7.3
work together to produce a failure-free build".

$ bin/ompi_info | grep -e Ident -e Fort
Ident string: 1.7.3
 Fort mpif.h: yes (all)
Fort use mpi: yes (full: ignore TKR)
   Fort use mpi size: deprecated-ompi-info-value
Fort use mpi_f08: no
 Fort mpi_f08 compliance: The mpi_f08 module was not built
  Fort mpi_f08 subarrays: no
   Fort compiler: pathf95
   Fort compiler abs:
/project/projectdirs/ftg/ekopath-4.0.12.1/bin/pathf95
 Fort ignore TKR: yes (!DIR$ IGNORE_TKR)
   Fort 08 assumed shape: no
  Fort optional args: no
Fort BIND(C): yes
Fort PRIVATE: no
   Fort ABSTRACT: no
   Fort ASYNCHRONOUS: no
  Fort PROCEDURE: no
 Fort f08 using wrappers: yes
   Fort mpif.h profiling: yes
  Fort use mpi profiling: yes
   Fort use mpi_f08 prof: no

-Paul

-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


[OMPI devel] 1.7.4rc: yet another launch failure

2014-01-22 Thread Paul Hargrove
On yet another test platform I see the following:

$ mpirun -mca btl sm,self -np 1 examples/ring_c
--
Open MPI was unable to obtain the username in order to create a path
for its required temporary directories.  This type of error is usually
caused by a transient failure of network-based authentication services
(e.g., LDAP or NIS failure due to network congestion), but can also be
an indication of system misconfiguration.

Please consult your system administrator about these issues and try
again.
--
[biou2.rice.edu:30021] [[40214,0],0] ORTE_ERROR_LOG: Out of resource in
file
/home/phh1/SCRATCH/OMPI/openmpi-1.7-latest-linux-ppc32-xlc-11.1/openmpi-1.7.4rc2r30361/orte/util/session_dir.c
at line 380
[biou2.rice.edu:30021] [[40214,0],0] ORTE_ERROR_LOG: Out of resource in
file
/home/phh1/SCRATCH/OMPI/openmpi-1.7-latest-linux-ppc32-xlc-11.1/openmpi-1.7.4rc2r30361/orte/mca/ess/hnp/ess_hnp_module.c
at line 599
--
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_session_dir failed
  --> Returned value Out of resource (-2) instead of ORTE_SUCCESS
--


An "-np 2" run fails in the same manner.
This is a production system and there is no problem with "whoami" or "id",
leaving me doubting the explanation provided by the error message.

[phh1@biou2 ~]$ whoami
phh1
[phh1@biou2 ~]$ id
uid=44154(phh1) gid=2016(hpc)
groups=2016(hpc),3803(hpcusers),3805(sshgw),3808(biou)

The "ompi_info --all" output is attached.
Please let me know what additional info is needed.

-Paul

-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


biou2_info.txt.bz2
Description: BZip2 compressed data


Re: [OMPI devel] 1.7.4 status update

2014-01-22 Thread Jeff Squyres (jsquyres)
Ok, here's my update: I fixed a bunch of issues in the Fortran support today; 
most are minor, but they took a while to verify (and some are slated for v1.7.5 
because they aren't critical).  I also added the ability to disable building 
the mpi_f08 module.

Here's what's on the trunk / v1.7, and will be in nightly tarballs soon (v1.7 
is building now; I have to re-start the trunk one when v1.7 finishes because I 
goofed and it failed the first time):

- Fix for MPI_STATUS_IGNORE issue (a side-effect of the "protected" update the 
other day)
- Add some missing interfaces for MPI_Neighbor subroutines
- Add missing interfaces for the profiled versions of the MPI_Dist_graph 
subroutines
- Add missing pre-defined function callbacks in the mpi_f08 module

Here's what's on the trunk and still awaiting a code review (probably tomorrow) 
before it can go to v1.7:

- --enable-mpi-fortran behavior now allows you specify up to what level of 
Fortran bindings you want built:

  --enable-mpi-fortran: tries to build them all (this is the default)
  --enable-mpi-fortran=mpifh: only builds mpif.h support
  --enable-mpi-fortran=usempi: builds mpif.h and use mpi support
  --enable-mpi-fortran=usempif08: builds mpif.h, use mpi, and use mpi_f08 
support
  --disable-mpi-fortran: does not build any Fortran support

So to disable the mpi_f08 bindings, you can --enable-mpi-fortran=usempi.



On Jan 22, 2014, at 5:45 PM, Jeff Squyres (jsquyres)  wrote:

> Update: I've been working all day on Fortran issues (pulling on one 
> Paul-Fortran--sweater-thread revealed several other issues :-( ).  
> 
> I'll be sending an update soon...
> 
> 
> 
> On Jan 22, 2014, at 5:40 PM, Paul Hargrove  wrote:
> 
>> 
>> On Wed, Jan 22, 2014 at 1:33 PM, Ralph Castain  wrote:
>> My main concern with 1.7.4 at the moment stems from all the Fortran changes 
>> we pushed into that release - this occurred *after* 1.7.3, and so those 
>> problems represent a regression in the 1.7 series.
>> 
>> Unless I am missing something, the currently open Fortan issues are:
>> + XLF, which didn't work in 1.7.3 either (just verified this today)
>> + PathScale and Open64 which fail building in ompi/mpi/fortran/use-mpi-f08/
>> 
>> The XLF issue is not a regression.
>> The remaining PathScale/Open64 issue MAY be a compiler bug.
>> 
>> If Jeff follows through on his promise to implement --disable-mpi-fortran-08 
>> then use of that option is a work-around for the regression on PathScale and 
>> Open64.
>> 
>> -Paul
>> 
>> 
>> -- 
>> Paul H. Hargrove  phhargr...@lbl.gov
>> Future Technologies Group
>> Computer and Data Sciences Department Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI devel] 1.7.4rc: MPI_F08_INTERFACES_CALLBACKS build failure with PathScale 4.0.12.1

2014-01-22 Thread Jeff Squyres (jsquyres)
Yep, this is 'zactly what I was contemplating today.  Chris from Pathscale is 
checking into what their symbol length limits are for me and promises to get 
back to me shortly.  So I'm holding off on the symbol length issues until then.

But just to confirm: you said that your pathscale compilers *do* compile 1.7.3 
-- including the mpi_f08 module -- with no problems?  That would be a little 
surprising, because those same >=32 character symbol names are in 1.7.3...


On Jan 22, 2014, at 9:20 PM, Paul Hargrove  wrote:

> 
> On Wed, Jan 22, 2014 at 8:50 AM, Jeff Squyres (jsquyres)  
> wrote:
> Can you do me a favor and cd into ompi/mpi/fortran/use-mpi-f08 and try to 
> manually "make type_create_indexed_block_f08.lo" and see if it also 
> complains?  That's a 32 character name -- let's see if the limit is >=32 or 
> >=33...
> 
> 
> Jeff,
> 
> Perhaps you came to the same conclusion already, but just in case:
> 
> I think the simplest approach to this problem is to include a configure check 
> with the longest name in the interface (without regard to WHAT that length 
> is).  This would be added to the ever-growing list of probes of BIND 
> behavior.  If the compiler can't handle the longest name required, then it is 
> disqualified from building use-mpi-f08.
> 
> Of course that solution adeptly avoids the "Internal" failure in PathScale 
> and Open64 compilers, but a configure option to disable the F08 support 
> addresses that and other misc cases of fortran compilers that just aren't 
> ready for F08 (or 03 for that matter).
> 
> -Paul
> 
> -- 
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI devel] [EXTERNAL] 1.7.4rc: linux/ppc32/xlc-11.1 build failure

2014-01-22 Thread Paul Hargrove
On Mon, Jan 20, 2014 at 12:24 PM, Barrett, Brian W wrote:

> Ugh, this is a 32 bit RISC problem; we don't have a 64 bit atomic on a 32
> bit platform.  People are supposed to check to see if there's 64 bit atomic
> support, but that clearly hasn't been happening.  I've fixed this compile
> error, but there are still two places in the code base (bcol-basesmuma and
> coll-ml) that blindly use 64 bit atomics and I don't have time to fix
> those.  I'll file a CMR for the core fix and bugs about the components, but
> I'm not hopeful people will fix them before the 1.7.4 release.  Sigh.



I can confirm that a PPC32 builds of 1.7.4rc2r30361 w/ xlc (w/o fortran
support) now works for me.

-Paul


-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] 1.7.4rc: MPI_F08_INTERFACES_CALLBACKS build failure with PathScale 4.0.12.1

2014-01-22 Thread Paul Hargrove
On Wed, Jan 22, 2014 at 8:50 AM, Jeff Squyres (jsquyres)  wrote:

> Can you do me a favor and cd into ompi/mpi/fortran/use-mpi-f08 and try to
> manually "make type_create_indexed_block_f08.lo" and see if it also
> complains?  That's a 32 character name -- let's see if the limit is >=32 or
> >=33...



Jeff,

Perhaps you came to the same conclusion already, but just in case:

I think the simplest approach to this problem is to include a configure
check with the longest name in the interface (without regard to WHAT that
length is).  This would be added to the ever-growing list of probes of BIND
behavior.  If the compiler can't handle the longest name required, then it
is disqualified from building use-mpi-f08.

Of course that solution adeptly avoids the "Internal" failure in PathScale
and Open64 compilers, but a configure option to disable the F08 support
addresses that and other misc cases of fortran compilers that just aren't
ready for F08 (or 03 for that matter).

-Paul

-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


[hwloc-devel] Create success (hwloc git dev-40-g183a7da)

2014-01-22 Thread MPI Team
Creating nightly hwloc snapshot git tarball was a success.

Snapshot:   hwloc dev-40-g183a7da
Start time: Wed Jan 22 21:01:01 EST 2014
End time:   Wed Jan 22 21:03:37 EST 2014

Your friendly daemon,
Cyrador


Re: [OMPI devel] 1.7.4rc: mpirun hangs on ia64

2014-01-22 Thread Paul Hargrove
On Wed, Jan 22, 2014 at 2:22 PM, Paul Hargrove  wrote:

> My ia64 asm is a bit rusty, but I'll give a quick look if/when I can.


I had a look (in v1.7) and this is what I see:

$cat -n IA64.asm | grep -A14 opal_atomic_cmpset_acq_64:
70  opal_atomic_cmpset_acq_64:
71  .prologue
72  .body
73  mov ar.ccv=r33;;
74  cmpxchg8.acq r32=[r32],r34,ar.ccv
75  ;;
76  sxt4 r32 = r32
77  ;;
78  cmp.eq p6, p7 = r33, r32
79  ;;
80  (p6) addl r8 = 1, r0
81  (p7) mov r8 = r0
82  br.ret.sptk.many b0
83  ;;
84  .endp opal_atomic_cmpset_acq_64#

The (approximate and non-atomic) C equivalent is:

// r32 = address
// r33 = oldvalue
// r34 = newvalue
int opal_atomic_cmpset_acq_64(int64_t r32, int64_t r33, int64 r34) {
   int64_t ccv = r33; // L73
   if (*(int64_t *)r32 == ccv) *(int64_t *)r32 = r34; // L74

   r32 = (int64_t)(int32_t)r32; // L76 = sign-extend 32->64

   bool p6, p7;
   p7 = !(p6 = (r33 == r32)); // L78

   const int r0 = 0;
   int r8;
   if (p6) r8 = 1 + r0; // L80
   if (p7) r8 = r0; // L81
   return r8; // L82
}

Which is fine except that line 76 is totally wrong!!
The "sxt4" instruction is "sign-extend from 4 bytes to 8 bytes".
Thus the upper 32-bits of the value read from memory are lost!
Unless the upper 33 bits off r33 (oldvalue) are all 0s or all 1s, the
comparison on line 78 MUST fail.
This explains the hang, as the lifo push will loop indefinitely waiting for
the success of this cmpset.

Note the same erroneous instruction is also present in the _rel variant (at
line 94).
The trunk has the same issue.
This code has not changed at all since IA64.asm was added way back in r4471.

I won't have access to the IA64 platform again until tomorrow AM.
So, testing my hypothesis will need to wait.

BTW:
IFF I am right about the source of this problem, then it would be
beneficial to have (and I may contribute) a stronger test (for "make
check") that would detect this sort of bug in the atomics (specifically
look for both false-positive and false-negative return value from 64-bit
cmpset operations with values satisfying a range of "corner cases").  I
think I have single-bit and double-bit "marching tests" for cmpset in my
own arsenal of tests for GASNet's atomics.  If I don't have time to
contribute a complete test, I can at least contribute that logic for
somebody else to port to the OPAL atomics.

-Paul

P.S.:
The cmpxchgN for N in 1,2,4 are documented as ZERO-extending their loads to
64-bits.
So, there is a slim chance that the sxt4 actually was intended for the
32-bit cmpset code.
However, since the comparison used there is a "cmp4.eq" the "sxt4" would
still not be needed.


-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] 1.7.4 status update

2014-01-22 Thread Paul Hargrove
On Wed, Jan 22, 2014 at 2:40 PM, Paul Hargrove  wrote:
>
> + PathScale and Open64 which fail building in ompi/mpi/fortran/use-mpi-f08/
>


I implied, but forgot to state the following explicitly:
Both PathScale and Open64 can build the Fortran support present in 1.7.3
(verified today).

So, as Ralph stated, the current situation IS a regression in the 1.7
series.

-Paul

-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] 1.7.4 status update

2014-01-22 Thread Jeff Squyres (jsquyres)
Update: I've been working all day on Fortran issues (pulling on one 
Paul-Fortran--sweater-thread revealed several other issues :-( ).  

I'll be sending an update soon...



On Jan 22, 2014, at 5:40 PM, Paul Hargrove  wrote:

> 
> On Wed, Jan 22, 2014 at 1:33 PM, Ralph Castain  wrote:
> My main concern with 1.7.4 at the moment stems from all the Fortran changes 
> we pushed into that release - this occurred *after* 1.7.3, and so those 
> problems represent a regression in the 1.7 series.
> 
> Unless I am missing something, the currently open Fortan issues are:
> + XLF, which didn't work in 1.7.3 either (just verified this today)
> + PathScale and Open64 which fail building in ompi/mpi/fortran/use-mpi-f08/
> 
> The XLF issue is not a regression.
> The remaining PathScale/Open64 issue MAY be a compiler bug.
> 
> If Jeff follows through on his promise to implement --disable-mpi-fortran-08 
> then use of that option is a work-around for the regression on PathScale and 
> Open64.
> 
> -Paul
> 
> 
> -- 
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI devel] 1.7.4 status update

2014-01-22 Thread Paul Hargrove
On Wed, Jan 22, 2014 at 1:33 PM, Ralph Castain  wrote:

> My main concern with 1.7.4 at the moment stems from all the Fortran
> changes we pushed into that release - this occurred *after* 1.7.3, and so
> those problems represent a regression in the 1.7 series.


Unless I am missing something, the currently open Fortan issues are:
+ XLF, which didn't work in 1.7.3 either (just verified this today)
+ PathScale and Open64 which fail building in ompi/mpi/fortran/use-mpi-f08/

The XLF issue is not a regression.
The remaining PathScale/Open64 issue MAY be a compiler bug.

If Jeff follows through on his promise to
implement --disable-mpi-fortran-08 then use of that option is a work-around
for the regression on PathScale and Open64.

-Paul


-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] 1.7.4rc: mpirun hangs on ia64

2014-01-22 Thread Paul Hargrove
On Wed, Jan 22, 2014 at 1:59 PM, Ralph Castain  wrote:

> Huh - afraid I can't see anything wrong so far. All looks normal and then
> it just hangs. Any chance you can "gdb" to the proc and see where it is
> stuck?
>

Ralph,

The gstack output below looks like one thread is spinning on an atomic of
some sort.
Running gstack repeatedly 100 times yields the following "histogram" of the
top frame of Thread 1:

 47 opal_atomic_lifo_push > opal_atomic_cmpset_ptr >
opal_atomic_cmpset_acq_64
 19 opal_atomic_lifo_push > opal_atomic_cmpset_ptr
  6 opal_atomic_lifo_push > opal_atomic_wmb
 28 opal_atomic_lifo_push

A spin in a lifo push is not consistent (in my experience) with the
possibility that the other thread and failed to post some event. So, the
problem is probably in the atomics or lifo code, though "make check" passes
just fine.


My ia64 asm is a bit rusty, but I'll give a quick look if/when I can.
I've implemented a lock-free LIFO for ia64 in the past and so have some
idea what I am looking at/for.
However, with my access window closing under 10 minutes from now, anything
more than source inspection will need to wait until tomorrow.

-Paul

$ gstack 21094
Thread 2 (Thread 0x216bf200 (LWP 21095)):
#0  0xa0010721 in __kernel_syscall_via_break ()
#1  0x205a00d0 in poll () from /lib/libc.so.6.1
#2  0x20a0c3e0 in poll_dispatch () from
/eng/home/PHHargrove/OMPI/openmpi-1.7-latest-linux-ia64/INST/lib/libopen-pal.so.6
#3  0x209e5e90 in opal_libevent2021_event_base_loop () from
/eng/home/PHHargrove/OMPI/openmpi-1.7-latest-linux-ia64/INST/lib/libopen-pal.so.6
#4  0x206bd8a0 in orte_progress_thread_engine () from
/eng/home/PHHargrove/OMPI/openmpi-1.7-latest-linux-ia64/INST/lib/libopen-rte.so.7
#5  0x203dc310 in start_thread () from /lib/libpthread.so.0
#6  0x205b49a0 in __clone2 () from /lib/libc.so.6.1
#7  0x in ?? ()
Thread 1 (Thread 0x200566a0 (LWP 21094)):
#0  0x200973f2 in opal_atomic_cmpset_acq_64 () from
/eng/home/PHHargrove/OMPI/openmpi-1.7-latest-linux-ia64/INST/lib/libmpi.so.1
#1  0x20097350 in opal_atomic_cmpset_ptr () from
/eng/home/PHHargrove/OMPI/openmpi-1.7-latest-linux-ia64/INST/lib/libmpi.so.1
#2  0x200995d0 in opal_atomic_lifo_push () from
/eng/home/PHHargrove/OMPI/openmpi-1.7-latest-linux-ia64/INST/lib/libmpi.so.1
#3  0x20099030 in ompi_free_list_grow () from
/eng/home/PHHargrove/OMPI/openmpi-1.7-latest-linux-ia64/INST/lib/libmpi.so.1
#4  0x2009a2a0 in ompi_rb_tree_init () from
/eng/home/PHHargrove/OMPI/openmpi-1.7-latest-linux-ia64/INST/lib/libmpi.so.1
#5  0x2029ec10 in mca_mpool_base_tree_init () from
/eng/home/PHHargrove/OMPI/openmpi-1.7-latest-linux-ia64/INST/lib/libmpi.so.1
#6  0x20299380 in mca_mpool_base_open () from
/eng/home/PHHargrove/OMPI/openmpi-1.7-latest-linux-ia64/INST/lib/libmpi.so.1
#7  0x2098fd80 in mca_base_framework_open () from
/eng/home/PHHargrove/OMPI/openmpi-1.7-latest-linux-ia64/INST/lib/libopen-pal.so.6
#8  0x2010d6b0 in ompi_mpi_init () from
/eng/home/PHHargrove/OMPI/openmpi-1.7-latest-linux-ia64/INST/lib/libmpi.so.1
#9  0x201b3460 in PMPI_Init () from
/eng/home/PHHargrove/OMPI/openmpi-1.7-latest-linux-ia64/INST/lib/libmpi.so.1
#10 0x4c00 in main ()





>
> On Jan 22, 2014, at 11:39 AM, Paul Hargrove  wrote:
>
> Ralph,
>
> Attached is the requested output with the addition of "-mca
> grpcomm_base_verbose 5".
> I have also attached a 2nd output with the further addition of "-mca
> oob_tcp_if_include lo" to ensure that this is not related to the firewall
> issues I've seen on other hosts.
>
> I have use of this host until 14:30 PST today, and then lose it for 12
> hours.
> So, tests of the next tarball won't start until after 2:30am - which
> probably means 10am.
>
> -Paul
>
>
> On Wed, Jan 22, 2014 at 7:34 AM, Ralph Castain  wrote:
>
>> Weird - everything looks completely normal. Can you add -mca
>> grpcomm_base_verbose 5 to your cmd line?
>>
>>
>> On Jan 22, 2014, at 1:38 AM, Paul Hargrove  wrote:
>>
>> Following-up as promised:
>>
>> Output from an --enable-debug build is attached.
>>
>> -Paul
>>
>>
>> On Tue, Jan 21, 2014 at 11:25 PM, Paul Hargrove wrote:
>>
>>> Yes, this is familiar. See:
>>> http://www.open-mpi.org/community/lists/devel/2013/11/13182.php
>>>
>>> If I understand correctly, the thread ended with:
>>>
>>> On 03 December 2013, Sylvestre Ledru wrote:
>>>
 FYI, Debian has stopped supporting ia64 for its next release
 So, I stopped working on that issue.
>>>
>>>
>>> Well, I have access to a Linux/IA64 system and my trials with
>>> openmpi-1.7.4rc2r30361 appear to hang, much as Sylvestre had reported w/
>>> 1.6.5.
>>>
>>> I am atatching output from a build W/O --enable debug for the command:
>>> $ mpirun -mca plm_base_verbose 5 -mca ras_base_verbose 5 -mca

Re: [OMPI devel] 1.7.4rc: mpirun hangs on ia64

2014-01-22 Thread Ralph Castain
Huh - afraid I can't see anything wrong so far. All looks normal and then it 
just hangs. Any chance you can "gdb" to the proc and see where it is stuck?


On Jan 22, 2014, at 11:39 AM, Paul Hargrove  wrote:

> Ralph,
> 
> Attached is the requested output with the addition of "-mca 
> grpcomm_base_verbose 5".
> I have also attached a 2nd output with the further addition of "-mca 
> oob_tcp_if_include lo" to ensure that this is not related to the firewall 
> issues I've seen on other hosts.
> 
> I have use of this host until 14:30 PST today, and then lose it for 12 hours.
> So, tests of the next tarball won't start until after 2:30am - which probably 
> means 10am.
> 
> -Paul
> 
> 
> On Wed, Jan 22, 2014 at 7:34 AM, Ralph Castain  wrote:
> Weird - everything looks completely normal. Can you add -mca 
> grpcomm_base_verbose 5 to your cmd line?
> 
> 
> On Jan 22, 2014, at 1:38 AM, Paul Hargrove  wrote:
> 
>> Following-up as promised:
>> 
>> Output from an --enable-debug build is attached.
>> 
>> -Paul
>> 
>> 
>> On Tue, Jan 21, 2014 at 11:25 PM, Paul Hargrove  wrote:
>> Yes, this is familiar. See:
>> http://www.open-mpi.org/community/lists/devel/2013/11/13182.php
>> 
>> If I understand correctly, the thread ended with:
>> 
>> On 03 December 2013, Sylvestre Ledru wrote: 
>> FYI, Debian has stopped supporting ia64 for its next release
>> So, I stopped working on that issue.
>> 
>> 
>> Well, I have access to a Linux/IA64 system and my trials with 
>> openmpi-1.7.4rc2r30361 appear to hang, much as Sylvestre had reported w/ 
>> 1.6.5.
>> 
>> I am atatching output from a build W/O --enable debug for the command:
>> $ mpirun -mca plm_base_verbose 5 -mca ras_base_verbose 5 -mca 
>> rmaps_base_verbose 5 -mca ess_base_verbose 5 -np 1 ./ring_c
>> 
>> I will follow-up with an --enable-debug build when possible.
>> 
>> -Paul
>> 
>> -- 
>> Paul H. Hargrove  phhargr...@lbl.gov
>> Future Technologies Group
>> Computer and Data Sciences Department Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>> 
>> 
>> 
>> -- 
>> Paul H. Hargrove  phhargr...@lbl.gov
>> Future Technologies Group
>> Computer and Data Sciences Department Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> 
> -- 
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel



Re: [OMPI devel] 1.7.4 status update

2014-01-22 Thread Ralph Castain
I appreciate those points, Paul. My main concern with 1.7.4 at the moment stems 
from all the Fortran changes we pushed into that release - this occurred 
*after* 1.7.3, and so those problems represent a regression in the 1.7 series.

We obviously appreciate all your testing since you have far more systems than 
we do!


On Jan 22, 2014, at 1:18 PM, Paul Hargrove  wrote:

> My $0.02USD:
> 
> I agree that "just keep the bar high" for 1.7.4 is the right approach.
> In other words: just because I found all these issues doesn't mean they 
> should delay 1.7.4.
> Considering most (possibly all) were in 1.7.3 and nobody noticed, what harm 
> in leaving the issue unresolved in 1.7.4?
> If my help is needed to determine if a given issue was in 1.7.3 then just ask.
> 
> For those who don't know me, or have forgotten:
> 
> I am not an MPI applications programmer or user, nor do I admin systems for 
> people who are.
> If every single issue I reported were to be ignored and never fixed, it would 
> not harm me in any way.
> I will push back if I ever think the core developers are making poor choices, 
> but have no reason to "fight" for any particular issue to be fixed.
> 
> I am a middleware developer who happens to have access to an exceptionally 
> wide range of systems and compilers.
> I use those resources to work hard to ensure portability of my own s/w.
> Having known Jeff and Brian since the LAM/MPI days I occasionally apply my 
> resources and knowledge to testing of Open MPI release candidates.
> 
> -Paul
> 
> 
> On Wed, Jan 22, 2014 at 12:57 PM, Rolf vandeVaart  
> wrote:
> Hi Ralph:
> In my opinion, we still try to get to a stable 1.7.4.  I think we can just 
> keep the bar high (as you said in the meeting) about what types of fixes need 
> to get into 1.7.4.  I have been telling folks 1.7.4 would be ready "really 
> soon" so the idea of folding in 1.7.5 CMRs and delaying it is less desirable 
> to me.
> 
> Can you remind me again about why the 1.8.0 by mid-March is a requirement?
> 
> Thanks,
> Rolf
> 
> >-Original Message-
> >From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Ralph
> >Castain
> >Sent: Tuesday, January 21, 2014 6:41 PM
> >To: Open MPI Developers
> >Subject: [OMPI devel] 1.7.4 status update
> >
> >Hi folks
> >
> >I think it is safe to say that we are not going to get a release candidate 
> >out
> >tonight - more Fortran problems have surfaced, along with the need to
> >complete the ROMIO review. I have therefore concluded we cannot release
> >1.7.4 this week. This leaves us with a couple of options:
> >
> >1. continue down this path, hopefully releasing 1.7.4 sometime next week,
> >followed by 1.7.5 in the latter half of Feb. The risk here is that any 
> >further
> >slippage in 1.7.4/5 means that we will not release it as we must roll 1.8.0 
> >by
> >mid-March. I'm not too concerned about most of those cmr's as they could be
> >considered minor bug fixes and pushed to the 1.8 series, but it leaves
> >oshmem potentially pushed into 1.9.0.
> >
> >2. "promote" all the 1.7.5 cmr's into 1.7.4 and just do a single release 
> >before
> >ending the series. This eases the immediate schedule crunch, but means we
> >will have to deal with all the bugs that surface when we destabilize the 1.7
> >branch again.
> >
> >
> >I'm open to suggestions. Please be prepared to discuss at next Tues telecon.
> >Ralph
> >
> >___
> >devel mailing list
> >de...@open-mpi.org
> >http://www.open-mpi.org/mailman/listinfo.cgi/devel
> ---
> This email message is for the sole use of the intended recipient(s) and may 
> contain
> confidential information.  Any unauthorized review, use, disclosure or 
> distribution
> is prohibited.  If you are not the intended recipient, please contact the 
> sender by
> reply email and destroy all copies of the original message.
> ---
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> 
> -- 
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel



Re: [OMPI devel] 1.7.4 status update

2014-01-22 Thread Paul Hargrove
My $0.02USD:

I agree that "just keep the bar high" for 1.7.4 is the right approach.
In other words: just because I found all these issues doesn't mean they
should delay 1.7.4.
Considering most (possibly all) were in 1.7.3 and nobody noticed, what harm
in leaving the issue unresolved in 1.7.4?
If my help is needed to determine if a given issue was in 1.7.3 then just
ask.

For those who don't know me, or have forgotten:

I am not an MPI applications programmer or user, nor do I admin systems for
people who are.
If every single issue I reported were to be ignored and never fixed, it
would not harm me in any way.
I will push back if I ever think the core developers are making poor
choices, but have no reason to "fight" for any particular issue to be fixed.

I am a middleware developer who happens to have access to an exceptionally
wide range of systems and compilers.
I use those resources to work hard to ensure portability of my own s/w.
Having known Jeff and Brian since the LAM/MPI days I occasionally apply my
resources and knowledge to testing of Open MPI release candidates.

-Paul


On Wed, Jan 22, 2014 at 12:57 PM, Rolf vandeVaart wrote:

> Hi Ralph:
> In my opinion, we still try to get to a stable 1.7.4.  I think we can just
> keep the bar high (as you said in the meeting) about what types of fixes
> need to get into 1.7.4.  I have been telling folks 1.7.4 would be ready
> "really soon" so the idea of folding in 1.7.5 CMRs and delaying it is less
> desirable to me.
>
> Can you remind me again about why the 1.8.0 by mid-March is a requirement?
>
> Thanks,
> Rolf
>
> >-Original Message-
> >From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Ralph
> >Castain
> >Sent: Tuesday, January 21, 2014 6:41 PM
> >To: Open MPI Developers
> >Subject: [OMPI devel] 1.7.4 status update
> >
> >Hi folks
> >
> >I think it is safe to say that we are not going to get a release
> candidate out
> >tonight - more Fortran problems have surfaced, along with the need to
> >complete the ROMIO review. I have therefore concluded we cannot release
> >1.7.4 this week. This leaves us with a couple of options:
> >
> >1. continue down this path, hopefully releasing 1.7.4 sometime next week,
> >followed by 1.7.5 in the latter half of Feb. The risk here is that any
> further
> >slippage in 1.7.4/5 means that we will not release it as we must roll
> 1.8.0 by
> >mid-March. I'm not too concerned about most of those cmr's as they could
> be
> >considered minor bug fixes and pushed to the 1.8 series, but it leaves
> >oshmem potentially pushed into 1.9.0.
> >
> >2. "promote" all the 1.7.5 cmr's into 1.7.4 and just do a single release
> before
> >ending the series. This eases the immediate schedule crunch, but means we
> >will have to deal with all the bugs that surface when we destabilize the
> 1.7
> >branch again.
> >
> >
> >I'm open to suggestions. Please be prepared to discuss at next Tues
> telecon.
> >Ralph
> >
> >___
> >devel mailing list
> >de...@open-mpi.org
> >http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
> ---
> This email message is for the sole use of the intended recipient(s) and
> may contain
> confidential information.  Any unauthorized review, use, disclosure or
> distribution
> is prohibited.  If you are not the intended recipient, please contact the
> sender by
> reply email and destroy all copies of the original message.
>
> ---
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] 1.7.4 status update

2014-01-22 Thread Ralph Castain
On Jan 22, 2014, at 12:57 PM, Rolf vandeVaart  wrote:

> Hi Ralph:
> In my opinion, we still try to get to a stable 1.7.4.  I think we can just 
> keep the bar high (as you said in the meeting) about what types of fixes need 
> to get into 1.7.4.  I have been telling folks 1.7.4 would be ready "really 
> soon" so the idea of folding in 1.7.5 CMRs and delaying it is less desirable 
> to me.

I generally agree, providing 1.7.4 can finally make it out someday :-(

> 
> Can you remind me again about why the 1.8.0 by mid-March is a requirement?

We committed to Fedora and other OS packagers that we would provide a stable 
1.8.0 by end of Q1 so they can include it in their next major release series

> 
> Thanks,
> Rolf
> 
>> -Original Message-
>> From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Ralph
>> Castain
>> Sent: Tuesday, January 21, 2014 6:41 PM
>> To: Open MPI Developers
>> Subject: [OMPI devel] 1.7.4 status update
>> 
>> Hi folks
>> 
>> I think it is safe to say that we are not going to get a release candidate 
>> out
>> tonight - more Fortran problems have surfaced, along with the need to
>> complete the ROMIO review. I have therefore concluded we cannot release
>> 1.7.4 this week. This leaves us with a couple of options:
>> 
>> 1. continue down this path, hopefully releasing 1.7.4 sometime next week,
>> followed by 1.7.5 in the latter half of Feb. The risk here is that any 
>> further
>> slippage in 1.7.4/5 means that we will not release it as we must roll 1.8.0 
>> by
>> mid-March. I'm not too concerned about most of those cmr's as they could be
>> considered minor bug fixes and pushed to the 1.8 series, but it leaves
>> oshmem potentially pushed into 1.9.0.
>> 
>> 2. "promote" all the 1.7.5 cmr's into 1.7.4 and just do a single release 
>> before
>> ending the series. This eases the immediate schedule crunch, but means we
>> will have to deal with all the bugs that surface when we destabilize the 1.7
>> branch again.
>> 
>> 
>> I'm open to suggestions. Please be prepared to discuss at next Tues telecon.
>> Ralph
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> ---
> This email message is for the sole use of the intended recipient(s) and may 
> contain
> confidential information.  Any unauthorized review, use, disclosure or 
> distribution
> is prohibited.  If you are not the intended recipient, please contact the 
> sender by
> reply email and destroy all copies of the original message.
> ---
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel



Re: [OMPI devel] 1.7.4rc: MPI_F08_INTERFACES_CALLBACKS build failure with PathScale 4.0.12.1

2014-01-22 Thread Paul Hargrove
On Wed, Jan 22, 2014 at 8:50 AM, Jeff Squyres (jsquyres)  wrote:

> Wow.  Pulling on this thread turned up a whole pile of bugs :-\, including
> several other names that are >=32 characters:
>
> Found long name: ompi_type_create_indexed_block_f (32)
> Found long name: ompi_type_create_hindexed_block_f (33)
> Found long name: pompi_type_create_indexed_block_f (33)
> Found long name: pompi_type_create_hindexed_block_f (34)
> Found long name: pompi_file_get_position_shared_f (32)
> Found long name: pompi_file_write_ordered_begin_f (32)
>


As Larry Baker has cast some doubt on the conformance of fortran compiler
applying a 32 (or 31?) char limit on the identifiers used for subroutines
(and/or in BIND), I would not suggest radical changes to OMPI to shorten
names - at least not for 1.7 (might there be a resulting ABI break?).



>
> Can you do me a favor and cd into ompi/mpi/fortran/use-mpi-f08 and try to
> manually "make type_create_indexed_block_f08.lo" and see if it also
> complains?  That's a 32 character name -- let's see if the limit is >=32 or
> >=33...
>


Well that requested make command fails with the *original* complaint
because the 33-char "ompi_type_create_hindexed_block_f" is in the HEADER
file.

So, I manually #if0'ed ompi_type_create_hindexed_block_f from the header.
 That resolved ONE issue, but the Internal issue remains:

$ make type_create_indexed_block_f08.lo
  PPFC mpi-f08.lo
pathf95-1044 pathf95: INTERNAL OMPI_COMM_CREATE_KEYVAL_F, File =
/global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.7-latest-linux-x86_64-pathcc-4.0/openmpi-1.7.4rc2r30361/ompi/mpi/fortran/use-mpi-f08/mpi-f-interfaces-bind.h,
Line = 1244, Column = 38
  Internal : Unexpected ATP_PGM_UNIT in check_interoperable_pgm_unit()
make: *** [mpi-f08.lo] Error 1


After the addition of 2-lines ("#if 0" and "#endif") line 1244 is now:
  1244  subroutine
ompi_comm_create_keyval_f(comm_copy_attr_fn,comm_delete_attr_fn, &
  1245
comm_keyval,extra_state,ierror) &
  1246 BIND(C, name="ompi_comm_create_keyval_f")
  1247 use :: mpi_f08_types, only : MPI_ADDRESS_KIND
  1248 use :: mpi_f08_interfaces_callbacks, only :
MPI_Comm_copy_attr_function


Since PathScale and Open64 fortran compilers print the same errors, I am
guessing that this is from code both inherited from their common ancestor
(SGI's Pro64 was open sourced to create the original Open64).  So, in case
anybody wants to reverse-engineer the problem, below is the source from
Open64 that issues the error (though I can't say I gained any insight from
looking at it).

-Paul

/*
 * Print error messages for constraint violations related to the BIND
attribute
 *
 * attr_idx AT_Tbl_Idx index for program unit
 */
static void
check_interoperable_pgm_unit(int attr_idx) {
  switch (ATP_PGM_UNIT(attr_idx)) {
case Function:
  check_interoperable_data(ATP_RSLT_IDX(attr_idx));
  check_interoperable_procedure(attr_idx);
  break;

case Subroutine:
  check_interoperable_procedure(attr_idx);
  break;

case Program:
case Blockdata:
case Module:
case Pgm_Unknown:
default:
  PRINTMSG(AT_DEF_LINE(attr_idx), 1044, Internal,
AT_DEF_COLUMN(attr_idx),
"Unexpected ATP_PGM_UNIT in check_interoperable_pgm_unit()");
  break;
  }
}




-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] 1.7.4 status update

2014-01-22 Thread Rolf vandeVaart
Hi Ralph:
In my opinion, we still try to get to a stable 1.7.4.  I think we can just keep 
the bar high (as you said in the meeting) about what types of fixes need to get 
into 1.7.4.  I have been telling folks 1.7.4 would be ready "really soon" so 
the idea of folding in 1.7.5 CMRs and delaying it is less desirable to me.

Can you remind me again about why the 1.8.0 by mid-March is a requirement?

Thanks,
Rolf

>-Original Message-
>From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Ralph
>Castain
>Sent: Tuesday, January 21, 2014 6:41 PM
>To: Open MPI Developers
>Subject: [OMPI devel] 1.7.4 status update
>
>Hi folks
>
>I think it is safe to say that we are not going to get a release candidate out
>tonight - more Fortran problems have surfaced, along with the need to
>complete the ROMIO review. I have therefore concluded we cannot release
>1.7.4 this week. This leaves us with a couple of options:
>
>1. continue down this path, hopefully releasing 1.7.4 sometime next week,
>followed by 1.7.5 in the latter half of Feb. The risk here is that any further
>slippage in 1.7.4/5 means that we will not release it as we must roll 1.8.0 by
>mid-March. I'm not too concerned about most of those cmr's as they could be
>considered minor bug fixes and pushed to the 1.8 series, but it leaves
>oshmem potentially pushed into 1.9.0.
>
>2. "promote" all the 1.7.5 cmr's into 1.7.4 and just do a single release before
>ending the series. This eases the immediate schedule crunch, but means we
>will have to deal with all the bugs that surface when we destabilize the 1.7
>branch again.
>
>
>I'm open to suggestions. Please be prepared to discuss at next Tues telecon.
>Ralph
>
>___
>devel mailing list
>de...@open-mpi.org
>http://www.open-mpi.org/mailman/listinfo.cgi/devel
---
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
---


Re: [OMPI devel] 1.7.4rc: MIPS64 atomics tests fail

2014-01-22 Thread Paul Hargrove
Ralph,

It took all night to run, but I CAN confirm "make check" passed on my
64-bit MIPS platform with last night's v1.7 tarball (1.7.4rc2r30361).

This was with "-mabi=64" in {C,CXX,FC}FLAGS and the corresponding wrapper
flags. I would guess from Brian's response to my failures reported for
PPC32 and MIPS32 that the other two ABIs won't work.  However, that won't
stop me from trying (for completeness).

-Paul


On Tue, Jan 21, 2014 at 8:48 AM, Ralph Castain  wrote:

> I dug back and found that your trunk patch still applies, so I committed
> it and moved it to 1.7.4. So if you wouldn't mind verifying it once the
> nightly tarball is available, I'd appreciate it.
>
> Thanks!
> Ralph
>
> On Jan 20, 2014, at 9:38 PM, Paul Hargrove  wrote:
>
> Building a recent (1.7.4rc2r30303) v1.7 tarball on a (QEMU-emulated)
> MIPS64 system I find that the opal atomics test fail.
>
> Applying the "for trunk" patch I attached to ticket #3039 roughly 1 year
> ago resolves the problems for me.  I suppose it would be great if at least
> one person with real MIPS h/w could verify.
>
> As far as I am concerned there is no "pressure" to push this into 1.7.4 if
> time is tight.  I am just (re)testing this platform and reporting the
> results for completeness.
>
> -Paul
>
> --
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>  ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


Re: [OMPI devel] 1.7.4rc: mpirun hangs on ia64

2014-01-22 Thread Paul Hargrove
Ralph,

Attached is the requested output with the addition of "-mca
grpcomm_base_verbose 5".
I have also attached a 2nd output with the further addition of "-mca
oob_tcp_if_include lo" to ensure that this is not related to the firewall
issues I've seen on other hosts.

I have use of this host until 14:30 PST today, and then lose it for 12
hours.
So, tests of the next tarball won't start until after 2:30am - which
probably means 10am.

-Paul


On Wed, Jan 22, 2014 at 7:34 AM, Ralph Castain  wrote:

> Weird - everything looks completely normal. Can you add -mca
> grpcomm_base_verbose 5 to your cmd line?
>
>
> On Jan 22, 2014, at 1:38 AM, Paul Hargrove  wrote:
>
> Following-up as promised:
>
> Output from an --enable-debug build is attached.
>
> -Paul
>
>
> On Tue, Jan 21, 2014 at 11:25 PM, Paul Hargrove wrote:
>
>> Yes, this is familiar. See:
>> http://www.open-mpi.org/community/lists/devel/2013/11/13182.php
>>
>> If I understand correctly, the thread ended with:
>>
>> On 03 December 2013, Sylvestre Ledru wrote:
>>
>>> FYI, Debian has stopped supporting ia64 for its next release
>>> So, I stopped working on that issue.
>>
>>
>> Well, I have access to a Linux/IA64 system and my trials with
>> openmpi-1.7.4rc2r30361 appear to hang, much as Sylvestre had reported w/
>> 1.6.5.
>>
>> I am atatching output from a build W/O --enable debug for the command:
>> $ mpirun -mca plm_base_verbose 5 -mca ras_base_verbose 5 -mca
>> rmaps_base_verbose 5 -mca ess_base_verbose 5 -np 1 ./ring_c
>>
>> I will follow-up with an --enable-debug build when possible.
>>
>> -Paul
>>
>> --
>> Paul H. Hargrove  phhargr...@lbl.gov
>> Future Technologies Group
>> Computer and Data Sciences Department Tel: +1-510-495-2352
>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>>
>
>
>
> --
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>  ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


log.txt.bz2
Description: BZip2 compressed data


log-incl-lo.txt.bz2
Description: BZip2 compressed data


Re: [OMPI devel] 1.7.4rc: MPI_F08_INTERFACES_CALLBACKS build failure with PathScale 4.0.12.1

2014-01-22 Thread Larry Baker
My copy of the Fortran 2003 Standard (Adams, et al., The Fortran 203 Handbook), 
says Fortran Names (incl. procedures, section 3.2.2) are permitted to be up to 
63 characters.  This is not phrased as a requirement, though.  It could be that 
a conforming processor could restrict this to fewer characters i.e., if the 
linker/loader does not support that many characters in an external symbol.

Larry Baker
US Geological Survey
650-329-5608
ba...@usgs.gov



On 22 Jan 2014, at 8:50 AM, Jeff Squyres (jsquyres) wrote:

> On Jan 21, 2014, at 11:49 PM, Paul Hargrove  wrote:
> 
>> Looks like we may be getting closer, but are not quite there:
>> 
>>  PPFC mpi-f08.lo
>>   BIND(C, name="ompi_type_create_hindexed_block_f")
>>^
>> pathf95-1690 pathf95: ERROR OMPI_TYPE_CREATE_HINDEXED_BLOCK_F, File = 
>> /global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.7-latest-linux-x86_64-pathcc-4.0/openmpi-1.7.4rc2r30361/ompi/mpi/fortran/use-mpi-f08/mpi-f-interfaces-bind.h,
>>  Line = 605, Column = 17
>>  NAME= specifier in BIND clause requires scalar character constant
> 
> Wow.  Pulling on this thread turned up a whole pile of bugs :-\, including 
> several other names that are >=32 characters:
> 
> Found long name: ompi_type_create_indexed_block_f (32)
> Found long name: ompi_type_create_hindexed_block_f (33)
> Found long name: pompi_type_create_indexed_block_f (33)
> Found long name: pompi_type_create_hindexed_block_f (34)
> Found long name: pompi_file_get_position_shared_f (32)
> Found long name: pompi_file_write_ordered_begin_f (32)
> 
> Can you do me a favor and cd into ompi/mpi/fortran/use-mpi-f08 and try to 
> manually "make type_create_indexed_block_f08.lo" and see if it also 
> complains?  That's a 32 character name -- let's see if the limit is >=32 or 
> >=33...
> 
>> pathf95-1044 pathf95: INTERNAL OMPI_COMM_CREATE_KEYVAL_F, File = 
>> /global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.7-latest-linux-x86_64-pathcc-4.0/openmpi-1.7.4rc2r30361/ompi/mpi/fortran/use-mpi-f08/mpi-f-interfaces-bind.h,
>>  Line = 1242, Column = 38
>>  Internal : Unexpected ATP_PGM_UNIT in check_interoperable_pgm_unit()
>> make[2]: *** [mpi-f08.lo] Error 1
>> make[2]: Leaving directory 
>> `/global/scratch2/sd/hargrove/OMPI/openmpi-1.7-latest-linux-x86_64-pathcc-4.0/BLD/ompi/mpi/fortran/use-mpi-f08'
>> 
>> The first error appears likely to be due to the 33-character name for the C 
>> binding.
>> Not sure if that is a limitation allowed by the fortran spec, or an 
>> arbitrary limitation in this compiler.
>> 
>> The "Internal" may be a show-stopper (not OMPI's fault), unless it goes away 
>> once the prior error is resolved.
> 
> I'll ask Pathscale; thanks.
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel



Re: [OMPI devel] 1.7.4rc: MPI_F08_INTERFACES_CALLBACKS build failure with PathScale 4.0.12.1

2014-01-22 Thread Jeff Squyres (jsquyres)
On Jan 21, 2014, at 11:49 PM, Paul Hargrove  wrote:

> Looks like we may be getting closer, but are not quite there:
> 
>   PPFC mpi-f08.lo
>BIND(C, name="ompi_type_create_hindexed_block_f")
> ^
> pathf95-1690 pathf95: ERROR OMPI_TYPE_CREATE_HINDEXED_BLOCK_F, File = 
> /global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.7-latest-linux-x86_64-pathcc-4.0/openmpi-1.7.4rc2r30361/ompi/mpi/fortran/use-mpi-f08/mpi-f-interfaces-bind.h,
>  Line = 605, Column = 17
>   NAME= specifier in BIND clause requires scalar character constant

Wow.  Pulling on this thread turned up a whole pile of bugs :-\, including 
several other names that are >=32 characters:

Found long name: ompi_type_create_indexed_block_f (32)
Found long name: ompi_type_create_hindexed_block_f (33)
Found long name: pompi_type_create_indexed_block_f (33)
Found long name: pompi_type_create_hindexed_block_f (34)
Found long name: pompi_file_get_position_shared_f (32)
Found long name: pompi_file_write_ordered_begin_f (32)

Can you do me a favor and cd into ompi/mpi/fortran/use-mpi-f08 and try to 
manually "make type_create_indexed_block_f08.lo" and see if it also complains?  
That's a 32 character name -- let's see if the limit is >=32 or >=33...

> pathf95-1044 pathf95: INTERNAL OMPI_COMM_CREATE_KEYVAL_F, File = 
> /global/homes/h/hargrove/GSCRATCH/OMPI/openmpi-1.7-latest-linux-x86_64-pathcc-4.0/openmpi-1.7.4rc2r30361/ompi/mpi/fortran/use-mpi-f08/mpi-f-interfaces-bind.h,
>  Line = 1242, Column = 38
>   Internal : Unexpected ATP_PGM_UNIT in check_interoperable_pgm_unit()
> make[2]: *** [mpi-f08.lo] Error 1
> make[2]: Leaving directory 
> `/global/scratch2/sd/hargrove/OMPI/openmpi-1.7-latest-linux-x86_64-pathcc-4.0/BLD/ompi/mpi/fortran/use-mpi-f08'
> 
> The first error appears likely to be due to the 33-character name for the C 
> binding.
> Not sure if that is a limitation allowed by the fortran spec, or an arbitrary 
> limitation in this compiler.
> 
> The "Internal" may be a show-stopper (not OMPI's fault), unless it goes away 
> once the prior error is resolved.

I'll ask Pathscale; thanks.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI devel] 1.7.4rc: mpirun hangs on ia64

2014-01-22 Thread Paul Hargrove
Following-up as promised:

Output from an --enable-debug build is attached.

-Paul


On Tue, Jan 21, 2014 at 11:25 PM, Paul Hargrove  wrote:

> Yes, this is familiar. See:
> http://www.open-mpi.org/community/lists/devel/2013/11/13182.php
>
> If I understand correctly, the thread ended with:
>
> On 03 December 2013, Sylvestre Ledru wrote:
>
>> FYI, Debian has stopped supporting ia64 for its next release
>> So, I stopped working on that issue.
>
>
> Well, I have access to a Linux/IA64 system and my trials with
> openmpi-1.7.4rc2r30361 appear to hang, much as Sylvestre had reported w/
> 1.6.5.
>
> I am atatching output from a build W/O --enable debug for the command:
> $ mpirun -mca plm_base_verbose 5 -mca ras_base_verbose 5 -mca
> rmaps_base_verbose 5 -mca ess_base_verbose 5 -np 1 ./ring_c
>
> I will follow-up with an --enable-debug build when possible.
>
> -Paul
>
> --
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> Computer and Data Sciences Department Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


log.txt.bz2
Description: BZip2 compressed data


[OMPI devel] 1.7.4rc: mpirun hangs on ia64

2014-01-22 Thread Paul Hargrove
Yes, this is familiar. See:
http://www.open-mpi.org/community/lists/devel/2013/11/13182.php

If I understand correctly, the thread ended with:

On 03 December 2013, Sylvestre Ledru wrote:

> FYI, Debian has stopped supporting ia64 for its next release
> So, I stopped working on that issue.


Well, I have access to a Linux/IA64 system and my trials with
openmpi-1.7.4rc2r30361 appear to hang, much as Sylvestre had reported w/
1.6.5.

I am atatching output from a build W/O --enable debug for the command:
$ mpirun -mca plm_base_verbose 5 -mca ras_base_verbose 5 -mca
rmaps_base_verbose 5 -mca ess_base_verbose 5 -np 1 ./ring_c

I will follow-up with an --enable-debug build when possible.

-Paul

-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
Computer and Data Sciences Department Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


log.txt.bz2
Description: BZip2 compressed data