Re: [OMPI users] Windows Open MPI question

2012-12-18 Thread Kumar, Sudhir
Hi
 The error is resolved. The solution was actually in a previous post.
http://www.open-mpi.org/community/lists/users/2011/03/15954.php



-Original Message-
From: Kumar, Sudhir 
Sent: Tuesday, December 18, 2012 1:37 PM
To: 'Open MPI Users'
Subject: RE: [OMPI users] Windows Open MPI question

Hi 
 I am getting several Linker errors while doing a build

 Error  194 error LNK2001: unresolved external symbol ompi_mpi_int  


-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Jeff Squyres
Sent: Tuesday, December 18, 2012 12:59 PM
To: Open MPI Users
Subject: Re: [OMPI users] Windows Open MPI question

Just curious -- what do you need this struct type for?

It's an internal type; you shouldn't need the definition for MPI applications.


On Dec 18, 2012, at 12:05 PM, marco atzeri wrote:

> On 12/18/2012 5:49 PM, Kumar, Sudhir wrote:
>> Hi
>>  Is struct ompi_datatype_t defined only for Linux or is there a windows 
>> equivalent. If so in which header file can it be found.
>> Thanks
> 
> ompi/datatype/ompi_datatype.h
> 
> Regards
> Marco
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



[OMPI users] OpenMPI with cMake on Windows

2012-12-18 Thread Stephen Conley
Hello,



I have installed CMake version 2.8.10.2 and OpenMPI version 1.6.2 on a 64
bit Windows 7 computer.  



OpenMPI is installed in "C:\program files\OpenMPI" and the path has been
updated to include the bin subdirectory.



In the cmakelists.txt file, I have: find_package(MPI REQUIRED)



When I run cmake, I receive the following error:



C:\Users\steve\workspace\Dales\build>cmake ..\src -G "MinGW Makefiles"

CMake Error at C:/Program Files (x86)/CMake
2.8/share/cmake-2.8/Modules/FindPack

ageHandleStandardArgs.cmake:97 (message):

  Could NOT find MPI_C (missing: MPI_C_LIBRARIES)

Call Stack (most recent call first):

  C:/Program Files (x86)/CMake
2.8/share/cmake-2.8/Modules/FindPackageHandleStan

dardArgs.cmake:291 (_FPHSA_FAILURE_MESSAGE)

  C:/Program Files (x86)/CMake 2.8/share/cmake-2.8/Modules/FindMPI.cmake:587
(fi

nd_package_handle_standard_args)

  CMakeLists.txt:9 (find_package)





-- Configuring incomplete, errors occurred!



Any ideas as to what I am missing?





Re: [OMPI users] openmpi-1.9a1r27674 on Cygwin-1.7.17

2012-12-18 Thread marco atzeri

On 12/18/2012 6:55 PM, Jeff Squyres wrote:

...but only of v1.6.x.


okay, adding development version on Christmas wishlist
;-)



On Dec 18, 2012, at 10:32 AM, Ralph Castain wrote:


Also, be aware that the Cygwin folks have already released a fully functional 
port of OMPI to that environment as a package. So if you want OMPI on Cygwin, 
you can just download and install the Cygwin package - no need to build it 
yourself.




Regards
Marco



Re: [OMPI users] openmpi-1.9a1r27674 on Cygwin-1.7.17

2012-12-18 Thread Jeff Squyres
...but only of v1.6.x.

On Dec 18, 2012, at 10:32 AM, Ralph Castain wrote:

> Also, be aware that the Cygwin folks have already released a fully functional 
> port of OMPI to that environment as a package. So if you want OMPI on Cygwin, 
> you can just download and install the Cygwin package - no need to build it 
> yourself.
> 
> 
> On Dec 18, 2012, at 7:23 AM, Jeff Squyres  wrote:
> 
>> On Dec 18, 2012, at 10:06 AM, JR Cary wrote:
>> 
>>> So, IMO, OpenMPI would have to turn to a different
>>> group for support.  E.g., Microsoft compatible HPC
>>> application vendors.  And for that one would need a
>>> compelling case of being better in, e.g., performance.
>> 
>> I doubt that a performance case could be made.  That is, I don't expect 
>> modern versions of Windows are any more/less efficient and integer/floating 
>> point ops (which are key to HPC apps) than modern versions of Linux or other 
>> OS's.  The underlying x86 hardware is the same (in most/commodity cases), 
>> after all.
>> 
>> Windows also has (effectively) an OS-bypass network stack, like Linux, for 
>> network providers.
>> 
>> Hence, I don't want to open the "Windows performance vs. Linux performance" 
>> religious debate.  I'm assuming that if someone cared, they could get 
>> comparable performance out of Windows and Linux.
>> 
>>> Perhaps there is another way?
>> 
>> 
>> At this point, I think we're up for volunteers.  :-\
>> 
>> FWIW: I'm still debating these cygwin patches.  
>> 
>> The cmake/native build process will likely go if no one steps up to maintain 
>> it.  But in our discussions, I don't think we've delineated between "Windows 
>> native" and "cygwin": a major difference is that he cygwin build uses the 
>> same Autotools build system that OMPI uses on POSIX systems.  And I don't 
>> know how much custom code cygwin requires vs. native Windows code (although 
>> I seem to recall that native windows code definitely performs better than 
>> its cygwin counterparts -- e.g., Windows SOCKETs are faster then cygwin 
>> POSIX sockets).
>> 
>> -- 
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to: 
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>> 
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] openmpi-1.9a1r27674 on Cygwin-1.7.17

2012-12-18 Thread Damien
It's a historical and emotional decision that also used to have a 
business driver.  I learned MPI with LAM on Linux (minute's silence...) 
and switched to OpenMPI when LAM went to join the big supercomputer in 
the sky.  Shortly after OpenMPI launched, we had some discussions about 
a Windows version, but it took until 1.5 before there was one because 
Shiqing did the heavy lifting.


At the time, MS was still heavily into HPC and I used OpenMPI on Windows 
(and Linux) because I like the team that develops it and wanted to 
support the product.  In 2011 when MS changed the direction of their HPC 
team I think OpenMPI on Windows lost the actual business driver for a 
critical mass of users, and that's what we're seeing now.  I'll still 
use OpenMPI on Linux, but I think on Windows it will be HPC Pack or MPICH.


Damien

On 18/12/2012 10:20 AM, JR Cary wrote:

So a question - why do *you* use (native) OpenMPI on Windows, when
you could just download HPC Pack?  Was it for any reason related
to implementation?

(I may have been one of those 2-3 candidate users, but I actually
just download HPC Pack.)

Back to the point of why OpenMPI might be desirable: I agree with
Jeff that it is not about on-node performance, nor use of the network
stack.  It would have to be better or more implementations above that
layer, such as OpenMPI having implementations for some advanced MPI
methods that are absent in HPC Pack (which I understand has forked
from MPICH).

But, yeah, it does seem like the coffin is pretty well shut, otherwise.

Thx...John

On 12/18/12 9:00 AM, Damien wrote:
Proper Windows support of OpenMPI is likely around 20 hours a week. 
That can be maintained by a small group, but it's probably too much 
for one person unless they're working in Windows HPC every day. When 
I posted a couple of weeks back, there were three people (maybe two?) 
who responded that they used OpenMPI on Windows regularly, other than 
me.


I hate to say it, but against MPICH and the Microsoft and Intel MPICH 
versions with probably a few thousand regular users, I think OpenMPI 
on native Windows is dead in the water.


Damien

On 18/12/2012 8:06 AM, JR Cary wrote:

On 12/18/12 6:29 AM, Jeff Squyres wrote:

This brings up the point again, however, of Windows support.

Open MPI recently lost its only Windows developer (he moved on to 
non-HPC things).  This has been discussed on the lists a few times 
(I honestly don't remember if it was this users list or the devel 
list), and there hasn't really been anyone who volunteered their 
time to support Open MPI on Windows.


Definitely this list.

We're seriously considering removing all Windows support for 1.7 
and beyond (keep in mind that the native Windows support on the SVN 
trunk and v1.7 branch is very, very out of date and needs some 
serious work to get working again -- the last working native 
Windows version is on the v1.6 branch).


Sounds appropriate.  My conversations with Microsoft
went no where.  Spoke last night with another good
friend there who worked in their HPC unit when that
existed.  Microsoft has their own implementation, and
they see no need for another.

So, IMO, OpenMPI would have to turn to a different
group for support.  E.g., Microsoft compatible HPC
application vendors.  And for that one would need a
compelling case of being better in, e.g., performance.

Can this case be made?

Perhaps there is another way?

John







On Dec 18, 2012, at 3:04 AM, Siegmar Gross wrote:


Hi,

I tried to install openmpi-1.9a1r27674 on Cygwin-1.7.17 and
got the following error (gcc-4.5.3).

...
  CC   path.lo
../../../openmpi-1.9a1r27668/opal/util/path.c: In function
  'opal_path_df':
../../../openmpi-1.9a1r27668/opal/util/path.c:578:18: error:
  'buf' undeclared (first use in this function)
../../../openmpi-1.9a1r27668/opal/util/path.c:578:18: note:
  each undeclared identifier is reported only once for each
  function it appears in
Makefile:1669: recipe for target `path.lo' failed
make[3]: *** [path.lo] Error 1
...


The reason is that "buf" is only declared for some operating
systems. I added "defined(__CYGWIN__)" in some places and
was able to compile "path.c".


hermes util 41 diff path.c path.c.orig
452c452
< #elif defined(__linux__) || defined(__CYGWIN__) ||
  defined (__BSD) || (defined(__APPLE__) && defined(__MACH__))
---

#elif defined(__linux__) || defined (__BSD) ||

  (defined(__APPLE__) && defined(__MACH__))
480c480
< #elif defined(__linux__) || defined(__CYGWIN__) ||
  defined (__BSD) || (defined(__APPLE__) && defined(__MACH__))
---

#elif defined(__linux__) || defined (__BSD) ||

  (defined(__APPLE__) && defined(__MACH__))
517c517
< #elif defined(__linux__) || defined(__CYGWIN__)
---

#elif defined(__linux__)

549c549
< #elif defined(__linux__) || defined(__CYGWIN__) ||
  defined (__BSD) || \
---

#elif defined(__linux__) || defined (__BSD) ||\

562c562
< #elif defined(__linux__) || defined (__CYGWIN__) ||
  defined (__BSD) ||   \

Re: [OMPI users] openmpi-1.9a1r27674 on Cygwin-1.7.17

2012-12-18 Thread JR Cary

So a question - why do *you* use (native) OpenMPI on Windows, when
you could just download HPC Pack?  Was it for any reason related
to implementation?

(I may have been one of those 2-3 candidate users, but I actually
just download HPC Pack.)

Back to the point of why OpenMPI might be desirable: I agree with
Jeff that it is not about on-node performance, nor use of the network
stack.  It would have to be better or more implementations above that
layer, such as OpenMPI having implementations for some advanced MPI
methods that are absent in HPC Pack (which I understand has forked
from MPICH).

But, yeah, it does seem like the coffin is pretty well shut, otherwise.

Thx...John

On 12/18/12 9:00 AM, Damien wrote:
Proper Windows support of OpenMPI is likely around 20 hours a week. 
That can be maintained by a small group, but it's probably too much 
for one person unless they're working in Windows HPC every day. When I 
posted a couple of weeks back, there were three people (maybe two?) 
who responded that they used OpenMPI on Windows regularly, other than me.


I hate to say it, but against MPICH and the Microsoft and Intel MPICH 
versions with probably a few thousand regular users, I think OpenMPI 
on native Windows is dead in the water.


Damien

On 18/12/2012 8:06 AM, JR Cary wrote:

On 12/18/12 6:29 AM, Jeff Squyres wrote:

This brings up the point again, however, of Windows support.

Open MPI recently lost its only Windows developer (he moved on to 
non-HPC things).  This has been discussed on the lists a few times 
(I honestly don't remember if it was this users list or the devel 
list), and there hasn't really been anyone who volunteered their 
time to support Open MPI on Windows.


Definitely this list.

We're seriously considering removing all Windows support for 1.7 and 
beyond (keep in mind that the native Windows support on the SVN 
trunk and v1.7 branch is very, very out of date and needs some 
serious work to get working again -- the last working native Windows 
version is on the v1.6 branch).


Sounds appropriate.  My conversations with Microsoft
went no where.  Spoke last night with another good
friend there who worked in their HPC unit when that
existed.  Microsoft has their own implementation, and
they see no need for another.

So, IMO, OpenMPI would have to turn to a different
group for support.  E.g., Microsoft compatible HPC
application vendors.  And for that one would need a
compelling case of being better in, e.g., performance.

Can this case be made?

Perhaps there is another way?

John







On Dec 18, 2012, at 3:04 AM, Siegmar Gross wrote:


Hi,

I tried to install openmpi-1.9a1r27674 on Cygwin-1.7.17 and
got the following error (gcc-4.5.3).

...
  CC   path.lo
../../../openmpi-1.9a1r27668/opal/util/path.c: In function
  'opal_path_df':
../../../openmpi-1.9a1r27668/opal/util/path.c:578:18: error:
  'buf' undeclared (first use in this function)
../../../openmpi-1.9a1r27668/opal/util/path.c:578:18: note:
  each undeclared identifier is reported only once for each
  function it appears in
Makefile:1669: recipe for target `path.lo' failed
make[3]: *** [path.lo] Error 1
...


The reason is that "buf" is only declared for some operating
systems. I added "defined(__CYGWIN__)" in some places and
was able to compile "path.c".


hermes util 41 diff path.c path.c.orig
452c452
< #elif defined(__linux__) || defined(__CYGWIN__) ||
  defined (__BSD) || (defined(__APPLE__) && defined(__MACH__))
---

#elif defined(__linux__) || defined (__BSD) ||

  (defined(__APPLE__) && defined(__MACH__))
480c480
< #elif defined(__linux__) || defined(__CYGWIN__) ||
  defined (__BSD) || (defined(__APPLE__) && defined(__MACH__))
---

#elif defined(__linux__) || defined (__BSD) ||

  (defined(__APPLE__) && defined(__MACH__))
517c517
< #elif defined(__linux__) || defined(__CYGWIN__)
---

#elif defined(__linux__)

549c549
< #elif defined(__linux__) || defined(__CYGWIN__) ||
  defined (__BSD) || \
---

#elif defined(__linux__) || defined (__BSD) ||\

562c562
< #elif defined(__linux__) || defined (__CYGWIN__) ||
  defined (__BSD) ||   \
---

#elif defined(__linux__) || defined (__BSD) ||\

hermes util 42


Searching for "__linux__" delivered some more files which
must possibly be adapted.

opal/config/opal_check_os_flavors.m4
opal/mca/event/libevent2019/libevent/buffer.c


I assume that the following files do not need any changes
because they are special for Linux or for features which
are not important/available for Cygwin.


configure:{ $as_echo "$as_me:${as_lineno-$LINENO}:
  checking __linux__" >&5
configure:$as_echo_n "checking __linux__... " >&6; }
configure:#ifndef __linux__
configure:  error: this isnt __linux__

test/util/opal_path_nfs.c

opal/asm/base/MIPS.asm:#ifdef __linux__
opal/asm/generated/atomic-mips64el.s:#ifdef __linux__
opal/asm/generated/atomic-mips64-linux.s:#ifdef __linux__
opal/asm/generated/atomic-mips-irix.s:#ifdef __linux__

Re: [OMPI users] [Open MPI] #3351: JAVA scatter error

2012-12-18 Thread Siegmar Gross
Hi

> >>  1. The datatypes passed to Scatter are not valid MPI datatypes
> >> (MPI.OBJECT).  You need to construct a datatype that is specific to the
> >> !MyData class, just like you would in C/C++.  I think that this is the
> >> first error that you are seeing (i.e., that OMPI is trying to treat
> >> MPI.OBJECT as an MPI Datatype object, and failing (and therefore throwing
> >> an !ClassCastException exception).
> > 
> > Perhaps you are right and my small example program ist not a valid MPI
> > program. The problem is that I couldn't find any good documentation or
> > example programs how to write a program which uses a structured data
> > type.
> 
> In Java, that's probably true.  Remember: there are no official MPI
> Java bindings. What is included in Open MPI is a research project
> from several years ago.  We picked what appeared to be the best one,
> freshened it up a little, updated its build system to incorporate
> into ours, verified its basic functionality, and went with that.
> 
> In C, there should be plenty of google-able examples about how to
> use Scatter (and friends).  You might want to have a look at a few
> of those to get an idea how to use MPI_Scatter in general, and then
> apply that knowledge to a Java program.
> 
> Make sense?

I know how to use MPI_Scatter or MPI_Scatterv in C, because I have
written some small and working example programs myself in the past.
My first Java program with MPI_Scatter was ColumnScatterMain.java
which I had sent to the list early October and now once more to you in
December. October 10th I had sent the program ColumnSendRecvMain.java
to the list (Subject: Datatype.Vector in mpijava in openmpi-1.9a1r27380),
because I thought and still think that building a column vector
doesn't work as expected. At the end of that email I wrote "In my
opinion Datatype.Vector doesn't work as expected. mpiJava doesn't
support something similar to MPI_Type_create_resized so how can I use
column_t in a scatter operation? Will scatter automatically start with
the next element and not with the element following the extent of
column_t?". In my opinion Datatype.Vector must set the size of the
base datatype as extent of the vector and not the true extent, because
MPI-Java doesn't provide a function to resize a datatype. Furthermore
Datatype.Struct allows only a collection of elements of the same type,
so that you must use a data object, if you want to scatter or broadcast
data of different types in one operation. We should forget
ObjectScatterMain.java for the moment and concentrate on
ObjectBroadcastMain.java, which I have sent three days ago to the list,
because it has the same problem.

1) ColumnSendRecvMain.java

I create a 2D-matrix with (Java books would use "double[][] matrix"
which is the same in my opinion, but I like C notation)

double matrix[][] = new double[P][Q];

Next I create a column vector

column_t = Datatype.Vector (P, 1, Q, MPI.DOUBLE);
column_t.Commit ();

which I can use in a send/recv-operation

if (mytid == 0)
{
  /* send one column to each process*/
  for (i = 0; i < Q; ++i)
  {
MPI.COMM_WORLD.Send (matrix, i, 1, column_t, i + 1, 0);
  }
}
else
{
  MPI.COMM_WORLD.Recv (column, 0, P, MPI.DOUBLE, 0, 0);


This example doesn't depend on the extent of column_t, because I set
the "offset" where every column starts (at least I think so :-) ).
Java doesn't want that a user has any knowledge about memory layouts
or addresses of data structures. That's the reason why I think that
all necessary computations and transformations must be done in
Datatype.Vector, MPI.COMM_WORLD.Send, and MPI.COMM_WORLD.Recv.
Unfortunately it seems that that is not the case.

tyr java 125 mpiexec -np 7 -output-filename xx java ColumnSendRecvMain
tyr java 128 cat xx.1.0 xx.1.1

matrix:

  1.00  2.00  3.00  4.00  5.00  6.00
  7.00  8.00  9.00 10.00 11.00 12.00
 13.00 14.00 15.00 16.00 17.00 18.00
 19.00 20.00 21.00 22.00 23.00 24.00

Column of process 1

  0.00  3.00  7.00  0.00


I get the following output, if I use "int" instead of "double".

tyr java 143 mpiexec -np 7 -output-filename xx java ColumnSendRecvIntMain
tyr java 144 cat xx.1.0 xx.1.1

matrix:

 1   2   3   4   5   6  
 7   8   9  10  11  12  
13  14  15  16  17  18  
19  20  21  22  23  24  

Column of process 1

99731135  1586 5 7

It is easy to see that process 1 doesn't get column 0. Your
suggestion to allocate enough memory for a matrix (without defining
a matrix) and doing all index computations yourself is in my opinion
not applicable for a "normal" Java programmer (it's even hard for
most C programmers :-) ). Hopefully you have an idea how to solve
this problem so that all processes receive correct column values.


2) 

Re: [OMPI users] Windows Open MPI question

2012-12-18 Thread marco atzeri

On 12/18/2012 5:49 PM, Kumar, Sudhir wrote:

Hi
  Is struct ompi_datatype_t defined only for Linux or is there a windows 
equivalent. If so in which header file can it be found.
Thanks


 ompi/datatype/ompi_datatype.h

Regards
Marco



[OMPI users] Windows Open MPI question

2012-12-18 Thread Kumar, Sudhir
Hi
 Is struct ompi_datatype_t defined only for Linux or is there a windows 
equivalent. If so in which header file can it be found.
Thanks



Re: [OMPI users] openmpi-1.9a1r27674 on Cygwin-1.7.17

2012-12-18 Thread Jeff Squyres
On Dec 18, 2012, at 10:06 AM, JR Cary wrote:

> So, IMO, OpenMPI would have to turn to a different
> group for support.  E.g., Microsoft compatible HPC
> application vendors.  And for that one would need a
> compelling case of being better in, e.g., performance.

I doubt that a performance case could be made.  That is, I don't expect modern 
versions of Windows are any more/less efficient and integer/floating point ops 
(which are key to HPC apps) than modern versions of Linux or other OS's.  The 
underlying x86 hardware is the same (in most/commodity cases), after all.

Windows also has (effectively) an OS-bypass network stack, like Linux, for 
network providers.

Hence, I don't want to open the "Windows performance vs. Linux performance" 
religious debate.  I'm assuming that if someone cared, they could get 
comparable performance out of Windows and Linux.

> Perhaps there is another way?


At this point, I think we're up for volunteers.  :-\

FWIW: I'm still debating these cygwin patches.  

The cmake/native build process will likely go if no one steps up to maintain 
it.  But in our discussions, I don't think we've delineated between "Windows 
native" and "cygwin": a major difference is that he cygwin build uses the same 
Autotools build system that OMPI uses on POSIX systems.  And I don't know how 
much custom code cygwin requires vs. native Windows code (although I seem to 
recall that native windows code definitely performs better than its cygwin 
counterparts -- e.g., Windows SOCKETs are faster then cygwin POSIX sockets).

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] openmpi-1.9a1r27674 on Cygwin-1.7.17

2012-12-18 Thread JR Cary

On 12/18/12 6:29 AM, Jeff Squyres wrote:

This brings up the point again, however, of Windows support.

Open MPI recently lost its only Windows developer (he moved on to non-HPC 
things).  This has been discussed on the lists a few times (I honestly don't 
remember if it was this users list or the devel list), and there hasn't really 
been anyone who volunteered their time to support Open MPI on Windows.


Definitely this list.


We're seriously considering removing all Windows support for 1.7 and beyond 
(keep in mind that the native Windows support on the SVN trunk and v1.7 branch 
is very, very out of date and needs some serious work to get working again -- 
the last working native Windows version is on the v1.6 branch).


Sounds appropriate.  My conversations with Microsoft
went no where.  Spoke last night with another good
friend there who worked in their HPC unit when that
existed.  Microsoft has their own implementation, and
they see no need for another.

So, IMO, OpenMPI would have to turn to a different
group for support.  E.g., Microsoft compatible HPC
application vendors.  And for that one would need a
compelling case of being better in, e.g., performance.

Can this case be made?

Perhaps there is another way?

John







On Dec 18, 2012, at 3:04 AM, Siegmar Gross wrote:


Hi,

I tried to install openmpi-1.9a1r27674 on Cygwin-1.7.17 and
got the following error (gcc-4.5.3).

...
  CC   path.lo
../../../openmpi-1.9a1r27668/opal/util/path.c: In function
  'opal_path_df':
../../../openmpi-1.9a1r27668/opal/util/path.c:578:18: error:
  'buf' undeclared (first use in this function)
../../../openmpi-1.9a1r27668/opal/util/path.c:578:18: note:
  each undeclared identifier is reported only once for each
  function it appears in
Makefile:1669: recipe for target `path.lo' failed
make[3]: *** [path.lo] Error 1
...


The reason is that "buf" is only declared for some operating
systems. I added "defined(__CYGWIN__)" in some places and
was able to compile "path.c".


hermes util 41 diff path.c path.c.orig
452c452
< #elif defined(__linux__) || defined(__CYGWIN__) ||
  defined (__BSD) || (defined(__APPLE__) && defined(__MACH__))
---

#elif defined(__linux__) || defined (__BSD) ||

  (defined(__APPLE__) && defined(__MACH__))
480c480
< #elif defined(__linux__) || defined(__CYGWIN__) ||
  defined (__BSD) || (defined(__APPLE__) && defined(__MACH__))
---

#elif defined(__linux__) || defined (__BSD) ||

  (defined(__APPLE__) && defined(__MACH__))
517c517
< #elif defined(__linux__) || defined(__CYGWIN__)
---

#elif defined(__linux__)

549c549
< #elif defined(__linux__) || defined(__CYGWIN__) ||
  defined (__BSD) || \
---

#elif defined(__linux__) || defined (__BSD) ||\

562c562
< #elif defined(__linux__) || defined (__CYGWIN__) ||
  defined (__BSD) ||   \
---

#elif defined(__linux__) || defined (__BSD) ||\

hermes util 42


Searching for "__linux__" delivered some more files which
must possibly be adapted.

opal/config/opal_check_os_flavors.m4
opal/mca/event/libevent2019/libevent/buffer.c


I assume that the following files do not need any changes
because they are special for Linux or for features which
are not important/available for Cygwin.


configure:{ $as_echo "$as_me:${as_lineno-$LINENO}:
  checking __linux__" >&5
configure:$as_echo_n "checking __linux__... " >&6; }
configure:#ifndef __linux__
configure:  error: this isnt __linux__

test/util/opal_path_nfs.c

opal/asm/base/MIPS.asm:#ifdef __linux__
opal/asm/generated/atomic-mips64el.s:#ifdef __linux__
opal/asm/generated/atomic-mips64-linux.s:#ifdef __linux__
opal/asm/generated/atomic-mips-irix.s:#ifdef __linux__
opal/asm/generated/atomic-mips-linux.s:#ifdef __linux__

ompi/mca/common/verbs/common_verbs_basics.c:#if defined(__linux__)
opal/include/opal/sys/cma.h:#ifdef __linux__
opal/mca/memory/linux/arena.c:#ifdef __linux__

ompi/mca/io/romio/romio/configure:#ifdef __linux__
ompi/mca/io/romio/romio/configure.in:#ifdef __linux__
opal/include/opal/sys/mips/atomic.h:#ifdef __linux__
opal/include/opal/sys/mips/atomic.h:#ifdef __linux__
opal/include/opal/sys/mips/atomic.h:#ifdef __linux__
opal/mca/event/libevent2019/libevent/arc4random.c:#ifdef __linux__

ompi/mca/io/romio/romio/adio/ad_lustre/ad_lustre.h:#ifdef __linux__
ompi/mca/io/romio/romio/adio/ad_lustre/ad_lustre.h:#endif /* __linux__ */


Can somebody add __Cygwin__ to all necessary files? Now I get
the following error.

...
Making all in mca/if/windows
make[2]: Entering directory
  `/home/Admin/openmpi/openmpi-1.9-Cygwin.x86.32_gcc/opal/mca/if/windows'
  CC   opal_if_windows.lo
../../../../../openmpi-1.9a1r27674/opal/mca/if/windows/opal_if_windows.c:
  In function 'if_windows_open':
../../../../../openmpi-1.9a1r27674/opal/mca/if/windows/opal_if_windows.c:58:5:
  error: 'SOCKET' undeclared (first use in this function)
../../../../../openmpi-1.9a1r27674/opal/mca/if/windows/opal_if_windows.c:58:5:
  note: each undeclared identifier is reported only 

Re: [OMPI users] problem configuring openmpi-1.6.4a1r27643 on Linux

2012-12-18 Thread Jeff Squyres
On Dec 13, 2012, at 2:39 AM, Siegmar Gross wrote:

> I found the error with your hint. For Open MPI 1.6.x I must also
> specify "F77" and "FFLAGS" for the Fortran 77 compiler. Otherwise
> it uses "gfortran" from the GNU package. "gfortran" worked for the
> 64 bit version and didn't work for the 32 bit version. (that's the
> reason why I got only an error for the 32 bit version).

Yep, sorry about that -- we revamped the Fortran support in v1.7 and beyond 
such that the following are used with configure:

v1.6.x and earlier: F77, FFLAGS, FC, FCFLAGS
v1.7.0 and later: FC, FCFLAGS

I.e., we consolidated down to one Fortran compiler for v1.7.0 and later (and 
similarly deprecated mpif77 and mpif90 -- mpifort is now preferred in v1.7.0 
and layer).

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] openmpi-1.9a1r27674 on Cygwin-1.7.17

2012-12-18 Thread Jeff Squyres
Thanks for all the patches.

This brings up the point again, however, of Windows support.

Open MPI recently lost its only Windows developer (he moved on to non-HPC 
things).  This has been discussed on the lists a few times (I honestly don't 
remember if it was this users list or the devel list), and there hasn't really 
been anyone who volunteered their time to support Open MPI on Windows.  

We're seriously considering removing all Windows support for 1.7 and beyond 
(keep in mind that the native Windows support on the SVN trunk and v1.7 branch 
is very, very out of date and needs some serious work to get working again -- 
the last working native Windows version is on the v1.6 branch).




On Dec 18, 2012, at 3:04 AM, Siegmar Gross wrote:

> Hi,
> 
> I tried to install openmpi-1.9a1r27674 on Cygwin-1.7.17 and
> got the following error (gcc-4.5.3).
> 
> ...
>  CC   path.lo
> ../../../openmpi-1.9a1r27668/opal/util/path.c: In function
>  'opal_path_df':
> ../../../openmpi-1.9a1r27668/opal/util/path.c:578:18: error:
>  'buf' undeclared (first use in this function)
> ../../../openmpi-1.9a1r27668/opal/util/path.c:578:18: note:
>  each undeclared identifier is reported only once for each
>  function it appears in
> Makefile:1669: recipe for target `path.lo' failed
> make[3]: *** [path.lo] Error 1
> ...
> 
> 
> The reason is that "buf" is only declared for some operating
> systems. I added "defined(__CYGWIN__)" in some places and
> was able to compile "path.c".
> 
> 
> hermes util 41 diff path.c path.c.orig
> 452c452
> < #elif defined(__linux__) || defined(__CYGWIN__) ||
>  defined (__BSD) || (defined(__APPLE__) && defined(__MACH__))
> ---
>> #elif defined(__linux__) || defined (__BSD) ||
>  (defined(__APPLE__) && defined(__MACH__))
> 480c480
> < #elif defined(__linux__) || defined(__CYGWIN__) ||
>  defined (__BSD) || (defined(__APPLE__) && defined(__MACH__))
> ---
>> #elif defined(__linux__) || defined (__BSD) ||
>  (defined(__APPLE__) && defined(__MACH__))
> 517c517
> < #elif defined(__linux__) || defined(__CYGWIN__)
> ---
>> #elif defined(__linux__)
> 549c549
> < #elif defined(__linux__) || defined(__CYGWIN__) ||
>  defined (__BSD) || \
> ---
>> #elif defined(__linux__) || defined (__BSD) ||\
> 562c562
> < #elif defined(__linux__) || defined (__CYGWIN__) ||
>  defined (__BSD) ||   \
> ---
>> #elif defined(__linux__) || defined (__BSD) ||\
> hermes util 42 
> 
> 
> Searching for "__linux__" delivered some more files which
> must possibly be adapted.
> 
> opal/config/opal_check_os_flavors.m4
> opal/mca/event/libevent2019/libevent/buffer.c
> 
> 
> I assume that the following files do not need any changes
> because they are special for Linux or for features which
> are not important/available for Cygwin.
> 
> 
> configure:{ $as_echo "$as_me:${as_lineno-$LINENO}:
>  checking __linux__" >&5
> configure:$as_echo_n "checking __linux__... " >&6; }
> configure:#ifndef __linux__
> configure:  error: this isnt __linux__
> 
> test/util/opal_path_nfs.c
> 
> opal/asm/base/MIPS.asm:#ifdef __linux__
> opal/asm/generated/atomic-mips64el.s:#ifdef __linux__
> opal/asm/generated/atomic-mips64-linux.s:#ifdef __linux__
> opal/asm/generated/atomic-mips-irix.s:#ifdef __linux__
> opal/asm/generated/atomic-mips-linux.s:#ifdef __linux__
> 
> ompi/mca/common/verbs/common_verbs_basics.c:#if defined(__linux__)
> opal/include/opal/sys/cma.h:#ifdef __linux__
> opal/mca/memory/linux/arena.c:#ifdef __linux__
> 
> ompi/mca/io/romio/romio/configure:#ifdef __linux__
> ompi/mca/io/romio/romio/configure.in:#ifdef __linux__
> opal/include/opal/sys/mips/atomic.h:#ifdef __linux__
> opal/include/opal/sys/mips/atomic.h:#ifdef __linux__
> opal/include/opal/sys/mips/atomic.h:#ifdef __linux__
> opal/mca/event/libevent2019/libevent/arc4random.c:#ifdef __linux__
> 
> ompi/mca/io/romio/romio/adio/ad_lustre/ad_lustre.h:#ifdef __linux__
> ompi/mca/io/romio/romio/adio/ad_lustre/ad_lustre.h:#endif /* __linux__ */
> 
> 
> Can somebody add __Cygwin__ to all necessary files? Now I get
> the following error.
> 
> ...
> Making all in mca/if/windows
> make[2]: Entering directory
>  `/home/Admin/openmpi/openmpi-1.9-Cygwin.x86.32_gcc/opal/mca/if/windows'
>  CC   opal_if_windows.lo
> ../../../../../openmpi-1.9a1r27674/opal/mca/if/windows/opal_if_windows.c:
>  In function 'if_windows_open':
> ../../../../../openmpi-1.9a1r27674/opal/mca/if/windows/opal_if_windows.c:58:5:
>  error: 'SOCKET' undeclared (first use in this function)
> ../../../../../openmpi-1.9a1r27674/opal/mca/if/windows/opal_if_windows.c:58:5:
>  note: each undeclared identifier is reported only once for each function
>  it appears in
> ...
> 
> 
> Is it necessary to use windows sockets directly or is it possible
> to use something similar to Linux sockets? Cygwin supports sockets
> (based on Windows sockets as far as I know) and very often uses
> similar interfaces as Linux. Which file is responsible for the
> selection of "opal_if_windows.c"?
> 

Re: [OMPI users] EXTERNAL: Re: Problems with shared libraries while launching jobs

2012-12-18 Thread Reuti
Am 17.12.2012 um 16:42 schrieb Blosch, Edwin L:

> Ralph,
>  
> Unfortunately I didn’t see the ssh output.  The output I got was pretty much 
> as before.
>  
> You know, the fact that the error message is not prefixed with a host name 
> makes me think it could be happening on the host where the job is placed by 
> PBS. If there is something wrong in the user environment prior to mpirun, 
> that is not an OpenMPI problem. And yet, in one of the jobs that failed, I 
> have also printed outthe results of ‘ldd’ on the mpirun executable just prior 
> to executing the command, and all the shared libraries were resolved:

You checked the mpirun, but not the orted which misses a "libimf.so" from 
Intel. The Intel libimf.so from the redistributable archive is present on all 
nodes?

-- Reuti


>  
> ldd /release/cfd/openmpi-intel/bin/mpirun
> linux-vdso.so.1 =>  (0x7fffbbb39000)
> libopen-rte.so.0 => /release/cfd/openmpi-intel/lib/libopen-rte.so.0 
> (0x2abdf75d2000)
> libopen-pal.so.0 => /release/cfd/openmpi-intel/lib/libopen-pal.so.0 
> (0x2abdf7887000)
> libdl.so.2 => /lib64/libdl.so.2 (0x2abdf7b39000)
> libnsl.so.1 => /lib64/libnsl.so.1 (0x2abdf7d3d000)
> libutil.so.1 => /lib64/libutil.so.1 (0x2abdf7f56000)
> libm.so.6 => /lib64/libm.so.6 (0x2abdf8159000)
> libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x2abdf83af000)
> libpthread.so.0 => /lib64/libpthread.so.0 (0x2abdf85c7000)
> libc.so.6 => /lib64/libc.so.6 (0x2abdf87e4000)
> libimf.so => /appserv/intel/Compiler/11.1/072/lib/intel64/libimf.so 
> (0x2abdf8b42000)
> libsvml.so => /appserv/intel/Compiler/11.1/072/lib/intel64/libsvml.so 
> (0x2abdf8ed7000)
> libintlc.so.5 => 
> /appserv/intel/Compiler/11.1/072/lib/intel64/libintlc.so.5 
> (0x2abdf90ed000)
> /lib64/ld-linux-x86-64.so.2 (0x2abdf73b1000)
>  
> Hence my initial assumption that the shared-library problem was happening 
> with one of the child processes on a remote node.
>  
> So at this point I have more questions than answers.  I still don’t know if 
> this message comes from the main mpirun process or one of the child 
> processes, although it seems that it should not be the main process because 
> of the output of ldd above.
>  
> Any more suggestions are welcomed of course.
>  
> Thanks
>  
>  
> /release/cfd/openmpi-intel/bin/mpirun --machinefile 
> /var/spool/PBS/aux/20804.maruhpc4-mgt -np 160 -x LD_LIBRARY_PATH -x 
> MPI_ENVIRONMENT=1 --mca plm_base_verbose 5 --leave-session-attached 
> /tmp/fv420804.maruhpc4-mgt/test_jsgl -v -cycles 1 -ri restart.5000 -ro 
> /tmp/fv420804.maruhpc4-mgt/restart.5000
>  
> [c6n38:16219] mca:base:select:(  plm) Querying component [rsh]
> [c6n38:16219] mca:base:select:(  plm) Query of component [rsh] set priority 
> to 10
> [c6n38:16219] mca:base:select:(  plm) Selected component [rsh]
> Warning: Permanently added 'c6n39' (RSA) to the list of known hosts.^M
> Warning: Permanently added 'c6n40' (RSA) to the list of known hosts.^M
> Warning: Permanently added 'c6n41' (RSA) to the list of known hosts.^M
> Warning: Permanently added 'c6n42' (RSA) to the list of known hosts.^M
> Warning: Permanently added 'c5n26' (RSA) to the list of known hosts.^M
> Warning: Permanently added 'c3n20' (RSA) to the list of known hosts.^M
> Warning: Permanently added 'c4n10' (RSA) to the list of known hosts.^M
> Warning: Permanently added 'c4n40' (RSA) to the list of known hosts.^M
> /release/cfd/openmpi-intel/bin/orted: error while loading shared libraries: 
> libimf.so: cannot open shared object file: No such file or directory
> --
> A daemon (pid 16227) died unexpectedly with status 127 while attempting
> to launch so we are aborting.
>  
> There may be more information reported by the environment (see above).
>  
> This may be because the daemon was unable to find all the needed shared
> libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
> location of the shared libraries on the remote nodes and this will
> automatically be forwarded to the remote nodes.
> --
> --
> mpirun noticed that the job aborted, but has no info as to the process
> that caused that situation.
> --
> Warning: Permanently added 'c3n27' (RSA) to the list of known hosts.^M
> --
> mpirun was unable to cleanly terminate the daemons on the nodes shown
> below. Additional manual cleanup may be required - please refer to
> the "orte-clean" tool for assistance.
> --
> c6n39 - daemon did not report back 

[OMPI users] openmpi-1.9a1r27674 on Cygwin-1.7.17

2012-12-18 Thread Siegmar Gross
Hi,

I tried to install openmpi-1.9a1r27674 on Cygwin-1.7.17 and
got the following error (gcc-4.5.3).

...
  CC   path.lo
../../../openmpi-1.9a1r27668/opal/util/path.c: In function
  'opal_path_df':
../../../openmpi-1.9a1r27668/opal/util/path.c:578:18: error:
  'buf' undeclared (first use in this function)
../../../openmpi-1.9a1r27668/opal/util/path.c:578:18: note:
  each undeclared identifier is reported only once for each
  function it appears in
Makefile:1669: recipe for target `path.lo' failed
make[3]: *** [path.lo] Error 1
...


The reason is that "buf" is only declared for some operating
systems. I added "defined(__CYGWIN__)" in some places and
was able to compile "path.c".


hermes util 41 diff path.c path.c.orig
452c452
< #elif defined(__linux__) || defined(__CYGWIN__) ||
  defined (__BSD) || (defined(__APPLE__) && defined(__MACH__))
---
> #elif defined(__linux__) || defined (__BSD) ||
  (defined(__APPLE__) && defined(__MACH__))
480c480
< #elif defined(__linux__) || defined(__CYGWIN__) ||
  defined (__BSD) || (defined(__APPLE__) && defined(__MACH__))
---
> #elif defined(__linux__) || defined (__BSD) ||
  (defined(__APPLE__) && defined(__MACH__))
517c517
< #elif defined(__linux__) || defined(__CYGWIN__)
---
> #elif defined(__linux__)
549c549
< #elif defined(__linux__) || defined(__CYGWIN__) ||
  defined (__BSD) || \
---
> #elif defined(__linux__) || defined (__BSD) ||\
562c562
< #elif defined(__linux__) || defined (__CYGWIN__) ||
  defined (__BSD) ||   \
---
> #elif defined(__linux__) || defined (__BSD) ||\
hermes util 42 


Searching for "__linux__" delivered some more files which
must possibly be adapted.

opal/config/opal_check_os_flavors.m4
opal/mca/event/libevent2019/libevent/buffer.c


I assume that the following files do not need any changes
because they are special for Linux or for features which
are not important/available for Cygwin.


configure:{ $as_echo "$as_me:${as_lineno-$LINENO}:
  checking __linux__" >&5
configure:$as_echo_n "checking __linux__... " >&6; }
configure:#ifndef __linux__
configure:  error: this isnt __linux__

test/util/opal_path_nfs.c

opal/asm/base/MIPS.asm:#ifdef __linux__
opal/asm/generated/atomic-mips64el.s:#ifdef __linux__
opal/asm/generated/atomic-mips64-linux.s:#ifdef __linux__
opal/asm/generated/atomic-mips-irix.s:#ifdef __linux__
opal/asm/generated/atomic-mips-linux.s:#ifdef __linux__

ompi/mca/common/verbs/common_verbs_basics.c:#if defined(__linux__)
opal/include/opal/sys/cma.h:#ifdef __linux__
opal/mca/memory/linux/arena.c:#ifdef __linux__

ompi/mca/io/romio/romio/configure:#ifdef __linux__
ompi/mca/io/romio/romio/configure.in:#ifdef __linux__
opal/include/opal/sys/mips/atomic.h:#ifdef __linux__
opal/include/opal/sys/mips/atomic.h:#ifdef __linux__
opal/include/opal/sys/mips/atomic.h:#ifdef __linux__
opal/mca/event/libevent2019/libevent/arc4random.c:#ifdef __linux__

ompi/mca/io/romio/romio/adio/ad_lustre/ad_lustre.h:#ifdef __linux__
ompi/mca/io/romio/romio/adio/ad_lustre/ad_lustre.h:#endif /* __linux__ */


Can somebody add __Cygwin__ to all necessary files? Now I get
the following error.

...
Making all in mca/if/windows
make[2]: Entering directory
  `/home/Admin/openmpi/openmpi-1.9-Cygwin.x86.32_gcc/opal/mca/if/windows'
  CC   opal_if_windows.lo
../../../../../openmpi-1.9a1r27674/opal/mca/if/windows/opal_if_windows.c:
  In function 'if_windows_open':
../../../../../openmpi-1.9a1r27674/opal/mca/if/windows/opal_if_windows.c:58:5:
  error: 'SOCKET' undeclared (first use in this function)
../../../../../openmpi-1.9a1r27674/opal/mca/if/windows/opal_if_windows.c:58:5:
  note: each undeclared identifier is reported only once for each function
  it appears in
...


Is it necessary to use windows sockets directly or is it possible
to use something similar to Linux sockets? Cygwin supports sockets
(based on Windows sockets as far as I know) and very often uses
similar interfaces as Linux. Which file is responsible for the
selection of "opal_if_windows.c"?

I added the following constants to /usr/include/cygwin/shm.h before
I started to build openmpi-1.9a1r27674.

diff /usr/include/cygwin/shm.h /usr/include/cygwin/shm.h.orig
29,34d28
< /* Permission definitions   */
< #define SHM_R   0400/* read permission  */
< #define SHM_W   0200/* write permission */
< 

I used the following commands to configure Open MPI.
"/usr/local/jdk1.7.0" is a link to my Java installation
on Windows 7.

cd /usr/local
ln -s /cygdrive/c/Program\ Files\ \(x86\)/jdk1.7.0 jdk1.7.0


../openmpi-1.9a1r27674/configure --prefix=/usr/local/openmpi-1.9 \
  --with-jdk-bindir=/usr/local/jdk1.7.0/bin \
  --with-jdk-headers=/usr/local/jdk1.7.0/include \
  JAVA_HOME=/usr/local/jdk1.7.0 \
  LDFLAGS="-m32 -Wl,--export-all-symbols -no-undefined" \
  CC="gcc" CXX="g++" FC="gfortran" \
  CFLAGS="-m32" CXXFLAGS="-m32" FCFLAGS="-m32" \
  CPP="cpp" CXXCPP="cpp" \
  CPPFLAGS="" CXXCPPFLAGS="" \
  C_INCL_PATH="" C_INCLUDE_PATH=""