Re: [OMPI devel] Pack data mismatch in file dps_unpack.c 95/121

2006-04-22 Thread Brian Barrett

Well, so much for the easy one :(.

Is it possible that you have two versions of Open MPI in your path  
somewhere and that you might be getting different versions on  
different nodes?  The errors below generally indicate that data was  
received in a totally different format than expected, so I'm just  
kind of guessing as to how one could get to that situation...


Brian

On Apr 21, 2006, at 5:01 PM, Manjunath G Venkata wrote:


On Thu, 20 Apr 2006, Brian Barrett wrote:

Are these both identical architecture?  Those look suspiciously  
like what happens when you're trying to mix 32/64 bit or little  
endian / big endian.



- Both my nodes are Intel Xeons and run linux 2.4.26.

-Manjunath


Brian

On Apr 20, 2006, at 8:53 PM, Galen M. Shipman wrote:


Hey Guys,
Not sure what is going on here, has anyone seen this before?
- Galen

Hi Galen,
Sorry to bother you.
I have installed latest stable version of Open MPI(1.0) on two  
of spider
nodes(s7,s4) for some experiments, but there seems to be  
configuration
error  or something else which I don't understand. After  
installing, as

a test I ran an simple MPI program but it complains with following
errors.
[s4:10685] [0,0,0] ORTE_ERROR_LOG: Pack data mismatch in file
dps_unpack.c at line 121
[s4:10685] [0,0,0] ORTE_ERROR_LOG: Pack data mismatch in file
dps_unpack.c at line 95
Further digging with gdb prints following errors
[s7:07005] ERROR: A daemon on node s4 failed to start as expected.
[s7:07005] ERROR: There may be more information available from
[s7:07005] ERROR: the remote shell (see above).
[s7:07005] The daemon received a signal 5.
[s7:07005] [0,0,0] ORTE_ERROR_LOG: Pack data mismatch in file
dps_unpack.c at line 121
[s7:07005] [0,0,0] ORTE_ERROR_LOG: Pack data mismatch in file
dps_unpack.c at line 95
[s7:07005] [0,0,0] ORTE_ERROR_LOG: Pack data mismatch in file
dps_unpack.c at line 121
[s7:07005] [0,0,0] ORTE_ERROR_LOG: Pack data mismatch in file
dps_unpack.c at line 95
any clue on what I am doing wrong ?
thanks,
-Manjunath

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel






Re: [OMPI devel] Pack data mismatch in file dps_unpack.c 95/121

2006-04-22 Thread Ralph Castain




Another thing to try - go to your installation location's lib
subdirectory (at $prefix/lib) and delete everything that is there. Then
go back to the directory where you put the software and do a "make
install" again.

Sometimes, especially if you are upgrading to a new version, you can be
burned by stale shared libraries. This sounds like it could be the
problem here. We don't remove any old libraries when you do an
installation, so if you change versions, you really should do this
procedure to avoid picking up "old stuff".

Alternatively, you could build and run without shared libraries to
avoid this problem altogether - just reconfigure with "--enable-static
--disable-shared" and then do "make clean all install".

Ralph


Brian Barrett wrote:

  Well, so much for the easy one :(.

Is it possible that you have two versions of Open MPI in your path  
somewhere and that you might be getting different versions on  
different nodes?  The errors below generally indicate that data was  
received in a totally different format than expected, so I'm just  
kind of guessing as to how one could get to that situation...

Brian

On Apr 21, 2006, at 5:01 PM, Manjunath G Venkata wrote:

  
  
On Thu, 20 Apr 2006, Brian Barrett wrote:



  Are these both identical architecture?  Those look suspiciously  
like what happens when you're trying to mix 32/64 bit or little  
endian / big endian.

  

- Both my nodes are Intel Xeons and run linux 2.4.26.

-Manjunath



  Brian

On Apr 20, 2006, at 8:53 PM, Galen M. Shipman wrote:

  
  
Hey Guys,
Not sure what is going on here, has anyone seen this before?
- Galen


  Hi Galen,
Sorry to bother you.
I have installed latest stable version of Open MPI(1.0) on two  
of spider
nodes(s7,s4) for some experiments, but there seems to be  
configuration
error  or something else which I don't understand. After  
installing, as
a test I ran an simple MPI program but it complains with following
errors.
[s4:10685] [0,0,0] ORTE_ERROR_LOG: Pack data mismatch in file
dps_unpack.c at line 121
[s4:10685] [0,0,0] ORTE_ERROR_LOG: Pack data mismatch in file
dps_unpack.c at line 95
Further digging with gdb prints following errors
[s7:07005] ERROR: A daemon on node s4 failed to start as expected.
[s7:07005] ERROR: There may be more information available from
[s7:07005] ERROR: the remote shell (see above).
[s7:07005] The daemon received a signal 5.
[s7:07005] [0,0,0] ORTE_ERROR_LOG: Pack data mismatch in file
dps_unpack.c at line 121
[s7:07005] [0,0,0] ORTE_ERROR_LOG: Pack data mismatch in file
dps_unpack.c at line 95
[s7:07005] [0,0,0] ORTE_ERROR_LOG: Pack data mismatch in file
dps_unpack.c at line 121
[s7:07005] [0,0,0] ORTE_ERROR_LOG: Pack data mismatch in file
dps_unpack.c at line 95
any clue on what I am doing wrong ?
thanks,
-Manjunath
  

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

  

  
  
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel