Another thing to try - go to your installation location's lib subdirectory (at $prefix/lib) and delete everything that is there. Then go back to the directory where you put the software and do a "make install" again.

Sometimes, especially if you are upgrading to a new version, you can be burned by stale shared libraries. This sounds like it could be the problem here. We don't remove any old libraries when you do an installation, so if you change versions, you really should do this procedure to avoid picking up "old stuff".

Alternatively, you could build and run without shared libraries to avoid this problem altogether - just reconfigure with "--enable-static --disable-shared" and then do "make clean all install".

Ralph


Brian Barrett wrote:
Well, so much for the easy one :(.

Is it possible that you have two versions of Open MPI in your path  
somewhere and that you might be getting different versions on  
different nodes?  The errors below generally indicate that data was  
received in a totally different format than expected, so I'm just  
kind of guessing as to how one could get to that situation...

Brian

On Apr 21, 2006, at 5:01 PM, Manjunath G Venkata wrote:

  
On Thu, 20 Apr 2006, Brian Barrett wrote:

    
Are these both identical architecture?  Those look suspiciously  
like what happens when you're trying to mix 32/64 bit or little  
endian / big endian.

      
- Both my nodes are Intel Xeons and run linux 2.4.26.

-Manjunath

    
Brian

On Apr 20, 2006, at 8:53 PM, Galen M. Shipman wrote:

      
Hey Guys,
Not sure what is going on here, has anyone seen this before?
- Galen
        
Hi Galen,
Sorry to bother you.
I have installed latest stable version of Open MPI(1.0) on two  
of spider
nodes(s7,s4) for some experiments, but there seems to be  
configuration
error  or something else which I don't understand. After  
installing, as
a test I ran an simple MPI program but it complains with following
errors.
[s4:10685] [0,0,0] ORTE_ERROR_LOG: Pack data mismatch in file
dps_unpack.c at line 121
[s4:10685] [0,0,0] ORTE_ERROR_LOG: Pack data mismatch in file
dps_unpack.c at line 95
Further digging with gdb prints following errors
[s7:07005] ERROR: A daemon on node s4 failed to start as expected.
[s7:07005] ERROR: There may be more information available from
[s7:07005] ERROR: the remote shell (see above).
[s7:07005] The daemon received a signal 5.
[s7:07005] [0,0,0] ORTE_ERROR_LOG: Pack data mismatch in file
dps_unpack.c at line 121
[s7:07005] [0,0,0] ORTE_ERROR_LOG: Pack data mismatch in file
dps_unpack.c at line 95
[s7:07005] [0,0,0] ORTE_ERROR_LOG: Pack data mismatch in file
dps_unpack.c at line 121
[s7:07005] [0,0,0] ORTE_ERROR_LOG: Pack data mismatch in file
dps_unpack.c at line 95
any clue on what I am doing wrong ?
thanks,
-Manjunath
          
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
        
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

  

Reply via email to