Re: [OMPI devel] Missing Symbol

2010-03-06 Thread Leonardo Fialho
Terry, the mca_pml_v is declared in a .so, and at loading time it should export the symbol. But, this component load another modules like mca_vprotocol_pessimist or mca_vprotocol_receiver (in my case). The symbol is declared on the pml_v.c which acts as a pseudo-framework loading other componen

Re: [OMPI devel] Missing Symbol

2010-03-05 Thread Jeff Squyres
We already use global symbols; mca_base_component_repository.c invokes: if (lt_dladvise_global(&opal_mca_dladvise)) { return OPAL_ERROR; } On Mar 5, 2010, at 6:18 PM, George Bosilca wrote: > Unfortunately this will not fix his issues ;( I pretty sure that his problem > is relat

Re: [OMPI devel] Missing Symbol

2010-03-05 Thread Jeff Squyres
On Mar 5, 2010, at 6:02 PM, Jeff Squyres (jsquyres) wrote: > I wondered aloud on IM to Terry after your earlier emails if we should just > custom-patch ltdl in OMPI to fix this issue. The problem is that libltdl is > effectively reporting the "wrong" error back to OMPI, so the error string > t

Re: [OMPI devel] Missing Symbol

2010-03-05 Thread George Bosilca
Unfortunately this will not fix his issues ;( I pretty sure that his problem is related to the fact that mca_pml_v is exported by another dynamic module, and therefore not available via dlsym. I don't think there is a simple solution for this problem, except going back to GLOBAL symbols. geor

Re: [OMPI devel] Missing Symbol

2010-03-05 Thread Jeff Squyres
Ick. I wondered aloud on IM to Terry after your earlier emails if we should just custom-patch ltdl in OMPI to fix this issue. The problem is that libltdl is effectively reporting the "wrong" error back to OMPI, so the error string that we get to print out ends up not being very useful (e.g.,

Re: [OMPI devel] Missing Symbol

2010-03-05 Thread Terry Dontje
Have you found the symbol being exposed by another .so (ie have you done an nm on the .so that shows the symbol)? And are you sure that .so is loaded by the time your .so is being loaded? --td Leonardo Fialho wrote: No George, this trick does not change the problem. I'm looking for the proble

Re: [OMPI devel] Missing Symbol

2010-03-05 Thread George Bosilca
Because I guess it is declared by another module loaded dynamically at runtime. As libtool load the symbols not in a global scope, this mca_pml_v will not be visible for other modules trying to use it. george. On Mar 5, 2010, at 14:35 , Leonardo Fialho wrote: > No George, this trick does not

Re: [OMPI devel] Missing Symbol

2010-03-05 Thread Leonardo Fialho
No George, this trick does not change the problem. I'm looking for the problem in the mca_pml_v declaration, but I still can't figure out the reason why it doesn't work. Leonardo On Mar 5, 2010, at 8:12 PM, George Bosilca wrote: > I would first try the Open MPI configure option --disable-visib

Re: [OMPI devel] Missing Symbol

2010-03-05 Thread George Bosilca
I would first try the Open MPI configure option --disable-visibility. If this doesn't fix it, you should make sure that dlopen is called with the GLOBAL flag on (don't remember where exactly in the code and unfortunately I can't check right now). Use gdb and set a breakpoint to dlopen and you wi

Re: [OMPI devel] Missing Symbol

2010-03-05 Thread Leonardo Fialho
Yeah, probably ompi_request_null and opal_output are not good candidates. I'm trying with mca_pml_v. But I'm not familiarized with this framework although it is really small. George, you said to change this (opal/mca/base/mca_base_component_find.c): #if OPAL_HAVE_LTDL_ADVISE component_handle

Re: [OMPI devel] Missing Symbol

2010-03-05 Thread Terry Dontje
Leonardo Fialho wrote: Yes, I renamed all references to Aurelien's componant name and removed all code regarding to the component itself. There are only functions which returns OMPI_SUCCESS. No other function is called. I'm debugging with LD_DEBUG=symbols, but the output is really huge! Proba

Re: [OMPI devel] Missing Symbol

2010-03-05 Thread George Bosilca
This might be an issue with the [new] way libtool load the symbols, i.e., in a private space and not in a global one. Try turning off the visibility feature and see if you get the same error. george. On Mar 5, 2010, at 13:47 , Terry Dontje wrote: > I would also start nm'ing the .so's you thi

Re: [OMPI devel] Missing Symbol

2010-03-05 Thread Terry Dontje
I would also start nm'ing the .so's you think the U symbols are resolved in to make sure they are exposed. Luckily you only have 3 symbols to look for. --td Ralph Castain wrote: It's probably a visibility issue - check for an OMPI_DECLSPEC missing from the declaration of a symbol. On Mar 5

Re: [OMPI devel] Missing Symbol

2010-03-05 Thread Ralph Castain
It's probably a visibility issue - check for an OMPI_DECLSPEC missing from the declaration of a symbol. On Mar 5, 2010, at 11:40 AM, Leonardo Fialho wrote: > Yes, > > I renamed all references to Aurelien's componant name and removed all code > regarding to the component itself. There are only

Re: [OMPI devel] Missing Symbol

2010-03-05 Thread Leonardo Fialho
Yes, I renamed all references to Aurelien's componant name and removed all code regarding to the component itself. There are only functions which returns OMPI_SUCCESS. No other function is called. I'm debugging with LD_DEBUG=symbols, but the output is really huge! Probably the error is in the

Re: [OMPI devel] Missing Symbol

2010-03-05 Thread Ralph Castain
You said this component was a copy of Aurelien's component? Did you rename the critical elements (e.g., component, module) inside it to avoid name confusion? On Mar 5, 2010, at 11:27 AM, Leonardo Fialho wrote: > I see... but it is really strange because this module is clean, it does not > use n

Re: [OMPI devel] Missing Symbol

2010-03-05 Thread Leonardo Fialho
I see... but it is really strange because this module is clean, it does not use nothing. This is the output of the nm command, I can't see any symbol which is not available. [lfialho@aoclsb-clus openmpi]$ nm mca_vprotocol_receiver.so 00201208 a _DYNAMIC 00201408 a _GLOBAL_OFFSET

Re: [OMPI devel] Missing Symbol

2010-03-05 Thread Terry Dontje
Sorry meant to add this, but you might be able to try and find the symbol causing the issue by twiddling with LD_DEBUG --td Terry Dontje wrote: Possibly there is an external symbol in the .so that is being loaded that cannot be resolved. --td Leonardo Fialho wrote: Hi, I know that libtool do

Re: [OMPI devel] Missing Symbol

2010-03-05 Thread Terry Dontje
Possibly there is an external symbol in the .so that is being loaded that cannot be resolved. --td Leonardo Fialho wrote: Hi, I know that libtool does not help us to find the source of this error, but, what can generate the following error? [aoclsb-clus.uab.es:11724] mca: base: component_fi

Re: [OMPI devel] Missing symbol with DR PML

2008-03-11 Thread Andrew Friedley
Thanks George. Andrew George Bosilca wrote: I guess that as now we have visibility turned on by default, this symbol is missing as it is not exported. Commit 17810 should fix the problem. Thanks, george. On Mar 11, 2008, at 2:03 PM, Andrew Friedley wrote: I'm running a several differ

Re: [OMPI devel] Missing symbol with DR PML

2008-03-11 Thread George Bosilca
I guess that as now we have visibility turned on by default, this symbol is missing as it is not exported. Commit 17810 should fix the problem. Thanks, george. On Mar 11, 2008, at 2:03 PM, Andrew Friedley wrote: I'm running a several different trunk checkouts and I always see this er