Michael Felt added the comment: First, my apology that I have not responded earlier. I had other things to work on (real life things), customers that had interest in a fix for find_library() have indicated no longer have interest, and also my personal issue - becoming disillusioned with the lack of progress in the discussion.
Clearly, the detail of your comments proves me wrong on that final point. I will be more attentive. Jumping to your last comment: > Uhm, ultimative solution feels complex already, while still some things to > decide... So, rather than try for perfection in one go - set some priorities. IMHO (emphasis on the H) - having find_library return something useful, even if not right 100% of the time - is better than wrong 100% of the time. - this concerns find_library(). Your "find" of the routine "loadquery()" may be a big improvement over what I found first. FYI - while you show links to AIX 7.2 documentation I am, as a 'packager' still trying, for the time being, to package on AIX 5.3. I know AIX 5.3 is no longer supported, but as long as something such as RBAC intelligence is not needed I prefer to package only once - and let AIX binary compatibility upwards do it's thing. Back to the H from above: loadquery() is an enhancement - and in previous discussions about my patch, such enhancements were frowned upon (i.e., it seemed that anything that would consider or include LD_LIBRARY_PATH or LIBPATH would not be considered - so even something that would examine the environment (i.e., an exported LD_LIBRARY_PATH or LIBPATH) was not acceptable. However, I concur loadquery() using *L_GETLIBPATH*can be expected to be more efficient (and easier to read?) that my earlier attempts to get this info using "dump -H". More comments (a few) below. On 20/02/2017 14:37, Michael Haubenwallner wrote: > Michael Haubenwallner added the comment: > > On 02/03/2017 09:52 PM, Michael Felt wrote: >>> Anyway: >>> Unfortunately, there is no concept of embedding something like ELF's >>> DT_SONAME tag into the Shared Object. >>> The very (PATH,BASE,MEMBER) value as (specified to and) discovered by the >>> linker is recorded into the just-linked executable (or Shared Object). >>> This implies that the runtime loader does search for the very same filename >>> (and member eventually) as the linker (at linktime). >> I assume this is why there are many systems besides AIX that do not >> support/use DT_SONAME. > Except for Windows, I'm not sure which "many systems besides AIX" you're > talking here about, that "do not use/support DT_SONAME". Clearly, my assumption is wrong. I "grew up" on BSD Unix, not System V, and AIX seems (imho) favor BSD in some aspects. I assume (emphasis assume!) that soname is not part of the POSIX standard (at least not as early as 2005 - standards change). >> At least I see many references to "Shared >> Objects" libFOO.so.X.Y.Z, libFOO.so.X.Y, libFOO.so.X and libFOO.so (with >> the latter three being symbolic links to the first). > When a system happens to find these symlinks useful, then it actually _does_ > support embedding DT_SONAME (or something similar) into its binary file > format. What I have seen on AIX - packaging by others, and from memory what libtool is doing, is the following: the file: libfoo.so.X.Y.Z gets created, the following symbolic links are created - each pointing at libfoo.so.X.Y.Z: libfoo.so.X.Y, libfoo.so.X and libfoo.so. I also see the same "logic" in IBM provided archives. I use libssl.a as my guide (for 'versioning'), but most of my 'issues' has been with supporting backwards compatibility with libintl.a - execpt here they are not symbolic links. The same "file" is added to the archive, but with a different name - so if a "soname extension" was used (better, found during linking), it is used. The order in the archive is important. If the generic name is first, then that is the name that will be used during linking (and remembered for execution). root@x064:[/usr/lib]lslpp -L | grep openssl.base openssl.base 1.0.2.1000 C F Open Secure Socket Layer root@x064:[/usr/lib]ar -Xany -tv libssl.a rwxr-xr-x 537912/767508 726474 Oct 18 11:38 2016 libssl.so rwxr-xr-x 537912/767508 726474 Oct 18 11:38 2016 libssl.so.1.0.0 rwxr-xr-x 537912/767508 510610 Oct 18 11:39 2016 libssl.so.0.9.8 rwxr-xr-x 537912/767508 823217 Oct 18 11:39 2016 libssl64.so rwxr-xr-x 537912/767508 823217 Oct 18 11:39 2016 libssl64.so.1.0.0 rwxr-xr-x 537912/767508 577122 Oct 18 11:54 2016 libssl64.so.0.9.8 root@x064:[/usr/lib]lslpp -L aixtools.gnu.gettext.rte Fileset Level State Type Description (Uninstaller) ---------------------------------------------------------------------------- aixtools.gnu.gettext.rte 0.19.8.1 C F built 21-Aug-2016 1821 UTC root@x064:[/usr/lib]ar -Xany tv libintl.a rwxr-xr-x 0/0 87530 Aug 21 16:45 2016 libintl.so.8 rwxr-xr-x 0/0 79727 Aug 21 18:17 2016 libintl.so.8 rw-r--r-- 0/0 3753 Jun 03 18:05 2017 bindtextdom.o ... rw-r--r-- 0/0 2014 Jun 03 18:05 2017 textdomain.o rwxr-xr-x 0/0 64001 Jun 03 18:05 2017 libintl.so.1 Here, to keep ancient legacy programs - I have copied (and put at the end of the archive the 32-bit libintl.so.1 member that some programs are still using - sadly). And another example - with libz.a - where, historical moments can be important as well: -- Here I overwrite the member provided by IBM (libz.so.1) - and do not provide libz.so - as this has been the convention - forever. root@x064:[/usr/lib]ar -Xany -tv libz.a rwxr-xr-x 0/0 174334 Jan 31 12:53 2017 libz.so.1 rwxr-xr-x 0/0 174334 Jan 31 12:53 2017 libz.so.1.2.11 rwxr-xr-x 0/1954 174270 Feb 02 18:55 2017 libz.so.1.2.10 r-xr-xr-x 0/1954 164528 Feb 02 18:55 2017 libz.so.1.2.8 rw-r--r-- 0/0 5565 Jan 31 12:53 2017 adler32.o rw-r--r-- 0/0 13383 Jan 31 12:53 2017 crc32.o ... rw-r--r-- 0/0 10973 Jan 31 12:53 2017 gzwrite.o rwxr-xr-x 0/0 164632 Jan 31 12:55 2017 libz.so.1 rwxr-xr-x 0/1954 164632 Feb 02 18:43 2017 libz.so.1.2.11 rwxr-xr-x 0/1954 164594 Feb 02 18:53 2017 libz.so.1.2.10 r-xr-xr-x 0/1954 156861 Feb 02 18:53 2017 libz.so.1.2.8 rw-r--r-- 0/0 5249 Jan 31 12:55 2017 adler32.o rw-r--r-- 0/0 13880 Jan 31 12:55 2017 crc32.o ... So, getting back to your comment about it being "complex". Maybe I understand that in details that few have cared to investigate. And I am skipping over the cases where IBM uses the name shr.o as the shared library that needs to be loaded (as find_library("c") should return - as just one example, or shr_64.o - when a 64-bit version of python is packaged) (example from AIX 5.3 TL7 - and I would expect to see the same on AIX 7.2) root@x064:[/usr/lib]ar -Xany -tv libc.a | grep shr rwxr-xr-x 300/300 4154464 Sep 26 20:00 2007 shr.o rwxr-xr-x 300/300 4516730 Sep 26 20:00 2007 shr_64.o >> Another issue is support for what I believe MacOS calls "fat" objects - >> that support both 32-bit and 64-bit applications - rather than /XXX/lib >> for 32-bit objects and /XXX/lib/lib64 or /XXX/lib64 for 64-bit objects. > Yes, the AIX Archive Libraries supporting different bitwidths for members is > quite similar to MacOS fat objects. > However - although related, the creation of "fat" AIX archives is a different > topic. > But yes, Python ctypes.find_library+dlopen should know how to deal with them. I am not asking steps on how to create them - that is my 'private' problem as a packager. I can tell you it entails building the package twice - plus some additional steps. > >> b) One of the difficulties I faced is trying to guess what version -lFOO >> should find when there is more than one version available. > Exactly. There is an idea below (the symbol->member map). I looked at that - and my expectation is: wonderful, but not for 2.7 as it is a different behavior than find_library() anno Python2.7 (or any version Lib/ctypes for that matter) - and should not be limited to AIX, but should deal with all platforms and how they provide multiple versions. I have seen packages that try to resolve that now by doing - potentially, multiple calls to find_library - and supply specific values for libfoo.so.X. That is, I believe there are users (i.e., python application/module developers) that would make use of the added functionality. However, for now I think it is simpler - on AIX at least - to 'assume' that libfoo.so is a symbolic link to libfoo.so.X[[.Y][.Z]] (hope I have the [] correctly and/or you understand my intent) - as that is the 'default' behavior of libtool. And, when file or "member" libfoo.so does not exist to go for the most specific .X.Y.Z available. >>> But still, how to get ctypes.find_library() working - ideally for each >>> variant, is another story. Right now it does not work for any variant, >> Do you mean all systems, or specific to AIX - I am assuming you mean AIX. > Yes - find_library currently does not work for any variant on *AIX*. So, it is always up to the programmer to find a solution. And, my observation is that they create an unmaintable/unmanageable situation by extracting a member from an archive into /usr/lib and hard-coding the library-name in a call to LoadLibrary(). Of course if the archive they extracted from gets updated - their extracted member is not updated. etc. If they have done enough research (which I have not seen) they could use - instead (part of my would-be 'patch' to Lib/ctypes/util.py) # load if sys.platform == "darwin": print cdll.LoadLibrary("libm.dylib") print cdll.LoadLibrary("libcrypto.dylib") print cdll.LoadLibrary("libSystem.dylib") print cdll.LoadLibrary("System.framework/System") elif sys.platform[:3] == "aix": from ctypes import CDLL RTLD_MEMBER = 0x00040000 print CDLL("libc.a(shr.o)", RTLD_MEMBER) else: print cdll.LoadLibrary("libm.so") print cdll.LoadLibrary("libcrypt.so") print find_library("crypt") And gives: root@x064:[/data/prj/python/python-2.7.13.0/Lib/ctypes]../../python util.py None None None <CDLL 'libc.a(shr.o)', handle 10 at 300fe2f0> So, just want to point out - if find_library() was working, there is only a minor change to __init__.py needed (specifically to add the RTLD_MEMBER to the mode when the filename includes '(' and ends in ')'. In short, for AIX find_library() is broken. period. It could be fixed so that code written as: cdll.LoadLibrary(find_library("foo")) would work cross platform. (I do not have a mac to test on, but I expect find_library("m") returns 'libm.dylib'. FYI: On AIX, find_library("m") should return None - as there is no shared library in libm.a >>> but I guess that search algorithm should follow how the linker discovers >>> the (PATH,BASE,MEMBER) values to >> I am not a tool builder. My comments are based on observations and >> experience from when I was a developer 25+ years ago. The AIX linker is >> not interested in the member name - it seems to go through the >> PATH/libBASE.a looking for the first object it can find to resolve a >> symbol. The name of the object it finds becomes the MEMBER it records in >> it's internal table of where to look later when the application runs. > Exactly. See above re: my observations re: current practice/behavior I could go through all of PEP 20 - but one of my goals was to have AIX low-level details masked, so that code "written for any platform" such as: # leaving out the 'include statements' to show the 'core' objective solib = None solibname = find_library("foo") if solibname: solib = cdll.LoadLibrary(solibname) # optionally, there might also be an else block dealing with the 'unexpected?' lack of solibname resolution. would work. There should be one-- and preferably only one --obvious way to do it. Although that way may not be obvious at first unless you're Dutch. Now is better than never. Although never is often better than *right* now. If the implementation is hard to explain, it's a bad idea. If the implementation is easy to explain, it may be a good idea. I would like to add - I may be American born, I carry a Dutch passport when traveling in EU. ;) And, while *right* now may be too soon - but remember 'never' is - I hope in the spirit of PEP 20 - too late. > >>> write into just-linked executables, combined with how the runtime loader >>> finds the Shared Object to actually load. >> I worked on a patch - to do all that - taking into consideration the way >> libtool names .so files/members and then looking into/at "legacy" aka >> IBM dev ways they did things before the libtool model was so prominent. >> >> My algorithm - attempts to solve the (PATH, BASE, MEMBER) problem as >> "dynamically" as possible. PATH and BASE are fairly straight forward - >> but MEMBER is clearly more complex. >> >> PATH: start by looking at the python executable - > As far as I can tell, any executable can actually link against the Python > interpreter. I am talking about packaging python as "stand-alone" >> and looking at it's "blibpath" - > There also is the loadquery() subroutine in AIX, see > https://www.ibm.com/support/knowledgecenter/ssw_aix_72/com.ibm.aix.basetrf1/loadquery.htm > > loadquery(L_GETLIBPATH) "Returns the library path that was used at process > exec time.", > which includes both the environment variable LIBPATH (or LD_LIBRARY_PATH if > LIBPATH is unset) and the executable's "blibpath" value. Not having used loadquery() I do not know if it also, by default, returns the default library path that the executable uses for it's own (initial) libraries. E.g, when packaging I try to use /opt/lib, rather than /usr/lib as I do not want to conflict or overwrite existing libraries. So, when I add -L/opt/lib the XCOFF headers of python, the executable, include /opt/lib. >> and using that as the default colon separated list of PATHs > Question is if we do want to consider _current_ values of environment > variable LIBPATH (or LD_LIBRARY_PATH) in addition to the "library path at > process exec time"? In Python2.X, and prior to Python 3.6 - for Linux it was ignored. I cannot comment on other platforms. On AIX I know that CDLL("libfoo.so") and CDLL("libfoo.a(member.o), RTLD_MEMBER) follow LIBPATH or, if LIBPATH is not defined LD_LIBRARY_PATH if it is defined). In short, LoadLibrary() may find something different compared to what find_library returns. >> to search for BASE.a archive. Once a BASE.a file is found it is examined >> for a MEMBER. If all PATH/BASE.a do not find a potential MEMBER then the >> PATHs are examined again for PATH/BASE.so. > Erm, nope, the AIX linker has a different algorithm (for -lNAME): > Iterating over the "library path", the first path entry containing any > matching filename (either libNAME.a or libNAME.so) will be used, and no > further library path iteration is performed. > This one found PATH/filename does have to provide the requested symbol in one > way or another. Maybe the linker (ld) follows this. It has been a year since I did my tests for rtld behavior. Above is what I recall. Or maybe I just tested wrong. In any case, I felt it is more 'friendly' to skip over a libfoo.a that does not include a dlopen() suitable object (aka member) and look for a dlopen() suitable .so file. >> When a .so file is found that >> is returned - versioning must be accomplished via a symbolic link to a >> versioned library. I mean, if we are getting to "file" level, it should function as I believe "Linux" is behaving. Whatever the platform thinks is "best". Linux x066 3.2.0-4-powerpc64 #1 SMP Debian 3.2.78-1 ppc64 GNU/Linux root@x066:~# find / -name libc.so\* /lib64/libc.so.6 /lib/powerpc-linux-gnu/libc.so.6 /usr/lib64/libc.so /usr/lib/powerpc-linux-gnu/libc.so root@x066:~# python -m ctypes/util libm.so.6 libc.so.6 libbz2.so.1.0 <CDLL 'libm.so', handle f7e965e0 at f7c20370> <CDLL 'libcrypt.so', handle 1096df58 at f7c20370> libcrypt.so.1 michael@x067:~$ uname -a Linux x067 3.16.0-4-powerpc64 #1 SMP Debian 3.16.7-ckt9-3 (2015-04-23) ppc64 GNU/Linux And, as you look at the details - perhaps 'libm' shows more clearly how python really does not care - much. find_library() may find one thing, but if the less specific exists, then the less specific is loaded. michael@x067:~$ uname -a Linux x067 3.16.0-4-powerpc64 #1 SMP Debian 3.16.7-ckt9-3 (2015-04-23) ppc64 GNU /Linux michael@x067:~$ uname -a Linux x067 3.16.0-4-powerpc64 #1 SMP Debian 3.16.7-ckt9-3 (2015-04-23) ppc64 GNU/Linux (switching to root, so find does not complain about non-accessible directories) michael@x067:~$ su Password: root@x067:/home/michael# find / -name libc.so\* /usr/lib/powerpc-linux-gnu/libc.so /lib/powerpc-linux-gnu/libc.so.6 root@x067:/home/michael# python -m ctypes/util libm.so.6 libc.so.6 libbz2.so.1.0 <CDLL 'libm.so', handle f7ea3528 at f7b970b0> <CDLL 'libcrypt.so', handle 10a74e68 at f7b970b0> libcrypt.so.1 root@x067:/home/michael# find / -name libm.so\* /usr/lib/powerpc-linux-gnu/libm.so /lib/powerpc-linux-gnu/libm.so.6 > The linker does not perform such a check, nor does it feel necessary for > ctypes.find_library+dlopen as long as it does search similar to the linker. Above shows (find output) - that the names are 'equal' - but where they actually are is 'don't care' from python perspective. That is a low-level detail - platform dependent. And as the documentation says: Python2.7: |ctypes.util.||find_library|(/name/) Try to find a library and return a pathname. /name/ is the library name without any prefix like /lib/, suffix like |.so|, |.dylib| or version number (this is the form used for the posix linker option |-l|). If no library can be found, returns |None|. The exact functionality is system dependent. On Linux, |find_library()| tries to run external programs (|/sbin/ldconfig|, |gcc|, and |objdump|) to find the library file. Python3.6.1 |ctypes.util.||find_library|(/name/) Try to find a library and return a pathname. /name/ is the library name without any prefix like /lib/, suffix like |.so|, |.dylib| or version number (this is the form used for the posix linker option |-l|). If no library can be found, returns |None|. The exact functionality is system dependent. On Linux, |find_library()| tries to run external programs (|/sbin/ldconfig|, |gcc|, |objdump| and |ld|) to find the library file. It returns the filename of the library file. Changed in version 3.6: On Linux, the value of the environment variable |LD_LIBRARY_PATH| is used when searching for libraries, if a library cannot be found by any other means. Please note: "The exact functionality is system dependent.", so I have not understood how adding _aix.py (_dynload_aix.py if you must have a longer name, the _ (underscore at the beginning is key (and thinking PEP 20 I thought _aix was a fine name). But somewhere, perhaps in my 'noobish python attempts that needed upgrading" arose the objection to including anything - passing by the true issue. find_library() is broken. >> The program "dump -H" provides this information for both executables and >> archive (aka BASE) members. > Eventually we might want to avoid spawning the 'dump' program, but implement > reading the XCOFF Object File Format within _ctypes module instead. > At least AIX does provide the necessary headers: > https://www.ibm.com/support/knowledgecenter/ssw_aix_72/com.ibm.aix.files/XCOFF.htm On my 'to read' list. >> Starting from the "blibpath" values in the executable mean a cpython >> packager can add a specific PATH by adding it to >> LDFLAGS="-L/my/special/libdir:$LDFLAGS". Note that AIX archives also >> have their own "blibpath" - so libraries dynamically loaded may also >> follow additional paths that the executable is not aware of (nor need to >> be). > There is no need for the ctypes module to search libpaths from other Shared > Objects than the main executable (and current env vars). OK. But my point is - the environment variable is most generally not defined, and my observation is that the LIBPATHs that rtld searches is: a) the LIBPATH defined in the XCOFF info of the executable when it loads, and b) the LIBPATH specified in the libfoo.a(member) when it needs other libraries. These may not be the same. Although, I must verify (again) part b. I agree that an executable should not need to care about the LIBPATH needs of libraries it wants to dlopen(). >> So - once the PATHS are determined the system is examined looking for >> ${PATH}/BASE.a. If a target BASE.a is found, it is examined for a MEMBER >> is set to BASE.so (now used a MEMBER.so) . If MEMBER.so is not found >> then look for the "highest X[.Y[.Z]] aka MEMBER.so.X[.Y[.Z]] name. If >> that is not found check AIX legacy names (mainly shr.o or shr_64.o, >> although there are also certain key libraries that have additional >> variations (boring)). > When ctypes.dlopen is asked to load an Archive Library (either .a or .so) > without a specific member, it probably should not immediately dlopen a > specific member, but fetch the list of symbols provided by useable members > (either Shared Objects without the F_LOADONLY flag, as well as specified in > Import Files), and return the handle to some internal symbol->member map > instead. > > Then, really loading a shared archive member is done by subsequent > ctypes.dlsym - where it becomes clear which archive member to load. IMHO (noob me) thinks this is a "new behavior" not described in Python2.7 (or even Python3.6) ctypes documentation - and beyond the scope of what I have been reporting. >> Again, if PATH, BASE, MEMBER is not located as a .a archive - look for >> libFOO.so in all the PATH directories known to the executable. > Nope, see above - please iterate over a libpath list only _once_, and search > for each filename while at one path list entry. Actually, I think we largely agree. If I understand your comment correctly you would prefer to search, per directory for libfoo.a (for suitable members), then libfoo.so, then next directory in LIBPATH - rather than my search for libfoo.so only after LIBPATH directories have all 'failed' to find a shareable member. Our difference is that I put .a archives - at the premium - whereas you put "files" at the premium. Maybe your way is how rtld (aka InitandLoad() searches), and then that should be the way to go! > > However, I'm not sure yet how to identify if we should search for .a or .so > first (before .so and .a, respectively): > This depends on whether the current executable actually does use runtime > linking or not, but I have no idea yet how to figure that out. I have not specifically looked - but I have not seen any application on AIX (in recent years) that only uses static libraries. > But probably we may want to use how the Python interpreter (libpython.a or > libpython.so) was built. Using "./configure" defaults - looking at my own packaging: $ ./configure --prefix=/opt --sysconfdir=/var/python/etc --sharedstatedir=/var/python/com --localstatedir=/var/python --mandir=/usr/share/man --infodir=/opt/share/info/python --without-computed-gotos executable: (never released though, am hung on whether to add my non-accepted patch to ctypes or not) root@x064:[/data/prj/python/python-2.7.13.0]dump -H python python: ***Loader Section*** Loader Header Information VERSION# #SYMtableENT #RELOCent LENidSTR 0x00000001 0x000005b2 0x00003545 0x0000006e #IMPfilID OFFidSTR LENstrTBL OFFstrTBL 0x00000005 0x0003080c 0x00006837 0x0003087a ***Import File Strings*** INDEX PATH BASE MEMBER 0 /usr/vac/lib:/usr/lib:/lib 1 libc.a shr.o 2 libpthreads.a shr_xpg5.o 3 libpthreads.a shr_comm.o 4 libdl.a shr.o executable Uses rtld for 'other stuff'. FYI: shareable does work (i.e., libpython* is also in the dump -H info), but ../../python (for testing) does not work when "shared". Since most of my time is spent testing "shared" is time-consuming. So, I do not add --shared by default. root@x064:[/data/prj/python/python-2.7.13.0]ls -l lib* -rw-r----- 1 root 1954 3156377 Jan 13 13:18 libpython2.7.a No shared members, only static .o files (112 of them) > > Uhm, ultimative solution feels complex already, while still some things to > decide... I am willing to help. But I can only take 'no' for so long. Then I'll pass the baton. Again, my apologies for the long delay since your extensive reply. I hope you see I have spent (too) many hours on this. I also hope that you understand I do not see my solution as the only solution (I am not that Dutch ;) !). I just hope for resolution - closer to sooner than to 'never'. Michael > > ---------- > > _______________________________________ > Python tracker <rep...@bugs.python.org> > <http://bugs.python.org/issue27435> > _______________________________________ > ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue27435> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com