[HACKERS] Using RTLD_DEEPBIND to handle symbol conflicts in loaded libraries

2014-11-26 Thread Ants Aasma
I had to make oracle_fdw work with PostgreSQL compiled using
--with-ldap. The issue there is that Oracle's client library has the
delightful property of linking against a ldap library they bundle that
has symbol conflicts with OpenLDAP. At PostgreSQL startup libldap is
loaded, so when libclntsh.so (the Oracle client) is loaded it gets
bound to OpenLDAP symbols, and unsurprisingly crashes with a segfault
when those functions get used.

glibc-2.3.4+ has a flag called RTLD_DEEPBIND for dlopen that prefers
symbols loaded by the library to those provided by the caller. Using
this flag fixes my issue, PostgreSQL gets the ldap functions from
libldap, Oracle client gets them from whatever it links to. Both work
fine.

Attached is a patch that enables this flag on Linux when available.
This specific case could also be fixed by rewriting oracle_fdw to use
dlopen for libclntsh.so and pass this flag, but I think it would be
better to enable it for all PostgreSQL loaded extension modules. I
can't think of a sane use case where it would be correct to prefer
PostgreSQL loaded symbols to those the library was actually linked
against.

Does anybody know of a case where this flag wouldn't be a good idea?
Are there any similar options for other platforms? Alternatively, does
anyone know of linker flags that would give a similar effect?

Regards,
Ants Aasma
-- 
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de
diff --git a/src/backend/port/dynloader/linux.h b/src/backend/port/dynloader/linux.h
index db2ac66..34fd35c 100644
--- a/src/backend/port/dynloader/linux.h
+++ b/src/backend/port/dynloader/linux.h
@@ -25,8 +25,12 @@
 /*
  * In some older systems, the RTLD_NOW flag isn't defined and the mode
  * argument to dlopen must always be 1.  The RTLD_GLOBAL flag is wanted
- * if available, but it doesn't exist everywhere.
- * If it doesn't exist, set it to 0 so it has no effect.
+ * if available, but it doesn't exist everywhere. The RTLD_DEEPBIND flag
+ * may also be missing, but if it is available we want to enable it so
+ * extensions we load prefer symbols from libraries they were linked
+ * against to conflicting symbols from unrelated libraries loaded by
+ * PostgreSQL.
+ * If the optional flags don't exist, set them to 0 so they have no effect.
  */
 #ifndef RTLD_NOW
 #define RTLD_NOW 1
@@ -34,8 +38,11 @@
 #ifndef RTLD_GLOBAL
 #define RTLD_GLOBAL 0
 #endif
+#ifndef RTLD_DEEPBIND
+#define RTLD_DEEPBIND 0
+#endif
 
-#define pg_dlopen(f)	dlopen((f), RTLD_NOW | RTLD_GLOBAL)
+#define pg_dlopen(f)	dlopen((f), RTLD_NOW | RTLD_GLOBAL | RTLD_DEEPBIND)
 #define pg_dlsym		dlsym
 #define pg_dlclose		dlclose
 #define pg_dlerror		dlerror

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Using RTLD_DEEPBIND to handle symbol conflicts in loaded libraries

2014-11-26 Thread Albe Laurenz
Ants Aasma wrote:
 I had to make oracle_fdw work with PostgreSQL compiled using
 --with-ldap. The issue there is that Oracle's client library has the
 delightful property of linking against a ldap library they bundle that
 has symbol conflicts with OpenLDAP. At PostgreSQL startup libldap is
 loaded, so when libclntsh.so (the Oracle client) is loaded it gets
 bound to OpenLDAP symbols, and unsurprisingly crashes with a segfault
 when those functions get used.
 
 glibc-2.3.4+ has a flag called RTLD_DEEPBIND for dlopen that prefers
 symbols loaded by the library to those provided by the caller. Using
 this flag fixes my issue, PostgreSQL gets the ldap functions from
 libldap, Oracle client gets them from whatever it links to. Both work
 fine.

I am aware of the problem, but this solution is new to me.
My workaround so far has been to load OpenLDAP with the LD_PRELOAD
environment variable on PostgreSQL start.
But then you get a crash when Oracle uses LDAP functionality (directory naming).

 Attached is a patch that enables this flag on Linux when available.
 This specific case could also be fixed by rewriting oracle_fdw to use
 dlopen for libclntsh.so and pass this flag, but I think it would be
 better to enable it for all PostgreSQL loaded extension modules.

I'll consider changing oracle_fdw in that fashion, even if that will
only remedy the problem on Linux.
I think that this patch is a good idea though.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers