Hello Postgresql Community Members, I am stumped trying to install a few 'c' language functions on a particular Solaris server (64-bit, amd cpu arch (not sparc)). I actually have 5 Postgresql servers, and the .so loads fine into 4 of them, but refuses to load into the 5th. I've quintuple checked the file permissions, build of the .so, gcc versions, PostgreSQL versions, etc... I've had a college double check my work. We're both stumped. Details to follow.
All servers are running Solaris 10u9 on 64-bit hardware inside Solaris zones. Two of the servers are X4720's, 144GB ram, 24 Intel CPU cores. These two servers run the 4 working Solaris zones that are able to load the function implemented in the .so files. Postgresql version 8.4.6, compiled from source (not a binary package). The server that is misbehaving is an X4600, 128 GB ram, 16 AMD CPU cores, but otherwise identical: Solaris 10u9, 64-bit OS, Postgresql 8.4.6. All 5 systems use the stock gcc that ships with Solaris (v3.4.3, its old, I know). The permissions on the files and Postgresql directories. First the a working server, then the server that is not working as expected. (root@working: </db>) # ls -ld /db /db/*.so drwx------ 11 pgsql root 23 Sep 27 10:39 /db -rwxr-xr-x 1 root root 57440 Sep 27 10:39 /db/pgsql_micr_parser_64.so (root@working: </db>) # psql -Upgsql -dpostgres -c"select version();" PostgreSQL 8.4.6 on x86_64-pc-solaris2.11, compiled by GCC gcc (GCC) 3.4.3 (csl-sol210-3_4-20050802), 64-bit (root@working: </db>) # file /opt/local/x64/postgresql-8.4.6/bin/postgres /opt/local/x64/postgresql-8.4.6/bin/postgres: ELF 64-bit LSB executable AMD64 Version 1 [SSE], dynamically linked, not stripped (root@working: </db>) # psql -Upgsql -dmy_db -c"create or replace function parse_micr(text) returns micr_struct as '/db/pgsql_micr_parser_64.so', 'pgsql_micr_parser' language c volatile cost 1;" CREATE FUNCTION (root@working: </db>) # psql -Upgsql -dmy_db -t -c"select transit from parse_micr(':8888=8888: <45800=100<');" 8888=8888 (root@failed: </db>) # ls -ld /db /db/*.so drwx------ 11 pgsql root 24 Sep 29 11:16 /db -rwxr-xr-x 1 root root 57440 Sep 29 09:46 /db/pgsql_micr_parser_64.so (root@failed: </db>) # psql -Upgsql -dpostgres -c"select version();" PostgreSQL 8.4.6 on x86_64-pc-solaris2.11, compiled by GCC gcc (GCC) 3.4.3 (csl-sol210-3_4-20050802), 64-bit (root@failed: </db>) # file /opt/local/x64/postgresql-8.4.6/bin/postgres /opt/local/x64/postgresql-8.4.6/bin/postgres: ELF 64-bit LSB executable AMD64 Version 1 [SSE], dynamically linked, not stripped (root@failed: </db>) # psql -Upgsql -dmy_db -c"create or replace function parse_micr(text) returns micr_struct as '/db/pgsql_micr_parser_64.so', 'pgsql_micr_parser' language c volatile cost 1;" ERROR: could not load library "/db/pgsql_micr_parser_64.so": ld.so.1: postgres: fatal: /db/pgsql_micr_parser_64.so: Permission denied Ok. Well, the file permissions are correct, so what gives? Next step is to trace the backend process as it attempts to load the .so. So I connect to the "failed" server via pgAdmin and run "select getpid();" I then run "truss -p <PID>" from my shell, and in pgAdmin, execute the SQL to create the function. This is the result of the system trace: (root@failed: </db>) # truss -p 10369 recv(9, 0x0097C103, 5, 0) (sleeping...) recv(9, "170301\0 ", 5, 0) = 5 recv(9, " TBEE5 n J\0 VF6E4DDCF84".., 32, 0) = 32 recv(9, "170301\0B0", 5, 0) = 5 recv(9, "AAD5A5 L97B0CEA5A9F0CD89".., 176, 0) = 176 stat("/db/pgsql_micr_parser_64.so", 0xFFFFFD7FFFDF9520) = 0 stat("/db/pgsql_micr_parser_64.so", 0xFFFFFD7FFFDF9530) = 0 stat("/db/pgsql_micr_parser_64.so", 0xFFFFFD7FFFDF8F50) = 0 resolvepath("/db/pgsql_micr_parser_64.so", "/db/pgsql_micr_parser_64.so", 1023) = 27 open("/db/pgsql_micr_parser_64.so", O_RDONLY) = 22 mmap(0x00010000, 32768, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_ALIGN, 22, 0) Err#13 EACCES close(22) = 0 setcontext(0xFFFFFD7FFFDF9050) setcontext(0xFFFFFD7FFFDF9BB0) We can see that the backend is able to open the .so file for reading, but the mmap fails. From the Solaris man page on mmap: ERRORS The mmap() function will fail if: EACCES The fildes file descriptor is not open for read, regardless of the protection speci- fied; or fildes is not open for write and PROT_WRITE was specified for a MAP_SHARED type mapping. My analysis: 1) The file descriptor (#22) is open for O_RDONLY. 2) PROT_WRITE and MAP_SHARED are not specified, so write access is not relevant. Things that I tried, unsuccessfully: 1) I recompiled the .so on the target system (X4600, AMD chips) just in case it is somehow different from the .so that got built on the working system (X4270, Intel chips). 2) Tested with a different .so (I have another that implements forward and reverse DNS lookups, so one may invoke DNS functions inside SQL statements). Same behavior. Loads fine on the X4270 systems, but fails on the X4600 system. 3) Compiled both .so's on 32-bit and 64-bit Gentoo Linux and load them into Postgresql 9.0.4. Works fine. 4) Compiled both .so's on 64-bit Solaris 10u9, postgresql 9.1 on an X4270 and it loads fine there too. 5) Examined a truss on a working system while loading the function. Since it loaded fine already, I had to drop the function, then disconnect pgAdmin (to make the backend exit), reconnect and redo the "create function": (root@working: </db>) # truss -p 16921 ## (I elided a bunch of non-relevant grovelling though the FSM mapped file) stat("/db/pgsql_micr_parser_64.so", 0xFFFFFD7FFFDF9520) = 0 stat("/db/pgsql_micr_parser_64.so", 0xFFFFFD7FFFDF9530) = 0 stat("/db/pgsql_micr_parser_64.so", 0xFFFFFD7FFFDF8F50) = 0 resolvepath("/db/pgsql_micr_parser_64.so", "/db/pgsql_micr_parser_64.so", 1023) = 27 open("/db/pgsql_micr_parser_64.so", O_RDONLY) = 22 mmap(0x00010000, 32768, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_ALIGN, 22, 0) = 0xFFFFFD7FFED80000 mmap(0x00010000, 90112, PROT_NONE, MAP_PRIVATE|MAP_NORESERVE|MAP_ANON|MAP_ALIGN, 4294967295, 0) = 0xFFFFFD7FFED00000 mmap(0xFFFFFD7FFED00000, 21997, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_TEXT, 22, 0) = 0xFFFFFD7FFED00000 mmap(0xFFFFFD7FFED15000, 2576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_INITDATA, 22, 20480) = 0xFFFFFD7FFED15000 munmap(0xFFFFFD7FFED06000, 61440) = 0 memcntl(0xFFFFFD7FFED00000, 7008, MC_ADVISE, MADV_WILLNEED, 0, 0) = 0 close(22) = 0 6) There is nothing interesting in dmesg or syslog. 7) Disconnecting and reconnecting a few times, to try a freshly launched backend. No luck. Any thoughts or suggestions?