[GENERAL] SSL and crash woes.
A couple of years back (2005) we were setting up replication for the first time (using slony) from our production database server to a stand-by box sitting next to it and a remote box in a DR site. We were running FreeBSD 5.X/6.X on all systems on Dell servers and postgres 7.4.X and then 8.0.X Replication appeared to crash our production database...a lot. After looking at the core dumps/stack traces at the time, we determined that we were crashing in the ssl layersso we disabled SSL (via pg_hba.conf and the slony conn settings) and haven't had an issue for the last couple of years. Stable as a rock. Wellwe just upgraded our hardware (Sun X4600s) and operating sytems (solaris 10) , postgres versions (8.2.4), and slony (1.2.10). Rock solid. However, our first indication of an issue was an issue with executing pg_dump from a remote backup server. (see http://archives.postgresql.org/pgsql-general/2007-08/msg01347.php) Local pg_dump's have no issue. So we changed our backup scheme to do local dumps and push the files off the server to the backup location. Problem solved. Then...replication woes again. With these fresh installs, we didn't think too much about the SSL settingsand bing-bang...crash. Crash. Crash. Crash. Stopped replication. Problem goes away. Start replication...crash crash. So we stopped replication. We recompiled postgres with debug info on a test db box and loaded up the most recent database dump. We then attempted a remote pg_dump from another local server. Crash. Took a look at the core dump... Core was generated by `/usr/local/pgsql/bin/postgres -D /testdb'. Program terminated with signal 11, Segmentation fault. #0 0xfee8ec23 in sk_value () from /usr/local/ssl/lib/libcrypto.so.0.9.8 (gdb) bt #0 0xfee8ec23 in sk_value () from /usr/local/ssl/lib/libcrypto.so.0.9.8 #1 0xfef5b05b in ssl3_output_cert_chain () from /usr/local/ssl/lib/libssl.so.0.9.8 #2 0x in ?? () Hmmm...that looked familiar (from years ago) Sowe set up the connection to be 'hostnossl' in pg_hba.conf and tried again. Success. Changed it back to 'hostssl'.crash. Same place. I am going to take the time and set up test environment for the replication as well, but I assume I will experience the same thing. SSL means crash...no SSL means no crash. Anyone have any thoughts? Pinpoint customers who are looking for what you sell. http://searchmarketing.yahoo.com/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [GENERAL] SSL and crash woes.
Jeff Amiel [EMAIL PROTECTED] writes: [ SSL plus slony = crash ] Interesting --- I don't recall that that's ever been reported before. It might be best to take it up on the slony lists; I wouldn't want to speculate whether the bug is in slony or the core backend (or openssl?) but slony hackers would have less of a learning curve to try to reproduce your results. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [GENERAL] SSL and crash woes.
On 8/29/07, Jeff Amiel [EMAIL PROTECTED] wrote: A couple of years back (2005) we were setting up replication for the first time (using slony) from our production database server to a stand-by box sitting next to it and a remote box in a DR site. We were running FreeBSD 5.X/6.X on all systems on Dell servers and postgres 7.4.X and then 8.0.X Replication appeared to crash our production database...a lot. After looking at the core dumps/stack traces at the time, we determined that we were crashing in the ssl layersso we disabled SSL (via pg_hba.conf and the slony conn settings) and haven't had an issue for the last couple of years. Interesting. Is it possible that either you've got 2 versions of openssl? Maybe slony is being compiled against one, then using the other lib, etc.? ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] SSL and crash woes.
--- Scott Marlowe [EMAIL PROTECTED] wrote: Interesting. Is it possible that either you've got 2 versions of openssl? Maybe slony is being compiled against one, then using the other lib, etc.? yes...I suppose it is.Solaris came with one...we installed another. hm... # find /usr /lib -name libssl* /usr/lib/mps/amd64/libssl3.so /usr/lib/mps/secv1/amd64/libssl3.so /usr/lib/mps/secv1/libssl3.so /usr/lib/mps/libssl3.so /usr/sfw/lib/amd64/libssl.so /usr/sfw/lib/amd64/libssl.so.0.9.7 /usr/sfw/lib/libssl.so /usr/sfw/lib/libssl.so.0.9.7 /usr/sfw/lib/mozilla/libssl3.so /usr/apache/libexec/libssl.so /usr/local/ssl/lib/libssl.a /usr/local/ssl/lib/libssl.so /usr/local/ssl/lib/libssl.so.0.9.8 /usr/local/ssl/lib/pkgconfig/libssl.pc Yahoo! oneSearch: Finally, mobile search that gives answers, not web links. http://mobile.yahoo.com/mobileweb/onesearch?refer=1ONXIC ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match