Re: [PERFORM] pg_connect takes 3.0 seconds
Thanks for advice, Dave! This saga ended in an unexpected way: the firewall died. Since the replacement firewall installed I have not seen any 3 seconds connects. Well, there was no real load so far, but I will keep checking. Thanks to everyone replied, it was very helpful. Cheers, Dmitri. On Fri, Jan 8, 2010 at 7:13 AM, Dave Crooke dcro...@gmail.com wrote: Oops, I meant to mention this too virtually all GigE and/or server class NICs do TCP checksum offload. Dimitri - it's unlikely that you have a hardware issue on the NIC, it's more likely to be a cable problem or network congestion. What you want to look for in the tcpdump capture is things like SYN retries. A good way to test for cable issues is to use a ping flood with a large packet size. Cheers Dave Hang on a sec. You need to ignore bad checksums on *outbound* packets, because many (most?) Ethernet drivers implement some level of TCP offloading, and this will result in packet sniffers seeing invalid checksums for transmitted packets - the checksums haven't been generated by the NIC yet. Unless you know for sure that your NIC doesn't do TSO, ignore bad checksums on outbound packets from the local interface. -- Craig Ringer -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org ) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- @Gmail
Re: [PERFORM] pg_connect takes 3.0 seconds
Hi everybody, Many thanks to everyone replied, I think we are on the right way. I've used tcpdump to generate the logs and there are a lot of dropped packets due to the bad checksum. Network guy is currently looking at the problem and most likely this is hardware issue. Cheers, Dmitri. On Tue, Jan 5, 2010 at 1:12 PM, Dmitri Girski mite...@gmail.com wrote: Hi everybody, I am running a PostgreSQL server 8.3.5 with a pretty much standard config. The web application server which runs Apache 1.3/PHP2.9 has an intermittent problem: pg_connect takes exactly 3.0 seconds. The usual connection time is 0.0045. The long request happens at approximate rate 1:100. I turned on logs on postgres server side, and there is nothing suspicious for me there. When a connection request comes, it is being served without any delay. Could anyone point me to the direction in which I should investigate this problem further? Thank you in advance! PS The hardware is: Dell SC1435/4Gb/2x2.0GHz/Gentoo Linux. The database web servers are in the 2 local subnets. Dmitri. -- @Gmail
Re: [PERFORM] pg_connect takes 3.0 seconds
Hi Andy, I tried 2 connections strings: - server name (DB1), which is listed in all machines hosts files. - ip address. There is no difference in both methods, still I have 5-7 pg_connects which last around 3 seconds. Cheers, Dmitri. On Wed, Jan 6, 2010 at 2:03 AM, Andy Colson a...@squeakycode.net wrote: On 1/4/2010 8:12 PM, Dmitri Girski wrote: Hi everybody, I am running a PostgreSQL server 8.3.5 with a pretty much standard config. The web application server which runs Apache 1.3/PHP2.9 has an intermittent problem: pg_connect takes exactly 3.0 seconds. The usual connection time is 0.0045. The long request happens at approximate rate 1:100. I turned on logs on postgres server side, and there is nothing suspicious for me there. When a connection request comes, it is being served without any delay. Could anyone point me to the direction in which I should investigate this problem further? Thank you in advance! PS The hardware is: Dell SC1435/4Gb/2x2.0GHz/Gentoo Linux. The database web servers are in the 2 local subnets. Dmitri. How do you have the connect string? With an IP or a name? Maybe its a DNS lookup timeout? You could switch to IP or drop the name in the hosts file and see if that makes a difference. -Andy -- @Gmail
Re: [PERFORM] pg_connect takes 3.0 seconds
Hi Scott, Thank you pointers, I've spoken to the network guy, he will help to monitor connections on the firewall. On the other hand, if I use ip addresses this should not attract any possible issues with DNS, right? Thanks! Dmitri. On Wed, Jan 6, 2010 at 9:32 AM, Scott Carey sc...@richrelevance.com wrote: Delays that are almost exactly 3 seconds over a network are almost always some sort of network configuration issue. Inside a datacenter, mis-configured load balancers or routers can cause low level network issues that result in intermittent network delays of exactly 3 seconds (a loop in a routing network?). DNS timeouts are often 3 seconds. Not sure if any of the above is it, but this sounds like a network configuration problem to me. On 1/4/10 6:12 PM, Dmitri Girski mite...@gmail.com wrote: Hi everybody, I am running a PostgreSQL server 8.3.5 with a pretty much standard config. The web application server which runs Apache 1.3/PHP2.9 has an intermittent problem: pg_connect takes exactly 3.0 seconds. The usual connection time is 0.0045. The long request happens at approximate rate 1:100. I turned on logs on postgres server side, and there is nothing suspicious for me there. When a connection request comes, it is being served without any delay. Could anyone point me to the direction in which I should investigate this problem further? Thank you in advance! PS The hardware is: Dell SC1435/4Gb/2x2.0GHz/Gentoo Linux. The database web servers are in the 2 local subnets. Dmitri. -- @Gmail
Re: [PERFORM] pg_connect takes 3.0 seconds
Thank you for reply , Andy! I tried both cases: server name which is listed in hosts file and ip address ( 192.168.2.2) - no difference so far. Cheers, Dmitri. On Wed, Jan 6, 2010 at 2:03 AM, Andy Colson a...@squeakycode.net wrote: On 1/4/2010 8:12 PM, Dmitri Girski wrote: Hi everybody, I am running a PostgreSQL server 8.3.5 with a pretty much standard config. The web application server which runs Apache 1.3/PHP2.9 has an intermittent problem: pg_connect takes exactly 3.0 seconds. The usual connection time is 0.0045. The long request happens at approximate rate 1:100. I turned on logs on postgres server side, and there is nothing suspicious for me there. When a connection request comes, it is being served without any delay. Could anyone point me to the direction in which I should investigate this problem further? Thank you in advance! PS The hardware is: Dell SC1435/4Gb/2x2.0GHz/Gentoo Linux. The database web servers are in the 2 local subnets. Dmitri. How do you have the connect string? With an IP or a name? Maybe its a DNS lookup timeout? You could switch to IP or drop the name in the hosts file and see if that makes a difference. -Andy -- @Gmail
Re: [PERFORM] pg_connect takes 3.0 seconds
Hi Tom, The timing is around 3.0 seconds Time=3.0037 Time=3.4038 Time=3.0038 Time=3.004 Time=3.2037 Time=3.0039 Time=3.0034 Time=3.0034 Time=3.2039 Time=3.0044 Time=3.8044 Time=3.2034 I don't think that it could relate to DNS problem as I tried 2 methods which does not use name resolution ( hosts file ip address) I will definitely seek the help from network geeks and I will check all TCP stack settings. Thank you! Cheers, Dmitri. On Tue, Jan 5, 2010 at 4:14 PM, Tom Lane t...@sss.pgh.pa.us wrote: Dmitri Girski mite...@gmail.com writes: I am running a PostgreSQL server 8.3.5 with a pretty much standard config. The web application server which runs Apache 1.3/PHP2.9 has an intermittent problem: pg_connect takes exactly 3.0 seconds. The usual connection time is 0.0045. The long request happens at approximate rate 1:100. Sounds a lot like a dropped-packets problem. The exact timing would be explained if that is the retransmit timeout in your client-side TCP stack. If that's what it is, you need some network engineers, not us database geeks ... regards, tom lane -- @Gmail
Re: [PERFORM] pg_connect takes 3.0 seconds
Hi Greg, Thank you for idea, reading about checkpints tuning was very useful. I had a checkpoints logging turned on. I studied a couple of days logs and I there is no clear dependency on checkpoint write. Sometimes it is within a vicinity of 3 seconds CONNECT, sometimes well off it. Also the postgres log file does not show any long operations, which inclines me to think that this is a network connectivity/apache/php issue rather than postgres. Here is the excerpts from logs: Log file from WWW server === [06-01-10 14:58:16] UserId=15 Time=3.0032 Req=DB CONNECT === Log file from DB server: === [2010-01-06 14:58:13 EST] idleLOG: 0: disconnection: session time: 0:00:00.027 user=pri_user database=data host=192.168.1.10 port=50087 [2010-01-06 14:58:13 EST] idleLOCATION: log_disconnections, postgres.c:3982 [2010-01-06 14:58:18 EST] /usr/lib64/postgresql-8.3/bin/postgresLOG: 0: connection received: host=192.168.1.10 port=52425 [2010-01-06 14:58:18 EST] /usr/lib64/postgresql-8.3/bin/postgresLOCATION: BackendInitialize, postmaster.c:3027 [2010-01-06 14:58:18 EST] authenticationLOG: 0: connection authorized: user=pri_user database=data [2010-01-06 14:58:18 EST] authenticationLOCATION: BackendInitialize, postmaster.c:3097 [2010-01-06 14:58:18 EST] idleLOG: 0: statement: SELECT fIsLoggedIn(15) [2010-01-06 14:58:18 EST] idleLOCATION: exec_simple_query, postgres.c:845 [2010-01-06 14:58:18 EST] SELECTLOG: 0: duration: 39.233 ms [2010-01-06 14:58:18 EST] SELECTLOCATION: exec_simple_query, postgres.c:1056 [2010-01-06 14:58:18 EST] idleLOG: 0: statement: START TRANSACTION [2010-01-06 14:58:18 EST] idleLOCATION: exec_simple_query, postgres.c:845 [2010-01-06 14:58:18 EST] START TRANSACTIONLOG: 0: duration: 0.050 ms === Cheers, Dmitri. On Tue, Jan 5, 2010 at 2:01 PM, Greg Smith g...@2ndquadrant.com wrote: Dmitri Girski wrote: I am running a PostgreSQL server 8.3.5 with a pretty much standard config. The web application server which runs Apache 1.3/PHP2.9 has an intermittent problem: pg_connect takes exactly 3.0 seconds. The usual connection time is 0.0045. The long request happens at approximate rate 1:100. First thing to check for intermittent multi-second delays is whether a checkpoint is happening at that time. See http://wiki.postgresql.org/wiki/Logging_Checkpoints for an intro, you'd want to see if the checkpoints are around the same time as the delays each time. The default configuration makes checkpoints happen all the time if there's any significant write traffic on your database. -- Greg Smith2ndQuadrant Baltimore, MD PostgreSQL Training, Services and supportg...@2ndquadrant.com www.2ndQuadrant.com -- @Gmail
[PERFORM] pg_connect takes 3.0 seconds
Hi everybody, I am running a PostgreSQL server 8.3.5 with a pretty much standard config. The web application server which runs Apache 1.3/PHP2.9 has an intermittent problem: pg_connect takes exactly 3.0 seconds. The usual connection time is 0.0045. The long request happens at approximate rate 1:100. I turned on logs on postgres server side, and there is nothing suspicious for me there. When a connection request comes, it is being served without any delay. Could anyone point me to the direction in which I should investigate this problem further? Thank you in advance! PS The hardware is: Dell SC1435/4Gb/2x2.0GHz/Gentoo Linux. The database web servers are in the 2 local subnets. Dmitri.