Re: [HACKERS] Millisecond-precision connect_timeout for libpq
Is there any hope to see it in libpq? If so, can anyone review latest version of my patch? On 10 July 2013 11:49, ivan babrou ibob...@gmail.com wrote: On 9 July 2013 18:43, Andres Freund and...@2ndquadrant.com wrote: On 2013-07-05 21:28:59 +0400, ivan babrou wrote: Hi, guys! I made a quick patch to support floating number in connect_timeout param for libpq. This will treat floating number as seconds so this is backwards-compatible. I don't usually write in C, so there may be mistakes. Could you review it and give me some feedback? -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c index 18fcb0c..58c1a35 100644 --- a/src/interfaces/libpq/fe-connect.c +++ b/src/interfaces/libpq/fe-connect.c @@ -1452,7 +1452,7 @@ static int connectDBComplete(PGconn *conn) { PostgresPollingStatusType flag = PGRES_POLLING_WRITING; - time_t finish_time = ((time_t) -1); + struct timeval finish_time; if (conn == NULL || conn-status == CONNECTION_BAD) return 0; @@ -1462,17 +1462,14 @@ connectDBComplete(PGconn *conn) */ if (conn-connect_timeout != NULL) { - int timeout = atoi(conn-connect_timeout); + int timeout_usec = (int) (atof(conn-connect_timeout) * 100); I'd rather not use a plain int for storing usecs. An overflow is rather unlikely, but still. Also, I'd rather use something like USECS_PER_SEC instead of a plain 100 in multiple places. - if (timeout 0) + if (timeout_usec 0) { - /* - * Rounding could cause connection to fail; need at least 2 secs - */ - if (timeout 2) - timeout = 2; - /* calculate the finish time based on start + timeout */ - finish_time = time(NULL) + timeout; + gettimeofday(finish_time, NULL); + finish_time.tv_usec += (int) timeout_usec; Accordingly adjust this. Looks like a sensible thing to me. *Independent* from this patch, you might want look into server-side connection pooling using transaction mode. If that's applicable for your application it might reduce latency noticeably. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services I tried to make it more safe. Still not sure about constants, I haven't found any good examples in libpq. -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Millisecond-precision connect_timeout for libpq
On 9 July 2013 19:17, Dmitriy Igrishin dmit...@gmail.com wrote: 2013/7/9 Merlin Moncure mmonc...@gmail.com On Fri, Jul 5, 2013 at 12:28 PM, ivan babrou ibob...@gmail.com wrote: Hi, guys! I made a quick patch to support floating number in connect_timeout param for libpq. This will treat floating number as seconds so this is backwards-compatible. I don't usually write in C, so there may be mistakes. Could you review it and give me some feedback? First thing that jumps into my head: why not use asynchronous connection (PQconnectStart, etc) and code the timeout on top of that? +1. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- // Dmitriy. Doesn't look like straightforward solution for me. In my case existing drivers will benefit from my patch, in async case they should be rewritten. We don't use libpq directly, we use native pgsql module from php. Even with that, kernel can wait for milliseconds — why should we limit precision 1000x times and reinvent milliseconds again in userspace? -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Millisecond-precision connect_timeout for libpq
On 9 July 2013 18:43, Andres Freund and...@2ndquadrant.com wrote: On 2013-07-05 21:28:59 +0400, ivan babrou wrote: Hi, guys! I made a quick patch to support floating number in connect_timeout param for libpq. This will treat floating number as seconds so this is backwards-compatible. I don't usually write in C, so there may be mistakes. Could you review it and give me some feedback? -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c index 18fcb0c..58c1a35 100644 --- a/src/interfaces/libpq/fe-connect.c +++ b/src/interfaces/libpq/fe-connect.c @@ -1452,7 +1452,7 @@ static int connectDBComplete(PGconn *conn) { PostgresPollingStatusType flag = PGRES_POLLING_WRITING; - time_t finish_time = ((time_t) -1); + struct timeval finish_time; if (conn == NULL || conn-status == CONNECTION_BAD) return 0; @@ -1462,17 +1462,14 @@ connectDBComplete(PGconn *conn) */ if (conn-connect_timeout != NULL) { - int timeout = atoi(conn-connect_timeout); + int timeout_usec = (int) (atof(conn-connect_timeout) * 100); I'd rather not use a plain int for storing usecs. An overflow is rather unlikely, but still. Also, I'd rather use something like USECS_PER_SEC instead of a plain 100 in multiple places. - if (timeout 0) + if (timeout_usec 0) { - /* - * Rounding could cause connection to fail; need at least 2 secs - */ - if (timeout 2) - timeout = 2; - /* calculate the finish time based on start + timeout */ - finish_time = time(NULL) + timeout; + gettimeofday(finish_time, NULL); + finish_time.tv_usec += (int) timeout_usec; Accordingly adjust this. Looks like a sensible thing to me. *Independent* from this patch, you might want look into server-side connection pooling using transaction mode. If that's applicable for your application it might reduce latency noticeably. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services I tried to make it more safe. Still not sure about constants, I haven't found any good examples in libpq. -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou connect_timeout_in_ms.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Millisecond-precision connect_timeout for libpq
On 9 July 2013 11:05, Markus Wanner mar...@bluegap.ch wrote: On 07/08/2013 08:31 PM, ivan babrou wrote: Seriously, I don't get why running 150 poolers is easier. Did you consider running pgbouncer on the database servers? Regards Markus Wanner Database server lost network — boom, 2 seconds delay. What's the point then? -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Millisecond-precision connect_timeout for libpq
On 9 July 2013 12:20, Markus Wanner mar...@bluegap.ch wrote: On 07/09/2013 09:15 AM, ivan babrou wrote: Database server lost network — boom, 2 seconds delay. What's the point then? Oh, I see. Good point. It could still improve connection time during normal operation, though. Connection time during normal operation is 1.5ms which is fast enough for now. None the less, I now agree with you: we recommend a pooler, which may be capable of millisecond timeouts, but arguably is vastly more complex than the proposed patch. And it even brings its own set of gotchas (lots of connections). I guess I don't quite buy the complexity argument, yet. Pooler isn't capable of millisecond timeouts. At least I don't see how could I understand that pooler is dead in 50ms. Sure, gettimeofday() is subject to clock adjustments. But so is time(). And if you're setting timeouts that low, you probably know what you're doing (or at least care about latency a lot). Or is gettimeofday() still considerably slower on certain architectures or in certain scenarios? Where's the complexity? There's no complexity here :) Regards Markus Wanner -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Millisecond-precision connect_timeout for libpq
On 9 July 2013 17:59, Markus Wanner mar...@bluegap.ch wrote: Ian, On 07/05/2013 07:28 PM, ivan babrou wrote: - /* - * Rounding could cause connection to fail; need at least 2 secs - */ You removed this above comment... please check why it's there. The relevant revision seems to be: ### commit 2908a838ac2cf8cdccaa115249f8399eef8a731e Author: Tom Lane t...@sss.pgh.pa.us Date: Thu Oct 24 23:35:55 2002 + That's not correct, facb72007 is the relevant revision. It seems that it's only applicable for small timeouts in seconds, but it you request connect timeout in 1 ms you should be ready to fail. I may be wrong about this, Bruce Momjian introduced that change in 2002. Code review for connection timeout patch. Avoid unportable assumption that tv_sec is signed; return a useful error message on timeout failure; honor PGCONNECT_TIMEOUT environment variable in PQsetdbLogin; make code obey documentation statement that timeout=0 means no timeout. ### - if (timeout 2) - timeout = 2; - /* calculate the finish time based on start + timeout */ - finish_time = time(NULL) + timeout; + gettimeofday(finish_time, NULL); + finish_time.tv_usec += (int) timeout_usec; I vaguely recall tv_usec only being required to hold values up to 100 by some standard. A signed 32 bit value would qualify, but only hold up to a good half hour worth of microseconds. That doesn't quite seem enough to calculate finish_time the way you are proposing to do it. Agree, this should be fixed. + finish_time.tv_sec += finish_time.tv_usec / 100; + finish_time.tv_usec = finish_time.tv_usec % 100; } } @@ -1073,15 +1074,15 @@ pqSocketPoll(int sock, int forRead, int forWrite, time_t end_time) input_fd.events |= POLLOUT; /* Compute appropriate timeout interval */ - if (end_time == ((time_t) -1)) + if (end_time == NULL) timeout_ms = -1; else { - time_t now = time(NULL); + struct timeval now; + gettimeofday(now, NULL); - if (end_time now) - timeout_ms = (end_time - now) * 1000; - else + timeout_ms = (end_time-tv_sec - now.tv_sec) * 1000 + (end_time-tv_usec - now.tv_usec) / 1000; I think that's incorrect on a platform where tv_sec and/or tv_usec is unsigned. (And the cited commit above indicates there are such platforms.) I don't get it. timeout_ms is signed, and can hold unsigned - unsigned. Is it about anything else? On 07/09/2013 02:25 PM, ivan babrou wrote: There's no complexity here :) Not so fast, cowboy... :-) Regards Markus Wanner Is there anything else I should fix? -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Millisecond-precision connect_timeout for libpq
On 5 July 2013 23:47, ivan babrou ibob...@gmail.com wrote: On 5 July 2013 23:26, Tom Lane t...@sss.pgh.pa.us wrote: ivan babrou ibob...@gmail.com writes: If you can figure out that postgresql is overloaded then you may decide what to do faster. In our app we have very strict limit for connect time to mysql, redis and other services, but postgresql has minimum of 2 seconds. When processing time for request is under 100ms on average sub-second timeouts matter. If you are issuing a fresh connection for each sub-100ms query, you're doing it wrong anyway ... regards, tom lane In php you cannot persist connection between requests without worrying about transaction state. We don't use postgresql for every sub-100ms query because it can block the whole request for 2 seconds. Usually it takes 1.5ms to connect, btw. Can you tell me why having ability to specify more accurate connect timeout is a bad idea? -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou Nobody answered my question yet. -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Millisecond-precision connect_timeout for libpq
On 8 July 2013 20:40, David E. Wheeler da...@justatheory.com wrote: On Jul 8, 2013, at 7:44 AM, ivan babrou ibob...@gmail.com wrote: Can you tell me why having ability to specify more accurate connect timeout is a bad idea? Nobody answered my question yet. From an earlier post by Tom: What exactly is the use case for that? It seems like extra complication for something with little if any real-world usefulness. So the answer is: extra complication. Best, David I don't see any extra complication in backwards-compatible patch that removes more lines that adds. Can you tell me, what exactly is extra complicated? About pooling connections: we have 150 applications servers and 10 postgresql servers. Each app connects to each server - 150 connections per server if I run pooler on each application server. That's more than default setting and now we usually have not more than 10 connections per server. What would happen if we have 300 app servers? I thought connections consume some memory. Running pooler not on every app server gives no advantage — I still may get network blackhole and 2 seconds delay. Moreover, now I can guess that postgresql is overloaded if it does not accept connections, with pooler I can simply blow up disks with heavy io. Seriously, I don't get why running 150 poolers is easier. And my problem is still here: server (pooler is this case) is down — 2 seconds delay. 2000% slower. Where am I wrong? -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Millisecond-precision connect_timeout for libpq
Hi, guys! I made a quick patch to support floating number in connect_timeout param for libpq. This will treat floating number as seconds so this is backwards-compatible. I don't usually write in C, so there may be mistakes. Could you review it and give me some feedback? -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou connect_timeout_in_ms.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Millisecond-precision connect_timeout for libpq
If you can figure out that postgresql is overloaded then you may decide what to do faster. In our app we have very strict limit for connect time to mysql, redis and other services, but postgresql has minimum of 2 seconds. When processing time for request is under 100ms on average sub-second timeouts matter. On 5 July 2013 22:20, Tom Lane t...@sss.pgh.pa.us wrote: ivan babrou ibob...@gmail.com writes: Hi, guys! I made a quick patch to support floating number in connect_timeout param for libpq. What exactly is the use case for that? It seems like extra complication for something with little if any real-world usefulness. regards, tom lane -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Millisecond-precision connect_timeout for libpq
On 5 July 2013 23:26, Tom Lane t...@sss.pgh.pa.us wrote: ivan babrou ibob...@gmail.com writes: If you can figure out that postgresql is overloaded then you may decide what to do faster. In our app we have very strict limit for connect time to mysql, redis and other services, but postgresql has minimum of 2 seconds. When processing time for request is under 100ms on average sub-second timeouts matter. If you are issuing a fresh connection for each sub-100ms query, you're doing it wrong anyway ... regards, tom lane In php you cannot persist connection between requests without worrying about transaction state. We don't use postgresql for every sub-100ms query because it can block the whole request for 2 seconds. Usually it takes 1.5ms to connect, btw. Can you tell me why having ability to specify more accurate connect timeout is a bad idea? -- Regards, Ian Babrou http://bobrik.name http://twitter.com/ibobrik skype:i.babrou -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers