Re: [GENERAL] Should PQconsumeInput/PQisBusy be expensive to use?

2010-10-28 Thread Michael Clark
Hello all.

Thanks a lot for the responses, they are appreciated.

I think I now understand the folly of my loop, and how that was negatively
impacting my test.

I tried the suggestion Alex and Tom made to change my loop with a select()
and my results are now very close to the non-async version.

The main reason for looking at this API is not to support async in our
applications, that is being achieved architecturally in a PG agnostic way.
 It is to give our PG agnostic layer the ability to cancel queries.
(Admittedly the queries I mention in these emails are not candidates for
cancelling...).

Again, thanks so much for the help.
Michael.


On Wed, Oct 27, 2010 at 6:10 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Michael Clark codingni...@gmail.com writes:
  In doing some experiments I found that using
  PQsendQueryParams/PQconsumeInput/PQisBusy/PQgetResult produces slower
  results than simply calling PQexecParams.

 Well, PQconsumeInput involves at least one extra kernel call (to see
 whether data is available) so I don't know why this surprises you.
 The value of those functions is if your application can do something
 else useful while it's waiting.  If the data comes back so fast that
 you can't afford any extra cycles expended on the client side, then
 you don't have any use for those functions.

 However, if you do have something useful to do, the problem with
 this example code is that it's not doing that, it's just spinning:

  while ( ((consume_result = PQconsumeInput(self.db)) == 1) 
  ((is_busy_result = PQisBusy(self.db)) == 1) )
  ;

 That's a busy-wait loop, so it's no wonder you're eating cycles there.
 You want to sleep, or more likely do something else productive, when
 there is no data available from the underlying socket.  Usually the
 idea is to include libpq's socket in the set of files being watched
 by select() or poll(), and dispatch off to something that absorbs
 the data whenever you see some data is available to read.

regards, tom lane



Re: [GENERAL] Should PQconsumeInput/PQisBusy be expensive to use?

2010-10-28 Thread A.M.

On Oct 28, 2010, at 11:08 AM, Michael Clark wrote:

 Hello all.
 
 Thanks a lot for the responses, they are appreciated.
 
 I think I now understand the folly of my loop, and how that was negatively
 impacting my test.
 
 I tried the suggestion Alex and Tom made to change my loop with a select()
 and my results are now very close to the non-async version.
 
 The main reason for looking at this API is not to support async in our
 applications, that is being achieved architecturally in a PG agnostic way.
 It is to give our PG agnostic layer the ability to cancel queries.
 (Admittedly the queries I mention in these emails are not candidates for
 cancelling...).

Hm- I'm not sure how the async API will allow you to cancel queries. In 
PostgreSQL, query canceling is implemented by opening a second connection and 
passing specific data which is received from the first connection (effectively 
sending a cancel signal to the connection instead of a specific query). This 
implementation is necessitated by the fact that the PostgreSQL backend isn't 
asynchronous.

Even if you cancel the query, you still need to consume the socket input. Query 
cancellation is available for libpq both in sync and async modes.

Cheers,
M
-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Should PQconsumeInput/PQisBusy be expensive to use?

2010-10-28 Thread Michael Clark
On Thu, Oct 28, 2010 at 11:15 AM, A.M. age...@themactionfaction.com wrote:


 On Oct 28, 2010, at 11:08 AM, Michael Clark wrote:

  Hello all.
 
  Thanks a lot for the responses, they are appreciated.
 
  I think I now understand the folly of my loop, and how that was
 negatively
  impacting my test.
 
  I tried the suggestion Alex and Tom made to change my loop with a
 select()
  and my results are now very close to the non-async version.
 
  The main reason for looking at this API is not to support async in our
  applications, that is being achieved architecturally in a PG agnostic
 way.
  It is to give our PG agnostic layer the ability to cancel queries.
  (Admittedly the queries I mention in these emails are not candidates for
  cancelling...).

 Hm- I'm not sure how the async API will allow you to cancel queries. In
 PostgreSQL, query canceling is implemented by opening a second connection
 and passing specific data which is received from the first connection
 (effectively sending a cancel signal to the connection instead of a specific
 query). This implementation is necessitated by the fact that the PostgreSQL
 backend isn't asynchronous.

 Even if you cancel the query, you still need to consume the socket input.
 Query cancellation is available for libpq both in sync and async modes.


Oh.  I misunderstood that.

I guess I can have one thread performing the query using the non async PG
calls, then from another thread issue the cancellation.  Both threads
accessing the same PGconn ?

I am glad I added that extra bit of info in my reply, and that your caught
it!!

Thank you!
Michael.


Re: [GENERAL] Should PQconsumeInput/PQisBusy be expensive to use?

2010-10-28 Thread Daniel Verite
A.M. wrote:

 In PostgreSQL, query canceling is implemented by opening a
 second connection and passing specific data which is received
 from the first connection

With libpq's PQCancel(), a second connection is not necessary.

Best regards,
-- 
Daniel
PostgreSQL-powered mail user agent and storage: http://www.manitou-mail.org

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Should PQconsumeInput/PQisBusy be expensive to use?

2010-10-28 Thread Daniel Verite
Michael Clark wrote:

 I guess I can have one thread performing the query using the non async PG
 calls, then from another thread issue the cancellation.  Both threads
 accessing the same PGconn ?

Yes. See http://www.postgresql.org/docs/9.0/static/libpq-cancel.html

Best regards,
-- 
Daniel
PostgreSQL-powered mail user agent and storage: http://www.manitou-mail.org

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Should PQconsumeInput/PQisBusy be expensive to use?

2010-10-28 Thread A.M.

On Oct 28, 2010, at 12:04 PM, Daniel Verite wrote:

   A.M. wrote:
 
 In PostgreSQL, query canceling is implemented by opening a
 second connection and passing specific data which is received
 from the first connection
 
 With libpq's PQCancel(), a second connection is not necessary.

To clarify, PQcancel() opens a new socket to the backend and sends the cancel 
message. (The server's socket address is passed as part of the cancel structure 
to PQcancel.)

http://git.postgresql.org/gitweb?p=postgresql.git;a=blob;f=src/interfaces/libpq/fe-connect.c;h=8f318a1a8cc5bf2d49b2605dd76581609cf9be32;hb=HEAD#l2964

The point is that a query can be cancelled from anywhere really and 
cancellation will not use the original connection socket.

Cheers,
M
-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Should PQconsumeInput/PQisBusy be expensive to use?

2010-10-27 Thread Michael Clark
Hello everyone.

I have been investigating the PG async calls and trying to determine whether
I should go down the road of using them.

In doing some experiments I found that using
PQsendQueryParams/PQconsumeInput/PQisBusy/PQgetResult produces slower
results than simply calling PQexecParams.
Upon some investigation I found that not calling PQconsumeInput/PQisBusy
produces results in line with PQexecParams (which PQexecParams seems to be
doing under the hood).

I profiled my test and found this calling stack:
(This is OS X 10.6)

lo_unix_scall
   recvfrom$UNIX2003
  recv$UNIX2003
 pqsecure_read
pqReadData
   PQconsumeInput
  .


This showed up as the hottest part of the execution by far.  This was a
pretty simple test of fetching 6000+ rows.

If I remove the PQconsumeInput/PQisBusy calls, which essentially makes the
code blocking this hot spot goes away.

Fetching 1000 rows goes from .5 seconds to 3 seconds when I have the
PQconsumeInput/PQisBusy calls in.


I was wondering if maybe I am doing something wrong, or if there is a
technique that might help reduce this penalty?

Thanks in advance for any suggestions,
Michael.

P.S. here is a code snippet of what I am doing basically:
(please keep in mind this is just test code and rather simplistic...)

int send_result = PQsendQueryParams(self.db,
[sql UTF8String],
i,
NULL,
(const char *const *)vals,
(const int *)lens,
(const int *)formats,
kTextResultFormat);
int consume_result = 0;
int is_busy_result = 0;

while ( ((consume_result = PQconsumeInput(self.db)) == 1) 
((is_busy_result = PQisBusy(self.db)) == 1) )
;

if (consume_result != 1)
NSLog(@Got an error in PQconsumeInput);

PGresult* res = PQgetResult(self.db);
while (PQgetResult(self.db) != NULL)
NSLog(@Oops, seems we got an extra response?);


Re: [GENERAL] Should PQconsumeInput/PQisBusy be expensive to use?

2010-10-27 Thread Alex Hunsaker
On Wed, Oct 27, 2010 at 15:02, Michael Clark codingni...@gmail.com wrote:
 Hello everyone.
 Upon some investigation I found that not calling PQconsumeInput/PQisBusy
 produces results in line with PQexecParams (which PQexecParams seems to be
 doing under the hood).

 (please keep in mind this is just test code and rather simplistic...)
     int send_result = PQsendQueryParams(self.db,
                                         [sql UTF8String],
                                         i,
                                         NULL,
                                         (const char *const *)vals,
                                         (const int *)lens,
                                         (const int *)formats,
                                         kTextResultFormat);
     int consume_result = 0;
     int is_busy_result = 0;

     while ( ((consume_result = PQconsumeInput(self.db)) == 1) 
 ((is_busy_result = PQisBusy(self.db)) == 1) )
         ;

You really want to select() or equivalent here...  This basically is a
busy loop using 100% cpu; neither PQconsumeInput or PQisBusy do any
kind of sleeping...

Something like:
fd_set read_mask;
int sock = PQsocket(st-con);
FD_ZERO(read_mask);
FD_SET(sock, read_mask);

while(1)
{
  struct timeval tv = {5, 0};
  select(sock+1, read_mask, NULL, NULL, tv);
  PQconsumeInput(self.db)
  if(!PQisBusy(self.db))
break;
}

or something...

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Should PQconsumeInput/PQisBusy be expensive to use?

2010-10-27 Thread David Wilson
On Wed, Oct 27, 2010 at 5:02 PM, Michael Clark codingni...@gmail.comwrote:


 while ( ((consume_result = PQconsumeInput(self.db)) == 1) 
 ((is_busy_result = PQisBusy(self.db)) == 1) )
 ;


 The problem with this code is that it's effectively useless as a test.
You're just spinning in a loop; if you don't have anything else to be doing
while waiting for responses, then this sort of calling pattern is always
going to be worse than just blocking.

Only do async if you actually have an async problem, and only do a
performance test on it if you're actually doing a real async test, otherwise
the results are fairly useless.

-- 
- David T. Wilson
david.t.wil...@gmail.com


Re: [GENERAL] Should PQconsumeInput/PQisBusy be expensive to use?

2010-10-27 Thread Tom Lane
Michael Clark codingni...@gmail.com writes:
 In doing some experiments I found that using
 PQsendQueryParams/PQconsumeInput/PQisBusy/PQgetResult produces slower
 results than simply calling PQexecParams.

Well, PQconsumeInput involves at least one extra kernel call (to see
whether data is available) so I don't know why this surprises you.
The value of those functions is if your application can do something
else useful while it's waiting.  If the data comes back so fast that
you can't afford any extra cycles expended on the client side, then
you don't have any use for those functions.

However, if you do have something useful to do, the problem with
this example code is that it's not doing that, it's just spinning:

 while ( ((consume_result = PQconsumeInput(self.db)) == 1) 
 ((is_busy_result = PQisBusy(self.db)) == 1) )
 ;

That's a busy-wait loop, so it's no wonder you're eating cycles there.
You want to sleep, or more likely do something else productive, when
there is no data available from the underlying socket.  Usually the
idea is to include libpq's socket in the set of files being watched
by select() or poll(), and dispatch off to something that absorbs
the data whenever you see some data is available to read.

regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general