date:20120724

Re: [HACKERS] isolation check takes a long time

2012-07-24 Thread Tom Lane

Peter Eisentraut pete...@gmx.net writes:
 On fre, 2012-07-20 at 13:15 -0400, Andrew Dunstan wrote:
 Meanwhile, I would like to remove the prepared_transactions test from 
 the main isolation schedule, and add a new Make target which runs that 
 test explicitly. Is there any objection to that?

 Why was this backpatched to 9.1?  That doesn't seem appropriate.

AIUI the point was to cut the load on buildfarm machines, so it won't
help (as much) if it's only applied to a subset of the branches that
have this test.  In any case, I don't follow your objection.  Removing
regression tests from stable branches is probably safer than removing
them from HEAD.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: Checkpointer split has broken things dramatically (was Re: DELETE vs TRUNCATE explanation)

2012-07-24 Thread Greg Stark

On Wed, Jul 18, 2012 at 1:13 AM, Craig Ringer ring...@ringerc.id.au wrote:

 That makes me wonder if on top of the buildfarm, extending some buildfarm
 machines into a crashfarm is needed:

 - Keep kvm instances with copy-on-write snapshot disks and the build env
 on them
 - Fire up the VM, do a build, and start the server
 - From outside the vm have the test controller connect to the server and
 start a test run
 - Hard-kill the OS instance at a random point in time.


For what it's worth you don't need to do a hard kill of the vm and start
over repeatedly to kill at different times. You could take a snapshot of
the disk storage and keep running. You could take many snapshots from a
single run. Each snapshot would represent the storage that would exist if
the machine had crashed at the point in time that the snapshot was taken.

You do want the snapshots to be taken using something outside the virtual
machine. Either the kvm storage layer or using lvm on the host. But not
using lvm on the guest virtual machine.

And yes, the hard part that always stopped me from looking at this was
having any way to test the correctness of the data.

-- 
greg

Re: [HACKERS] isolation check takes a long time

2012-07-24 Thread Alvaro Herrera


Excerpts from Noah Misch's message of dom jul 22 17:11:53 -0400 2012:

 I was pondering something like this:
 
 setting i-rc isolation = READ COMMITTED
 setting i-rr isolation = REPEATABLE READ
 
 session s1
 setup{ BEGIN TRANSACTION ISOLATION LEVEL :isolation; }
 step foo{ SELECT 1; }
 
 permutation i-rc foo
 permutation i-rr foo
 
 That is, introduce psql-style variable substitutions in per-session setup,
 step and teardown directives.  Introduce the setting directive to
 declare possible values for each variable.  Each permutation may name settings
 as well as steps.  Order within the permutation would not matter; we could
 allow them anywhere in the list or only at the beginning.  When the tester
 generates permutations, it would include all variable setting combinations.

The idea of using psql-style variables seems good to me.

Offhand having the setting names appear in permutations just like step
names seems a bit weird to me, though I admit that I don't see any other
way that's not overly verbose.

I would expect that if no permutations are specified, all possible
values for a certain setting would be generated.  That way it'd be easy
to define tests that run through all possible permutations of two (or
more) sequences of commands on all isolation levels, without having the
specify them all by hand.  With that in mind, having each possible value
for a setting be declared independently might be a bit troublesome.

Something like 

 setting isolevel isolation = { READ COMMITTED, REPEATABLE READ }
 
 session s1
 setup{ BEGIN TRANSACTION ISOLATION LEVEL :isolation; }
 step foo{ SELECT 1; }
 
 permutation isolevel foo

Maybe we can have a single name for both the setting specification and
variable name, instead of having to invent two names; in that case, it'd
reduce to

 setting isolation = { READ COMMITTED, REPEATABLE READ }
 
 session s1
 setup{ BEGIN TRANSACTION ISOLATION LEVEL :isolation; }
 step foo{ SELECT 1; }
 
 permutation isolation foo

Thoughts?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Merlin Moncure

On Mon, Jul 23, 2012 at 5:07 PM, Marko Kreen mark...@gmail.com wrote:
 Does PQgetRowData() break the ability to call PQgetvalue() against the
 result as well as other functions like PQgetisnull()?  If so, I
 object.

 I don't get what are you objecting to.  The PQgetRowData()
 is called instead of PQgetResult() and it returns zero-row PGresult
 for row header and list of PGdataValues that point to actual
 data. You call the value functions, they don't crash, but as
 the result has zero rows (the data is not copied to it)
 they don't do anything useful either.

 The whole point of the API is to avoid the unnecessary copying.

 You can mix the calls to PQgetRowData() and PQgetResult()
 during one resultset, it works fine although does not seem
 that useful.

 I guess you fear that some code can unexpectedly see
 new behavior, and that exactly why the row-callback API
 needs to be scrapped - it allowed such possibility.

 But with PQsetSingleRowMode and PQgetRowData, the
 new behavior is seen only by new code that clearly
 expects it, so I don't see what the problem is.

Well, for one, it breaks libpqtypes (see line 171@
http://libpqtypes.esilo.com/browse_source.html?file=exec.c), or at
least makes it unable to use the row-processing mode.   libpqtypes
wraps the libpq getter functions and exposes a richer api using only
the result object.  When writing libpqtypes we went through great
pains to keep access to server side data through the existing result
API.  Note, I'm not arguing that compatibility with libpqtypes is a
desired objective, but the wrapping model that it uses is pretty
reasonably going to be used by other code -- the awkwardness there
should be a red flag of sorts.

I'm arguing that *all* data getting must continue to do so through the
result object, and bypassing the result to get at data is breaking the
result abstraction in the libpq api.  I bet you can still maintain
data access through result object while avoiding extra copies.  For
example, maybe PQsetSingleRowMode maybe should return a result object?

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Tom Lane

Merlin Moncure mmonc...@gmail.com writes:
 I'm arguing that *all* data getting must continue to do so through the
 result object, and bypassing the result to get at data is breaking the
 result abstraction in the libpq api.

That's a fair point, but the single-row mode without PQgetRowData still
fits that model, doesn't it?  From the point of view of libpqtypes it
just looks like you got a lot of one-row query results.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Merlin Moncure

On Tue, Jul 24, 2012 at 10:49 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Merlin Moncure mmonc...@gmail.com writes:
 I'm arguing that *all* data getting must continue to do so through the
 result object, and bypassing the result to get at data is breaking the
 result abstraction in the libpq api.

 That's a fair point, but the single-row mode without PQgetRowData still
 fits that model, doesn't it?  From the point of view of libpqtypes it
 just looks like you got a lot of one-row query results.

Sure: Marko's exec_query_single_row example looks like 100% reasonable
libpq code.  That said, I'd still spend a few cycles to think this
through and make sure we aren't walling ourselves off from 'copy free'
behavior, even if that's reserved for a future improvement.  In
particular, I'd like to explore if PQsetSingleRowMode() should be
returning a result.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Marko Kreen

On Tue, Jul 24, 2012 at 6:18 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Mon, Jul 23, 2012 at 5:07 PM, Marko Kreen mark...@gmail.com wrote:
 Does PQgetRowData() break the ability to call PQgetvalue() against the
 result as well as other functions like PQgetisnull()?  If so, I
 object.

 I don't get what are you objecting to.  The PQgetRowData()
 is called instead of PQgetResult() and it returns zero-row PGresult
 for row header and list of PGdataValues that point to actual
 data. You call the value functions, they don't crash, but as
 the result has zero rows (the data is not copied to it)
 they don't do anything useful either.

 The whole point of the API is to avoid the unnecessary copying.

 You can mix the calls to PQgetRowData() and PQgetResult()
 during one resultset, it works fine although does not seem
 that useful.

 I guess you fear that some code can unexpectedly see
 new behavior, and that exactly why the row-callback API
 needs to be scrapped - it allowed such possibility.

 But with PQsetSingleRowMode and PQgetRowData, the
 new behavior is seen only by new code that clearly
 expects it, so I don't see what the problem is.

 Well, for one, it breaks libpqtypes (see line 171@
 http://libpqtypes.esilo.com/browse_source.html?file=exec.c), or at
 least makes it unable to use the row-processing mode.   libpqtypes
 wraps the libpq getter functions and exposes a richer api using only
 the result object.  When writing libpqtypes we went through great
 pains to keep access to server side data through the existing result
 API.  Note, I'm not arguing that compatibility with libpqtypes is a
 desired objective, but the wrapping model that it uses is pretty
 reasonably going to be used by other code -- the awkwardness there
 should be a red flag of sorts.

 I'm arguing that *all* data getting must continue to do so through the
 result object, and bypassing the result to get at data is breaking the
 result abstraction in the libpq api.  I bet you can still maintain
 data access through result object while avoiding extra copies.

Well, the main problem this week is whether we should
apply PQsetSingleRowMode() from single-row-mode1
or from single-row-mode2 branch?

The PQgetRowData() is unimportant as it just exposes
the rowBuf to user and thats all.

Do you have opinion about that?

  For example, maybe PQsetSingleRowMode maybe should return a result object?

What do you mean by that?  And have you though about both
sync and async usage patterns?

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Merlin Moncure

On Tue, Jul 24, 2012 at 11:08 AM, Marko Kreen mark...@gmail.com wrote:
 I'm arguing that *all* data getting must continue to do so through the
 result object, and bypassing the result to get at data is breaking the
 result abstraction in the libpq api.  I bet you can still maintain
 data access through result object while avoiding extra copies.

 Well, the main problem this week is whether we should
 apply PQsetSingleRowMode() from single-row-mode1
 or from single-row-mode2 branch?

 The PQgetRowData() is unimportant as it just exposes
 the rowBuf to user and thats all.

right. branch 1 (containing PQgetRowData) seems wrong to me.  so, if
given that choice, I'd argue for branch 2, forcing a PGresult pull on
each row.  However, what you were gunning for via branch 1 which is
extra performance via removing the extra allocs is important and
useful; hopefully we can get the best of both worlds, or punt and
settle on branch 2.

  For example, maybe PQsetSingleRowMode maybe should return a result object?

 What do you mean by that?  And have you though about both
 sync and async usage patterns?

No, I haven't -- at least not very well.  The basic idea is that
PQsetSingleRowMode turns into PQgetSingleRowResult() and returns a
result object.  For row by row an extra API call gets called (maybe
PQgetNextRow(PGconn, PGresult)) that does the behind the scenes work
under the existing result object.  This is the same general structure
you have in branch 2, but the result allocation is move out of the
loop.  Presumably sync and async would then follow the same pattern,
but we'd still have to be able to guarantee non-blocking behavior in
the async api.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Marko Kreen

On Tue, Jul 24, 2012 at 7:08 PM, Marko Kreen mark...@gmail.com wrote:
 The PQgetRowData() is unimportant as it just exposes
 the rowBuf to user and thats all.

So to get back to the more interesting PQgetRowData() discussion...

Did you notice that it's up to app to decide whether it calls
PQgetResult() or PQgetRowData() after PQsetSingleRowMode()?
So there is no way for it to get into conflicts with anything.
If libpqtypes keeps using PQgetResult it keeps getting
PGresult.  No problem.

The PQgetRowData() is meant for use-cases where PGresult is *not* the
main data structure the app operates on.  If the app does dumb
copy out of PGresult as soon as possible, it can move to PGgetRowData().

If app wants do to complex operations with PGresult or just
store it, then it can keep doing it.  Nobody forces the use
of PQgetRowData().


Now the about idea about doing more optimized PGresult - one way of doing
it would be to create zero-copy PGresult that points into network
buffer like PGgetRowData() does.  But this breaks all expectations
of lifetime rules for PGresult, thus seems like bad idea.

Second way is to optimize the copying step - basically just
do a malloc and 2 or 3 memcopies - for struct, headers
and data.  This leaves standalone PGresult around that
behaves as expected.  But for apps that don't care about
PGresult it is still unnecessary copy.  So even if we
optimize PGresult that way, it still seems worthwhile
to have PGgetRowData() around.


Hm, I think it's possible to rig the test to do dummy
copy of pgresult, thus it's possible to see what kind
of speed is possible..  Will try.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Marko Kreen

On Tue, Jul 24, 2012 at 7:25 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Jul 24, 2012 at 11:08 AM, Marko Kreen mark...@gmail.com wrote:
 I'm arguing that *all* data getting must continue to do so through the
 result object, and bypassing the result to get at data is breaking the
 result abstraction in the libpq api.  I bet you can still maintain
 data access through result object while avoiding extra copies.

 Well, the main problem this week is whether we should
 apply PQsetSingleRowMode() from single-row-mode1
 or from single-row-mode2 branch?

 The PQgetRowData() is unimportant as it just exposes
 the rowBuf to user and thats all.

 right. branch 1 (containing PQgetRowData) seems wrong to me.

Indeed, you are missing the fact that PGgetResult works same
in both branches.,

And please see the benchmart I posted.

Even better, run it yourself...

 What do you mean by that?  And have you though about both
 sync and async usage patterns?

 No, I haven't -- at least not very well.  The basic idea is that
 PQsetSingleRowMode turns into PQgetSingleRowResult() and returns a
 result object.  For row by row an extra API call gets called (maybe
 PQgetNextRow(PGconn, PGresult)) that does the behind the scenes work
 under the existing result object.  This is the same general structure
 you have in branch 2, but the result allocation is move out of the
 loop.  Presumably sync and async would then follow the same pattern,
 but we'd still have to be able to guarantee non-blocking behavior in
 the async api.

Well, as discussed previously it's worthwhile to keep standard functions
like PQisBusy() and PQgetResult() working sensibly in new mode,
and currently they do so.

If you add new functions, you also need to define the behavior
when new and old one's are mixed and it gets messy quickly.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Merlin Moncure

On Tue, Jul 24, 2012 at 11:39 AM, Marko Kreen mark...@gmail.com wrote:
 On Tue, Jul 24, 2012 at 7:25 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Jul 24, 2012 at 11:08 AM, Marko Kreen mark...@gmail.com wrote:
 I'm arguing that *all* data getting must continue to do so through the
 result object, and bypassing the result to get at data is breaking the
 result abstraction in the libpq api.  I bet you can still maintain
 data access through result object while avoiding extra copies.

 Well, the main problem this week is whether we should
 apply PQsetSingleRowMode() from single-row-mode1
 or from single-row-mode2 branch?

 The PQgetRowData() is unimportant as it just exposes
 the rowBuf to user and thats all.

 right. branch 1 (containing PQgetRowData) seems wrong to me.

 Indeed, you are missing the fact that PGgetResult works same
 in both branches.,

 And please see the benchmart I posted.

 Even better, run it yourself...

 What do you mean by that?  And have you though about both
 sync and async usage patterns?

 No, I haven't -- at least not very well.  The basic idea is that
 PQsetSingleRowMode turns into PQgetSingleRowResult() and returns a
 result object.  For row by row an extra API call gets called (maybe
 PQgetNextRow(PGconn, PGresult)) that does the behind the scenes work
 under the existing result object.  This is the same general structure
 you have in branch 2, but the result allocation is move out of the
 loop.  Presumably sync and async would then follow the same pattern,
 but we'd still have to be able to guarantee non-blocking behavior in
 the async api.

 Well, as discussed previously it's worthwhile to keep standard functions
 like PQisBusy() and PQgetResult() working sensibly in new mode,
 and currently they do so.

 If you add new functions, you also need to define the behavior
 when new and old one's are mixed and it gets messy quickly.

Right, I'm aware that you can use 'slow' method in branch 1.  I also
saw the benchmarks and agree that single row mode should be at least
competitive with current methods for pretty much all cases.

But, the faster rowbuf method is a generally incompatible way of
dealing with data vs current libpq -- this is bad.  If it's truly
impossible to get those benefits without bypassing result API that
then I remove my objection on the grounds it's optional behavior (my
gut tells me it is possible though).

I think the dummy copy of PGresult is plausible (if by that you mean
optimizing PQgetResult when in single row mode).  That would be even
better: you'd remove the need for the rowbuf mode.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Marko Kreen

On Tue, Jul 24, 2012 at 7:52 PM, Merlin Moncure mmonc...@gmail.com wrote:
 But, the faster rowbuf method is a generally incompatible way of
 dealing with data vs current libpq -- this is bad.  If it's truly
 impossible to get those benefits without bypassing result API that
 then I remove my objection on the grounds it's optional behavior (my
 gut tells me it is possible though).

Um, please clarify what are you talking about here?

What is the incompatibility of PGresult from branch 1?

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Tom Lane

Merlin Moncure mmonc...@gmail.com writes:
 I think the dummy copy of PGresult is plausible (if by that you mean
 optimizing PQgetResult when in single row mode).  That would be even
 better: you'd remove the need for the rowbuf mode.

I haven't spent any time looking at this, but my gut tells me that a big
chunk of the expense is copying the PGresult's metadata (the column
names, types, etc).  It has to be possible to make that cheaper.

One idea is to rearrange the internal storage so that that part reduces
to one memcpy().  Another thought is to allow PGresults to share
metadata by treating the metadata as a separate reference-counted
object.  The second could be a bit hazardous though, as we advertise
that PGresults are independent objects that can be manipulated by
separate threads.  I don't want to introduce mutexes into PGresults,
but I'm not sure reference-counted metadata can be safe without them.
So maybe the memcpy idea is the only workable one.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Checkpointer split has broken things dramatically (was Re: DELETE vs TRUNCATE explanation)

2012-07-24 Thread karavelov




- Цитат от David Fetter (da...@fetter.org), на 23.07.2012 в
15:41 -  I'm not sure how you automate testing a pull-the-plug
scenario.   


I have a dim memory of how the FreeBSD project was alleged to have
done it, namely by rigging a serial port (yes, it was that long ago)
to the power supply of another machine and randomly cycling the power.   


These days most of the server class hardware could be power-cycled with
IPMI command. 


Best regards 


--
Luben Karavelov

Re: [HACKERS] Use of rsync for data directory copying

2012-07-24 Thread Kevin Grittner

Bruce Momjian br...@momjian.us wrote:
 
 if a write happens in both the first and second half of a second,
 
While I'm not sure whether I believe that granularity is really to
the nanosecond, a stat of a table in a production database on xfs
shows this:
 
Modify: 2012-07-24 10:15:44.096415501 -0500
 
So presumably both writes would need to happen within a much smaller
time frame than a second.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Recent absence

2012-07-24 Thread Kevin Grittner

I have recently been laid up with a MRSA infection, which left me
suddenly unable to deal with even reading my email.  (I took a shot
at catching up a week ago, and it was too taxing, resulting in a
relapse.)  I'm trying to fight though the backlog and get caught up,
without pushing it too hard too soon again.  I hope to be back to
normal and caught up in about a week.  If there's something you know
of which needs more immediate attention from me, please send me an
email off-list.
 
Apologies for falling down on CF duties, and thanks to everyone who
helped picked up the slack.
 
You'll probably notice me responding to some slightly stale posts as
I catch up.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Recent absence

2012-07-24 Thread Joshua D. Drake



On 07/24/2012 10:26 AM, Kevin Grittner wrote:


I have recently been laid up with a MRSA infection, which left me
suddenly unable to deal with even reading my email.  (I took a shot
at catching up a week ago, and it was too taxing, resulting in a
relapse.)  I'm trying to fight though the backlog and get caught up,
without pushing it too hard too soon again.  I hope to be back to
normal and caught up in about a week.  If there's something you know
of which needs more immediate attention from me, please send me an
email off-list.

Apologies for falling down on CF duties, and thanks to everyone who
helped picked up the slack.

You'll probably notice me responding to some slightly stale posts as
I catch up.


Don't push yourself, take care of yourself. The community can wait. 
Better to have you at 95% than 30%.


Sincerely,

jD





--
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC
@cmdpromptinc - 509-416-6579

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Recent absence

2012-07-24 Thread Atri Sharma

On Tue, Jul 24, 2012 at 11:05 PM, Joshua D. Drake j...@commandprompt.com 
wrote:

 On 07/24/2012 10:26 AM, Kevin Grittner wrote:


 I have recently been laid up with a MRSA infection, which left me
 suddenly unable to deal with even reading my email.  (I took a shot
 at catching up a week ago, and it was too taxing, resulting in a
 relapse.)  I'm trying to fight though the backlog and get caught up,
 without pushing it too hard too soon again.  I hope to be back to
 normal and caught up in about a week.  If there's something you know
 of which needs more immediate attention from me, please send me an
 email off-list.

 Apologies for falling down on CF duties, and thanks to everyone who
 helped picked up the slack.

 You'll probably notice me responding to some slightly stale posts as
 I catch up.


 Don't push yourself, take care of yourself. The community can wait. Better
 to have you at 95% than 30%.

 Sincerely,

 jD





 --
 Command Prompt, Inc. - http://www.commandprompt.com/
 PostgreSQL Support, Training, Professional Services and Development
 High Availability, Oracle Conversion, Postgres-XC
 @cmdpromptinc - 509-416-6579


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

Please take care Kevin,and wishing you a speedy recovery.

Atri

-- 
Regards,

Atri
l'apprenant

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] canceling autovacuum task woes

2012-07-24 Thread Robert Haas

I am running into a lot of customer situations where the customer
reports that canceling autovacuum task shows up in the logs, and
it's unclear whether this is happening often enough to matter, and
even more unclear what's causing it.

Me: So, do you know what table it's getting cancelled on?
Customer: Nope.
Me: Are you running any DDL commands anywhere in the cluster?
Customer: Nope, absolutely none.
Me: Well you've got to be running something somewhere or it wouldn't
be having a lock conflict.
Customer: OK, well I don't know of any.  What should I do?

It would be awfully nice if the process that does the cancelling would
provide the same kind of reporting that we do for a deadlock: the
relevant lock tag, the PID of the process sending the cancel, and the
query string.

Personally, I'm starting to have a sneaky suspicion that there is
something actually broken here - that is, that there are lock
conflicts involve here other than the obvious one (SHARE UPDATE
EXCLUSIVE on the table) that are allowing autovac to get cancelled
more often than we realize.  But whether that's true or not, the
current logging is wholly inadequate.

Thoughts?  Anybody else have this problem?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.2 release schedule

2012-07-24 Thread Fujii Masao

On Tue, Jul 24, 2012 at 4:10 AM, Noah Misch n...@leadboat.com wrote:
 On Mon, Jul 23, 2012 at 10:29:06AM -0400, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  Seems OK, but I think we need to work a little harder on evicting some
  things from the list of open items.  I don't think all of the things
  listed in the blockers section really are, and I'm not sure what needs
  to be done about some of the things that are there.

 I've got the libpq row processor thing.  That and the CHECK NO INHERIT
 syntax thing are definitely release-blockers, because we won't be able
 to change such decisions post-release (well, we could, but the pain to
 benefit ratio is bad).  I guess the SPGiST vs HS issue is a blocker too.
 A lot of the rest look like pre-existing bugs to me.

 The only preexisting issues listed under Blockers for 9.2 are GiST indexes
 vs fuzzy comparisons used by geometric types and Should we fix tuple limit
 handling, or redefine 9.x behavior as correct?.  Also, I'm not sure what
 exactly the keepalives item indicates.  Whether every regression deserves to
 block the release is, of course, a separate question.

 I think WAL files which were restored from the archive are archived again is
 the thorniest regression, and we don't yet have a patch.

Yep, that's really a problem. Will implement the patch.

Regards,

-- 
Fujii Masao

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] canceling autovacuum task woes

2012-07-24 Thread Andres Freund

Hi,

On Tuesday, July 24, 2012 07:48:27 PM Robert Haas wrote:
 I am running into a lot of customer situations where the customer
 reports that canceling autovacuum task shows up in the logs, and
 it's unclear whether this is happening often enough to matter, and
 even more unclear what's causing it.
 
 Me: So, do you know what table it's getting cancelled on?
 Customer: Nope.
 Me: Are you running any DDL commands anywhere in the cluster?
 Customer: Nope, absolutely none.
 Me: Well you've got to be running something somewhere or it wouldn't
 be having a lock conflict.
 Customer: OK, well I don't know of any.  What should I do?
 
 It would be awfully nice if the process that does the cancelling would
 provide the same kind of reporting that we do for a deadlock: the
 relevant lock tag, the PID of the process sending the cancel, and the
 query string.
 
 Personally, I'm starting to have a sneaky suspicion that there is
 something actually broken here - that is, that there are lock
 conflicts involve here other than the obvious one (SHARE UPDATE
 EXCLUSIVE on the table) that are allowing autovac to get cancelled
 more often than we realize.  But whether that's true or not, the
 current logging is wholly inadequate.
Very, very, very quick guess: The relation extension lock?

 Thoughts?  Anybody else have this problem?
I have seen spuriously high occurances of that message before, but I never 
really investigated it.

Andres
-- 
Andres Freund   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] canceling autovacuum task woes

2012-07-24 Thread Steve Singer


On 12-07-24 01:48 PM, Robert Haas wrote:

I am running into a lot of customer situations where the customer
reports that canceling autovacuum task shows up in the logs, and
it's unclear whether this is happening often enough to matter, and
even more unclear what's causing it.


Could autovacuum be compacting a lot of space at the end of the table.  
This is described
in the thread 
http://archives.postgresql.org/message-id/4d8df88e.7080...@yahoo.com




Me: So, do you know what table it's getting cancelled on?
Customer: Nope.
Me: Are you running any DDL commands anywhere in the cluster?
Customer: Nope, absolutely none.
Me: Well you've got to be running something somewhere or it wouldn't
be having a lock conflict.
Customer: OK, well I don't know of any.  What should I do?

It would be awfully nice if the process that does the cancelling would
provide the same kind of reporting that we do for a deadlock: the
relevant lock tag, the PID of the process sending the cancel, and the
query string.

Personally, I'm starting to have a sneaky suspicion that there is
something actually broken here - that is, that there are lock
conflicts involve here other than the obvious one (SHARE UPDATE
EXCLUSIVE on the table) that are allowing autovac to get cancelled
more often than we realize.  But whether that's true or not, the
current logging is wholly inadequate.

Thoughts?  Anybody else have this problem?




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] canceling autovacuum task woes

2012-07-24 Thread Robert Haas

On Tue, Jul 24, 2012 at 2:11 PM, Steve Singer ssin...@ca.afilias.info wrote:
 On 12-07-24 01:48 PM, Robert Haas wrote:
 I am running into a lot of customer situations where the customer
 reports that canceling autovacuum task shows up in the logs, and
 it's unclear whether this is happening often enough to matter, and
 even more unclear what's causing it.

 Could autovacuum be compacting a lot of space at the end of the table.  This
 is described
 in the thread
 http://archives.postgresql.org/message-id/4d8df88e.7080...@yahoo.com

You (and Andres) may well be right, but I think the way we find out is
to add some better logging.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] canceling autovacuum task woes

2012-07-24 Thread Andrew Dunstan



On 07/24/2012 01:48 PM, Robert Haas wrote:

I am running into a lot of customer situations where the customer
reports that canceling autovacuum task shows up in the logs, and
it's unclear whether this is happening often enough to matter, and
even more unclear what's causing it.

Me: So, do you know what table it's getting cancelled on?
Customer: Nope.
Me: Are you running any DDL commands anywhere in the cluster?
Customer: Nope, absolutely none.
Me: Well you've got to be running something somewhere or it wouldn't
be having a lock conflict.
Customer: OK, well I don't know of any.  What should I do?

It would be awfully nice if the process that does the cancelling would
provide the same kind of reporting that we do for a deadlock: the
relevant lock tag, the PID of the process sending the cancel, and the
query string.



+1 for more information on this.

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Merlin Moncure

On Tue, Jul 24, 2012 at 11:57 AM, Marko Kreen mark...@gmail.com wrote:
 On Tue, Jul 24, 2012 at 7:52 PM, Merlin Moncure mmonc...@gmail.com wrote:
 But, the faster rowbuf method is a generally incompatible way of
 dealing with data vs current libpq -- this is bad.  If it's truly
 impossible to get those benefits without bypassing result API that
 then I remove my objection on the grounds it's optional behavior (my
 gut tells me it is possible though).

 Um, please clarify what are you talking about here?

 What is the incompatibility of PGresult from branch 1?

Incompatibility in terms of usage -- we should be getting data with
PQgetdata.  I think you're suspecting that I incorrectly believe your
forced to use the rowbuf API -- I don't (although I wasn't clear on
that earlier).  Basically I'm saying that we should only buy into that
if all other alternative routes to getting the faster performance are
exhausted.

On Tue, Jul 24, 2012 at 11:59 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Merlin Moncure mmonc...@gmail.com writes:
 I think the dummy copy of PGresult is plausible (if by that you mean
 optimizing PQgetResult when in single row mode).  That would be even
 better: you'd remove the need for the rowbuf mode.

 I haven't spent any time looking at this, but my gut tells me that a big
 chunk of the expense is copying the PGresult's metadata (the column
 names, types, etc).  It has to be possible to make that cheaper.

 One idea is to rearrange the internal storage so that that part reduces
 to one memcpy().  Another thought is to allow PGresults to share
 metadata by treating the metadata as a separate reference-counted
 object.  The second could be a bit hazardous though, as we advertise
 that PGresults are independent objects that can be manipulated by
 separate threads.  I don't want to introduce mutexes into PGresults,
 but I'm not sure reference-counted metadata can be safe without them.
 So maybe the memcpy idea is the only workable one.

Yeah -- we had a very similar problem in libpqtypes and we solved it
exactly as you're thinking.  libpqtypes has to create a result with
each row iteration potentially (we expose rows and composites as on
the fly created result objects) and stores some extra non-trivial data
with the result.  We solved it with the optimized-memcpy method (look
here: http://libpqtypes.esilo.com/browse_source.html?file=libpqtypes.h
and you'll see all the important structs like PGtypeHandler are
somewhat haphazardly designed to be run through a memcpy.   We
couldn't do anything about internal libpq issues though, but some
micro optimization of PQsetResultAttrs (which is called via
PQcopyResult) might fit the bill.

The 'source' result (or source data that would be copied into the
destination result) would be stored in the PGconn, right? So, the idea
is that when you set up single row mode the connection generates a
template PGconn which is then copied out repeatedly during row-by-row
processing.  I like it, but only if we're reasonably confident the
PGresult can be sufficiently optimized like that.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Merlin Moncure

On Tue, Jul 24, 2012 at 1:33 PM, Merlin Moncure mmonc...@gmail.com wrote:
 The 'source' result (or source data that would be copied into the
 destination result) would be stored in the PGconn, right? So, the idea
 is that when you set up single row mode the connection generates a
 template PGconn which is then copied out repeatedly during row-by-row
 processing.  I like it, but only if we're reasonably confident the
 PGresult can be sufficiently optimized like that.

hm, PGresAttDesc is unfortunately in the public header and as such
probably can't be changed?

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Merlin Moncure

On Tue, Jul 24, 2012 at 1:49 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Jul 24, 2012 at 1:33 PM, Merlin Moncure mmonc...@gmail.com wrote:
 The 'source' result (or source data that would be copied into the
 destination result) would be stored in the PGconn, right? So, the idea
 is that when you set up single row mode the connection generates a
 template PGconn which is then copied out repeatedly during row-by-row
 processing.  I like it, but only if we're reasonably confident the
 PGresult can be sufficiently optimized like that.

 hm, PGresAttDesc is unfortunately in the public header and as such
 probably can't be changed?

It can't -- at least not without breaking compatibility.  So, as
attractive as it sounds, it looks like a memcpy based PGresult copy is
out unless we break the rules change it anyways (with the probably
safe assumption that the only userland code that cares is libpqtypes,
which we'd certainly change).

However, it's not clear that a shared metadata implementation would
require a mutex -- if you stored the shared data in the conn and were
willing to walk the results in the event the PGconn was freed before
the results, you'd probably be ok.  That's really unpleasant though.

Either way, it looks like there's plausible paths to optimizing
repeated result fetch without having expose an alternate data-fetching
API and that the proposed implementation doesn't appear to be a wall
in terms of getting there. So I'm firmly in the non-exposed-rowbuf
camp, even if we can't expose an optimal implementation in time for
9.2.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] canceling autovacuum task woes

2012-07-24 Thread Alvaro Herrera



 On Tuesday, July 24, 2012 07:48:27 PM Robert Haas wrote:
  I am running into a lot of customer situations where the customer
  reports that canceling autovacuum task shows up in the logs, and
  it's unclear whether this is happening often enough to matter, and
  even more unclear what's causing it.
  
  Me: So, do you know what table it's getting cancelled on?
  Customer: Nope.
  Me: Are you running any DDL commands anywhere in the cluster?
  Customer: Nope, absolutely none.
  Me: Well you've got to be running something somewhere or it wouldn't
  be having a lock conflict.
  Customer: OK, well I don't know of any.  What should I do?
  
  It would be awfully nice if the process that does the cancelling would
  provide the same kind of reporting that we do for a deadlock: the
  relevant lock tag, the PID of the process sending the cancel, and the
  query string.

Hm, autovacuum is supposed to set an errcontext callback that would tell
you the table name it's working on at the time of the cancel.  So if
even that is missing, something strange is going on.

No objections to the general idea of adding more info about the process
blocked on autovacuum.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] canceling autovacuum task woes

2012-07-24 Thread Alvaro Herrera


Excerpts from Alvaro Herrera's message of mar jul 24 15:30:49 -0400 2012:
 
  On Tuesday, July 24, 2012 07:48:27 PM Robert Haas wrote:
   I am running into a lot of customer situations where the customer
   reports that canceling autovacuum task shows up in the logs, and
   it's unclear whether this is happening often enough to matter, and
   even more unclear what's causing it.
   
   Me: So, do you know what table it's getting cancelled on?
   Customer: Nope.

 Hm, autovacuum is supposed to set an errcontext callback that would tell
 you the table name it's working on at the time of the cancel.  So if
 even that is missing, something strange is going on.

Yep, it says:

ERROR:  canceling autovacuum task
CONTEXT:  automatic vacuum of table alvherre.public.foo

So at least that part seems pilot error more than anything else.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Tom Lane

Merlin Moncure mmonc...@gmail.com writes:
 Either way, it looks like there's plausible paths to optimizing
 repeated result fetch without having expose an alternate data-fetching
 API and that the proposed implementation doesn't appear to be a wall
 in terms of getting there. So I'm firmly in the non-exposed-rowbuf
 camp, even if we can't expose an optimal implementation in time for
 9.2.

Yeah, the schedule argument is a strong one.  If we put in PQgetRowData
now, we'll be stuck with it forevermore.  It would be better to push
that part of the patch to 9.3 so we can have more time to think it over
and investigate alternatives.  The single-row mode is already enough to
solve the demonstrated client-side performance issues we know about.
So IMO it would be a lot smarter to be spending our time right now on
making sure we have *that* part of the patch right.

In particular I agree that PQsetSingleRowMode is a bit ugly, so I'm
wondering about your thoughts on providing PQgetSingleRowResult instead.
I don't see how to make that work in async mode.  I think the library
needs to be aware of whether it's supposed to return a single-row result
in advance of actually doing so --- for instance, how can PQisBusy
possibly give the correct answer if it doesn't know whether having
received the first row is sufficient?  (Yeah, I know we could invent
a separate flavor of PQisBusy for single-row usage, but ick.  There are
implementation problems too, such as possibly having already copied a
lot of rows into a work-in-progress PGresult.)

The thing I fundamentally don't like about PQsetSingleRowMode is that
there's such a narrow window of time to use it correctly -- as soon as
you've done even one PQconsumeInput, it's too late.  (Or maybe not, if
the server is slow, which makes it timing-dependent whether you'll
observe a bug.  Maybe we should add more internal state so that we can
consistently throw error if you've done any PQconsumeInput?)  The most
obvious alternative is to invent new versions of the PQsendXXX functions
with an additional flag parameter; but there are enough of them to make
that not very attractive either.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] canceling autovacuum task woes

2012-07-24 Thread Robert Haas

On Tue, Jul 24, 2012 at 3:35 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 Yep, it says:

 ERROR:  canceling autovacuum task
 CONTEXT:  automatic vacuum of table alvherre.public.foo

 So at least that part seems pilot error more than anything else.

Yeah, you're right.  So you do get the table name.  But you don't get
the cause, which is what you really need to understand why it's
happening.  Attached is a patch that adds some more detail.  Here's an
example of what the output looks like:

LOG:  sending cancel to blocking autovacuum PID 21595
DETAIL:  Process 21618 waits for AccessExclusiveLock on relation 27863
of database 16384
STATEMENT:  drop table if exists pgbench_accounts
ERROR:  canceling autovacuum task
CONTEXT:  automatic vacuum of table rhaas.public.pgbench_accounts

I think that's a lot more useful than just getting those last two lines...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


more-autovac-cancel-logging.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] canceling autovacuum task woes

2012-07-24 Thread Alvaro Herrera


Excerpts from Robert Haas's message of mar jul 24 15:52:23 -0400 2012:
 On Tue, Jul 24, 2012 at 3:35 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
  Yep, it says:
 
  ERROR:  canceling autovacuum task
  CONTEXT:  automatic vacuum of table alvherre.public.foo
 
  So at least that part seems pilot error more than anything else.
 
 Yeah, you're right.  So you do get the table name.  But you don't get
 the cause, which is what you really need to understand why it's
 happening.  Attached is a patch that adds some more detail.  Here's an
 example of what the output looks like:
 
 LOG:  sending cancel to blocking autovacuum PID 21595
 DETAIL:  Process 21618 waits for AccessExclusiveLock on relation 27863
 of database 16384
 STATEMENT:  drop table if exists pgbench_accounts
 ERROR:  canceling autovacuum task
 CONTEXT:  automatic vacuum of table rhaas.public.pgbench_accounts

Looks great.  Are you considering backpatching this?


-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] canceling autovacuum task woes

2012-07-24 Thread Robert Haas

On Tue, Jul 24, 2012 at 4:03 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 Excerpts from Robert Haas's message of mar jul 24 15:52:23 -0400 2012:
 On Tue, Jul 24, 2012 at 3:35 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
  Yep, it says:
 
  ERROR:  canceling autovacuum task
  CONTEXT:  automatic vacuum of table alvherre.public.foo
 
  So at least that part seems pilot error more than anything else.

 Yeah, you're right.  So you do get the table name.  But you don't get
 the cause, which is what you really need to understand why it's
 happening.  Attached is a patch that adds some more detail.  Here's an
 example of what the output looks like:

 LOG:  sending cancel to blocking autovacuum PID 21595
 DETAIL:  Process 21618 waits for AccessExclusiveLock on relation 27863
 of database 16384
 STATEMENT:  drop table if exists pgbench_accounts
 ERROR:  canceling autovacuum task
 CONTEXT:  automatic vacuum of table rhaas.public.pgbench_accounts

 Looks great.  Are you considering backpatching this?

Well, that would certainly make MY life easier.  I am not sure whether
it would be in line with project policy, however.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] canceling autovacuum task woes

2012-07-24 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 Yeah, you're right.  So you do get the table name.  But you don't get
 the cause, which is what you really need to understand why it's
 happening.  Attached is a patch that adds some more detail.

Uh, what's the added dependency on pgstat.h for?  Looks sane to the
eyeball otherwise.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] canceling autovacuum task woes

2012-07-24 Thread Robert Haas

On Tue, Jul 24, 2012 at 4:09 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Yeah, you're right.  So you do get the table name.  But you don't get
 the cause, which is what you really need to understand why it's
 happening.  Attached is a patch that adds some more detail.

 Uh, what's the added dependency on pgstat.h for?  Looks sane to the
 eyeball otherwise.

Woops, that was leftovers from some earlier silliness that I (mostly)
removed before posting.

New version attached.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


more-autovac-cancel-logging.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Marko Kreen

 Hm, I think it's possible to rig the test to do dummy
 copy of pgresult, thus it's possible to see what kind
 of speed is possible..  Will try.

I added a new method (-x) to rowdump where it asks for row
with PQgetRowData() and then tries to emulate super-efficient
PGresult copy, then loads data from that PGresult.

Quick reference:
rowdump1 - single-row-mode1 [~ libpq 9.2]
rowdump2 - single-row-mode2 [~ libpq 9.1]

-s - single row mode with PQgetResult()
-z - single row mode with PQgetRowData()
-x - simulated optimized PQgetResult()

-
QUERY: select 1,200,30,rpad('x',30,'z') from
generate_series(1,500)
./rowdump1 -s:   6.28   6.25   6.39  avg:  6.31 [ 100.00 % ]
./rowdump2 -s:   7.49   7.40   7.57  avg:  7.49 [ 118.71 % ]
./rowdump1 -z:   2.86   2.77   2.79  avg:  2.81 [ 44.50 % ]
./rowdump1 -x:   3.46   3.27   3.29  avg:  3.34 [ 52.96 % ]
QUERY: select
rpad('x',10,'z'),rpad('x',20,'z'),rpad('x',30,'z'),rpad('x',40,'z'),rpad('x',50,'z'),rpad('x',60,'z')
from generate_series(1,300)
./rowdump1 -s:   7.76   7.76   7.68  avg:  7.73 [ 100.00 % ]
./rowdump2 -s:   8.24   8.12   8.66  avg:  8.34 [ 107.85 % ]
./rowdump1 -z:   5.34   5.07   5.23  avg:  5.21 [ 67.41 % ]
./rowdump1 -x:   5.53   5.61   5.61  avg:  5.58 [ 72.20 % ]
QUERY: select
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100
from generate_series(1,80)
./rowdump1 -s:   7.49   7.66   7.59  avg:  7.58 [ 100.00 % ]
./rowdump2 -s:   7.56   8.12   7.95  avg:  7.88 [ 103.91 % ]
./rowdump1 -z:   2.77   2.76   2.76  avg:  2.76 [ 36.46 % ]
./rowdump1 -x:   3.07   3.05   3.18  avg:  3.10 [ 40.90 % ]
QUERY: select 1000,rpad('x', 400, 'z'),rpad('x', 4000, 'z') from
generate_series(1,10)
./rowdump1 -s:   2.66   2.62   2.67  avg:  2.65 [ 100.00 % ]
./rowdump2 -s:   3.11   3.14   3.11  avg:  3.12 [ 117.74 % ]
./rowdump1 -z:   2.49   2.46   2.47  avg:  2.47 [ 93.33 % ]
./rowdump1 -x:   2.59   2.57   2.57  avg:  2.58 [ 97.23 % ]
-

It shows that even if the actual fast row copy will be slower
than this one here, it's still quote competitive approach to
PQgetRowData(), as long it's not too slow.

So the optimized PGgetResult() may be good enough, thus we
can drop the idea of PQgetRowData().

Code attached, also in https://github.com/markokr/pqtest repo.

-- 
marko


pg1 = /opt/apps/pgsql92mode1
pg2 = /opt/apps/pgsql92mode2

X1 = -DHAVE_ROWDATA -I$(pg1)/include/internal -I$(pg1)/include/server

CFLAGS = -O -g -Wall

all: rowdump1 rowdump2

rowdump1: rowdump.c
$(CC) -I$(pg1)/include $(CFLAGS) -o $@ $ -L$(pg1)/lib 
-Wl,-rpath=$(pg1)/lib -lpq $(X1)

rowdump2: rowdump.c
$(CC) -I$(pg2)/include $(CFLAGS) -o $@ $ -L$(pg2)/lib 
-Wl,-rpath=$(pg2)/lib -lpq

clean:
rm -f rowdump1 rowdump2 time.tmp README.html

html: README.html

README.html: README.rst
rst2html $  $@


#include stdio.h
#include stdlib.h
#include string.h
#include unistd.h
#include getopt.h

#include libpq-fe.h

#ifdef HAVE_ROWDATA
#include internal/libpq-int.h
#endif

struct Context {
	PGconn *db;
	int count;

	char *buf;
	int buflen;
	int bufpos;
};

/* print error and exit */
static void die(PGconn *db, const char *msg)
{
	if (db)
		fprintf(stderr, %s: %s\n, msg, PQerrorMessage(db));
	else
		fprintf(stderr, %s\n, msg);
	exit(1);
}

/* write out buffer */
static void out_flush(struct Context *ctx)
{
	int out;
	if (!ctx-buf)
		return;

	out = write(1, ctx-buf, ctx-bufpos);
	if (out != ctx-bufpos)
		die(NULL, failed to write file);
	ctx-bufpos = 0;
	ctx-buflen = 0;
	free(ctx-buf);
	ctx-buf = NULL;
}

/* add char to buffer */
static void out_char(struct Context *ctx, char c)
{
	if (ctx-bufpos + 1  ctx-buflen) {
		if (!ctx-buf) {
			ctx-buflen = 16;
			ctx-buf = malloc(ctx-buflen);
			if (!ctx-buf)
die(NULL, failed to allocate buffer);
		} else {
			ctx-buflen *= 2;
			ctx-buf = realloc(ctx-buf, ctx-buflen);
			if (!ctx-buf)
die(NULL, failed to resize buffer);
		}
	}

	ctx-buf[ctx-bufpos++] = c;
}

/* quote string for copy */
static void proc_value(struct Context *ctx, const char *val, int vlen)
{
	int i;
	char c;

	for (i = 0; i  vlen; i++) {
		c = val[i];
		switch (c) {
		case '\\':
			out_char(ctx, '\\');
			out_char(ctx, '\\');
			break;
		case '\t':
			out_char(ctx, '\\');
			out_char(ctx, 't');
			break;
		case '\n':
			out_char(ctx, '\\');
			out_char(ctx, 'n');
			break;
		case '\r':
			out_char(ctx, '\\');
			out_char(ctx, 'r');
			break;
		default:
			out_char(ctx, c);
			break;
		}
	}
}

/* quote one row for copy from regular PGresult */
static void proc_row(struct Context *ctx, PGresult *res, int tup)
{
	int n = PQnfields(res);
	const char *val;
	int i, vlen;

	ctx-count++;

	for

Re: [HACKERS] canceling autovacuum task woes

2012-07-24 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Tue, Jul 24, 2012 at 4:03 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
 Looks great.  Are you considering backpatching this?

 Well, that would certainly make MY life easier.  I am not sure whether
 it would be in line with project policy, however.

+1 for a backpatch.  Otherwise it'll be years before we gain any
information about the unexpected cancels that you think exist.

However, after looking some more at deadlock.c, I wonder whether
(a) this patch gives sufficient detail, and (b) whether there isn't a
problem that's obvious by inspection.  It appears to me that as the
blocking_autovacuum_proc stuff is coded, it will finger an AV proc as
needing to be killed even though it may be several graph edges out from
the current proc.  This means that with respect to (a), the connection
from the process doing the kill to the AV proc may be inadequately
documented by this patch, and with respect to (b), there might well be
cases where we found an AV proc somewhere in the graph traversal but
it's not actually guilty of blocking the current process ... especially
not after the queue reorderings that we may have done.  I think I'd be
happier with that code if it restricted its AV targets to procs that
*directly* block the current process, which not incidentally would make
this amount of log detail sufficient.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CHECK NO INHERIT syntax

2012-07-24 Thread Alvaro Herrera


Excerpts from Alvaro Herrera's message of sáb jul 21 00:20:57 -0400 2012:

 Here's a (hopefully) complete patch.

Pushed.  I hope people like this one better (third time's the charm, and
all that).

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Merlin Moncure

On Tue, Jul 24, 2012 at 2:37 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Merlin Moncure mmonc...@gmail.com writes:
 Either way, it looks like there's plausible paths to optimizing
 repeated result fetch without having expose an alternate data-fetching
 API and that the proposed implementation doesn't appear to be a wall
 in terms of getting there. So I'm firmly in the non-exposed-rowbuf
 camp, even if we can't expose an optimal implementation in time for
 9.2.

 Yeah, the schedule argument is a strong one.  If we put in PQgetRowData
 now, we'll be stuck with it forevermore.  It would be better to push
 that part of the patch to 9.3 so we can have more time to think it over
 and investigate alternatives.  The single-row mode is already enough to
 solve the demonstrated client-side performance issues we know about.
 So IMO it would be a lot smarter to be spending our time right now on
 making sure we have *that* part of the patch right.

 In particular I agree that PQsetSingleRowMode is a bit ugly, so I'm
 wondering about your thoughts on providing PQgetSingleRowResult instead.
 I don't see how to make that work in async mode.  I think the library
 needs to be aware of whether it's supposed to return a single-row result
 in advance of actually doing so --- for instance, how can PQisBusy
 possibly give the correct answer if it doesn't know whether having
 received the first row is sufficient?

Well (Marko probably wants to slap me with a halibut by now), the
original intent was for it not to have to: PQgetSingleRowResult is
more 'construct result for single row iteration', so it would never
block -- it only sets the internal single row mode and instantiates
the result object.  After that, you can do do standard
consumeinput/isbusy processing on the conn.  The actual iteration
routine would be like PQiterateResult which you could guard with
PQisBusy.  Like I said: it's the same general structure but the result
construction is moved out of the loop.

I don't think this breaks result scoping rules...the conn keeps a copy
of the result during 'result construction' which we are going to
define as ending when the query terminates.  Until the query resolves,
the result remains 'under construction' and avoids the expectation of
const-ness that you normally get so you don't get to interleave
threads reading the result with threads iterating from the connection
(which is fine) and you have to avoid doing stupid things like closing
the connection before all the data has been read through.

(but maybe all this is moot per Marko's latest benchmarks)

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Marko Kreen

On Tue, Jul 24, 2012 at 11:34 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Jul 24, 2012 at 2:37 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 In particular I agree that PQsetSingleRowMode is a bit ugly, so I'm
 wondering about your thoughts on providing PQgetSingleRowResult instead.
 I don't see how to make that work in async mode.  I think the library
 needs to be aware of whether it's supposed to return a single-row result
 in advance of actually doing so --- for instance, how can PQisBusy
 possibly give the correct answer if it doesn't know whether having
 received the first row is sufficient?

 Well (Marko probably wants to slap me with a halibut by now), the
 original intent was for it not to have to: PQgetSingleRowResult is
 more 'construct result for single row iteration', so it would never
 block -- it only sets the internal single row mode and instantiates
 the result object.  After that, you can do do standard
 consumeinput/isbusy processing on the conn.  The actual iteration
 routine would be like PQiterateResult which you could guard with
 PQisBusy.  Like I said: it's the same general structure but the result
 construction is moved out of the loop.

If you really don't like PQsetSingleRowMode, then I would prefer
new PQsend* functions to new fetch functions.

Because while send is one-shot affair, receive is complex state-machine
with requires multiple function calls.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Marko Kreen

On Tue, Jul 24, 2012 at 10:37 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Merlin Moncure mmonc...@gmail.com writes:
 Either way, it looks like there's plausible paths to optimizing
 repeated result fetch without having expose an alternate data-fetching
 API and that the proposed implementation doesn't appear to be a wall
 in terms of getting there. So I'm firmly in the non-exposed-rowbuf
 camp, even if we can't expose an optimal implementation in time for
 9.2.

 Yeah, the schedule argument is a strong one.  If we put in PQgetRowData
 now, we'll be stuck with it forevermore.  It would be better to push
 that part of the patch to 9.3 so we can have more time to think it over
 and investigate alternatives.  The single-row mode is already enough to
 solve the demonstrated client-side performance issues we know about.
 So IMO it would be a lot smarter to be spending our time right now on
 making sure we have *that* part of the patch right.

Just to show agreement: both PQgetRowData() and optimized PGresult
do not belong to 9.2.

Only open questions are:

* Is there better API than PQsetSingleRowMode()?  New PQsend*
  functions is my alternative.

* Should we rollback rowBuf change? I think no, as per benchmark
  it performs better than old code.

 The thing I fundamentally don't like about PQsetSingleRowMode is that
 there's such a narrow window of time to use it correctly -- as soon as
 you've done even one PQconsumeInput, it's too late.  (Or maybe not, if
 the server is slow, which makes it timing-dependent whether you'll
 observe a bug.  Maybe we should add more internal state so that we can
 consistently throw error if you've done any PQconsumeInput?)  The most
 obvious alternative is to invent new versions of the PQsendXXX functions
 with an additional flag parameter; but there are enough of them to make
 that not very attractive either.

It belongs logically together with PQsend, so if you don't like
current separation, then proper fix is to make them single
function call.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Merlin Moncure

On Tue, Jul 24, 2012 at 3:52 PM, Marko Kreen mark...@gmail.com wrote:
 * Is there better API than PQsetSingleRowMode()?  New PQsend*
   functions is my alternative.

I would prefer to have PQsetSingleRowMode() over new flavor of PQsend.

To consolidate my argument above: since you're throwing PQgetResult in
the main iteration loop you're potentially walling yourself off from
one important optimization: not copying the result at all and reusing
the old one; that's going to produce the fastest possible code since
you're recovering all the attribute structures that have already been
set up unless you're willing to do the following:

Reading your original patch, what if at the line inside PQgetResult:

res = pqSingleRowResult(conn);

you instead manipulated the result the connection already had and
directly returned it -- discarding the result data but not the
attributes etc?  Actually, you could even keep your API 'as is' if
you're willing to adjust the behavior of PQclear while the connection
is doing row by row results processing to be a no-op unless done.

Single row results' data  would then be volatile across iterations,
not const -- even if you save off the pointer the connection can and
will rewrite the data portion of the PGresult.  Any pointers to
PQgetdata'd values would have to be copied off between iterations
naturally (or the user can copy off using the user-facing copy result
function).  This would be extremely efficient since we'd only even
need to extend PGresult buffer if a particular row was longer than the
longest one already fetched.

If all that's a bridge too far, we can look at re-jiggering the API
like I was thinking ealier to make the adjusted result scoping more
user visible or run your non-rowbuf-exposing patch as-is and hope for
optimizations down the line.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] canceling autovacuum task woes

2012-07-24 Thread Robert Haas

On Jul 24, 2012, at 4:31 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Tue, Jul 24, 2012 at 4:03 PM, Alvaro Herrera
 alvhe...@commandprompt.com wrote:
 Looks great.  Are you considering backpatching this?
 
 Well, that would certainly make MY life easier.  I am not sure whether
 it would be in line with project policy, however.
 
 +1 for a backpatch.  Otherwise it'll be years before we gain any
 information about the unexpected cancels that you think exist

OK, great.

 However, after looking some more at deadlock.c, I wonder whether
 (a) this patch gives sufficient detail, and (b) whether there isn't a
 problem that's obvious by inspection.  It appears to me that as the
 blocking_autovacuum_proc stuff is coded, it will finger an AV proc as
 needing to be killed even though it may be several graph edges out from
 the current proc.  This means that with respect to (a), the connection
 from the process doing the kill to the AV proc may be inadequately
 documented by this patch, and with respect to (b), there might well be
 cases where we found an AV proc somewhere in the graph traversal but
 it's not actually guilty of blocking the current process ... especially
 not after the queue reorderings that we may have done.  I think I'd be
 happier with that code if it restricted its AV targets to procs that
 *directly* block the current process, which not incidentally would make
 this amount of log detail sufficient.

Uggh.  Well, that certainly sounds like something that could cause spurious 
cancels - or excessively fast ones, since presumably if we limit it to things 
that directly block the current process, you'll always allow the full 
deadlock_timeout before nuking the autovac worker.  So +1 for changing that.

Does an edge in this context mean any lock, or just an ungranted one?  I assume 
the latter, which still leaves the question of where the edges are coming from 
in the first place.

...Robert
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Marko Kreen

On Wed, Jul 25, 2012 at 12:35 AM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Jul 24, 2012 at 3:52 PM, Marko Kreen mark...@gmail.com wrote:
 * Is there better API than PQsetSingleRowMode()?  New PQsend*
   functions is my alternative.

 I would prefer to have PQsetSingleRowMode() over new flavor of PQsend.

 To consolidate my argument above: since you're throwing PQgetResult in
 the main iteration loop you're potentially walling yourself off from
 one important optimization: not copying the result at all and reusing
 the old one; that's going to produce the fastest possible code since
 you're recovering all the attribute structures that have already been
 set up unless you're willing to do the following:

I am not forgetting such optimization, I've already implemented it:
it's called PQgetRowData().  Main reason why it works, and so simply,
is that it does not try to give you PGresult.

PGresult carries various historical assumptions:
- Fully user-controlled lifetime.
- Zero-terminated strings.
- Aligned binary values.

Breaking them all does not seem like a good idea.

Trying to make them work with zero-copy seems
not worth the pain.

And if you are ready to introduce new accessor functions,
why care about PGresult at all?  Why not give user
PQgetRowData()?

Btw, PQgetRowData() and any other potential zero-copy API
is not incompatible with both slow PQgetResult() or optimized
PQgetResult().

So if we give only PQgetResult() in 9.2, I do not see that we
are locked out from any interesting optimizations.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Merlin Moncure

On Tue, Jul 24, 2012 at 4:59 PM, Marko Kreen mark...@gmail.com wrote:
 So if we give only PQgetResult() in 9.2, I do not see that we
 are locked out from any interesting optimizations.

Well, you are locked out of having PQgetResult reuse the conn's result
since that would then introduce potentially breaking changes to user
code.  Here are the approaches I see:

1) introduce rowbuf (but not user exposed rowbuf) patch:
Not the fastest method.  Users would begin using the feature and API
behaviors would be locked in -- only possible optmiization route would
be to try and make PQcopyResult as fast as possible.

2) expose PQgetRowData
Very fast, but foreign and somewhat awkward.  Existing libpq wrappers
(libpqtypes, libpqxx etc) could not be converted to use without
extensive revision, if at all, and would be stuck with the slower
version of iteration.  Also a side benefit here is that optimizing
result copying later has usefulness outside of row by row processing.

3) as #1, but instead of copying result, overwrite the data area.
this is bending the rule 'user defined lifetime of result' since each
iteration clears the data area of the PGresult. This is almost as fast
as #2, and existing libpq wrappers would be trivially converted to the
API.

4) as #3, but introduce a new iteration function
(PQiterateResult(conn, result)) that would be called instead of a
paired PQgetResult/PQclear.  This would also be fast, and could almost
lay directly on top of #2 as an alternate optimization strategy -- the
set result mode would have to be modified (or and additional function
introduced) to return a result.

I feel like you're partial to #2 and appreciate that, but I'm really
quite skeptical about it as noted.  #3 and #4 are not as well thought
out as what you've put together and would need some extra work and
buy-in to get out the door, so #2 seems like the lowest common
denominator (it would permanently preclude #3 and would require #4 to
introduce two new functions instead of just one).  #1 of course would
bolt on to #2.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Merlin Moncure

On Tue, Jul 24, 2012 at 5:29 PM, Merlin Moncure mmonc...@gmail.com wrote:
 so #2 seems like the lowest common
 denominator (it would permanently preclude #3 and would require #4 to
 introduce two new functions instead of just one).  #1 of course would
 bolt on to #2.

oops, got #1 and #2 backwards there.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Marko Kreen

On Wed, Jul 25, 2012 at 1:29 AM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Jul 24, 2012 at 4:59 PM, Marko Kreen mark...@gmail.com wrote:
 So if we give only PQgetResult() in 9.2, I do not see that we
 are locked out from any interesting optimizations.

 Well, you are locked out of having PQgetResult reuse the conn's result
 since that would then introduce potentially breaking changes to user
 code.

You can specify special flags to PQsend or have special PQgetResultWeird()
calls to get PGresults with unusual behavior.  Like I did with PQgetRowData().

I see no reason here to reject PQgetResult() that returns normal PGresult.

-- 
marko

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [patch] libpq one-row-at-a-time API

2012-07-24 Thread Merlin Moncure

On Tuesday, July 24, 2012, Marko Kreen mark...@gmail.com wrote:
 On Wed, Jul 25, 2012 at 1:29 AM, Merlin Moncure mmonc...@gmail.com
wrote:
 On Tue, Jul 24, 2012 at 4:59 PM, Marko Kreen mark...@gmail.com wrote:
 So if we give only PQgetResult() in 9.2, I do not see that we
 are locked out from any interesting optimizations.

 Well, you are locked out of having PQgetResult reuse the conn's result
 since that would then introduce potentially breaking changes to user
 code.

 You can specify special flags to PQsend or have special PQgetResultWeird()
 calls to get PGresults with unusual behavior.  Like I did with
PQgetRowData().

 I see no reason here to reject PQgetResult() that returns normal PGresult.

Yeah -- I agree.  So, given the scheduling, I think we should go with
non-PQgetRowData patch and reserve optimized path for 9.3.

merlin

Re: [HACKERS] [BUGS] BUG #6748: sequence value may be conflict in some cases

2012-07-24 Thread Tom Lane

I wrote:
 The tuple in buffers has log_cnt = 1, is_called = false, but the initial
 XLOG_SEQ_LOG record shows log_cnt = 0, is_called = true.  So if we crash
 at this point, after recovery it looks like one nextval() has already
 been done.  However, AlterSequence generates another XLOG_SEQ_LOG record
 based on what's in shared buffers, so after replay of that, we're back
 to the original state where it does not appear that any nextval() has
 been done.

 I'm of the opinion that this kluge needs to be removed; it's just insane
 that we're not logging the same state we leave in our buffers.

Attached is a WIP patch for that.  I had to leave nextval() logging
a different state from what it puts into buffers, since without that
we can't have the optimization of one-WAL-record-per-N-nextval-calls,
which seems worth preserving.  But I was able to get rid of the other
routines' doing it.  I think it's all right for nextval to do it because
the only difference between what it logs and what it leaves in the
buffer is that last_value is advanced and log_cnt is zero; in particular
there is not a difference in is_called state.

For awhile I was convinced that nextval() was broken in another way,
because it does MarkBufferDirty() before changing the buffer contents,
which sure *looks* wrong; what if a checkpoint writes the
not-yet-modified buffer contents and then clears the dirty bit?
However, I now think this is safe because we hold the buffer content
lock exclusively until we've finished making the changes.  A checkpoint
might observe the dirty bit set and attempt to write the page, but it
will block trying to get shared buffer content lock until we complete
the changes.  So that should be all right, and I've added some comments
to explain it, but if anyone thinks it's wrong speak up!

Another thing that I think is a real bug is that AlterSequence needs
to reset log_cnt to zero any time it changes any of the sequence
generation parameters, to ensure that the new values will be applied
promptly.  I've done that in the attached.

I also modified the code so that log_cnt is initialized to zero not one
when a sequence is created or reset to is_called = false.  This doesn't
make any functional difference given the change to force logging a WAL
record when is_called becomes set, but I think it's a good change
anyway: it doesn't make any sense anymore to imply that one sequence
value has been pre-reserved.  However, if anyone's particularly annoyed
by the first diff hunk in the regression tests, we could undo that.

Lastly, I modified read_seq_tuple (nee read_info) to return a HeapTuple
pointer to the sequence's tuple, and made use of that to make the rest
of the code simpler and less dependent on page-layout assumptions.
At this point this is only a code-beautification change; it was
functionally necessary in an earlier version of the patch, but I didn't
back it out when I got rid of the change that made it necessary.

Oh, there are also some hunks in here to add debug logging, which I was
using for testing.  I'm undecided whether to remove those or clean them
up some more and wrap them in something like #ifdef SEQUENCE_DEBUG.

One other point about the regression test diffs: formerly, the typical
state seen at line 191 of sequence.out was log_cnt = 32.  Now it is 31,
because the first nextval call increases log_cnt to 32 and then the
second decreases it to 31.  The failure mode described in
http://archives.postgresql.org/pgsql-hackers/2008-08/msg01359.php
is no longer possible, because a checkpoint before the first nextval
call won't change its behavior.  But instead, if a checkpoint happens
*between* those two nextvals, the second nextval will decide to create
a WAL entry and it will increase log_cnt to 32.  So we still need the
variant expected file sequence_1.out, but now it shows log_cnt = 32
instead of the other way around.

Comments anyone?

regards, tom lane

diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index 34b74f6c3844a3a172cd660357f8ccbc48f57f3c..69054b546cec1bb71ff79b18702a0e1edb3a45fb 100644
*** a/src/backend/commands/sequence.c
--- b/src/backend/commands/sequence.c
***
*** 38,44 
  /*
   * We don't want to log each fetching of a value from a sequence,
   * so we pre-log a few fetches in advance. In the event of
!  * crash we can lose as much as we pre-logged.
   */
  #define SEQ_LOG_VALS	32
  
--- 38,44 
  /*
   * We don't want to log each fetching of a value from a sequence,
   * so we pre-log a few fetches in advance. In the event of
!  * crash we can lose (skip over) as many values as we pre-logged.
   */
  #define SEQ_LOG_VALS	32
  
*** typedef struct SeqTableData
*** 73,79 
  	int64		cached;			/* last value already cached for nextval */
  	/* if last != cached, we have not used up all the cached values */
  	int64		increment;		/* copy of sequence's increment field */
! 	/* note that increment is zero until we

Re: [HACKERS] canceling autovacuum task woes

2012-07-24 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Jul 24, 2012, at 4:31 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 ... This means that with respect to (a), the connection
 from the process doing the kill to the AV proc may be inadequately
 documented by this patch, and with respect to (b), there might well be
 cases where we found an AV proc somewhere in the graph traversal but
 it's not actually guilty of blocking the current process ... especially
 not after the queue reorderings that we may have done.  I think I'd be
 happier with that code if it restricted its AV targets to procs that
 *directly* block the current process, which not incidentally would make
 this amount of log detail sufficient.

 Uggh.  Well, that certainly sounds like something that could cause spurious 
 cancels - or excessively fast ones, since presumably if we limit it to things 
 that directly block the current process, you'll always allow the full 
 deadlock_timeout before nuking the autovac worker.  So +1 for changing that.

I think something as simple as the attached would do the trick.  I've
verified that this still allows the expected cancel cases, though of
course it's harder to prove that it gives any benefit.

 Does an edge in this context mean any lock, or just an ungranted one?  I 
 assume the latter, which still leaves the question of where the edges are 
 coming from in the first place.

The deadlock code follows hard edges, which mean A wants a lock that
B has already got, as well as soft edges, which mean A wants a lock
that B also wants, and B is ahead of A in the queue for it.  We don't
kill autovacs that are the end of soft edges, which I think is good,
but the fact that we follow them at all makes it a little unclear what
we might reach recursively.  Your point about determinacy of the timeout
is probably even a better argument for only allowing the direct blockee
to wield the knife.

regards, tom lane

diff --git a/src/backend/storage/lmgr/deadlock.c b/src/backend/storage/lmgr/deadlock.c
index 288186a6ceebf40d5a43638c7be5be35c12dd3b3..6620a3d571d769f75e6ae3fbbe4a0ed2cf271720 100644
*** a/src/backend/storage/lmgr/deadlock.c
--- b/src/backend/storage/lmgr/deadlock.c
*** FindLockCycleRecurse(PGPROC *checkProc,
*** 527,551 
  if ((proclock-holdMask  LOCKBIT_ON(lm)) 
  	(conflictMask  LOCKBIT_ON(lm)))
  {
- 	/*
- 	 * Look for a blocking autovacuum. There can be more than
- 	 * one in the deadlock cycle, in which case we just pick a
- 	 * random one.	We stash the autovacuum worker's PGPROC so
- 	 * that the caller can send a cancel signal to it, if
- 	 * appropriate.
- 	 *
- 	 * Note we read vacuumFlags without any locking.  This is
- 	 * OK only for checking the PROC_IS_AUTOVACUUM flag,
- 	 * because that flag is set at process start and never
- 	 * reset; there is logic elsewhere to avoid canceling an
- 	 * autovacuum that is working for preventing Xid
- 	 * wraparound problems (which needs to read a different
- 	 * vacuumFlag bit), but we don't do that here to avoid
- 	 * grabbing ProcArrayLock.
- 	 */
- 	if (pgxact-vacuumFlags  PROC_IS_AUTOVACUUM)
- 		blocking_autovacuum_proc = proc;
- 
  	/* This proc hard-blocks checkProc */
  	if (FindLockCycleRecurse(proc, depth + 1,
  			 softEdges, nSoftEdges))
--- 527,532 
*** FindLockCycleRecurse(PGPROC *checkProc,
*** 559,565 
  
  		return true;
  	}
! 	/* If no deadlock, we're done looking at this proclock */
  	break;
  }
  			}
--- 540,575 
  
  		return true;
  	}
! 
! 	/*
! 	 * No deadlock here, but see if this proc is an autovacuum
! 	 * that is directly hard-blocking our own proc.  If so,
! 	 * report it so that the caller can send a cancel signal
! 	 * to it, if appropriate.  If there's more than one such
! 	 * proc (probably not possible given that autovacuums all
! 	 * take similar lock types), it's indeterminate which one
! 	 * will be reported.
! 	 *
! 	 * We don't touch autovacuums that are indirectly blocking
! 	 * us; it's up to the direct blockee to take action.  This
! 	 * rule simplifies understanding the behavior and ensures
! 	 * that an autovacuum won't be canceled with less than
! 	 * deadlock_timeout grace period.
! 	 *
! 	 * Note we read vacuumFlags without any locking.  This is
! 	 * OK only for checking the PROC_IS_AUTOVACUUM flag,
! 	 * because that flag is set at process start and never
! 	 * reset.  There is logic elsewhere to avoid canceling an
! 	 * autovacuum that is working to prevent XID wraparound
! 	 * problems (which needs to read a different vacuumFlag
! 	 * bit), but we don't do that here to avoid grabbing
! 	 * ProcArrayLock.
! 	 */
! 	if (checkProc == MyProc 
! 		pgxact-vacuumFlags  PROC_IS_AUTOVACUUM)
! 		blocking_autovacuum_proc = proc;
!

50 matches

Mail list logo