date:20110131

On Mon, Jan 31, 2011 at 8:00 AM, Shigeru HANADA
han...@metrosystems.co.jp wrote:
 * Is there any use case for changing the handler or validator function
 of an existign FDW with ALTER? To me it just seems like an unnecessary
 complication.

 AFAICS, the only case for that is upgrading FDW to new one without
 re-creating foreign tables.  I don't have strong opinion for this
 issue, and it seems reasonable to remove ALTER feature in first
 version.

-1.  I don't think that removing the ability to change this is going
to save a measurable amount of complexity, and it certainly will suck
if you need it and don't have it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI patch version 14

Jeff Davis  wrote:
 
 1. In CheckForSerializableConflictIn(), I think the comment above
 may be out of date. It says:
 
 2. Also in the comment above CheckForSerializableConflictIn(), I
 see:
 
 3. The comment above CheckForSerializableConflictOut() seems to
 trail off, as though you may have meant to write more. It also
 seems to be out of date.
 
Will fix and post a patch version 15, along with the other fixes
based on feedback to v14 (mostly to comments and docs) within a few
hours.
 
I'll also do that global change from using tran as an abbreviation
for transaction in some places to consistently using xact whenever it
is abbreviated.
 
 And why are you reading the infomask directly? Do the existing
 visibility functions not suffice?
 
It's possible we re-invented some code somewhere, but I'm not clear
on what code from this patch might use what existing function.  Could
you provide specifics?
 
 I have made it through predicate.c, and I have a much better
 understanding of what it's actually doing. I can't claim that I
 have a clear understanding of everything involved, but the code
 looks like it's in good shape (given the complexity involved) and
 well-commented.
 
Thanks!  I know that's a lot of work, and I appreciate you pointing
out where comments have not kept up with coding.
 
 I am marking the patch Ready For Committer, because any committer
 will need time to go through the patch; and the authors have
 clearly applied the thought, care, and testing required for
 something of this complexity. I will continue to work on it,
 though.
 
Thanks!
 
 The biggest issue on my mind is what to do about Hot Standby. The
 authors have a plan, but there is also some resistance to it:


http://archives.postgresql.org/message-id/23698.1295566...@sss.pgh.pa.us

 We don't need a perfect solution for 9.1, but it would be nice if
 we had a viable plan for 9.2.
 
I don't recall any real opposition to what I sketched out in this
post, which came after the above-referenced one:
 
http://archives.postgresql.org/message-id/4d39d5ec022500039...@gw.wicourts.gov
 
Also, that opposition appears to be based on a misunderstanding of
the first alternative, which was for sending at most one snapshot per
commit or rollback of a serializable read write transaction, with
possible throttling.  The alternative needs at most two bits per
commit or rollback of a serializable read write transaction; although
I haven't checked whether that can be scared up without adding a
whole byte.  Read only transactions have nothing to do with the
traffic under either alternative.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [pgsql-general 2011-1-21:] Are there any projects interested in object functionality? (+ rule bases)

On Mon, Jan 31, 2011 at 4:34 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 What I know no body is working on SQL/OLB ISO/IEC 9075-10 now.

 I proposed a 3 years ago a support of methods, but without success.
 This propose was rejected. There isn't a real interest to implement it
 from commiters. And I have to say - users doesn't request it too. And
 there are a few issues with compatibility.

It seems to me it's a bit unfair to say there isn't real interest to
implement it from committers.  Plenty of features get implemented
that no committer particularly cares about, because a number of
committers - including me - spend a good deal of time reviewing and
committing patches written by other people which they never would have
written themselves.  It's true that patches sometimes get swatted down
because they are judged to be insufficiently useful or badly design or
because they create compatibility breaks, but that's not the same as
not interested, which to me implies a sort of purely arbitrary
rejection that I try hard to avoid.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [pgsql-general 2011-1-21:] Are there any projects interested in object functionality? (+ rule bases)

On Mon, Jan 31, 2011 at 3:32 AM, Jörg Roman Rudnick
joerg.rudn...@t-online.de wrote:
 * are there any people / projects known which are interested in ORDBMS /
 OODBMS usage of PostgreSQL? Strict SQL standard conformance is less
 important than the possibility to provide instructive and impressive
 examples to students.

 * are there any people / projects known which are interested in extending
 PostgreSQL at a higher level (plpgsql, creating operators, etc.) for the
 sake of ORDBMS / OODBMS functionality?

 * are there any people / projects known which are interested in extending
 PostgreSQL on the level of developing C code for the sake of ORDBMS / OODBMS
 functionality?

 * in how far does the backend support such efforts -- would it do fine, or
 is rather to be expected that doing ORDBMS / OODBMS driven queries would
 lead to disastrous performance?

 * are there any people / projects known which are interested in using the
 rule (?trigger?) system of PostgreSQL (maybe with extensions) to achieve
 some kind of rule base / datalog type inference engines? In how far does the
 backend constrain this in regard of performance?

I don't really know much about ORDBMS / OODBMS functionality; a quick
Google search suggests that SQL/OLB is mostly about Java language
bindings, and there's a separate project (pgsql-jdbc) which works on
PostgreSQL connectivity for Java.  As far as changes to the core
database are concerned, user-defined functions and operators are not
hard to create, but I'm fuzzy on what specifically you want to do.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Spread checkpoint sync

On Mon, Jan 31, 2011 at 3:04 AM, Itagaki Takahiro
itagaki.takah...@gmail.com wrote:
 On Mon, Jan 31, 2011 at 13:41, Robert Haas robertmh...@gmail.com wrote:
 1. Absorb fsync requests a lot more often during the sync phase.
 2. Still try to run the cleaning scan during the sync phase.
 3. Pause for 3 seconds after every fsync.

 So if we want the checkpoint
 to finish in, say, 20 minutes, we can't know whether the write phase
 needs to be finished by minute 10 or 15 or 16 or 19 or only by 19:59.

 We probably need deadline-based scheduling, that is being used in write()
 phase. If we want to sync 100 files in 20 minutes, each file should be
 sync'ed in 12 seconds if we think each fsync takes the same time.
 If we would have better estimation algorithm (file size? dirty ratio?),
 each fsync chould have some weight factor.  But deadline-based scheduling
 is still needed then.

Right.  I think the problem is balancing the write and sync phases.
For example, if your operating system is very aggressively writing out
dirty pages to disk, then you want the write phase to be as long as
possible and the sync phase can be very short because there won't be
much work to do.  But if your operating system is caching lots of
stuff in memory and writing dirty pages out to disk only when
absolutely necessary, then the write phase could be relatively quick
without much hurting anything, but the sync phase will need to be long
to keep from crushing the I/O system.  The trouble is, we don't really
have a priori way to know which it's doing.  Maybe we could try to
tune based on the behavior of previous checkpoints, but I'm wondering
if we oughtn't to take the cheesy path first and split
checkpoint_completion_target into checkpoint_write_target and
checkpoint_sync_target.  That's another parameter to set, but I'd
rather add a parameter that people have to play with to find the right
value than impose an arbitrary rule that creates unavoidable bad
performance in certain environments.

 BTW, we should not sleep in full-speed checkpoint. CHECKPOINT command,
 shutdown, pg_start_backup(), and some of checkpoints during recovery
 might don't want to sleep.

Yeah, I think that's understood.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error code for terminating connection due to conflict with recovery

 Actually, it was Simon and Florian who were arguing that we needed to
 distinguish these cases from other types of recovery conflict;
 Tatsuo-san was arguing that we needed to distinguish a
 dropped-database-recovery-conflict from a cluster shutdown - the
 current choice of ERRCODE_ADMIN_SHUTDOWN makes that confusing.
 
 ISTM we can invent zero, one, or two new error codes here.  If we
 invent zero, then we change all recovery conflicts to look like
 serialization failures and call it good.  If we invent one, then we
 make retryable recovery conflicts look like serialization failures and
 the dropped-database case gets a newly minted error code that means
 just that.  Or we can invent two, and make serialization failures
 different from recovery conflicts, and retryable recovery conflicts
 different from the dropped-database variety.
 
 I don't have a terribly strong opinion as between those options.

As a novice I am not sure why we _wouldn't_ create two new separate
error codes --- it not not like they cost us anything, and they
certainly sound distinct.  The requirement to retry is clearly something
we want to avoid if we get a new error code.

Backpatching to 9.0 makes sense too, though the problem is the delay in
getting the code into a released minor version.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FPI

On Fri, Jan 28, 2011 at 3:39 PM, Robert Haas robertmh...@gmail.com wrote:
 What happens if we (a) keep the current rule after reaching
 consistency and (b) apply any such updates *unconditionally* - that
 is, without reference to the LSN - prior to reaching consistency?
 Under that rule, if we encounter an FPI before reaching consistency,
 we're OK.  So let's suppose we don't.  No matter how many times we
 replay any initial prefix of any such updates between the redo pointer
 and the point at which we reach consistency, the state of the page
 when we finally reach consistency will be identical.  But we could get
 hosed if replay progressed *past* the minimum recovery point and then
 started over at the previous redo pointer.  If we forced an immediate
 restartpoint on reaching consistency, that seems like it might prevent
 that scenario.

Actually, I'm wrong, and this doesn't work at all.  At the time of the
crash, there could already be pages on disk with LSNs greater than the
minimum recovery point.  Duh.

It was such a good idea in my head...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Snapshots no longer build

Tom Lane wrote:
 Thom Brown t...@linux.com writes:
  On 29 January 2011 11:12, Magnus Hagander mag...@hagander.net wrote:
  Any idea why this is happening?
 
  I don't know what's causing that since I can see both of those IDs are
  present, but I should also mention that the identities those linkends
  point to should have xreflabel attributes.  At the moment, it's
  reading: Use the OSSP UUID library when building the Section F.44
  module
 
 Yeah, that is pretty icky, and Bruce introduced similar unpleasantness
 in a bunch of places.  I'm unsure that xreflabel is a good fix, though,
 because there are other places where the wording is such that a
 chapter/section number reference *is* appropriate, eg a parenthetical
 cross-reference (see xref).
 
 I think what we might need is to use link linkend=xml2xml2/link
 in the places where we want the xref to read as an incidental hyperlink
 rather than a cross-reference.  Comments?

Agreed, thanks. I did forget about that INSTALL build requiremnt.  We do
mention the requirement in release.sgml.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add ENCODING option to COPY

2011-01-31 Thread Hitoshi Harada

2011/1/31 Hitoshi Harada umi.tan...@gmail.com:
 2011/1/31 Robert Haas robertmh...@gmail.com:
 On Tue, Jan 25, 2011 at 10:24 AM, Hitoshi Harada umi.tan...@gmail.com 
 wrote:
 I'll check the code more if we have better alternatives.

 Where are we with this?

 I'll post another version today.

Here's the patch.

Finally I concluded the concern Itagaki-san raised can be solved by
adding code that restores client_encoding in copy_in_error_callback. I
tested some encoding mismatch cases with this patch and saw
appropriate messages in NLS environment.

Regards,


-- 
Hitoshi Harada


copy_encoding.v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FPI

2011-01-31 Thread Greg Stark

On Fri, Jan 28, 2011 at 8:08 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 3. Page LSN  WAL location: do NOT apply field update or change LSN.


I don't think this works. There could be multiple writes to a page for
different records before the crash occurs. The LSN could be far in the
future and yet still have a torn page missing the update.


Another thing to consider is that there's still a plan on the table to
implement block checksums. Does any of this block that?

Or do checksums make it *easier* to implement any of this? You can
check the checksum and if it's valid assume there isn't a torn page.
If the LSN = current you can skip the log record. If the checksum is
invalid you could try replaying the log entry and if it makes it valid
then you're golden. If not then you could continue and hopefully there
will be more unconditional records and eventually the block will
become consistent or a FP write will come along later.

Just for reference the Oracle solution is to ignore the problem but
provide recovery tools to recover an individual block. You go to the
last consistent backup, pull the old version of the block from there,
then apply the logs from that point forward replaying any records for
that block.



-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Spread checkpoint sync


On 31.01.2011 16:44, Robert Haas wrote:

On Mon, Jan 31, 2011 at 3:04 AM, Itagaki Takahiro
itagaki.takah...@gmail.com  wrote:

On Mon, Jan 31, 2011 at 13:41, Robert Haasrobertmh...@gmail.com  wrote:

1. Absorb fsync requests a lot more often during the sync phase.
2. Still try to run the cleaning scan during the sync phase.
3. Pause for 3 seconds after every fsync.

So if we want the checkpoint
to finish in, say, 20 minutes, we can't know whether the write phase
needs to be finished by minute 10 or 15 or 16 or 19 or only by 19:59.


We probably need deadline-based scheduling, that is being used in write()
phase. If we want to sync 100 files in 20 minutes, each file should be
sync'ed in 12 seconds if we think each fsync takes the same time.
If we would have better estimation algorithm (file size? dirty ratio?),
each fsync chould have some weight factor.  But deadline-based scheduling
is still needed then.


Right.  I think the problem is balancing the write and sync phases.
For example, if your operating system is very aggressively writing out
dirty pages to disk, then you want the write phase to be as long as
possible and the sync phase can be very short because there won't be
much work to do.  But if your operating system is caching lots of
stuff in memory and writing dirty pages out to disk only when
absolutely necessary, then the write phase could be relatively quick
without much hurting anything, but the sync phase will need to be long
to keep from crushing the I/O system.  The trouble is, we don't really
have a priori way to know which it's doing.  Maybe we could try to
tune based on the behavior of previous checkpoints, ...


IMHO we should re-consider the patch to sort the writes. Not so much 
because of the performance gain that gives, but because we can then 
re-arrange the fsyncs so that you write one file, then fsync it, then 
write the next file and so on. That way we the time taken by the fsyncs 
is distributed between the writes, so we don't need to accurately 
estimate how long each will take. If one fsync takes a long time, the 
writes that follow will just be done a bit faster to catch up.



... but I'm wondering
if we oughtn't to take the cheesy path first and split
checkpoint_completion_target into checkpoint_write_target and
checkpoint_sync_target.  That's another parameter to set, but I'd
rather add a parameter that people have to play with to find the right
value than impose an arbitrary rule that creates unavoidable bad
performance in certain environments.


That is of course simpler..

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error code for terminating connection due to conflict with recovery

Bruce Momjian br...@momjian.us wrote:
 
 As a novice I am not sure why we _wouldn't_ create two new
 separate error codes
 
The argument for using SQLSTATE 40001 for failures which are
strictly due to concurrency problems, and are likely to work if the
transaction is retried, is that there is already a lot of software
which knows how to do that.  On the other hand, going into such code
to turn that into a list of concurrency failure states is probably
only going to cause pain to those with applications intended to work
with multiple DBMS products without much modification.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error code for terminating connection due to conflict with recovery

On Mon, 2011-01-31 at 09:46 -0500, Bruce Momjian wrote:
  Actually, it was Simon and Florian who were arguing that we needed to
  distinguish these cases from other types of recovery conflict;
  Tatsuo-san was arguing that we needed to distinguish a
  dropped-database-recovery-conflict from a cluster shutdown - the
  current choice of ERRCODE_ADMIN_SHUTDOWN makes that confusing.
  
  ISTM we can invent zero, one, or two new error codes here.  If we
  invent zero, then we change all recovery conflicts to look like
  serialization failures and call it good.  If we invent one, then we
  make retryable recovery conflicts look like serialization failures and
  the dropped-database case gets a newly minted error code that means
  just that.  Or we can invent two, and make serialization failures
  different from recovery conflicts, and retryable recovery conflicts
  different from the dropped-database variety.
  
  I don't have a terribly strong opinion as between those options.
 
 As a novice I am not sure why we _wouldn't_ create two new separate
 error codes --- it not not like they cost us anything, and they
 certainly sound distinct.  The requirement to retry is clearly something
 we want to avoid if we get a new error code.

It's the way it was because of discussion during 9.0.
The errors for serialization error and deadlock are appropriate
because everybody knows these exist. If you invent a new error code then
you'll need to re-program loads of applications which would all just
work if we use those error codes. Kevin specifically requested it for
that reason.

That leaves the error code for the drop database case. As both Tatsuo
and myself have said, it needs to be different so we do not retry it. I
don't personally think an edge case like that needs a whole new error
code to explain it.

 Backpatching to 9.0 makes sense too, though the problem is the delay in
 getting the code into a released minor version.

I treat this as a bugfix, so backpatch is appropriate. My earlier fix
in 9.0 beta didn't cover the case pointed out by Robert/Florian and as
Tatsuo points out, we need to have a retryable error for all common
cases.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Add reference to client_encoding parameter

2011-01-31 Thread Thom Brown

Hi,

I've attached a small patch for the docs which adds a reference to the
client_encoding parameter description.  This is in response to someone
attempting to submit a comment which explains where available
encodings can be found.

Thanks

Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935


encoding_ref.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error code for terminating connection due to conflict with recovery

Kevin Grittner wrote:
 Bruce Momjian br...@momjian.us wrote:
  
  As a novice I am not sure why we _wouldn't_ create two new
  separate error codes
  
 The argument for using SQLSTATE 40001 for failures which are
 strictly due to concurrency problems, and are likely to work if the
 transaction is retried, is that there is already a lot of software
 which knows how to do that.  On the other hand, going into such code
 to turn that into a list of concurrency failure states is probably
 only going to cause pain to those with applications intended to work
 with multiple DBMS products without much modification.

The way they usually handle that is by having a class of codes that
designate that behavior, but I can see now that the number can't be
subdivided.  :-(

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add ENCODING option to COPY

Hitoshi Harada umi.tan...@gmail.com writes:
 Finally I concluded the concern Itagaki-san raised can be solved by
 adding code that restores client_encoding in copy_in_error_callback.

That seems like an absolutely horrid idea.  Error context callbacks
should not have side-effects like that.  They're not guaranteed to be
called at all, let alone in any particular order.  In this case I'd also
be worried that the state needs to be fixed before elog.c reaches the
point of calling the callbacks --- there's nothing to say that it might
not try to translate some strings to the client encoding earlier than
that.

It might happen to work today (or at least in the scenarios you tested),
but it seems fragile as can be.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allowing multiple concurrent base backups


On 25.01.2011 06:02, Fujii Masao wrote:

On Tue, Jan 25, 2011 at 6:02 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com  wrote:

Hmm, perhaps the code would be more readable if instead of the
forcePageWrites counter that counts exclusive and non-exclusive backups, and
an exclusiveBackup boolean indicating if one of the in-progress backups is
an exclusive one, we had a counter that only counts non-exclusive backups,
plus a boolean indicating if an exclusive backup is in progress in addition
to them.

Attached is a patch for that (against master branch, including only xlog.c).


I read this patch and previous-posted one. Those look good.

Comments:

+ * do_pg_start_backup is the workhorse of the user-visible pg_stop_backup()
+ * function.

Typo: s/do_pg_start_backup/do_pg_stop_backup

It's helpful to explain about this behavior in pg_basebackup.sgml or elsewhere.


Thanks. I've committed this now, fixing that, and hopefully all the 
other issues mentioned in this thread.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Spread checkpoint sync

Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 IMHO we should re-consider the patch to sort the writes. Not so much 
 because of the performance gain that gives, but because we can then 
 re-arrange the fsyncs so that you write one file, then fsync it, then 
 write the next file and so on.

Isn't that going to make performance worse not better?  Generally you
want to give the kernel as much scheduling flexibility as possible,
which you do by issuing the write as far before the fsync as you can.
An arrangement like the above removes all cross-file scheduling freedom.
For example, if two files are on different spindles, you've just
guaranteed that no I/O overlap is possible.

 That way we the time taken by the fsyncs 
 is distributed between the writes,

That sounds like you have an entirely wrong mental model of where the
cost comes from.  Those times are not independent.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allowing multiple concurrent base backups


On 27.01.2011 15:15, Fujii Masao wrote:

When I read the patch, I found that pg_stop_backup removes the backup history
file as soon as it creates the file, if archive_mode is not enabled.
This looks like
oversight. We should prevent pg_stop_backup from removing the fresh history
file? Or we should prevent pg_stop_backup from creating the history
file from the
beginning since it's not used at all if archiving is disabled?
(If archiving is enabled, the history file can be used to clean the
archived files up).


Hmm, good point. It's harmless, but creating the history file in the 
first place sure seems like a waste of time.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error code for terminating connection due to conflict with recovery

On Mon, Jan 31, 2011 at 10:31 AM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Bruce Momjian br...@momjian.us wrote:
 As a novice I am not sure why we _wouldn't_ create two new
 separate error codes

 The argument for using SQLSTATE 40001 for failures which are
 strictly due to concurrency problems, and are likely to work if the
 transaction is retried, is that there is already a lot of software
 which knows how to do that.  On the other hand, going into such code
 to turn that into a list of concurrency failure states is probably
 only going to cause pain to those with applications intended to work
 with multiple DBMS products without much modification.

Yeah, I think that one's pretty logical.  I think my vote is to either
change the drop-database case to be the same as that, or to use a new
error code.  ERRCODE_ADMIN_SHUTDOWN is just strange.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Spread checkpoint sync

On Mon, Jan 31, 2011 at 11:29 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 IMHO we should re-consider the patch to sort the writes. Not so much
 because of the performance gain that gives, but because we can then
 re-arrange the fsyncs so that you write one file, then fsync it, then
 write the next file and so on.

 Isn't that going to make performance worse not better?  Generally you
 want to give the kernel as much scheduling flexibility as possible,
 which you do by issuing the write as far before the fsync as you can.
 An arrangement like the above removes all cross-file scheduling freedom.
 For example, if two files are on different spindles, you've just
 guaranteed that no I/O overlap is possible.

 That way we the time taken by the fsyncs
 is distributed between the writes,

 That sounds like you have an entirely wrong mental model of where the
 cost comes from.  Those times are not independent.

Yeah, Greg Smith made the same point a week or three ago.  But it
seems to me that there is potential value in overlaying the write and
sync phases to some degree.  For example, if the write phase is spread
over 15 minutes and you have 30 files, then by, say, minute 7, it's a
probably OK to flush the file you wrote first.  Waiting longer isn't
necessarily going to help - the kernel has probably written what it is
going to write without prodding.

In fact, it might be that on a busy system, you could lose by waiting
*too long* to perform the fsync.  The cleaning scan and/or backends
may kick out additional dirty buffers that will now have to get forced
down to disk, even though you don't really care about them (because
they were dirtied after the checkpoint write had already been done).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Spread checkpoint sync

Robert Haas robertmh...@gmail.com writes:
 On Mon, Jan 31, 2011 at 11:29 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 That sounds like you have an entirely wrong mental model of where the
 cost comes from.  Those times are not independent.

 Yeah, Greg Smith made the same point a week or three ago.  But it
 seems to me that there is potential value in overlaying the write and
 sync phases to some degree.  For example, if the write phase is spread
 over 15 minutes and you have 30 files, then by, say, minute 7, it's a
 probably OK to flush the file you wrote first.

Yeah, probably, but we can't do anything as stupid as file-by-file.

I wonder whether it'd be useful to keep track of the total amount of
data written-and-not-yet-synced, and to issue fsyncs often enough to
keep that below some parameter; the idea being that the parameter would
limit how much dirty kernel disk cache there is.  Of course, ideally the
kernel would have a similar tunable and this would be a waste of effort
on our part...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Spread checkpoint sync

On Mon, Jan 31, 2011 at 11:51 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Mon, Jan 31, 2011 at 11:29 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 That sounds like you have an entirely wrong mental model of where the
 cost comes from.  Those times are not independent.

 Yeah, Greg Smith made the same point a week or three ago.  But it
 seems to me that there is potential value in overlaying the write and
 sync phases to some degree.  For example, if the write phase is spread
 over 15 minutes and you have 30 files, then by, say, minute 7, it's a
 probably OK to flush the file you wrote first.

 Yeah, probably, but we can't do anything as stupid as file-by-file.

Eh?

 I wonder whether it'd be useful to keep track of the total amount of
 data written-and-not-yet-synced, and to issue fsyncs often enough to
 keep that below some parameter; the idea being that the parameter would
 limit how much dirty kernel disk cache there is.  Of course, ideally the
 kernel would have a similar tunable and this would be a waste of effort
 on our part...

It's not clear to me how you'd maintain that information without it
turning into a contention bottleneck.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Spread checkpoint sync

Robert Haas robertmh...@gmail.com writes:
 3. Pause for 3 seconds after every fsync.

 I think something along the lines of #3 is probably a good idea,

Really?  Any particular delay is guaranteed wrong.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Spread checkpoint sync

On Mon, Jan 31, 2011 at 12:01 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 3. Pause for 3 seconds after every fsync.

 I think something along the lines of #3 is probably a good idea,

 Really?  Any particular delay is guaranteed wrong.

What I was getting at was - I think it's probably a good idea not to
do the fsyncs at top speed, but I'm not too sure how they should be
spaced out.  I agree a fixed delay isn't necessarily right.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Spread checkpoint sync

Robert Haas robertmh...@gmail.com writes:
 On Mon, Jan 31, 2011 at 11:51 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I wonder whether it'd be useful to keep track of the total amount of
 data written-and-not-yet-synced, and to issue fsyncs often enough to
 keep that below some parameter; the idea being that the parameter would
 limit how much dirty kernel disk cache there is.  Of course, ideally the
 kernel would have a similar tunable and this would be a waste of effort
 on our part...

 It's not clear to me how you'd maintain that information without it
 turning into a contention bottleneck.

What contention bottleneck?  I was just visualizing the bgwriter process
locally tracking how many writes it'd issued.  Backend-issued writes
should happen seldom enough to be ignorable for this purpose.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] windows build docs



Following recent discussions and the enabling of 64 bit Mingw builds, I 
propose to make the attached changes to the docs. I don't see any great 
reason for us to advise against building with Mingw, especially now that 
we have 64 bit support for it, so I removed that, amd also clarified 
where Cygwin is useful and where it's not, as well as adding some detail 
about how to make 64 bit builds.


cheers

andrew


*** a/doc/src/sgml/install-windows.sgml
--- b/doc/src/sgml/install-windows.sgml
***
*** 47,61 
productnameCygwin/productname uses the normal build system, see
xref linkend=installation and the specific notes in
xref linkend=installation-notes-mingw and xref linkend=installation-notes-cygwin.
!   These builds cannot generate 64-bit binaries.
!   productnameCygwin/productname is not recommended and should
!   only be used for older versions of productnameWindows/productname where
the native build does not work, such as
!   productnameWindows 98/productname. productnameMinGW/productname is
!   only recommended if you are building other modules using it. The official
binaries are built using productnameVisual Studio/productname.
   /para
  
   sect1 id=install-windows-full
titleBuilding with productnameVisual C++/productname or the
productnamePlatform SDK/productname/title
--- 47,72 
productnameCygwin/productname uses the normal build system, see
xref linkend=installation and the specific notes in
xref linkend=installation-notes-mingw and xref linkend=installation-notes-cygwin.
!   To produce native 64 bit binaries in these environments, use the tools from
!   productnameMingw64/productname. These tools can also be used to
!   cross-compile for 32 bit and 64 bit productnameWindows/productname 
!   targets on other hosts, such as productnameLinux/productname and 
!   productnameDarwin/productname.
!   productnameCygwin/productname is not recommended for running a
!   production server, and it should only be used for running on
!   older versions of productnameWindows/productname where
the native build does not work, such as
!   productnameWindows 98/productname. The official
binaries are built using productnameVisual Studio/productname.
   /para
  
+  para
+   Native builds of applicationpsql/application don't support command 
+   line editing. The productnameCygwin/productname build does support
+   command line editing, so it should be used where psql is needed for
+   interactive use on  productnameWindows/productname.
+  /para
+ 
   sect1 id=install-windows-full
titleBuilding with productnameVisual C++/productname or the
productnamePlatform SDK/productname/title
*** a/doc/src/sgml/installation.sgml
--- b/doc/src/sgml/installation.sgml
***
*** 2733,2738  cc-1020 cc: ERROR File = pqcomm.c, Line = 427
--- 2733,2746 
 /para
  
 para
+  To build 64 bit binaries using MinGW, install the 64 bit tool set
+  from ulink url=http://www.mingw64.org/;/ulink, put its bin
+  directory in the envarPATH/envar, and run 
+  commandconfigure/command with the  
+  command--host=x86_64-w64-mingw/command option.
+/para
+ 
+para
  After you have everything installed, it is suggested that you
  run applicationpsql/application
  under commandCMD.EXE/command, as the MSYS console has

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] windows build docs

2011-01-31 Thread Magnus Hagander

On Mon, Jan 31, 2011 at 18:14, Andrew Dunstan and...@dunslane.net wrote:

 Following recent discussions and the enabling of 64 bit Mingw builds, I
 propose to make the attached changes to the docs. I don't see any great
 reason for us to advise against building with Mingw, especially now that we
 have 64 bit support for it, so I removed that, amd also clarified where
 Cygwin is useful and where it's not, as well as adding some detail about how
 to make 64 bit builds.

Agreed, and looks good to me.

Do you know if cygwin psql works in non-US locales these days? It used
to be that it didn't, and if we recommend it we should probably
include a notice if it doesn't.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI patch version 15

On Mon, Jan 31, 2011 at 12:31 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Pretty minimal differences from V14, but I figured it would save the
 committer some work if I rolled them all up here.

Sounds good.  I believe Heikki is planning to work on this one.
Hopefully that will happen soon, since we are running short on time.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] windows build docs




On 01/31/2011 12:17 PM, Magnus Hagander wrote:

On Mon, Jan 31, 2011 at 18:14, Andrew Dunstanand...@dunslane.net  wrote:

Following recent discussions and the enabling of 64 bit Mingw builds, I
propose to make the attached changes to the docs. I don't see any great
reason for us to advise against building with Mingw, especially now that we
have 64 bit support for it, so I removed that, amd also clarified where
Cygwin is useful and where it's not, as well as adding some detail about how
to make 64 bit builds.

Agreed, and looks good to me.

Do you know if cygwin psql works in non-US locales these days? It used
to be that it didn't, and if we recommend it we should probably
include a notice if it doesn't.



I just tested by setting my machine to fr_FR, and also setting 
LANG=fr_FR.utf8 under Cygwin. The build failed. The I set Cygwin back to 
LANG=C.utf8 and the build/install succeeded. After that, I switched back 
to LANG=fr_FR.utf8, and initdb, pg_ctl start and psql all behaved as 
expected. I'm adding a note accordingly.


Some day I'll try to work out why the build went boom, but now I need to 
move on to other things.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI patch version 14

On Mon, 2011-01-31 at 07:26 -0600, Kevin Grittner wrote:
  And why are you reading the infomask directly? Do the existing
  visibility functions not suffice?
  
 It's possible we re-invented some code somewhere, but I'm not clear
 on what code from this patch might use what existing function.  Could
 you provide specifics?

In CheckForSerializableConflictOut(), it takes a boolean valid. Then
within the function, it tries to differentiate:

  1. Valid with no indication that it will be deleted.
  2. Valid, but delete in progress
  3. Invalid

For #1, you are using the hint bit (not the real transaction status),
and manually checking whether it's just a lock or a real delete. For #2
you are assuming any other xmax means that the transaction is in
progress (which it may not be, because the hint bit might not be set for
some time). I assume that will cause additional false positives.

If you used HeapTupleSatisfiesVacuum(), you could do something like:

  case HEAPTUPLE_LIVE:
 return;
  case HEAPTUPLE_RECENTLY_DEAD:
  case HEAPTUPLE_DELETE_IN_PROGRESS:
 xid = HeapTupleHeaderGetXmax(tuple-t_data);
 break;
  case HEAPTUPLE_INSERT_IN_PROGRESS:
 xid = HeapTupleHeaderGetXmin(tuple-t_data);
 break;
  case HEAPTUPLE_DEAD:
 return;

This is not identical to what's happening currently, and I haven't
thought this through thoroughly yet. For instance, recently dead and
invalid would be checking on the xmax instead of the xmin. Perhaps you
could exit early in that case (if you still keep the valid flag), but
that will happen soon enough anyway.

I don't think this function really cares about the visibility with
respect to the current snapshot, right? It really cares about what other
transactions are interacting with the tuple and how. And I think HTSV
meets that need a little better.

  The biggest issue on my mind is what to do about Hot Standby. The
  authors have a plan, but there is also some resistance to it:
 
 
 http://archives.postgresql.org/message-id/23698.1295566...@sss.pgh.pa.us
 
  We don't need a perfect solution for 9.1, but it would be nice if
  we had a viable plan for 9.2.
  
 I don't recall any real opposition to what I sketched out in this
 post, which came after the above-referenced one:
  
 http://archives.postgresql.org/message-id/4d39d5ec022500039...@gw.wicourts.gov
  
 Also, that opposition appears to be based on a misunderstanding of
 the first alternative, which was for sending at most one snapshot per
 commit or rollback of a serializable read write transaction, with
 possible throttling.  The alternative needs at most two bits per
 commit or rollback of a serializable read write transaction; although
 I haven't checked whether that can be scared up without adding a
 whole byte.  Read only transactions have nothing to do with the
 traffic under either alternative.

Ok, great. When I read that before I thought that WAL might need to be
sent for implicit RO transactions. I will read it more carefully again.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Spread checkpoint sync

On Mon, Jan 31, 2011 at 12:11 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Mon, Jan 31, 2011 at 11:51 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I wonder whether it'd be useful to keep track of the total amount of
 data written-and-not-yet-synced, and to issue fsyncs often enough to
 keep that below some parameter; the idea being that the parameter would
 limit how much dirty kernel disk cache there is.  Of course, ideally the
 kernel would have a similar tunable and this would be a waste of effort
 on our part...

 It's not clear to me how you'd maintain that information without it
 turning into a contention bottleneck.

 What contention bottleneck?  I was just visualizing the bgwriter process
 locally tracking how many writes it'd issued.  Backend-issued writes
 should happen seldom enough to be ignorable for this purpose.

Ah.  Well, if you ignore backend writes, then yes, there's no
contention bottleneck.  However, I seem to recall Greg Smith showing a
system at PGCon last year with a pretty respectable volume of backend
writes (30%?) and saying OK, so here's a healthy system.  Perhaps
I'm misremembering.  But at any rate any backend that is using a
BufferAccessStrategy figures to do a lot of its own writes.  This is
probably an area for improvement in future releases, if we an figure
out how to do it: if we're doing a bulk load into a system with 4GB of
shared_buffers using a 16MB ring buffer, we'd ideally like the
background writer - or somebody other than the foreground process - to
go nuts on those buffers, writing them out as fast as it possibly can
- rather than letting the backend do it when the ring wraps around.

Back to the idea at hand - I proposed something a bit along these
lines upthread, but my idea was to proactively perform the fsyncs on
the relations that had gone the longest without a write, rather than
the ones with the most dirty data.  I'm not sure which is better.
Obviously, doing the ones that have gone idle gives the OS more time
to write out the data, but OTOH it might not succeed in purging much
dirty data.  Doing the ones with the most dirty data will definitely
reduce the size of the final checkpoint, but might also cause a
latency spike if it's triggered immediately after heavy write activity
on that file.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] windows build docs

2011-01-31 Thread Magnus Hagander

On Mon, Jan 31, 2011 at 19:34, Andrew Dunstan and...@dunslane.net wrote:


 On 01/31/2011 12:17 PM, Magnus Hagander wrote:

 On Mon, Jan 31, 2011 at 18:14, Andrew Dunstanand...@dunslane.net  wrote:

 Following recent discussions and the enabling of 64 bit Mingw builds, I
 propose to make the attached changes to the docs. I don't see any great
 reason for us to advise against building with Mingw, especially now that
 we
 have 64 bit support for it, so I removed that, amd also clarified where
 Cygwin is useful and where it's not, as well as adding some detail about
 how
 to make 64 bit builds.

 Agreed, and looks good to me.

 Do you know if cygwin psql works in non-US locales these days? It used
 to be that it didn't, and if we recommend it we should probably
 include a notice if it doesn't.


 I just tested by setting my machine to fr_FR, and also setting
 LANG=fr_FR.utf8 under Cygwin. The build failed. The I set Cygwin back to
 LANG=C.utf8 and the build/install succeeded. After that, I switched back to
 LANG=fr_FR.utf8, and initdb, pg_ctl start and psql all behaved as expected.
 I'm adding a note accordingly.

 Some day I'll try to work out why the build went boom, but now I need to
 move on to other things.

The thing that *didn't* work for me previously was running it with
*windows* in a different locale. Meaning setting regional settings
in the control panel to something like, say, Swedish :-) Or for that
matter, french.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] windows build docs




On 01/31/2011 01:51 PM, Magnus Hagander wrote:



I just tested by setting my machine to fr_FR, and also setting
LANG=fr_FR.utf8 under Cygwin. The build failed. The I set Cygwin back to
LANG=C.utf8 and the build/install succeeded. After that, I switched back to
LANG=fr_FR.utf8, and initdb, pg_ctl start and psql all behaved as expected.
I'm adding a note accordingly.

Some day I'll try to work out why the build went boom, but now I need to
move on to other things.

The thing that *didn't* work for me previously was running it with
*windows* in a different locale. Meaning setting regional settings
in the control panel to something like, say, Swedish :-) Or for that
matter, french.


Oh. Well, that worked fine for me just now. Maybe it's been fixed.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] windows build docs

2011-01-31 Thread Magnus Hagander

 I just tested by setting my machine to fr_FR, and also setting
 LANG=fr_FR.utf8 under Cygwin. The build failed. The I set Cygwin back to
 LANG=C.utf8 and the build/install succeeded. After that, I switched back
 to
 LANG=fr_FR.utf8, and initdb, pg_ctl start and psql all behaved as
 expected.
 I'm adding a note accordingly.

 Some day I'll try to work out why the build went boom, but now I need to
 move on to other things.

 The thing that *didn't* work for me previously was running it with
 *windows* in a different locale. Meaning setting regional settings
 in the control panel to something like, say, Swedish :-) Or for that
 matter, french.

 Oh. Well, that worked fine for me just now. Maybe it's been fixed.

Great. Doesn't surprise me if it is, it was years ago...

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Optimize PL/Perl function argument passing [PATCH]




On 01/15/2011 12:31 AM, Alex Hunsaker wrote:

On Tue, Dec 7, 2010 at 07:24, Tim Buncetim.bu...@pobox.com  wrote:

Changes:

Sets the local $_TD via C instead of passing an extra argument.
So functions no longer start with our $_TD; local $_TD = shift;

Pre-extend stack for trigger arguments for slight performance gain.

Passes installcheck.

Cool, surprisingly in the non trigger case I saw up to an 18% speedup.
The trigger case remained about the same, I suppose im I/O bound.

Find attached a v2 with some minor fixes, If it looks good to you Ill
mark this as Ready for Commit.

Changes:
- move up a declaration to make it c90 safe
- avoid using tg_trigger before it was initialized
- only extend the stack to the size we need (there was + 1 which
unless I am missing something was needed because we used to push $_TD
on the stack, but we dont any more)



This looks pretty good. But why are we bothering to keep $prolog at all 
any more, if all we're going to pass it is PL_sv_no all the time? Maybe 
we'll have a use for it in the future, but right now we don't appear to 
unless I'm missing something.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI patch version 14

Jeff Davis pg...@j-davis.com wrote:
 
 On Mon, 2011-01-31 at 07:26 -0600, Kevin Grittner wrote:
 And why are you reading the infomask directly? Do the existing
 visibility functions not suffice?
  
 It's possible we re-invented some code somewhere, but I'm not
 clear on what code from this patch might use what existing
 function. Could you provide specifics?
 
 In CheckForSerializableConflictOut(), it takes a boolean valid.
 
Ah, now I see what you're talking about.  Take a look at where that
valid flag come from -- the CheckForSerializableConflictOut are
all place right after calls to HeapTupleSatisfiesVisibility.  The
valid value is what HeapTupleSatisfiesVisibility returned.  Is it
possible that the hint bits will not be accurate right after that? 
With that in mind, do you still see a problem with how things are
currently done?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error code for terminating connection due to conflict with recovery

On Mon, 2011-01-31 at 11:24 -0500, Robert Haas wrote:
 On Mon, Jan 31, 2011 at 10:31 AM, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:
  Bruce Momjian br...@momjian.us wrote:
  As a novice I am not sure why we _wouldn't_ create two new
  separate error codes
 
  The argument for using SQLSTATE 40001 for failures which are
  strictly due to concurrency problems, and are likely to work if the
  transaction is retried, is that there is already a lot of software
  which knows how to do that.  On the other hand, going into such code
  to turn that into a list of concurrency failure states is probably
  only going to cause pain to those with applications intended to work
  with multiple DBMS products without much modification.
 
 Yeah, I think that one's pretty logical. 

  I think my vote is to either
 change the drop-database case to be the same as that

No, we shouldn't have a can retry errcode for a must not retry case.

 , or to use a new
 error code.  ERRCODE_ADMIN_SHUTDOWN is just strange.

It's not strange at all. It's the same error code as we use for all of
the other cases listed. We need that because it is the current
catch-all errcode for cannot retry.

The purpose of errcodes is to allow programs to check them and then act.
It's pointless to add a new errcode that is so rare that nobody will
ever program for it because they won't expect it, let alone test for it.
Or at least won't assign any sensible priority to handling that error.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI patch version 14

On Mon, 2011-01-31 at 13:32 -0600, Kevin Grittner wrote:
 Ah, now I see what you're talking about.  Take a look at where that
 valid flag come from -- the CheckForSerializableConflictOut are
 all place right after calls to HeapTupleSatisfiesVisibility.  The
 valid value is what HeapTupleSatisfiesVisibility returned.  Is it
 possible that the hint bits will not be accurate right after that? 
 With that in mind, do you still see a problem with how things are
 currently done?

Oh, ok. The staleness of the hint bit was a fairly minor point though.

Really, I think this should be using HTSV to separate concerns better
and improve readability. My first reaction was to try to find out what
the function was doing that's special. If it is doing something special,
and HTSV is not what you're really looking for, a comment to explain
would be helpful.

As an example, consider that Robert Haas recently suggested using an
infomask bit to mean frozen, rather than actually removing the xid, to
save the xid as forensic information. If that were to happen, your code
would be reading an xid that may have been re-used.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI patch version 14

Jeff Davis pg...@j-davis.com wrote:
 
 I don't think this function really cares about the visibility with
 respect to the current snapshot, right?
 
What it cares about is whether some other particular top level
transaction wrote a tuple which we *would* read except that it is
not visible to us because that other top level transaction is
concurrent with ours.  If so, we want to flag a read-write conflict
out from our transaction and in to that other transaction.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error code for terminating connection due to conflict with recovery

Simon Riggs si...@2ndquadrant.com writes:
 On Mon, 2011-01-31 at 11:24 -0500, Robert Haas wrote:
 , or to use a new
 error code.  ERRCODE_ADMIN_SHUTDOWN is just strange.

 It's not strange at all. It's the same error code as we use for all of
 the other cases listed. We need that because it is the current
 catch-all errcode for cannot retry.

 The purpose of errcodes is to allow programs to check them and then act.
 It's pointless to add a new errcode that is so rare that nobody will
 ever program for it because they won't expect it, let alone test for it.
 Or at least won't assign any sensible priority to handling that error.

The trouble with ERRCODE_ADMIN_SHUTDOWN is that it might lead a
connection pooler to expect that *all* its connections are going bad,
not just the ones that are connected to a specific database.  I think
this is a bad decision.  Programs that are interested in testing for this
case at all are likely to need to be worried about that distinction.

Also, while I believe that ERRCODE_T_R_DEADLOCK_DETECTED is a reasonable
catchall retry code, I don't think it's equally sane to think that
ERRCODE_ADMIN_SHUTDOWN is a catchall non-retry code.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI patch version 14

Jeff Davis pg...@j-davis.com wrote:
 
 Really, I think this should be using HTSV to separate concerns
 better and improve readability. My first reaction was to try to
 find out what the function was doing that's special. If it is
 doing something special, and HTSV is not what you're really
 looking for, a comment to explain would be helpful.
 
It does seem that at least a comment would be needed.  I'm not at
all confident that there isn't some macro or function which would
yield what I need.  I just sent an email clarifying exactly what I
want to check, so if you can see a better way to determine that, I'm
all ears.
 
 As an example, consider that Robert Haas recently suggested using
 an infomask bit to mean frozen, rather than actually removing the
 xid, to save the xid as forensic information. If that were to
 happen, your code would be reading an xid that may have been
 re-used.
 
Yeah, clearly if the code remains as it is, it would be sensitive to
changes in how hint bits or the xid values are used.  If we can
abstract that, it's clearly a Good Thing to do so.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error code for terminating connection due to conflict with recovery

On Mon, 2011-01-31 at 14:58 -0500, Tom Lane wrote:
 Simon Riggs si...@2ndquadrant.com writes:
  On Mon, 2011-01-31 at 11:24 -0500, Robert Haas wrote:
  , or to use a new
  error code.  ERRCODE_ADMIN_SHUTDOWN is just strange.
 
  It's not strange at all. It's the same error code as we use for all of
  the other cases listed. We need that because it is the current
  catch-all errcode for cannot retry.
 
  The purpose of errcodes is to allow programs to check them and then act.
  It's pointless to add a new errcode that is so rare that nobody will
  ever program for it because they won't expect it, let alone test for it.
  Or at least won't assign any sensible priority to handling that error.
 
 The trouble with ERRCODE_ADMIN_SHUTDOWN is that it might lead a
 connection pooler to expect that *all* its connections are going bad,
 not just the ones that are connected to a specific database.  I think
 this is a bad decision.  Programs that are interested in testing for this
 case at all are likely to need to be worried about that distinction.

That's a reasonable argument.

My objection to a new code is only to one that is so specific that
people have to program for ERRCODE_BLUE_MOON_ON_A_LEAP_YEAR.

Can we invent a new catch-all that might be used here? Something that
means unknown operational error, not sure what to do.

ERRCODE_ADMIN_OTHER or ERRCODE_ADMIN_UNCLASSIFIED.

 Also, while I believe that ERRCODE_T_R_DEADLOCK_DETECTED is a reasonable
 catchall retry code, I don't think it's equally sane to think that
 ERRCODE_ADMIN_SHUTDOWN is a catchall non-retry code.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Problem with postgresql database connection in combination with HUAWEI data modem

2011-01-31 Thread Jürgen Wolfsgruber

Hello,

I discussed my problem at www.pg-forum.de with Mr. Scherbaum (ADS)  and he 
recommended me to inform you about this problem.

I worked on my Mac-Book (Mac OS X 10.6.6) with postgresql database version 9 
(postgresql-9.0.1-1-osx.dmg). After installing and connecting my HUAWEI E122 
data modem (net provider 3 (drei)), I have detected, that I could't connect to 
database.  
Disconnecting the HUAWEI modem has no effects. I could not connect to database. 
Restarting the system makes the database available again.
The problem occurs exactly at connecting to internet from 3DataManager. There 
are no problems with connecting USB-Stick to Mac Book or starting the 
3DataManager. 

I installed the postgresql-database with default settings:

host: localhost
port: 5432
user: postgres

listen_addresses: *

This is a netstat-result after connecting via HUAWEI data modem:

Active Internet connections (including servers)
Proto Recv-Q Send-Q Local Address Foreign Address (state)
tcp4 0 0 178.113.226.250..52044 c-98-245-185-91..56489 ESTABLISHED
tcp4 0 0 *.21234 *.* LISTEN
tcp4 0 0 *.postgresql *.* LISTEN
tcp6 0 0 *.postgres *.* LISTEN
tcp4 0 0 .psf.ipp *.* LISTEN
tcp6 0 0 localhost.ipp *.* LISTEN

Trying to start a telnet-connection, the result was:

telnet 127.0.0.1 5432
Trying 127.0.0.1...
telnet: connect to address 127.0.0.1: Operation timed out
telnet: Unable to connect to remote host

As I had no idea, what the problem could be, I tried installing the latest 
version of postgresql (postgresql-9.0.2-1-osx.dmg), maybe that this problem is 
fixed now. 
But it has no effect. 

Then I tried reproducing the problem on my Windows 7 system. I installed the 
HUAWEI data stick and connected it to the internet. 

I could connect to database without problems!! So I checked the version of 
postgresql on my windows system. The version is 8.4! 
(postgresql-8.4.2-1-windows.exe)

After uninstalling of postgresql 9 from my Mac Book and installing the latest 
8.4 version (postgresql-8.4.6-1-osx.dmg), the problem was also solved.

Maybe that the problem is a 32/64 bit problem. 

best regards,

Juergen Wolfsgruber




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI patch version 14

On Mon, 2011-01-31 at 13:55 -0600, Kevin Grittner wrote:
 Jeff Davis pg...@j-davis.com wrote:
  
  I don't think this function really cares about the visibility with
  respect to the current snapshot, right?
  
 What it cares about is whether some other particular top level
 transaction wrote a tuple which we *would* read except that it is
 not visible to us because that other top level transaction is
 concurrent with ours.

Or a tuple that you *are* reading, but is being deleted concurrently,
right? Or has been deleted by an overlapping transaction?

 If so, we want to flag a read-write conflict
 out from our transaction and in to that other transaction.

It still seems like HTSV would suffice, unless I'm missing something.

I think visible is still needed though: it matters in the cases
HEAPTUPLE_RECENTLY_DEAD and HEAPTUPLE_LIVE. For the former, it only
allows an early exit (if !visible); but for the latter, I think it's
required.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Spread checkpoint sync

Robert Haas wrote:
 Back to the idea at hand - I proposed something a bit along these
 lines upthread, but my idea was to proactively perform the fsyncs on
 the relations that had gone the longest without a write, rather than
 the ones with the most dirty data.  I'm not sure which is better.
 Obviously, doing the ones that have gone idle gives the OS more time
 to write out the data, but OTOH it might not succeed in purging much
 dirty data.  Doing the ones with the most dirty data will definitely
 reduce the size of the final checkpoint, but might also cause a
 latency spike if it's triggered immediately after heavy write activity
 on that file.

Crazy idea #2 --- it would be interesting if you issued an fsync
_before_ you wrote out data to a file that needed an fsync.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI patch version 14

Jeff Davis pg...@j-davis.com wrote:
 On Mon, 2011-01-31 at 13:55 -0600, Kevin Grittner wrote:
 
 What it cares about is whether some other particular top level
 transaction wrote a tuple which we *would* read except that it is
 not visible to us because that other top level transaction is
 concurrent with ours.
 
 Or a tuple that you *are* reading, but is being deleted
 concurrently, right? Or has been deleted by an overlapping
 transaction?
 
Right.  I guess that wasn't as precise a statement as I thought.  I
was trying to say that the effects of some write (insert, update,
delete to a permanent table) would not be visible to us because the
writing transaction is concurrent, for some tuple under
consideration.
 
 If so, we want to flag a read-write conflict
 out from our transaction and in to that other transaction.
 
 It still seems like HTSV would suffice, unless I'm missing
 something.
 
It is at least as likely that I'm missing something.  If I'm
following you, we're talking about these 24 lines of code, where
valid is the what was just returned from
HeapTupleSatisfiesVisibility:
 
if (valid)
{
/*
 * We may bail out if previous xmax aborted, or if it
committed but
 * only locked the tuple without updating it.
 */
if (tuple-t_data-t_infomask  (HEAP_XMAX_INVALID |
HEAP_IS_LOCKED))
return;

/*
 * If there's a valid xmax, it must be from a concurrent
transaction,
 * since it deleted a tuple which is visible to us.
 */
xid = HeapTupleHeaderGetXmax(tuple-t_data);
if (!TransactionIdIsValid(xid))
return;
}
else
{
/*
 * We would read this row, but it isn't visible to us.
 */
xid = HeapTupleHeaderGetXmin(tuple-t_data);
}
 
We follow this by a check for the top-level xid, and return if
that's early enough to have overlapped our transaction.
 
This seems to work as intended for a all known tests.  I guess my
questions would be:
 
(1)  Do you see a case where this would do the wrong thing?  Can you
describe that or (even better) provide a test case to demonstrate
it?
 
(2)  I haven't gotten my head around how HTSV helps or is even the
right thing.  If I want to try the switch statement from your recent
post, what should I use as the OldestXmin value on the call to HTSV?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI patch version 14

I wrote:
 
 We follow this by a check for the top-level xid, and return if
 that's early enough to have overlapped our transaction.
 
s/early enough to have overlapped/early enough not to have
overlapped/
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI patch version 14

On Mon, 2011-01-31 at 14:38 -0600, Kevin Grittner wrote:
 It is at least as likely that I'm missing something.  If I'm
 following you, we're talking about these 24 lines of code, where
 valid is the what was just returned from
 HeapTupleSatisfiesVisibility:

Yes.

 (1)  Do you see a case where this would do the wrong thing?  Can you
 describe that or (even better) provide a test case to demonstrate
 it?

No, I don't see any incorrect results.
 
 (2)  I haven't gotten my head around how HTSV helps or is even the
 right thing.

It primarily just encapsulates the access to the tuple header fields. I
think that avoiding the messy logic of hint bits, tuple locks, etc., is
a significant win for readability and maintainability.

 If I want to try the switch statement from your recent
 post, what should I use as the OldestXmin value on the call to HTSV?

I believe RecentGlobalXmin should work.

And I don't think the original switch statement I posted did the right
thing for HEAPTUPLE_LIVE. I think that case needs to account for the
visible flag (if it's live but not visible, that's the same as
insert-in-progress for your purposes).

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Spread checkpoint sync

2011-01-31 Thread Greg Smith


Tom Lane wrote:

I wonder whether it'd be useful to keep track of the total amount of
data written-and-not-yet-synced, and to issue fsyncs often enough to
keep that below some parameter; the idea being that the parameter would
limit how much dirty kernel disk cache there is.  Of course, ideally the
kernel would have a similar tunable and this would be a waste of effort
on our part...
  


I wanted to run the tests again before reporting in detail here, because 
the results are so bad, but I threw out an initial report about trying 
to push this toward this down to be the kernel's job at 
http://blog.2ndquadrant.com/en/2011/01/tuning-linux-for-low-postgresq.html


So far it looks like the newish Linux dirty_bytes parameter works well 
at reducing latency by limiting how much dirty data can pile up before 
it gets nudged heavily toward disk.  But the throughput drop you pay on 
VACUUM in particular is brutal, I'm seeing over a 50% slowdown in some 
cases.  I suspect we need to let the regular cleaner and backend writes 
queue up in the largest possible cache for VACUUM, so it benefits as 
much as possible from elevator sorting of writes.  I suspect this being 
the worst case now for a tightly controlled write cache is an unintended 
side-effect of the ring buffer implementation it uses now.


Right now I'm running the same tests on XFS instead of ext3, and those 
are just way more sensible all around; I'll revisit this on that 
filesystem and ext4.  The scale=500 tests I've running lots of lately 
are a full 3X TPS faster on XFS relative to ext3, with about 1/8 as much 
worst-case latency.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI patch version 14

Jeff Davis pg...@j-davis.com wrote:
 
 Ok, great. When I read that before I thought that WAL might need
 to be sent for implicit RO transactions. I will read it more
 carefully again.
 
In looking back over recent posts to see what I might have missed or
misinterpreted, I now see your point.  Either of these alternatives
would involve potentially sending something through the WAL on
commit or rollback of some serializable transactions which *did not*
write anything, if they were not *declared* to be READ ONLY.  If
that is not currently happening (again, I confess to not having yet
delved into the mysteries of writing WAL records), then we would
need a new WAL record type for writing these.
 
That said, the logic would not make it at all useful to send
something for *every* such transaction, and I've rather assumed
that we would want some heuristic for setting a minimum interval
between notifications, whether we sent the snapshots themselves or
just flags to indicate it was time to build or validate a candidate
snapshot.
 
Sorry for misunderstanding the concerns.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error code for terminating connection due to conflict with recovery

Simon Riggs si...@2ndquadrant.com writes:
 On Mon, 2011-01-31 at 14:58 -0500, Tom Lane wrote:
 The trouble with ERRCODE_ADMIN_SHUTDOWN is that it might lead a
 connection pooler to expect that *all* its connections are going bad,
 not just the ones that are connected to a specific database.  I think
 this is a bad decision.  Programs that are interested in testing for this
 case at all are likely to need to be worried about that distinction.

 That's a reasonable argument.

 My objection to a new code is only to one that is so specific that
 people have to program for ERRCODE_BLUE_MOON_ON_A_LEAP_YEAR.

What's wrong with ERRCODE_DATABASE_DROPPED, or something like that?

 Can we invent a new catch-all that might be used here? Something that
 means unknown operational error, not sure what to do.

Because that's not the situation here.  We know exactly what a pooler
should do.  It might be an infrequent case, but obscurantism isn't going
to help anyone.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Spread checkpoint sync

Robert Haas robertmh...@gmail.com writes:
 Back to the idea at hand - I proposed something a bit along these
 lines upthread, but my idea was to proactively perform the fsyncs on
 the relations that had gone the longest without a write, rather than
 the ones with the most dirty data.

Yeah.  What I meant to suggest, but evidently didn't explain well, was
to use that or something much like it as the rule for deciding *what* to
fsync next, but to use amount-of-unsynced-data-versus-threshold as the
method for deciding *when* to do the next fsync.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI patch version 14

Jeff Davis pg...@j-davis.com wrote:
 On Mon, 2011-01-31 at 14:38 -0600, Kevin Grittner wrote:
 
 If I want to try the switch statement from your recent
 post, what should I use as the OldestXmin value on the call to
 HTSV?
 
 I believe RecentGlobalXmin should work.
 
 And I don't think the original switch statement I posted did the
 right thing for HEAPTUPLE_LIVE. I think that case needs to account
 for the visible flag (if it's live but not visible, that's the
 same as insert-in-progress for your purposes).
 
I'll try to set this up and see if I can get it to pass the check
and dcheck make targets.  Can we assume that the performance impact
would be too small to matter when we know for sure that hint bits
have already been set?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error code for terminating connection due to conflict with recovery

2011-01-31 Thread Josh Berkus

On 1/31/11 11:26 AM, Simon Riggs wrote:
 The purpose of errcodes is to allow programs to check them and then act.
 It's pointless to add a new errcode that is so rare that nobody will
 ever program for it because they won't expect it, let alone test for it.
 Or at least won't assign any sensible priority to handling that error.

Personally, I would prefer a new error code because I *intend* to write
specific code to deal with replication cancellation.  Without a specific
code, then I cannot distinguish replication cancel from whatever else
it's lumped in with, except by regexping the error message.

If we're not going to get a new code, then I'd prefer DEADLOCK_DETECTED
because the behavior we expect from the client (try the whole
transaction over from the beginning) is the same.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Spread checkpoint sync

2011-01-31 Thread Greg Smith


Tom Lane wrote:

Robert Haas robertmh...@gmail.com writes:
  

3. Pause for 3 seconds after every fsync.



  

I think something along the lines of #3 is probably a good idea,



Really?  Any particular delay is guaranteed wrong.

  


'3 seconds' is just a placeholder for whatever comes out of a total 
time scheduled to sync / relations to sync computation.  (Still doing 
all my thinking in terms of time, altough I recognize a showdown with 
segment-based checkpoints is coming too)


I think the right way to compute relations to sync is to finish the 
sorted writes patch I sent over a not quite right yet update to already, 
which is my next thing to work on here.  I remain pessimistic that any 
attempt to issue fsync calls without the maximum possible delay after 
asking kernel to write things out first will work out well.  My recent 
tests with low values of dirty_bytes on Linux just reinforces how bad 
that can turn out.  In addition to computing the relation count while 
sorting them, placing writes in-order by relation and then doing all 
writes followed by all syncs should place the database right in the 
middle of the throughput/latency trade-off here.  It will have had the 
maximum amount of time we can give it to sort and flush writes for any 
given relation before it is asked to sync it.  I don't want to try and 
be any smarter than that without trying to be a *lot* smarter--timing 
individual sync calls, feedback loops on time estimation, etc.


At this point I have to agree with Robert's observation that splitting 
checkpoints into checkpoint_write_target and checkpoint_sync_target is 
the only reasonable thing left that might be possible complete in a 
short period.  So that's how this can compute the total time numerator here.


The main thing I will warn about in relations to discussion today is the 
danger of true dead-line oriented scheduling in this area.  The 
checkpoint process may discover the sync phase is falling behind 
expectations because the individual sync calls are taking longer than 
expected.  If that happens, aiming for the finish on target anyway 
goal puts you right back to a guaranteed nasty write spike again.  I 
think many people would prefer logging the overrun as tuning feedback 
for the DBA rather than to accelerate, which is likely to make the 
problem even worse if the checkpoint is falling behind.  But since 
ultimately the feedback for this will be make the checkpoints longer or 
increase checkpoint_sync_target, sync acceleration to meet the deadline 
isn't unacceptable; DBA can try both of those themselves if seeing spikes.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books

Re: [HACKERS] SSI patch version 15


On 31.01.2011 20:05, Robert Haas wrote:

On Mon, Jan 31, 2011 at 12:31 PM, Kevin Grittner
kevin.gritt...@wicourts.gov  wrote:

Pretty minimal differences from V14, but I figured it would save the
committer some work if I rolled them all up here.


Sounds good.  I believe Heikki is planning to work on this one.
Hopefully that will happen soon, since we are running short on time.


Yeah, I can commit this. Jeff, are you satisfied with this patch now? 
I'm glad you're reviewing this, more eyeballs helps a lot with a big 
patch like this.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI patch version 14

On Mon, 2011-01-31 at 15:30 -0600, Kevin Grittner wrote:
 I'll try to set this up and see if I can get it to pass the check
 and dcheck make targets.  Can we assume that the performance impact
 would be too small to matter when we know for sure that hint bits
 have already been set?

I think that's a safe assumption. If there is some kind of noticeable
difference in conflict rates or runtime, that probably indicates a bug
in the new or old code.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [pgsql-general 2011-1-21:] Are there any projects interested in object functionality? (+ rule bases)

2011-01-31 Thread Nick Rudnick

Interesting... I remember that some years ago, I fiddled around with 
functions, operators etc. to allow a method like syntax -- but I ever 
was worried this approach would have serious weaknesses -- are there any 
principal hindrances to having methods, if no, can this be implemented 
in a straightforward way?


Thank you in advance,

Nick

On 01/31/2011 03:19 PM, Robert Haas wrote:

On Mon, Jan 31, 2011 at 4:34 AM, Pavel Stehulepavel.steh...@gmail.com  wrote:

What I know no body is working on SQL/OLB ISO/IEC 9075-10 now.

I proposed a 3 years ago a support of methods, but without success.
This propose was rejected. There isn't a real interest to implement it
from commiters. And I have to say - users doesn't request it too. And
there are a few issues with compatibility.

It seems to me it's a bit unfair to say there isn't real interest to
implement it from committers.  Plenty of features get implemented
that no committer particularly cares about, because a number of
committers - including me - spend a good deal of time reviewing and
committing patches written by other people which they never would have
written themselves.  It's true that patches sometimes get swatted down
because they are judged to be insufficiently useful or badly design or
because they create compatibility breaks, but that's not the same as
not interested, which to me implies a sort of purely arbitrary
rejection that I try hard to avoid.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [pgsql-general 2011-1-21:] Are there any projects interested in object functionality? (+ rule bases)

2011-01-31 Thread Nick Rudnick


Hello Robert,

a good moment to clear things up:

* Of course, compliance with an ISO-SQL standard is of minimal 
importance -- I just grabbed it from the docs.


* The same holds (in a somewhat weaker way) for Java -- I would even 
prefer the more general notion type instead of OO, but I am asking in 
interest of a CS professor I worked for in the past, who is looking for 
impressive demos of non standard DBMS technology. The students might 
find resemblance to Java helpful, who knows?


* In this regard it is of interest in how far there are principal 
efficiency problems with the support of (deeply nested) object like 
structure by the backend, or if the backend may be expected to do this 
job not terribly worse then more specialized OODMS -- of course, I would 
be interested in any discussions of these topics...


* The same question for doing rule bases on top of the PostgreSQL backend...

* For teaching at university courses, on the other hand, efficiency 
would be of lower interest, so there was an idea that there might be 
some (possibly toy example like) efforts to tune the frontend into this 
direction.


== Academic prototypes, which preferably do not bring too much overhead 
for students.


Cheers,

Nick



On 01/31/2011 03:22 PM, Robert Haas wrote:

On Mon, Jan 31, 2011 at 3:32 AM, Jörg Roman Rudnick
joerg.rudn...@t-online.de  wrote:

* are there any people / projects known which are interested in ORDBMS /
OODBMS usage of PostgreSQL? Strict SQL standard conformance is less
important than the possibility to provide instructive and impressive
examples to students.

* are there any people / projects known which are interested in extending
PostgreSQL at a higher level (plpgsql, creating operators, etc.) for the
sake of ORDBMS / OODBMS functionality?

* are there any people / projects known which are interested in extending
PostgreSQL on the level of developing C code for the sake of ORDBMS / OODBMS
functionality?

* in how far does the backend support such efforts -- would it do fine, or
is rather to be expected that doing ORDBMS / OODBMS driven queries would
lead to disastrous performance?

* are there any people / projects known which are interested in using the
rule (?trigger?) system of PostgreSQL (maybe with extensions) to achieve
some kind of rule base / datalog type inference engines? In how far does the
backend constrain this in regard of performance?

I don't really know much about ORDBMS / OODBMS functionality; a quick
Google search suggests that SQL/OLB is mostly about Java language
bindings, and there's a separate project (pgsql-jdbc) which works on
PostgreSQL connectivity for Java.  As far as changes to the core
database are concerned, user-defined functions and operators are not
hard to create, but I'm fuzzy on what specifically you want to do.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Invitation to Cluster Hackers meeting at pgCon

2011-01-31 Thread Josh Berkus

All,

This year we will be having a Cluster Hackers summit at pgCon.  You
are invited if you are currently working on any PostgreSQL replication
or clustering solution, or core features to support such solutions.
This includes, but is not limited to: PostgreXC, GridSQL, Postgres-R,
Slony-I, Bucardo, binary replication, pgPool, Skytools, Aster Data,
Greenplum, HadoopDB, or others.

This is a working meeting, intended to coordinate development of joint
open source projects in order to enhance PostgreSQL clustering.  It is
not a meeting for users, just hackers.  The meeting is sponsored by NTT
Open Source.

Tuesday May 17th
9am to 5pm
University of Ottawa Campus (room TBD)

This is concurrent with the first day of tutorials at pgCon.

Please RSVP if you plan to be there.  Please also make a note if you
need assistance with travel costs (We probably will not have funds
for assistance, but I'll let you know if we do).

RSVP:
https://spreadsheets.google.com/viewform?formkey=dEFINzFPYlROTFZBVjJEYVFTbWZIUXc6MQ

Tentative schedule is:

09:00-09:30 - Introduction, organize schedule

09:30-12:00 - Status updates on clustering projects
and on core-code clustering projects

12:00-13:00 - Break for lunch

13:00-17:00 - Breakout sessions: discuss specific projects
  in small groups
  (extra room will be available for this)

19:00 Dinner (optional, not sponsored)




-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI patch version 15

On Mon, 2011-01-31 at 23:35 +0200, Heikki Linnakangas wrote:
 Yeah, I can commit this. Jeff, are you satisfied with this patch now? 
 I'm glad you're reviewing this, more eyeballs helps a lot with a big 
 patch like this.

I think the patch is very close. I am doing my best in my free time to
complete a thorough review. If you have other patches to review/commit
then I will still be making progress reviewing SSI.

However, I would recommend leaving yourself some time to think on this
one if you don't already understand the design well.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error code for terminating connection due to conflict with recovery

On Mon, 2011-01-31 at 16:24 -0500, Tom Lane wrote:
 Simon Riggs si...@2ndquadrant.com writes:
  On Mon, 2011-01-31 at 14:58 -0500, Tom Lane wrote:
  The trouble with ERRCODE_ADMIN_SHUTDOWN is that it might lead a
  connection pooler to expect that *all* its connections are going bad,
  not just the ones that are connected to a specific database.  I think
  this is a bad decision.  Programs that are interested in testing for this
  case at all are likely to need to be worried about that distinction.
 
  That's a reasonable argument.
 
  My objection to a new code is only to one that is so specific that
  people have to program for ERRCODE_BLUE_MOON_ON_A_LEAP_YEAR.
 
 What's wrong with ERRCODE_DATABASE_DROPPED, or something like that?
 
  Can we invent a new catch-all that might be used here? Something that
  means unknown operational error, not sure what to do.
 
 Because that's not the situation here.  We know exactly what a pooler
 should do.  It might be an infrequent case, but obscurantism isn't going
 to help anyone.

OK, that makes sense to me.

I would make ERRCODE_DATABASE_DROPPED an Invalid Authorization error,
rather than a Transaction Rollback code. So sqlstate 28P02

The sensible handling of such an error is not to retry, or at least to
switch to an alternate if one is available.

Ready to commit if no objection.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error code for terminating connection due to conflict with recovery

On Mon, Jan 31, 2011 at 6:07 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On Mon, 2011-01-31 at 16:24 -0500, Tom Lane wrote:
 Simon Riggs si...@2ndquadrant.com writes:
  On Mon, 2011-01-31 at 14:58 -0500, Tom Lane wrote:
  The trouble with ERRCODE_ADMIN_SHUTDOWN is that it might lead a
  connection pooler to expect that *all* its connections are going bad,
  not just the ones that are connected to a specific database.  I think
  this is a bad decision.  Programs that are interested in testing for this
  case at all are likely to need to be worried about that distinction.

  That's a reasonable argument.

  My objection to a new code is only to one that is so specific that
  people have to program for ERRCODE_BLUE_MOON_ON_A_LEAP_YEAR.

 What's wrong with ERRCODE_DATABASE_DROPPED, or something like that?

  Can we invent a new catch-all that might be used here? Something that
  means unknown operational error, not sure what to do.

 Because that's not the situation here.  We know exactly what a pooler
 should do.  It might be an infrequent case, but obscurantism isn't going
 to help anyone.

 OK, that makes sense to me.

 I would make ERRCODE_DATABASE_DROPPED an Invalid Authorization error,
 rather than a Transaction Rollback code. So sqlstate 28P02

 The sensible handling of such an error is not to retry, or at least to
 switch to an alternate if one is available.

 Ready to commit if no objection.

ISTM it should still be in class 40.  There's nothing wrong with the
user's authorization; we've just decided to roll back the transaction
for our own purposes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SSI patch version 14

Jeff Davis pg...@j-davis.com wrote:
On Mon, 2011-01-31 at 15:30 -0600, Kevin Grittner wrote:
I'll try to set this up and see if I can get it to pass the check
and dcheck make targets. Can we assume that the performance
impact would be too small to matter when we know for sure that
hint bits have already been set?

I think that's a safe assumption. If there is some kind of
noticeable difference in conflict rates or runtime, that probably
indicates a bug in the new or old code.

That worked fine; passed check and dcheck targets.

Here's the code:

http://git.postgresql.org/gitweb?p=users/kgrittn/postgres.git;a=commitdiff;h=6360b0d4ca88c09cf590a75409cd29831afff58b

With confidence that it works, I looked it over some more and now
like this a lot. It is definitely more readable and should be less
fragile in the face of changes to MVCC bit-twiddling techniques. Of
course, any changes to the HTSV_Result enum will require changes to
this code, but that seems easier to spot and fix than the
alternative. Thanks for the suggestion!

Having gotten my head around it, I embellished here:

http://git.postgresql.org/gitweb?p=users/kgrittn/postgres.git;a=commitdiff;h=f9307a41c198a9aa4203eb529f9c6d1b55c5c6e1

Do those changes look reasonable? None of that is really
*necessary*, but it seemed cleaner and clearer that way once I
looked at the code with the changes you suggested.

-Kevin

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [NOVICE] systable_getnext_ordered

y...@mwd.biglobe.ne.jp (YAMAMOTO Takashi) writes:
 the attached patch is to avoid unnecessary detoast'ing and EOF marker pages
 when possible.  does it make sense?

The blob page size is already chosen not to allow for out-of-line
storage, not to mention that pg_largeobject doesn't have a TOAST table.
So I think avoiding detoasting is largely a waste of time.  I'm
unexcited about the other consideration too --- it looks to me like it
just makes truncation slower, more complicated, and hence more
bug-prone, in return for a possible speedup that probably nobody will
ever notice.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error code for terminating connection due to conflict with recovery

On Mon, 2011-01-31 at 18:21 -0500, Robert Haas wrote:
  Ready to commit if no objection.
 
 ISTM it should still be in class 40.  There's nothing wrong with the
 user's authorization; we've just decided to roll back the transaction
 for our own purposes.

OK.

BTW, anybody know why we have PL/pgSQL condition codes for conditions
that can't be trapped by PL/pgSQL? ERRCODE_ADMIN_SHUTDOWN and
ERRCODE_DATABASE_DROPPED are always FATAL. Seems like pointless code to
me.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error code for terminating connection due to conflict with recovery

Robert Haas robertmh...@gmail.com writes:
 On Mon, Jan 31, 2011 at 6:07 PM, Simon Riggs si...@2ndquadrant.com wrote:
 I would make ERRCODE_DATABASE_DROPPED an Invalid Authorization error,
 rather than a Transaction Rollback code. So sqlstate 28P02

 ISTM it should still be in class 40.  There's nothing wrong with the
 user's authorization; we've just decided to roll back the transaction
 for our own purposes.

I agree, 28 is a completely off-point category.  But it wasn't in 40
before, either --- we are talking about where it currently says
ADMIN_SHUTDOWN, no?  I'd vote for keeping it in class 57 (operator
intervention), as that is both sensible and a minimal change from
current behavior.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error code for terminating connection due to conflict with recovery

2011-01-31 Thread Josh Berkus

 BTW, anybody know why we have PL/pgSQL condition codes for conditions
 that can't be trapped by PL/pgSQL? ERRCODE_ADMIN_SHUTDOWN and
 ERRCODE_DATABASE_DROPPED are always FATAL. Seems like pointless code to
 me.

So we can support autonomous transactions in the future?


-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error code for terminating connection due to conflict with recovery

On Mon, Jan 31, 2011 at 7:12 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Mon, Jan 31, 2011 at 6:07 PM, Simon Riggs si...@2ndquadrant.com wrote:
 I would make ERRCODE_DATABASE_DROPPED an Invalid Authorization error,
 rather than a Transaction Rollback code. So sqlstate 28P02

 ISTM it should still be in class 40.  There's nothing wrong with the
 user's authorization; we've just decided to roll back the transaction
 for our own purposes.

 I agree, 28 is a completely off-point category.  But it wasn't in 40
 before, either --- we are talking about where it currently says
 ADMIN_SHUTDOWN, no?  I'd vote for keeping it in class 57 (operator
 intervention), as that is both sensible and a minimal change from
 current behavior.

Seems a little weird to me, since the administrator hasn't done
anything.  It's the system that has decide to roll the transaction
back, not the operator.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error code for terminating connection due to conflict with recovery

On Mon, Jan 31, 2011 at 7:13 PM, Josh Berkus j...@agliodbs.com wrote:
 BTW, anybody know why we have PL/pgSQL condition codes for conditions
 that can't be trapped by PL/pgSQL? ERRCODE_ADMIN_SHUTDOWN and
 ERRCODE_DATABASE_DROPPED are always FATAL. Seems like pointless code to
 me.

 So we can support autonomous transactions in the future?

Huh?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error code for terminating connection due to conflict with recovery

Robert Haas robertmh...@gmail.com writes:
 On Mon, Jan 31, 2011 at 7:12 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I agree, 28 is a completely off-point category.  But it wasn't in 40
 before, either --- we are talking about where it currently says
 ADMIN_SHUTDOWN, no?  I'd vote for keeping it in class 57 (operator
 intervention), as that is both sensible and a minimal change from
 current behavior.

 Seems a little weird to me, since the administrator hasn't done
 anything.

Sure he has: he issued the DROP DATABASE command that's causing the
system to disconnect standby sessions.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error code for terminating connection due to conflict with recovery

On Mon, 2011-01-31 at 19:12 -0500, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  On Mon, Jan 31, 2011 at 6:07 PM, Simon Riggs si...@2ndquadrant.com wrote:
  I would make ERRCODE_DATABASE_DROPPED an Invalid Authorization error,
  rather than a Transaction Rollback code. So sqlstate 28P02
 
  ISTM it should still be in class 40.  There's nothing wrong with the
  user's authorization; we've just decided to roll back the transaction
  for our own purposes.
 
 I agree, 28 is a completely off-point category.  But it wasn't in 40
 before, either --- we are talking about where it currently says
 ADMIN_SHUTDOWN, no?  I'd vote for keeping it in class 57 (operator
 intervention), as that is both sensible and a minimal change from
 current behavior.

Sorry about that chaps, didn't see your emails: I just committed.

57 was another one I considered, as was 08 connection failures. I don't
think 40 was the right place, but my bed was calling. I'll sleep now and
fix in my morning according to what y'all decide.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error code for terminating connection due to conflict with recovery

Simon Riggs si...@2ndquadrant.com writes:
 BTW, anybody know why we have PL/pgSQL condition codes for conditions
 that can't be trapped by PL/pgSQL? ERRCODE_ADMIN_SHUTDOWN and
 ERRCODE_DATABASE_DROPPED are always FATAL. Seems like pointless code to
 me.

There's a difference between not being able to trap the error and not
being able to name it at all.  You might wish to throw it, for instance.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_upgrade fails for non-postgres user

Magnus Hagander wrote:
 I just tried doing pg_upgrade on a database when logged in as user
 mha rather than postgres on my system. And it failed. Even though
 the db was initialized with superuser mha. The reason for this was
 that pg_upgrade tried to connect to the database mha (hardcoded to
 be the db username), and that certainly didn't exist.
 
 When that was fixed, I realized the psql command to create the
 datanbases connect to database template1 only to immediately switch
 to database postgres, which also seems rather pointless.
 
 Attach patch makes it connect to the postgres database instead of
 $USER, and then also changes the psql command to actually use it.
 
 I know way too little about pg_upgrade to tell if this is fully safe,
 but it does fix the problem in my installation.

I have found that this problem only affects PG 9.1 and is not part of
released PG 9.0 because we don't restore pg_authid in 9.0 (we don't need
to because we have no pg_largeobject_metadata table in PG 8.4).

I have applied a modified version of your patch to always retore into
the 'postgres' database rather than the OS user.  I thought we created
an os-user-named database, but it seems that database is always called
'postgres' but is owned by the OS user.  That seems kind of
inconsistent, but no matter.

I did not modify what we use for psql because everything else in
pg_upgrade connects to template1.  I am surprised that we recommend
restoring pg_dump to the 'postgres' database rather than template1, and
have no idea why we do that.  pg_dumpall also favors the 'postgres'
database:

   -l dbname, --database=dbname
   Specifies the name of the database to connect to to
   dump global objects and discover what other databases
   should be dumped. If not specified, the postgres
   database will be used, and if that does not exist,
   template1 will be used.

Anyway, it seems good to keep consistent and I defined a macro to record
what pg_dumpall uses as a hard-coded database for the restore. 
pg_dumpall always assumes the 'postgres' database exists, so we are OK
there.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/contrib/pg_upgrade/pg_upgrade.c b/contrib/pg_upgrade/pg_upgrade.c
index 294f58b..d3e1fef 100644
*** a/contrib/pg_upgrade/pg_upgrade.c
--- b/contrib/pg_upgrade/pg_upgrade.c
*** static void set_frozenxids(void);
*** 50,55 
--- 50,58 
  static void setup(char *argv0, bool live_check);
  static void cleanup(void);
  
+ /* This is the database used by pg_dumpall to restore global tables */
+ #define GLOBAL_DUMP_DB	postgres
+ 
  ClusterInfo old_cluster, new_cluster;
  OSInfo		os_info;
  
*** prepare_new_databases(void)
*** 226,235 
  	prep_status(Creating databases in the new cluster);
  
  	/*
! 	 *	Install support functions in the database accessed by
! 	 *	GLOBALS_DUMP_FILE because it can preserve pg_authid.oid.
  	 */
! 	install_support_functions_in_new_db(os_info.user);
  
  	/*
  	 * We have to create the databases first so we can install support
--- 229,238 
  	prep_status(Creating databases in the new cluster);
  
  	/*
! 	 *	Install support functions in the global-restore database
! 	 *	to preserve pg_authid.oid.
  	 */
! 	install_support_functions_in_new_db(GLOBAL_DUMP_DB);
  
  	/*
  	 * We have to create the databases first so we can install support
*** create_new_objects(void)
*** 266,272 
  		DbInfo	   *new_db = new_cluster.dbarr.dbs[dbnum];
  
  		/* skip db we already installed */
! 		if (strcmp(new_db-db_name, os_info.user) != 0)
  			install_support_functions_in_new_db(new_db-db_name);
  	}
  	check_ok();
--- 269,275 
  		DbInfo	   *new_db = new_cluster.dbarr.dbs[dbnum];
  
  		/* skip db we already installed */
! 		if (strcmp(new_db-db_name, GLOBAL_DUMP_DB) != 0)
  			install_support_functions_in_new_db(new_db-db_name);
  	}
  	check_ok();

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error code for terminating connection due to conflict with recovery

On Mon, Jan 31, 2011 at 7:25 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Mon, Jan 31, 2011 at 7:12 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I agree, 28 is a completely off-point category.  But it wasn't in 40
 before, either --- we are talking about where it currently says
 ADMIN_SHUTDOWN, no?  I'd vote for keeping it in class 57 (operator
 intervention), as that is both sensible and a minimal change from
 current behavior.

 Seems a little weird to me, since the administrator hasn't done
 anything.

 Sure he has: he issued the DROP DATABASE command that's causing the
 system to disconnect standby sessions.

Well, I'm not sure how much this matters - as long as it's a dedicated
error code, the user can write code to DTRT somehow.  But I don't buy
your argument.  Ultimately, user activity causes any kind of recovery
conflict.  So I don't see why one particular kind of recovery conflict
should be in a different class than all the others.  It's the
administrator (I guess) who ran VACUUM FREEZE and blew up every query
running on the standby, too.  But the user doesn't directly control
when recovery conflicts get fired on the standby - it's the system
that decides to do that.  Right now, we happen to ignore
max_standby_delay for drop database, but in general the coupling
between when an action happens on the master and when conflicts occur
on the standby is fairly loose.

Then again - in theory, there's no reason why we couldn't drop a
database on the master when it's in use, kicking out everyone using it
with this very same error code.  We don't happen to handle it that way
right now, but...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error code for terminating connection due to conflict with recovery

Robert Haas robertmh...@gmail.com writes:
 On Mon, Jan 31, 2011 at 7:25 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Seems a little weird to me, since the administrator hasn't done
 anything.

 Sure he has: he issued the DROP DATABASE command that's causing the
 system to disconnect standby sessions.

 Well, I'm not sure how much this matters - as long as it's a dedicated
 error code, the user can write code to DTRT somehow.  But I don't buy
 your argument.  Ultimately, user activity causes any kind of recovery
 conflict.

Well, yeah, but the predictability of the failure is pretty variable.
In this case we can say that the error definitely would not have
occurred if somebody hadn't done a DROP DATABASE on the master while
there were live sessions in that DB on the slave.  I think that's a
sufficiently close coupling to say that the error is the result of an
operator action.  OTOH, the occurrence of deadlocks is (usually) a lot
more dependent on random-chance timing of different transactions, and
you usually can't point to any action that intentionally caused a
deadlock.

 Then again - in theory, there's no reason why we couldn't drop a
 database on the master when it's in use, kicking out everyone using it
 with this very same error code.  We don't happen to handle it that way
 right now, but...

Yeah, that was in the back of my mind too.  DROP DATABASE foo FORCE,
maybe?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add ENCODING option to COPY

2011-01-31 Thread Hitoshi Harada

2011/2/1 Tom Lane t...@sss.pgh.pa.us:
 Hitoshi Harada umi.tan...@gmail.com writes:
 Finally I concluded the concern Itagaki-san raised can be solved by
 adding code that restores client_encoding in copy_in_error_callback.

 It might happen to work today (or at least in the scenarios you tested),
 but it seems fragile as can be.

Although I thought its fragile-ness was acceptable to avoid making the
patch too complex, I agree with you.
The third patch is attached, modifying mb routines so that they can
receive conversion procedures as FmgrInof * and save the function
pointer in CopyState.
I tested it with encoding option and could not see performance slowdown.

Regards,


-- 
Hitoshi Harada


copy_encoding.v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Authentication Enhancement Proposal

2011-01-31 Thread Christopher Hotchkiss

To All,
I would like to propose (and volunteer to do if its considered to be a
decent idea) to extend the mapping of users to roles in the pg_ident.conf to
incorporate groups. This would allow any user who belonged to a particular
group in certain authentication systems to be mapped to a role using the
existing regular expression support that exists today. This would also allow
the offloading of the creation of new users for the system to an external
mechanism instead of needing to create a new role in the database for each
person. At the same time by allowing the mapping to match based off of
groups the offloading of authentication would still allow for restrictions
of who could connect to the database.

A second enhancement that would be useful would be despite what role the
database logs the user in as the server sets a read only session variable
similar to application_name could store the system username or username plus
groups for use in audit triggers.

For example:
User Bob is a sales clerk and needs to login to the database... the database
client uses his existing login credentials (sspi, gssapi, kerberos or pam)
and attempts authentication to the database. The database takes the incoming
user name appends the groups Bob belongs to and finds a user map that maps
him to a generic role for access privileges into the database.

Changes Needed:
- Add support for an option append_groups to the sspi, gssapi, kerberos
and pam authentication methods in pg_hba.conf
- After the authentication process if append_groups is enabled, use the apis
for those authentication methods to append all groups for the user in the
following format
- - USERNAME - USERNAME:[GROUP][,GROUP]...
- Add another session variable similar to session_user and current_user that
stores the username + group that cannot be reset without superuser
privileges.

How does this proposal sound?
-- 
Christopher Hotchkiss chotchki
http://www.chotchki.us

Re: [HACKERS] FPI

On Mon, Jan 31, 2011 at 10:01 AM, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Jan 28, 2011 at 3:39 PM, Robert Haas robertmh...@gmail.com wrote:
 What happens if we (a) keep the current rule after reaching
 consistency and (b) apply any such updates *unconditionally* - that
 is, without reference to the LSN - prior to reaching consistency?
 Under that rule, if we encounter an FPI before reaching consistency,
 we're OK.  So let's suppose we don't.  No matter how many times we
 replay any initial prefix of any such updates between the redo pointer
 and the point at which we reach consistency, the state of the page
 when we finally reach consistency will be identical.  But we could get
 hosed if replay progressed *past* the minimum recovery point and then
 started over at the previous redo pointer.  If we forced an immediate
 restartpoint on reaching consistency, that seems like it might prevent
 that scenario.

 Actually, I'm wrong, and this doesn't work at all.  At the time of the
 crash, there could already be pages on disk with LSNs greater than the
 minimum recovery point.  Duh.

 It was such a good idea in my head...

I should mention that most of this idea was Heikki's, originally.
Except for the crappy parts that don't work - those are all me.  But
I'm back to thinking this can work.  Heikki pointed out to me on IM
today that in crash recovery, we always replay to end-of-WAL before
opening for connections, and for Hot Standby every block we write
advances the minimum recovery point to its LSN.  This implies that if
we're accepting connections (either regular or Hot Standby) or at a
valid stopping point for PITR, there are no unreplayed WAL records
whose changes are reflected in blocks on disk.

So I'm back to proposing that we just apply FPI-free WAL records
unconditionally, without regard to the LSN.  This could potentially
corrupt the page, of course.  Consider delete (no FPI) - vacuum (with
FPI) - crash, leaving the vacuum page half on disk.  Now the replay of
the delete is probably going to do the wrong thing, because the page
is torn.  But it doesn't matter, because the vacuum's FPI will
overwrite the page anyway, and whatever stupid thing the delete replay
did will become irrelevant - BEFORE we can begin processing any
queries.  On the other hand, if the delete record *isn't* followed by
an FPI, but just, by, say, a bunch more deletes, then it should all
Just Work (TM).  As long as the page header (excluding LSN and TLI,
which we're ignoring by stipulation) and item pointer list are intact,
we can redo those deletes and clean things up.  And if they're not
intact, then we must've done something that emits an FPI, and so any
temporary page corruption will get overwritten when we get to that
point in the WAL stream...

That is a bit ugly, though, because it means the XLOG replay of
FPI-free records would have to be prepared to just punt if they
encounter any sort of corruption, in the sure hope that any such
corruption must imply the presence of a future FPI that will be
replayed - since if there is no such future FPI, it should be
impossible for the page to be corrupted in the first place.  But that
might reduce our chances of being able to detect real corruption.

Heikki also came up with another idea that might be worth exploring:
at the point when we currently emit FPIs, emit an image of just the
part of the page that precedes pd_lower - the page header and item
IDs.  To make this work, we'd have to make a rule that redo isn't
allowed to rely for correctness on any bits following the pd_lower
boundary - it can write those bits, but it can't read them.  But most
of the XLOG_HEAP records follow that rule already - we look at the
item pointers to figure out where we're putting a new tuple or to
locate an existing tuple and unconditionally overwrite some of its
bits.  The obvious exception is XLOG_HEAP2_CLEAN, emitted by VACUUM,
which would probably need to just log the entire page.  Also, we'd
again need to apply records unconditionally, without reference to the
page LSN, until we reached the minimum recovery point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Error code for terminating connection due to conflict with recovery

On Mon, Jan 31, 2011 at 8:52 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Then again - in theory, there's no reason why we couldn't drop a
 database on the master when it's in use, kicking out everyone using it
 with this very same error code.  We don't happen to handle it that way
 right now, but...

 Yeah, that was in the back of my mind too.  DROP DATABASE foo FORCE,
 maybe?

I have to think some people would find that useful.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [pgsql-general 2011-1-21:] Are there any projects interested in object functionality? (+ rule bases)

On Mon, Jan 31, 2011 at 5:09 PM, Nick Rudnick joerg.rudn...@t-online.de wrote:
 Interesting... I remember that some years ago, I fiddled around with
 functions, operators etc. to allow a method like syntax -- but I ever was
 worried this approach would have serious weaknesses -- are there any
 principal hindrances to having methods, if no, can this be implemented in a
 straightforward way?

It would help if you were a bit more specific.  Do you mean you want
to write something like foo.bar(baz) and have that mean call the bar
method of foo and pass it baz as an argument?

If so, that'd certainly be possible to implement for purposes of a
college course, if you're so inclined - after all it's free software -
but we'd probably not make such a change to core PG, because right now
that would mean call the function bar in schema baz and pass it foo as
an argument.  We try not to break people's code to when adding
nonstandard features.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [pgsql-general 2011-1-21:] Are there any projects interested in object functionality? (+ rule bases)

On Mon, Jan 31, 2011 at 5:40 PM, Nick Rudnick joerg.rudn...@t-online.de wrote:
 * In this regard it is of interest in how far there are principal efficiency
 problems with the support of (deeply nested) object like structure by the
 backend, or if the backend may be expected to do this job not terribly worse
 then more specialized OODMS -- of course, I would be interested in any
 discussions of these topics...

I simply don't know what a more-specialized OODBMS would do that is
similar to or different than what PostgreSQL does, so it's hard to
comment.  I don't immediately see why we'd be any less efficient, but
without knowing what algorithms are in use on the other side, it's a
bit hard to say.

 * The same question for doing rule bases on top of the PostgreSQL backend...

I'm not sure if you're referring to the type of rules added by the SQL
command CREATE RULE here, or some other kind of rule.  But the rules
added by CREATE RULE are generally not too useful.  Most serious
server programming is done using triggers.

 * For teaching at university courses, on the other hand, efficiency would be
 of lower interest, so there was an idea that there might be some (possibly
 toy example like) efforts to tune the frontend into this direction.

You're still being awfully vague about what you mean by this direction.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wildcard search support for pg_trgm

=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes:
 OK, now it works flawlessly as far as I can tell. Will mark it as Ready
 for Committer.

Applied with mostly-stylistic corrections, plus addition of
documentation and a minimal regression test.

I did *not* apply this bit:

 2) I found gist index not very useful with default SIGLENINT = 3. I've
 changed this value to 15 and I found gist index performs very good on
 dictionary. But on longer strings greater values of SIGLENINT may be
 required (probably even SIGLENINT  122 will give benefit in some cases in
 spite of TOAST).

AFAICT that would break on-disk compatibility of pg_trgm GIST indexes.
I don't believe we have adequate evidence to justify doing that, and
in any case it ought to be a separate patch rather than buried inside a
mostly unrelated feature patch.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [pgsql-general 2011-1-21:] Are there any projects interested in object functionality? (+ rule bases)