[HACKERS] timezone buglet?

2011-10-04 Thread daveg

Postgresql 9.0.4 has the timezone:

  America/Blanc-Sablon

However other sources seem to spell this with an underscore instead of dash:

  America/Blanc_Sablon

It appears that beside 'America/Blanc_Sablon', other multi-word timezones
are spelled with underscore. For example: 'Australia/Broken_Hill',
'Asia/Ho_chi_minh', 'America/Los_Angeles', and so on.

Two questions:

Is this correct as is, or is it wrong in 9.0.4?

And, should I have reported this somewhere else? Bugs?

Err, three questions:

I'm a little unclear on how the tz machinery works. Can I just update the
name column in pg_timezones to fix it for now?

Thanks

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-08 Thread daveg
On Wed, Sep 07, 2011 at 09:02:04PM -0400, Tom Lane wrote:
 daveg da...@sonic.net writes:
  On Wed, Sep 07, 2011 at 07:39:15PM -0400, Tom Lane wrote:
  BTW ... what were the last versions you were running on which you had
  *not* seen the problem?  (Just wondering about the possibility that we
  back-patched some fix that broke things.  It would be good to have
  a version range before trawling the commit logs.)
 
  The first version we saw it on was 8.4.7.
 
 Yeah, you said that.  I was wondering what you'd last run before 8.4.7.

Sorry, misunderstood. We were previously running 8.4.4, but have been on 8.4.7
since shortly after it was released. Prior to that we have had all the major
and most of the minor releases since 7.1.
 
-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread daveg
On Tue, Aug 23, 2011 at 12:15:23PM -0400, Robert Haas wrote:
 On Mon, Aug 22, 2011 at 3:31 AM, daveg da...@sonic.net wrote:
  So far I've got:
 
   - affects system tables
   - happens very soon after process startup
   - in 8.4.7 and 9.0.4
   - not likely to be hardware or OS related
   - happens in clusters for period of a few second to many minutes
 
  I'll work on printing the LOCK and LOCALLOCK when it happens, but it's
  hard to get downtime to pick up new builds. Any other ideas on getting to
  the bottom of this?
 
 I've been thinking this one over, and doing a little testing. I'm
 still stumped, but I have a few thoughts.  What that error message is
 really saying is that the LOCALLOCK bookkeeping doesn't match the
 PROCLOCK bookkeeping; it doesn't tell us which one is to blame.
... 
 My second thought is that perhaps a process is occasionally managing
 to exit without fully cleaning up the associated PROCLOCK entry.  At
 first glance, it appears that this would explain the observed
 symptoms.  A new backend gets the PGPROC belonging to the guy who
 didn't clean up after himself, hits the error, and disconnects,
 sticking himself right back on to the head of the SHM_QUEUE where the
 next connection will inherit the same PGPROC and hit the same problem.
  But it's not clear to me what could cause the system to get into this
 state in the first place, or how it would eventually right itself.
 
 It might be worth kludging up your system to add a test to
 InitProcess() to verify that all of the myProcLocks SHM_QUEUEs are
 either NULL or empty, along the lines of the attached patch (which
 assumes that assertions are enabled; otherwise, put in an elog() of
 some sort).  Actually, I wonder if we shouldn't move all the
 SHMQueueInit() calls for myProcLocks to InitProcGlobal() rather than
 doing it over again every time someone calls InitProcess().  Besides
 being a waste of cycles, it's probably less robust this way.   If
 there somehow are leftovers in one of those queues, the next
 successful call to LockReleaseAll() ought to clean up the mess, but of
 course there's no chance of that working if we've nuked the queue
 pointers.

I did this in the elog flavor as we don't build production images with asserts.
It has been running on all hosts for a few days. Today it hit the extra
checks in initproc.

00:02:32.782  8626  [unknown] [unknown]  LOG:  connection received: host=bk0 
port=42700
00:02:32.783  8627  [unknown] [unknown]  LOG:  connection received: host=op2 
port=45876
00:02:32.783  8627  d61 apps  FATAL:  Initprocess myProclocks[4] not empty: 
queue 0x2ae6b4b895f8 (prev 0x2ae6b4a2b558, next 0x2ae6b4a2b558) 
00:02:32.783  8626  d35 postgres  LOG:  connection authorized: user=postgres 
database=c35
00:02:32.783  21535  LOG:  server process (PID 8627) exited with exit code 1
00:02:32.783  21535  LOG:  terminating any other active server processes
00:02:32.783  8626  c35 postgres  WARNING:  terminating connection because of 
crash of another server process

The patch that produced this is attached. If you can think of anything I
can add to this to help I'd be happy to do so. Also, can I clean this up
and continue somehow? Maybe clear the queue instead having to have a restart?
Or is there a way to just pause this proc here, maybe mark it not to be used
and exit, or just to sleep forever so I can debug later?

Thanks

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.
--- postgresql-9.0.4/src/backend/storage/lmgr/proc.c2011-04-14 
20:15:53.0 -0700
+++ postgresql-9.0.4.dg/src/backend/storage/lmgr/proc.c 2011-08-23 
17:30:03.505176019 -0700
@@ -323,7 +323,15 @@
MyProc-waitLock = NULL;
MyProc-waitProcLock = NULL;
for (i = 0; i  NUM_LOCK_PARTITIONS; i++)
+   {
+   SHM_QUEUE *queue = (MyProc-myProcLocks[i]);
+   if (! (!queue-prev || queue-prev == queue ||
+  !queue-next || queue-next == queue)
+   )
+   elog(FATAL, Initprocess myProclocks[%d] not empty: 
queue %p (prev %p, next %p) ,
+   i, queue, queue-prev, queue-next);
SHMQueueInit((MyProc-myProcLocks[i]));
+   }
MyProc-recoveryConflictPending = false;
 
/*

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread daveg
On Wed, Sep 07, 2011 at 10:20:24AM -0400, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  After spending some time staring at the code, I do have one idea as to
  what might be going on here.  When a backend is terminated,
  ShutdownPostgres() calls AbortOutOfAnyTransaction() and then
  LockReleaseAll(USER_LOCKMETHOD, true).  The second call releases all
  user locks, and the first one (or so we suppose) releases all normal
  locks as part of aborting the transaction.  But what if there's no
  transaction in progress?  In that case, AbortOutOfAnyTransaction()
  won't do anything - which is fine as far as transaction-level locks
  go, because we shouldn't be holding any of those anyway if there's no
  transaction in progress.  However, if we hold a session-level lock at
  that point, then we'd orphan it.  We don't make much use of session
  locks.  As far as I can see, they are used by (1) VACUUM, (2) CREATE
  INDEX CONCURRENTLY, (3) ALTER DATABASE .. SET TABLESPACE, and (4) on
  standby servers, redo of DROP DATABASE actions.  Any chance one of
  those died or was killed off around the time this happened?
 
 I don't believe this theory at all, because if that were the issue,
 we'd have heard about it long since.  The correct theory has to involve
 a very-little-used triggering condition.  At the moment I'm wondering
 about advisory (userspace) locks ... Dave, do your apps use any of those?

Yes, we make extensive use of advisory locks. That was my thought too when
Robert mentioned session level locks.

I'm happy to add any additional instrumentation, but my client would be
happier to actually run it if there was a way to recover from this without
an unplanned outage. Is there something I can do when the patch detects the
problem to be able to continue without a restart? Is is save to just reset
the proclock queue? I don't think they would mind leaking locks, for instance,
and a later planned restart to clear it up as much as they mind unscheduled
downtime.

Thank

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread daveg
On Wed, Sep 07, 2011 at 04:55:24PM -0400, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  Tom's right to be skeptical of my theory, because it would require a
  CHECK_FOR_INTERRUPTS() outside of a transaction block in one of the
  pathways that use session-level locks, and I can't find one.
 
 More to the point: session-level locks are released on error.  The only
 way to get out of a transaction while still holding one is for the
 VACUUM-or-whichever-command code to deliberately commit and exit while
 still holding it.  An error exit path would release the lock.
 
  OTOH, I'm skeptical of the theory that this involves userlocks,
  because this whole thread started because of a complaint about lock
  0/1260/0 already being held.  That ain't no userlock.
 
 Yeah, and for that matter it seems to let VACUUM off the hook too.
 If we assume that the reported object ID is non-corrupt (and if it's
 always the same, that seems like the way to bet) then this is a lock
 on pg_authid.
 
 Hmmm ... could the pathway involve an error exit from client
 authentication?  We're still finding bugs in the 9.0 rewrite of
 auth-time database access.

It does not seem restricted to pg_authid:

2011-08-24 18:35:57.445 24987  c23  apps  ERROR:  lock AccessShareLock on 
object 16403/2615/0 

And I think I've seen it on other tables too.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread daveg
On Wed, Sep 07, 2011 at 06:35:08PM -0400, Tom Lane wrote:
 daveg da...@sonic.net writes:
  It does not seem restricted to pg_authid:
  2011-08-24 18:35:57.445 24987  c23  apps  ERROR:  lock AccessShareLock on 
  object 16403/2615/0 
  And I think I've seen it on other tables too.
 
 Hmm.  2615 = pg_namespace, which most likely is the first catalog
 accessed by just about any SQL command that's going to access tables at
 all, so I suspect that this is mostly just a the first access failed
 thing and not something peculiar to pg_namespace.  But we still don't
 have a clue why the locks are not getting released by the previous
 owner of the PGPROC slot.  Have you trawled your logs to see if there's
 any sign of any distress at all, shortly before the problem starts to
 happen?

Will do, but its a pretty big haystack. Sure wish I knew what the needle
looked like. ;-)

-dg
 
-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread daveg
On Wed, Sep 07, 2011 at 06:25:23PM -0400, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  I thought about an error exit from client authentication, and that's a
  somewhat appealing explanation, but I can't quite see why we wouldn't
  clean up there the same as anywhere else.  The whole mechanism feels a
  bit rickety to me - we don't actually release locks; we just abort the
  transaction and *assume* that will cause locks to get released.
 
 Well, transaction abort will call LockReleaseAll, which is carefully
 coded to clean up the proclock lists regardless of what is in the
 locallocks table, so I'm not sure why you find that any more rickety
 than anything else.  But maybe it'd be interesting for Dave to stick a
 LockReleaseAll call into ProcKill() and see if that makes things better.
 (Dave: test that before you put it in production, I'm not totally sure
 it's safe.)

Re safety, what is the worst case here? 

Also, this is very intermittant, we have seen it only in recent months
on both 8.4.7 and 9.0.4 after years of no problems. Lately we see it what
feels like a few times a month. Possibly some new application behaviour
is provoking it, but I have no guesses as to what.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-09-07 Thread daveg
On Wed, Sep 07, 2011 at 07:39:15PM -0400, Tom Lane wrote:
 daveg da...@sonic.net writes:
  Also, this is very intermittant, we have seen it only in recent months
  on both 8.4.7 and 9.0.4 after years of no problems. Lately we see it what
  feels like a few times a month. Possibly some new application behaviour
  is provoking it, but I have no guesses as to what.
 
 BTW ... what were the last versions you were running on which you had
 *not* seen the problem?  (Just wondering about the possibility that we
 back-patched some fix that broke things.  It would be good to have
 a version range before trawling the commit logs.)

The first version we saw it on was 8.4.7.

-dg
 
-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] pg_upgrade problem

2011-09-05 Thread daveg

Sorry I missed your reply, catching up now.

On Wed, Aug 31, 2011 at 09:56:59PM -0400, Bruce Momjian wrote:
 daveg wrote:
  On Mon, Aug 29, 2011 at 07:49:24PM +0200, hubert depesz lubaczewski wrote:
   On Mon, Aug 29, 2011 at 06:54:41PM +0200, hubert depesz lubaczewski wrote:
   vacuumdb: vacuuming of database etsy_v2 failed: ERROR:  could not 
   access status of transaction 3429738606
   DETAIL:  Could not open file pg_clog/0CC6: No such file or directory.
   
   Interestingly.
   
   In old dir there is pg_clog directory with files:
   0AC0 .. 0DAF (including 0CC6, size 262144)
   but new pg_clog has only:
   0D2F .. 0DB0
   
   File content - nearly all files that exist in both places are the same, 
   with exception of 2 newest ones in new datadir:
   3c5122f3e80851735c19522065a2d12a  0DAF
   8651fc2b9fa3d27cfb5b496165cead68  0DB0
   
   0DB0 doesn't exist in old, and 0DAF has different md5sum: 
   7d48996c762d6a10f8eda88ae766c5dd
...
  I had this same thing happen this Saturday just past and my client had to
  restore the whole 2+ TB instance from the previous days pg_dumps.
...
  After running pg_upgrade apparently successfully and analyzeing all the

Update: reviewing the logs I see some of the analyzes hit the could not
access status of transaction error too.

  tables we restarted the production workload and started getting errors:
  
  2011-08-27 04:18:34.015  12337  c06  postgres  ERROR:  could not access 
  status of transaction 2923961093
  2011-08-27 04:18:34.015  12337  c06  postgres  DETAIL:  Could not open file 
  pg_clog/0AE4: No such file or directory.
  2011-08-27 04:18:34.015  12337  c06  postgres  STATEMENT:  analyze 
  public.b_pxx;
  
  On examination the pg_clog directory contained on two files timestamped
  after the startup of the new cluster with 9.0.4. Other hosts that upgraded
  successfully had numerous files in pg_clog dating back a few days. So it
  appears that all the clog files went missing during the upgrade somehow.
  a
  This happened upgrading from 8.4.7 to 9.0.4, with a brief session in between
  at 8.4.8. We have upgraded several hosts to 9.0.4 successfully previously.
 
 I have posted this fix to the hackers email list, but I found it only
 affected old 8.3 servers, not old 8.4.X, so I am confused by your bug
 report.
 
 I have tested 8.4.X to 9.0.4 and found pg_upgrade preserves toast
 relfrozenxids properly in that case.
 
 Can you tell me what table is showing this error?  Does it happen during
 vacuum?  Can you run a vacuum verbose to see what it is throwing the
 error on?  Thanks.

This was upgrading from 8.4.8 to 9.0.4. I don't have the running cluster
anymore, but I do have tar.gz archives of it and could probably find
2TB free somewhere to restore it to if there is something useful to extract.

However, I don't think this was toast related. Most of our rows are short and 
have only int, float, and short text columns. These errors hit over 60
different tables mostly during the analyzes we ran immediately after the
upgrade. It also hit during select, insert and delete statements. We did not
run the db more than a few minutes as the damage was so extensive.

As far as I can tell pg_upgrade never copied any pg_clog files from the
old cluster to the new cluster. I wish I had detected that before running
the remove_old_cluster.sh script.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] pg_upgrade problem

2011-09-05 Thread daveg
On Mon, Sep 05, 2011 at 08:19:21PM -0400, Bruce Momjian wrote:
 daveg wrote:
   Can you tell me what table is showing this error?  Does it happen during
   vacuum?  Can you run a vacuum verbose to see what it is throwing the
   error on?  Thanks.
  
  This was upgrading from 8.4.8 to 9.0.4. I don't have the running cluster
  anymore, but I do have tar.gz archives of it and could probably find
  2TB free somewhere to restore it to if there is something useful to extract.
  
  However, I don't think this was toast related. Most of our rows are short 
  and 
  have only int, float, and short text columns. These errors hit over 60
  different tables mostly during the analyzes we ran immediately after the
  upgrade. It also hit during select, insert and delete statements. We did not
  run the db more than a few minutes as the damage was so extensive.
  
  As far as I can tell pg_upgrade never copied any pg_clog files from the
  old cluster to the new cluster. I wish I had detected that before running
  the remove_old_cluster.sh script.
 
 Wow, no clogs?  That would make the system very confused.  You can pull
 the clogs out of the old backup and move them over if the files don't
 already exist.

We don't have the old cluster after running delete_old_cluster.ch. We use
pg_dump for backup, so no clogs.  We ended up restored 20 odd dbs totalling
2.1TB from the previous days pg_dumps.

If you review my original report I mentioned that there were only 2 clog
files in the new cluster both with ctime after the start of postgresql
after the upgrade. I did the upgrade for three hosts at the same time, the
others were fine. They have dozens of clogs dating back days before the
upgrade. The failing system had only 2 recent clog.

-dg
 
 -- 
   Bruce Momjian  br...@momjian.ushttp://momjian.us
   EnterpriseDB http://enterprisedb.com
 
   + It's impossible for everything to be true. +
 

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] pg_upgrade problem

2011-08-29 Thread daveg
On Mon, Aug 29, 2011 at 07:49:24PM +0200, hubert depesz lubaczewski wrote:
 On Mon, Aug 29, 2011 at 06:54:41PM +0200, hubert depesz lubaczewski wrote:
 vacuumdb: vacuuming of database etsy_v2 failed: ERROR:  could not access 
 status of transaction 3429738606
 DETAIL:  Could not open file pg_clog/0CC6: No such file or directory.
 
 Interestingly.
 
 In old dir there is pg_clog directory with files:
 0AC0 .. 0DAF (including 0CC6, size 262144)
 but new pg_clog has only:
 0D2F .. 0DB0
 
 File content - nearly all files that exist in both places are the same, with 
 exception of 2 newest ones in new datadir:
 3c5122f3e80851735c19522065a2d12a  0DAF
 8651fc2b9fa3d27cfb5b496165cead68  0DB0
 
 0DB0 doesn't exist in old, and 0DAF has different md5sum: 
 7d48996c762d6a10f8eda88ae766c5dd
 
 one more thing. I did select count(*) from transactions and it worked.
 
 that's about it. I can probably copy over files from old datadir to new (in
 pg_clog/), and will be happy to do it, but I'll wait for your call - retry 
 with
 copies files might destroy some evidence.

I had this same thing happen this Saturday just past and my client had to
restore the whole 2+ TB instance from the previous days pg_dumps.
I had been thinking that perhaps I did something wrong in setting up or
running the upgrade, but had not found it yet. Now that I see Hubert has
the same problem it is starting to look like pg_upgrade can eat all your
data.

After running pg_upgrade apparently successfully and analyzeing all the
tables we restarted the production workload and started getting errors:

2011-08-27 04:18:34.015  12337  c06  postgres  ERROR:  could not access status 
of transaction 2923961093
2011-08-27 04:18:34.015  12337  c06  postgres  DETAIL:  Could not open file 
pg_clog/0AE4: No such file or directory.
2011-08-27 04:18:34.015  12337  c06  postgres  STATEMENT:  analyze public.b_pxx;

On examination the pg_clog directory contained on two files timestamped
after the startup of the new cluster with 9.0.4. Other hosts that upgraded
successfully had numerous files in pg_clog dating back a few days. So it
appears that all the clog files went missing during the upgrade somehow.
a
This happened upgrading from 8.4.7 to 9.0.4, with a brief session in between
at 8.4.8. We have upgraded several hosts to 9.0.4 successfully previously.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OperationalError: FATAL: lock AccessShareLock on object 0/1260/0 is already

2011-08-23 Thread daveg
On Sun, Aug 14, 2011 at 12:16:39AM -0400, Robert Haas wrote:
 On Fri, Aug 12, 2011 at 7:19 PM, daveg da...@sonic.net wrote:
  This seems to be bug month for my client. Now there are seeing periods
  where all new connections fail immediately with the error:
 
    FATAL:  lock AccessShareLock on object 0/1260/0 is already held
...
  What can I do to help track this down?
 
 I've seen that error (though not that exact fact pattern) caused by
 bad RAM.  It's unclear to me what else could cause it.
 
 In terms of debugging, it seems like it might be sensible to start by
 injecting some debugging code that dumps out the contents of the LOCK
 and LOCALLOCK structures at the point the error occurs.

I've made up the attached patch to print this, please suggest any additions.
I'll deploy this on a couple of the production hosts that have had the
issue this evening, but there is no telling when or if it will strike next.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.
*** postgresql-9.0.4/src/backend/storage/lmgr/lock.c.orig   2011-08-22 
13:46:29.386428943 -0700
--- postgresql-9.0.4/src/backend/storage/lmgr/lock.c2011-08-23 
00:12:43.456422966 -0700
***
*** 469,474 
--- 469,534 
return LockAcquireExtended(locktag, lockmode, sessionLock, dontWait, 
true);
  }
  
+ int DG_FORCE_TRAP = 0;
+ 
+ inline static void
+ DG_LOCK_PRINT(const char *where, const LOCK *lock, LOCKMODE type)
+ {
+   elog(LOG,
+%s: lock(%p) id(%u,%u,%u,%u,%u,%u) grantMask(%x) 
+req(%d,%d,%d,%d,%d,%d,%d)=%d 
+grant(%d,%d,%d,%d,%d,%d,%d)=%d wait(%d) type(%s),
+where, lock,
+lock-tag.locktag_field1, lock-tag.locktag_field2,
+lock-tag.locktag_field3, lock-tag.locktag_field4,
+lock-tag.locktag_type, lock-tag.locktag_lockmethodid,
+lock-grantMask,
+lock-requested[1], lock-requested[2], lock-requested[3],
+lock-requested[4], lock-requested[5], lock-requested[6],
+lock-requested[7], lock-nRequested,
+lock-granted[1], lock-granted[2], lock-granted[3],
+lock-granted[4], lock-granted[5], lock-granted[6],
+lock-granted[7], lock-nGranted,
+lock-waitProcs.size,
+LockMethods[LOCK_LOCKMETHOD(*lock)]-lockModeNames[type]);
+ }
+ 
+ 
+ inline static void
+ DG_PROCLOCK_PRINT(const char *where, const PROCLOCK *proclockP)
+ {
+   elog(LOG,
+%s: proclock(%p) lock(%p) method(%u) proc(%p) 
+  hold(%x) release(%x) 
+  links(p,n): lock=(%p,%p), proc=(%p,%p),
+where, proclockP, proclockP-tag.myLock,
+PROCLOCK_LOCKMETHOD(*(proclockP)),
+proclockP-tag.myProc,
+(int) proclockP-holdMask, (int) proclockP-releaseMask,
+proclockP-lockLink.prev, proclockP-lockLink.next,
+proclockP-procLink.prev, proclockP-procLink.next
+   );
+ }
+ 
+ inline static void
+ DG_LOCALLOCK_PRINT(const char *where, const LOCALLOCK *localP)
+ {
+   elog(LOG,
+%s: locallock(%p) id(%u,%u,%u,%u,%u,%u mode %x) 
+lock(%p), proclock(%p) 
+hashcode %x  nlocks %ld  numLockOwners %d  maxLockOwners %d ,
+where, localP,
+localP-tag.lock.locktag_field1, 
localP-tag.lock.locktag_field2,
+localP-tag.lock.locktag_field3, 
localP-tag.lock.locktag_field4,
+localP-tag.lock.locktag_type, 
localP-tag.lock.locktag_lockmethodid,
+localP-tag.mode,
+localP-lock, localP-proclock,
+localP-hashcode, localP-nLocks,
+localP-numLockOwners, localP-maxLockOwners
+/* localP-lockOwners[0].owner, localP-lockOwners.nlocks  %p 
%d */
+   );
+ }
+ 
  /*
   * LockAcquireExtended - allows us to specify additional options
   *
***
*** 500,505 
--- 560,568 
LWLockIdpartitionLock;
int status;
boollog_lock = false;
+   int DG_found_local = -1;
+   int DG_found_lock = -1;
+   int DG_found_proc = -1;
  
if (lockmethodid = 0 || lockmethodid = lengthof(LockMethods))
elog(ERROR, unrecognized lock method: %d, lockmethodid);
***
*** 540,546 
locallock = (LOCALLOCK *) hash_search(LockMethodLocalHash,

  (void *) localtag,

  HASH_ENTER, found);
! 
/*
 * if it's a new locallock object, initialize it
 */
--- 603,609 
locallock = (LOCALLOCK

Re: [HACKERS] FATAL: lock AccessShareLock on object 0/1260/0 is already held

2011-08-22 Thread daveg
On Fri, Aug 12, 2011 at 04:19:37PM -0700, daveg wrote:
 
 This seems to be bug month for my client. Now there are seeing periods
 where all new connections fail immediately with the error:
 
FATAL:  lock AccessShareLock on object 0/1260/0 is already held 
 
 This happens on postgresql 8.4.7 on a large (512GB, 32 core) system that has
 been up for months. It started happening sporadicly a few days ago. It will
 do this for a period of several minutes to an hour and then go back to
 normal for hours or days.
 
 One complete failing session out of several hundred around that time:
 -
 2011-08-09 00:01:04.446  8823  [unknown]  [unknown]  LOG:  connection 
 received: host=op05.xxx port=34067
 2011-08-09 00:01:04.446  8823  c77  apps  LOG:  connection authorized: 
 user=apps database=c77
 2011-08-09 00:01:04.449  8823  c77  apps  FATAL:  lock AccessShareLock on 
 object 0/1260/0 is already held
 --

This is to add additional information to the original report:

For a while this was happening on many different databases in one postgresql
8.4.7 instance on a single large host ('U2' 512GB 64cpu) running RH 5.
That has been quiet for several days and the newest batches of errors have
happened on only on a single database 'c23', in a postgresql 9.0.4 instance
on a smaller host ('A', 64GB 8cpu) running SuSE 10.2.

No memory errors or other misbehaviour have been seen on either of these
hosts in recent months.

The original error was:

  lock AccessShareLock on object 0/1260/0 is already held

which is for pg_database. The recent errors are:

  lock AccessShareLock on object 16403/2615/0 is already held

which is for pg_namespace in database c23.

All of the orginal and most of the recent batchs of errors were immediately
after connecting to a database and being authorized, that is, before any
statements were attempted. However, some of the most recent are on the first
query statement. That is after logging in and doing things like set
transaction ...  the first select would hit this error.

It seems to come in clusters, sometimes, which suggests something shared
by multiple processes. For example, here are the times for the errors
on c23 in the afternoon of August 20:

20 07:14:12.722

20 16:05:07.798
20 16:05:07.808

20 16:05:10.519

20 16:07:07.726
20 16:07:08.722
20 16:07:09.734
20 16:07:10.656

20 16:07:25.436

20 16:22:23.983
20 16:22:24.014
20 16:22:24.335
20 16:22:24.409
20 16:22:24.477
20 16:22:24.499
20 16:22:24.516

20 16:30:58.210

20 16:31:15.261
20 16:31:15.296
20 16:31:15.324
20 16:31:15.348

20 18:06:16.515

20 18:06:49.198
20 18:06:49.204

20 18:06:51.444

20 21:03:05.940

So far I've got:

  - affects system tables
  - happens very soon after process startup
  - in 8.4.7 and 9.0.4
  - not likely to be hardware or OS related
  - happens in clusters for period of a few second to many minutes

I'll work on printing the LOCK and LOCALLOCK when it happens, but it's
hard to get downtime to pick up new builds. Any other ideas on getting to
the bottom of this?

Thanks

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-15 Thread daveg
[adding back hackers so the thread shows the resolution]

On Sun, Aug 14, 2011 at 07:02:55PM -0400, Tom Lane wrote:
 Sounds good.  Based on my own testing so far, I think that patch will
 probably make things measurably better for you, though it won't resolve
 every corner case.

The most recent catalog vacuums did vacuum full pg_class in 34 databases on
that instance with no new failures. So it looks like the patch fixes it. This
is not conclusive, but it looks very good so far. I'll send an update if I
see any new errors during the week.

Thanks for your help on this. It looks like it has sent you on a merry
search through all the catcache related program activities. I'm assuming
this patch or some improvement on it will show up in a point release.
Meanwhile, if this works as is for couple more days we will resume upgrading
the rest of the hosts to 9.0 using this patch.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] OperationalError: FATAL: lock AccessShareLock on object 0/1260/0 is already

2011-08-13 Thread daveg
On Sun, Aug 14, 2011 at 12:16:39AM -0400, Robert Haas wrote:
 On Fri, Aug 12, 2011 at 7:19 PM, daveg da...@sonic.net wrote:
  This seems to be bug month for my client. Now there are seeing periods
  where all new connections fail immediately with the error:
 
    FATAL:  lock AccessShareLock on object 0/1260/0 is already held
 
  This happens on postgresql 8.4.7 on a large (512GB, 32 core) system that has
  been up for months. It started happening sporadicly a few days ago. It will
  do this for a period of several minutes to an hour and then go back to
  normal for hours or days.
 
  One complete failing session out of several hundred around that time:
  -
  2011-08-09 00:01:04.446  8823  [unknown]  [unknown]  LOG:  connection 
  received: host=op05.xxx port=34067
  2011-08-09 00:01:04.446  8823  c77  apps  LOG:  connection authorized: 
  user=apps database=c77
  2011-08-09 00:01:04.449  8823  c77  apps  FATAL:  lock AccessShareLock on 
  object 0/1260/0 is already held
  --
 
  What can I do to help track this down?
 
 I've seen that error (though not that exact fact pattern) caused by
 bad RAM.  It's unclear to me what else could cause it.

I'll look into that. I think it is only happening on one host, so that might
make sense. On the other hand, these are pretty fancy hosts all ECC and that
so I'd hope they would have squeaked about bad ram.
 
 In terms of debugging, it seems like it might be sensible to start by
 injecting some debugging code that dumps out the contents of the LOCK
 and LOCALLOCK structures at the point the error occurs.

Hmm, we will update to 9.0 next week on these hosts, so I'll try to hold off
on this part at least until then.

-dg
 
-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] VACUUM FULL versus system catalog cache invalidation

2011-08-12 Thread daveg
On Fri, Aug 12, 2011 at 09:26:02PM +0100, Simon Riggs wrote:
 With HOT, there is very little need to perform a VACUUM FULL on any
 shared catalog table. Look at the indexes...
 
 I would a suggest that VACUUM FULL perform only a normal VACUUM on
 shared catalog tables, then perform an actual VACUUM FULL only in dire
 need (some simple heuristic in size and density). This avoids doing a
 VACUUM FULL unless it is actually necessary to do so. That has the
 added advantage of not locking out essential tables, which is always a
 concern.
 
 In the unlikely event we do actually have to VACUUM FULL a shared
 catalog table, nuke any cache entry for the whole shared catalog. That
 way we absolutely and positively will never get any more bugs in this
 area, ever again. Sounds harsh, but these events are only actually
 needed very, very rarely and hygiene is more important than a few
 minor points of performance.

This is a very optimistic view. My client makes heavy use of temp tables.
HOT and autovacuum are not sufficient to keep catalog bloat under control.
We run a daily script that calculates the density of the catalog and only
vaccum fulls those that are severely bloated. Here is a result from a
recent bloat check on one db. 'packed' is the number of pages needed for
the rows if they were packed, 'bloat' is the multiple of pages in use over
the number really needed.

relation  | tuples | pages | packed | bloat
--++---++---
 pg_class; -- |   4292 | 10619 |114 |  93.2
 pg_depend; --|  25666 |  7665 |217 |  35.4
 pg_attrdef; --   |   6585 |  7595 |236 |  32.2
 pg_type; --  |   4570 |  8177 |416 |  19.6
 pg_shdepend; --  |  52040 |  7968 |438 |  18.2

-dg
-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] our buffer replacement strategy is kind of lame

2011-08-12 Thread daveg
On Fri, Aug 12, 2011 at 01:28:49PM +0100, Simon Riggs wrote:
 I think there are reasonable arguments to make
 
 * prefer_cache = off (default) | on a table level storage parameter,
 =on will disable the use of BufferAccessStrategy
 
 * make cache_spoil_threshold a parameter, with default 0.25
 
 Considering the world of very large RAMs in which we now live, some
 control of the above makes sense.

As long as we are discussion cache settings for tables, I have a client
who would like to be able to lock specific tables and indexes into cache
as they have strict response time requirements for particular queries.
At the moment they are running postgres with a tablespace on ramfs and
taking frequent backups, but this is not optimal.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] OperationalError: FATAL: lock AccessShareLock on object 0/1260/0 is already

2011-08-12 Thread daveg

This seems to be bug month for my client. Now there are seeing periods
where all new connections fail immediately with the error:

   FATAL:  lock AccessShareLock on object 0/1260/0 is already held 

This happens on postgresql 8.4.7 on a large (512GB, 32 core) system that has
been up for months. It started happening sporadicly a few days ago. It will
do this for a period of several minutes to an hour and then go back to
normal for hours or days.

One complete failing session out of several hundred around that time:
-
2011-08-09 00:01:04.446  8823  [unknown]  [unknown]  LOG:  connection received: 
host=op05.xxx port=34067
2011-08-09 00:01:04.446  8823  c77  apps  LOG:  connection authorized: 
user=apps database=c77
2011-08-09 00:01:04.449  8823  c77  apps  FATAL:  lock AccessShareLock on 
object 0/1260/0 is already held
--
 
What can I do to help track this down?

-dg


--
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-05 Thread daveg
On Fri, Aug 05, 2011 at 12:10:31PM -0400, Tom Lane wrote:
 I wrote:
  Ahh ... you know what, never mind about stack traces, let's just see if
  the attached patch doesn't fix it.
 
 On reflection, that patch would only fix the issue for pg_class, and
 that's not the only catalog that gets consulted during relcache reloads.
 I think we'd better do it as attached, instead.
 
   regards, tom lane

Should this be applied in addition to the earlier patch, or to replace it?

-dg
 
-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-04 Thread daveg
On Wed, Aug 03, 2011 at 11:18:20AM -0400, Tom Lane wrote:
 Evidently not, if it's not logging anything, but now the question is
 why.  One possibility is that for some reason RelationGetNumberOfBlocks
 is persistently lying about the file size.  (We've seen kernel bugs
 before that resulted in transiently wrong values, so this isn't totally
 beyond the realm of possibility.)  Please try the attached patch, which
 extends the previous one to add a summary line including the number of
 blocks physically scanned by the seqscan.

Ok, I have results from the latest patch and have attached a redacted
server log with the select relfilenode output added inline. This is the
shorter of the logs and shows the sequence pretty clearly. I have additional
logs if wanted.

Summary: the failing process reads 0 rows from 0 blocks from the OLD
relfilenode. 

-dg


-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.
SS.msec  pid   db   user  statement
--  -  ---    ---
03.804  29706  c27  postgres  connection authorized: user=postgres database=c27
...
03.824  29706  c27  postgres  set statement_timeout=60;
03.824  29706  c27  postgres  0.364 ms
03.825  29706  c27  postgres  select current_database() as db, relname, oid, 
pg_relation_filenode(oid) as filenode, clock_timestamp()::time as ts from 
pg_class where relname like 'pg_class%';
03.829  29706  c27  postgres  4.173 ms
 db  |  relname   | oid  |  filenode  |   ts
-++--++-
 c27 | pg_class   | 1259 | 1245833951 | 21:31:03.828293
 c27 | pg_class_oid_index | 2662 | 1245833955 | 21:31:03.828791
 c27 | pg_class_relname_nsp_index | 2663 | 1259100530 | 21:31:03.828807

03.829  29706  c27  postgres  vacuum full pg_catalog.pg_class;
03.829  29706  c27  postgres  LOCATION: exec_simple_query, postgres.c:900
...
03.845  29707  c27  postgres  disconnection: session time: 0:00:00.041 
user=postgres database=c27 host=bk-0
...
08.856  29706  c27  postgres  process 29706 still waiting for RowExclusiveLock 
on relation 1214 of database 0 after 5000.483 ms
08.856  29706  c27  postgres  LOCATION: ProcSleep, proc.c:1059
08.856  29706  c27  postgres  STATEMENT: vacuum full pg_catalog.pg_class;
09.383  29711  LOG: 0: process 29711 still waiting for AccessShareLock on 
relation 1259 of database 16408 after 5000.331 ms
09.383  29711  LOCATION: ProcSleep, proc.c:1059
...
11.559  28857  c27  apps  SELECT
...
16.560  28857  c27  apps  process 28857 still waiting for AccessShareLock 
on relation 1259 of database 16408 after 5001.209 ms
16.560  28857  c27  apps  LOCATION: ProcSleep, proc.c:1059
16.560  28857  c27  apps  STATEMENT: SELECT
...
19.763  29706  c27  postgres  process 29706 acquired RowExclusiveLock on 
relation 1214 of database 0 after 15907.284 ms
19.763  29706  c27  postgres  LOCATION: ProcSleep, proc.c:1063
19.763  29706  c27  postgres  STATEMENT: vacuum full pg_catalog.pg_class;
...
25.735  29711  LOG: 0: process 29711 acquired AccessShareLock on relation 
1259 of database 16408 after 21352.393 ms
25.735  29711  LOCATION: ProcSleep, proc.c:1063
25.735  28857  c27  apps  process 28857 acquired AccessShareLock on 
relation 1259 of database 16408 after 14176.040 ms
25.735  28857  c27  apps  LOCATION: ProcSleep, proc.c:1063
25.735  28857  c27  apps  STATEMENT: SELECT
25.736  28857  c27  apps  ScanPgRelationDetailed: found 0 tuples with OID 
2662 in 0 blocks of filenode 1245833951
25.736  28857  c27  apps  LOCATION: ScanPgRelationDetailed, relcache.c:372
25.736  28857  c27  apps  STATEMENT: SELECT
25.736  28857  c27  apps  ERROR: XX000: could not find pg_class tuple for 
index 2662
25.736  28857  c27  apps  LOCATION: RelationReloadIndexInfo, relcache.c:1816
25.736  28857  c27  apps  STATEMENT: SELECT
25.736  29706  c27  postgres  21906.865 ms
25.737  29706  c27  postgres  select current_database() as db, relname, oid, 
pg_relation_filenode(oid) as filenode, clock_timestamp()::time as ts from 
pg_class where relname like 'pg_class%';
25.767  29706  c27  postgres  30.902 ms
 db  |  relname   | oid  |  filenode  |   ts
-++--++-
 c27 | pg_class   | 1259 | 1279787837 | 21:31:25.76726
 c27 | pg_class_oid_index | 2662 | 1279788022 | 21:31:25.767764
 c27 | pg_class_relname_nsp_index | 2663 | 1279788023 | 21:31:25.767782

25.768  29706  c27  postgres  vacuum full pg_catalog.pg_attribute;
25.775  28857  c27  apps  disconnection: session time: 0:07:07.758 
user=apps database=c27 host=op-01
25.775  28857  c27  apps  LOCATION: log_disconnections, postgres.c:4339
...
30.914  29711  LOG: 0: process 29711 still waiting for AccessShareLock on 
relation 1249 of database 16408 

Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-04 Thread daveg
On Thu, Aug 04, 2011 at 12:28:31PM -0400, Tom Lane wrote:
 daveg da...@sonic.net writes:
  Summary: the failing process reads 0 rows from 0 blocks from the OLD
  relfilenode. 
 
 Hmm.  This seems to mean that we're somehow missing a relation mapping
 invalidation message, or perhaps not processing it soon enough during
 some complex set of invalidations.  I did some testing with that in mind
 but couldn't reproduce the failure.  It'd be awfully nice to get a look
 at the call stack when this happens for you ... what OS are you running?

cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 5.5 (Tikanga)
Linux version 2.6.18-194.el5

I can use gdb as well if we can get a core or stop the correct process.
Perhaps a long sleep when it hits this?
Or perhaps we could log invalidate processing for pg_class?

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-04 Thread daveg
On Thu, Aug 04, 2011 at 12:28:31PM -0400, Tom Lane wrote:
 daveg da...@sonic.net writes:
  Summary: the failing process reads 0 rows from 0 blocks from the OLD
  relfilenode. 
 
 Hmm.  This seems to mean that we're somehow missing a relation mapping
 invalidation message, or perhaps not processing it soon enough during
 some complex set of invalidations.  I did some testing with that in mind
 but couldn't reproduce the failure.  It'd be awfully nice to get a look
 at the call stack when this happens for you ... what OS are you running?

To recap, a few observations:

When it happens the victim has recently been waiting on a lock for a
several seconds.

We create a lot of temp tables, hundreds of thousands a day.

There are catalog vacuum fulls and reindexes running on 30 odd other databases
at the same time. The script estimates the amount of bloat on each table and
index and chooses either reindex on specific indexes or vacuum full as needed.

This is a 32 core (64 with hype threading) 512GB host with several hundred
connections

We are seeing cannot read' and 'cannot open' errors too that would be
consistant with trying to use a vanished file.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-04 Thread daveg
On Thu, Aug 04, 2011 at 04:16:08PM -0400, Tom Lane wrote:
 daveg da...@sonic.net writes:
  We are seeing cannot read' and 'cannot open' errors too that would be
  consistant with trying to use a vanished file.
 
 Yeah, these all seem consistent with the idea that the failing backend
 somehow missed an update for the relation mapping file.  You would get
 the could not find pg_class tuple syndrome if the process was holding
 an open file descriptor for the now-deleted file, and otherwise cannot
 open/cannot read type errors.  And unless it later received another
 sinval message for the relation mapping file, the errors would persist.
 
 If this theory is correct then all of the file-related errors ought to
 match up to recently-vacuumed mapped catalogs or indexes (those are the
 ones with relfilenode = 0 in pg_class).  Do you want to expand your
 logging of the VACUUM FULL actions and see if you can confirm that idea?

At your service, what would you like to see?
 
 Since the machine is running RHEL, I think we can use glibc's
 backtrace() function to get simple stack traces without too much effort.
 I'll write and test a patch and send it along in a bit.

Great.

Any point to try to capture SI events somehow?

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-03 Thread daveg
On Mon, Aug 01, 2011 at 01:23:49PM -0400, Tom Lane wrote:
 daveg da...@sonic.net writes:
  On Sun, Jul 31, 2011 at 11:44:39AM -0400, Tom Lane wrote:
  I think we need to start adding some instrumentation so we can get a
  better handle on what's going on in your database.  If I were to send
  you a source-code patch for the server that adds some more logging
  printout when this happens, would you be willing/able to run a patched
  build on your machine?
 
  Yes we can run an instrumented server so long as the instrumentation does
  not interfere with normal operation. However, scheduling downtime to switch
  binaries is difficult, and generally needs to be happen on a weekend, but
  sometimes can be expedited. I'll look into that.
 
 OK, attached is a patch against 9.0 branch that will re-scan pg_class
 after a failure of this sort occurs, and log what it sees in the tuple
 header fields for each tuple for the target index.  This should give us
 some useful information.  It might be worthwhile for you to also log the
 results of
 
 select relname,pg_relation_filenode(oid) from pg_class
 where relname like 'pg_class%';
 
 in your script that does VACUUM FULL, just before and after each time it
 vacuums pg_class.  That will help in interpreting the relfilenodes in
 the log output.

We have installed the patch and have encountered the error as usual.
However there is no additional output from the patch. I'm speculating
that the pg_class scan in ScanPgRelationDetailed() fails to return
tuples somehow.


I have also been trying to trace it further by reading the code, but have not
got any solid hypothesis yet. In the absence of any debugging output I've
been trying to deduce the call tree leading to the original failure. So far
it looks like this:

RelationReloadIndexInfo(Relation)
// Relation is 2662 and !rd_isvalid
pg_class_tuple = ScanPgRelation(2662, indexOK=false)  // returns NULL
pg_class_desc = heap_open(1259, ACC_SHARE)
r = relation_open(1259, ACC_SHARE) // locks oid, ensures 
RelationIsValid(r)
r = RelationIdGetRelation(1259)
r = RelationIdCacheLookup(1259)   // assume success
if !rd_isvalid:
RelationClearRelation(r, true)
RelationInitPhysicalAddr(r) // r is pg_class 
relcache

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Further news on Clang - spurious warnings

2011-08-03 Thread daveg
On Wed, Aug 03, 2011 at 04:03:39PM -0400, Tom Lane wrote:
 The C standard specifies that signed-to-unsigned conversions must work
 like that; and even if the standard didn't, it would surely work like
 that on any machine with two's-complement representation, which is to
 say every computer built in the last forty years or so.  So I don't find
 it a questionable assumption.

I had the pleasure of working on a Univac 1108 in about 1978 and
it was very definitely ones complement. I'm somewhat amazed to find that
the Univac 1100 series architecture and instruction set lives on to this
day. The last pure 1100 seems to be the Unisys 2200/3800 released in 1997.
Even later U1100/Exec 8 descendants appear to still exist and are still
actively supported:

  http://en.wikipedia.org/wiki/Unisys_OS_2200_operating_system

So there are still ones complement machines out there. However I suggest we
pretend otherwise and continue to ignore them.

-dg

--
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-07-31 Thread daveg
On Thu, Jul 28, 2011 at 11:31:31PM -0700, daveg wrote:
 On Thu, Jul 28, 2011 at 07:45:01PM -0400, Robert Haas wrote:
  REINDEX.  My guess is that this is happening either right around the
  time the VACUUM FULL commits or right around the time the REINDEX
  commits.  It'd be helpful to know which, if you can figure it out.
 
 I'll update my vacuum script to skip reindexes after vacuum full for 9.0
 servers and see if that makes the problem go away. Thanks for reminding
 me that they are not needed. However, I suspect it is the vacuum, not the
 reindex causing the problem. I'll update when I know.

Here is the update: the problem happens with vacuum full alone, no reindex
is needed to trigger it. I updated the script to avoid reindexing after
vacuum. Over the past two days there are still many ocurrances of this
error coincident with the vacuum.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-07-31 Thread daveg
On Sun, Jul 31, 2011 at 11:44:39AM -0400, Tom Lane wrote:
 daveg da...@sonic.net writes:
  Here is the update: the problem happens with vacuum full alone, no reindex
  is needed to trigger it. I updated the script to avoid reindexing after
  vacuum. Over the past two days there are still many ocurrances of this
  error coincident with the vacuum.
 
 Well, that jives with the assumption that the one case we saw in
 the buildfarm was the same thing, because the regression tests were
 certainly only doing a VACUUM FULL and not a REINDEX of pg_class.
 But it doesn't get us much closer to understanding what's happening.
 In particular, it seems to knock out most ideas associated with race
 conditions, because the VAC FULL should hold exclusive lock on pg_class
 until it's completely done (including index rebuilds).
 
 I think we need to start adding some instrumentation so we can get a
 better handle on what's going on in your database.  If I were to send
 you a source-code patch for the server that adds some more logging
 printout when this happens, would you be willing/able to run a patched
 build on your machine?

Yes we can run an instrumented server so long as the instrumentation does
not interfere with normal operation. However, scheduling downtime to switch
binaries is difficult, and generally needs to be happen on a weekend, but
sometimes can be expedited. I'll look into that.

 (BTW, just to be perfectly clear ... the could not find pg_class tuple
 errors always mention index 2662, right, never any other number?)

Yes, only index 2662, never any other.

I'm attaching a somewhat redacted log for two different databases on the same
instance around the time of vacuum full of pg_class in each database.
My observations so far are:

 - the error occurs at commit of vacuum full of pg_class
 - in these cases error hits autovacuum after it waited for a lock on pg_class
 - in these two cases there was a new process startup while the vacuum was
   running. Don't know if this is relevant.
 - while these hit autovacuum, the error does hit other processs (just not in
   these sessions).  Unknown if autovacuum is a required component.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.
NOTES:
-- most statements for db c01 in one catalog vacuum session
-- database c01 is oid 16404
-- applications run as user 'app'
-- vacuums run as user postgres
-- somewhat redacted

MM:SS.ms   piduser  log text
-  -    -

03:00.350  2716   c01  postgres  vacuum full pg_catalog.pg_class;
03:00.507  2866  LOG: process 2866 acquired ShareUpdateExclusiveLock on 
relation 2604 of database 16416 after 10143.750 ms
03:01.222  21685  c01  app   11100.670 ms
03:01.223  21685  c01  app  SELECT DISTINCT
03:01.224  21685  c01  app   1.524 ms
03:02.031  3001   c01  app  connection authorized: user=app database=c01
03:04.952  2787  LOG: process 2787 still waiting for AccessShareLock on 
relation 1259 of database 16412 after 5001.035 ms
03:05.065  2787  LOG: process 2787 acquired AccessShareLock on relation 1259 of 
database 16412 after 5114.253 ms
03:05.564  2977  LOG: process 2977 still waiting for AccessShareLock on 
relation 1259 of database 16404 after 5000.970 ms
03:05.640  2731  LOG: process 2731 still waiting for RowExclusiveLock on 
relation 1259 of database 16404 after 5000.186 ms
03:06.045  2977  LOG: process 2977 acquired AccessShareLock on relation 1259 of 
database 16404 after 5482.389 ms
03:06.045  2731  LOG: process 2731 acquired RowExclusiveLock on relation 1259 
of database 16404 after 5405.652 ms
03:06.046  2731  ERROR: could not find pg_class tuple for index 2662
03:06.046  2731  CONTEXT: automatic vacuum of table c01.pg_catalog.pg_index
03:06.046  2731  ERROR: could not find pg_class tuple for index 2662
03:06.046  2716   c01  postgres   5696.537 ms
03:06.056  3001   c01  app  SET SESSION TIME ZONE 'UTC'; ...
03:06.057  2716   c01  postgres  vacuum full pg_catalog.pg_rewrite;
03:06.066  3001   c01  app   10.459 ms
03:06.091  3001   c01  app  disconnection: session time: 0:00:04.085 user=app 
database=c01 host=xxxl01
03:08.908  3006  LOG: process 3006 still waiting for AccessShareLock on 
relation 2659 of database 16407 after 5000.778 ms
03:11.777  21685  c01  app  SELECT
03:11.779  21685  c01  app   2.296 ms
03:11.779  21685  c01  app  SELECT
03:11.780  21685  c01  app   0.328 ms
03:11.798  21685  c01  app  SELECT
03:11.799  21685  c01  app   0.348 ms
03:11.800  21685  c01  app  SELECT
03:11.800  21685  c01  app   0.205 ms
03:11.804  21685  c01  app  SELECT
03:11.805  21685  c01  app   0.589 ms
03:11.806  21685  c01  app  SELECT DISTINCT
03:11.809  21685  c01  app   3.552 ms
03:11.810  21685  c01  app  SELECT
03:12.554  21685  c01  app   744.233 ms
03:12.555  21685  c01  app  SELECT DISTINCT
03:12.556  21685  c01  app   1.136 ms
03:13.153  2716   c01  postgres

Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-07-29 Thread daveg
On Thu, Jul 28, 2011 at 07:45:01PM -0400, Robert Haas wrote:
 On Thu, Jul 28, 2011 at 5:46 PM, daveg da...@sonic.net wrote:
  On Thu, Jul 28, 2011 at 09:46:41AM -0400, Robert Haas wrote:
  On Wed, Jul 27, 2011 at 8:28 PM, daveg da...@sonic.net wrote:
   My client has been seeing regular instances of the following sort of 
   problem:
  On what version of PostgreSQL?
 
  9.0.4.
 
  I previously said:
   This occurs on postgresql 9.0.4. on 32 core 512GB Dell boxes. We have
   identical systems still running 8.4.8 that do not have this issue, so I'm
   assuming it is related to the vacuum full work done for 9.0. Oddly, we 
   don't
   see this on the smaller hosts (8 core, 64GB, slower cpus) running 9.0.4,
   so it may be timing related.
 
 Ah, OK, sorry.  Well, in 9.0, VACUUM FULL is basically CLUSTER, which
 means that a REINDEX is happening as part of the same operation.  In
 9.0, there's no point in doing VACUUM FULL immediately followed by
 REINDEX.  My guess is that this is happening either right around the
 time the VACUUM FULL commits or right around the time the REINDEX
 commits.  It'd be helpful to know which, if you can figure it out.

I'll update my vacuum script to skip reindexes after vacuum full for 9.0
servers and see if that makes the problem go away. Thanks for reminding
me that they are not needed. However, I suspect it is the vacuum, not the
reindex causing the problem. I'll update when I know.

 If there's not a hardware problem causing those read errors, maybe a
 backend is somehow ending up with a stale or invalid relcache entry.
 I'm not sure exactly how that could be happening, though...

It does not appear to be a hardware problem. I also suspect it is a stale
relcache.

-dg
 
-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-07-29 Thread daveg
On Fri, Jul 29, 2011 at 09:55:46AM -0400, Tom Lane wrote:
 The thing that was bizarre about the one instance in the buildfarm was
 that the error was persistent, ie, once a session had failed all its
 subsequent attempts to access pg_class failed too.  I gather from Dave's
 description that it's working that way for him too.  I can think of ways
 that there might be a transient race condition against a REINDEX, but
 it's very unclear why the failure would persist across multiple
 attempts.  The best idea I can come up with is that the session has
 somehow cached a wrong commit status for the reindexing transaction,
 causing it to believe that both old and new copies of the index's
 pg_class row are dead ... but how could that happen?  The underlying

It is definitely persistant. Once triggered the error occurs for any new
statement until the session exits.

 state in the catalog is not wrong, because no concurrent sessions are
 upset (at least not in the buildfarm case ... Dave, do you see more than
 one session doing this at a time?).

It looks like it happens to multiple sessions so far as one can tell from
the timestamps of the errors:

 timestampsessionid error
  - --
 03:05:37.434 4e26a861.4a6d could not find pg_class tuple for index 2662
 03:05:37.434 4e26a861.4a6f could not find pg_class tuple for index 2662

 03:06:12.041 4e26a731.438e could not find pg_class tuple for index 2662

 03:06:12.042 4e21b6a3.629b could not find pg_class tuple for index 2662
 03:06:12.042 4e26a723.42ec could not find pg_class tuple for index 2662 at 
character 13


-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-07-28 Thread daveg
On Thu, Jul 28, 2011 at 09:46:41AM -0400, Robert Haas wrote:
 On Wed, Jul 27, 2011 at 8:28 PM, daveg da...@sonic.net wrote:
  My client has been seeing regular instances of the following sort of 
  problem:
 On what version of PostgreSQL?

9.0.4.

I previously said:
  This occurs on postgresql 9.0.4. on 32 core 512GB Dell boxes. We have
  identical systems still running 8.4.8 that do not have this issue, so I'm
  assuming it is related to the vacuum full work done for 9.0. Oddly, we don't
  see this on the smaller hosts (8 core, 64GB, slower cpus) running 9.0.4,
  so it may be timing related.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] error: could not find pg_class tuple for index 2662

2011-07-27 Thread daveg

My client has been seeing regular instances of the following sort of problem:

...
 03:06:09.453 exec_simple_query, postgres.c:900
 03:06:12.042 XX000: could not find pg_class tuple for index 2662 at character 
13
 03:06:12.042 RelationReloadIndexInfo, relcache.c:1740
 03:06:12.042 INSERT INTO zzz_k(k) SELECT ...
 03:06:12.045 0: statement: ABORT
 03:06:12.045 exec_simple_query, postgres.c:900
 03:06:12.045 0: duration: 0.100 ms
 03:06:12.045 exec_simple_query, postgres.c:1128
 03:06:12.046 0: statement: INSERT INTO temp_807
  VALUES (...)
 03:06:12.046 exec_simple_query, postgres.c:900
 03:06:12.046 XX000: could not find pg_class tuple for index 2662 at character 
13
 03:06:12.046 RelationReloadIndexInfo, relcache.c:1740
 03:06:12.046 INSERT INTO temp_807
  VALUES (...)
 03:06:12.096 08P01: unexpected EOF on client connection
 03:06:12.096 SocketBackend, postgres.c:348
 03:06:12.096 XX000: could not find pg_class tuple for index 2662
 03:06:12.096 RelationReloadIndexInfo, relcache.c:1740
 03:06:12.121 0: disconnection: session time: 0:06:08.537 user=ZZZ 
database=ZZZ_01
 03:06:12.121 log_disconnections, postgres.c:4339


The above happens regularly (but not completely predictably) corresponding
with a daily cronjob that checks the catalogs for bloat and does vacuum full
and/or reindex as needed. Since some of the applications make very heavy
use of temp tables this will usually mean pg_class and pg_index get vacuum
full and reindex.

Sometimes queries will fail due to being unable to open a tables containing
file. On investigation the file will be absent in both the catalogs and the
filesystem so I don't know what table it refers to:

 20:41:19.063  ERROR:  could not open file 
pg_tblspc/16401/PG_9.0_201008051/16413/1049145092: No such file or directory
 20:41:19.063  STATEMENT:  insert into r_ar__30
   select aid, mid, pid, sum(wdata) as wdata, ...
--
 20:41:19.430  ERROR:  could not open file 
pg_tblspc/16401/PG_9.0_201008051/16413/1049145092: No such file or directory
 20:41:19.430  STATEMENT: SELECT nextval('j_id_seq')


Finallly, I have seen a several instances of failure to read data by
vacuum full itself:

 03:05:45.699 0: statement: vacuum full pg_catalog.pg_index;
 03:05:45.699 exec_simple_query, postgres.c:900
 03:05:46.142 XX001: could not read block 65 in file 
pg_tblspc/16401/PG_9.0_201008051/16416/1049146489: read only 0 of 8192 bytes
 03:05:46.142 mdread, md.c:656
 03:05:46.142 vacuum full pg_catalog.pg_index;

This occurs on postgresql 9.0.4. on 32 core 512GB Dell boxes. We have
identical systems still running 8.4.8 that do not have this issue, so I'm
assuming it is related to the vacuum full work done for 9.0. Oddly, we don't
see this on the smaller hosts (8 core, 64GB, slower cpus) running 9.0.4,
so it may be timing related.

This seems possibly related to the issues in:

  Bizarre buildfarm failure on baiji: can't find pg_class_oid_index
http://archives.postgresql.org/pgsql-hackers/2010-02/msg02038.php
  Broken HOT chains in system catalogs
http://archives.postgresql.org/pgsql-hackers/2011-04/msg00777.php

As far as I can tell from the logs I have, once a session sees one of these
errors any subsequent query will hit it again until the session exits.
However, it does not seem to harm other sessions or leave any persistant
damage (crossing fingers and hoping here).

I'm ready to do any testing/investigation/instrumented builds etc that may be
helpful in resolving this.

Regards

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

2011-03-08 Thread daveg
On Tue, Mar 08, 2011 at 10:00:01AM +0200, Heikki Linnakangas wrote:
 On 08.03.2011 04:07, Greg Stark wrote:
 Well from that log you definitely have OldestXmin going backwards. And
 not by a little bit either. at 6:33 it set the all_visible flag and
 then at 7:01 it was almost 1.3 million transactions earlier. In fact
 to precisely the same value that was in use for a transaction at 1:38.
 That seems like a bit of a coincidence though it's not repeated
 earlier.
 
 Yep. After staring at GetOldestXmin() again, it finally struck me how 
 OldestXmin can move backwards. You need two databases for it, which 
 probably explains why this has been so elusive.
... 
 What there are no other transactions active in the same database, 
 GetOldestXmin() returns just latestCompletedXid. When you open a 
 transaction in the same database after that, its xid will be above 
 latestCompletedXid, but its xmin includes transactions from all 
 databases, and there might be a transaction in some other database with 
 an xid that precedes the value that GetOldestXmin() returned earlier.
 
 I'm not sure what to do about that. One idea is track two xmin values in 
 proc-array, one that includes transactions in all databases, and another 
 that only includes transactions in the same database. GetOldestXmin() 
 (when allDbs is false) would only pay attention to the latter. It would 
 add a few instructions to GetSnapshotData(), though.
 
 Another idea is to give up on the warning when it appears that 
 oldestxmin has moved backwards, and assume that it's actually fine. We 
 could still warn in other cases where the flag appears to be incorrectly 
 set, like if there is a deleted tuple on the page.

I read this to mean that it is safe to ignore this warning and that these
databases are not at risk for data corruption or wrong results so long as
the warning is due to oldestxmin.  Please correct me if I have misunderstood. 

Thanks

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

2011-03-08 Thread daveg
On Tue, Mar 08, 2011 at 10:37:24AM +0200, Heikki Linnakangas wrote:
 On 08.03.2011 10:00, Heikki Linnakangas wrote:
 Another idea is to give up on the warning when it appears that
 oldestxmin has moved backwards, and assume that it's actually fine. We
 could still warn in other cases where the flag appears to be incorrectly
 set, like if there is a deleted tuple on the page.
 
 This is probably a better idea at least in back-branches. It also 
 handles the case of twiddling vacuum_defer_cleanup_age, which tracking 
 two xmins per transactions would not handle.
 
 Here's a patch. I also changed the warning per Robert's suggestion. 
 Anyone see a hole in this?

It would be helpful to have the dbname and schema in the message in addition
to the relname. I added those to the original diagnostic patch as it was not
clear that the messages were all related to the same page/table/dg.

Also, in your comment you might mention that multiple databases are one way
we could see oldestxmin move backwards.

-dg
 

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

2011-03-07 Thread daveg
On Fri, Mar 04, 2011 at 05:52:29PM +0200, Heikki Linnakangas wrote:
 Hmm, if these all came from the same database, then it looks OldestXmin 
 has moved backwards. That would explain the warnings. First one vacuum 
 determines that all the tuples are visible to everyone and sets the 
 flag. Then another vacuum runs with an older OldestXmin, and determines 
 that there's a tuple on the page with an xmin that is not yet visible to 
 everyone, hence it thinks that the flag should not have been set yet.
 
 Looking at the code, I don't see how that situation could arise, though. 
 The value calculated by GetOldestXmin() should never move backwards. And 
 GetOldestXmin() is called in lazy_vacuum_rel(), after it has acquired a 
 lock on the table, which should protect from a race condition where two 
 vacuums could run on the table one after another, in a way where the 
 later vacuum runs with an OldestXmin calculated before the first vacuum.
 
 Hmm, fiddling with vacuum_defer_cleanup_age on the fly could cause that, 
 though. You don't do that, do you?
 
No.

I've updated the patch to collect db and schema and added Merlins patch as
well and run it for a while. The attached log is all the debug messages
for pg_statistic page 333 from one database. I've also attached the two
most recent page images for that particular page, the last digits in the
filename are the hour and minute of when the page was saved.

What else can I be doing to help figure this out?

Thanks

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.
2011-03-05 00:50:03.238 PST  13304  WARNING:  debugging: setting PD_ALL_VISIBLE 
in relation c57.pg_catalog.pg_statistic on page 333 (OldestXmin 337186202)
2011-03-05 21:55:00.004 PST  10702  WARNING:  debugging: setting PD_ALL_VISIBLE 
in relation c57.pg_catalog.pg_statistic on page 333 (OldestXmin 341342282)
2011-03-05 23:28:06.368 PST  16660  WARNING:  debugging: setting PD_ALL_VISIBLE 
in relation c57.pg_catalog.pg_statistic on page 333 (OldestXmin 342000646)
2011-03-06 05:24:03.461 PST  12850  WARNING:  debugging: setting PD_ALL_VISIBLE 
in relation c57.pg_catalog.pg_statistic on page 333 (OldestXmin 343420345)
2011-03-06 06:23:40.174 PST  21650  WARNING:  debugging: setting PD_ALL_VISIBLE 
in relation c57.pg_catalog.pg_statistic on page 333 (OldestXmin 343773970)
2011-03-06 07:50:49.921 PST  3847  WARNING:  debugging: setting PD_ALL_VISIBLE 
in relation c57.pg_catalog.pg_statistic on page 333 (OldestXmin 344373260)
2011-03-06 09:11:35.662 PST  10820  WARNING:  debugging: setting PD_ALL_VISIBLE 
in relation c57.pg_catalog.pg_statistic on page 333 (OldestXmin 344660206)
2011-03-07 01:23:57.124 PST  14121  WARNING:  debugging: setting PD_ALL_VISIBLE 
in relation c57.pg_catalog.pg_statistic on page 333 (OldestXmin 347873961)
2011-03-07 01:25:01.129 PST  14266  WARNING:  PD_ALL_VISIBLE flag was 
incorrectly set in relation c57.pg_catalog.pg_statistic page 333 via 
OldestXmin (OldestXmin 347186993)
2011-03-07 01:26:01.143 PST  14356  WARNING:  debugging: setting PD_ALL_VISIBLE 
in relation c57.pg_catalog.pg_statistic on page 333 (OldestXmin 347879220)
2011-03-07 01:27:01.158 PST  14456  WARNING:  PD_ALL_VISIBLE flag was 
incorrectly set in relation c57.pg_catalog.pg_statistic page 333 via 
OldestXmin (OldestXmin 347186993)
2011-03-07 01:33:05.353 PST  15108  WARNING:  debugging: setting PD_ALL_VISIBLE 
in relation c57.pg_catalog.pg_statistic on page 333 (OldestXmin 347705144)
2011-03-07 01:38:09.298 PST  15869  WARNING:  debugging: setting PD_ALL_VISIBLE 
in relation c57.pg_catalog.pg_statistic on page 333 (OldestXmin 347705144)
2011-03-07 01:51:17.622 PST  18829  WARNING:  debugging: setting PD_ALL_VISIBLE 
in relation c57.pg_catalog.pg_statistic on page 333 (OldestXmin 348042728)
2011-03-07 02:24:41.994 PST  22594  WARNING:  debugging: setting PD_ALL_VISIBLE 
in relation c57.pg_catalog.pg_statistic on page 333 (OldestXmin 348116608)
2011-03-07 06:33:47.460 PST  18384  WARNING:  debugging: setting PD_ALL_VISIBLE 
in relation c57.pg_catalog.pg_statistic on page 333 (OldestXmin 349004767)
2011-03-07 07:01:51.400 PST  23896  WARNING:  PD_ALL_VISIBLE flag was 
incorrectly set in relation c57.pg_catalog.pg_statistic page 333 via 
OldestXmin (OldestXmin 347705144)
2011-03-07 10:23:31.416 PST  10654  WARNING:  debugging: setting PD_ALL_VISIBLE 
in relation c57.pg_catalog.pg_statistic on page 333 (OldestXmin 349660381)


pageimage_c57_pg_catalog_pg_statistic_333.0127
Description: Binary data


pageimage_c57_pg_catalog_pg_statistic_333.0701
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

2011-03-04 Thread daveg
On Thu, Mar 03, 2011 at 09:04:04AM -0600, Merlin Moncure wrote:
 On Thu, Mar 3, 2011 at 2:16 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
  On 03.03.2011 09:12, daveg wrote:
 
  Question: what would be the consequence of simply patching out the setting
  of this flag? Assuming that the incorrect PD_ALL_VISIBLE flag is the only
  problem (big assumption perhaps) then simply never setting it would at
  least
  avoid the possibility of returning wrong answers, presumably at some
  performance cost. We possibly could live with that until we get a handle
  on the real cause and fix.
 
  Yes. With that assumption.
 
  If you really want to do that, I would suggest the attached patch instead.
  This just disables the optimization in seqscans to trust it, so an
  incorrectly set flag won't affect correctness of query results,  but the
  flag is still set as usual and you still get the warnings so that we can
  continue to debug the issue.
 
 This.  The mis-set flag can is likely a bug/concurrency issue etc,
 but could also be a symptom of more sinister data corruption.  I did
 various vacuum experiments all day yesterday on my windows workstation
 and was not able to produce any mis-flags.  I trust iscsi more than
 nfs, but maybe there is a connection here that is hardware based.  hm.
 do you think it would be helpful to know what is causing the
 all_visible flag to get flipped?  If so, the attached patch shows
 which case is throwing it...

I'll apply your patch and try it. Probably can only do it for a few minutes
tomorrow evening though as the output is huge and we have only limited down
time availability.
 
-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

2011-03-03 Thread daveg
On Thu, Mar 03, 2011 at 10:16:29AM +0200, Heikki Linnakangas wrote:
 On 03.03.2011 09:12, daveg wrote:
 Question: what would be the consequence of simply patching out the setting
 of this flag? Assuming that the incorrect PD_ALL_VISIBLE flag is the only
 problem (big assumption perhaps) then simply never setting it would at 
 least
 avoid the possibility of returning wrong answers, presumably at some
 performance cost. We possibly could live with that until we get a handle
 on the real cause and fix.
 
 Yes. With that assumption.
 
 If you really want to do that, I would suggest the attached patch 
 instead. This just disables the optimization in seqscans to trust it, so 
 an incorrectly set flag won't affect correctness of query results,  but 
 the flag is still set as usual and you still get the warnings so that we 
 can continue to debug the issue.

Thanks. I'll be applying this tomorrow and will send you some page images
to look at assuming it still does it.

I had a look at how this gets set and cleared and did not see anything obvious
so I'm pretty mystified. Also, we are seeing thousands of these daily for at
least a month on 4 large hosts and no-one has noticed any other issues,
which suprises me. Very strange.

-dg
 

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [ADMIN] PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

2011-03-02 Thread daveg
On Tue, Mar 01, 2011 at 01:20:43PM -0800, daveg wrote:
 On Tue, Mar 01, 2011 at 12:00:54AM +0200, Heikki Linnakangas wrote:
  On 28.02.2011 23:28, daveg wrote:
  On Wed, Jan 12, 2011 at 10:46:14AM +0200, Heikki Linnakangas wrote:
  We'll likely need to go back and forth a few times with various
  debugging patches until we get to the heart of this..
  
  Anything new on this? I'm seeing at on one of my clients production boxes.
  
  I haven't heard anything from the OP since.
  
  Also, what is the significance, ie what is the risk or damage potential if
  this flag is set incorrectly?
  
  Sequential scans will honor the flag, so you might see some dead rows 
  incorrectly returned by a sequential scan. That's the only damage, but 
  an incorrectly set flag could be a sign of something more sinister, like 
  corrupt tuple headers. The flag should never be set incorrectly, so if 
  you see that message you have hit a bug in PostgreSQL, or you have bad 
  hardware.
  
  This flag is quite new, so a bug in PostgreSQL is quite possible. If you 
  still have a backup that contains those incorrectly set flags, I'd like 
  to see what the page looks like.
 
 
 I ran vacuums on all the affected tables last night. I plan to take a downtime
 to clear the buffer cache and then to run vacuums on all the dbs in the
 cluster.
 
 Most but not all the tables involved are catalogs.
 
 However, I could probably pick up your old patch sometime next week if it
 recurrs and send you page images.

After a restart and vacuum of all dbs with no other activity things were
quiet for a couple hours and then we started seeing these PD_ALL_VISIBLE
messages again. 

Going back through the logs we have been getting these since at least before
mid January. Oddly, this only happens on four systems which are all new Dell
32 core Nehalem 512GB machines using iscsi partitions served off a Netapp.
Our older 8 core 64GB hosts have never logged any of these errors. I'm not
saying it is related to the hw, as these hosts are doing a lot more work than
the old hosts so it may be a concurrency problem that just never came up at
lower levels before.

Postgresql version is 8.4.4.

I'll pick up Heikkis page logging patch and run it for a bit to get some
damaged page images. What else could I be doing to track this down?

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [ADMIN] PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

2011-03-02 Thread daveg
On Wed, Mar 02, 2011 at 06:45:13PM -0300, Alvaro Herrera wrote:
 Excerpts from daveg's message of mié mar 02 18:30:34 -0300 2011:
 
  After a restart and vacuum of all dbs with no other activity things were
  quiet for a couple hours and then we started seeing these PD_ALL_VISIBLE
  messages again. 
  
  Going back through the logs we have been getting these since at least before
  mid January. Oddly, this only happens on four systems which are all new Dell
  32 core Nehalem 512GB machines using iscsi partitions served off a Netapp.
  Our older 8 core 64GB hosts have never logged any of these errors. I'm not
  saying it is related to the hw, as these hosts are doing a lot more work 
  than
  the old hosts so it may be a concurrency problem that just never came up at
  lower levels before.
  
  Postgresql version is 8.4.4.
 
 I don't see how this could be related, but since you're running on NFS,
 maybe it is, somehow:
 http://archives.postgresql.org/message-id/4d40ddb7.1010...@credativ.com
 (for example what if the visibility map fork's last page is overwritten?)

Running on ISCSI, not nfs. But it is still a Netapp, so who knows. I'll look.
Also, we are not seeing any of the unexpected data beyond EOF errors,
just thousands per day of the PD_ALL_VISIBLE error.

-dg
 

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [ADMIN] PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

2011-03-02 Thread daveg
On Wed, Mar 02, 2011 at 04:20:24PM -0800, bricklen wrote:
 On Wed, Mar 2, 2011 at 3:53 PM, daveg da...@sonic.net wrote:
   Postgresql version is 8.4.4.
 
  I don't see how this could be related, but since you're running on NFS,
  maybe it is, somehow:
  http://archives.postgresql.org/message-id/4d40ddb7.1010...@credativ.com
  (for example what if the visibility map fork's last page is overwritten?)
 
  Running on ISCSI, not nfs. But it is still a Netapp, so who knows. I'll 
  look.
  Also, we are not seeing any of the unexpected data beyond EOF errors,
  just thousands per day of the PD_ALL_VISIBLE error.
 
  -dg
 
 FWIW, we had a couple occurrences of that message about a month ago on 9.0.2
 
 http://archives.postgresql.org/pgsql-general/2011-01/msg00887.php
 
 Haven't seen it since we ran a cluster-wide vacuum.

We did a shutdown and restart to clear the buffer cache (but did not reboot
the host) and a vacuum on all dbs in the cluster last night. That cleared it
up for a couple hours, but we are still getting lots of these messages.

Most of them are pg_statistic and we create and drop hundreds of thousands of
temp tables daily, so there is a good chance there is a concurrancy issue.

-dg
 

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

2011-03-02 Thread daveg
On Tue, Mar 01, 2011 at 08:40:37AM -0500, Robert Haas wrote:
 On Mon, Feb 28, 2011 at 10:32 PM, Greg Stark gsst...@mit.edu wrote:
  On Tue, Mar 1, 2011 at 1:43 AM, David Christensen da...@endpoint.com 
  wrote:
  Was this cluster upgraded to 8.4.4 from 8.4.0?  It sounds to me like a 
  known bug in 8.4.0 which was fixed by this commit:
 
 
  The reproduction script described was running vacuum repeatedly. A
  single vacuum run out to be sufficient to clean up the problem if it
  was left-over.
 
  I wonder if it would help to write a regression test that runs 100 or
  so vacuums and see if the bulid farm turns up any examples of this
  behaviour.
 
 One other thing to keep in mind here is that the warning message we've
 chosen can be a bit misleading.  The warning is:
 
 WARNING:  PD_ALL_VISIBLE flag was incorrectly set in relation test page 1
 
 ...which implies that the state of the tuples is correct, and that the
 page-level bit is wrong in comparison.  But I recently saw a case
 where the infomask got clobbered, resulting in this warning.  The page
 level bit was correct, at least relative to the intended page
 contents; it was the a tuple on the page that was screwed up.  It
 might have been better to pick a more neutral phrasing, like page is
 marked all-visible but some tuples are not visible.

Yeesh. Yikes. I hope that this is not the case as we are seeing thousands of
these daily on each of 4 large production hosts. Mostly on catalogs,
especially pg_statistic. However it does occur on some high delete/insert
traffic user tables too.

Question: what would be the consequence of simply patching out the setting
of this flag? Assuming that the incorrect PD_ALL_VISIBLE flag is the only
problem (big assumption perhaps) then simply never setting it would at least
avoid the possibility of returning wrong answers, presumably at some
performance cost. We possibly could live with that until we get a handle
on the real cause and fix.

I had a look and don't really see anything except vacuum_lazy that sets it,
so it seems simple to disable.

Or have I understood this incorrectly?

Anything else I can be doing to try to track this down?

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [ADMIN] PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

2011-03-01 Thread daveg
On Mon, Feb 28, 2011 at 07:43:39PM -0600, David Christensen wrote:
 
 On Feb 28, 2011, at 3:28 PM, daveg wrote:
 
  Anything new on this? I'm seeing at on one of my clients production boxes.
  Also, what is the significance, ie what is the risk or damage potential if
  this flag is set incorrectly?
 
 
 Was this cluster upgraded to 8.4.4 from 8.4.0?  It sounds to me like a known 
 bug in 8.4.0 which was fixed by this commit:
 
 commit 7fc7a7c4d082bfbd579f49e92b046dd51f1faf5f
 Author: Tom Lane t...@sss.pgh.pa.us
 Date:   Mon Aug 24 02:18:32 2009 +
 
 Fix a violation of WAL coding rules in the recent patch to include an
 all tuples visible flag in heap page headers.  The flag update *must*
 be applied before calling XLogInsert, but heap_update and the tuple
 moving routines in VACUUM FULL were ignoring this rule.  A crash and
 replay could therefore leave the flag incorrectly set, causing rows
 to appear visible in seqscans when they should not be.  This might explain
 recent reports of data corruption from Jeff Ross and others.
 
 In passing, do a bit of editorialization on comments in visibilitymap.c.
 
 oy:postgresql machack$ git describe --tag 
 7fc7a7c4d082bfbd579f49e92b046dd51f1faf5f
 REL8_4_0-190-g7fc7a7c
 
 If the flag got twiddled while running as 8.4.0, the incorrect PD_ALL_VISIBLE 
 flag would (obviously) not be fixed by the upgrade to 8.4.4.  (Is this a 
 separate issue?)

This cluster was installed with 8.4.4. So it is still an existing problem.
Also, to my recollection, this cluster has never crashed.

-dg


-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [ADMIN] PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

2011-03-01 Thread daveg
On Tue, Mar 01, 2011 at 12:00:54AM +0200, Heikki Linnakangas wrote:
 On 28.02.2011 23:28, daveg wrote:
 On Wed, Jan 12, 2011 at 10:46:14AM +0200, Heikki Linnakangas wrote:
 We'll likely need to go back and forth a few times with various
 debugging patches until we get to the heart of this..
 
 Anything new on this? I'm seeing at on one of my clients production boxes.
 
 I haven't heard anything from the OP since.
 
 Also, what is the significance, ie what is the risk or damage potential if
 this flag is set incorrectly?
 
 Sequential scans will honor the flag, so you might see some dead rows 
 incorrectly returned by a sequential scan. That's the only damage, but 
 an incorrectly set flag could be a sign of something more sinister, like 
 corrupt tuple headers. The flag should never be set incorrectly, so if 
 you see that message you have hit a bug in PostgreSQL, or you have bad 
 hardware.
 
 This flag is quite new, so a bug in PostgreSQL is quite possible. If you 
 still have a backup that contains those incorrectly set flags, I'd like 
 to see what the page looks like.


I ran vacuums on all the affected tables last night. I plan to take a downtime
to clear the buffer cache and then to run vacuums on all the dbs in the
cluster.

Most but not all the tables involved are catalogs.

However, I could probably pick up your old patch sometime next week if it
recurrs and send you page images.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [ADMIN] PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum

2011-02-28 Thread daveg
On Wed, Jan 12, 2011 at 10:46:14AM +0200, Heikki Linnakangas wrote:
 On 12.01.2011 06:21, Fujii Masao wrote:
 On Sat, Dec 25, 2010 at 2:09 PM, Maxim Bogukmaxim.bo...@gmail.com  wrote:
 While I trying create reproducible test case for BUG #5798 I
 encountered very strange effect on two of my servers (both servers
 have same hardware platform/OS (freebsd 7.2) and PostgreSQL 8.4.4).
 
 Very simple test table created as:
 CREATE TABLE test (id integer);
 INSERT INTO test select generate_series(0,1);
 
 And I trying repeateble vacuum of that table with script:
   perl -e foreach (1..10) {system \psql -d test -h -c 'vacuum 
   test'\;}
 
 And once per like an minute (really random intervals can be 5 minutes
 without problems can be 3 vacuum in row show same error)  I getting
 next errors:
 WARNING:  PD_ALL_VISIBLE flag was incorrectly set in relation test page 
 1
 ...
 WARNING:  PD_ALL_VISIBLE flag was incorrectly set in relation test
 page 30 for all pages of the relation.
 
 Oh, interesting. This is the first time anyone can reliably reproducible 
 that. I can't reproduce that on my laptop with that script, though, so 
 I'm going to need your help to debug this.
 
 Can you compile PostgreSQL with the attached patch, and rerun the test? 
 It will dump the pages with incorrectly set flags to files in /tmp/, and 
 adds a bit more detail in the WARNING.  Please run the test until you 
 get those warnings, and tar up the the created /tmp/pageimage* files, 
 and post them along with the warning generated.
 
 We'll likely need to go back and forth a few times with various 
 debugging patches until we get to the heart of this..

Anything new on this? I'm seeing at on one of my clients production boxes.
Also, what is the significance, ie what is the risk or damage potential if
this flag is set incorrectly?

Thanks

-dg


-- 
David Gould   da...@sonic.net
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL and HugePage

2010-10-21 Thread daveg
On Thu, Oct 21, 2010 at 08:16:27AM -0700, Mark Wong wrote:
 On Tue, Oct 19, 2010 at 8:30 PM, daveg da...@sonic.net wrote:
  On Wed, Oct 20, 2010 at 04:08:37PM +1300, Mark Kirkwood wrote:
  On 20/10/10 16:05, Mark Kirkwood wrote:
  
  
  shmget and friends are hugetlbpage  aware, so it seems it should 'just
  work'.
  
 
  Heh - provided you specify
 
  SHM_HUGETLB
 
 
  in the relevant call that is :-)
 
  I had a patch for this against 8.3 that I could update if there is any
  interest. I suspect it is helpful.
 
 Oh, probably better than me digging up my broken one.  Send it out as
 is if you don't want to update it. :)

I'll update it and see if I can get a largish machine to test, at least with
pgbench on. But not today alas.

-dg
 
-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL and HugePage

2010-10-20 Thread daveg
On Wed, Oct 20, 2010 at 12:28:25PM -0700, Greg Stark wrote:
 On Wed, Oct 20, 2010 at 12:17 PM, Greg Stark gsst...@mit.edu wrote:
  I don't think it's a big cost once all the processes
  have been forked if you're reusing them beyond perhaps slightly more
  efficient cache usage.
 
 Hm, this site claims to get a 13% win just from the reduced tlb misses
 using a preload hack with Pg 8.2. That would be pretty substantial.
 
 http://oss.linbit.com/hugetlb/

That was my motivation in trying a patch. TLB misses can be a substantial
overhead. I'm not current on the state of play, but working at Sun's
benchmark lab on a DB TPC-B benchmark something for the first generation
of MP systems, something like 30% of all bus traffic was TLB misses. The
next iteration of the hardward had a much larger TLB.

I have a client with 512GB memory systems, currently with 128GB configured
as postgresql buffer cache. Which is 32M TLB entires trying to fit in the
few dozed cpu TLB slots. I suspect there may be some contention.

I'll benchmark of course.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL and HugePage

2010-10-19 Thread daveg
On Wed, Oct 20, 2010 at 04:08:37PM +1300, Mark Kirkwood wrote:
 On 20/10/10 16:05, Mark Kirkwood wrote:
 
 
 shmget and friends are hugetlbpage  aware, so it seems it should 'just 
 work'.
 
 
 Heh - provided you specify
 
 SHM_HUGETLB
 
 
 in the relevant call that is :-)

I had a patch for this against 8.3 that I could update if there is any
interest. I suspect it is helpful.

-dg
 

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] a faster compression algorithm for pg_dump

2010-04-14 Thread daveg
On Tue, Apr 13, 2010 at 03:03:58PM -0400, Tom Lane wrote:
 Joachim Wieland j...@mcknight.de writes:
  If we still cannot do this, then what I am asking is: What does the
  project need to be able to at least link against such a compression
  algorithm?
 
 Well, what we *really* need is a convincing argument that it's worth
 taking some risk for.  I find that not obvious.  You can pipe the output
 of pg_dump into your-choice-of-compressor, for example, and that gets
 you the ability to spread the work across multiple CPUs in addition to
 eliminating legal risk to the PG project.  And in any case the general
 impression seems to be that the main dump-speed bottleneck is on the
 backend side not in pg_dump's compression.

My client uses pg_dump -Fc and produces about 700GB of compressed postgresql
dump nightly from multiple hosts. They also depend on being able to read and
filter the dump catalog. A faster compression algorithm would be a huge
benefit for dealing with this volume.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TCP keepalive support for libpq

2010-02-10 Thread daveg
On Tue, Feb 09, 2010 at 09:34:10AM -0500, Andrew Chernow wrote:
 Tollef Fog Heen wrote:
 (please Cc me on replies, I am not subscribed)
 
 Hi,
 
 libpq currently does not use TCP keepalives.  This is a problem in our
 case where we have some clients waiting for notifies and then the
 connection is dropped on the server side.  The client never gets the FIN
 and thinks the connection is up.  The attached patch unconditionally
 adds keepalives.  I chose unconditionally as this is what the server
 does.  We didn't need the ability to tune the timeouts, but that could
 be added with reasonable ease.
 
 ISTM that the default behavior should be keep alives disabled, as it is 
 now, and those wanting it can just set it in their apps:
 
 setsockopt(PQsocket(conn), SOL_SOCKET, SO_KEEPALIVE, ...)

I disagree. I have clients who have problems with leftover client connections
due to server host failures. They do not write apps in C. For a non-default
change to be effective we would need to have all the client drivers, eg JDBC,
psycopg, DBD-DBI, and the apps like psql make changes to turn it on. Adding
this option as a non-default will not really help.

-dg
 

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump enhancement proposal

2009-12-13 Thread daveg
On Thu, Nov 12, 2009 at 04:31:37PM -0500, Tom Lane wrote:
 Mark Hammonds mhammo...@omniti.com writes:
  2.  Custom Query Exports
 
  In my use of mysqldump, I found one feature very useful: the ability  
  to execute a custom SELECT. . .WHERE statement and then dump only the  
  results. This feature currently provides MySQL users with the ability  
  to quickly and easily export very granular data subsets, and I see no  
  reason why PostgreSQL users wouldn't benefit from the same capability.  
  While it is true that this functionality can already be achieved in  
  PostgreSQL using Copy, it seems to me that it would logically fit well  
  as an extension to pg_dump, especially since many beginning and even  
  some intermediate PostgreSQL users aren't aware of the alternatives.
 
 As you say, we already have this using COPY, and I don't agree that
 it would be a good idea to plaster it into pg_dump as well.  pg_dump
 is intended for dumping and restoring data, not for ETL-type tasks.
 Furthermore, pg_dump is a overly complex beast already --- much more
 so than one could wish, for a tool that is absolutely fundamental to
 database reliability.  Putting requirements on it that are well outside
 its charter seems like a short route to maintenance disaster.
 
 There has been some occasional chatter about developing one or more
 tools focused on ETL rather than dump/restore, and my thought is that
 this idea would fit better there.  An ETL tool would not have the
 kind of requirements pg_dump has for coping with multiple server
 versions and knowing everything there is to know about database
 contents, so it seems like it could address new areas of functionality
 without a complexity explosion.
 
 You might want to check the archives for previous discussions ---
 I think the last go-round started with someone wanting to add an
 arbitrary WHERE filter to pg_dump dumps.

Sorry I missed this thread.

Not only has there been previous discussion, there have been at least two,
and I seem to recall three, patches implementing this. None of the patches
was very large, and none of them impacted the basic make a backup paths
in pg_dump.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] EOL for 7.4?

2009-11-12 Thread daveg
On Fri, Nov 13, 2009 at 02:22:01AM +, Greg Stark wrote:
 
 Really I think you guys are on the wrong track trying to map Postgres
 releases to commercial support terms. None of the Postgres releases
 are supported in the sense that there's no warranty and no promises,
 it's all best effort. If you want a promise of anything then pay
 someone for that service.
 
 As with any open source software if you're running 7-year-old versions
 of the software you can't seriously expect the developers to take any
 interest in bugs you discover which don't affect current releases.
 Other projects don't release back branches at all. The most the
 developers are likely to do if your bugs require serious engineering
 is declare that the version you're using is too old.

Claiming to support versions that are too old is giving users a false
sense of comfort. Encouraging users to use these versions is actually
harming them as when this happens they will be stuck with either living
with the bug or doing an immediate unplanned upgrade.

I suggest we announce now that both 7.4 and 8.0 will EOL when 8.5 is expected
to ship, or to comfort those who never use .0 versions when 8.5.1 ships.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] EOL for 7.4?

2009-11-12 Thread daveg
On Fri, Nov 13, 2009 at 02:47:56AM +, Greg Stark wrote:
 On Fri, Nov 13, 2009 at 2:35 AM, daveg da...@sonic.net wrote:
  I suggest we announce now that both 7.4 and 8.0 will EOL when 8.5 is 
  expected
  to ship, or to comfort those who never use .0 versions when 8.5.1 ships.
 
 What would this mean? How would it be different than the status quo?

I suppose it would mean posting periodic prominent notices, moving the sources
to the OLD directory, that sort of thing. I thought that was the topic of this
thread?

-dg
 
-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] EOL for 7.4?

2009-11-03 Thread daveg
On Tue, Nov 03, 2009 at 10:32:17AM -0800, Josh Berkus wrote:
 So I'm going to make a case in favor of EOL'ing 7.4.  In fact, I'd be in
 favor of doing so in, say, February after an announcement this month.
 
 The main reason I'm in favor of this is that we have a lot of users
 using 7.4 out of inertia, and they need a message that 7.4 is not
 supported to get them to upgrade.  I can think of several here in SF
 who have been working on upgrade plans for the past 3 years.  An EOL
 is what's needed to give them a kick in the pants.
 
 The same goes for other OSS projects.  There's quite a few random OSS
 apps which were created on PG 7.4 and have never offered their users an
 upgrade path (Gnuworld comes to mind).  They need an EOL announcement to
 get them motivated to upgrade.

+1

-dg
 
-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Application name patch - v2

2009-10-20 Thread daveg
On Tue, Oct 20, 2009 at 12:16:42PM -0400, Tom Lane wrote:
 Magnus Hagander mag...@hagander.net writes:
  Also, how many platforms can't we do this on? If we have BSD and
  Windows covered already. on linux, I believe you can easily read it
  out of /proc/self/cmdline, no?
 
 Writing a pile of platform-specific code for this is simply insane from
 a support point of view.  The feature is NOT worth it.  Especially not
 since the typical result will be something quite uninformative like
 psql or java.  The cases that are actually useful are the ones where
 the application sets it.  I don't think we should have a default at all
 --- you don't set it, you don't get a name.

-1

I'd like a default, especially for psql, to help identify interactive sessions.

-dg 

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres server goes in recovery mode repeteadly

2009-10-20 Thread daveg
On Fri, Oct 02, 2009 at 07:57:13PM -0700, daveg wrote:
 On Fri, Oct 02, 2009 at 10:41:07AM -0400, Alvaro Herrera wrote:
  daveg escribió:
  
   I work with Kunal and have been looking into this. It appears to be the 
   same
   as the bug described in:
   
 http://archives.postgresql.org/pgsql-bugs/2009-09/msg00355.php
   
   as I have localized it to a NULL pointer deference in
   RelationCacheInitializePhase2() as well. Tom speculates in:
   
 http://archives.postgresql.org/pgsql-bugs/2009-09/msg00372.php
   
   that large numbers of table drops might trigger this. The system in 
   question
   creates and drops temp tables at a high rate which tends to confirm this. 
  
  Did you test the patch posted by Tom?
 
 We are testing it since last night in our test environment. If it does not
 break anything (unlikely) we will deploy it next week. However, since the
 problem is only occasional, only happens every few days on one of 50+ hosts,
 it will take some extended time without further segfaults to say anything
 confident about the patches effectiveness.

We have had this deployed in our test and production environments for a 
couple weeks now. We have not seen any further instance of the problem.
Without the patch, we would have expected to see at least a few by now.
So the patch appears to be effective.

-dg
 
-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Deprecation

2009-10-19 Thread daveg
On Sat, Oct 17, 2009 at 03:01:27PM -0400, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Sounds like a good reason to remove add_missing_from in 8.5.
 
 Seems like the general consensus is that it's okay to do that.
 I will go make it happen unless somebody squawks pretty soon...
 
   regards, tom lane

+1

-dg


-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Application name patch - v2

2009-10-19 Thread daveg
On Mon, Oct 19, 2009 at 01:00:28PM +0100, Dave Page wrote:
 On Mon, Oct 19, 2009 at 12:57 PM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
  It is not practical. I'll log errors. Usually SQL injection generates
  lot of errors. Loging all statements has not sense. What is difference
  bad and good SQL statement.? Maybe multistatements are good candidates
  for log as possible attackers statements. On highly load databases
  loging all statements significantly increase load :(
 
 Ahh, I see.
 
  My point is, that the query to change the app name is logged using the
  *original* app name, thus it will not be discarded by the log analysis
  tools in your scenario.
 
 
  I thing, so change of original name should generate warning.
 
 Well, if other people think that's necessary, it's certainly possible.

I have clients working around the lack of this feature by simply prepending
a single line comment to their sql in the application to supply the app name.
eg:

  -- monthly_report monthly_process.py:524
  select wev from foo;

This feature would be very handy, but not if it requires special permission
to use it.

-dg


-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres server goes in recovery mode repeteadly

2009-10-02 Thread daveg
On Fri, Oct 02, 2009 at 10:41:07AM -0400, Alvaro Herrera wrote:
 daveg escribió:
 
  I work with Kunal and have been looking into this. It appears to be the same
  as the bug described in:
  
http://archives.postgresql.org/pgsql-bugs/2009-09/msg00355.php
  
  as I have localized it to a NULL pointer deference in
  RelationCacheInitializePhase2() as well. Tom speculates in:
  
http://archives.postgresql.org/pgsql-bugs/2009-09/msg00372.php
  
  that large numbers of table drops might trigger this. The system in question
  creates and drops temp tables at a high rate which tends to confirm this. 
 
 Did you test the patch posted by Tom?

We are testing it since last night in our test environment. If it does not
break anything (unlikely) we will deploy it next week. However, since the
problem is only occasional, only happens every few days on one of 50+ hosts,
it will take some extended time without further segfaults to say anything
confident about the patches effectiveness.

-dg
 
-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Limit allocated memory per session

2009-10-01 Thread daveg

I'd like to propose adding a new GUC to limit the amount of memory a backend
can allocate for its own use. The problem this addresses is that sometimes
one needs to set work_mem fairly high to get good query plans for large joins.
However, some complex queries will then use huge amounts of memory so that
one or a few of them will consume all the memory on the host and run it deep
into swap or trigger the oom killer or worse.

I've attached a patch based on 8.4.1. It works by keeping a track of the
total memory allocated via malloc to AllocBlocks (aset.c). If this is not
shot down/up too badly I will rebase it on CVS and submit it for the next
commit fest.

I would also like to propose a similar limit on temp space use. It is quite
easy for an unintended cartesion product to use hundreds of gigabytes of
scratch space and cause other processes to fail due to lack of disk space.
If this is not objectionable, I'll work on it too.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.
*** ./src/include/utils/memutils.h.orig 2009-09-30 01:54:36.0 -0700
--- ./src/include/utils/memutils.h  2009-09-30 03:33:44.0 -0700
***
*** 114,119 
--- 114,122 
   */
  
  /* aset.c */
+ 
+ extern int max_allocated_mem;
+ 
  extern MemoryContext AllocSetContextCreate(MemoryContext parent,
  const char *name,
  Size minContextSize,
*** ./src/backend/utils/mmgr/aset.c.orig2009-09-29 16:14:23.0 
-0700
--- ./src/backend/utils/mmgr/aset.c 2009-10-01 03:07:34.0 -0700
***
*** 168,173 
--- 168,187 
  } AllocBlockData;
  
  /*
+  * AllocBlock accounting maintains total allocated memory to enforce the 
memory use limit.
+  */
+ int max_allocated_mem = 0;
+ Size AllocBlockAccountingMemUsed = 0;
+ 
+ #define AllocBlockAccountingFree(block)   \
+   (AllocBlockAccountingMemUsed -= block-endptr - (char 
*) (block))
+ #define AllocBlockAccountingAlloc(block)  \
+(AllocBlockAccountingMemUsed += block-endptr - (char 
*) (block))
+ #define AllocBlockAccountingOverLimit()   \
+   (max_allocated_mem != 0 \
+ AllocBlockAccountingMemUsed / 1024  
max_allocated_mem)
+ 
+ /*
   * AllocChunk
   *The prefix of each piece of memory in an AllocBlock
   *
***
*** 393,398 
--- 407,423 
context-blocks = block;
/* Mark block as not to be released at reset time */
context-keeper = block;
+ 
+   AllocBlockAccountingAlloc(block);
+   if (AllocBlockAccountingOverLimit())
+   {
+   MemoryContextStats(TopMemoryContext);
+   ereport(ERROR,
+   (errcode(ERRCODE_OUT_OF_MEMORY),
+errmsg(memory limit exceeded),
+errdetail(Failed while creating 
memory context \%s\.,
+  name)));
+   }
}
  
context-isReset = true;
***
*** 476,481 
--- 501,507 
else
{
/* Normal case, release the block */
+   AllocBlockAccountingFree(block);
  #ifdef CLOBBER_FREED_MEMORY
/* Wipe freed memory for debugging purposes */
memset(block, 0x7F, block-freeptr - ((char *) block));
***
*** 521,526 
--- 547,553 
{
AllocBlock  next = block-next;
  
+   AllocBlockAccountingFree(block);
  #ifdef CLOBBER_FREED_MEMORY
/* Wipe freed memory for debugging purposes */
memset(block, 0x7F, block-freeptr - ((char *) block));
***
*** 597,602 
--- 624,640 
set-blocks = block;
}
  
+   AllocBlockAccountingAlloc(block);
+   if (AllocBlockAccountingOverLimit())
+   {
+   MemoryContextStats(TopMemoryContext);
+   ereport(ERROR,
+   (errcode(ERRCODE_OUT_OF_MEMORY),
+errmsg(memory limit exceeded),
+errdetail(Failed on request of size 
%lu.,
+  (unsigned long) 
size)));
+   }
+ 
set-isReset = false;
  
AllocAllocInfo(set, chunk);
***
*** 767,772 
--- 805,821 
  
block-next = set-blocks;
set-blocks = block;
+ 
+   AllocBlockAccountingAlloc(block);
+   if 

Re: [HACKERS] Limit allocated memory per session

2009-10-01 Thread daveg
On Thu, Oct 01, 2009 at 10:35:55AM -0400, Tom Lane wrote:
 daveg da...@sonic.net writes:
  I'd like to propose adding a new GUC to limit the amount of memory a backend
  can allocate for its own use.
 
 Use ulimit.

That was my initial thought too. However, ulimit() is documented as superceded
by setrlimit(). Which has the option RLIMIT_DATA to limit the size of the data
segment. Perfect!

Except, RLIMIT_DATA does not appear to work on linux. The call succeeds and
the new value can even be read back with getrlimit(), but it does not seem
to do anything to actually limit the memory allocated. I tested this on
SuSE 11: kernel 2.6.25, and Ubuntu Intrepid: kernel 2.6.28.

Setting RLIMIT_AS to limit the total address space for a process works as
expected. However this seems undesireable for postgresql as it can also cause
stack expansion to fail, which would then force a general restart. Also,
this limit would interact with the buffercache size setting as it includes
the shared address space as well.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Limit allocated memory per session

2009-10-01 Thread daveg
On Thu, Oct 01, 2009 at 11:47:43AM -0400, Tom Lane wrote:
 Euler Taveira de Oliveira eu...@timbira.com writes:
  Tom Lane escreveu:
  daveg da...@sonic.net writes:
  I'd like to propose adding a new GUC to limit the amount of memory a 
  backend
  can allocate for its own use.
  
  Use ulimit.

 Seriously, the proposed patch introduces overhead into a place that is
 already a known hot spot, in return for not much of anything.  It will

The overhead is simply an integer addition and compare with values that are
likely already in processor caches. And this only occurs when we actually
call malloc() to get a new block, not on every palloc. So I suspect it will
not be noticable. However, I welcome any suggestion on how to test this
and actually measure the overhead if any. pg_bench? Something else?

 *not* bound backend memory use very accurately, because there is no way
 to track raw malloc() calls. And I think that 99% of users will
  not find it useful.

The use case that motivated is a client that runs many postgresql instances
with a mostly batch/large query workload. Some of the queries are code
generated by an application and can be very complex.  A few times a month
one of these will run through 64GB of memory and oom the host. So it
seriously hurts production. Setting work_mem low enough to prevent this
results in poor query performance.

This client does not use any outside libraries that call malloc() directly.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres server goes in recovery mode repeteadly

2009-10-01 Thread daveg
On Tue, Sep 29, 2009 at 09:52:06PM +0530, kunal sharma wrote:
 Hi ,
 We are using Postgres 8.4 and its been found going into recovery
 mode couple of times. The server process seems to fork another child process
 which is another postgres server running under same data directory and after
 some time it goes away while the old server is still running. There were few
 load issues on the server but the load didnt went above 32.
 
We are running opensuse 10.2 x86_64 with 32Gb of physical memory.
 Checking the logs I found that theres a segmentation fault ,
 
 
 Sep 26 05:39:54 pace kernel: postgres[28694]: segfault at 0030
 rip 0066ba8c rsp 7fffd364da30 error 4
 
 gdb dump shows this
 
 Reading symbols from /lib64/libdl.so.2...done.
 Loaded symbols for /lib64/libdl.so.2
 Reading symbols from /lib64/libm.so.6...done.
 Loaded symbols for /lib64/libm.so.6
 Reading symbols from /lib64/libc.so.6...done.
 Loaded symbols for /lib64/libc.so.6
 Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
 Loaded symbols for /lib64/ld-linux-x86-64.so.2
 Reading symbols from /lib64/libnss_files.so.2...done.
 Loaded symbols for /lib64/libnss_files.so.2
 0x2ad6d7b8c2b3 in __select_nocancel () from /lib64/libc.so.6
 (gdb)
 
   Any suggestions what is causing this segmentation fault?

I work with Kunal and have been looking into this. It appears to be the same
as the bug described in:

  http://archives.postgresql.org/pgsql-bugs/2009-09/msg00355.php

as I have localized it to a NULL pointer deference in
RelationCacheInitializePhase2() as well. Tom speculates in:

  http://archives.postgresql.org/pgsql-bugs/2009-09/msg00372.php

that large numbers of table drops might trigger this. The system in question
creates and drops temp tables at a high rate which tends to confirm this. 

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] BEGIN TRANSACTION and START TRANSACTION: different error handling

2009-09-24 Thread daveg
On Thu, Sep 24, 2009 at 12:16:43PM +0300, Hannu Krosing wrote:
 I expect the transaction is aborted and rollback is executed
 automatically. - this is not how postgreSQL behaves. PostgreSQL needs
 an explicit end of transaction from client, either COMMIT; or ROLLBACK;
 
 when run from psql, they both act the same, except the string returned
 
 hannu=# begin transaction;
 BEGIN
 hannu=# select 1/0;
 ERROR:  division by zero
 hannu=# select 1/0;
 ERROR:  current transaction is aborted, commands ignored until end of
 transaction block
 hannu=# abort;
 ROLLBACK
 hannu=# start transaction;
 START TRANSACTION
 hannu=# select 1/0;
 ERROR:  division by zero
 hannu=# select 1/0;
 ERROR:  current transaction is aborted, commands ignored until end of
 transaction block
 hannu=# abort;
 ROLLBACK
 
 I suspect, that psqlodbc is the one doing the automatic rollback and it
 seems to rely on reply BEGIN to establish an in-transaction state.
 
 so when start transaction; returns START TRANSACTION instead of
 BEGIN, psqlodbc does not realise that it is in transaction and does
 not initiate the automatic rollback.

Well. I'd always thought BEGIN and START were syntactic Aspartame and had
the same underlying implementation. So this is a surprise. Why do they
return a different status?

-dg
`
-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] numeric_to_number() function skipping some digits

2009-09-23 Thread daveg
On Tue, Sep 22, 2009 at 10:27:19AM +0530, Jeevan Chalke wrote:
 
 It seems that Oracle reads formatting string from right-to-left. Here are
 few results:
 ('number','format') == Oracle  PG
 
 ('34,50','999,99')  == 3450340
 ('34,50','99,99')   == 34503450
 ('34,50','99,999')  == Invalid Number  3450
 ('34,50','999,999') == Invalid Number  340

It seems worse to to give a wrong answer silently then to throw an error.
What we do now seems sort of MySqlish.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding \ev view editor?

2009-09-23 Thread daveg
On Mon, Sep 21, 2009 at 02:26:05PM -0400, Andrew Dunstan wrote:
andrew=# select pg_get_viewdef('foo',true);
pg_get_viewdef   
--
  SELECT 'a'::text AS b,
 ( SELECT 1
FROM dual) AS x,
 random() AS y,
 CASE
 WHEN true THEN 1
 ELSE 0
 END AS c,
 1 AS d
FROM dual;
(1 row)

+1 

-dg


-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] happy birthday Tom Lane ...

2009-09-18 Thread daveg
On Fri, Sep 18, 2009 at 01:04:23PM +0200, Hans-Juergen Schoenig -- PostgreSQL 
wrote:
 Tom,
 
 On behalf of the entire PostgreSQL team here in Austria I want to wish 
 you a happy birthday.
 We hope that you fill be a vital part of PostgreSQL for many years to come.
 
   Best regards,
 
   Hans-Jürgen Schönig + team

+1 from me too.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RfD: more powerful any types

2009-09-17 Thread daveg
On Tue, Sep 15, 2009 at 07:38:18AM +0200, Pavel Stehule wrote:
 it isn't fair :) why you use $$ without single quote? And still this
 case should be vulnerable on SQL injection. Maybe you or me knows,
 what SQL injection means, but beginners knows nothing and this people
 use following bad code:
 
 sql := $$SELECT * FROM '${table_name}'$$} and are happy. But this code
 is wrong!

I have an idea you will like less: have multiple interpolation codes that
automagically do the right quoting. Perhaps as extra printf like type codes.
The above then becomes:

  sql := pgprintf($$SELECT * FROM %I;$$, table_name )

Where %I evaluates as if it were quote_ident(%s).

This would maybe even encourage users to do the quoting they should by
making it easy.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RfD: more powerful any types

2009-09-12 Thread daveg
On Fri, Sep 11, 2009 at 11:43:32AM -0400, Merlin Moncure wrote:
 
 If you are going to use printf format codes, which is good and useful
 being something of a standard, I'd call routine printf (not format)
 and actually wrap vsnprintf.  The format codes in printf have a very
 specific meaning: converting native C types to arrays of characters.
 I think that a postgresql implementation should do exactly that:
 attempt to convert the passed in datum to the c type in question if
 possible (erroring if no cast exists) and then pass it down.  The idea
 is we are not adding new formatting routines but using a very high
 quality existing one...why reinvent the wheel?
 
 so if you did: select printf('%s %3.1f', foo::box, bar::circle);
 the box to char* cast would work (using the text cast) but the second
 cast would fail unless the user added a cast to float.  The code in
 question is easy to imagine...parse the format string, and loop the
 varargs using the appropriate looked up cast one by one...

+1

-dg 

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Ragged CSV import

2009-09-12 Thread daveg
On Fri, Sep 11, 2009 at 10:27:06AM +0200, Dimitri Fontaine wrote:
 Maybe instead of opening FROM for COPY, having it accepted in WITH would
 be better, the same way (from the user point of view) that DML returning
 are worked on.
... 
  WITH csv AS (
COPY t FROM stdin CSV
  )
  INSERT INTO foo(x, y, z) 
  SELECT t[3], t[2], mytimestamptz([5], [6], [7])
FROM csv;
 
 Now the text[] has a strange feeling, without it it'd be:
 
  WITH csv AS (
COPY t(a, b, c, d, e, f, g) 
FROM stdin 
 CSV IGNORING EXTRA COLUMNS -- random nice syntax
 MISSING COLUMNS DEFAULTS NULL  -- that needs some reality check
  )
  INSERT INTO foo(x, y, z) 
  SELECT c, b, mytimestamptz(e, f, g)
FROM csv;
 
 The function mytimestamptz(date text, time text, timezone text) will
 accept input that PostgreSQL input types would have errored out on... so
 you can process in one go strange formats from other products.

+1

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] remove flatfiles.c

2009-09-03 Thread daveg
On Tue, Sep 01, 2009 at 07:42:56PM -0400, Tom Lane wrote:
 Greg Stark gsst...@mit.edu writes:
  On Wed, Sep 2, 2009 at 12:01 AM, Alvaro
  Herreraalvhe...@commandprompt.com wrote:
  The use cases where VACUUM FULL wins currently are where storing two
  copies of the table and its indexes concurrently just isn't practical.
  
  Yeah, but then do you really need to use VACUUM FULL?  If that's really
  a problem then there ain't that many dead tuples around.
 
  That's what I want to believe. But picture if you have, say a
  1-terabyte table which is 50% dead tuples and you don't have a spare
  1-terabytes to rewrite the whole table.
 
 But trying to VACUUM FULL that table is going to be horridly painful
 too, and you'll still have bloated indexes afterwards.  You might as
 well just live with the 50% waste, especially since if you did a
 full-table update once you'll probably do it again sometime.
 
 I'm having a hard time believing that VACUUM FULL really has any
 interesting use-case anymore.

I have a client who uses temp tables heavily, hundreds of thousands of creates
and drops per day. They also have long running queries. The only thing that
keeps catalog bloat somewhat in check is vacuum full on bloated catalogs
a few times a day. Without that pg_class, pg_attribute etc quickly balloon to
thousands of pages.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] remove flatfiles.c

2009-09-03 Thread daveg
On Thu, Sep 03, 2009 at 07:57:25PM -0400, Andrew Dunstan wrote:
 daveg wrote:
 On Tue, Sep 01, 2009 at 07:42:56PM -0400, Tom Lane wrote:
 I'm having a hard time believing that VACUUM FULL really has any
 interesting use-case anymore.
 
 I have a client who uses temp tables heavily, hundreds of thousands of 
 creates
 and drops per day. They also have long running queries. The only thing that
 keeps catalog bloat somewhat in check is vacuum full on bloated catalogs
 a few times a day. Without that pg_class, pg_attribute etc quickly balloon 
 to thousands of pages.
 
 That's a rate of more than one create and drop per second. How does your 
 client handle the fact that VACUUM FULL will exclusively lock those 
 catalog tables? Without knowing more, it looks like a bit of a design issue.

I'd say it is several per second.

They wait for the catalog locks sometimes. This is not an interactive
application so that is somewhat acceptable. It also occasionally causes
deadlocks which is less agreeable.

There are various reasons for the heavy use of temps, mainly having to do
with loading external feeds or reusing intermediate query results in a series
of queries.

It would be great if there was a way to have temp tables that
did not get cataloged, eg local cache only.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add YAML option to explain

2009-08-31 Thread daveg
On Mon, Aug 31, 2009 at 02:15:08PM -, Greg Sabino Mullane wrote:
  Greg, can we see a few examples of the YAML output
  compared to both json and text?
...
 greg=# explain (format json, analyze on) select * from pg_class where relname 
 ~ 'x' order by 1,2,3;
 QUERY PLAN
 - ---

An interesting property of json, it is almost exactly the same as python
data structure syntax. If I paste the following into python:

plan = [
 {
   Plan: {
 Node Type: Sort,
 Startup Cost: 12.82,
 Total Cost: 13.10,
 Plan Rows: 111,
 Plan Width: 185,
 Actual Startup Time: 1.152,
 Actual Total Time: 1.373,
 Actual Rows: 105,
 Actual Loops: 1,
 Sort Key: [relname, relnamespace, reltype],
 Sort Method: quicksort,
 Sort Space Used: 44,
 Sort Space Type: Memory,
 Plans: [
   {
 Node Type: Seq Scan,
 Parent Relationship: Outer,
 Relation Name: pg_class,
 Alias: pg_class,
 Startup Cost: 0.00,
 Total Cost: 9.05,
 Plan Rows: 111,
 Plan Width: 185,
 Actual Startup Time: 0.067,
 Actual Total Time: 0.817,
 Actual Rows: 105,
 Actual Loops: 1,
 Filter: (relname ~ 'x'::text)
   }
 ]
   },
   Triggers: [
   ],
   Total Runtime: 1.649
 }
   ]

I get a python data structure. Which can be manipulated directly, or pretty
printed:

 import pprint
 pprint.pprint(plan)
[{'Plan': {'Actual Loops': 1,
   'Actual Rows': 105,
   'Actual Startup Time': 1.1519,
   'Actual Total Time': 1.373,
   'Node Type': 'Sort',
   'Plan Rows': 111,
   'Plan Width': 185,
   'Plans': [{'Actual Loops': 1,
  'Actual Rows': 105,
  'Actual Startup Time': 0.067004,
  'Actual Total Time': 0.81695,
  'Alias': 'pg_class',
  'Filter': (relname ~ 'x'::text),
  'Node Type': 'Seq Scan',
  'Parent Relationship': 'Outer',
  'Plan Rows': 111,
  'Plan Width': 185,
  'Relation Name': 'pg_class',
  'Startup Cost': 0.0,
  'Total Cost': 9.0507}],
   'Sort Key': ['relname', 'relnamespace', 'reltype'],
   'Sort Method': 'quicksort',
   'Sort Space Type': 'Memory',
   'Sort Space Used': 44,
   'Startup Cost': 12.82,
   'Total Cost': 13.1},
  'Total Runtime': 1.649,
  'Triggers': []}]

I'm not sure if all json can be read this way, but the python and json
notations are very similar.

-dg

--
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 8.5 release timetable, again

2009-08-28 Thread daveg
On Thu, Aug 27, 2009 at 09:38:15PM +0200, Dimitri Fontaine wrote:
 Exactly, and I think that what we're missing here is a simple tool for
 our users to check a new PostgreSQL release against their existing
 application.
 
 We already know how to either log all queries and analyze the log files
 (CSV makes it easier, pgfouine parses them too) or to have a fe/be
 protocol proxy to record application SQL traffic (tsung recorder does
 that).
 
 What we miss is a tool to run the captured queries through both versions
 of PG and report any resultset mismatch, of course with a way to account
 for ordering issues (but we've seen people rely on the ordering when
 they don't give an order by clause, then bug the lists about it if a new
 release changes it).

This would be very useful. I often am asked how much better will the new
release run our apps as part of convincing a client to upgrade to a
more current postgresql release. Being able to replay a days workload in a
somewhat realistic manner would be a great help.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 8.5 release timetable, again

2009-08-28 Thread daveg
On Thu, Aug 27, 2009 at 08:02:03PM -0700, Ron Mayer wrote:
 Andrew Dunstan wrote:
  I don't know of anyone who is likely to want to try out alphas in their
  normal development environments. The client I approached was
  specifically prepared to test beta releases that way.
 
 Perhaps end-users won't, but I think companies who develop software that
 works on top of postgres will. Perhaps to make sure their existing software
 continues to work; or perhaps to get a head start working with new features.
 I test against CVS-head occasionally.

I've been trying to help a client take up new versions of postgresql
more quickly as the performance or feature content is often very valuable
to them. Accordingly, I have encouraged them to run periodic samples of the
nightly snapshots on at least one development instance, and to run the
betas in the test environment. The goal is to be confident on the day of the
postgresql release that we have tested enough and fixed any incompatibilities
so that, if they so choose, they could migrate production to the new release
immediately. This way they get several months extra benefit from improvements
to postgresql. 

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add YAML option to explain

2009-08-28 Thread daveg
On Fri, Aug 28, 2009 at 04:37:41PM -0700, David E. Wheeler wrote:
 On Aug 28, 2009, at 3:45 PM, Stephen Frost wrote:
 
 +1 from me.  I've read the other comments and just plain don't agree
 with them.  It's a small patch, adds a useful format for EXPLAIN, and
 would be used.
 
 One of the best things about PG is the flexibility and usability.
 
 I agree, I tend to prefer YAML output where it's parseable (and I  
 expect it the EXPLAIN YAML output won't be doing anything tricky).
 
 That said, maybe there should be a way to create modules add formats,  
 instead of adding them to core?

+1

-dg
 

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GRANT ON ALL IN schema

2009-08-16 Thread daveg
On Sat, Aug 15, 2009 at 11:34:04PM +0200, Dimitri Fontaine wrote:
 Nitpicking dept, I think I prefer:
 
  DO [ [LANGUAGE] language] $$ ... $$;
  DO plperl $$ ... $$;
  DO language plpython $$ ... $$;
 
 language is optional and defaults to plpgsql.

+1
 
-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GRANT ON ALL IN schema

2009-08-16 Thread daveg
On Sun, Aug 16, 2009 at 02:59:53PM +0200, Pavel Stehule wrote:
 2009/8/16 Peter Eisentraut pete...@gmx.net:
  On sön, 2009-08-16 at 00:04 -0400, Andrew Dunstan wrote:
  SQL is not Lisp. Simple is  good. I didn't think Peter was really very
  serious.
 
  Well, I don't know if we really need to call it lambda, but I fully
  expect to be able to use these ad hoc functions as part of other
  expressions.  So making DO or whatever a top-level command that does not
  integrate with anything else would not really satisfy me.
 
 +1

+1

-dg
 

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Alpha releases: How to tag

2009-08-08 Thread daveg
On Fri, Aug 07, 2009 at 06:28:34PM -0400, Tom Lane wrote:
 David Fetter da...@fetter.org writes:
  I am not suggesting that this change be immediate, and it's not ivory
  tower.  It's just how everybody else does it.
 
 You keep saying that, and it's completely meaningless.  What do you know
 about the development practices of Oracle, or DB2, or even Mysql?

When I was at Sybase, changes to the on disk structure were required to
provide code to do the migration. Nonetheless, at release time, the
migrate process was almost always discovered to be broken, sometimes even
before it was shipped to customers.

Of course, Sybase implemented its own complete filesystem layer on top of
raw partitions, so there was more scope to go wrong, especially since it
was possible to corrupt the on disk structure in subtle ways that would
not be discovered in normal operation but that would cause migration to
corrupt it still further.

In fairness, this is a very difficult problem to solve well and I expect
to rely on dump/load migrations for quite sometime.

-dg 

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: Revise parallel pg_restore's scheduling heuristic

2009-08-03 Thread daveg
On Mon, Aug 03, 2009 at 11:21:43AM -0400, Tom Lane wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
  Over the weekend I ran 40 restores of Milwaukee County's production
  data using Friday's snapshot with and without the patch.  I alternated
  between patched and unpatched.  It appears that this latest version is
  slightly slower for our production database on the same machine and
  configuration where the previous patch appeared to be 1% to 2% faster
  than unpatched (although I had fewer samples of that).
 
 I think we can conclude that for this particular test case, the effects
 of the patch are pretty much masked by noise.  I definitely see no way
 that the latest version of the patch could really be slower than the
 original; it has the same job-scheduling behavior and strictly less
 list-munging overhead.  Now the patch could be slower than unpatched
 as a result of different job-scheduling behavior ... but there's no
 evidence here of a consistently measurable benefit or loss from that.
 
 IIRC daveg was volunteering to do some tests with his own data; maybe
 we should wait for those results.

I have run extensive tests with three trials of each configuration on two
hosts with a variety of db sizes from 3GB to 142GB. These just finished,
and I will send a more detailed summary later, but at the moment I don't
see any significant difference between the patched and vanilla pg_restore.

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: Revise parallel pg_restore's scheduling heuristic

2009-07-31 Thread daveg
On Thu, Jul 30, 2009 at 12:29:34PM -0500, Kevin Grittner wrote:
 Tom Lane t...@sss.pgh.pa.us wrote: 
  
  I think we've pretty much established that it doesn't make things
  *worse*, so I'm sort of inclined to go ahead and apply it.  The
  theoretical advantage of eliminating O(N^2) search behavior seems
  like reason enough, even if it takes a ridiculous number of tables
  for that to become significant.
  
 Agreed, although I'm having some concerns about whether this should
 proceed based exclusively on my benchmarks.  On a thread on the
 performance list, people are talking about restores which go several
 times faster with parallel restore (compared to a single job).  On my
 hardware, I haven't even gotten it to run twice as fast.  This means
 that parallel restore is not a good fit for servers like we have, at
 least with databases like we have, which means it's probably a poor
 environment to get benchmarks for this patch.  :-(
  
 Can we get someone who has benchmarks showing parallel restore to be
 eight times the speed of a single job to benchmark with this patch,
 just for confirmation?

I have a couple spare 32GB 4 core and 64GB 8 core servers with 15 scsi/sas
drives and dumps of production dbs in the 100GB to 500 GB range. These have
several hundred tables most with an index or few and an fkey or too many.

It will take a couple days to run a variety of tests I suppose, and I will
be away starting mid next week, but maybe I could get some done before I go.

Will the patch apply to a vanilla 8.4.0?


-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] display previous query string of idle-in-transaction

2009-07-23 Thread daveg
On Thu, Jun 04, 2009 at 10:22:41PM -0400, Robert Haas wrote:
 The only thing I don't like about this is that I think it's kind of a
 hack to shove the IDLE in transaction designation and the query
 string into the same database column.  I've never liked having to
 write:
 
 select sum(1) from pg_stat_activity where current_query = 'IDLE in
 transaction';
 
 ...and I like it even less if I now have to modify that query to use
 like.  We should find some way to represent this as structured
 data...  maybe make a separate column called idle that's a boolean,
 or something, and let the query column contain the most recent query
 (whether or not it's still executing).
 
I like this idea a lot. Possibly it would be useful to have the end time
of the last query too, then one could find idle sessions that were old and
truly idle rather than just waiting for a busy client to send the next query.

   select ... from pg_stat_activity
 where idle
   and last_statement_endtime  now() - interval '1 minute';

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is a plan for lmza commpression in pg_dump

2009-02-07 Thread daveg
On Wed, Feb 04, 2009 at 10:23:17PM -0500, Andrew Chernow wrote:
 Dann Corbit wrote:
 
 The LZMA SDK is granted to the public domain:
 http://www.7-zip.org/sdk.html
 
 
 I played with this but found the SDK extremely confusing and flat out 
 horrible. One personal dislike was the unnecessary use of C++; although it 
  was the horrible API that turned me off.  I'm not even sure if I ever got a 
 test program working.
 
 LZO (http://www.oberhumer.com/opensource/lzo/) is a great algorithm, easy 
 API with many variants; my fav is LZO1X-1(15).  Its known for its 
 compresison and decompresison speeds ... its blazing fast.  zlib typically 
 gets 5-8% more compression.

LZO rocks. I wonder if the lzo developer would consider a license exception
so that postgresql could use it?  What would we need?

-dg
 

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is a plan for lmza commpression in pg_dump

2009-02-07 Thread daveg
On Sat, Feb 07, 2009 at 02:47:05PM -0500, Bruce Momjian wrote:
 daveg wrote:
  On Wed, Feb 04, 2009 at 10:23:17PM -0500, Andrew Chernow wrote:
   Dann Corbit wrote:
   
   The LZMA SDK is granted to the public domain:
   http://www.7-zip.org/sdk.html
   
   
   I played with this but found the SDK extremely confusing and flat out 
   horrible. One personal dislike was the unnecessary use of C++; although 
   it 
was the horrible API that turned me off.  I'm not even sure if I ever 
   got a 
   test program working.
   
   LZO (http://www.oberhumer.com/opensource/lzo/) is a great algorithm, easy 
   API with many variants; my fav is LZO1X-1(15).  Its known for its 
   compresison and decompresison speeds ... its blazing fast.  zlib 
   typically 
   gets 5-8% more compression.
  
  LZO rocks. I wonder if the lzo developer would consider a license exception
  so that postgresql could use it?  What would we need?
 
 The chance of us using anything but one zlib is near zero so please do
 not persue this;  this discussion comes up much too often.

That this comes up much to often suggests that there is more than near
zero interest.  Why can only one compression library can be considered?
We use multiple readline implementations, for better or worse.

I think the context here is for pg_dump only and in that context a faster
compression library makes a lot of sense. I'd be happy to prepare a patch
if the license issue can be accomodated. Hence my question, what sort of
licence accomodation would we need to be able to use this library?

-dg

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is a plan for lmza commpression in pg_dump

2009-02-07 Thread daveg
On Sat, Feb 07, 2009 at 08:49:29PM -0500, Robert Haas wrote:
 Proprietary compression algorithms, even with Postgresql-specific  
 license exceptions?

To be fair, lzo is GPL, which is a stretch to consider proprietary.

-dg
 

-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Keeping creation time of objects

2008-09-09 Thread daveg
On Tue, Sep 09, 2008 at 03:36:19PM -0400, Tom Lane wrote:
 Devrim =?ISO-8859-1?Q?G=DCND=DCZ?= [EMAIL PROTECTED] writes:
  AFAICS, PostgreSQL is not keeping info about when a table, database,
  sequence, etc was created. We cannot get that info even from OS, since
  CLUSTER or VACUUM FULL may change the metadata of corresponding
  relfilenode.
 
  Does anyone think that adding a timestamp column to pg_class would bring
  an overhead?
 
 There isn't sufficient support for such a feature.  In any case, why
 would creation time (as opposed to any other time, eg last schema
 modification, last data modification, yadda yadda) be especially
 significant?  Would you expect it to be preserved over dump/restore?
 How about every other object type in the system?

I'd be very interested in seeing a last schema modification time for pg_class
objects. I don't care about it being preserved over dump and restore as my
use case is more to find out when a table was created with a view to finding
out if it is still needed. So the question I'm looking to answer is when did
that get here?

-dg

-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Keeping creation time of objects

2008-09-09 Thread daveg
On Tue, Sep 09, 2008 at 11:03:56PM +0300, Hannu Krosing wrote:
 On Tue, 2008-09-09 at 12:40 -0700, daveg wrote:
 
  I'd be very interested in seeing a last schema modification time for 
  pg_class
  objects. I don't care about it being preserved over dump and restore as my
  use case is more to find out when a table was created with a view to finding
  out if it is still needed.
 
 Isn't it easier to find out if it is still needed by looking if it is
 still used, say from pg_stat_user_tables ?

Except that pg_dump will access it and make it look used. Also, this does
not work for functions, views etc.

It seems to me to be pretty simple to put an abstime or timestamp column
on the major catalog tables and update it when the row is updated. A mod
time is more useful probably than a create time.

-dg
 

-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] IN vs EXISTS equivalence

2008-09-03 Thread daveg
On Thu, Aug 14, 2008 at 06:50:09PM +0100, Simon Riggs wrote:
 
 On Fri, 2008-08-08 at 16:23 -0400, Tom Lane wrote:
 
  NOT IN is a lot trickier,
  condition: you must also assume that the comparison operator involved
  never yields NULL for non-null inputs.  That might be okay for btree
  comparison functions but it's not a very comfy assumption in general;
  we certainly haven't got any explicit knowledge that any functions are
  guaranteed to act that way.  So this case might be worth doing later
... 
 Just found this comment, after reading what you said on other thread
 about NOT IN.
 
 NOT IN is a serious performance issue for most people. We simply can't
 say to people you were told not to.
 
 If we can fix it easily for the majority of cases, we should. We can't
 let the it won't work in certain cases reason prevent various

A suggestion: what about adding an attribute to functions to declare that
they never return null? 

   declare foo(int, int) returns int immutable not null as ...


-dg


-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal sql: labeled function params

2008-08-23 Thread daveg
On Sat, Aug 23, 2008 at 05:08:25PM +0100, Gregory Stark wrote:
 Pavel Stehule [EMAIL PROTECTED] writes:
 
  Hello
 
  2008/8/23 Peter Eisentraut [EMAIL PROTECTED]:
  On Friday 22 August 2008 07:41:30 Decibel! wrote:
  If we're really worried about it we can have a GUC for a few versions
  that turns off named parameter assignment. But I don't think we
  should compromise the design on the theory that some folks might be
  using that as an operator *and* can't change their application to
  wrap it's use in ().
 
  Even if that were a reasonable strategy, you can't use GUC parameters to 
  alter
  parser behavior.
 
  I thing, so it's possible - in this case. We should transform named
  params to expr  after syntax analyze.
 
 So for a bit of useless syntactic sugar we should introduce conflicts with
 named parameters, conflicts with operators, introduce an un-sqlish syntax and
 remove a feature users have already made use of and introduce backwards
 compatibility issues for those users?
 
 At any point in this discussion has anyone explained why these labels would
 actually be a good idea?

I was missing that too. What is this for that makes it so compelling?

-dg
 

-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adjusting debug_print_plan to be more useful by default

2008-08-19 Thread daveg
On Tue, Aug 19, 2008 at 06:33:33PM +0100, Simon Riggs wrote:
 
 On Tue, 2008-08-19 at 12:40 -0400, Tom Lane wrote:
  Back in April we changed EXPLAIN VERBOSE to not dump the internal plan
  tree anymore, on the grounds that non-hackers didn't want that info and
  hackers could get it with debug_print_plan and related variables.
  Well, now that I've tried to do some planner development work relying on
  debug_print_plan instead of EXPLAIN VERBOSE, I find it a mite annoying.
  It's not sufficient to set debug_print_plan = true, because the output
  comes out at priority DEBUG1, which is to say it doesn't come out at all
  in a default configuration.  If you boost up client_min_messages or
  log_min_messages so you can see it, you get lots of extraneous debugging
  messages too.
  
  I'd like to propose that the messages emitted by debug_print_plan
  and friends be given priority LOG rather than DEBUG1.  If you've gone
  to the trouble of turning on the variable, then you presumably want the
  results, so it seems dumb to print them at a priority that isn't logged
  by default.  (Note that this is biased to the assumption that you want
  the messages in the postmaster log, not on the console.  Which is
  usually what I want, but maybe someone wants to argue for NOTICE?)
  
  I'd also like to propose making debug_pretty_print default to ON.
  At least for me, the other formatting is 100% unreadable.
 
 +1

+1

-dg

-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A smaller default postgresql.conf

2008-08-19 Thread daveg
On Tue, Aug 19, 2008 at 09:39:39PM +0300, Peter Eisentraut wrote:
 On Tuesday 19 August 2008 19:12:16 Tom Lane wrote:
  Well, why not just make a one-eighty and say that the default
  postgresql.conf is *empty* (except for whatever initdb puts into it)?
 
 Well, my original implementation of GUC had an empty default configuration 
 file, which was later craptaculated to its current form based on seemingly 
 popular demand.  I am very happy to work back toward the empty state, and 
 there appears to be growing support for that.

I like the almost empty file idea.

I often use the include facility to setup postgresql.conf files
that look roughly like this:

  #include postgresql_site_policy.conf
  #include postgresql_host_class.conf
  listen_port=

where the included files set things like the overall sites logging
preferences and buffer_cache and workmen based on installed memory and
disk system (ex: postgresql_32GB.conf) etc.

-dg


-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Visibility Groups

2008-08-08 Thread daveg
On Thu, Aug 07, 2008 at 01:30:27PM +0100, Gregory Stark wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
 
  Currently, we calculate a single OldestXmin across all snapshots on the
  assumption that any transaction might access any table.
 
  I propose creating Visibility Groups that *explicitly* limit the
  ability of a transaction to access data outside its visibility group(s).
  By default, visibility_groups would be NULL, implying potential access
  to all tables.
 
  Once set, any attempt to lock an object outside of a transactions
  defined visibility_groups will result in an error:
ERROR attempt to lock table outside of visibility group(s): foo
HINT you need to set a different value for visibility_groups
  A transaction can only ever reduce or restrict its visibility_groups, it
  cannot reset or add visibility groups.
 
 Hm, so backing up a bit from the specific proposed interface, the key here is
 being able to explicitly mark which tables your transaction will need in the
 future?
 
 Is it always just a handful of heavily updated tables that you want to
 protect? In that case we could have a lock type which means I'll never need
 to lock this object. Then a session could issue LOCK TABLE foo IN
 INACCESSIBLE MODE or something like that. That requires people to hack up
 their pg_dump or replication script though which might be awkward.
 
 Perhaps the way to do that would be to preemptively take locks on all the
 objects that you'll need, then have a command to indicate you won't need any
 further objects beyond those. 

+1

-dg
-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Mini improvement: statement_cost_limit

2008-08-04 Thread daveg
On Sun, Aug 03, 2008 at 10:57:55PM -0400, Robert Treat wrote:
 
 ISTR that what ended up killing the enthusiasm for this was that most people 
 realized that this GUC was just a poor tool to take a stab at solving other 
 problems (ie. rate limiting cpu for queries). 

I'm not concerned with that, I want developers to have feed back on costs in
a way that is obvious.
 
   I think a variation on this could be very useful in development and test
   environments. Suppose it raised a warning or notice if the cost was over
   the limit. Then one could set a limit of a few million on the development
   and test servers and developers would at least have a clue that they
   needed to look at explain for that query. As it is now, one can exhort
   them to run explain, but it has no effect.  Instead we later see queries
   killed by a 24 hour timeout with estimated costs ranging from until they
   unplug the machine and dump it to until the sun turns into a red
   giant.
 
  Great argument. So that's 4 in favour at least.
 
 
 Not such a great argument. Cost models on development servers can and often 
 are quite different from those on production, so you might be putting an 
 artifical limit on top of your developers. 

We load the production dumps into our dev environment, which are the same
hardware spec, so the costs should be identical.
 
 I still think it is worth revisiting what problems people are trying to 
 solve, 
 and see if there are better tools they can be given to solve them.  Barring 
 that, I suppose a crude solution is better than nothing, though I fear people 
 might point at the crude solution as a good enough solution to justify not 
 working on better solutions. 

Alerting developers and QA to potentially costly queries would help solve
some of the probems we are trying to solve. Better tools are welcome, an
argument that the good is the enemy of the best so we should be content with
nothing is not.

-dg
 

-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Mini improvement: statement_cost_limit

2008-08-04 Thread daveg
On Mon, Aug 04, 2008 at 03:09:34PM -0400, Greg Smith wrote:
 On Mon, 4 Aug 2008, daveg wrote:
 We load the production dumps into our dev environment, which are the same
 hardware spec, so the costs should be identical.
 
 Not identical, just close.  ANALYZE samples data from your table randomly. 
 The statistics used to compute the costs will therefore be slightly 
 different on the two servers even if the data is the same.  The problem of 
 discovering one plan on production and another on development is not quite 
 that easy to remove.  Ultimately, if your developers aren't thorough 
 enough to do thinks like look at EXPLAIN plans enough to discover things 
 that are just bad, I just chuckle at your thinking that putting a single 
 limiter on their bad behavior will somehow magically make that better.

Not all developers can be persuaded to run explain on every change.
However, many will investigate a new message. I'm only hoping to try to
focus their attention toward possible problem queries.

 Anyway, if your production server is small enough that you can afford to 
 have another one just like it for the developers to work on, that's great. 
 Robert's point is that many installs don't work like that.  The 
 development teams in lots of places only get a subset of the production 
 data because it's too large to deploy on anything but a big server, which 
 often is hard to cost justify buying just for development purposes.

Not to get into a size war ;-), but the production environment I'd like this
feature for is over 40 32GB 16 scsi drive quadcore boxes. These are dedicated
to postgresql and run one or just a few databases. There are also a bunch
of client boxes that we will not speak of. The staging and test environments
are similar hardware but have only a subset of the databases copied to them.
There are probably than a dozen DB hosts for that.

 I like the concept of a cost limit, but I'm a bit horrified by the thought 
 of it being exposed simply through the internal cost numbers because they 
 are so arbitrary.  One of the endless projects I think about but never 


 start coding is to write something that measures the things the planner 
 cost constants estimate on a particular machine, so that all those numbers 
 actually can be tied to some real-world time measure.  If you did that, 
 you'd actually have a shot at accomplishing the real goal here, making 
 statement_cost_limit cut off statements expected to take longer than 
 statement_timeout before they even get started.

That is a nice idea. Possibly it could be a utility like the fsync tester.

But planner estimates are never going to be all that accurate even with solid
cost numbers because for some classes of queries, particularly those with
many joins the stats can be good at each level but the error accumulates
exponentially. Which is why I think a warning is appropriate instead of an
error. Even a notice in the logs would be useful.

-dg

-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Mini improvement: statement_cost_limit

2008-08-04 Thread daveg
On Mon, Aug 04, 2008 at 02:35:07PM -0400, Robert Treat wrote:
 On Monday 04 August 2008 03:50:40 daveg wrote:
 
 That's great for you, I am talking in the scope of a general solution. (Note 
 I'd also bet that even given the same hardware, different production loads 
 can produce different relative mappings of cost vs. performance, but 
 whatever)

Even on different hardware it would still likely warn of mistakes like
products due to missing join conditions etc.
 
   I still think it is worth revisiting what problems people are trying to
   solve, and see if there are better tools they can be given to solve them.
Barring that, I suppose a crude solution is better than nothing, though
   I fear people might point at the crude solution as a good enough solution
   to justify not working on better solutions.
 
  Alerting developers and QA to potentially costly queries would help solve
  some of the probems we are trying to solve. Better tools are welcome, an
  argument that the good is the enemy of the best so we should be content
  with nothing is not.
 
 And you'll note, I specifically said that a crude tool is better than 
 nothing. 

I released somewhat after I sent the above that it might have sounded a bit
snippy. I hope I have not offended.

 But your completely ignoring that a crude tool can often end-up as a foot-gun 
 once relased into the wild. 

I'm suggesting a warning, or even just a notice into the logs, I don't see
the footgun. What am I missing?

Regards

-dg

-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Mini improvement: statement_cost_limit

2008-08-04 Thread daveg
On Mon, Aug 04, 2008 at 11:59:03AM -0700, Josh Berkus wrote:
 Greg,
 
 Well that's going to depend on the application But I suppose there's
 nothing wrong with having options which aren't always a good idea to use. 
 The
 real question I guess is whether there's ever a situation where it would 
 be a
 good idea to use this. I'm not 100% sure.
 
 I can think of *lots*.   Primarily, simple web applications, where 
 queries are never supposed to take more than 50ms.  If a query turns up 
 with an estimated cost of 100, then you know something's wrong; 
 in the statistics if not in the query.  In either case, that query has a 
 good chance of dragging down the whole system.
 
 In such a production application, it is better to have false positives 
 and reject otherwise-OK queries becuase their costing is wrong, than to 
 let a single cartesian join bog down an application serving 5000 
 simultaneous users.  Further, with a SQL error, this would allow the 
 query rejection to be handled in a user-friendly way from the UI 
 (Search too complex.  Try changing search terms.) rather than timing 
 out, which is very difficult to handle well.
 
 The usefulness of this feature for interactive sessions is 
 limited-to-nonexistant.  It's for production applications.

Ok, that is a different use case where an error seems very useful. What
about slightly extending the proposal to have the severity of exceeding
the limit configurable too. Something like:

   costestimate_limit = 10 # default 0 to ignore limit
   costestimate_limit_severity = error # debug, notice, warning, error

-dg

-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Automatic Client Failover

2008-08-04 Thread daveg
On Mon, Aug 04, 2008 at 05:17:59PM -0400, Jonah H. Harris wrote:
 On Mon, Aug 4, 2008 at 5:08 PM, Simon Riggs [EMAIL PROTECTED] wrote:
  When primary server fails, it would be good if the clients connected to
  the primary knew to reconnect to the standby servers automatically.
 
 This would be a nice feature which many people I've talked to have
 asked for.  In Oracle-land, it's called Transparent Application
 Failover (TAF) and it gives you a lot of options, including the
 ability to write your own callbacks when a failover is detected.

This might be better done as part of a proxy server, eg pgbouncer, pgpool
than as part of postgresql or libpq. I like the concept, but the logic to
determine when a failover has occurred is complex and a client will often
not have access to enough information to make this determination accurately.

postgresql could have hooks to support this though, ie to determine when a
standby thinks it has become the master.

-dg

-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Mini improvement: statement_cost_limit

2008-08-04 Thread daveg
On Mon, Aug 04, 2008 at 05:19:50PM -0400, Robert Treat wrote:
 See, this is what we ended up talking about before. Someone will say I'd 
 like 
 to prevent my devs from accidentally doing queries with cartesian products 
 and they will use this to do it... but that will only work in some cases, so 
 it becomes a poor tool to solve a different problem. 
 
 BTW, what I really love about statement costs, is that they aren't even 
 reliable on the same machine with the same data. I have seen query plans 
 which run on the same data on the same machine where the resultant query 
 runtime can vary from 2 hours to 5 hours, depending on how much other 
 concurrent traffic is on the machine. Awesome eh? 

Sure, I don't think anyone believes that costs are precise. But the case that
is interesting is 2 hours versus years and years.

 The footgun in my mind is that people will think this solves a number of 
 problems even though it doesnt solve them well.  However, the footgun for yo

I suspect that a good solution to this problem is impossible as it is more
or less the halting problem. So I'm willing to accept a poor solution based
on costs and then hope we improve the cost model.

-dg

-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Mini improvement: statement_cost_limit

2008-08-03 Thread daveg
On Sat, Aug 02, 2008 at 09:30:08PM +0200, Hans-Jürgen Schönig wrote:
 On Aug 2, 2008, at 8:38 PM, Tom Lane wrote:
 
 Andrew Dunstan [EMAIL PROTECTED] writes:
 Hans-Jürgen Schönig wrote:
 i introduced a GUC called statement_cost_limit which can be used to
 error out if a statement is expected to be too expensive.
 
 You clearly have far more faith in the cost estimates than I do.
 
 Wasn't this exact proposal discussed and rejected awhile back?
 
  regards, tom lane
 
 
 
 i don't remember precisely.
 i have seen it on simon's wiki page and it is something which would  
 have been useful in some cases in the past.

I think a variation on this could be very useful in development and test
environments. Suppose it raised a warning or notice if the cost was over
the limit. Then one could set a limit of a few million on the development
and test servers and developers would at least have a clue that they needed
to look at explain for that query. As it is now, one can exhort them to
run explain, but it has no effect.  Instead we later see queries killed
by a 24 hour timeout with estimated costs ranging from until they unplug
the machine and dump it to until the sun turns into a red giant.

-dg

-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Copy storage parameters on CREATE TABLE LIKE/INHERITS

2008-07-30 Thread daveg
On Wed, Jul 30, 2008 at 04:45:47PM +0900, ITAGAKI Takahiro wrote:
 Here is a patch to copy column storage parameters and reloptions on
 CREATE TABLE LIKE, which I proposed at:
   [HACKERS] Uncopied parameters on CREATE TABLE LIKE
   http://archives.postgresql.org/message-id/[EMAIL PROTECTED]
 
 Column storage parameters (by ALTER COLUMN SET STORAGE) and table
 storage parameters (by ALTER TABLE SET (...) ) are copied from template
 tables on LIKE or parent tables on INHERITS (1. and 2. at above e-mail).
 The patch does not include copying of comments (3.) for now.
 It also includes new regression tests and rough documentation.
 
 When template or parent tables have non-default settings,
 they are copied into a new table automatically on CREATE TABLE.
 If CREATE TABLE statement has own storage parameter definition,
 they overwrites inherited settings.

I'd like to have the ability to copy these parameters, but not to have it
be automatic. Some of my clients applications commonly use CREATE TABLE
LIKE to get empty work tables to populate and update before adding the
data to the main table (which may have non-default settings). The automatic
behaviour may be undesirable for this use.

-dg

-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] pg_dump additional options for performance

2008-07-27 Thread daveg
On Sun, Jul 27, 2008 at 10:37:34AM +0100, Simon Riggs wrote:
 
 On Sat, 2008-07-26 at 11:03 -0700, Joshua D. Drake wrote:
 
  2. We have no concurrency which means, anyone with any database over 50G
  has unacceptable restore times.
 
 Agreed.
 
 Also the core reason for wanting -w
 
  3. We have to continue develop hacks to define custom utilization. Why
  am I passing pre-data anything? It should be automatic. For example:

[adding hackers for discussion]

On Sat, Jul 26, 2008 at 01:56:14PM -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  I want to dump tables separately for performance reasons. There are
  documented tests showing 100% gains using this method. There is no gain
  adding this to pg_restore. There is a gain to be had - parallelising
  index creation, but this patch doesn't provide parallelisation.
 
 Right, but the parallelization is going to happen sometime, and it is
 going to happen in the context of pg_restore.  So I think it's pretty
 silly to argue that no one will ever want this feature to work in
 pg_restore.
 
 To extend the example I just gave to Stephen, I think a fairly probable
 scenario is where you only need to tweak some before object
 definitions, and then you could do
 
 pg_restore --schema-before-data whole.dump before.sql
 edit before.sql
 psql -f before.sql target_db
 pg_restore --data-only --schema-after-data -d target_db whole.dump
 
 which (given a parallelizing pg_restore) would do all the time-consuming
 steps in a fully parallelized fashion.

A few thoughts about pg_restore performance:

To take advantage of non-logged copy, the table create and data load should
be in the same transaction.

To take advantage of file and buffer cache, it would be better to create
indexes immediately after loading table data. Many tables will be small
enough to fit in cache on and this will avoid re-reading them for index
builds. This is more advantagious with more indexes on one table. There
may also be some filesytem placement benefits to building the indexes for
a table immediately after loading the data.

Creating constraints immediately after loading data also would benefit from
warm buffer and file caches. Doing this this is complicated by the need
for indexes and data in the referenced tables to exist first.

It seems that a high performance restore will want to procede in a different
order than the current sort order or that proposed by the before/data/after
patch.

 - The simplest unit of work for parallelism may be the table and its
   decorations, eg indexes and relational constraints.

 - Sort tables by foreign key dependency so that referenced tables are
   loaded before referencing tables.

 - Do table creation and data load together in one transaction to use
   non-logged copy. Index builds, and constraint creation should follow
   immediately, either as part of the same transaction, or possibly
   parallelized themselves.

Table creation, data load, index builds, and constraint creation could
be packaged up as the unit of work to be done in a subprocess which either
completes or fails as a unit. The worker process would be called with
connection info, a file pointer to the data, and the DDL for the table.
pg_restore would keep a work queue of tables to be restored in FK dependency
order and also do the other schema operations such as functions and types.

-dg

-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Additional psql requirements

2008-07-25 Thread daveg
On Fri, Jul 25, 2008 at 08:16:59AM +0100, Simon Riggs wrote:
 
 On Fri, 2008-07-25 at 10:00 +0900, ITAGAKI Takahiro wrote:
  Simon Riggs [EMAIL PROTECTED] wrote:
  
   * access to version number
   * simple mechanism for conditional execution
   * ability to set substitution variables from command execution
   * conditional execution whether superuser or not
  
  Can we use pgScript for such flow controls?
  http://pgscript.projects.postgresql.org/INDEX.html
  
  I'm not sure pgScript can be used in pgAdmin already, but if we support
  it both psql and pgAdmin, the scripting syntax will be a defact standard
  because they are the most major user interfaces to postgres. I think it is
  not good to add another dialect that can be used only in psql.
 
 I just want good way, not two imperfect ones.
 
 And I'm not going to suggest having pgscript in core.

It seems to me that a sql-like client side scripting language should be as
similar as possible to plpgsql. Pgscript looks a bit like plpgsql, but is
pretty much incompatible with it for no particularly obvious reason.

-dg

-- 
David Gould   [EMAIL PROTECTED]  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >