Re: [GENERAL] VACUUM Question

2006-01-31 Thread Jim C. Nasby
On Thu, Jan 26, 2006 at 04:14:45PM -0500, Chris Browne wrote:
> > As for Reindex, I'm not entirely sure, I don't think you would benefit
> > from reindex because you aren't updating or deleting.  Can anyone comment
> > on this?  Is is possibile that a table with lots of inserts resulting in
> > lots of page splits etc could ever benifit form REINDEX?
> 
> I could imagine a CLUSTER doing some good, and if that's the case,
> REINDEX could have some favorable results.  But you'd better have a
> real specific model as to why that would be...

Aside from the cluster case, are there any issues with how page splits
in the b-tree are done that could lead to better performance after a
REINDEX?
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [GENERAL] VACUUM Question

2006-01-26 Thread Matthew T. O'Connor
> Also, somebody made a real good point about rolled-back insertions.
> Even if the only command you ever apply to the table is INSERT, you
> could still have dead rows in the table if some of those transactions
> occasionally roll back.

hmm... That's true.  I don't think autovacuum doesn't anything to account
for the concept of rolledback inserts.  I suppose in most real world
situations that number is going to be small enough to be ignored, but not
in all cases.  Is there anyway for the stats system to report the
information about rolledback inserts?  In fact autovacuum probably has a
similar deficiency for rolled back deletes but not a rolled back update. 
Anyone think this is enough of an issue that it needs more attention?

Matt


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] VACUUM Question

2006-01-26 Thread Chris Browne
matthew@zeut.net ("Matthew T. O'Connor") writes:
> If you really are just inserting, and never updating or deleting,
> then you will never need to vacuum the table, rather you will just
> need to ANALYSE the table.  If you use autovacuum that is exactly
> what it will do.

"Never" is a pretty long time...

You need a VACUUM every 2^31 transactions, but since there needs to be
such a vacuum for the whole database, that one will do...

> As for Reindex, I'm not entirely sure, I don't think you would benefit
> from reindex because you aren't updating or deleting.  Can anyone comment
> on this?  Is is possibile that a table with lots of inserts resulting in
> lots of page splits etc could ever benifit form REINDEX?

I could imagine a CLUSTER doing some good, and if that's the case,
REINDEX could have some favorable results.  But you'd better have a
real specific model as to why that would be...
-- 
let name="cbbrowne" and tld="ntlug.org" in name ^ "@" ^ tld;;
http://cbbrowne.com/info/spreadsheets.html
Oh,  boy, virtual memory!  Now I'm  gonna make  myself a  really *big*
RAMdisk!

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [GENERAL] VACUUM Question

2006-01-26 Thread Tom Lane
Michael Fuhr <[EMAIL PROTECTED]> writes:
> On Thu, Jan 26, 2006 at 02:35:42PM -0500, Matthew T. O'Connor wrote:
>> If you really are just inserting, and never updating or deleting, then you
>> will never need to vacuum the table, rather you will just need to ANALYSE
>> the table.

> That's not quite true; the table must still be vacuumed occasionally
> to prevent transaction ID wraparound failure,

Also, somebody made a real good point about rolled-back insertions.
Even if the only command you ever apply to the table is INSERT, you
could still have dead rows in the table if some of those transactions
occasionally roll back.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] VACUUM Question

2006-01-26 Thread Michael Fuhr
On Thu, Jan 26, 2006 at 02:35:42PM -0500, Matthew T. O'Connor wrote:
> If you really are just inserting, and never updating or deleting, then you
> will never need to vacuum the table, rather you will just need to ANALYSE
> the table.

That's not quite true; the table must still be vacuumed occasionally
to prevent transaction ID wraparound failure, else you risk losing
data.

http://www.postgresql.org/docs/8.1/interactive/maintenance.html#VACUUM-FOR-WRAPAROUND

-- 
Michael Fuhr

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [GENERAL] VACUUM Question

2006-01-26 Thread Matthew T. O'Connor
If you really are just inserting, and never updating or deleting, then you
will never need to vacuum the table, rather you will just need to ANALYSE
the table.  If you use autovacuum that is exactly what it will do.

As for Reindex, I'm not entirely sure, I don't think you would benefit
from reindex because you aren't updating or deleting.  Can anyone comment
on this?  Is is possibile that a table with lots of inserts resulting in
lots of page splits etc could ever benifit form REINDEX?

Matt


> We have 2 tables we expect to grow by up to 50,000 rows per day each
> depending on the customer.  In normal operation we will most likely never
> update or delete from these tables as they are for historical reporting.
> (Eventually we may but a limit on the amount of data and delete older than
> X months or such)
> We intend to create a number of indexes based upon the reporting search
> criteria.
>
> What would the best setup be for VACUUM, ANALYSE, REINDEX.  Alot of the
> infor refers to data hanging around from deletes and updates which in
> normal course we will not do on these tables?
>
>
> Oisin
>


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [GENERAL] VACUUM Question

2004-06-04 Thread Bill Moran
Rick Gigger <[EMAIL PROTECTED]> wrote:

> >>One more question; on one server the Vacuum Analyze before the insert takes
> >>approx. 2min after that the same command takes 15min.
> > 
> > 
> > You might try a VACUUM FULL sometime when you can deal with 15min of downtime
> > or so. Actually it would probably be longer. Perhaps the table that's taking
> > 15min has a ton of extra dead tuples left over from the fsm settings being too
> > low and/or vacuum being too infrequent.
> 
> Does VACUUM FULL just lock entire tables and thus cause you to 
> essentially have downtime on that database because it doesn't respond 
> quickly or do you actually have to shut down postgres to safely do a 
> vacuum full?

The former.  You don't shut the database server down, but it won't be
responsive while vacuum full is running.

-- 
Bill Moran
Potential Technologies
http://www.potentialtech.com

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [GENERAL] VACUUM Question

2004-06-04 Thread Rick Gigger
One more question; on one server the Vacuum Analyze before the insert takes
approx. 2min after that the same command takes 15min.

You might try a VACUUM FULL sometime when you can deal with 15min of downtime
or so. Actually it would probably be longer. Perhaps the table that's taking
15min has a ton of extra dead tuples left over from the fsm settings being too
low and/or vacuum being too infrequent.
Does VACUUM FULL just lock entire tables and thus cause you to 
essentially have downtime on that database because it doesn't respond 
quickly or do you actually have to shut down postgres to safely do a 
vacuum full?

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [GENERAL] VACUUM Question

2004-06-04 Thread Joseph Shraibman
Greg Stark wrote:
Alex <[EMAIL PROTECTED]> writes:

There won't be anything to VACUUM after the insert, but perhaps you still want
to run ANALYZE. Note that a plain ANALYZE uses a statistical sample which is
much faster, whereas VACUUM ANALYZE has to look at every record anyways so
it's slower but produces more accurate statistics. If you don't have
That is not true.  My nightly vacuum analyze clearly is sampling:
INFO:  "tablename": 22102 pages, 3 rows sampled, 1712934 estimated 
total rows

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [GENERAL] VACUUM Question

2004-06-04 Thread Greg Stark
Alex <[EMAIL PROTECTED]> writes:

> How reasonable is it to run a Vacuum Analyze before and after the
> insert/update of the data.

On a busy system you should run vacuum more often than once per day.

You should probably run a VACUUM after the update. And running ANALYZE at the
same time isn't a bad idea, especially if the update changes the distribution
a lot.

There won't be anything to VACUUM after the insert, but perhaps you still want
to run ANALYZE. Note that a plain ANALYZE uses a statistical sample which is
much faster, whereas VACUUM ANALYZE has to look at every record anyways so
it's slower but produces more accurate statistics. If you don't have
performance problems then using VACUUM ANALYZE isn't a bad idea, but it's
probably overkill.

> Also, I noticed that i get quite some performance improvement if I run a
> count(*) on the two tables before the insert. Any reasons for that?

Uh, just preloading the kernel cache with blocks from the table and index?

> One more question; on one server the Vacuum Analyze before the insert takes
> approx. 2min after that the same command takes 15min.

You might try a VACUUM FULL sometime when you can deal with 15min of downtime
or so. Actually it would probably be longer. Perhaps the table that's taking
15min has a ton of extra dead tuples left over from the fsm settings being too
low and/or vacuum being too infrequent.

-- 
greg


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [GENERAL] VACUUM Question

2004-02-19 Thread scott.marlowe
On Thu, 19 Feb 2004, Alex wrote:

> Hi,
> just a few questions on the Vaccum
> 
> I run a vacuum analyze on the database every night as part of a
> maintenance job.
> 
> During the day I have a job that loads 30-70,000 records into two tables
> (each 30-70k).
> This job runs 2-3 times a day; the first time mainly inserts, the 2nd,
> 3rd time mostly updates.
> Both tables have in the area of 1-3Mio records
> 
> How reasonable is it to run a Vacuum Analyze before and after the
> insert/update of the data.

Running it before probably gains you little.  In some circumstances 
(running analyze on an empty table is one) analyzing before loading data 
is counterproductive, because postgresql's query planner will be making 
decisions on the 30,000th of 70,000 inserts based on a table size of very 
few rows, and favoring seq scans when it should be using index scans.

vacuuming (and analyzing) after the import is a good thing.

> Also, I noticed that i get quite some performance improvement if I run a
> count(*) on the two tables before the insert. Any reasons for that?

Likely it is loading the whole table into kernel cache.

> One more question; on one server the Vacuum Analyze before the insert
> takes approx. 2min after that the same command takes 15min.

Normal.  Before hand, there are no dead tuples to harvest / put in the 
fsm, but afterward there are plenty to harvest.

Make sure your fsm settings are high enough to retain all the freed pages, 
or you'll wind up with table bloat.

Vacuum full every so often (off hours are best) to make sure.  Do a df on 
the database mount point before and after and see how much spave it 
recovers.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [GENERAL] Vacuum Question

2000-06-06 Thread Tom Lane

Ed Loehr <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> I can't reproduce any problem with just a "vacuum" (with or without
>> analyze) and no following command.
>> 
>> I did, however, notice that very occasionally the inserting process
>> would spit out weird error messages like "Function '(int4)' does not
>> exist" and  null probin for procedure 481".  This seems
>> to be due to VACUUM (on system tables) causing syscache entries to be
>> flushed at unexpected times.  I've committed patches for the two cases
>> I observed, but there may be more lurking...

> Yes, I was getting a similar Function error message, though I never saw
> the init_fcache message.  And it appeared the backend would crash
> coincidentally with the function error message.  If the patch can be
> applied to 7.0 (beta3), and you don't mind posting it, I could test it
> here...

Hmm, I only saw error messages, no crashes --- but I suppose a crash is
possible, since the root of the problem here is a dangling pointer.

Patches for 7.0.2 are attached.  Not sure if they will apply perfectly
cleanly to beta3, but you should be able to make the right mods by hand
if patch doesn't cope...

regards, tom lane

*** src/backend/parser/parse_type.c.origTue May 30 00:24:49 2000
--- src/backend/parser/parse_type.c Tue Jun  6 11:41:08 2000
***
*** 48,54 
return NULL;
}
typetuple = (Form_pg_type) GETSTRUCT(tup);
!   return NameStr(typetuple->typname);
  }
  
  /* return a Type structure, given a type id */
--- 48,55 
return NULL;
}
typetuple = (Form_pg_type) GETSTRUCT(tup);
!   /* pstrdup here because result may need to outlive the syscache entry */
!   return pstrdup(NameStr(typetuple->typname));
  }
  
  /* return a Type structure, given a type id */
***
*** 119,125 
Form_pg_type typ;
  
typ = (Form_pg_type) GETSTRUCT(t);
!   return NameStr(typ->typname);
  }
  
  /* given a type, return its typetype ('c' for 'c'atalog types) */
--- 120,127 
Form_pg_type typ;
  
typ = (Form_pg_type) GETSTRUCT(t);
!   /* pstrdup here because result may need to outlive the syscache entry */
!   return pstrdup(NameStr(typ->typname));
  }
  
  /* given a type, return its typetype ('c' for 'c'atalog types) */

*** src/backend/utils/cache/fcache.c~   Wed Apr 12 13:15:53 2000
--- src/backend/utils/cache/fcache.cTue Jun  6 13:39:03 2000
***
*** 14,19 
--- 14,20 
   */
  #include "postgres.h"
  
+ #include "access/heapam.h"
  #include "catalog/pg_language.h"
  #include "catalog/pg_proc.h"
  #include "catalog/pg_type.h"
***
*** 89,97 
if (!use_syscache)
elog(ERROR, "what the , init the fcache without the catalogs?");
  
!   procedureTuple = SearchSysCacheTuple(PROCOID,
!
ObjectIdGetDatum(foid),
!0, 0, 
0);
  
if (!HeapTupleIsValid(procedureTuple))
elog(ERROR, "init_fcache: Cache lookup failed for procedure %u",
--- 90,98 
if (!use_syscache)
elog(ERROR, "what the , init the fcache without the catalogs?");
  
!   procedureTuple = SearchSysCacheTupleCopy(PROCOID,
!  
  ObjectIdGetDatum(foid),
!  
  0, 0, 0);
  
if (!HeapTupleIsValid(procedureTuple))
elog(ERROR, "init_fcache: Cache lookup failed for procedure %u",
***
*** 258,263 
--- 259,266 
}
else
retval->func.fn_addr = (func_ptr) NULL;
+ 
+   heap_freetuple(procedureTuple);
  
return retval;
  }



Re: [GENERAL] Vacuum Question

2000-06-06 Thread Ed Loehr

Tom Lane wrote:
> 
> Ed Loehr <[EMAIL PROTECTED]> writes:
> > % while test 1; do echo -n "`date`: "; psql -d pdb -c "vacuum analyze;
> > select count(*) from foo;"; sleep 3; done
> >
> > This seems to consistently crash after the first vacuum with the
> > following message:
> >>
> >> This is a known gotcha that's got nothing to do with any sort of
> >> concurrency.  You can't safely send a VACUUM followed by anything
> >> else in a single query string.
> 
> > Well, I thought that select count(*) might've been causing a problem, so
> > I experimented without it and found the same problem.  Doesn't seem to
> > happen with 'vacuum'or 'vacuum analyze foo', only 'vacuum analyze'...
> 
> I can't reproduce any problem with just a "vacuum" (with or without
> analyze) and no following command.
> 
> I did, however, notice that very occasionally the inserting process
> would spit out weird error messages like "Function '(int4)' does not
> exist" and  null probin for procedure 481".  This seems
> to be due to VACUUM (on system tables) causing syscache entries to be
> flushed at unexpected times.  I've committed patches for the two cases
> I observed, but there may be more lurking...

Yes, I was getting a similar Function error message, though I never saw
the init_fcache message.  And it appeared the backend would crash
coincidentally with the function error message.  If the patch can be
applied to 7.0 (beta3), and you don't mind posting it, I could test it
here...

Regards,
Ed Loehr



Re: [GENERAL] Vacuum Question

2000-06-06 Thread Tom Lane

Ed Loehr <[EMAIL PROTECTED]> writes:
> % while test 1; do echo -n "`date`: "; psql -d pdb -c "vacuum analyze;
> select count(*) from foo;"; sleep 3; done
> 
> This seems to consistently crash after the first vacuum with the
> following message:
>> 
>> This is a known gotcha that's got nothing to do with any sort of
>> concurrency.  You can't safely send a VACUUM followed by anything
>> else in a single query string.  

> Well, I thought that select count(*) might've been causing a problem, so
> I experimented without it and found the same problem.  Doesn't seem to
> happen with 'vacuum'or 'vacuum analyze foo', only 'vacuum analyze'...

I can't reproduce any problem with just a "vacuum" (with or without
analyze) and no following command.

I did, however, notice that very occasionally the inserting process
would spit out weird error messages like "Function '(int4)' does not
exist" and "init_fcache: null probin for procedure 481".  This seems
to be due to VACUUM (on system tables) causing syscache entries to be
flushed at unexpected times.  I've committed patches for the two cases
I observed, but there may be more lurking...

regards, tom lane



Re: [GENERAL] Vacuum Question

2000-06-06 Thread Ed Loehr

Tom Lane wrote:
> 
> Ed Loehr <[EMAIL PROTECTED]> writes:
> >> Then, start this one in another bash window/terminal/whatever...
> >>
> >> % while test 1; do echo -n "`date`: "; psql -d pdb -c "vacuum analyze;
> >> select count(*) from foo;"; sleep 3; done
> >>
> >> This seems to consistently crash after the first vacuum with the
> >> following message:
> 
> This is a known gotcha that's got nothing to do with any sort of
> concurrency.  You can't safely send a VACUUM followed by anything
> else in a single query string.  

Well, I thought that select count(*) might've been causing a problem, so
I experimented without it and found the same problem.  Doesn't seem to
happen with 'vacuum'or 'vacuum analyze foo', only 'vacuum analyze'...

Regards,
Ed Loehr



Re: [GENERAL] Vacuum Question

2000-06-06 Thread Tom Lane

Ed Loehr <[EMAIL PROTECTED]> writes:
>> Then, start this one in another bash window/terminal/whatever...
>> 
>> % while test 1; do echo -n "`date`: "; psql -d pdb -c "vacuum analyze;
>> select count(*) from foo;"; sleep 3; done
>> 
>> This seems to consistently crash after the first vacuum with the
>> following message:

This is a known gotcha that's got nothing to do with any sort of
concurrency.  You can't safely send a VACUUM followed by anything
else in a single query string.  The problem is that VACUUM forces a
transaction commit, which releases all transiently allocated memory
in its backend ... including the already-parsed querytrees for the
rest of the query string.  Oops.  (cf. comment near line 560 in
src/backend/tcop/postgres.c)

You won't see the problem if you enter "vacuum analyze; select ..."
interactively or as a script in psql, because it chops up the
commands into separate query submittals.  But apparently psql
doesn't chop up a -c string.  Non-psql frontends can expose the bug
as well.

It's possible that this will get cleaned up as a byproduct of the
proposed rework of transaction-local memory contexts.  But it's
not a real high-priority problem, at least not IMHO.  For now,
the answer is "if it hurts, don't do it ;-)"

regards, tom lane