Re: [PERFORM] Slow vacuum performance

2004-06-21 Thread Patrick Hatcher

Thanks!


Patrick Hatcher






Andrew McMillan <[EMAIL PROTECTED]>

Sent by: [EMAIL PROTECTED]
06/21/04 03:11 AM




To
Patrick Hatcher <[EMAIL PROTECTED]>


cc
[EMAIL PROTECTED]


Subject
Re: [PERFORM] Slow vacuum
performance








On Fri, 2004-06-18 at 19:51 -0700, Patrick Hatcher
wrote:
> 
> Thanks!
>  
> My effective_cache_size = 625000
>  
> I thought that having the shared_buffers above 2k or 3k didn't gain
> any performance and may in fact degrade it?

Hi Patrick,


Quoting from:
http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html

shared_buffers
        Sets the size of PostgreSQL's' memory buffer
where queries are
        held before being fed into the Kernel buffer
of the host system.
        It's very important to remember that this is
only a holding
        area, and not the total memory available for
the server. As
        such, resist the urge to set this number to
a large portion of
        your RAM, as this will actually degrade performance
on many
        operating systems. Members of the pgsql-performance
mailing list
        have found useful values in the range of 1000-6000,
depending on
        available RAM, database size, and number of
concurrent queries.
        For servers with very large amounts of available
RAM (more than
        1 GB) increasing this setting to 6-15% or available
RAM has
        worked well for some users. The real analysis
of the precise
        best setting is not fully understood and is
more readily
        determined through testing than calculation.
 
        
        As a rule of thumb, observe shared memory usage
of PostgreSQL
        with tools like ipcs and determine the setting.
Remember that
        this is only half the story. You also need
to set
        effective_cache_size so that postgreSQL will
use available
        memory optimally.

Using this conservatively, on an 8G system, 6% would be roughly 60,000
pages - considerably higher than 2-3000...

One day when I wasn't timid (well, OK, I was desperate :-), I did see a
_dramatic_ performance improvement in a single very narrow activity by
setting shared_buffers to 30 on a 4G RAM system (I was rolling back
a transaction involving an update to 2.8 million rows) , but afterwards
I set shared_buffers back to 1, which I have now increased to 2
on that system.


You may also want to look at:
http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html

Or indeed, peruse the articles regularly as they appear:
http://www.varlena.com/varlena/GeneralBits/

Regards,
                
                 
               
                 
               
Andrew McMillan

-
Andrew @ Catalyst .Net .NZ  Ltd,  PO Box 11-053, Manners St,
 Wellington
WEB: http://catalyst.net.nz/            PHYS:
Level 2, 150-154 Willis St
DDI: +64(4)803-2201      MOB: +64(272)DEBIAN    
 OFFICE: +64(4)499-2267
           Tomorrow will be cancelled due to lack
of interest.
-

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.4 (GNU/Linux)

iD4DBQBA1rRkjJA0f48GgBIRAsedAJiY9VyXDUEIyQtjD2rPXzOoZlroAKCwdSWn
vsecos2tWn99gvpgm/ruWg==
=A+kz
-END PGP SIGNATURE-

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Slow vacuum performance

2004-06-21 Thread Andrew McMillan
On Fri, 2004-06-18 at 19:51 -0700, Patrick Hatcher wrote:
> 
> Thanks!
>  
> My effective_cache_size = 625000
>  
> I thought that having the shared_buffers above 2k or 3k didn't gain
> any performance and may in fact degrade it?

Hi Patrick,


Quoting from:
http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html

shared_buffers
Sets the size of PostgreSQL's' memory buffer where queries are
held before being fed into the Kernel buffer of the host system.
It's very important to remember that this is only a holding
area, and not the total memory available for the server. As
such, resist the urge to set this number to a large portion of
your RAM, as this will actually degrade performance on many
operating systems. Members of the pgsql-performance mailing list
have found useful values in the range of 1000-6000, depending on
available RAM, database size, and number of concurrent queries.
For servers with very large amounts of available RAM (more than
1 GB) increasing this setting to 6-15% or available RAM has
worked well for some users. The real analysis of the precise
best setting is not fully understood and is more readily
determined through testing than calculation.  

As a rule of thumb, observe shared memory usage of PostgreSQL
with tools like ipcs and determine the setting. Remember that
this is only half the story. You also need to set
effective_cache_size so that postgreSQL will use available
memory optimally.

Using this conservatively, on an 8G system, 6% would be roughly 60,000
pages - considerably higher than 2-3000...

One day when I wasn't timid (well, OK, I was desperate :-), I did see a
_dramatic_ performance improvement in a single very narrow activity by
setting shared_buffers to 30 on a 4G RAM system (I was rolling back
a transaction involving an update to 2.8 million rows) , but afterwards
I set shared_buffers back to 1, which I have now increased to 2
on that system.


You may also want to look at:
http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html

Or indeed, peruse the articles regularly as they appear:
http://www.varlena.com/varlena/GeneralBits/

Regards,
Andrew McMillan

-
Andrew @ Catalyst .Net .NZ  Ltd,  PO Box 11-053, Manners St,  Wellington
WEB: http://catalyst.net.nz/PHYS: Level 2, 150-154 Willis St
DDI: +64(4)803-2201  MOB: +64(272)DEBIAN  OFFICE: +64(4)499-2267
   Tomorrow will be cancelled due to lack of interest.
-


signature.asc
Description: This is a digitally signed message part


Re: [PERFORM] Slow vacuum performance

2004-06-18 Thread Andrew McMillan
On Thu, 2004-06-17 at 13:09 -0700, Patrick Hatcher wrote:
> 
> 
> 
> Pg:  7.4.2
> RedHat 7.3
> Ram: 8gig
> 
> I have 6 million row table that I vacuum full analyze each night.  The time
> seems to be streching out further and further as I add more rows.  I read
> the archives and Josh's annotated pg.conf guide that setting the FSM higher
> might help.  Currently, my memory settings are set as such.  Does this seem
> low?
> 
> Last reading from vaccum verbose:
>   INFO:  analyzing "cdm.cdm_ddw_customer"
> INFO:  "cdm_ddw_customer": 209106 pages, 3000 rows sampled, 6041742
> estimated total rows
> >>I think I should now set my max FSM to at least 21 but wanted to make
> sure

Yes, that's my interpretation of those numbers too.  I would set
max_fsm_pages to 30 (or more) in that case.

If you have 8G of RAM in the machine your shared_buffers seems very low
too.  Depending on how it is used I would increase that to at least the
recommended maximum (1 - 80M).

You don't quote your setting for effective_cache_size, but you should
probably look at what "/usr/bin/free" reports as "cached", divide that
by 10, and set it to that as a quick rule of thumb...

Regards,
Andrew McMillan


> shared_buffers = 2000   # min 16, at least max_connections*2, 8KB
> each
> sort_mem = 12288# min 64, size in KB
> 
> # - Free Space Map -
> 
> max_fsm_pages = 10  # min max_fsm_relations*16, 6 bytes each
> #max_fsm_relations = 1000   # min 100, ~50 bytes each
> 

-
Andrew @ Catalyst .Net .NZ  Ltd,  PO Box 11-053, Manners St,  Wellington
WEB: http://catalyst.net.nz/PHYS: Level 2, 150-154 Willis St
DDI: +64(4)803-2201  MOB: +64(272)DEBIAN  OFFICE: +64(4)499-2267
  Make things as simple as possible, but no simpler -- Einstein
-


signature.asc
Description: This is a digitally signed message part


Re: [PERFORM] Slow vacuum performance

2004-06-17 Thread Dennis Bjorklund
On Thu, 17 Jun 2004, Patrick Hatcher wrote:

> I have 6 million row table that I vacuum full analyze each night.  The time
> seems to be streching out further and further as I add more rows.  I read

You could try to run normal (non full) vacuum every hour or so. If you do 
normal vacuum often enough you probably don't need to run vacuum full at 
all.

-- 
/Dennis Björklund


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[PERFORM] Slow vacuum performance

2004-06-17 Thread Patrick Hatcher




Pg:  7.4.2
RedHat 7.3
Ram: 8gig

I have 6 million row table that I vacuum full analyze each night.  The time
seems to be streching out further and further as I add more rows.  I read
the archives and Josh's annotated pg.conf guide that setting the FSM higher
might help.  Currently, my memory settings are set as such.  Does this seem
low?

Last reading from vaccum verbose:
  INFO:  analyzing "cdm.cdm_ddw_customer"
INFO:  "cdm_ddw_customer": 209106 pages, 3000 rows sampled, 6041742
estimated total rows
>>I think I should now set my max FSM to at least 21 but wanted to make
sure

shared_buffers = 2000   # min 16, at least max_connections*2, 8KB
each
sort_mem = 12288# min 64, size in KB

# - Free Space Map -

max_fsm_pages = 10  # min max_fsm_relations*16, 6 bytes each
#max_fsm_relations = 1000   # min 100, ~50 bytes each


TIA

Patrick Hatcher
Macys.Com


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] slow vacuum performance

2004-03-24 Thread pginfo


scott.marlowe wrote:

> On Wed, 24 Mar 2004, pginfo wrote:
>
> > Hi,
> >
> > scott.marlowe wrote:
> >
> > > On Wed, 24 Mar 2004, pginfo wrote:
> > >
> > > > Hi,
> > > >
> > > > I am running pg 7.4.1 on linux box.
> > > > I have a midle size DB with many updates and after it I try to run
> > > > vacuum full analyze.
> > >
> > > Is there a reason to not use just regular vacuum / analyze (i.e. NOT
> > > full)?
> > >
> >
> > Yes, in case I make massive updates (only in my case of cource)   for example
> > 2 M rows, I do not expect to have 2M new rows in next 180 days.That is the
> > reaso for running vacuum full.
> > My idea was to free unneedet space and so to have faster system.
> > It is possible that I am wrong.
>
> It's all about percentages.  If you've got an average of 5% dead tuples
> with regular vacuuming, then full vacuums won't gain you much, if
> anything.  If you've got 20 dead tuples for each live one, then a full
> vacuum is pretty much a necessity.  The generally accepted best
> performance comes with 5 to 50% or so dead tuples.  Keep in mind, having a
> few dead tuples is actually a good thing, as your database won't grow then
> srhink the file all the time, but keep it in a steady state size wise.

thanks for the good analyze,ivan.



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] slow vacuum performance

2004-03-24 Thread scott.marlowe
On Wed, 24 Mar 2004, pginfo wrote:

> Hi,
> 
> scott.marlowe wrote:
> 
> > On Wed, 24 Mar 2004, pginfo wrote:
> >
> > > Hi,
> > >
> > > I am running pg 7.4.1 on linux box.
> > > I have a midle size DB with many updates and after it I try to run
> > > vacuum full analyze.
> >
> > Is there a reason to not use just regular vacuum / analyze (i.e. NOT
> > full)?
> >
> 
> Yes, in case I make massive updates (only in my case of cource)   for example
> 2 M rows, I do not expect to have 2M new rows in next 180 days.That is the
> reaso for running vacuum full.
> My idea was to free unneedet space and so to have faster system.
> It is possible that I am wrong.

It's all about percentages.  If you've got an average of 5% dead tuples 
with regular vacuuming, then full vacuums won't gain you much, if 
anything.  If you've got 20 dead tuples for each live one, then a full 
vacuum is pretty much a necessity.  The generally accepted best 
performance comes with 5 to 50% or so dead tuples.  Keep in mind, having a 
few dead tuples is actually a good thing, as your database won't grow then 
srhink the file all the time, but keep it in a steady state size wise.




---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] slow vacuum performance

2004-03-24 Thread pginfo
Hi,

scott.marlowe wrote:

> On Wed, 24 Mar 2004, pginfo wrote:
>
> > Hi,
> >
> > I am running pg 7.4.1 on linux box.
> > I have a midle size DB with many updates and after it I try to run
> > vacuum full analyze.
>
> Is there a reason to not use just regular vacuum / analyze (i.e. NOT
> full)?
>

Yes, in case I make massive updates (only in my case of cource)   for example
2 M rows, I do not expect to have 2M new rows in next 180 days.That is the
reaso for running vacuum full.
My idea was to free unneedet space and so to have faster system.
It is possible that I am wrong.


> > It takes about 2 h.
>
> Full vacuums, by their nature, tend to be a bit slow.  It's better to let
> the database achieve a kind of "steady state" with regards to number of
> dead tuples, and use regular vacuums to reclaim said space rather than a
> full vacuum.
>
> > How can I improve the vacuum full analyze time?
> >
> > My configuration:
> >
> > shared_buffers = 15000  # min 16, at least max_connections*2,
> > 8KB each
> > sort_mem = 1# min 64, size in KB
>
> You might want to look at dropping sort_mem.  It would appear you've been
> going through the postgresql.conf file and bumping up numbers to see what
> works and what doesn't.  While most of the settings aren't too dangerous
> to crank up a little high, sort_mem is quite dangerous to crank up high,
> should you have a lot of people connected who are all sorting.  Note that
> sort_mem is a limit PER SORT, not per backend, or per database, or per
> user, or even per table, but per sort.  IF a query needs to run three or
> four sorts, it can use 3 or 4x sort_mem.  If a hundred users do this at
> once, they can then use 300 or 400x sort_mem.  You can see where I'm
> heading.
>
> Note that for individual sorts in batch files, like import processes, you
> can bump up sort_mem with the set command, so you don't have to have a
> large setting in postgresql.conf to use a lot of sort mem when you need
> to, you can just grab it during that one session.
>

I know. In my case we are using many ID's declared as varchar/name (I know it
is bad idea, butwe are migrating this system from oracle) and pg have very
bad performance with varchar/name indexes.
The only solution I found was to increase the sort mem.
But, I wll try to decrease this one and to see the result.

> > vacuum_mem = 32000  # min 1024, size in KB
>
> If you've got lots of memory, crank up vacuum_mem to the 200 to 500 meg
> range and see what happens.
>

I wil try it today. It is good idea and hope it will help.

> For a good tuning guide, go here:
>
> http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html

  I know it. It is the best I found and also the site.

Thanks for the help.
ivan.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] slow vacuum performance

2004-03-24 Thread scott.marlowe
On Wed, 24 Mar 2004, pginfo wrote:

> Hi,
> 
> I am running pg 7.4.1 on linux box.
> I have a midle size DB with many updates and after it I try to run
> vacuum full analyze.

Is there a reason to not use just regular vacuum / analyze (i.e. NOT 
full)?  

> It takes about 2 h.

Full vacuums, by their nature, tend to be a bit slow.  It's better to let 
the database achieve a kind of "steady state" with regards to number of 
dead tuples, and use regular vacuums to reclaim said space rather than a 
full vacuum.

> How can I improve the vacuum full analyze time?
> 
> My configuration:
> 
> shared_buffers = 15000  # min 16, at least max_connections*2,
> 8KB each
> sort_mem = 1# min 64, size in KB

You might want to look at dropping sort_mem.  It would appear you've been 
going through the postgresql.conf file and bumping up numbers to see what 
works and what doesn't.  While most of the settings aren't too dangerous 
to crank up a little high, sort_mem is quite dangerous to crank up high, 
should you have a lot of people connected who are all sorting.  Note that 
sort_mem is a limit PER SORT, not per backend, or per database, or per 
user, or even per table, but per sort.  IF a query needs to run three or 
four sorts, it can use 3 or 4x sort_mem.  If a hundred users do this at 
once, they can then use 300 or 400x sort_mem.  You can see where I'm 
heading.

Note that for individual sorts in batch files, like import processes, you 
can bump up sort_mem with the set command, so you don't have to have a 
large setting in postgresql.conf to use a lot of sort mem when you need 
to, you can just grab it during that one session.

> vacuum_mem = 32000  # min 1024, size in KB

If you've got lots of memory, crank up vacuum_mem to the 200 to 500 meg 
range and see what happens.

For a good tuning guide, go here:

http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] slow vacuum performance

2004-03-24 Thread pginfo
Hi Bill,
I am vacuuming every 24 h.
I have a cron script about i.
But if I make massive update (for example it affects 1 M rows) and I start vacuum,
it take this 2 h.
Also I will note, that this massive update is running in one transaction ( I can
not update 100K and start vacuum after it).

regards,
ivan.

Bill Moran wrote:

> pginfo wrote:
> > Hi,
> >
> > I am running pg 7.4.1 on linux box.
> > I have a midle size DB with many updates and after it I try to run
> > vacuum full analyze.
> > It takes about 2 h.
> > If I try to dump and reload the DB it take 20 min.
> >
> > How can I improve the vacuum full analyze time?
>
> How often are you vacuuming?  If you've gone a LONG time since the last vacuum,
> it can take quite a while, to the point where a dump/restore is faster.
>
> A recent realization that I've had some misconceptions about vacuuming led me
> to re-read section 8.2 of the admin guide (on vacuuming) ... I highly suggest
> a review of these 3 pages of the admin manual, as it contains an excellent
> description of why databases need vacuumed, that one can use to determine how
> often vacuuming is necessary.
>
> --
> Bill Moran
> Potential Technologies
> http://www.potentialtech.com




---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] slow vacuum performance

2004-03-24 Thread Bill Moran
pginfo wrote:
Hi,

I am running pg 7.4.1 on linux box.
I have a midle size DB with many updates and after it I try to run
vacuum full analyze.
It takes about 2 h.
If I try to dump and reload the DB it take 20 min.
How can I improve the vacuum full analyze time?
How often are you vacuuming?  If you've gone a LONG time since the last vacuum,
it can take quite a while, to the point where a dump/restore is faster.
A recent realization that I've had some misconceptions about vacuuming led me
to re-read section 8.2 of the admin guide (on vacuuming) ... I highly suggest
a review of these 3 pages of the admin manual, as it contains an excellent
description of why databases need vacuumed, that one can use to determine how
often vacuuming is necessary.
--
Bill Moran
Potential Technologies
http://www.potentialtech.com
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


[PERFORM] slow vacuum performance

2004-03-24 Thread pginfo
Hi,

I am running pg 7.4.1 on linux box.
I have a midle size DB with many updates and after it I try to run
vacuum full analyze.
It takes about 2 h.
If I try to dump and reload the DB it take 20 min.

How can I improve the vacuum full analyze time?

My configuration:

shared_buffers = 15000  # min 16, at least max_connections*2,
8KB each
sort_mem = 1# min 64, size in KB
vacuum_mem = 32000  # min 1024, size in KB
effective_cache_size = 4# typically 8KB each
#max_fsm_pages = 2  # min max_fsm_relations*16, 6 bytes each

#max_fsm_relations = 1000   # min 100, ~50 bytes each


regards,
ivan.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match