Re: [PERFORM] General performance questions about postgres on Apple

2004-06-05 Thread scott.marlowe
On Fri, 20 Feb 2004, Sean Shanny wrote:

 max_connections = 100
 
 # - Memory -
  
 shared_buffers = 16000  # min 16, at least max_connections*2, 
 8KB each
 sort_mem = 256000   # min 64, size in KB

You might wanna drop sort_mem somewhat and just set it during your imports 
to something big like 512000 or larger.  That way with 100 users during 
the day you won't have to worry about swap storms, and when you run your 
updates, you get all that sort_mem.

 Actual row count in the temp table:
 
 select count(*) from referral_temp ;
   502347
 
 Actual row count in d_referral table:
 
 select count(*) from d_referral ;
   27908024
  
 
 Note: that an analyze had not been performed on the referral_temp table 
 prior to the explain analyze run.
 
 explain analyze  SELECT t1.id, t2.md5, t2.url from referral_temp t2 LEFT 
 OUTER JOIN d_referral t1 ON t2.md5 = t1.referral_md5
 
 Nested Loop Left Join  (cost=0.00..3046.00 rows=1001 width=68) (actual 
 time=136.513..6440616.541 rows=502347 loops=1)
-  Seq Scan on referral_temp t2  (cost=0.00..20.00 rows=1000 
 width=64) (actual time=21.730..10552.421 rows=502347 loops=1)
-  Index Scan using d_referral_referral_md5_key on d_referral t1  
 (cost=0.00..3.01 rows=1 width=40) (actual time=12.768..14.022 rows=1 
 loops=502347)
  Index Cond: (outer.md5 = t1.referral_md5)
 
 
 Thanks.
 
 --sean
  Total runtime: 6441969.698 ms
 (5 rows)
 
 
 Here is an explain analyze after the analyze was done.  Unfortunately I 
 think a lot of the data was still in cache when I did this again :-(
 
 explain analyze  SELECT t1.id, t2.md5, t2.url from referral_temp t2 LEFT 
 OUTER JOIN d_referral t1 ON t2.md5 = t1.referral_md5;
 
 Nested Loop Left Join  (cost=0.00..1468759.69 rows=480082 width=149) 
 (actual time=69.576..3226854.850 rows=502347 loops=1)
-  Seq Scan on referral_temp t2  (cost=0.00..16034.81 rows=480081 
 width=145) (actual time=11.206..4003.521 rows=502347 loops=1)
-  Index Scan using d_referral_referral_md5_key on d_referral t1  
 (cost=0.00..3.01 rows=1 width=40) (actual time=6.396..6.402 rows=1 
 loops=502347)
  Index Cond: (outer.md5 = t1.referral_md5)
  Total runtime: 3227830.752 ms

Hmmm.  It looks like postgresql is still picking a nested loop when it 
should be sorting something faster.  Try doing a set enable_nestloop = 
off and see what you get.

If that makes it faster, you may want to adjust the costs of the cpu_* 
stuff higher to see if that can force it to do the right thing.

Looking at the amount of time taken by the nested loop, it looks like the 
problem to me.

And why are you doing a left join of ONE row from one table against the 
whole temp table?  Do you really need to do that?  since there's only one 
row in the source table, and I'd guess is only matches one or a few rows 
from the temp table, this means you're gonna have that one row and a bunch 
of null filled rows to go with it.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Disappointing performance in db migrated from MS SQL

2004-06-03 Thread scott.marlowe
On Fri, 13 Feb 2004 [EMAIL PROTECTED] wrote:

  Josh, the disks in the new system should be substantially faster than
  the old. Both are Ultra160 SCSI RAID 5 arrays, but the new system has
  15k RPM disks, as opposed to the 10k RPM disks in the old system.
 
 Spindle speed does not correlate with 'throughput' in any easy way.  What
 controllers are you using for these disks?

This is doubly so with a good RAID card with battery backed cache.  

I'd bet that 10k rpm drives on a cached array card will beat an otherwise 
equal setup with 15k rpm disks and no cache.  I know that losing the cache 
slows my system down to a crawl (i.e. set it to write thru instead of 
write back.) comparitively speaking.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Tables on multiple disk drives

2004-06-03 Thread scott.marlowe
On Tue, 17 Feb 2004, Craig Thomas wrote:

  On Tue, 17 Feb 2004, Konstantin Tokar wrote:
 
  Hi!
  Does PostgreSQL allow to create tables and indices of a single
  database on multiple disk drives with a purpose of increase
  performance as Oracle database does? If a symbolic reference is the
  only method then the next question is: how can it be determined what
  file is referred to what table and index?
 
  You're life will be simpler, and your setup will be faster without
  having  to muck about with it, if you just buy a good RAID controller
  with battery  backed cache.  LSI/Megaraid and Adaptec both make
  serviceable controllers  for reasonable prices, and as you add drives,
  the speed just goes up, no  muddling around with sym links.
 
 This works to a limited extent.  For very large databases, maximum
 throughput of I/O is the paramount factor for database performance.  With
 raid controllers, your LUN is still limited to a small number of disks.
 PostgreSQL can only write on a file system, but Oracle, SAP DB, DB2, etc
 can write directly to disk (raw I/O).  With large databases it is
 advantageous to spread a table across 100's of disks, if the table is
 quite large.  I don't know of any manufacturer that creates a 100 disk
 raid array yet.

You can run up to four LSI / Megaraids in one box, each with 3 UW SCSI 
interfaces, and they act as one unit.  That's 3*4*15 = 180 disks max.

With FC AL connections and four cards, it would be possible to approach 
1000 drives.  

Of course, I'm not sure how fast any RAID card setup is gonna be with that 
many drives, but ya never know.  My guess is that before you go there you 
buy a big external RAID box built for speed.  We have a couple of 200+ 
drive external RAID5 storage boxes at work that are quite impressive.

 Some of the problem can be addressed by using a volume manager (such as
 LVM in Linux, or Veritas on Unix-like systems).  This allows one to
 create a volume using partitions from many disks.  One can then create
 a file system and mount it on the volume.

Pretty much RAID arrays in software, which means no battery backed cache, 
which means it'll be fast at reading, but probably pretty slow at writes, 
epsecially if there's a lot of parallel access waiting to write to the 
database.

 However, to get the best performance, Raw I/O capability is the best
 way to go.

Unsupported statement made as fact.  I'm not saying it can't or isn't
true, but my experience has been that large RAID5 arrays are a great 
compromise between maximum performance and reliability, giving a good 
measure of each.  It doesn't take 100 drives to do well, even a dozen to 
two dozen will get you in the same basic range as splitting out files by 
hand with sym links without all the headache of chasing down the files, 
shutting down the database, linking it over etc...



---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [ADMIN] [PERFORM] Quad processor options

2004-05-12 Thread scott.marlowe
On Wed, 12 May 2004, Grega Bremec wrote:

 ...and on Tue, May 11, 2004 at 03:02:24PM -0600, scott.marlowe used the keyboard:
  
  If you get the LSI megaraid, make sure you're running the latest megaraid 
  2 driver, not the older, slower 1.18 series.  If you are running linux, 
  look for the dkms packaged version.  dkms, (Dynamic Kernel Module System) 
  automagically compiles and installs source rpms for drivers when you 
  install them, and configures the machine to use them to boot up.  Most 
  drivers seem to be slowly headed that way in the linux universe, and I 
  really like the simplicity and power of dkms.
  
 
 Hi,
 
 Given the fact LSI MegaRAID seems to be a popular solution around here, and
 many of you folx use Linux as well, I thought sharing this piece of info
 might be of use.
 
 Running v2 megaraid driver on a 2.4 kernel is actually not a good idea _at_
 _all_, as it will silently corrupt your data in the event of a disk failure.
 
 Sorry to have to say so, but we tested it (on kernels up to 2.4.25, not sure
 about 2.4.26 yet) and it comes out it doesn't do hotswap the way it should.
 
 Somehow the replaced disk drives are not _really_ added to the array, which
 continues to work in degraded mode for a while and (even worse than that)
 then starts to think the replaced disk is in order without actually having
 resynced it, thus beginning to issue writes to non-existant areas of it.
 
 The 2.6 megaraid driver indeed seems to be a merged version of the above
 driver and the old one, giving both improved performance and correct
 functionality in the event of a hotswap taking place.

This doesn't make any sense to me, since the hot swapping is handled by 
the card autonomously.  I also tested it with a hot spare and pulled one 
drive and it worked fine during our acceptance testing.

However, I've got a hot spare machine I can test on, so I'll try it again 
and see if I can make it fail.

when testing it, was the problem present in certain RAID configurations or 
only one type or what?  I'm curious to try and reproduce this problem, 
since I've never heard of it before.

Also, what firmware version were those megaraid cards, ours is fairly 
new, as we got it at the beginning of this year, and I'm wondering if it 
is a firmware issue.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Configuring PostgreSQL to minimize impact of checkpoints

2004-05-11 Thread scott.marlowe
On Tue, 11 May 2004, Paul Tuckfield wrote:

 If you are having a write storm or bursty writes that's burying 
 performance, a scsi raid controler with writeback cache will greatly 
 improve the situation, but I do believe they run around $1-2k.   If 
 it's write specific problem, the cache matters more than the striping, 
 except to say that write specfic perf problems should avoid raid5

Actually, a single channel MegaRAID 320-1 (single channel ultra 320) is 
only $421 at http://www.siliconmechanics.com/c248/u320-scsi.php  It works 
pretty well for me, having 6 months of a production server on one with 
zero hickups and very good performance.  They have a dual channel intel 
card for only $503, but I'm not at all familiar with that card.

The top of the line megaraid is the 320-4, which is only $1240, which 
ain't bad for a four channel RAID controller.

Battery backed cache is an addon, but I think it's only about $80 or so.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Configuring PostgreSQL to minimize impact of checkpoints

2004-05-11 Thread scott.marlowe
On Tue, 11 May 2004, Paul Tuckfield wrote:

 Love that froogle.
 
 It looks like a nice card. One thing I didn't get straight is if 
 the cache is writethru or write back.
 
 If the original posters problem is truly a burst write problem (and not 
 linux caching or virtual memory overcommitment) then writeback is key.

the MegaRaid can be configured either way.  it defaults to writeback if 
the battery backed cache is present, I believe.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Quad processor options

2004-05-11 Thread scott.marlowe
On Tue, 11 May 2004, Allan Wind wrote:

 On 2004-05-11T15:29:46-0600, scott.marlowe wrote:
  The other nice thing about the LSI cards is that you can install 1 and 
  the act like one big RAID array.  i.e. install two cards with a 20 drive 
  RAID0 then make a RAID1 across them, and if one or the other cards itself 
  fails, you've still got 100% of your data sitting there.  Nice to know you 
  can survive the complete failure of one half of your chain.
 
 ... unless that dying controller corrupted your file system.  Depending
 on your tolerance for risk, you may not want to operate for long with a
 file system in an unknown state.

It would have to be the primary controller for that to happen.  The way 
the LSI's work is that you disable the BIOS on the 2nd to 4th cards, and 
the first card, with the active BIOS acts as the primary controller.

In this case, that means the main card is doing the RAID1 work, then 
handing off the data to the subordinate cards.

The subordinate cards do all their own RAID0 work.

mobo ---controller 1--array1 of disks in RAID0
.|--controller 2--array2 of disks in RAID0

and whichever controller fails just kind of disappears.

Note that if it is the master controller, then you'll have to shut down 
and enable the BIOS on one of the secondardy (now primary) controllers.

So while it's possible for the master card failing to corrupt the RAID1 
set, it's still a more reliable system that with just one card.

But nothing is 100% reliable, sadly.

 Btw, the Intel and LSI Logic RAID controller cards have suspeciously
 similar specificationsi, so I would be surprised if one is an OEM.

Hmmm.  I'll take a closer look.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Quad processor options

2004-05-11 Thread scott.marlowe
On Tue, 11 May 2004, Bjoern Metzdorf wrote:

 scott.marlowe wrote:
 
  Well, from what I've read elsewhere on the internet, it would seem the 
  Opterons scale better to 4 CPUs than the basic Xeons do.  Of course, the 
  exception to this is SGI's altix, which uses their own chipset and runs 
  the itanium with very good memory bandwidth.
 
 This is basically what I read too. But I cannot spent money on a quad 
 opteron just for testing purposes :)

Wouldn't it be nice to just have a lab full of these things?

  If your I/O is saturated, then the answer may well be a better RAID 
  array, with many more drives plugged into it.  Do you have any spare 
  drives you can toss on the machine to see if that helps?  Sometimes going 
  from 4 drives in a RAID 1+0 to 6 or 8 or more can give a big boost in 
  performance.
 
 Next drives I'll buy will certainly be 15k scsi drives.

Better to buy more 10k drives than fewer 15k drives.  Other than slightly 
faster select times, the 15ks aren't really any faster.

  In short, don't expect 4 CPUs to solve the problem if the problem isn't 
  really the CPUs being maxed out.
  
  Also, what type of load are you running?  Mostly read, mostly written, few 
  connections handling lots of data, lots of connections each handling a 
  little data, lots of transactions, etc...
 
 In peak times we can get up to 700-800 connections at the same time. 
 There are quite some updates involved, without having exact numbers I'll 
 think that we have about 70% selects and 30% updates/inserts.

Wow, a lot of writes then.

  If you are doing lots of writing, make SURE you have a controller that 
  supports battery backed cache and is configured to write-back, not 
  write-through.
 
 Could you recommend a certain controller type? The only battery backed 
 one that I found on the net is the newest model from icp-vortex.com.

Sure, adaptec makes one, so does lsi megaraid.  Dell resells both of 
these, the PERC3DI and the PERC3DC are adaptec, then lsi in that order, I 
believe.  We run the lsi megaraid with 64 megs battery backed cache.

Intel also makes one, but I've heard nothing about it.

If you get the LSI megaraid, make sure you're running the latest megaraid 
2 driver, not the older, slower 1.18 series.  If you are running linux, 
look for the dkms packaged version.  dkms, (Dynamic Kernel Module System) 
automagically compiles and installs source rpms for drivers when you 
install them, and configures the machine to use them to boot up.  Most 
drivers seem to be slowly headed that way in the linux universe, and I 
really like the simplicity and power of dkms.

I haven't directly tested anything but the adaptec and the lsi megaraid.  
Here at work we've had massive issues trying to get the adaptec cards 
configured and installed on, while the megaraid was a snap.  Installed RH, 
installed the dkms rpm, installed the dkms enabled megaraid driver and 
rebooted.  Literally, that's all it took.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Quad processor options

2004-05-11 Thread scott.marlowe
On Tue, 11 May 2004, Bjoern Metzdorf wrote:

 scott.marlowe wrote:
  Sure, adaptec makes one, so does lsi megaraid.  Dell resells both of 
  these, the PERC3DI and the PERC3DC are adaptec, then lsi in that order, I 
  believe.  We run the lsi megaraid with 64 megs battery backed cache.
 
 The LSI sounds good.
 
  Intel also makes one, but I've heard nothing about it.
 
 It could well be the ICP Vortex one, ICP was bought by Intel some time ago..

Also, there are bigger, faster external RAID boxes as well, that make the 
internal cards seem puny.  They're nice because all you need in your main 
box is a good U320 controller to plug into the external RAID array.

That URL I mentioned earlier that had prices has some of the external 
boxes listed.  No price, not for sale on the web, get out the checkbook 
and write a blank check is my guess.  I.e. they're not cheap.

The other nice thing about the LSI cards is that you can install 1 and 
the act like one big RAID array.  i.e. install two cards with a 20 drive 
RAID0 then make a RAID1 across them, and if one or the other cards itself 
fails, you've still got 100% of your data sitting there.  Nice to know you 
can survive the complete failure of one half of your chain.

  I haven't directly tested anything but the adaptec and the lsi megaraid.  
  Here at work we've had massive issues trying to get the adaptec cards 
  configured and installed on, while the megaraid was a snap.  Installed RH, 
  installed the dkms rpm, installed the dkms enabled megaraid driver and 
  rebooted.  Literally, that's all it took.
 
 I didn't hear anything about dkms for debian, so I will be hand-patching 
 as usual :)

Yeah, it seems to be an RPM kinda thing.  But, I'm thinking the 2.0 
drivers got included in the latest 2.6 kernels, so no biggie. I was 
looking around in google, and it definitely appears the 2.x and 1.x 
megaraid drivers were merged into unified driver in 2.6 kernel.


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Why queries takes too much time to execute?

2004-05-10 Thread scott.marlowe
On Mon, 10 May 2004, Anderson Boechat Lopes wrote:

 Hum... now i think i´m beginning to understand.
 
 The vacuum analyse is recommended to perform at least every day, after
 adding or deleting a large number of records, and not vacuum full analyse.
 I´ve performed the vacuum full analyse every day and after some time i´ve
 noticed the database was corrupted. I couldn´t select anything any more.

Hold it right there, full stop.

If you've got corruption you've either found a rare corner case in 
postgresql (unlikely, corruption is not usually a big problem for 
postgresql) OR you have bad hardware.  Test your RAM, CPUs, and hard 
drives before going any further.   Data corruption, 99% of the time, is 
not the fault of postgresql but the fault of the hardware.



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Help how to tune-up my Database

2004-05-10 Thread scott.marlowe
Sorry about that, I meant kbytes, not megs.  My point being it's NOT 
measured in 8k blocks, like a lot of other settings.  sorry for the mixup.

On Fri, 7 May 2004, Bricklen wrote:

 scott.marlowe wrote:
  sort_mem might do with a small bump, especially if you're only handling a 
  few connections at a time.  Be careful, it's per sort, and measured in 
  megs, so it's easy for folks to set it too high and make their machine 
  start flushing too much kernel cache, which will slow down the other 
  backends that have to go to disk for data.
 snip
 http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html
 (under Memory), it says that sort_mem is set in KB. Is this document 
 wrong (or outdated)?
 
 ---(end of broadcast)---
 TIP 2: you can get off all lists at once with the unregister command
 (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
 


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] [OFF-TOPIC] - Known maximum size of the PostgreSQL

2004-05-05 Thread scott.marlowe
On Sat, 5 Jun 2004, Carlos Eduardo Smanioto wrote:

 Hello all,
 
 What's the case of bigger database PostgreSQL (so greate and amount of
 registers) that they know???

http://www.postgresql.org/docs/faqs/FAQ.html#4.5


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] cache table

2004-05-04 Thread scott.marlowe
On Mon, 3 May 2004, Joseph Shraibman wrote:

 I have a big table with some int fields.  I frequently need to do 
 queries like:
 
 SELECT if2, count(*) FROM table WHERE if1 = 20 GROUP BY if2;
 
 The problem is that this is slow and frequently requires a seqscan. I'd 
 like to cache the results in a second table and update the counts with 
 triggers, but this would a) require another UPDATE for each 
 INSERT/UPDATE which would slow down adding and updating of data and b) 
 produce a large amount of dead rows for vacuum to clear out.
 
 It would also be nice if this small table could be locked into the pg 
 cache somehow.  It doesn't need to store the data on disk because the 
 counts can be generated from scratch?

I think you might be interested in materialized views.  You could create 
this as a materialized view which should be very fast to just select * 
from.

While materialized views aren't a standard part of PostgreSQL just yet, 
there is a working implementation available from Jonathan Gardner at:

http://jonathangardner.net/PostgreSQL/materialized_views/matviews.html

It's all implemented with plpgsql and is quite interesting to read 
through.  IT has a nice tutorial methodology to it.


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] index usage

2004-04-28 Thread scott.marlowe
On Mon, 26 Apr 2004, Stephan Szabo wrote:

 
 On Fri, 23 Apr 2004 [EMAIL PROTECTED] wrote:
 
  I have a query which I think should be using an index all of the time but
  postgres only uses the index part of the time.  The index
  (ticket_crm_map_crm_id_suppid) has the where clause column (crm_id) listed
  first followed by the selected column (support_person_id).  Wouldn't the
  most efficient plan be to scan the index each time because the only columns
  needed are in the index?  Below is the table, 2 queries showing the
 
 Not necessarily.  The rows in the actual file still need to be checked to
 see if they're visible to the select and if it's expected that the entire
 file (or a reasonable % of the pages anyway) will need to be loaded using
 the index isn't necessarily a win.

While those of us familiar with PostgreSQL are well aware of the fact that 
indexes can't be used directly to garner information, but only as a lookup 
to a tuple in the table, it seems this misconception is quite common among 
those coming to postgreSQL from other databases.

Is there any information that directly reflects this issue in the docs?  
There are tons of hints that it works this way in how they're written, but 
nothing that just comes out and says that with pgsql's mvcc 
implementation, an index scan still has to hit the pages that contain the 
tuples, so often in pgsql a seq scan is a win where in other databases and 
index scan would have been a win?

If not, where would I add it if I were going to write something up for the 
docs?  Just wondering...


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] [ADMIN] 7.4.2 out of memory

2004-04-28 Thread scott.marlowe
On Wed, 28 Apr 2004, Jie Liang wrote:

 All,
 After I upgraded postgres from 7.3.4 to 7.4.2, one of my program got following error:
 DRROR:out of memory
 DETAIL:   Fail on request of size 92.
 
 any idea??
 does memory management have big difference between 7.3.4 and 7.4.2???
 this program using a chunk of share memory and a lot of temp tables.

More than likely this is a hash aggregate problem (or can they spill to 
disk in 7.4.2 yet?  I don't think they can, but maybe we should ask Tom.

Try setting this before running the query and see what happens:

set enable_hashagg = false;




---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] OT: Help with performance problems

2004-04-27 Thread scott.marlowe
On Mon, 26 Apr 2004, Rob Fielding wrote:

 scott.marlowe wrote:
  On Fri, 23 Apr 2004, Chris Hoover wrote:
  
  
 DB's on Powervaults 220S using raid 5 (over 6 disks)
  
  
  What controller is this, the adaptec?  We've found it to be slower than 
  the LSI megaraid based controller, but YMMV.
 
 Wow, really? You got any more details of the chipset, mobo and kernel 
 driver ?

We're running on a Dell 2650, the controller is the U320 LSI megaraid 2 
channel (they only make the one that I know of right now).  Don't know my 
mobo chipset offhand, but might be able to find out what one dell includes 
on the 2650.  The kernel driver is the latest megaraid2 driver as of about 
Feb this year.

 I've been taken to my wits end wrestling with an LSI MegaRAID 320-1 
 controller on a supermicro board all weekend. I just couldn't get 
 anything more than 10MB/sec out of it with megaraid driver v1 OR v2 in 
 Linux 2.4.26, nor the version in 2.6.6-rc2. After 2 days of humming the 
 Adaptec mantra I gave in and switched the array straight onto the 
 onboard Adaptec 160 controller (same cable and everything). Software 
 RAID 5 gets me over 40MB sec for a nominal cpu hit - more than 4 times 
 what I could get out of the MegaRAID controller :( Even the 2nd SCSI-2 
 channel gets 40MB/sec max (pg_xlog :)
 
 And HOW LONG does it take to detect drives during POSTo never 
 mind ... I really just wanna rant :) There should be a free counseling 
 service for enraged sysops.

I wonder if your controller is broken or something?  Or maybe on a PCI 
slow that has to share IRQs or something.  I've had great luck with 
SuperMicro mobos in the past (we're talking dual PPro 200 mobos, so 
seriously, IN THE PAST here... )  Hell, my Dual PPro 200 with an old 
MegaRAID 428 got 18 Megs a second cfer rate no problem.

Have you tried that lsi card in another machine / mobo combo?  Can you 
disable the onboard adaptec?  We have on our Dell 2650s, the only active 
controllers are the onboard IDE and the add in LSI-320-2 controller.

We're running ours with 128 Meg cache (I think could be 64) set to write 
back.  I think our throughput on a RAID-1 pair was somewhere around 40+ 
megs a second reads with bonnie++  With RAID-5 it was not really much 
faster at reads (something like 60 megs a second) but was much more 
scalable under heavy parellel read/write access for PostgreSQL.

Have you updated the BIOS on the mobo to see if that helps?  I'm just 
throwing darts at the wall here.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Looking for ideas on how to speed up warehouse loading

2004-04-23 Thread scott.marlowe

On Thu, 22 Apr 2004, Sean Shanny wrote:

 I should have included this as well:
  fsync  | on
  shared_buffers | 4000
  sort_mem   | 64000

For purposes of loading only, you can try turning off fsync, assuming this 
is a virgin load and you can just re-initdb should bad things happen (OS, 
postgresql crash, power plug pulled, etc...)

Also increasing sort_mem and shared_buffers might help.  Especially 
sort_mem.  But turn it back down to something reasonable after the import.

And turn fsync back on after the import too.  Note you have to restart 
postgresql to make fsync = off take effect.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Help with performance problems

2004-04-23 Thread scott.marlowe
On Fri, 23 Apr 2004, Chris Hoover wrote:

 DB's on Powervaults 220S using raid 5 (over 6 disks)

What controller is this, the adaptec?  We've found it to be slower than 
the LSI megaraid based controller, but YMMV.

 Running RH ES 2.1

Are you running the latest kernel for ES 2.1?  Early 2.4 kernels are 
pretty pokey and have some odd behaviour under load that later 2.4 
kernels seemed to fix.

 Here is the postgresql.conf from the server with the 11GB db:
 
 max_connections = 64
 shared_buffers = 32768# 256MB=32768(buffs)*8192(bytes/buff)
 max_fsm_relations = 1000  # min 10, fsm is free space map, ~40 bytes
 max_fsm_pages = 1 # min 1000, fsm is free space map, ~6 bytes

IF you're doing lots of updates and such, you might want these higher.
Have you vacuumed full the databases since taking over?

 sort_mem = 4096   # 256MB=4096(bytes/proc)*64(procs or conns)

Sorry, that's wrong.  sort_mem is measure in kbytes.  i.e. 8192 means 8 
megs sort_mem.  Try setting it a bit higher (you've got LOTS of ram in these 
boxes) to something like 16 or 32 meg.

 checkpoint_segments = 16  # in logfile segments, min 1, 16MB each
 checkpoint_timeout = 30   # range 30-3600, in seconds
 effective_cache_size = 131072 # typically 8KB each

This still looks low.  On one machine you're showing kernel cache of about 
.7 gig, on the other it's 6 gig.  6 gigs of kernel cache would be a 
setting of 80.  It's more of a nudge factor than an exact science, so 
don't worry too much.

If you've got fast I/O look at lowering random page cost to something 
between 1 and 2.  We use 1.3 to 1.4 on most of our machines with fast 
drives under them.

I'd use vmstat to see if you're I/O bound.  

also, look for index bloat.  Before 7.4 it was a serious problem.  With 
7.4 regular vacuuming should reclaim most lost space, but there are corner 
cases where you still might need to re-index.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Help with performance problems

2004-04-23 Thread scott.marlowe
On Fri, 23 Apr 2004, Chris Hoover wrote:

 On Friday 23 April 2004 13:21, scott.marlowe wrote:
  On Fri, 23 Apr 2004, Chris Hoover wrote:
   DB's on Powervaults 220S using raid 5 (over 6 disks)
 
  What controller is this, the adaptec?  We've found it to be slower than
  the LSI megaraid based controller, but YMMV.
 
 We are using the perc3/di controller.  Believe it is using the megaraid 
 driver.

No, that's the adaptec, the PERC3/DC is the lsi megaraid.  See if there 
are newer drivers for the RAID card.  In terms of performance, the adaptec 
and lsi drivers have improved considerably in later versions.  In terms of 
stability they've largely gotten better with a few in between releases on 
the megaraid getting poor grades.  The latest / greatest from Dell is 
pretty up to date.

   Running RH ES 2.1
 
  Are you running the latest kernel for ES 2.1?  Early 2.4 kernels are
  pretty pokey and have some odd behaviour under load that later 2.4
  kernels seemed to fix.
 
 I'm not sure we are at the latest and greatest for 2.1, but I am trying to get 
 there.  Management won't let me do the upgrade w/o first testing/proving it 
 will not cause any more issues.  Due to all of the current issues, and the 
 criticality of these systems to our bottom line, they are being very careful 
 with any change that may impact our users further.

Understood.  It's why my production box is still running a 2.4 kernel on 
rh 7.2 with pg 7.2.  They just work, but for us stability AND performance 
are both good with our load.

You can install a new kernel and set up the machine to still boot off of 
the old one, and test on the weekend to see how it behaves under 
simulated load.  Mining the logs for slow queries is a good way to build 
one.

while we don't upgrade our production server's applications to the latest 
and greatest all the time (i.e. php or postgresql or openldap) we always 
run the latest security patches, and I think the latest kernels had 
security fixes for ES 2.1, so NOT upgrading it dangerous.  Late model 
linux kernels (the 2.0.x and 2.2.x where x20) tend to be VERY stable and 
very conservatively backported and upgraded, so running a new one isn't 
usually a big risk.

   Here is the postgresql.conf from the server with the 11GB db:
  
   max_connections = 64
   shared_buffers = 32768# 256MB=32768(buffs)*8192(bytes/buff)
   max_fsm_relations = 1000  # min 10, fsm is free space map, ~40 bytes
   max_fsm_pages = 1 # min 1000, fsm is free space map, ~6 bytes
 
  IF you're doing lots of updates and such, you might want these higher.
  Have you vacuumed full the databases since taking over?
 
   sort_mem = 4096   # 256MB=4096(bytes/proc)*64(procs or conns)
 
  Sorry, that's wrong.  sort_mem is measure in kbytes.  i.e. 8192 means 8
  megs sort_mem.  Try setting it a bit higher (you've got LOTS of ram in
  these boxes) to something like 16 or 32 meg.
 
   checkpoint_segments = 16  # in logfile segments, min 1, 16MB each
   checkpoint_timeout = 30   # range 30-3600, in seconds
   effective_cache_size = 131072 # typically 8KB each
 
  This still looks low.  On one machine you're showing kernel cache of about
  .7 gig, on the other it's 6 gig.  6 gigs of kernel cache would be a
  setting of 80.  It's more of a nudge factor than an exact science, so
  don't worry too much.
 I believe changing this requires a restart of the cluster (correct?).  If so, 
 I'll try bumping up the effective_cache_size over the weekend.
 
 Also, will all of the memory available to these machines, should I be running 
 with larger shared_buffers?  It seems like 256M is a bit small.

No, you probably shouldn't.  PostgreSQL doesn't cache in the classical 
sense.   If all backends close, the stuff they had in their buffers 
disappears in a flash.  So, it's generally considered better to let the 
kernel do the bulk of the caching, and having the buffer area be large 
enough to hold a large portion, if not all, of your working set of data.  
But between the cache management which is dirt simple and works but seems 
to have performance issues with large numbers of buffers, and the fact 
that all the memory in it disappears when the last backend using it.

for instance, in doing the following seq scan select:

explain analyze select * from test;

where test is a ~10 megabyte table, the first time I ran it it took 5 
seconds to run.  The second time took it 2.5, the third 1.9, and it 
levelled out around there.  Starting up another backend and running the 
same query got a 1.9 second response also.  Shutting down both 
connections, and running the query again, with only the kernel for 
caching, I got 1.9.

That's on a 2.4.2[2-4] kernel.

  If you've got fast I/O look at lowering random page cost to something
  between 1 and 2.  We use 1.3 to 1.4 on most of our machines with fast
  drives under them.
 
  I'd use vmstat to see if you're I/O bound.
 
 If we end up being I/O bound, should

Re: [PERFORM] Setting Shared Buffers , Effective Cache, Sort Mem

2004-04-22 Thread scott.marlowe
On Thu, 22 Apr 2004, Pallav Kalva wrote:

 Hi
 
 We are in the process of building a new machine for our production 
 database. Below you will see some of the harware specs for the machine. 
 I need some help with setting these parameters (shared buffers, 
 effective cache, sort mem) in the pg_conf file.  Also can anyone explain 
 the difference between shared buffers and effective cache , how these 
 are allocated in the main memory (the docs are not clear on this).
 
 Here are the Hardware details:
 Operating System: Red Hat 9
 Database Ver: Postgres 7.4
 CPU'S : 4
 RAM : 4 gig
 Datafile layout : RAID 1+0
 Transaction log : on different RAID1 Array
 RAID Stripe Size: 8k

Read this first:

http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html

Basically shared buffers are the play area for the database backends to 
toss data in the air and munge it together.  The effective cache size 
reflects the approximate amount of space your operating system is using to 
buffer Postgresql data.  On a dedicated database machine this is about the 
same as the size of the kernel buffer shown in top.  On a mixed machine, 
you'll have to see how much of what data is getting buffered to get a 
guesstimate of how much kernel cache is being used for pgsql and how much 
for other processes.  Then divide that number in bytes by 8192, the 
default block size.  On a machine with 1.2 gigs of kernel cache, that'd be 
about 150,000 blocks.

Buffer sizes from 1000 to 1 blocks are common.  Block sizes from 1 
to 5 can somtimes increase performance, but those sizes only really 
make sense for machines with lots of ram, and very large datasets being 
operated on.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] good pc but bad performance,why?

2004-04-07 Thread scott.marlowe
On Wed, 7 Apr 2004, Andrew McMillan wrote:

 On Wed, 2004-04-07 at 20:56, huang yaqin wrote:
  Hello, Richard Huxton,
  
  You said  turning fsync off may cause losing data, that's terrible.
  I use SCSI disk, and file system is ext3. I tune postgresql.conf and can't get 
  any improvement. So what can I do?
  Does SCSI disk and IDE disk have difference?
 
 Yes, turning off fsync means that the database is not guaranteeing
 consistency of writes to disk any longer.  On the other hand your IDE
 system probably never was, because IDE drives just typically turn on
 write caching in hardware without telling anyone.
 
 SCSI typically doesn't turn on write caching in the physical drive by
 default, as Tom Lane pointed out earlier.  Good SCSI has a battery
 backed up cache, and then it is OK to turn on write caching, because the
 controller has enough battery to complete all writes in the event of a
 power failure.

Actually, almost all SCSI drives turn on write caching by default, they 
just don't lie about fsync, so you still have a one update per revolution 
limit, but other things can be happening while that write is being 
commited due to the multi-threaded nature of both the SCSI interface and 
the kernel drivers associated with them

It would appear the linux kernel hackers are trying to implement the 
multi-threaded features of the latest ATA spec, so that, at some future 
date, you could have IDE drives that cache AND tell the truth of their 
sync AND can do more than one thing at a time.

 One thing I recommend is to use ext2 (or almost anything but ext3). 
 There is no real need (or benefit) from having the database on a
 journalled filesystem - the journalling is only trying to give similar
 sorts of guarantees to what the fsync in PostgreSQL is doing.

Is this true?  I was under the impression that without at least meta-data 
journaling postgresql could still be corrupted by power failure.

 The suggestion someone else made regarding use of software raid is
 probably also a good one if you are trying to use the on-board RAID at
 the moment.

Some onboard RAID controllers are fairly good (dell's 2600 series have an 
adaptec on board that can have battery backed cache that is ok, the lsi
megaraid based one is faster under linux though.)  But some of them are 
pretty poor performers.

 Finally, I would say that because you are seeing poor performance on one
 box and great performance on another, you should look at the hardware,
 or at the hardware drivers, for the problem - not so much at PostgreSQL.

More than likely, the biggest issue is that the SCSI drives are performing 
proper fsync, while the IDE drives are lying.  Definitely a time to look 
at a good caching RAID controller.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Postgresql educational sources

2004-04-07 Thread scott.marlowe
On Wed, 7 Apr 2004, Nid wrote:

 Rather than ask some general, unanswerable question on how to tune my 
 database...I thought I ask where I might find an introduction to...or 
 manual/tutorial for the depths of managing a postgres db.  Books?  
 Websites?  Assume a basic to intermediate knowledge of DBs in general 
 with a desire to learn about postgres from the ground up.  If it makes a 
 difference I'm using a postgres db in a Red Hat Linux OS environment.  
 Thanks!

The online (adminstration) docs are quite good, and for tuning, look at 
the excellent tuning document on varlena:

http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html




---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] atrocious update performance

2004-04-05 Thread scott.marlowe
On Mon, 5 Apr 2004, Kevin Barnard wrote:

 On 5 Apr 2004 at 12:05, Rosser Schwarz wrote:
 
  Just this morning, however, I created a copy of the target table (all
  4.7M rows), with absolutely no foreign keys referring to it, and ran
  the update against the copy.  That update took 2300 seconds.  The
  join columns were indexed in both cases.
 
 Have you added indexes for the custid column for tables account.acct accunt.orgacct 
 and note?
 
 I haven't followed the entire thread but it you have cascading FK on those tables 
 without an index on the column that could cause your delay.

also make sure the fk/pk types match, or the index likely won't get used 
anyway.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] postgres eating CPU on HP9000

2004-03-26 Thread scott.marlowe
On Fri, 26 Mar 2004, Fabio Esposito wrote:

 
 On Fri, 26 Mar 2004, scott.marlowe wrote:
 
   It maintains 48hours of data, so its not a large database; roughly
   600mbs.  We do this by running a housekeeping program in a cron job.
   It deletes all data older then 48hours, then vaccuum analyzes.  It will
   also preform a reindex if the option is set before it vaccuum's.
  
  This almost sounds like a problem (fixed in 7.4 I believe) where some
  system catalog indexes would get huge over time, and couldn't be vacuumed
  or reindexed while the database was up in multi-user mode.
 
  I'll defer to Tom or Bruce or somebody to say if my guess is even close...
 
 We haven't tried 7.4,  I will experiment with it next week,  I hope it
 will be that simple.

In the meantime, a simple dump - reload into a test box running your 
current version may provide some insight.  If it fixes the problem, then 
you likely do have some kind of issue with index / table growth that isn't 
being addressed by vacuuming.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] odd planner choice

2004-03-26 Thread scott.marlowe
On Thu, 25 Mar 2004, Ara Anjargolian wrote:

 I've run into this odd planner choice which I don't quite understand.
 
 I have two tables articles, users and
 articles.article_id and users.user_id are primary keys.
 
 Insides articles there are two optional fields author_id1, author_id2
 which all reference users.user_id.
 
 And now the plans:
 (by the way this is pg 7.4 and I set enable_seqscan to off).
 
 jargol=# explain select user_id, first_names, last_name from articles, users
 where article_id = 5027 and (articles.author_id1 = users.user_id);
 QUERY PLAN
 
 --
  Nested Loop  (cost=0.00..4.04 rows=1 width=26)
-  Index Scan using articles_pk on articles  (cost=0.00..2.01 rows=1
 width=4)
  Index Cond: (article_id = 5027)
-  Index Scan using users_pk on users  (cost=0.00..2.01 rows=1 width=26)
  Index Cond: (outer.author_id1 = users.user_id)
 (5 rows)
 
 jargol=# explain select user_id, first_names, last_name from articles, users
 where article_id = 5027 and (articles.author_id1 = users.user_id or
 articles.author_id2 = users.user_id);
 QUERY PLAN
 
 ---
  Nested Loop  (cost=1.00..10003.11 rows=2 width=26)
Join Filter: ((outer.author_id1 = inner.user_id) OR
 (outer.author_id2 = inner.user_id))
-  Index Scan using articles_pk on articles  (cost=0.00..2.01 rows=1
 width=8)
  Index Cond: (article_id = 5027)
-  Seq Scan on users  (cost=1.00..10001.04 rows=4 width=26)
 (5 rows)
 
 Why does it think it MUST do a seq-scan in the second case? users.user_id is
 a primary key,
 so shouldn't it behave exactly as in the first case?
 
 Any enlightenment on this problem will be much appreciated.

Are articles.author_id1 and users.user_id the same type?  Have you tried 
casting one to the other's type if they're different?


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] postgres eating CPU on HP9000

2004-03-26 Thread scott.marlowe
On Tue, 23 Mar 2004, Fabio Esposito wrote:

 
 Hello fellow PostgreSQL users.
 
 We've been working on this interesting issue for some time now, and we're
 hoping that someone can help.
 
 We've recently integrated postgres into an existing mature app.  Its a
 time sensitive 24x7 system.  It runs on HP9000, a K370 Dual Processor
 system.  Postgres is version 7.3.2.  Its spawned as a child from a parent
 supervisory process, and they communicate to eachother via shared memory.
 
 We preform 9-12K selects per hour
6-8K inserts per hour (a few updates here as well)
1-1.5K Deletes per hour.
 
 It maintains 48hours of data, so its not a large database; roughly
 600mbs.  We do this by running a housekeeping program in a cron job.
 It deletes all data older then 48hours, then vaccuum analyzes.  It will
 also preform a reindex if the option is set before it vaccuum's.
 
 Postgres initially worked wonderfully, fast and solid.  It
 preformed complex joins in 0.01secs, and was able to keep up with our
 message queue.  It stayed this way for almost a year during our
 development.
 
 Recently it started eating up the cpu, and cannot keepup with the system
 like it used to.  The interesting thing here is that it still runs great
 on an older system with less ram, one slower cpu, and an older disk.
 
 We tried the following with no success:
 
 running VACCUUM FULL
 dropping all tables and staring anew
 reinstalling postgres
 tweaking kernel parameters (various combos)
 tweaking postgres parameters (various combos)
 a number of other ideas

This almost sounds like a problem (fixed in 7.4 I believe) where some 
system catalog indexes would get huge over time, and couldn't be vacuumed 
or reindexed while the database was up in multi-user mode.

I'll defer to Tom or Bruce or somebody to say if my guess is even close...


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] slow vacuum performance

2004-03-24 Thread scott.marlowe
On Wed, 24 Mar 2004, pginfo wrote:

 Hi,
 
 I am running pg 7.4.1 on linux box.
 I have a midle size DB with many updates and after it I try to run
 vacuum full analyze.

Is there a reason to not use just regular vacuum / analyze (i.e. NOT 
full)?  

 It takes about 2 h.

Full vacuums, by their nature, tend to be a bit slow.  It's better to let 
the database achieve a kind of steady state with regards to number of 
dead tuples, and use regular vacuums to reclaim said space rather than a 
full vacuum.

 How can I improve the vacuum full analyze time?
 
 My configuration:
 
 shared_buffers = 15000  # min 16, at least max_connections*2,
 8KB each
 sort_mem = 1# min 64, size in KB

You might want to look at dropping sort_mem.  It would appear you've been 
going through the postgresql.conf file and bumping up numbers to see what 
works and what doesn't.  While most of the settings aren't too dangerous 
to crank up a little high, sort_mem is quite dangerous to crank up high, 
should you have a lot of people connected who are all sorting.  Note that 
sort_mem is a limit PER SORT, not per backend, or per database, or per 
user, or even per table, but per sort.  IF a query needs to run three or 
four sorts, it can use 3 or 4x sort_mem.  If a hundred users do this at 
once, they can then use 300 or 400x sort_mem.  You can see where I'm 
heading.

Note that for individual sorts in batch files, like import processes, you 
can bump up sort_mem with the set command, so you don't have to have a 
large setting in postgresql.conf to use a lot of sort mem when you need 
to, you can just grab it during that one session.

 vacuum_mem = 32000  # min 1024, size in KB

If you've got lots of memory, crank up vacuum_mem to the 200 to 500 meg 
range and see what happens.

For a good tuning guide, go here:

http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] slow vacuum performance

2004-03-24 Thread scott.marlowe
On Wed, 24 Mar 2004, pginfo wrote:

 Hi,
 
 scott.marlowe wrote:
 
  On Wed, 24 Mar 2004, pginfo wrote:
 
   Hi,
  
   I am running pg 7.4.1 on linux box.
   I have a midle size DB with many updates and after it I try to run
   vacuum full analyze.
 
  Is there a reason to not use just regular vacuum / analyze (i.e. NOT
  full)?
 
 
 Yes, in case I make massive updates (only in my case of cource)   for example
 2 M rows, I do not expect to have 2M new rows in next 180 days.That is the
 reaso for running vacuum full.
 My idea was to free unneedet space and so to have faster system.
 It is possible that I am wrong.

It's all about percentages.  If you've got an average of 5% dead tuples 
with regular vacuuming, then full vacuums won't gain you much, if 
anything.  If you've got 20 dead tuples for each live one, then a full 
vacuum is pretty much a necessity.  The generally accepted best 
performance comes with 5 to 50% or so dead tuples.  Keep in mind, having a 
few dead tuples is actually a good thing, as your database won't grow then 
srhink the file all the time, but keep it in a steady state size wise.




---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux

2004-03-23 Thread scott.marlowe
On Tue, 23 Mar 2004, Josh Berkus wrote:

 Matt, Stalin,
 
  As for the compute intensive side (complex joins  sorts etc), the Dell will 
 most likely beat the Sun by some distance, although
  what the Sun lacks in CPU power it may make up a bit in memory bandwidth/
 latency.
 
 Personally, I've been unimpressed by Dell/Xeon; I think the Sun might do 
 better than you think, comparitively.On all the Dell servers I've used so 
 far, I've not seen performance that comes even close to the hardware specs.

We use a 2600 at work (dual 2.8GHz) with the LSI/Megaraid based battery 
backed caching controller, and it flies.  Truly flies.

It's not Dell that's so slow, it's the default adaptec RAID controller or 
IDE drives that are slow.  Ours has 533 MHz memory bus, by the way.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] [ADMIN] syslog slowing the database?

2004-03-15 Thread scott.marlowe
On Thu, 11 Mar 2004, Christopher Kings-Lynne wrote:

  You could also consider not using syslog at all: let the postmaster
  output to its stderr, and pipe that into a log-rotation program.
  I believe some people use Apache's log rotator for this with good
  results.
 
 Not an option I'm afraid.  PostgreSQL just jams and stops logging after 
 the first rotation...
 
 I've read in the docs that syslog logging is the only production 
 solution...

Can you use the apache log rotator?  It's known to work in my environment 
(redhat 7.2, postgresql 7.2 and 7.4) with this command to start it in my 
rc.local file:

su - postgres -c 'pg_ctl start | rotatelogs $PGDATA/pglog 86400 21'


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] [ADMIN] syslog slowing the database?

2004-03-15 Thread scott.marlowe
On Mon, 15 Mar 2004, Bruce Momjian wrote:

 scott.marlowe wrote:
  On Thu, 11 Mar 2004, Christopher Kings-Lynne wrote:
  
You could also consider not using syslog at all: let the postmaster
output to its stderr, and pipe that into a log-rotation program.
I believe some people use Apache's log rotator for this with good
results.
   
   Not an option I'm afraid.  PostgreSQL just jams and stops logging after 
   the first rotation...
   
   I've read in the docs that syslog logging is the only production 
   solution...
  
  Can you use the apache log rotator?  It's known to work in my environment 
  (redhat 7.2, postgresql 7.2 and 7.4) with this command to start it in my 
  rc.local file:
  
  su - postgres -c 'pg_ctl start | rotatelogs $PGDATA/pglog 86400 21'
 
 Sure, our documentation specifically mentions using rotatelogs.


hehe.  What I meant was can Christopher use it, or does he have a 
limitation in his environment where he can't get ahold of the apache log 
rotater... :-)  


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] optimizing large query with IN (...)

2004-03-10 Thread scott.marlowe
On Wed, 10 Mar 2004, Marcus Andree S. Magalhaes wrote:

 
 Guys,
 
 I got a Java program to tune. It connects to a 7.4.1 postgresql server
 running Linux using JDBC.
 
 The program needs to update a counter on a somewhat large number of
 rows, about 1200 on a ~130k rows table. The query is something like
 the following:
 
 UPDATE table SET table.par = table.par + 1
 WHERE table.key IN ('value1', 'value2', ... , 'value1200' )
 
 This query runs on a  transaction (by issuing  a call to
 setAutoCommit(false)) and a commit() right after the query
 is sent to the backend.
 
 The process of committing and updating the values is painfully slow
 (no surprises here). Any ideas?

The problem, as I understand it, is that 7.4 introduced massive 
improvements in handling moderately large in() clauses, as long as they 
can fit in sort_mem, and are provided by a subselect.

So, creating a temp table with all the values in it and using in() on the 
temp table may be a win:

begin;
create temp table t_ids(id int);
insert into t_ids(id) values (123); - repeat a few hundred times
select * from maintable where id in (select id from t_ids);
...



---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Scaling further up

2004-03-04 Thread scott.marlowe
On Thu, 4 Mar 2004, Paul Thomas wrote:

 On 03/03/2004 18:23 scott.marlowe wrote:
  [snip]
  There are three factors that affect how fast you can get to the next
  sector:
  
  seek time
  settle time
  rotational latency
  
  Most drives only list the first, and don't bother to mention the other
  two.
 
 Ah yes, one of my (very) few still functioning brain cells was nagging 
 about another bit of time in the equation :)
 
  On many modern drives, the seek times are around 5 to 10 milliseconds.
  [snip]
 
 Going back to the OPs posting about random_page_cost, imagine I have 2 
 servers identical in every way except the disk drive. Server A has a 10K 
 rpm drive and server B has a 15K rpm drive. Seek/settle times aren't 
 spectacularly different between the 2 drives. I'm wondering if drive B 
 might actually merit a _higher_ random_page_cost than drive A as, once it 
 gets settled on a disk track, it can suck the data off a lot faster. 
 opinions/experiences anyone?

It might well be that you have higher settle times that offset the small 
gain in rotational latency.  I haven't looked into it, so I don't know one 
way or the other, but it seems a reasonable assumption.

However, a common misconception is that the higher angular velocity of 
the 15krpm drives would allow you to read data faster.  In fact, the limit 
of how fast you can read is set by the head.  There's a maximum frequency 
that it can read, and the areal density / rpm have to be such that you 
don't exceed that frequency.  OFten, the speed at which you read off the 
platters is exactly the same between a 10k and 15k of the same family.  

The required lower areal density is the reason 15krpm drives show up in 
the lower capacities first.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Scaling further up

2004-03-02 Thread scott.marlowe
On Tue, 2 Mar 2004, Anjan Dave wrote:

 By lots I mean dozen(s) in a raid 10 array with a good controller. 
 
 I believe, for RAID-10, I will need even number of drives.

Correct.

 Currently,
 the size of the database is about 13GB, and is not expected to grow
 exponentially with thousands of concurrent users, so total space is not
 of paramount importance compared to performance.
 
 Does this sound reasonable setup?
 10x36GB FC drives on RAID-10
 4x36GB FC drives for the logs on RAID-10 (not sure if this is the
 correct ratio)?
 1 hotspare
 Total=15 Drives per enclosure.

Putting the Logs on RAID-10 is likely to be slower than, or no faster than 
putting them on RAID-1, since the RAID-10 will have to write to 4 drives, 
while the RAID-1 will only have to write to two drives.  now, if you were 
reading in the logs a lot, it might help to have the RAID-10.

 Tentatively, I am looking at an entry-level EMC CX300 product with 2GB
 RAID cache, etc.

Pick up a spare, I'll get you my home address, etc... :-)

Seriously, that's huge.  At that point you may well find that putting 
EVERYTHING on a big old RAID-5 performs best, since you've got lots of 
caching / write buffering going on.

 Question - Are 73GB drives supposed to give better performance because
 of higher number of platters?

Generally, larger hard drives perform better than smaller hard drives 
because they a: have more heads and / or b: have a higher areal density.

It's a common misconception that faster RPM drives are a lot faster, when, 
in fact, their only speed advantage is slight faster seeks.  The areal 
density of faster spinning hard drives tends to be somewhat less than the 
slower spinning drives, since the maximum frequency the heads can work in 
on both drives, assuming the same technology, is the same.  I.e. the speed 
at which you can read data off of the platter doesn't usually go up with a 
higher RPM drive, only the speed with which you can get to the first 
sector.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] Scaling further up

2004-03-02 Thread scott.marlowe
On Tue, 2 Mar 2004, Anjan Dave wrote:

 That was part of my original question - whether it makes sense to go for
 a mid-range SunFire machine (64bit HW, 64bit OS), which is scalable to
 high amounts of memory, and shouldn't have any issues addressing it all.
 I've had that kind of setup once temporarily on a V480 (quad UltraSparc,
 16GB RAM) machine, and it did well in production use. Without having the
 time/resources to do extensive testing,  I am not sure if
 Postgres/Solaris9 is really suggested by the community for
 high-performance, as opposed to a XEON/Linux setup. Storage being a
 separate discussion.

Some folks on the list have experience with Postgresql on Solaris, and 
they generally say they use Solaris not for performance reasons, but for 
reliability reasons.  I.e. the bigger Sun hardware is fault tolerant.

For speed, the X86 32 and 64 bit architectures seem to be noticeable 
faster than Sparc.  However, running Linux or BSD on Sparc make them 
pretty fast too, but you lose the fault tolerant support for things like 
hot swappable CPUs or memory.



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] FreeBSD config

2004-02-26 Thread scott.marlowe
On Thu, 26 Feb 2004, Dror Matalon wrote:

 Hi,
 
 We have postgres running on freebsd 4.9 with 2 Gigs of memory. As per
 repeated advice on the mailing lists we configured effective_cache_size
 = 25520 which you get by doing `sysctl -n vfs.hibufspace` / 8192
 
 Which results in using 200Megs for disk caching. 
 
 Is there a reason not to increase the hibufspace beyond the 200 megs and
 provide a bigger cache to postgres? I looked both on the postgres and
 freebsd mailing lists and couldn't find a good answer to this.

Actually, I think you're confusing effective_cache_size with 
shared_buffers.

effective_cache_size changes no cache settings for postgresql, it simply 
acts as a hint to the planner on about how much of the dataset your OS / 
Kernel / Disk cache can hold.

Making it bigger only tells the query planny it's more likely the data 
it's looking for will be in cache.

shared_buffers, OTOH, sets the amount of cache that postgresql uses.  It's 
generall considered that 256 Megs or 1/4 of memory, whichever is LESS, is 
a good setting for production database servers.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] A cache for the results of queries ?

2004-02-26 Thread scott.marlowe
On Thu, 26 Feb 2004, David Pradier wrote:

 Hi everybody,
 
 i'd like to know if it exists a system of cache for the results of
 queries.

I believe there are some external libs that provide this at the 
application level.  PHP's adodb is purported to do so.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] FreeBSD config

2004-02-26 Thread scott.marlowe
On Thu, 26 Feb 2004, Dror Matalon wrote:

 On Thu, Feb 26, 2004 at 11:55:31AM -0700, scott.marlowe wrote:
  On Thu, 26 Feb 2004, Dror Matalon wrote:
  
   Hi,
   
   We have postgres running on freebsd 4.9 with 2 Gigs of memory. As per
   repeated advice on the mailing lists we configured effective_cache_size
   = 25520 which you get by doing `sysctl -n vfs.hibufspace` / 8192
   
   Which results in using 200Megs for disk caching. 
   
   Is there a reason not to increase the hibufspace beyond the 200 megs and
   provide a bigger cache to postgres? I looked both on the postgres and
   freebsd mailing lists and couldn't find a good answer to this.
  
  Actually, I think you're confusing effective_cache_size with 
  shared_buffers.
 
 No, I'm not.

OK, sorry, I wasn't sure which you meant.

  effective_cache_size changes no cache settings for postgresql, it simply 
  acts as a hint to the planner on about how much of the dataset your OS / 
  Kernel / Disk cache can hold.
 
 I understand that. The question is why have the OS, in this case FreeBsd
 use only 200 Megs for disk cache and not more. Why not double the
 vfs.hibufspace  to 418119680 and double the effective_cache_size to 51040.

Doesn't the kernel just use the spare memory to buffer anyway?

I'd say if you got 2 megs memory and nothing else on the box, give a big 
chunk (1 gig or so) to the kernel to manage.  Unless large kernel caches 
cause some issues in FreeBSD.

  Making it bigger only tells the query planny it's more likely the data 
  it's looking for will be in cache.
  
  shared_buffers, OTOH, sets the amount of cache that postgresql uses.  It's 
  generall considered that 256 Megs or 1/4 of memory, whichever is LESS, is 
  a good setting for production database servers.
  
 
 Actually last I looked, I thought that the recommended max shared
 buffers was 10,000, 80MB,  even on machines with large amounts of memory.

It really depends on what you're doing.  For loads involving very large 
data sets, up to 256 Megs has resulted in improvements, but anything after 
that has only had advantages in very limited types of applications.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] General performance questions about postgres on Apple

2004-02-23 Thread scott.marlowe
On Sun, 22 Feb 2004, Sean Shanny wrote:

 Tom,
 
 We have the following setting for random page cost:
 
 random_page_cost = 1# units are one sequential page fetch cost
 
 Any suggestions on what to bump it up to?
 
 We are waiting to hear back from Apple on the speed issues, so far we 
 are not impressed with the hardware in helping in the IO department.  
 Our DB is about 263GB with indexes now so there is not way it is going 
 to fit into memory. :-(  I have taken the step of breaking out the data 
 into month based groups just to keep the table sizes down.  Our current 
 months table has around 72 million rows in it as of today.  The joys of 
 building a data warehouse and trying to make it as fast as possible.

You may be able to achieve similar benefits with a clustered index.

see cluster:

\h cluster
Command: CLUSTER
Description: cluster a table according to an index
Syntax:
CLUSTER indexname ON tablename
CLUSTER tablename
CLUSTER

I've found this can greatly increase speed, but on 263 gigs of data, I'd 
run it when you had a couple days free.  You might wanna test it on a 
smaller test set you can afford to chew up some I/O CPU time on over a 
weekend.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Slow response of PostgreSQL

2004-02-17 Thread scott.marlowe

Easy two step procedure for speeding this up:

1:  Upgrade to 7.4.1
2:  Read this: 
http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Tables on multiple disk drives

2004-02-17 Thread scott.marlowe
On Tue, 17 Feb 2004, Konstantin Tokar wrote:

 Hi!
 Does PostgreSQL allow to create tables and indices of a single
 database on multiple disk drives with a purpose of increase
 performance as Oracle database does? If a symbolic reference is the
 only method then the next question is: how can it be determined what
 file is referred to what table and index?

You're life will be simpler, and your setup will be faster without having 
to muck about with it, if you just buy a good RAID controller with battery 
backed cache.  LSI/Megaraid and Adaptec both make serviceable controllers 
for reasonable prices, and as you add drives, the speed just goes up, no 
muddling around with sym links.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Tables on multiple disk drives

2004-02-17 Thread scott.marlowe
On Tue, 17 Feb 2004, Craig Thomas wrote:

  On Tue, 17 Feb 2004, Craig Thomas wrote:
 
   On Tue, 17 Feb 2004, Konstantin Tokar wrote:
  
   Hi!
   Does PostgreSQL allow to create tables and indices of a single
  database on multiple disk drives with a purpose of increase
   performance as Oracle database does? If a symbolic reference is the
  only method then the next question is: how can it be determined
  what file is referred to what table and index?
  
   You're life will be simpler, and your setup will be faster without
  having  to muck about with it, if you just buy a good RAID
  controller with battery  backed cache.  LSI/Megaraid and Adaptec
  both make serviceable controllers  for reasonable prices, and as you
  add drives, the speed just goes up, no  muddling around with sym
  links.
 
  This works to a limited extent.  For very large databases, maximum
  throughput of I/O is the paramount factor for database performance.
  With raid controllers, your LUN is still limited to a small number of
  disks. PostgreSQL can only write on a file system, but Oracle, SAP DB,
  DB2, etc can write directly to disk (raw I/O).  With large databases
  it is advantageous to spread a table across 100's of disks, if the
  table is quite large.  I don't know of any manufacturer that creates a
  100 disk raid array yet.
 
  You can run up to four LSI / Megaraids in one box, each with 3 UW SCSI
  interfaces, and they act as one unit.  That's 3*4*15 = 180 disks max.
 
  With FC AL connections and four cards, it would be possible to approach
  1000 drives.
 
  Of course, I'm not sure how fast any RAID card setup is gonna be with
  that  many drives, but ya never know.  My guess is that before you go
  there you  buy a big external RAID box built for speed.  We have a
  couple of 200+  drive external RAID5 storage boxes at work that are
  quite impressive.
 
 That's a good point.  But it seems that the databases that are the
 leaders of the TPC numbers seem to be the Oracles of the world.  I
 know that a former company I worked for publised TPC numbers using
 Oracle with Raw I/O to get the performance up.

But keep in mind, that in the TPC benchmarks, doing things that require 
lots of dba work don't tend to make the cost in the test go up (you can 
hide a lot of admin work in those things) while in real life, they do 
drive up the real cost of maintenance.

I'd imagine that with Postgresql coming along nicely, it may well be that 
in a year or two, in the real world, you can just take the money you'd 
have spend on Oracle licenses and Oracle DBAs and just throw more drives 
at a problem to solve it.

And still spend less money than you would on Oracle.  :-)


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Slow response of PostgreSQL

2004-02-17 Thread scott.marlowe
On Wed, 18 Feb 2004, Christopher Kings-Lynne wrote:

  1- How can I lock a single record so that other users can only read 
  it. ??
 
 
  You cannot do that in PostgreSQL.
  
  
  How about SELECT ... FOR UPDATE?
 
 No, because users cannot read the locked row in that case.

I just tested it (within transactions) and it appeared that I could still 
view the rows selected for update.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] slow database

2004-02-11 Thread scott.marlowe
If my boss came to me and asked me to make my database server run as 
slowly as possible, I might come up with the exact same postgresql.conf 
file as what you posted.

Just installing the default postgresql.conf that came with postgresql 
should make this machine run faster.

Read this:

http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Compile Vs RPMs

2004-02-04 Thread scott.marlowe
On Tue, 3 Feb 2004, Christopher Browne wrote:

 [EMAIL PROTECTED] (Anjan Dave) writes:
  I would like to know whether there are any significant performance
  advantages of compiling (say, 7.4) on your platform (being RH7.3, 8,
  and 9.0, and Fedora especially) versus getting the relevant binaries
  (rpm) from the postgresql site? Hardware is Intel XEON (various
  speeds, upto 2.8GHz, single/dual/quad configuration).
 
 Some Linux distribution makers make grand claims of such advantages,
 but it is not evident that this is much better than superstition.
 
 You are certainly NOT going to see GCC generating MMX code
 automagically that would lead to PostgreSQL becoming 8 times faster.
 
 Indeed, in database work, it is quite likely that you will find things
 to be largely I/O bound, with CPU usage being a very much secondary
 factor.
 
 I did some relative benchmarking between compiling PostgreSQL on GCC
 versus IBM's PPC compilers a while back; did not see differences that
 could be _clearly_ discerned as separate from observational noise.
 
 You should expect find that adding RAM, or adding a better disk
 controller would provide discernable differences in performance.  It
 is much less clear that custom compiling will have any substantial
 effect on I/O-bound processing.

I would add that the primary reason for compiling versus using RPMs is to 
take advantage of some compile time option having to do with block size, 
or using a patch to try and test a system that has found a new corner case 
where postgresql is having issues performing well, like the vacuum page 
delay patch for fixing the issue with disk bandwidth saturation.  If 
you've got a machine grinding to its knees under certain loads, and have a 
test box to test it on, and the test box shows better performance, it 
might be better to patch the live server on the off hours if it will keep 
the thing up and running during the day.  

In that way, performance differences are very real, but because you are 
doing something you can't do with factory rpms.  Of course, building 
custom rpms isn't that hard to do, so if you had a lot of boxes that 
needed a patched flavor of postgresql, you could still run from rpms and 
have the custom patch.  




---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Overlaping indexes

2004-02-02 Thread scott.marlowe
On Mon, 2 Feb 2004, Tomasz Myrta wrote:

 Dnia 2004-02-02 15:46, U?ytkownik Rigmor Ukuhe napisa3:
  Hi,
  
  I have many indexes somehow overlaping like:
  ... btree (STATUS, VISIBLE, NP_ID);
  ... btree (STATUS, VISIBLE);
  
  is perfomance gained by more exact index worth overhead with managing
  indexes.
 
 The second (2 columns) index is useless - it's function is well done by 
 the first one (3 columns).

Not entirely, since it only has to sort two columns, it will be smaller, 
and will therefore be somewhat faster.

On the other hand, I've seen a lot of folks create multi column indexes 
who didn't really understand how they work in Postgresql.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Overlaping indexes

2004-02-02 Thread scott.marlowe
On Mon, 2 Feb 2004, Tomasz Myrta wrote:

 Dnia 2004-02-02 19:30, U¿ytkownik scott.marlowe napisa³:
  Not entirely, since it only has to sort two columns, it will be smaller, 
  and will therefore be somewhat faster.
 
 Can you say something more about it? Will it be enough faster to keep 
 them both? Did anyone make such tests?

that really depends on the distribution of the third column.  If there's 
only a couple of values in the third column, no big deal.  If each entry 
is unique, and it's a large table, very big deal.

It is only useful to have a three column index if you actually use it.  If 
you have an index on (a,b,c) and select order by b, the index won't get 
used unless the a part is in the where clause.

the other issue is updates.  IT WILL cost more to update two indexes 
rather than one.  Generally, you can drop / readd the index and use 
explain analyze on one of your own queries to see if that helps.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


inserting large number of rows was: Re: [PERFORM] Increasing number of PG connections.

2004-02-02 Thread scott.marlowe
On Mon, 2 Feb 2004, Qing Zhao wrote:

 I am new here. I have a question related to this in some way.
 
 Our web site needs to upload a large volume of data into Postgres at a 
 time. The performance deterioates as number of rows becomes larger.  
 When it reaches 2500 rows, it never come back to GUI. Since the tests 
 were run through GUI, my suspision is
 that it might be caused by the way the application server talking to 
 Postgres server, the connections, etc.. What might be the factors 
 involved here? Does anyone know?

Actually, I'm gonna go out on a limb here and assume two things:

1. you've got lotsa fk/pk relationships setup.
2. you're analyzing the table empty before loading it up.

What happens in this instance is that the analyze on an empty, or nearly 
so, table, means that during the inserts, postgresql thinks you have only 
a few rows.  At first, this is fine, as pgsql will seq scan the  
tables to make sure there is a proper key in both.  As the number of 
rows increases, the planner needs to switch to index scans but doesn't, 
because it doesn't know that the number of rows is increasing.

Fix:  insert a few hundred rows, run analyze, check to see if the explain 
for inserts is showing index scans or not.  If not, load a few more 
hundred rows, analyze, rinse, repeat.

Also, look for fk/pk mismatches.  I.e. an int4 field pointing to an int8 
field.  That's a performance killer, so if the pk/fk types don't match, 
see if you can change your field types to match and try again.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] Increasing number of PG connections.

2004-02-02 Thread scott.marlowe
On Mon, 2 Feb 2004, Kevin Barnard wrote:

 On 2 Feb 2004 at 13:58, scott.marlowe wrote:
 
  what do you mean at 2 GB?  Is that how much is in kernel cache plus 
  buffer, plus used, plus etc???  Could you give us the top of top output to 
  make sure?  If most of that is kernel cache, then that's fine.  
 
 2GB was total system memory.  We upgraded to 4GB to prior to increasing the 
 number of connections.

Oh, ok.  I thought you meant the system was using 2 gigs of RAM for 
postgresql

 Here's the top of top
 
  16:14:17  up 2 days, 16:15,  1 user,  load average: 7.60, 6.56, 4.61
 730 processes: 721 sleeping, 9 running, 0 zombie, 0 stopped
 CPU states:  cpuusernice  systemirq  softirq  iowaitidle
total0.0%0.0%0.0%   0.0% 0.0%0.0%0.0%
cpu000.0%0.0%0.0%   0.0% 0.0%0.0%0.0%
cpu010.0%0.0%0.0%   0.0% 0.0%0.0%0.0%
cpu020.0%0.0%0.0%   0.0% 0.0%0.0%0.0%
cpu030.0%0.0%0.0%   0.0% 0.0%0.0%0.0%
 Mem:  3747644k av, 3298344k used,  449300k free,   0k shrd,  147880k buff
   2158532k active, 760040k inactive
 Swap: 1048088k av,   0k used, 1048088k free 2262156k cached

when you have a high load but load CPU usage, you are usually I/O bound.

 The DB is pretty close to max connections at this point in time.  I don't know why 
 CPU shows 0% in every bucket.  It looks like I can increase the number of 
 connections a little from here.  This is a fairly standard Fedora install.  It's 
 using 
 version 2.4.22 of the Kernel.  Postgres is a complied version using 7.4.1

On this machine you could probably handle even more.  What I want is to 
get your page return times down enough so you don't need to increase the 
number of connections.  I.e. if you've got 2 second response times and you 
drop those to 0.2 seconds, then you won't need as many processes to handle 
the load (theoretically... :-)

  experience has been that individual postgresql backends only weigh in at a 
  mega byte at most, and they share buffer, so 700 connections can be 
  anywhere from 300meg to 1 gig.  the rest would be buffer memory.  It's not 
  a good idea to give up too much to shared buffers, as the database isn't 
  as good at caching as the kernel.
 
 OK I take this as I should keep shared buffers around 2x connections then correct?

Not really.  What happens is that if the shared buffers are so large that 
they are as large as or god forbid, larger than the kernel cache, then the 
kernel cache becomes less effective.  The general rule of thumb is 25% of 
memory, or 256 Megs, whichever is less.  The real test is that you want 
enough shared_buffers so that all the result sets currently being smooshed 
up against each other in joins, sorts, etc... can fit in postgresql's 
shared buffers, or at least the buffers can hold a fair chunk of it.  So, 
the number of buffers can be anywhere from a few thousand, up to 4 or 
5, sometimes even higher.  But for most tuning you won't be needing to 
be above 32768, which is 256 Megs of ram.

  What do you have in postgresql.conf?  sort_mem, shared_buffers, etc???
 
 Here is what I have that is not set from the defaults.
 
 max_connections = 700
 shared_buffers = 1500
 sort_mem = 512
 random_page_cost = 2
 stats_start_collector = true
 stats_command_string = true
 stats_block_level = true
 stats_row_level = true
 
 
  sort_mem can be a real killer if it lets the processes chew up too much 
  memory.  Once sort_mem gets high enough to make the machine start swapping 
  it is doing more harm than good being that high, and should usually be 
  lowered a fair bit.
 
 I dropped it down to 512 as you can see.  Should I be running with all of the stats 
 on?  
 I am no longer using pg_autovacuum.  I seem to be getting better results with an 
 hourly Vacuum anaylse.

Seeing as how top shows 2262156k kernel cache, you can afford to give up a 
fair bit more than 512k per sort.  I generally run 8192 (8 meg) but I 
don't handle 700 simos.  Try running it a little higher, 2048, 4096, 
etc... and see if that helps.  Note you can change sort_mem and just do a 
pg_ctl reload to make the change, without interrupting service, unlike 
shared_buffers, which requires a restart.

  How many disks in your RAID5?  The more the better.  Is it hardware with 
  battery backed cache?  If you write much to it it will help to have 
  battery backed cache on board.  If it's a megaraid / LSI board, get the 
  megaraid2 driver, it's supposedly much faster.
 
 4 disk IBM ServeRAID 5i with battery backed cache.

Do you have the cache set to write back or write through?  Write through 
can be a performance killer.  But I don't think your RAID is the problem, 
it looks to me like postgresql is doing a lot of I/O.  When you run top, 
do the postgresql processes show a lot of D status? That's usually waiting

Re: [PERFORM] 100 simultaneous connections, critical limit?

2004-01-14 Thread scott.marlowe
On Wed, 14 Jan 2004, Jón Ragnarsson wrote:

 I am writing a website that will probably have some traffic.
 Right now I wrap every .php page in pg_connect() and pg_close().
 Then I read somewhere that Postgres only supports 100 simultaneous
 connections (default). Is that a limitation? Should I use some other
 method when writing code for high-traffic website?

A few tips from an old PHP/Apache/Postgresql developer.

1: Avoid pg_pconnect unless you are certain you have load tested the 
system and it will behave properly.  pg_pconnect often creates as many 
issues as it solves.

2: While php has pretty mediocre run time performance, it's startup / 
shutdown / cleanup are quite fast, and it caches previously executed 
pages.  Thus, if your pages are relatively small, code-wise, then the 
amount of time it will take to execute them, versus the amount of time the 
user will spend reading the output will be quite small.  So, you can 
likely handle many hundreds of users before hitting any limit on the 
database end.

3: Apache can only run so many children too.  The default for the 1.3 
branch is 150.  If you decrease that to 50 or so, you are quite unlikely 
to ever run out of connections to the database.

4: Postgresql can handle thousands of connections if the server and 
postgresql itself are properly configured, so don't worry so much about 
that.  You can always increase the max should you need to later.

5: Database connection time in a php script is generally a non-issue.  
pg_connect on a fast machine, hitting a local pgsql database generally 
runs in about 1/10,000th of a second.  Persistant connects get this down 
to about 1/1,000,000th of a second.  Either way, a typical script takes 
milliseconds to run, i.e. 1/100th of a second or longer, so the actual 
difference between a pg_pconnect and a pg_connect just isn't worth 
worrying about in 99% of all circumstances.

6: Profile your user's actions and the time it takes the server versus how 
long it takes them to make the next click.  Even the fastest user is 
usually much slower than your server, so it takes a whole bunch of them to 
start bogging the system down.  

7: Profile your machine under parallel load.  Note that machine simos 
(i.e. the kind you get from the ab utility) generally represent about 10 
to 20 real people.  I.e. if your machine runs well with 20 machine simos, 
you can bet on it handling 100 or more real people with ease.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] 100 simultaneous connections, critical limit?

2004-01-14 Thread scott.marlowe
On Wed, 14 Jan 2004, Adam Alkins wrote:

 scott.marlowe wrote:
 
 A few tips from an old PHP/Apache/Postgresql developer.
 
 1: Avoid pg_pconnect unless you are certain you have load tested the 
 system and it will behave properly.  pg_pconnect often creates as many 
 issues as it solves.
   
 
 
 I share the above view. I've had little success with persistent 
 connections. The cost of pg_connect is minimal, pg_pconnect is not a 
 viable solution IMHO. Connections are rarely actually reused.

I've found that for best performance with pg_pconnect, you need to 
restrict the apache server to a small number of backends, say 40 or 50, 
extend keep alive to 60 or so seconds, and use the same exact connection 
string all over the place.  Also, set max.persistant.connections or 
whatever it is in php.ini to 1 or 2.  Note that max.persistant.connections 
is PER BACKEND, not total, in php.ini, so 1 or 2 should be enough for most 
types of apps.  3 tops.  Then, setup postgresql for 200 connections, so 
you'll never run out.  Tis better to waste a little shared memory and be 
safe than it is to get the dreaded out of connections error from 
postgresql.

If you do all of the above, pg_pconnect can work pretty well, on things 
like dedicated app servers where only one thing is being done and it's 
being done a lot.  On general purpose servers with 60 databases and 120 
applications, it adds little, although extending the keep alive timeout 
helps.  

but if you just start using pg_pconnect without reconfiguring and then 
testing, it's quite likely your site will topple over under load with out 
of connection errors.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] COUNT Pagination

2004-01-14 Thread scott.marlowe
On Wed, 14 Jan 2004, Rajesh Kumar Mallah wrote:

 scott.marlowe wrote:
 
 On Tue, 13 Jan 2004, David Shadovitz wrote:
 
   
 
 We avert the subsequent execution of count(*) by passing the
 value of count(*) as a query parameter through the link in page
 numbers.
   
 
 Mallah, and others who mentioned caching the record count:
 
 Yes, I will certainly do this.  I can detect whether the query's filter has 
 been changed, or whether the user is merely paging through the results or 
 sorting* the results.
 
 I'd love to completely eliminate the cost of the COUNT(*) query, but I guess 
 that I cannot have everything.
 
 * My HTML table column headers are hyperlinks which re-execute the query, 
 sorting the results by the selected column.  The first click does an ASC 
 sort; a second click does a DESC sort.
 
 
 
 another useful trick is to have your script save out the count(*) result 
 in a single row table with a timestamp, and every time you grab if, check 
 to see if x number of minutes have passed, and if so, update that row with 
 a count(*). 
 
 
 Greetings!
 
 The count(*) can get evaluated with any arbitrary combination
 in whre clause how do you plan to store that information ?
 
 In a typical application pagination could be required in n number
 of contexts . I would be interested to know more about this trick
 and its applicability in such situations.
 
 Offtopic:
 
 Does PostgreSQL optimise repeated execution of similar queries ie
 queries on same table or set of tables (in a join) with same where clause
  and only differing in LIMIT and OFFSET.

Yes, and no.

Yes, previously run query should be faster, if it fits in kernel cache.  

No, Postgresql doesn't cache any previous results or plans (unless you use 
prepare / execute, then it only caches the plan, not the query results).

Plus, the design of Postgresql is such that it would have to do a LOT of 
cache checking to see if there were any updates to the underlying data 
between selects.  Since such queries are unlikely to be repeated inside a 
transaction, the only place where you wouldn't have to check for new 
tuples, it's not really worth trying to implement.

Keep in mind most databases can use an index on max(*) because each 
aggregate is programmed by hand to do one thing.  In Postgresql, you can 
create your own aggregate, and since there's no simple way to make 
aggregates use indexes in the general sense, it's not likely to get 
optimized.  I.e. any optimization for JUST max(*)/min(*) is unlikely 
unless it can be used for the other aggregates.



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] COUNT Pagination

2004-01-13 Thread scott.marlowe
On Tue, 13 Jan 2004, David Shadovitz wrote:

  We avert the subsequent execution of count(*) by passing the
  value of count(*) as a query parameter through the link in page
  numbers.
 
 Mallah, and others who mentioned caching the record count:
 
 Yes, I will certainly do this.  I can detect whether the query's filter has 
 been changed, or whether the user is merely paging through the results or 
 sorting* the results.
 
 I'd love to completely eliminate the cost of the COUNT(*) query, but I guess 
 that I cannot have everything.
 
 * My HTML table column headers are hyperlinks which re-execute the query, 
 sorting the results by the selected column.  The first click does an ASC 
 sort; a second click does a DESC sort.

another useful trick is to have your script save out the count(*) result 
in a single row table with a timestamp, and every time you grab if, check 
to see if x number of minutes have passed, and if so, update that row with 
a count(*).  You can even have a cron job do it so your own scripts don't 
incur the cost of the count(*) and delay output to the user.



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Index creation

2004-01-07 Thread scott.marlowe
On Wed, 7 Jan 2004, Eric Jain wrote:

 Any tips for speeding up index creation?
 
 I need to bulk load a large table with 100M rows and several indexes,
 some of which span two columns.
 
 By dropping all indexes prior to issuing the 'copy from' command, the
 operation completes 10x as fast (1.5h vs 15h).
 
 Unfortunately, recreating a single index takes nearly as long as loading
 all of the data into the table; this more or less eliminates the time
 gained by dropping the index in the first place.
 
 Also, there doesn't seem to be a simple way to disable/recreate all
 indexes for a specific table short of explicitely dropping and later
 recreating each index?

Note that you can issue the following command to see all the index 
definitions for a table:

select * from pg_indexes where tablename='sometable';

And store those elsewhere to be reused when you need to recreate the 
index.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] general peformance question

2003-12-22 Thread scott.marlowe
On Thu, 18 Dec 2003, Conny Thimren wrote:

 Hi,
 This a kind of newbie-question. I've been using Postgres for a long time in a low 
 transction environment - and it is great.
 
 Now I've got an inquiry for using Postgresql in a heavy-load on-line system. This 
 system must handle something like 20 questions per sec with a response time at 1/10 
 sec. Each question will result in approx 5-6 reads and a couple of updates.
 Anybody have a feeling if this is realistic on a Intelbased Linux server with 
 Postgresql. Ofcourse I know that this is too little info for an exact answer but - 
 as I said - maybe someone can give a hint if it's possible. Maybe someone with 
 heavy-load can give an example of what is possible...

That really depends on how heavy each query is, so it's hard to say from 
what little you've given us.

If you are doing simple banking style transactions, then you can easily 
handle this load, if you are talking a simple shopping cart, ditto, if, 
however, you are talking about queries that run 4 or 5 tables with 
millions of rows againts each other, you're gonna have to test it 
yourself.

With the autovacuum daemon running, I ran a test overnight of pgbench 
(more for general purpose burn in than anything else) 

pgbench -i -s 100
pgbench -c 50 -t 25

that's 10 million transactions, and it took just over twelve hours to 
complete at 220+ transactions per second.

so, for financials, you're likely to find it easy to meet your target.  
But as the tables get bigger / more complex / more interconnected you'll 
see a drop in performance.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] [ADMIN] ODBC Driver generates a too big windows swap file and

2003-12-12 Thread scott.marlowe
On Fri, 12 Dec 2003, Rhaoni Chiu Pereira wrote:


Hi, is there a switch in your pgsql/odbc connector to enable cursors?  If 
so, try turning that on.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] tuning questions

2003-12-04 Thread scott.marlowe
On Thu, 4 Dec 2003, Jack Coates wrote:

 Another problem is that performance of the 6 million row job is decent
 if I stop the job and run a vacuumdb --analyze before letting it
 continue; is this something that 7.4 will help with? vacuumb --analyze
 doesn't seem to have much effect on the 18 million row job.

Just to add to what the others have said here, you probably want to run 
the pg_autovacuum daemon in the background.  It comes with 7.4 but will 
work fine with 7.3.  



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] tuning questions

2003-12-04 Thread scott.marlowe
On Thu, 4 Dec 2003, Jack Coates wrote:

 On Thu, 2003-12-04 at 12:27, Richard Huxton wrote:
  On Thursday 04 December 2003 19:50, Jack Coates wrote:
  
   I'm trying to set Postgres's shared memory usage in a fashion that
   allows it to return requested results quickly. Unfortunately, none of
   these changes allow PG to use more than a little under 300M RAM.
   vacuumdb --analyze is now taking an inordinate amount of time as well
   (40 minutes and counting), so that change needs to be rolled back.
  
  You don't want PG to use all your RAM, it's designed to let the underlying OS 
  do a lot of caching for it. Probably worth having a look at vmstat/iostat and 
  see if it's saturating on I/O.
 
 latest changes:
 shared_buffers = 35642
 max_fsm_relations = 1000
 max_fsm_pages = 1
 wal_buffers = 64
 sort_mem = 32768
 vacuum_mem = 32768
 effective_cache_size = 1
 
 /proc/sys/kernel/shmmax = 5
 
 IO is active, but hardly saturated. CPU load is hefty though, load
 average is at 4 now.

Postgresql is busily managing a far too large shared buffer.  Let the 
kernel do that.  Postgresql's shared buffers should be bug enough to hold 
as much of the current working set as it can, up to about 25% or so of the 
servers memory, or 512Meg, whichever comes first.  Unless a single query 
will actually use all of the buffer at once, you're not likely to see an 
improvement.

Also, your effective cache size is really small.  On a typical Postgresql 
server with 2 gigs of ram, you'll have about 1 to 1.5 gigs as kernel cache 
and buffer, and if it's dedicated to postgresql, then the effective cache 
setting for 1 gig would be 131072 (assuming 8k pages).

If you're updating a lot of tuples without vacuums, you'll likely want to 
up your fsm settings.

Note you can change things like sort_mem, effective_cache_size and 
random_page_cost on the fly (but not buffers, they're allocated at 
startup, nor fsm, they are as well.)

so, if you're gonna have one huge honkin query that needs to sort a 
hundred megs at a time, but you'd rather not up your sort memory that high 
(sort mem is PER SORT, not per backend or per database, so it can get out 
of hand quickly) then you can just 

set sort_mem=128000;

before throwing out the big queries that need all the sort.


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Value of Quad vs. Dual Processor machine

2003-11-12 Thread scott.marlowe
As another post pointed out, you need to set cflags to get optimization 
under Solaris on that flavor of Postgresql.

Also, Postgresql tends to get its best performance from the free unixes, 
Linux and BSD.  those are available for Sun Sparcs, but postgresql in 64 
bit mode on those boxes is still a bit cutting edge.

It might be worth a try to set up the sun to dual boot to either BSD or 
Linux and test Postgresql under that environment to see how it works and 
compare it to Sun after you've set the cflags and recompiled.

On Tue, 11 Nov 2003, Chris Field wrote:

 we are looking at Xeon, We are currently running it on a quad sun v880
 compiled to be 64bit and have been getting dreadful performance.  I don't
 think we really have much to gain from going 64bit.
 
 
 - Original Message - 
 From: Ron Johnson [EMAIL PROTECTED]
 To: PgSQL Performance ML [EMAIL PROTECTED]
 Sent: Tuesday, November 11, 2003 8:24 PM
 Subject: Re: [PERFORM] Value of Quad vs. Dual Processor machine
 
 
  On Tue, 2003-11-11 at 17:32, Chris Field wrote:
   We are getting ready to spec out a new machine and are wondering about
   the wisdom of buying a quad versus a dual processor machine.  Seing as
   how postgres in not a threaded application, and this server will only be
   used for log/transaction analysis (it will only ever have a few large
   queries running).  Is there any performance to be gained, and if so is
   it worth the large cost?  Any thoughts/experience are much
   appreciated...
 
  Xeon or Opteron?  The faster Opterons *really* blaze, especially
  in 64-bit mode.  As others have said, though, RAM and I/O are most
  important.
 
  -- 
  -
  Ron Johnson, Jr. [EMAIL PROTECTED]
  Jefferson, LA USA
 
  As I like to joke, I may have invented it, but Microsoft made it
  popular
  David Bradley, regarding Ctrl-Alt-Del
 
 
  ---(end of broadcast)---
  TIP 5: Have you checked our extensive FAQ?
 
 http://www.postgresql.org/docs/faqs/FAQ.html
 
 
 
 
 ---(end of broadcast)---
 TIP 7: don't forget to increase your free space map settings
 
 


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] *very* slow query to summarize data for a month ...

2003-11-11 Thread scott.marlowe
On 11 Nov 2003, Greg Stark wrote:

 Marc G. Fournier [EMAIL PROTECTED] writes:
 
  On Tue, 11 Nov 2003, Greg Stark wrote:
  
   Actually you might be able to get the same effect using function indexes
   like:
  
   create index i on traffic_log (month_trunc(runtime), company_id)
  
  had actually thought of that one ... is it something that is only
  available in v7.4?
 
 Hum, I thought you could do simple functional indexes like that in 7.3, but
 perhaps only single-column indexes.
 
 In any case, given your situation I would seriously consider putting a
 month integer column on your table anyways. Then your index would be a
 simple (month, company_id) index.

In 7.3 and before, you had to use only column names as inputs, so you 
could cheat:

alter table test add alp int;
alter table test add omg int;
update test set alp=0;
update test set omg=13;

and then create a functional index:

create index test_xy on test (substr(info,alp,omg));

select * from test where substr(info,alp,omg)=='abcd';





---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Value of Quad vs. Dual Processor machine

2003-11-11 Thread scott.marlowe
On Tue, 11 Nov 2003, Rod Taylor wrote:

 On Tue, 2003-11-11 at 18:32, Chris Field wrote:
  We are getting ready to spec out a new machine and are wondering about
  the wisdom of buying a quad versus a dual processor machine.  Seing as
  how postgres in not a threaded application, and this server will only be
  used for log/transaction analysis (it will only ever have a few large
  queries running).  Is there any performance to be gained, and if so is
  it worth the large cost?  Any thoughts/experience are much
  appreciated...
 
 Since you're asking the question, I'll assume you don't have CPU
 intensive queries or monstrous loads.
 
 I'd probably invest in a Quad system with 2 chips in it (2 empty
 sockets) and put the difference in funds into a few extra GB of Ram or
 improved IO.
 
 In 6 months or a year, if you start doing longer or more complex
 queries, toss in the other 2 chips. So long as you don't hit a memory
 limit, it'll be fine.

Note that you want to carefully look at the difference in cost of the 
motherboard versus the CPUs.  It's often the motherboard that raises the 
cost, not the CPUs so much.  Although with Xeons, the CPUs are not cheap.

The second issue is that Intel (and AMD probably) only guarantee proper 
performance from chips int he same batch, so you may wind up replacing the 
two working CPUs with two new ones to go with the other two you'll be 
buying, to make sure that they work together.

My guess is that more CPUs aren't gonna help this problem a lot, so look 
more at fast RAM and lots of it, as well as a fast I/O subsystem.

2 CPUs should be plenty.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] vacuum locking

2003-10-30 Thread scott.marlowe
On Wed, 29 Oct 2003, Rob Nagler wrote:

 Greg Stark writes:
SELECT a, (SELECT name FROM t2 WHERE t2.f2 = t1.f2)
  FROM t1
 GROUP BY f2 
   
   This doesn't solve the problem.  It's the GROUP BY that is doing the
   wrong thing.  It's grouping, then aggregating.
  
  But at least in the form above it will consider using an index on f2, and it
  will consider using indexes on t1 and t2 to do the join.
 
 There are 20 rows in t2, so an index actually slows down the join.
 I had to drop the index on t1.f2, because it was trying to use it
 instead of simply sorting 20 rows.

t2 was 'vacuum full'ed and analyzed, right?  Just guessing.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] vacuum locking

2003-10-30 Thread scott.marlowe
On Thu, 30 Oct 2003, Rob Nagler wrote:

 scott.marlowe writes:
  t2 was 'vacuum full'ed and analyzed, right?  Just guessing.
 
 Fresh import.  I've been told this includes a ANALYZE.

You should probably run analyze by hand just to be sure.  If the planner 
is using an index scan on a table with 20 rows, then it's likely it has 
the default statistics for the table, not real ones.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] PostgreSQL 7.4beta5 vs MySQL 4.0.16 with

2003-10-29 Thread scott.marlowe
On Thu, 30 Oct 2003 [EMAIL PROTECTED] wrote:

  So its not just PostgreSQL that is suffering from the bad SQL but MySQL also.  
  But the
  question is my does PostgreSQL suffer so badly ??  I think not all developers 
  write very nice
  SQLs.
 
  Its really sad to see that a fine peice of work (RT) is performing sub-optimal 
  becoz of
  malformed SQLs.  [ specially on database of my choice ;-) ]
 
  Post EXPLAIN ANALYZES of the queries you're running, then maybe you'll be able to 
  get some
  useful help from this list.  Until then, it's very hard to speculate as to why 
  PostgreSQL is
  slower.  -sc
 
 Here It is:
 
 in case they are illegeble please lemme know i will attach it as .txt
 files.
 
 Slower One:
 
 explain analyze SELECT DISTINCT main.* FROM Groups main , Principals Principals_1, 
 ACL ACL_2 
 WHERE ((ACL_2.RightName = 'OwnTicket')OR(ACL_2.RightName = 'SuperUser')) AND  ( (   
 ACL_2.PrincipalId = Principals_1.id AND ACL_2.PrincipalType = 'Group' AND (   
 main.Domain =
 'SystemInternal' OR main.Domain = 'UserDefined' OR main.Domain = 'ACLEquivalence') 
 AND main.id =
 Principals_1.id) OR ( ( (main.Domain = 'RT::Queue-Role' AND main.Instance = 25)  OR 
 ( main.Domain
 = 'RT::Ticket-Role' AND main.Instance = 6973)  )  AND main.Type = 
 ACL_2.PrincipalType AND main.id
 = Principals_1.id) ) AND (ACL_2.ObjectType = 'RT::System'  OR (ACL_2.ObjectType = 
 'RT::Queue' AND
 ACL_2.ObjectId = 25) )  ORDER BY main.Name ASC ;

Note here:

Merge Join  
(cost=1788.68..4735.71 rows=1 width=85) 
(actual time=597.540..1340.526 rows=20153 loops=1)
Merge Cond: (outer.id = inner.id)

This estimate is WAY off.  Are both of those fields indexed and analyzed?  
Have you tried upping the statistics target on those two fields?
I assume they are compatible types.

You might try 'set enable_mergejoin = false' and see if it does something 
faster here.  Just a guess.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] My own performance/tuning qa

2003-10-24 Thread scott.marlowe
On Fri, 24 Oct 2003, Vivek Khera wrote:

  AL == Allen Landsidel [EMAIL PROTECTED] writes:
 
 AL Well I had the vacuums running every 15 minutes for a while.. via a
 AL simple cron script I wrote just to make sure no more than one vacuum
 AL ran at once, and to 'nice' the job.. but performance on the db does
 
 nice-ing the client does nothing for the backend server that does
 the actual work.  You need to track down the PID of the backend server
 running the vacuum and renice *it* to get any effect.

Note that Tom has mentioned problems with possible deadlocks when nicing 
individual backends before, so proceed with caution here.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Tuning for mid-size server

2003-10-21 Thread scott.marlowe
On Tue, 21 Oct 2003, Josh Berkus wrote:

 Anjan,
 
  Pretty soon, a PowerEdge 6650 with 4 x 2Ghz XEONs, and 8GB Memory, with
  internal drives on RAID5 will be delivered. Postgres will be from RH8.0.
 
 How many drives?   RAID5 sucks for heavy read-write databases, unless you have 
 5+ drives.  Or a large battery-backed cache.

You don't need a large cache, so much as a cache.  The size isn't usually 
an issue now that 64 to 256 megs caches are the nominal cache sizes.  Back 
when it was a choice of 4 or 8 megs it made a much bigger difference than 
64 versus 256 meg make today.

Also, if it's a read only environment, RAID5 with n drives equals the 
performance of RAID0 with n-1 drives.

 Also, last I checked, you can't address 8GB of RAM without a 64-bit processor.  
 Since when are the Xeons 64-bit?

Josh, you gotta get out more.  IA32 has supported 4 gig ram for a long 
time now, and so has the linux kernel.  It uses a paging method to do it.  
Individual processes are still limited to ~3 gig on Linux on 32 bit 
hardware though, so the extra mem will almost certainly spend it's time as 
kernel cache.



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Tuning for mid-size server

2003-10-21 Thread scott.marlowe
On Tue, 21 Oct 2003, Josh Berkus wrote:

 Scott,
 
  Also, if it's a read only environment, RAID5 with n drives equals the
  performance of RAID0 with n-1 drives.
 
 True.
 
  Josh, you gotta get out more.  IA32 has supported 4 gig ram for a long
  time now, and so has the linux kernel.  It uses a paging method to do it.
  Individual processes are still limited to ~3 gig on Linux on 32 bit
  hardware though, so the extra mem will almost certainly spend it's time as
  kernel cache.
 
 Not that you'd want a sigle process to grow that large anyway.

True :-)  Especially a pgsql backend.

 So what is the ceiling on 32-bit processors for RAM? Most of the 64-bit 
 vendors are pushing Athalon64 and G5 as breaking the 4GB barrier, and even 
 I can do the math on 2^32.   All these 64-bit vendors, then, are talking 
 about the limit on ram *per application* and not per machine?

I think it's 64 gigs in the current implementation, but that could just be 
a chip set thing, i.e. the theoretical limit is probably 2^63 or 2^64, but 
the realistic limitation is that the current mobo chipsets are gonna have 
a much lower limit, and I seem to recall that being 64 gig last I looked.



---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Tuning for mid-size server

2003-10-21 Thread scott.marlowe
On Tue, 21 Oct 2003, Josh Berkus wrote:

 Anjan,
 
  From what I know, there is a cache-row-set functionality that doesn't
  exist with the newer postgres...
 
 What?  PostgreSQL has always used the kernel cache for queries.
 
  Concurrent users will start from 1 to a high of 5000 or more, and could
  ramp up rapidly. So far, with increased users, we have gone up to
  starting the JVM (resin startup) with 1024megs min and max (recommended
  by Sun) - on the app side.
 
 Well, just keep in mind when tuning that your calculations should be based on 
 *available* RAM, meaning RAM not used by Apache or the JVM.
 
 With that many concurrent requests, you'll want to be *very* conservative with 
 sort_mem; I might stick to the default of 1024 if I were you, or even lower 
 it to 512k.

Exactly.  Remember, Anjan, that that if you have a single sort that can't 
fit in RAM, it will use the hard drive for temp space, effectively 
swapping on its own.  If the concurrent sorts run the server out of 
memory, the server will start swapping process, quite possibly the sorts, 
in a sort of hideous round robin death spiral that will bring your machine 
to its knees as the worst possible time, midday under load.  sort_mem is 
one of the small foot guns in the postgresql.conf file that people tend 
to pick up and go huh, what's this do? right before cranking it up.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] RAID controllers etc... was: PostgreSQL data on a NAS device ?

2003-10-21 Thread scott.marlowe
On 21 Oct 2003, Will LaShell wrote:

 On Tue, 2003-10-21 at 08:40, scott.marlowe wrote:
 SNIP
  So that brings up my question, which is better, the Perc4 or Perc3 
  controllers, and what's the difference between them?  I find Dell's 
  tendency to hide other people's hardware behind their own model numbers 
  mildly bothersome, as it makes it hard to comparison shop.
 
 Perc4 has n LSI 1030 chip
 http://docs.us.dell.com/docs/storage/perc4di/en/ug/features.htm
 
 
 Perc3
 depending on the model can be a couple of things but I think they are
 all U160 controllers and not U320

Thanks.  I googled around and found this page:

http://www.domsch.com/linux/

Which says what each model is.  It looks like the RAID controller they 
wanna charge me for is about $500 or so, so I'm guessing it's the medium 
range Elite 1600 type controller, i.e. U160, which is plenty for the 
machine / drive number we'll be using.  

Has anyone played around with the latest ones to get a feel for them?  I 
want a battery backed controller that runs well under linux and also BSD 
that isn't gonna break the bank.  I'd heard bad stories about the 
performance of the Adaptec RAID controllers, but it seems the newer ones 
aren't bad from what I've found googling.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Tuning for mid-size server

2003-10-21 Thread scott.marlowe
On Tue, 21 Oct 2003, Andrew Sullivan wrote:

 On Tue, Oct 21, 2003 at 11:51:02AM -0700, Josh Berkus wrote:
 
  Of course, if you have anecdotal evidence to the contrary, then the
  only way to work this would be to have OSDL help us sort it out.
 
 Nope.  I too have such anecdotal evidence that 25% is way too high. 
 It also seems to depend pretty heavily on what you're trying to
 optimise for and what platform you have.  But I'm glad to hear
 (again) that people seem to think the 25% too high for most cases.  I
 don't feel so much like I'm tilting against windmills.

I think where it makes sense is when you have something like a report 
server where the result sets may be huge, but the parellel load is load, 
i.e. 5 or 10 users tossing around 100 Meg or more at time.

If you've got 5,000 users running queries that are indexed and won't be 
using that much memory each, then there's usually no advantage to going 
over a certain number of buffers, and that certain number may be as low 
as 1000 for some applications.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] PostgreSQL data on a NAS device ?

2003-10-20 Thread scott.marlowe
On Mon, 20 Oct 2003, Alexander Priem wrote:

 Hi all,
 
 Does anyone have any experience with putting PostgreSQL data on a NAS
 device?
 
 I am asking this because a NAS device is much cheaper to set up than a
 couple of SCSI disks. I would like to use a relatively cheap NAS device
 which uses four IDE drives (7.200 rpm), like the Dell PowerVault 725N. The
 disks themselves would be much slower than SCSI disks, I know, but a NAS
 device can be equipped with 3 Gb of memory, so this would make a very large
 disk cache, right? If this NAS would be dedicated only to PostgreSQL, would
 this be slower/faster than a SCSI RAID-10 setup of 6 disks? It would be much
 cheaper...
 
 Any advice on this would be appreciated   :)

How important is this data?

With a local SCSI RAID controller and SCSI drives, you can pull the power 
cord out the back of the machine during 1000 transactions, and your 
database will come back up in a coherent state.

If you need that kind  of reliability, then you'll likely want to use 
local SCSI drives.

Note that you should test your setup to be sure, i.e. pull the network 
cord and see how the machine recovers (if the machine recovers).

Running storage on a NAS is a bit of a tightrope act with your data, as is 
using IDE drives with write cache enabled.  But depending on your 
application, using NAS may be a good solution.  So, what's this database 
gonna be used for?


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] further testing on IDE drives

2003-10-14 Thread scott.marlowe
On Tue, 14 Oct 2003, Tom Lane wrote:

 scott.marlowe [EMAIL PROTECTED] writes:
  open_sync was WAY faster at this than the other two methods.
 
 Do you not have open_datasync?  That's the preferred method if
 available.

Nope, when I try to start postgresql with it set to that, I get this error 
message:

FATAL:  invalid value for wal_sync_method: open_datasync

This is on RedHat 9, but I have the same problem on a RH 7.2 box as well.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [SQL] [PERFORM] sql performance and cache

2003-10-14 Thread scott.marlowe
On Tue, 14 Oct 2003, Wei Weng wrote:

 On Sat, 11 Oct 2003, Christopher Kings-Lynne wrote:
 
  
   I have two very similar queries which I need to execute. They both have
   exactly the same from / where conditions. When I execute the first, it takes
   about 16 seconds. The second is executed almost immediately after, it takes
   13 seconds. In short, I'd like to know why the query result isn't being
   cached and any ideas on how to improve the execution.
  
  snip
  
   OK - so I could execute the query once, and get the maximum size of the
   array and the result set in one. I know what I am doing is less than optimal
   but I had expected the query results to be cached. So the second execution
   would be very quick. So why aren't they ? I have increased my cache size -
   shared_buffers is 2000 and I have doubled the default max_fsm... settings
   (although I am not sure what they do). sort_mem is 8192.
  
  PostgreSQL does not have, and has never had a query cache - so nothing 
  you do is going to make that second query faster.
  
  Perhaps you are confusing it with the MySQL query cache?
  
  Chris
  
 Is there plan on developing one (query cache)?

Not really, Postgresql's design makes it a bit of a non-winner.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] further testing on IDE drives

2003-10-10 Thread scott.marlowe
On Thu, 9 Oct 2003, Bruce Momjian wrote:

 scott.marlowe wrote:
  I was testing to get some idea of how to speed up the speed of pgbench 
  with IDE drives and the write caching turned off in Linux (i.e. hdparm -W0 
  /dev/hdx).
  
  The only parameter that seems to make a noticeable difference was setting 
  wal_sync_method = open_sync.  With it set to either fsync, or fdatasync, 
  the speed with pgbench -c 5 -t 1000 ran from 11 to 17 tps.  With open_sync 
  it jumped to the range of 45 to 52 tps.  with write cache on I was getting 
  280 to 320 tps.  so, not instead of being 20 to 30 times slower, I'm only 
  about 5 times slower, much better.
  
  Now I'm off to start a pgbench -c 10 -t 1 and pull the power cord 
  and see if the data gets corrupted with write caching turned on, i.e. do 
  my hard drives have the ability to write at least some of their cache 
  during spin down.
 
 Is this a reason we should switch to open_sync as a default, if it is
 availble, rather than fsync?  I think we are doing a single write before
 fsync a lot more often than we are doing multiple writes before fsync.

Sounds reasonable to me.  Are there many / any scenarios where a plain 
fsync would be faster than open_sync?


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] further testing on IDE drives

2003-10-10 Thread scott.marlowe
On Fri, 10 Oct 2003, Josh Berkus wrote:

 Bruce,
 
  Yes.  If you were doing multiple WAL writes before transaction fsync,
  you would be fsyncing every write, rather than doing two writes and
  fsync'ing them both.  I wonder if larger transactions would find
  open_sync slower?
 
 Want me to test?   I've got an ide-based test machine here, and the TPCC 
 databases.

Just make sure the drive's write cache is disabled.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] further testing on IDE drives

2003-10-10 Thread scott.marlowe
On Fri, 10 Oct 2003, Josh Berkus wrote:

 Bruce,
 
  Yes.  If you were doing multiple WAL writes before transaction fsync,
  you would be fsyncing every write, rather than doing two writes and
  fsync'ing them both.  I wonder if larger transactions would find
  open_sync slower?
 
 Want me to test?   I've got an ide-based test machine here, and the TPCC 
 databases.

OK, I decided to do a quick dirty test of things that are big transactions 
in each mode my kernel supports.  I did this:

createdb dbname
time pg_dump -O -h otherserver dbname|psql dbname

then I would drop the db, edit postgresql.conf, and restart the server.

open_sync was WAY faster at this than the other two methods.

open_sync:

1st run:

real11m27.107s
user0m26.570s
sys 0m1.150s

2nd run:

real6m5.712s
user0m26.700s
sys 0m1.700s

fsync:

1st run:

real15m8.127s
user0m26.710s
sys 0m0.990s

2nd run:

real15m8.396s
user0m26.990s
sys 0m1.870s

fdatasync:

1st run:

real15m47.878s
user0m26.570s
sys 0m1.480s

2nd run:


real15m9.402s
user0m27.000s
sys 0m1.660s

I did the first runs in order, then started over, i.e. opensync run1, 
fsync run1, fdatasync run1, opensync run2, etc...

The machine I was restoring to was under no other load.  The machine I was 
reading from had little or no load, but is a production server, so it's 
possible the load there could have had a small effect, but probably not 
this big of a one.

The machine this is one is setup so that the data partition is on a drive 
with write cache enabled, but the pg_xlog and pg_clog directories are on a 
drive with write cache disabled.  Same drive models as listed before in my 
previous test, Seagate generic 80gig IDE drives, model ST380023A.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Linux filesystem shootout

2003-10-09 Thread scott.marlowe
On Thu, 9 Oct 2003, Shridhar Daithankar wrote:

 Kaarel wrote:
 http://www.ussg.iu.edu/hypermail/linux/kernel/0310.1/0208.html
 
 Shridhar
 
 
  I feel incompetent when it comes to file systems. Yet everybody would like to 
  have the best file system if given the choice...so do I :) Here I am looking at 
  those tables seeing JFS having more green cells than others. The more green the 
  better right? So based on these tests JFS ought to be the one?
 
 Yes and no. Yes for the results. No for the tests that weren't run.
 
 Database load is quite different. Its mixture of read and write load with a 
 dynamics varying from one extreme to other, between these two.
 
 All it says that if you want to choose a good file system for postgresql, look 
 at JFS first..:-)
 
   Besides all the tests were done on files file bigger than 1GB. If single file 
 size is restricted to 1GB, it might produce a different result set. And 
 postgresql does not exceed 1GB limit per file.
 
 So still, quite a few unknowns there..

Absolutely.  For instance, one file system may be faster on a RAID card 
with battery backed cache, while another may be faster on an IDE drive 
with write cache disabled, while another may be faster on software RAID1, 
while another might be faster on software RAID5.

If you haven't tested different file systems on your setup, you don't 
really know which will be faster until you do.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] PostgreSQL vs MySQL

2003-10-09 Thread scott.marlowe
On Thu, 9 Oct 2003, Jeff wrote:

 On Thu, 9 Oct 2003, David Griffiths wrote:
 
  1) the MySQL docs are better (sorry - I found them easier to read, and
  more comprehensive; I had an easier time finding the answers I needed)
 
 Huh. I had the opposite experience. Each to his own.
 I think everybody agrees PG needs a better tuning doc (or pointers to it,
 or something).

I think the issue is that Postgresql documentation is oriented towards DBA 
types, who already understand databases in general, so they can find what 
they want, while MySQL docs are oriented towards dbms newbies, who don't 
know much, if anything, about databases.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Shopping for hardware

2003-10-06 Thread scott.marlowe
On Mon, 6 Oct 2003, Jason Hihn wrote:

 Ok, I asked this on [novice], but I was told it's be better to post it
 here...
 
 I've got some money to spend on a new servers. The biggest concern is the
 PostgreSQL database server that will be the company. (*Everyone* uses the
 database server in some form or another) I'm looking for hot-swappable RAID
 1 on a Linux platform at the least. Are there any vendors to avoid or
 prefer? What works best? Am I better off going with a DIY or getting
 something pre-packaged?

Depends on your hardware expertise.  You can do quite well either way.  I 
prefer adding my own components to a pre-built vanilla server.

 In terms of numbers, we expect have an average of 100 active connections
 (most of which are idle 9/10ths of the time), with about 85% reading
 traffic.  I hope to have one server host about 1000-2000 active databases,
 with the largest being about 60 meg (no blobs). Inactive databases will only
 be for reading (archival) purposes, and will seldom be accessed. (I could
 probably move them off to another server with a r/o disk...)

That's not a really big load, but I'm guessing the peaks will be big 
enough to notice.

 Does any of this represent a problem for Postgres? The datasets are
 typically not that large, only a few queries on a few databases ever return
 over 1000 rows.

Nah, this is pretty normal stuff for Postgresql or any other database in 
its approximate class (Sybase, Oracle, Informix, DB2, MSSQL2k).

 The configuration that is going on in my head is:
 RAID 1, 200gig disks
 1 server, 4g ram
 Linux 2.4 or 2.6 (depends on when we deploy and 2.6's track record at that
 time)

That's a good starting point.  I'd avoid 2.6 until it's had time for the 
bugs to drop out.  The latest 2.4 kernels are pretty stable.

List of things to include if you need more performance, in order of 
priority:

proper tuning of the postgresql.conf file (see 
http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html)
hardware RAID card with battery backed cache, the bigger the cache the 
better.
more drives for RAID 1+0
faster CPUs.  

since you've already got 4 gigs of RAM slated, you're set there on linux, 
where having more won't likely help a lot unless you go to a 64 bit 
platform.

 I want something that can do hot-swaps and auto-mirroring after swap.
 Unfortunately, this is a new area for me. (I normally stick to S/W for
 non-high end systems)

The LSI/Megaraid cards can handle hot swaps quite well, make sure you get 
the right kind of hot swap shoes so they isolate the drive from the buss 
when you turn it off and they don't lock up your scsi buss.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] TPC-R benchmarks

2003-10-02 Thread scott.marlowe
Have you tried increasing the statistics target for those columns that are 
getting bad estimates yet and then turning back on enable_nestloop and 
rerunning analyze and seeing how the query does?  

The idea being to try and get a good enough estimate of your statistics so 
the planner stops using nestloops on its own rather than forcing it to 
with enable_nestloop = false.

On Thu, 2 Oct 2003, Oleg Lebedev wrote:

 As Scott recommended, I did the following:
 # set enable_nestloop = false;
 # vacuum full analyze;
 
 After this I re-ran the query and its execution time went down from 2
 hours to 2 minutes. I attached the new query plan to this posting.
 Is there any way to optimize it even further?
 What should I do to make this query run fast without hurting the
 performance of the other queries?
 Thanks.
 
 Oleg
 
 -Original Message-
 From: scott.marlowe [mailto:[EMAIL PROTECTED] 
 Sent: Wednesday, October 01, 2003 4:00 PM
 To: Oleg Lebedev
 Cc: Josh Berkus; [EMAIL PROTECTED]
 Subject: Re: [PERFORM] TPC-R benchmarks
 
 
 For troubleshooting, can you try it with set enable_nestloop = false
 and 
 rerun the query and see how long it takes?  
 
 It looks like the estimates of rows returned is WAY off (estimate is too
 
 low compared to what really comes back.)
 
 Also, you might try to alter the table.column to have a higher target on
 
 the rows p_partkey and ps_partkey and any others where the estimate is
 so 
 far off of the reality.
 
 On Wed, 1 Oct 2003, Oleg Lebedev wrote:
 
  All right, my query just finished running with EXPLAIN ANALYZE. I show
 
  the plan below and also attached it as a file. Any ideas?
  
 -  Sort  (cost=54597.49..54597.50 rows=1 width=121) (actual 
  time=6674562.03..6674562.15 rows=175 loops=1)
   Sort Key: nation.n_name, date_part('year'::text,
  orders.o_orderdate)
   -  Aggregate  (cost=54597.45..54597.48 rows=1 width=121) 
  (actual time=6668919.41..6674522.48 rows=175 loops=1)
 -  Group  (cost=54597.45..54597.47 rows=3 width=121) 
  (actual time=6668872.68..6672136.96 rows=348760 loops=1)
   -  Sort  (cost=54597.45..54597.46 rows=3
  width=121) (actual time=6668872.65..6669499.95 rows=348760 loops=1)
 Sort Key: nation.n_name, 
  date_part('year'::text, orders.o_orderdate)
 -  Hash Join  (cost=54596.00..54597.42 
  rows=3
  width=121) (actual time=6632768.89..6650192.67 rows=348760 loops=1)
   Hash Cond: (outer.n_nationkey =
  inner.s_nationkey)
   -  Seq Scan on nation 
  (cost=0.00..1.25 rows=25 width=33) (actual time=6.75..7.13 rows=25
  loops=1)
   -  Hash  (cost=54596.00..54596.00 
  rows=3
  width=88) (actual time=6632671.96..6632671.96 rows=0 loops=1)
 -  Nested Loop 
  (cost=0.00..54596.00 rows=3 width=88) (actual time=482.41..6630601.46 
  rows=348760 loops=1)
   Join Filter: 
  (inner.s_suppkey = outer.l_suppkey)
   -  Nested Loop 
  (cost=0.00..54586.18 rows=3 width=80) (actual time=383.87..6594984.40 
  rows=348760 loops=1)
 -  Nested Loop 
  (cost=0.00..54575.47 rows=4 width=68) (actual time=199.95..3580882.07 
  rows=348760 loops=1)
   Join Filter: 
  (outer.p_partkey = inner.ps_partkey)
   -  Nested 
  Loop (cost=0.00..22753.33 rows=9343 width=49) (actual 
  time=146.85..3541433.10 rows=348760 loops=1)
 -  Seq
 
  Scan on part  (cost=0.00..7868.00 rows=320 width=4) (actual 
  time=33.64..15651.90 rows=11637 loops=1)
  
  Filter: (p_name ~~ '%green%'::text)
 -  
  Index Scan using i_l_partkey on lineitem  (cost=0.00..46.15 rows=29 
  width=45) (actual time=10.71..302.67 rows=30 loops=11637)
   
  Index
  Cond: (outer.p_partkey = lineitem.l_partkey)
   -  Index 
  Scan using pk_partsupp on partsupp  (cost=0.00..3.39 rows=1 width=19) 
  (actual time=0.09..0.09 rows=1 loops=348760)
 Index
  Cond: ((partsupp.ps_partkey = outer.l_partkey) AND 
  (partsupp.ps_suppkey =
  outer.l_suppkey))
 -  Index Scan 
  using pk_orders on orders  (cost=0.00..3.01 rows=1 width=12) (actual 
  time=8.62..8.62 rows=1 loops=348760)
   Index Cond: 
  (orders.o_orderkey = outer.l_orderkey)
   -  Index Scan using 
  pk_supplier on supplier  (cost=0.00..3.01 rows=1 width=8) (actual 
  time=0.08

[PERFORM] further testing on IDE drives

2003-10-02 Thread scott.marlowe
I was testing to get some idea of how to speed up the speed of pgbench 
with IDE drives and the write caching turned off in Linux (i.e. hdparm -W0 
/dev/hdx).

The only parameter that seems to make a noticeable difference was setting 
wal_sync_method = open_sync.  With it set to either fsync, or fdatasync, 
the speed with pgbench -c 5 -t 1000 ran from 11 to 17 tps.  With open_sync 
it jumped to the range of 45 to 52 tps.  with write cache on I was getting 
280 to 320 tps.  so, not instead of being 20 to 30 times slower, I'm only 
about 5 times slower, much better.

Now I'm off to start a pgbench -c 10 -t 1 and pull the power cord 
and see if the data gets corrupted with write caching turned on, i.e. do 
my hard drives have the ability to write at least some of their cache 
during spin down.




---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] TPC-R benchmarks

2003-10-02 Thread scott.marlowe
On Thu, 2 Oct 2003, Oleg Lebedev wrote:

 I was trying to get the pg_stats information to Josh and decided to
 recreate the indexes on all my tables. After that I ran vacuum full
 analyze, re-enabled nestloop and ran explain analyze on the query. It
 ran in about 2 minutes.
 I attached the new query plan. I am not sure what did the trick, but 2
 minutes is much better than 2 hours. But then again, I can't take long
 lunches anymore :)
 Is there any way to make this query run even faster without increasing
 the memory dedicated to postgres?
 Thanks.

As long as the estimated row counts and real ones match up, and postgresql 
seems to be picking the right plan, there's probably not a lot to be done.  
You might want to look at increasing sort_mem a bit, but don't go crazy, 
as being too high can result in swap storms under load, which are a very 
bad thing.

I'd check for index growth.  You may have been reloading your data over 
and over and had an index growth problem.  Next time instead of recreating 
the indexed completely, you might wanna try reindex indexname.

Also, 7.4 mostly fixes the index growth issue, especially as it applies to 
truncating/reloading a table over and over, so moving to 7.4 beta3/4 and 
testing might be a good idea (if you aren't there already).

What you want to avoid is having postgresql switch back to that nestloop 
join on you in the middle of the day, and to prevent that you might need 
to have higher statistics targets so the planner gets the right number 
all the time.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Optimizing = and = for numbers and dates

2003-10-01 Thread scott.marlowe
On Wed, 1 Oct 2003, Dimitri Nagiev wrote:

 here goes the EXPLAIN ANALYZE output:
 
 
 template1=# VACUUM analyze mytable;
 VACUUM
 template1=# explain analyze select * from mytable where
 mydate='2003-09-01';
   QUERY PLAN

 
 ---
  Seq Scan on mytable  (cost=0.00..2209.11 rows=22274 width=562) (actual
 time=0.06..267.30 rows=22677 loops=1)
Filter: (mydate = '2003-09-01'::date)
  Total runtime: 307.71 msec
 (3 rows)

How many rows are there in this table?  If the number is only two or three 
times as many as the number of rows returned (22677) then a seq scan is 
preferable.

The way to tune your random_page_cost is to keep making your range more 
selective until you get an index scan.  Then, see what the difference is 
in speed between the two queries that sit on either side of that number, 
i.e. if a query that returns 1000 rows switches to index scan, and takes 
100 msec, while one that returns 1050 uses seq scan and takes 200 msec, 
then you might want to lower your random page cost.

Ideally, what should happen is that as the query returns more and more 
rows, the switch to seq scan should happen so that it's taking about the 
same amount of time as the index scan, maybe just a little more.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Optimizing = and = for numbers and dates

2003-10-01 Thread scott.marlowe

Oh, to followup on my previously sent post, make sure you've got 
effective_cache_size set right BEFORE you go trying to set 
random_page_cost, and you might wanna run a select * from table to load 
the table into kernel buffer cache before testing, then also test it with 
the cache cleared out (select * from a_different_really_huge_table will 
usually do that.)


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] TPC-R benchmarks

2003-10-01 Thread scott.marlowe
For troubleshooting, can you try it with set enable_nestloop = false and 
rerun the query and see how long it takes?  

It looks like the estimates of rows returned is WAY off (estimate is too 
low compared to what really comes back.)

Also, you might try to alter the table.column to have a higher target on 
the rows p_partkey and ps_partkey and any others where the estimate is so 
far off of the reality.

On Wed, 1 Oct 2003, Oleg Lebedev wrote:

 All right, my query just finished running with EXPLAIN ANALYZE.
 I show the plan below and also attached it as a file.
 Any ideas?
 
-  Sort  (cost=54597.49..54597.50 rows=1 width=121) (actual
 time=6674562.03..6674562.15 rows=175 loops=1)
  Sort Key: nation.n_name, date_part('year'::text,
 orders.o_orderdate)
  -  Aggregate  (cost=54597.45..54597.48 rows=1 width=121)
 (actual time=6668919.41..6674522.48 rows=175 loops=1)
-  Group  (cost=54597.45..54597.47 rows=3 width=121)
 (actual time=6668872.68..6672136.96 rows=348760 loops=1)
  -  Sort  (cost=54597.45..54597.46 rows=3
 width=121) (actual time=6668872.65..6669499.95 rows=348760 loops=1)
Sort Key: nation.n_name,
 date_part('year'::text, orders.o_orderdate)
-  Hash Join  (cost=54596.00..54597.42
 rows=3
 width=121) (actual time=6632768.89..6650192.67 rows=348760 loops=1)
  Hash Cond: (outer.n_nationkey =
 inner.s_nationkey)
  -  Seq Scan on nation
 (cost=0.00..1.25 rows=25 width=33) (actual time=6.75..7.13 rows=25
 loops=1)
  -  Hash  (cost=54596.00..54596.00
 rows=3
 width=88) (actual time=6632671.96..6632671.96 rows=0 loops=1)
-  Nested Loop
 (cost=0.00..54596.00 rows=3 width=88) (actual time=482.41..6630601.46
 rows=348760 loops=1)
  Join Filter:
 (inner.s_suppkey = outer.l_suppkey)
  -  Nested Loop
 (cost=0.00..54586.18 rows=3 width=80) (actual time=383.87..6594984.40
 rows=348760 loops=1)
-  Nested Loop
 (cost=0.00..54575.47 rows=4 width=68) (actual time=199.95..3580882.07
 rows=348760 loops=1)
  Join Filter:
 (outer.p_partkey = inner.ps_partkey)
  -  Nested Loop
 (cost=0.00..22753.33 rows=9343 width=49) (actual time=146.85..3541433.10
 rows=348760 loops=1)
-  Seq
 Scan on part  (cost=0.00..7868.00 rows=320 width=4) (actual
 time=33.64..15651.90 rows=11637 loops=1)
 
 Filter: (p_name ~~ '%green%'::text)
-  Index
 Scan using i_l_partkey on lineitem  (cost=0.00..46.15 rows=29 width=45)
 (actual time=10.71..302.67 rows=30 loops=11637)
  
 Index
 Cond: (outer.p_partkey = lineitem.l_partkey)
  -  Index Scan
 using pk_partsupp on partsupp  (cost=0.00..3.39 rows=1 width=19) (actual
 time=0.09..0.09 rows=1 loops=348760)
Index
 Cond: ((partsupp.ps_partkey = outer.l_partkey) AND
 (partsupp.ps_suppkey =
 outer.l_suppkey))
-  Index Scan using
 pk_orders on orders  (cost=0.00..3.01 rows=1 width=12) (actual
 time=8.62..8.62 rows=1 loops=348760)
  Index Cond:
 (orders.o_orderkey = outer.l_orderkey)
  -  Index Scan using
 pk_supplier on supplier  (cost=0.00..3.01 rows=1 width=8) (actual
 time=0.08..0.08 rows=1 loops=348760)
Index Cond:
 (outer.ps_suppkey = supplier.s_suppkey)  Total runtime: 6674724.23
 msec (28 rows)
 
 
 -Original Message-
 From: Oleg Lebedev 
 Sent: Wednesday, October 01, 2003 12:00 PM
 To: Josh Berkus; scott.marlowe
 Cc: [EMAIL PROTECTED]
 Subject: Re: [PERFORM] TPC-R benchmarks
 Importance: Low
 
 
 Sure, below is the query. I attached the plan to this posting.
 
 select
   nation,
   o_year,
   sum(amount) as sum_profit
 from
   (
   select
   n_name as nation,
   extract(year from o_orderdate) as o_year,
   l_extendedprice * (1 - l_discount) -
 ps_supplycost * l_quantity as amount
   from
   part,
   supplier,
   lineitem,
   partsupp,
   orders,
   nation
   where
   s_suppkey = l_suppkey
   and ps_suppkey = l_suppkey
   and ps_partkey = l_partkey

Re: [PERFORM] advice on raid controller

2003-09-29 Thread scott.marlowe
I've used the megaraid / LSI cards in the past and they were pretty good 
in terms of reliability, but the last one I used was the 328 model, from 4 
years ago or so.  that one had a battery backup option for the cache, and 
could go to 128 Meg.  We tested it with 4/16 and 128 meg ram, and it was 
about the same with each, but we didn't do heavy parallel testing either.

Here's the page on the megaraid cards at lsilogic.com:

http://www.lsilogic.com/products/stor_prod/raid/ultra320products.html

On Sun, 28 Sep 2003, Matt Clark wrote:

 As others have mentioned, you really ought to get battery-backed cache if
 you're doing any volume of writes.  The ability to do safe write-back
 caching makes an *insane* difference to write performance.
 
 The site you link to also has that for only 15% more money:
 http://uk.azzurri.com/product/product.cgi?productId=80
 
 No experience with the card(s) I'm afraid.
 
 In general though, U320 will only be faster than U160 for large sequential
 reads, or when you have silly numbers of disks on a channel (i.e. more than
 4/channel).  If you have silly numbers of disks, then RAID5 will probably be
 better, if you have 4 disks total then RAID1+0 will probably be better.  In
 between it depends on all sorts of other factors.  Bear in mind though that
 if you *do* have silly numbers of disks then more channels and more cache
 will count for more than anything else, so spend the money on that rather
 than latest-and-greatest performance for a single channel.
 
 HTH
 
 Matt
 
  -Original Message-
  From: [EMAIL PROTECTED]
  [mailto:[EMAIL PROTECTED] Behalf Of Richard
  Jones
  Sent: 27 September 2003 18:25
  To: [EMAIL PROTECTED]
  Subject: [PERFORM] advice on raid controller
 
 
  Hi, i'm on the verge of buying a MegaRAID SCSI 320-2 raid controller.
  I need it to build a db server using 4x ultra320 scsi disks
  i'm thinking raid 1+0 but will try with raid5 too and compare
 
  Does anyone have any experience with this model, good or bad i'd like to
  know.. thanks :)
 
  as seen:
  http://uk.azzurri.com/product/product.cgi?productId=188
 
  Regards,
  Richard.
 
  PS: whoever mentioned starting a site with raid controller
  reviews, excellent
  idea - its hard to find decent info on which card to buy.
 
 
  ---(end of broadcast)---
  TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match
 
 
 
 ---(end of broadcast)---
 TIP 2: you can get off all lists at once with the unregister command
 (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
 


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] advice on raid controller

2003-09-29 Thread scott.marlowe
On 29 Sep 2003, Will LaShell wrote:

 On Mon, 2003-09-29 at 06:48, scott.marlowe wrote:
  I've used the megaraid / LSI cards in the past and they were pretty good 
  in terms of reliability, but the last one I used was the 328 model, from 4 
  years ago or so.  that one had a battery backup option for the cache, and 
  could go to 128 Meg.  We tested it with 4/16 and 128 meg ram, and it was 
  about the same with each, but we didn't do heavy parallel testing either.
  
  Here's the page on the megaraid cards at lsilogic.com:
  
  http://www.lsilogic.com/products/stor_prod/raid/ultra320products.html
  
  On Sun, 28 Sep 2003, Matt Clark wrote:
  
   As others have mentioned, you really ought to get battery-backed cache if
   you're doing any volume of writes.  The ability to do safe write-back
   caching makes an *insane* difference to write performance.
   
   The site you link to also has that for only 15% more money:
   http://uk.azzurri.com/product/product.cgi?productId=80
   
   No experience with the card(s) I'm afraid.
   
   In general though, U320 will only be faster than U160 for large sequential
   reads, or when you have silly numbers of disks on a channel (i.e. more than
   4/channel).  If you have silly numbers of disks, then RAID5 will probably be
   better, if you have 4 disks total then RAID1+0 will probably be better.  In
   between it depends on all sorts of other factors.  Bear in mind though that
   if you *do* have silly numbers of disks then more channels and more cache
   will count for more than anything else, so spend the money on that rather
   than latest-and-greatest performance for a single channel.
 
 Just to add my thoughts,  we use the MegaRaid Elite 1650 in 3 servers
 here that drive our core databases.  We paired up the controllers with
 the Seagate Cheetah 10k drives,  we could have purchased the X15's which
 are Seagate's 15k version, but due to budget constraints and lack of
 true performance increase from the 10k to the 15k rpm drives we didn't
 opt for them.
 
 I have to say that I've been 100% pleased with the performance and
 reliability of the Megaraid controllers. They do everything a good
 controller should and they are very easy to manage. The driver is
 actively maintained by the guys at LSI and their tech support personnel
 are very good as well.
 
 If you want any specific information or have any questions about our
 experience or configuration please feel free to contact me.

To add one more feature the LSI/MegaRAIDs have that I find interesting, 
you can put two in a machine, build a RAID0 or 5 on each card, then mirror 
the two cards together, and should one card / RAID0 ot 5 chain die, the 
other card will keep working.  I.e. the work like one big card with 
failover.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] Is there a reason _not_ to vacuum continuously?

2003-09-17 Thread scott.marlowe
On Wed, 17 Sep 2003, Matt Clark wrote:

 *** THE QUESTION(S) ***
 Is there any reason for me not to run continuous sequential vacuum analyzes?
 At least for the 6 tables that see a lot of updates?
 I hear 10% of tuples updated as a good time to vac-an, but does my typical
 count of 3 indexes per table affect that?

Generally, the only time continuous vacuuming is a bad thing is when you 
are I/O bound.  If you are CPU bound, then continuous vacuuming is usually 
acceptable.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Inconsistent performance

2003-09-15 Thread scott.marlowe
On Mon, 15 Sep 2003, Joseph Bove wrote:

 Stephan,
 
 I've run explain analyze a number of times and have gotten results between 
 5.5 and 7.5 seconds
 
 Attached is a typical output
 
   QUERY PLAN
 -
   Aggregate  (cost=9993.92..9993.92 rows=1 width=0)
(actual time=7575.59..7575.59 rows=1 loops=1)
 -  Seq Scan on vetapview  (cost=0.00..9771.34 rows=89034 width=0)
 (actual time=0.06..7472.20 
 rows=88910 loops=1)
   Total runtime: 7575.67 msec
 (3 rows)
 
 The only things changing are the actual time. The costs are constant.
 
 The relpages from pg_class for vetapview (the table in question) is 8881.
 
 At the end of this message is the exhaustive contents of postgresql.conf. 
 The only settings I have attempted tuning are as follows:
 
 tcpip_socket = true
 max_connections = 100
 shared_buffers = 5000
 sort_mem = 8192
 fsync = false

A couple of things.

1:  Is there an index on the parts of the query used for the where clause?
2:  What is your effect_cache_size set to?  It needs to be set right for 
your postgresql server to be able to take advantage of the kernel's cache 
(i.e. use an index scan when the kernel is likely to have that data in 
memory.)


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Inconsistent performance

2003-09-15 Thread scott.marlowe
On Mon, 15 Sep 2003, scott.marlowe wrote:

 On Mon, 15 Sep 2003, Joseph Bove wrote:
 
  Stephan,
  
  I've run explain analyze a number of times and have gotten results between 
  5.5 and 7.5 seconds
  
  Attached is a typical output
  
QUERY PLAN
  -
Aggregate  (cost=9993.92..9993.92 rows=1 width=0)
 (actual time=7575.59..7575.59 rows=1 loops=1)
  -  Seq Scan on vetapview  (cost=0.00..9771.34 rows=89034 width=0)
  (actual time=0.06..7472.20 
  rows=88910 loops=1)
Total runtime: 7575.67 msec
  (3 rows)
  
  The only things changing are the actual time. The costs are constant.
  
  The relpages from pg_class for vetapview (the table in question) is 8881.
  
  At the end of this message is the exhaustive contents of postgresql.conf. 
  The only settings I have attempted tuning are as follows:
  
  tcpip_socket = true
  max_connections = 100
  shared_buffers = 5000
  sort_mem = 8192
  fsync = false
 
 A couple of things.
 
 1:  Is there an index on the parts of the query used for the where clause?
 2:  What is your effect_cache_size set to?  It needs to be set right for 
 your postgresql server to be able to take advantage of the kernel's cache 
 (i.e. use an index scan when the kernel is likely to have that data in 
 memory.)

Sorry, that should be effective_cache_size, not effect_cache_size.  It's 
set in 8k blocks and is usually about how much buffer / cache you have 
left over after the machines settles after being up and running for a 
while.  Fer instance, on my server, I show 784992K cache, and 42976K buff 
under top, so, that's 827968k/8k=103496 blocks.  Note that if you've 
recompiled you may have somehow set block size larger, but installations 
with postgresql block sizes ~=8k are pretty uncommon, and you'd know if 
you had done that, so it's probably 8k blocks.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] [GENERAL] how to get accurate values in pg_statistic

2003-09-11 Thread scott.marlowe
On Thu, 11 Sep 2003, Tom Lane wrote:

 Christopher Browne [EMAIL PROTECTED] writes:
  The right answer for most use seems likely to involve:
   a) Getting an appropriate number of bins (I suspect 10 is a bit
  small, but I can't justify that mathematically), and
 
 I suspect that also, but I don't have real evidence for it either.
 We've heard complaints from a number of people for whom it was indeed
 too small ... but that doesn't prove it's not appropriate in the
 majority of cases ...
 
  Does the sample size change if you increase the number of bins?
 
 Yes, read the comments in backend/commands/analyze.c.
 
  Do we also need a parameter to control sample size?
 
 Not if the paper I read before writing that code is correct.

I was just talking to a friend of mine who does statistical analysis, and 
he suggested a different way of looking at this.  I know little of the 
analyze.c, but I'll be reading it some today.

His theory was that we can figure out the number of target bins by 
basically running analyze twice with two different random seeds, and 
initially setting the bins to 10.

The, compare the variance of the two runs.  If the variance is great, 
increase the target by X, and run two again.  repeat, wash, rinse, until 
the variance drops below some threshold.

I like the idea, I'm not at all sure if it's practical for Postgresql to 
implement it.


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] slow plan for min/max

2003-09-08 Thread scott.marlowe
On Sun, 7 Sep 2003, Pailloncy Jean-Gérard wrote:

Asking a question about why max(id) is so much slower than select id order 
by id desc limit 1, Pailloncy said:

 I ask for the same thing.
 That's better !

This is a Frequently asked question about something that isn't likely to 
change any time soon.

Basically, Postgresql uses an MVCC locking system that makes massively 
parallel operation possible, but costs in certain areas, and one of those 
areas is aggregate performance over large sets.  MVCC makes it very hard 
to optimize all but the simplest of aggregates, and even those 
optimzations which are possible would wind up being quite ugly at the 
parser level.

You might want to search the archives in the last couple years for this 
subject, as it's come up quite often.


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] slow plan for min/max

2003-09-08 Thread scott.marlowe
On Mon, 8 Sep 2003, Neil Conway wrote:

 On Mon, 2003-09-08 at 11:56, scott.marlowe wrote:
  Basically, Postgresql uses an MVCC locking system that makes massively 
  parallel operation possible, but costs in certain areas, and one of those 
  areas is aggregate performance over large sets.  MVCC makes it very hard 
  to optimize all but the simplest of aggregates, and even those 
  optimzations which are possible would wind up being quite ugly at the 
  parser level.
 
 As was pointed out in a thread a couple days ago, MIN/MAX() optimization
 has absolutely nothing to do with MVCC. It does, however, make
 optimizing COUNT() more difficult.

Not exactly.  While max(id) is easily optimized by query replacement, 
more complex aggregates will still have perfomance issues that would not 
be present in a row locking database.  i.e. max((field1/field2)*field3) is 
still going to cost more to process, isn't it?



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] SELECT's take a long time compared to other DBMS

2003-09-04 Thread scott.marlowe
On Wed, 3 Sep 2003, Relaxin wrote:

 I have a table with 102,384 records in it, each record is 934 bytes.
 
 Using the follow select statement:
   SELECT * from table
 
 PG Info: version 7.3.4 under cygwin on Windows 2000
 ODBC: version 7.3.100
 
 Machine: 500 Mhz/ 512MB RAM / IDE HDD
 
 
 Under PG:  Data is returned in 26 secs!!
 Under SQL Server:  Data is returned in 5 secs.
 Under SQLBase: Data is returned in 6 secs.
 Under SAPDB:Data is returned in 7 secs.

This is typical of postgresql under cygwin, it's much faster under a Unix 
OS like Linux or BSD.  That said, you CAN do some things to help speed it 
up, the biggest being tuning the shared_buffers to be something large 
enough to hold a fair bit of data.  Set the shared_buffers to 1000, 
restart, and see if things get better.

Running Postgresql in a unix emulation layer is guaranteed to make it 
slow.  If you've got a spare P100 with 128 Meg of RAM you can throw redhat 
9 or FreeBSD 4.7 on and run Postgresql on, it will likely outrun your 
500MHZ cygwin box, and might even keep up with the other databases on that 
machine as well.


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Selecting random rows efficiently

2003-09-03 Thread scott.marlowe
Can you just create an extra serial column and make sure that one is 
always in order and no holes in it?  (i.e. a nightly process, etc...)???

If so, then something like this truly flies:

select * from accounts where aid = (select cast(floor(random()*10)+1 as int));

My times on it on a 100,000 row table are  1 millisecond.

Note that you have to have a hole free sequence AND know how many rows 
there are, but if you can meet those needs, this is screamingly fast.

On Sat, 30 Aug 2003, Russell Garrett wrote:

 Considering that we'd have to index the random field too, it'd be neater in
 the long term to re-number the primary key. Although, being a primary key,
 that's foreign-keyed from absolutely everywhere, so that'd probably take an
 amusingly long time.
 
 ...and no we're not from Micronesia, we're from ever so slightly less exotic
 London. Though Micronesia might be nice...
 
 Russ (also from last.fm but without the fancy address)
 
 [EMAIL PROTECTED] wrote:
  On Sat, 2003-08-30 at 09:01, Rod Taylor wrote:
  i was hoping there was some trickery with sequences that would
  allow me to easily pick a random valid sequence number..?
 
  I would suggest renumbering the data.
 
  ALTER SEQUENCE ... RESTART WITH 1;
  UPDATE table SET pkey = DEFAULT;
 
  Of course, PostgreSQL may have trouble with that update due to
  evaluation of the unique constraint immediately -- so drop the
  primary key first, and add it back after.
 
  And if there are child tables, they'd all have to be updated, too.
 
 
 
 ---(end of broadcast)---
 TIP 8: explain analyze is your friend
 


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] opinion on RAID choice

2003-08-29 Thread scott.marlowe
On Thu, 28 Aug 2003, Vivek Khera wrote:

  sm == scott marlowe scott.marlowe writes:
 
 sm My experience has been that once you get past 6 disks, RAID5 is faster 
 sm than RAID1+0.
 
 Any opinion on stripe size for the RAID?

That's more determined by what kind of data you're gonna be handling.  If 
you want to do lots of little financial transactions, then 32k or less is 
good.  If you're gonna store moderately large text fields and such, then 
going above 32k or 64k is usually a good idea.


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Hardware recommendations to scale to silly load

2003-08-28 Thread scott.marlowe
On 27 Aug 2003, matt wrote:

 I'm wondering if the good people out there could perhaps give me some
 pointers on suitable hardware to solve an upcoming performance issue. 
 I've never really dealt with these kinds of loads before, so any
 experience you guys have would be invaluable.  Apologies in advance for
 the amount of info below...
 
 My app is likely to come under some serious load in the next 6 months,
 but the increase will be broadly predictable, so there is time to throw
 hardware at the problem.
 
 Currently I have a ~1GB DB, with the largest (and most commonly accessed
 and updated) two tables having 150,000 and 50,000 rows.
 
 A typical user interaction with the system involves about 15
 single-table selects, 5 selects with joins or subqueries, 3 inserts, and
 3 updates.  The current hardware probably (based on benchmarking and
 profiling) tops out at about 300 inserts/updates *or* 2500 selects per
 second.
 
 There are multiple indexes on each table that updates  inserts happen
 on.  These indexes are necessary to provide adequate select performance.
 
 Current hardware/software:
 Quad 700MHz PIII Xeon/1MB cache
 3GB RAM
 RAID 10 over 4 18GB/10,000rpm drives
 128MB battery backed controller cache with write-back enabled
 Redhat 7.3, kernel 2.4.20
 Postgres 7.2.3 (stock redhat issue)
 
 I need to increase the overall performance by a factor of 10, while at
 the same time the DB size increases by a factor of 50.  e.g. 3000
 inserts/updates or 25,000 selects per second, over a 25GB database with
 most used tables of 5,000,000 and 1,000,000 rows.

It will likely take a combination of optimizing your database structure / 
methods and increasing your hardware / OS performance.

You probably, more than anything, should look at some kind of 
superfast, external storage array that has dozens of drives, and a large 
battery backed cache.  You may be able to approximate this yourself with 
just a few dual channel Ultra 320 SCSI cards and a couple dozen hard 
drives.  The more spindles you throw at a database, generally speaking, 
the more parallel load it can handle.  

You may find that once you get to 10 or 20 drives, RAID 5 or 5+0 or 0+5 
will be outrunning 1+0/0+1 due to fewer writes.

You likely want to look at the fastest CPUs with the fastest memory you 
can afford.  those 700MHz xeons are likely using PC133 memory, which is 
painfully slow compared to the stuff pumping data out at 4 to 8 times the 
rate of the older stuff.

Maybe an SGI Altix could do this?  Have you looked at them?  They're not 
cheap, but they do look to be quite fast, and can scale to 64 CPUs if need 
be.  They're interbox communication fabric is faster than most CPU's front 
side busses.

 Notably, the data is very time-sensitive, so the active dataset at any
 hour is almost certainly going to be more on the order of 5GB than 25GB
 (plus I'll want all the indexes in RAM of course).
 
 Also, and importantly, the load comes but one hour per week, so buying a
 Starfire isn't a real option, as it'd just sit idle the rest of the
 time.  I'm particularly interested in keeping the cost down, as I'm a
 shareholder in the company!

Interesting.  If you can't spread the load out, can you batch some parts 
of it?  Or is the whole thing interactive therefore needing to all be 
done in real time at once?

 So what do I need?

whether you like it or not, you're gonna need heavy iron if you need to do 
this all in one hour once a week.

 Can anyone who has (or has ever had) that kind of
 load in production offer any pointers, anecdotes, etc?  Any theoretical
 musings also more than welcome.  Comments upon my sanity will be
 referred to my doctor.
 
 If the best price/performance option is a second hand 32-cpu Alpha
 running VMS I'd be happy to go that way ;-)

Actually, I've seen stuff like that going on Ebay pretty cheap lately.  I 
saw a 64 CPU E10k (366 MHz CPUs) with 64 gigs ram and 20 hard drives going 
for $24,000 a month ago.  Put Linux or BSD on it and Postgresql should 
fly.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Best tweak for fast results.. ?

2003-08-28 Thread scott.marlowe
On Tue, 26 Aug 2003, JM wrote:

 need input on parameter values on confs...
 
 our database is getting 1000 transactions/sec on peak periods..
 
 sitting on RH 7.3 
 2.4.7-10smp
 RAM: 1028400
 SWAP: 2040244

1:  Upgrade your kernel.  2.4.7 on RH3 was updated to 2.4.18-24 in March, 
and the 2.4.18 kernel is MUCH faster and has many bugs squashed.

2:  Upgrade to the latest stable version of postgresql, 7.3.4

3:  Make sure your kernels file-nr settings, and shm settings are big 
enough to handle load.  

4:  Edit the $PGDATA/postgresql.conf file to reflect all that extra cache 
you've got etc  

shared_buffers = 5000
sort_mem = 16384
effective_cache_size = (size of cache/buffer mem divided by 8192)

5:  Look at moving WAL to it's own spindle(s), as it is often the choke 
point when doing lots of transactions.

6:  Look at using more drives in a RAID 1+0 array for the data (as well as 
a seperate one for WAL if you can afford it.)

7:  Make sure your drives are mounted noatime.

8:  If you don't mind living dangerously, or the data can be reproduced 
from source files (i.e. catastrophic failure of your data set won't set 
you back) look at both mounting the drives async (the default for linux, 
slightly dangerous) and turning fsync off (quite dangerous, in case of 
crashed hardware / OS, you very well might lose data.


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Hardware recommendations to scale to silly load

2003-08-28 Thread scott.marlowe
On 27 Aug 2003, matt wrote:

  You probably, more than anything, should look at some kind of 
  superfast, external storage array
 
 Yeah, I think that's going to be a given.  Low end EMC FibreChannel
 boxes can do around 20,000 IOs/sec, which is probably close to good
 enough.
 
 You mentioned using multiple RAID controllers as a boost - presumably
 the trick here is to split the various elements (WAL, tables, indexes)
 across different controllers using symlinks or suchlike?  Can I feasibly
 split the DB tables across 5 or more controllers?

I'm not sure I'd split the tables by hand right up front.  Try getting as 
many hard drives as you can afford hooked up at once, and then try 
different ways of partitioning them.  I'm guessing that making two fairly 
good sized 1+0 sets, one for data and one for WAL might be the best 
answer.

  Actually, I've seen stuff like that going on Ebay pretty cheap lately.  I 
  saw a 64 CPU E10k (366 MHz CPUs) with 64 gigs ram and 20 hard drives going 
  for $24,000 a month ago.  Put Linux or BSD on it and Postgresql should 
  fly.
 
 Jeez, and I thought I was joking about the Starfire.  Even Slowaris
 would be OK on one of them.
 
 The financial issue is that there's just not that much money in the
 micropayments game for bursty sales.  If I was doing these loads
 *continuously* then I wouldn't be working, I'd be in the Maldives :-)

$24,000 isn't that much for a server really, and if you can leverage this 
one sale to get more, then it would likely pay for itself over time.

If you have problems keeping up with load, it will be harder to get more 
customers, so you kinda wanna do this as well as possible the first time.




---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] The results of my PostgreSQL/filesystem performance

2003-08-28 Thread scott.marlowe
On Thu, 28 Aug 2003, Sean Chittenden wrote:

  What it still leaves quite open is just what happens when the OS has
  more than one disk drive or CPU to play with.  It's not clear what
  happens in such cases, whether FreeBSD would catch up, or be left
  further in the dust.  The traditional propaganda has been that
  there are all sorts of reasons to expect PostgreSQL on FreeBSD to
  run a bit faster than on Linux; it is a bit unexpected for the
  opposite to seem true.
 
 Let me nip this in the butt before people run away with ideas that
 aren't correct.  When the tests were performed in FreeBSD 5.1 and
 Linux, the hard drives were running UDMA.  When running 4.8, for some
 reason his drives settled in on PIO mode:
 
 ad0s1a: UDMA ICRC error writing fsbn 1458368 of 729184-729215 (ad0s1 bn 1458368; cn 
 241 tn 12 sn 44) retrying
 ad0s1a: UDMA ICRC error writing fsbn 1458368 of 729184-729215 (ad0s1 bn 1458368; cn 
 241 tn 12 sn 44) retrying
 ad0s1a: UDMA ICRC error writing fsbn 1458368 of 729184-729215 (ad0s1 bn 1458368; cn 
 241 tn 12 sn 44) retrying
 ad0s1a: UDMA ICRC error writing fsbn 1458368 of 729184-729215 (ad0s1 bn 1458368; cn 
 241 tn 12 sn 44) falling back to PIO mode
 
 The benchmarks were hardly conclusive as UDMA runs vastly faster than
 PIO.  Until we hear back as to whether cables were jarred loose
 between the tests or hearing if something else changed, I'd hardly
 consider these conclusive tests given PIO/UDMA is apples to oranges in
 terms of speed and I fully expect that FreeBSD 4.8 will perform at
 least faster than 5.1 (5.x is still being unwound from Giant), but
 should out perform Linux as well if industry experience iss any
 indicator.

Plus, in most real servers you're gonna be running SCSI, so it might be 
nice to see a test with a good SCSI controller (Symbios 875 is a nice 
choice) and a couple hard drives, one each for WAL and data.  This would 
more closely resemble actual usage and there are likely to be fewer issues 
with things like UDMA versus PIO on SCSI.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


  1   2   >