Re: [PERFORM] Opteron vs Xeon (Was: What to do with 6 disks?)

2005-04-19 Thread Jeff Frost
On Tue, 19 Apr 2005, J. Andrew Rogers wrote: I don't know about 2.5x faster (perhaps on specific types of loads), but the reason Opterons rock for database applications is their insanely good memory bandwidth and latency that scales much better than the Xeon. Opterons also have a ccNUMA-esque I

Re: [PERFORM] Opteron vs Xeon (Was: What to do with 6 disks?)

2005-04-19 Thread J. Andrew Rogers
I've seen quite a few folks touting the Opteron as 2.5x faster with postgres than a Xeon box. What makes the Opteron so quick? Is it that Postgres really prefers to run in 64-bit mode? I don't know about 2.5x faster (perhaps on specific types of loads), but the reason Opterons rock for datab

Re: [PERFORM] How to tell what your postgresql server is doing

2005-04-19 Thread Christopher Kings-Lynne
Is there a way to look at the stats tables and tell what is jamming up your postgres server the most? Other than seeing long running queries and watch top, atop, iostat, vmstat in separate xterms...I'm wondering if postgres keeps some stats on what it spends the most time doing or if there's a

[PERFORM] How to tell what your postgresql server is doing

2005-04-19 Thread Jeff Frost
Is there a way to look at the stats tables and tell what is jamming up your postgres server the most? Other than seeing long running queries and watch top, atop, iostat, vmstat in separate xterms...I'm wondering if postgres keeps some stats on what it spends the most time doing or if there's a

Re: [PERFORM] Opteron vs Xeon (Was: What to do with 6 disks?)

2005-04-19 Thread Jeff Frost
RAID1 2 disks OS, pg_xlog RAID 1+0 4 disks pgdata Looks like the consensus is RAID 1 for OS, pg_xlog and RAID10 for pgdata. Now here's another performance related question: I've seen quite a few folks touting the Opteron as 2.5x faster with postgres than a Xeon box. What makes the Opteron so q

Re: [PERFORM] Slow copy with little CPU/disk usage

2005-04-19 Thread Mischa Sandberg
Quoting Tom Lane <[EMAIL PROTECTED]>: > "Jim C. Nasby" <[EMAIL PROTECTED]> writes: > > A friend of mine has an application where he's copying in 4000 rows at a > > time into a table that has about 4M rows. Each row is 40-50 bytes. This > > is taking 25 seconds on a dual PIII-1GHz with 1G of R

Re: [PERFORM] Slow copy with little CPU/disk usage

2005-04-19 Thread Greg Stark
"Jim C. Nasby" <[EMAIL PROTECTED]> writes: > What's really odd is that neither the CPU or the disk are being > hammered. The box appears to be pretty idle; the postgresql proces is > using 4-5% CPU. Is he committing every row? In that case you would see fairly low i/o bandwidth usage because most

Re: [PERFORM] Sort and index

2005-04-19 Thread Jim C. Nasby
On Tue, Apr 19, 2005 at 11:01:26PM -0400, Tom Lane wrote: > "Jim C. Nasby" <[EMAIL PROTECTED]> writes: > > Actually, the planner (at least in 7.4) isn't smart enough to consider > > if the sort would fit in memory or not. > > Really? Have you read cost_sort()? > > It's certainly possible that th

Re: [PERFORM] Slow copy with little CPU/disk usage

2005-04-19 Thread Tom Lane
"Jim C. Nasby" <[EMAIL PROTECTED]> writes: > A friend of mine has an application where he's copying in 4000 rows at a > time into a table that has about 4M rows. Each row is 40-50 bytes. This > is taking 25 seconds on a dual PIII-1GHz with 1G of RAM and a 2 disk > SATA mirror, running FBSD 4.10-sta

Re: [PERFORM] What to do with 6 disks?

2005-04-19 Thread William Yu
My experience: 1xRAID10 for postgres 1xRAID1 for OS + WAL Jeff Frost wrote: Now that we've hashed out which drives are quicker and more money equals faster... Let's say you had a server with 6 separate 15k RPM SCSI disks, what raid option would you use for a standalone postgres server? a) 3xRAI

Re: [PERFORM] What to do with 6 disks?

2005-04-19 Thread Josh Berkus
Jeff, > Let's say you had a server with 6 separate 15k RPM SCSI disks, what raid > option would you use for a standalone postgres server? > > a) 3xRAID1 - 1 for data, 1 for xlog, 1 for os? > b) 1xRAID1 for OS/xlog, 1xRAID5 for data > c) 1xRAID10 for OS/xlong/data > d) 1xRAID1 for OS, 1xRAID10 for

Re: [PERFORM] Sort and index

2005-04-19 Thread Tom Lane
"Jim C. Nasby" <[EMAIL PROTECTED]> writes: > Actually, the planner (at least in 7.4) isn't smart enough to consider > if the sort would fit in memory or not. Really? Have you read cost_sort()? It's certainly possible that the calculation is all wet, but to claim that the issue is not considered

[PERFORM] Slow copy with little CPU/disk usage

2005-04-19 Thread Jim C. Nasby
A friend of mine has an application where he's copying in 4000 rows at a time into a table that has about 4M rows. Each row is 40-50 bytes. This is taking 25 seconds on a dual PIII-1GHz with 1G of RAM and a 2 disk SATA mirror, running FBSD 4.10-stable. There's one index on the table. What's really

Re: [PERFORM] What to do with 6 disks?

2005-04-19 Thread Jim C. Nasby
http://stats.distributed.net is setup with the OS, WAL, and temp on a RAID1 and the database on a RAID10. The drives are 200G SATA with a 3ware raid card. I don't think the controller has battery-backed cache, but I'm not sure. In any case, it's almost never disk-bound on the mirror; when it's disk

[PERFORM] What to do with 6 disks?

2005-04-19 Thread Jeff Frost
Now that we've hashed out which drives are quicker and more money equals faster... Let's say you had a server with 6 separate 15k RPM SCSI disks, what raid option would you use for a standalone postgres server? a) 3xRAID1 - 1 for data, 1 for xlog, 1 for os? b) 1xRAID1 for OS/xlog, 1xRAID5 for d

Re: [PERFORM] Sort and index

2005-04-19 Thread Jim C. Nasby
On Mon, Apr 18, 2005 at 10:44:43AM -0500, Dave Held wrote: > Since you are fetching the entire table, you are touching all the rows. > If the query were to fetch the rows in index order, it would be seeking > all over the table's tracks. By fetching in sequence order, it has a > much better chance

Re: [PERFORM] How to improve postgres performace

2005-04-19 Thread Jim C. Nasby
On Mon, Apr 18, 2005 at 11:36:01AM -0300, Rodrigo Moreno wrote: > I really worried about that, because it's no enough anymore, and users claim > about performace. But running the vacuumdb full, everthing starts to run > better again, so i think the problem is not related to a specific query. Vacuu

Re: [PERFORM] immutable functions vs. join for lookups ?

2005-04-19 Thread Jim C. Nasby
You should re-run the function test using SQL as the function language instead of plpgsql. There might be some performance to be had there. -- Jim C. Nasby, Database Consultant [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do

Re: [PERFORM] How to improve db performance with $7K?

2005-04-19 Thread Jim C. Nasby
On Tue, Apr 19, 2005 at 11:22:17AM -0500, [EMAIL PROTECTED] wrote: > > > [EMAIL PROTECTED] wrote on 04/19/2005 11:10:22 AM: > > > > What is 'multiple initiators' used for in the real world? > > I asked this same question and got an answer off list: Somebody said their > SAN hardware used multip

Re: [PERFORM] How to improve db performance with $7K?

2005-04-19 Thread Jim C. Nasby
On Mon, Apr 18, 2005 at 06:41:37PM -, Mohan, Ross wrote: > Don't you think "optimal stripe width" would be > a good question to research the binaries for? I'd > think that drives the answer, largely. (uh oh, pun alert) > > EG, oracle issues IO requests (this may have changed _just_ > recentl

Re: [PERFORM] How to improve db performance with $7K?

2005-04-19 Thread Jim C. Nasby
On Mon, Apr 18, 2005 at 10:20:36AM -0500, Dave Held wrote: > Hmm...so you're saying that at some point, quantity beats quality? > That's an interesting point. However, it presumes that you can > actually distribute your data over a larger number of drives. If > you have a db with a bottleneck of

Re: [PERFORM] How to improve db performance with $7K?

2005-04-19 Thread Jim C. Nasby
On Mon, Apr 18, 2005 at 07:41:49PM +0200, Jacques Caron wrote: > It would be interesting to actually compare this to real-world (or > nearly-real-world) benchmarks to measure the effectiveness of features like > TCQ/NCQ etc. I was just thinking that it would be very interesting to benchmark diff

Re: [PERFORM] How to improve db performance with $7K?

2005-04-19 Thread Jim C. Nasby
On Thu, Apr 14, 2005 at 10:51:46AM -0500, Matthew Nuzum wrote: > So if you all were going to choose between two hard drives where: > drive A has capacity C and spins at 15K rpms, and > drive B has capacity 2 x C and spins at 10K rpms and > all other features are the same, the price is the same and

Re: [PERFORM] [HACKERS] PLM pulling from CVS nightly for testing in STP

2005-04-19 Thread Mark Wong
I have dbt-2 tests automatically running against each pull from CVS and have started to automatically compile results here: http://developer.osdl.org/markw/postgrescvs/ I did start with a bit of a minimalistic approach, so I'm open for any comments, feedback, etc. Mark --

Re: [PERFORM] Bad n_distinct estimation; hacks suggested?

2005-04-19 Thread Josh Berkus
Tom, > What's the histogram itself look like? (I'd like to see the whole > pg_stats row not just part of it ...) There's probably no point in > showing the target=1000 version, but maybe target=100 would be > informative. Here is the stats = 100 version. Notice that n_distinct has gone down.

Re: [PERFORM] Bad n_distinct estimation; hacks suggested?

2005-04-19 Thread Tom Lane
Josh Berkus writes: > As you can see, n_distinct estimation is off by a factor of 10x and it's > causing query planning problems. Any suggested hacks to improve the > histogram on this? What's the histogram itself look like? (I'd like to see the whole pg_stats row not just part of it ...) T

Re: [PERFORM] Bad n_distinct estimation; hacks suggested?

2005-04-19 Thread Dave Held
> -Original Message- > From: Josh Berkus [mailto:[EMAIL PROTECTED] > Sent: Tuesday, April 19, 2005 2:09 PM > To: pgsql-perform > Subject: [PERFORM] Bad n_distinct estimation; hacks suggested? > > [...] > (BTW, increasing the stats to 1000 only doubles n_distinct, > and doesn't solve the p

[PERFORM] Bad n_distinct estimation; hacks suggested?

2005-04-19 Thread Josh Berkus
Folks, Params: PostgreSQL 8.0.1 on Solaris 10 Statistics = 500 (tablenames have been changed to protect NDA) e1=# select tablename, null_frac, correlation, n_distinct from pg_stats where tablename = 'clickstream1' andattname = 'session_id'; tablename | null_frac | correlation | n_di

Re: [PERFORM] Question on REINDEX

2005-04-19 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes: > Am I right in thinking that vacuum does at least two passes: one > front-to-back to find removable tuples, and other back-to-front for > movement? VACUUM FULL, yes. VACUUM only does the first one. > I know maintenance_work_mem is used for storing TIDs

Re: [PERFORM] Question on REINDEX

2005-04-19 Thread Alvaro Herrera
On Tue, Apr 19, 2005 at 10:06:40AM -0400, Tom Lane wrote: > BTW, VACUUM FULL does the data movement back-to-front, and stops as soon > as it finds a tuple it cannot move down; which is a reasonable strategy > since the goal is merely to make the file shorter. But it's entirely > likely that there

Re: [PERFORM] Question on REINDEX

2005-04-19 Thread Tom Lane
Josh Berkus writes: >> Not at all. What it says is that you expect 100% of the pages to have >> useful amounts of free space, which is a *much* weaker criterion. > H actually, it seems like, if you are vacuuming regularly, you only > *do* need to track pages that have been touched by D

Re: [PERFORM] Question on REINDEX

2005-04-19 Thread Josh Berkus
Tom, > Not at all. What it says is that you expect 100% of the pages to have > useful amounts of free space, which is a *much* weaker criterion. Hmmm. Good point. This seems to be another instance where my rule-of-thumb was based on false logic but nevertheless arrived at correct numbers.

Re: [PERFORM] How to improve db performance with $7K?

2005-04-19 Thread Mohan, Ross
Well, more like they both are allowed to issue disk requests and the magical "clustered file system" manages locking, etc. In reality, any disk is only reading/writing to one part of the disk at any given time, of course, but that in the multiple initiator deal, multiple streams of requests from

Re: [PERFORM] Question on REINDEX

2005-04-19 Thread Tom Lane
Josh Berkus writes: >> select count(1), sum(relpages) from pg_class where relkind in >> ('r','i','t') > Well, if you do that for all databases in the cluster, it's the number you > start with. However, setting FSM_pages to that would be assuming that you > excpected 100% of the rows to be repl

Re: [PERFORM] How to improve db performance with $7K?

2005-04-19 Thread Richard_D_Levine
[EMAIL PROTECTED] wrote on 04/19/2005 11:10:22 AM: > > What is 'multiple initiators' used for in the real world? I asked this same question and got an answer off list: Somebody said their SAN hardware used multiple initiators. I would try to check the archives for you, but this thread is becom

Re: [PERFORM] How to improve db performance with $7K?

2005-04-19 Thread Bruce Momjian
Mohan, Ross wrote: > Clustered file systems is the first/best example that > comes to mind. Host A and Host B can both request from diskfarm, eg. So one host writes to part of the disk and another host writes to a different part? --

Re: [PERFORM] How to improve db performance with $7K?

2005-04-19 Thread Mohan, Ross
Clustered file systems is the first/best example that comes to mind. Host A and Host B can both request from diskfarm, eg. -Original Message- From: Bruce Momjian [mailto:[EMAIL PROTECTED] Sent: Tuesday, April 19, 2005 12:10 PM To: Mohan, Ross Cc: pgsql-performance@postgresql.org Subjec

Re: [PERFORM] How to improve db performance with $7K?

2005-04-19 Thread Bruce Momjian
Mohan, Ross wrote: > The only part I am pretty sure about is that real-world experience shows SCSI > is better for a mixed I/O environment. Not sure why, exactly, but the > command queueing obviously helps, and I am not sure what else does. > > || TCQ is the secret sauce, no doubt. I think NCQ

Re: [PERFORM] Question on REINDEX

2005-04-19 Thread Josh Berkus
Tambet, > Hmm, thanks for a tip. BTW, is output of > > select count(1), sum(relpages) from pg_class where relkind in > ('r','i','t') Well, if you do that for all databases in the cluster, it's the number you start with. However, setting FSM_pages to that would be assuming that you excpected 10

Re: [PERFORM] Postgresql works too slow

2005-04-19 Thread Brad Nicholson
Simon Riggs wrote: On Mon, 2005-04-18 at 08:50 +0400, Nurlan Mukhanov (AL/EKZ) wrote: I'm trying to restore my database from dump in several parrallel processes, but restore process works too slow. Number of rows about 100 000 000, RAM: 8192M CPU: Ultra Sparc 3 Number of CPU: 4 OS: SunOS sun 5.8

Re: [PERFORM] Postgresql works too slow

2005-04-19 Thread Simon Riggs
On Mon, 2005-04-18 at 08:50 +0400, Nurlan Mukhanov (AL/EKZ) wrote: > I'm trying to restore my database from dump in several parrallel processes, > but restore process works too slow. > Number of rows about 100 000 000, > RAM: 8192M > CPU: Ultra Sparc 3 > Number of CPU: 4 > OS: SunOS sun 5.8 > RDBM

Re: [PERFORM] How to improve db performance with $7K?

2005-04-19 Thread Mohan, Ross
Good question. If the SCSI system was moving the head from track 1 to 10, and a request then came in for track 5, could the system make the head stop at track 5 on its way to track 10? That is something that only the controller could do. However, I have no idea if SCSI does that. || SCSI, A

Re: [PERFORM] Question on REINDEX

2005-04-19 Thread Tom Lane
"Tambet Matiisen" <[EMAIL PROTECTED]> writes: > Is my current understanding correct: > 1) VACUUM defragments each page locally - moves free space to the end of > page. > 2) VACUUM FULL defragments table globally - tries to fill up all > partially free pages and deletes all resulting empty pages.

Re: [PERFORM] How to improve db performance with $7K?

2005-04-19 Thread Dave Held
> -Original Message- > From: Alex Turner [mailto:[EMAIL PROTECTED] > Sent: Monday, April 18, 2005 5:50 PM > To: Bruce Momjian > Cc: Kevin Brown; pgsql-performance@postgresql.org > Subject: Re: [PERFORM] How to improve db performance with $7K? > > Does it really matter at which end of the c

Re: [PERFORM] Question on REINDEX

2005-04-19 Thread Tambet Matiisen
> > Josh Berkus writes: > >> 1) When is it necessary to run REINDEX or drop/create > >> an index? All I could really find in the docs is: > > > If you need to VACUUM FULL, you need to REINDEX as well. > For example, > > if you drop millions of rows from a table. > > That's probably a prett