Re: [PERFORM] Horribly slow query/ sequential scan

2007-01-09 Thread Tom Lane
"Gregory S. Williamson" <[EMAIL PROTECTED]> writes: > As Joe indicated, there is indeed an Informix explain, appended below my > signature ... > select > w.appid,w.rate,w.is_subscribed,sum(w.hits) AS Hits ,sum(w.sius) AS IUs, > sum(w.total_amnt) AS Total,sum(w.hits) * w.rate AS ByHits, > sum(w.s

Re: [PERFORM] group by will not use an index?

2007-01-09 Thread Andrew Lazarus
For the reasons indicated (that is, MVCC), PG can not do a DISTINCT or the equivalent GROUP BY from index values alone. If this table is large, perhaps you could denormalize and maintain a summary table with date (using truncation) and count, updated with triggers on the original table. This tabl

Re: [PERFORM] group by will not use an index?

2007-01-09 Thread Steinar H. Gunderson
On Tue, Jan 09, 2007 at 05:07:03PM -0800, Brian Herlihy wrote: > Actually, as I recently discovered, GROUP BY is faster than DISTINCT. It's > just due to how they are implemented, so don't go looking for any deep > reason :) The thread "GROUP BY vs DISTINCT" from 2006-12-20 discusses it. > DISTIN

Re: [PERFORM] group by will not use an index?

2007-01-09 Thread Brian Herlihy
Actually, as I recently discovered, GROUP BY is faster than DISTINCT. It's just due to how they are implemented, so don't go looking for any deep reason :) The thread "GROUP BY vs DISTINCT" from 2006-12-20 discusses it. DISTINCT sorts the results to find the unique rows, but GROUP BY uses a h

Re: [PERFORM] group by will not use an index?

2007-01-09 Thread Scott Marlowe
On Tue, 2007-01-09 at 17:05, tsuraan wrote: > I have a table of messages with paths and inserted dates (among other > things), like so: > > CREATE TABLE Messages ( > msgkey BIGSERIAL PRIMARY KEY, > path TEXT NOT NULL, > inserted TIMESTAMP WITHOUT TIMEZONE DEFAULT NOW() > ); > > I run

Re: [PERFORM] Horribly slow query/ sequential scan

2007-01-09 Thread Gregory S. Williamson
As Joe indicated, there is indeed an Informix explain, appended below my signature ... This table has 5565862 total rows, and 37 target rows. So about twice the total data, but all of the "extra" data in infomrix is much older. Thanks for the help, one and all! Greg W. QUERY: -- SELECT

Re: [PERFORM] group by will not use an index?

2007-01-09 Thread Adam Rich
That query looks strange to me (a group by without an aggregate). See if this is any faster: SELECT DISTINCT DATE(inserted) FROM Messages I won't hold my breath though, I don't think there's any way around the full table scan in Postgres, because the index does not contain enough information a

[PERFORM] group by will not use an index?

2007-01-09 Thread tsuraan
I have a table of messages with paths and inserted dates (among other things), like so: CREATE TABLE Messages ( msgkey BIGSERIAL PRIMARY KEY, path TEXT NOT NULL, inserted TIMESTAMP WITHOUT TIMEZONE DEFAULT NOW() ); I run a query to determine which days actually saw emails come in, like

Re: [PERFORM] High inserts, bulk deletes - autovacuum vs scheduled

2007-01-09 Thread Jeremy Haile
Good advice on the partitioning idea. I may have to restructure some of my queries, since some of them query across the whole range - but it may be a much more performant solution. How is the performance when querying across a set of partitioned tables vs. querying on a single table with all rows

Re: [PERFORM] High inserts, bulk deletes - autovacuum vs scheduled vacuum

2007-01-09 Thread Florian Weimer
* Jeremy Haile: > I'd like any performance advice, but my main concern is the amount of > time vacuum/analyze runs and its possible impact on the overall DB > performance. Thanks! You could partition your data tables by date and discard old data simply by dropping the tables. This is far more e

[PERFORM] High inserts, bulk deletes - autovacuum vs scheduled vacuum

2007-01-09 Thread Jeremy Haile
I am developing an application that has very predictable database operations: -inserts several thousand rows into 3 tables every 5 minutes. (table contain around 10 million rows each) -truncates and rebuilds aggregate tables of this data every 5 minutes. (several thousand rows each) -regu

Re: [PERFORM] High update activity, PostgreSQL vs BigDBMS

2007-01-09 Thread Jeff Frost
On Tue, 9 Jan 2007, Jim C. Nasby wrote: On Thu, Dec 28, 2006 at 02:15:31PM -0800, Jeff Frost wrote: When benchmarking various options for a new PG server at one of my clients, I tried ext2 and ext3 (data=writeback) for the WAL and it appeared to be fastest to have ext2 for the WAL. The winning

Re: [PERFORM] Postgresql Configutation and overflow

2007-01-09 Thread Jim C. Nasby
On Thu, Dec 28, 2006 at 10:35:29PM -0500, Dave Cramer wrote: > start with 25% of your 12G as shared buffers, and 75% of 12G for > effective cache I'm curious... why leave 3G for the kernel? Seems like overkill... Granted, as long as you're in the ballpark on effective_cache_size that's all that

Re: [PERFORM] Horribly slow query/ sequential scan

2007-01-09 Thread Plugge, Joe R.
Yes it does: SET EXPLAIN ON; It writes the file to sqexplain.out -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Tom Lane Sent: Tuesday, January 09, 2007 9:13 AM To: Gregory S. Williamson Cc: [EMAIL PROTECTED]; pgsql-performance@postgresql.org Subject:

Re: [PERFORM] Horribly slow query/ sequential scan

2007-01-09 Thread Tom Lane
"Gregory S. Williamson" <[EMAIL PROTECTED]> writes: > HAving burdened others with my foolishness too often, I hesitate to > ask, but could someone either point me to a reference or explain what > the difference might be ... I can see it with the eyes but I am having > trouble understanding what Inf

Re: [PERFORM] Horribly slow query/ sequential scan

2007-01-09 Thread Dave Cramer
On 9-Jan-07, at 7:50 AM, Nörder-Tuitje, Marcus wrote: Forget abount "IN". Its horribly slow. I think that statement above was historically correct, but is now incorrect. IN has been optimized quite significantly since 7.4 Dave try : select w.appid, w.rate, w.is_subscribe

Re: [PERFORM] Running PG on cluster files systems

2007-01-09 Thread Devrim GUNDUZ
Hi, On Tue, 2007-01-09 at 15:15 +0100, Hannes Dorbath wrote: > GFS2, OFCS2, lustre, CXFS, GPFS, Veritas and what else there is.. > > ..has someone experience with any of those? Is it bearable to run PG > on them from a performance point of view? I guess not, but any > positive reports? I have t

Re: [PERFORM] Worse perfomance on 8.2.0 than on 7.4.14

2007-01-09 Thread Rolf Østvik
--- Simon Riggs <[EMAIL PROTECTED]> skrev: > > The distribution of rows with those values also makes a difference to > the results. ANALYZE assumes that all values are randomly distributed > within the table, so if the values are clumped together for whatever > reason the ndistinct calc is less

[PERFORM] Running PG on cluster files systems

2007-01-09 Thread Hannes Dorbath
GFS2, OFCS2, lustre, CXFS, GPFS, Veritas and what else there is.. ..has someone experience with any of those? Is it bearable to run PG on them from a performance point of view? I guess not, but any positive reports? Thanks -- Regards, Hannes Dorbath ---(end of broadc

Re: [PERFORM] Horribly slow query/ sequential scan

2007-01-09 Thread Gregory S. Williamson
Thanks for the suggestion ... I will try it when I've had some sleep and the server is quiet again ... the IN seems to have improved markedly since the 7.4 release, as advertised, so I will be interested in trying this. GSW -Original Message- From: Nörder-Tuitje, Marcus [mailto:[EMAIL

Re: [PERFORM] Horribly slow query/ sequential scan

2007-01-09 Thread Nörder-Tuitje , Marcus
Forget abount "IN". Its horribly slow. try : select w.appid, w.rate, w.is_subscribed, sum(w.hits) AS Hits, sum(w.sius) AS IUs, sum(w.total_amnt) AS Total, sum(w.hits) * w.rate AS ByHits, sum(w.sius) * w.rate AS BYIUS from bill_rpt_work w where (

Re: [PERFORM] Horribly slow query/ sequential scan

2007-01-09 Thread Gregory S. Williamson
Voila ! You da man ! & other expressions of awe and appreciation ... HAving burdened others with my foolishness too often, I hesitate to ask, but could someone either point me to a reference or explain what the difference might be ... I can see it with the eyes but I am having trouble understan

Re: [PERFORM] High update activity, PostgreSQL vs BigDBMS

2007-01-09 Thread Jim C. Nasby
On Sun, Jan 07, 2007 at 11:26:01PM -0500, Guy Rouillier wrote: > Ok, I ran with the settings below, but with > > shared_buffers=768MB > effective_cache_size=2048MB > fsync=on > > This run took 29000 seconds. I'm beginning to think configuration > changes are not going to buy significant additio

Re: [PERFORM] table partioning performance

2007-01-09 Thread Simon Riggs
On Mon, 2007-01-08 at 15:02 -0500, Steven Flatt wrote: > On 1/6/07, Colin Taylor <[EMAIL PROTECTED]> wrote: > Hi there, we've partioned a table (using 8.2) by day due to > the 50TB of data (500k row size, 100G rows) we expect to store > it in a year. > Our performa

Re: [PERFORM] Horribly slow query/ sequential scan

2007-01-09 Thread db
I don't think I understand the idea behind this query. Do you really need billing_reports twice? > The query: > explain analyze select > w.appid,w.rate,w.is_subscribed,sum(w.hits) AS Hits ,sum(w.sius) AS IUs, > sum(w.total_amnt) AS Total,sum(w.hits) * w.rate AS ByHits, > sum(w.sius) * w.rate AS BY

Re: [PERFORM] High update activity, PostgreSQL vs BigDBMS

2007-01-09 Thread Jim C. Nasby
On Thu, Dec 28, 2006 at 02:15:31PM -0800, Jeff Frost wrote: > When benchmarking various options for a new PG server at one of my clients, > I tried ext2 and ext3 (data=writeback) for the WAL and it appeared to be > fastest to have ext2 for the WAL. The winning time was 157m46.713s for > ext2, 1

[PERFORM] Horribly slow query/ sequential scan

2007-01-09 Thread Gregory S. Williamson
This is a query migrated from postgres. In postgres it runs about 10,000 times *slower* than on informix on somewhat newer hardware. The problem is entirely due to the planner. This PostgreSQL 8.1.4 on linux, 2 gigs of ram. The table: Table "reporting.bill_rpt_work" Column |