Re: [PERFORM] good pc but bad performance,why?

2004-04-06 Thread huang yaqin
hello!

Thanks, you are right.
 I use "postmaster -o "-F" " to start my PG,and performance improved greatly.

   Best   regards,
 huang yaqin

>huang yaqin <[EMAIL PROTECTED]> writes:
>>  I have some question when I use postgresql 7.4.1 on redhat adv server 2.1 .
>> I use IBM335 as server, it has 4 cpus, 1G RAM. but I got very bad performance.
>> I can only do about 50 inserts per sencond. Event worse than my pc(PIII 800,256M 
>> RAM), can anyone give me some advice?   
>
>If the cheap machine appears to be able to commit more transactions
>per second than the better one, it's very likely because the cheap
>machine has a disk that lies about write completion.  Is the server
>using SCSI disks by any chance?  To a first approximation, IDE drives
>lie by default, SCSI drives don't.
>
>   regards, tom lane
>
>
>
>
>Powered by MessageSoft SMG
>SPAM, virus-free and secure email
>http://www.messagesoft.com

= = = = = = = = = = = = = = = = = = = =


致
礼!


huang yaqin
[EMAIL PROTECTED]
  2004-04-07




---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] possible improvement between G4 and G5

2004-04-06 Thread Josh Berkus
Aaron,

> I do consulting, so they're all over the place and tend to be complex. Very
> few fit in RAM, but still are very buffered. These are almost all backed
> with very high end I/O subsystems, with dozens of spindles with battery
> backed up writethrough cache and gigs of buffers, which may be why I worry
> so much about CPU. I have had this issue with multiple servers.

Aha, I think this is the difference.  I never seem to be able to get my 
clients to fork out for adequate disk support.   They are always running off 
single or double SCSI RAID in the host server; not the sort of setup you 
have.

> What my CPU tends to be doing is a combination of general processing,
> complex SQL processing: nested loops and sorting and hashing and triggers
> and SPs.

I haven't noticed SPs to be particularly CPU-hoggish, more RAM.

> I'm curious about you having flat CPU, which is not my experience. Are your
> apps mature and stable?

Well, "flat" was a bit of an exaggeration ... there are spikes ... but average 
CPU load is < 30%.I think the difference is that your clients listen to 
you about disk access.   Mine are all too liable to purchase a quad-Xeon 
machine but with an Adaptec RAID-5 card with 4 drives, and *then* call me and 
ask for advice.

As a result, most intensive operations don't tend to swamp the CPU because 
they are waiting for disk.   

I have noticed the limitiations on RAM for 64 vs. 32, as I find it easier to 
convince a client to get 8GB RAM than four-channel RAID with 12 drives, 
mostly because the former is cheaper.   Linux 2.4 + Bigmem just doesn't cut 
it for making effective use of > 3GB of RAM.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] possible improvement between G4 and G5

2004-04-06 Thread Aaron Werman


- Original Message - 
From: "Josh Berkus" <[EMAIL PROTECTED]>
To: "Aaron Werman" <[EMAIL PROTECTED]>; "Qing Zhao" <[EMAIL PROTECTED]>;
"Tom Lane" <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Sent: Tuesday, April 06, 2004 2:52 PM
Subject: Re: [PERFORM] possible improvement between G4 and G5


> Aaron,
>
> > I'm surprised by this thought. I tend to hit CPU bottlenecks more often
than
> > I/O ones. In most applications, db I/O is a combination of buffer misses
and
> > logging, which are both reasonably constrained.
>
> Not my experience at all.  In fact, the only times I've seen modern
platforms
> max out the CPU was when:
> a) I had bad queries with bad plans, or
> b) I had reporting queires that did a lot of calculation for display
(think
> OLAP).
>
> Otherwise, on the numerous servers I administrate, RAM spikes, and I/O
> bottlenecks, but the CPU stays almost flat.
>
> Of course, most of my apps are large databases (i.e. too big for RAM) with
a
> heavy transaction-processing component.
>
> What kind of applications are you running?
>
> -- 
> -Josh Berkus
>  Aglio Database Solutions
>  San Francisco
>
>



I do consulting, so they're all over the place and tend to be complex. Very
few fit in RAM, but still are very buffered. These are almost all backed
with very high end I/O subsystems, with dozens of spindles with battery
backed up writethrough cache and gigs of buffers, which may be why I worry
so much about CPU. I have had this issue with multiple servers.

Consider an analysis db with 10G data. Of that, 98% of the access is read
and only 2% write (that is normal for almost anything that is not order
entry, even transaction processing with thorough cross validation). Almost
all the queries access 10%, or 1G of the data. Of the reads, they average ~3
level b-trees, with the first 2 levels certainly cached, and the last ones
often cached. Virtually all the I/O activity is logical reads against
buffer. A system with a 100 transactions which on average access 200 rows
does 98% of 200 rows x 100 transactions x 3 logical I/Os per read = 58,800
logical reads, of which actually maybe a hundred are physical reads.  It
also does 2% of 200 rows x 100 transactions x (1 table logical I/O and say 2
index logical writes) per write = 1,200 logical writes to log, of which
there are 100 transaction commit synch writes, and in reality less than that
because of queuing against logs (there are also 1,200 logical writes
deferred to checkpoint, of which it is likely to only be 40 physical writes
because of page overlaps).

Transaction processing is a spectrum between activity logging, and database
centric design. The former, where actions are stored in the database is
totally I/O bound with the engine acting as a thin layer of logical to
physical mapping. Database centric processing makes the engine a functional
server of discrete actions - and is a big CPU hog.

What my CPU tends to be doing is a combination of general processing,
complex SQL processing: nested loops and sorting and hashing and triggers
and SPs.

I'm curious about you having flat CPU, which is not my experience. Are your
apps mature and stable?



/Aaron

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[PERFORM] plan problem

2004-04-06 Thread Ken Geis
I am trying to find an efficient way to draw a random sample from a 
complex query.  I also want it to be easy to use within my application.

So I've defined a view that encapsulates the query.  The id in the 
"driving" table is exposed, and I run a query like:

select * from stats_record_view
 where id in (select id from driver_stats
   order by random()
   limit 3);
driver_stats.id is unique, the primary key.  The problem I'm having is 
that neither the ORDER BY nor the LIMIT change the uniqueness of that 
column, but the planner doesn't know that.  It does a HashAggregate to 
make sure the results are unique.  It thinks that 200 rows will come out 
of that operation, and then 200 rows is small enough that it thinks a 
Nested Loop is the best way to proceed from there.

I can post more query plan, but I don't think it would be that very 
helpful.  I'm considering just making a sample table and creating an 
analogous view around that.  I'd like to be able to keep this as simple 
as possible though.

Ken



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] good pc but bad performance,why?

2004-04-06 Thread Stefan Kaltenbrunner
huang yaqin wrote:
hello,

I have some question when I use postgresql 7.4.1 on redhat adv server 2.1 .
I use IBM335 as server, it has 4 cpus, 1G RAM. but I got very bad performance.
This is most likely a dual processor Xeon machine with HT, because the 
x335 is limited to two physical cpus.

I can only do about 50 inserts per sencond. Event worse than my pc(PIII 800,256M RAM), can anyone give me some advice?
any chance you are using the onboard MPT-Fusion "Raid"controller with a 
RAID1 - we have seen absolutely horrible performance from these 
controllers here.
Using them as a normal SCSI-Controller with Softwareraid on top fixed 
this for us ...

stefan

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] possible improvement between G4 and G5

2004-04-06 Thread Josh Berkus
Aaron,

> I'm surprised by this thought. I tend to hit CPU bottlenecks more often than
> I/O ones. In most applications, db I/O is a combination of buffer misses and
> logging, which are both reasonably constrained. 

Not my experience at all.  In fact, the only times I've seen modern platforms 
max out the CPU was when:
a) I had bad queries with bad plans, or
b) I had reporting queires that did a lot of calculation for display (think 
OLAP).

Otherwise, on the numerous servers I administrate, RAM spikes, and I/O 
bottlenecks, but the CPU stays almost flat.

Of course, most of my apps are large databases (i.e. too big for RAM) with a 
heavy transaction-processing component.

What kind of applications are you running?

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] possible improvement between G4 and G5

2004-04-06 Thread Aaron Werman

- Original Message - 
From: "Tom Lane" <[EMAIL PROTECTED]>
To: "Qing Zhao" <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Sent: Tuesday, April 06, 2004 1:47 AM
Subject: Re: [PERFORM] possible improvement between G4 and G5


> Qing Zhao <[EMAIL PROTECTED]> writes:
> > We have got a G5 64-bit processor to replace an old G4 32-bit
> > processor.  Given everything else equal, should we see a big
> > improvement on PG's performance?
>
> Nope.  Database performance typically depends on disk performance first,
> and RAM size second.

I'm surprised by this thought. I tend to hit CPU bottlenecks more often than
I/O ones. In most applications, db I/O is a combination of buffer misses and
logging, which are both reasonably constrained. RAM size seems to me to be
the best way to improve performance, and then CPU which is needed to perform
the in-memory searching, locking, versioning,  and processing, and finally
I/O (this is not the case in small I/O subsystems - if you have less than a
dozen drives, you're easily I/O bound). I/O is often the thing I tune first,
because I can do it in place without buying hardware.

Conceptually, an RDBMS converts slow random I/O into in memory processing
and sequential logging writes. If successful, it should reduce the I/O
overhead.

/Aaron

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] good pc but bad performance,why?

2004-04-06 Thread Tom Lane
huang yaqin <[EMAIL PROTECTED]> writes:
>   I have some question when I use postgresql 7.4.1 on redhat adv server 2.1 .
> I use IBM335 as server, it has 4 cpus, 1G RAM. but I got very bad performance.
> I can only do about 50 inserts per sencond. Event worse than my pc(PIII 800,256M 
> RAM), can anyone give me some advice?

If the cheap machine appears to be able to commit more transactions
per second than the better one, it's very likely because the cheap
machine has a disk that lies about write completion.  Is the server
using SCSI disks by any chance?  To a first approximation, IDE drives
lie by default, SCSI drives don't.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] possible improvement between G4 and G5

2004-04-06 Thread Jim C. Nasby
On Tue, Apr 06, 2004 at 01:47:22AM -0400, Tom Lane wrote:
> Qing Zhao <[EMAIL PROTECTED]> writes:
> > We have got a G5 64-bit processor to replace an old G4 32-bit
> > processor.  Given everything else equal, should we see a big
> > improvement on PG's performance?
> 
> Nope.  Database performance typically depends on disk performance first,
> and RAM size second.  A 64-bit processor might help by allowing you to
> install more RAM, but you didn't say that you had.

Memory bandwidth is a consideration too, so you might see some
performance improvements on a G5. We recently debated between Xeons and
Opterons in a new PGSQL server and a little poking around on the lists
indicated that the Opterons did perform better, presumably due to the
increased memory bandwidth. Incidentally, this is why you need about 2x
the CPUs on Sun hardware vs RS6000 hardware for database stuff (and that
gets expensive if you're paying per CPU!).
-- 
Jim C. Nasby, Database Consultant  [EMAIL PROTECTED]
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


[PERFORM] Wierd issues

2004-04-06 Thread Andrew Matthews








This is what I got…

 

Two servers, one debian, one fedora

 

Debain dual 3ghz, 1 gig ram, ide, PostgreSQL 7.2.1 on
i686-pc-linux-gnu, compiled by GCC 2.95.4

 

 

Fedora: Dual 3ghz, 1 gig ram, scsi, PostgreSQL 7.3.4-RH on
i386-redhat-linux-gnu, compiled by GCC i386-redhat-linux-gcc (GCC) 3.3.2
20031022 (Red Hat Linux 3.3.2-1)

 

 

Both have same databases, Both have had vacume full ran on
them. Both doing the same query

 

Select * from vpopmail; The vpopmail is a view, this is the
view

 

 

    View "vpopmail"

  Column   |  Type  | Modifiers 

---++---

 pw_name   | character varying(32)  | 

 pw_domain | character varying(64)  | 

 pw_passwd | character varying  | 

 pw_uid    | integer    | 

 pw_gid    | integer    | 

 pw_gecos  | character varying  | 

 pw_dir    | character varying(160) | 

 pw_shell  | character varying(20)  | 

View definition: SELECT ea.email_name AS pw_name, ea.domain
AS pw_domain, get_pwd(u.username, '127.0.0.1'::"varchar",
'101'::"varchar", 'MD5'::"varchar") AS pw_passwd, 0 AS
pw_uid, 0 AS pw_gid, ''::"varchar" AS pw_gecos, ei.directory AS
pw_dir, ei.quota AS pw_shell FROM email_addresses ea, email_info ei, users u,
user_resources ur WHERE (ea.user_resource_id = ei.user_resource_id) AND
(get_pwd(u.username, '127.0.0.1'::"varchar",
'101'::"varchar", 'MD5'::"varchar") IS NOT NULL)) AND
(ur.id = ei.user_resource_id)) AND (u.id = ur.user_id)) AND (NOT (EXISTS
(SELECT forwarding.email_id FROM forwarding WHERE (forwarding.email_id =
ea.id);

 

 

 

Both are set to the same buffers and everything… this
is the execution time:

 

Debian: Total runtime: 35594.81 msec

 

Fedora: Total runtime: 2279869.08 msec

 

Huge difference as you can see… here are the pastes of
the stuff

 

Debain:

 

user_acl=# explain analyze SELECT count(*) from vpopmail;

NOTICE:  QUERY PLAN:

 

Aggregate  (cost=438231.94..438231.94 rows=1 width=20)
(actual time=35594.67..35594.67 rows=1 loops=1)

  ->  Hash Join  (cost=434592.51..438142.51 rows=35774
width=20) (actual time=34319.24..35537.11 rows=70613 loops=1)

    ->  Seq Scan on email_info ei 
(cost=0.00..1721.40 rows=71640 width=4) (actual time=0.04..95.13 rows=71689
loops=1)

    ->  Hash  (cost=434328.07..434328.07 rows=35776
width=16) (actual time=34319.00..34319.00 rows=0 loops=1)

  ->  Hash Join  (cost=430582.53..434328.07
rows=35776 width=16) (actual time=2372.45..34207.21 rows=70613 loops=1)

    ->  Seq Scan on users u 
(cost=0.00..1938.51 rows=71283 width=4) (actual time=0.81..30119.58 rows=70809
loops=1)

    ->  Hash  (cost=430333.64..430333.64
rows=35956 width=12) (actual time=2371.51..2371.51 rows=0 loops=1)

  ->  Hash Join 
(cost=2425.62..430333.64 rows=35956 width=12) (actual time=176.73..2271.14
rows=71470 loops=1)

    ->  Seq Scan on
email_addresses ea  (cost=0.00..426393.25 rows=35956 width=4) (actual time=0.06..627.49
rows=71473 loops=1)

  SubPlan

    ->  Index Scan
using forwarding_idx on forwarding  (cost=0.00..5.88 rows=1 width=4) (actual
time=0.00..0.00 rows=0 loops=71960)

    ->  Hash 
(cost=1148.37..1148.37 rows=71637 width=8) (actual time=176.38..176.38 rows=0
loops=1)

  ->  Seq Scan on
user_resources ur 
(cost=0.00..1148.37 rows=71637 width=8) (actual time=0.03..82.21 rows=71686
loops=1)

Total runtime: 35594.81 msec

 

EXPLAIN

 

 

 

And for fedora it’s

 

 

Aggregate  (cost=416775.52..416775.52 rows=1 width=20) (actual time=2279868.57..2279868.58 rows=1 loops=1)   ->  Hash Join  (cost=413853.79..416686.09 rows=35772 width=20) (actual time=2279271.26..2279803.91 rows=70841 loops=1) Hash Cond: ("outer".user_resource_id = "inner".id) ->  Seq Scan on email_info ei  (cost=0.00..1666.07 rows=71907 width=4) (actual time=8.12..171.10 rows=71907 loops=1) ->  Hash  (cost=413764.36..413764.36 rows=35772 width=16) (actual time=2279263.03..2279263.03 rows=0 loops=1)   ->  Hash Join  (cost=410712.87..413764.36 rows=35772 width=16) (actual time=993.90..2279008.72 rows=70841 loops=1) Hash Cond: ("outer".id = "inner".user_id) ->  Seq Scan on users u  (cost=0.00..1888.85 rows=71548 width=4) (actual time=18.38..2277152.51 rows=71028 loops=1)   Filter: (get_pwd(username, '127.0.0.1'::character varying, '101'::character varying, 'MD5'::character varying) IS NOT NULL) ->  Hash  (cost=410622.99..410622.99 rows=35952 width=12) (actual time=975.40..975.40 rows=0 loops=1)   ->  Hash Join  (cost=408346.51..410622.99 rows=35952 width=12) (actual time=507.52..905.91 rows=71697 loops=1) Hash Cond: ("outer"

Re: [PERFORM] performance comparission postgresql/ms-sql server

2004-04-06 Thread Kaarel
Heiko Kehlenbrink wrote:

i want to convince people to use postgresql instead of ms-sql server, so i
set up a kind of comparission insert data / select data from postgresql /
ms-sql server
 

[...]

do you have any hints like compiler-flags and so on to get the answering
time from postgresql equal to ms-sql?
(btw both dbms were running on exactly the same hardware)

i use suse 8.1
 postgresql 7.2 compiled from the rpms for using postgis, but that is
anothe story...
 1.5 gig ram
 1.8 mhz intel cpu
every help welcome
 

Suse 8.1 comes with 2.4 series kernel I suppose. Many have witnessed a 
speed increase when using 2.6 series kernel. Might consider this too 
besides the newer PostgreSQL version already suggested. 2.6 has some 
scheduling options that are not enabled by default but may enhance 
database performance 
(http://story.news.yahoo.com/news?tmpl=story&cid=75&e=2&u=/nf/20040405/tc_nf/23603).

Kaarel

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] performance comparission postgresql/ms-sql server

2004-04-06 Thread Shridhar Daithankar
Heiko Kehlenbrink wrote:
Hmm... I would suggest if you are testing, you should try 7.4.2. 7.4 has
some
good optimisation for hash agregates though I am not sure if it apply to
averaging.
would be the last option till we are runing other applications on that 7.2
system
I can understand..

Also try forcing a seq. scan by turning off index scan. I guess index scan
for
so many rows is not exactly good thing even if tuple size if pretty small.
a sequential scann gives me the following result:

[EMAIL PROTECTED]:~> time psql -d test -c 'explain analyse select
avg(dist) from massive2 where dist > 100*sqrt(3.0)::float8 and dist <
150*sqrt(3.0)::float8 ;'
NOTICE:  QUERY PLAN:
Aggregate  (cost=1193714.43..1193714.43 rows=1 width=8) (actual
time=166718.54..166718.54 rows=1 loops=1)
  ->  Seq Scan on massive2  (cost=0.00..1192478.00 rows=494573 width=8)
(actual time=3233.22..165576.40 rows=49 loops=1)
Total runtime: 166733.73 msec
Certainly bad and not an option.. I can't think of anything offhand to speed 
this up..

 Shridhar

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[PERFORM] Back to Linux 2.6 kernel thoughts...

2004-04-06 Thread Richard Huxton
I seem to remember discussion of anticipatory vs deadline scheduler in 2.6. 
Here is what Andrew Morton (I think) says:

"The deadline scheduler has two additional scheduling queues that were not 
available to the 2.4 IO scheduler. The two new queues are a FIFO read queue 
and a FIFO write queue. This new multi-queue method allows for greater 
interactivity by giving the read requests a better deadline than write 
requests, thus ensuring that applications rarely will be delayed by read 
requests.

Deadline scheduling is best suited for database servers and high disk 
performance systems. Morton has experienced up to 15 percent increases on 
database loads while using deadline scheduling."

http://story.news.yahoo.com/news?tmpl=story&cid=75&e=2&u=/nf/20040405/tc_nf/23603

Nothing very in-depth in the story.

-- 
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] good pc but bad performance,why?

2004-04-06 Thread Bill Moran
huang yaqin wrote:
> hello,
> 
>   I have some question when I use postgresql 7.4.1 on redhat adv server 2.1 .
> I use IBM335 as server, it has 4 cpus, 1G RAM. but I got very bad performance.
> I can only do about 50 inserts per sencond. Event worse than my pc(PIII 800,256M 
> RAM), can anyone give me some advice?

Have you referenced this document?:

http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html

-- 
Bill Moran
Potential Technologies
http://www.potentialtech.com


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] good pc but bad performance,why?

2004-04-06 Thread Richard Huxton
On Tuesday 06 April 2004 09:01, huang yaqin wrote:
> hello,
>
>   I have some question when I use postgresql 7.4.1 on redhat adv server 2.1
> . I use IBM335 as server, it has 4 cpus, 1G RAM. but I got very bad
> performance. I can only do about 50 inserts per sencond. Event worse than
> my pc(PIII 800,256M RAM), can anyone give me some advice?

How have you tuned your postgresql.conf file?
What disk systems do you have?
What does vmstat/iostat show as the bottleneck in the system?

-- 
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


[PERFORM] good pc but bad performance,why?

2004-04-06 Thread huang yaqin
hello,

I have some question when I use postgresql 7.4.1 on redhat adv server 2.1 .
I use IBM335 as server, it has 4 cpus, 1G RAM. but I got very bad performance.
I can only do about 50 inserts per sencond. Event worse than my pc(PIII 800,256M RAM), 
can anyone give me some advice?  

huang yaqin
[EMAIL PROTECTED]
  2004-04-06



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] performance comparission postgresql/ms-sql server

2004-04-06 Thread Shridhar Daithankar
Heiko Kehlenbrink wrote:

[EMAIL PROTECTED]:~> psql -d test -c 'explain analyse select avg(dist)
from massive2 where dist > (100*sqrt(3.0))::float8 and dist <
(150*sqrt(3.0))::float8;'
NOTICE:  QUERY PLAN:
Aggregate  (cost=14884.61..14884.61 rows=1 width=8) (actual
time=3133.24..3133.24 rows=1 loops=1)
  ->  Index Scan using massive2_dist on massive2  (cost=0.00..13648.17
rows=494573 width=8) (actual time=0.11..2061.38 rows=49 loops=1)
Total runtime: 3133.79 msec
EXPLAIN

seems to me that most time was needed for the index scanning...
Hmm... I would suggest if you are testing, you should try 7.4.2. 7.4 has some 
good optimisation for hash agregates though I am not sure if it apply to averaging.

Also try forcing a seq. scan by turning off index scan. I guess index scan for 
so many rows is not exactly good thing even if tuple size if pretty small.

 Shridhar

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings