Re: [PERFORM] First set of OSDL Shared Mem scalability results, some

2004-10-23 Thread Curt Sampson
On Sat, 9 Oct 2004, Tom Lane wrote:

 mmap provides msync which is comparable to fsync, but AFAICS it
 provides no way to prevent an in-memory change from reaching disk too
 soon.  This would mean that WAL entries would have to be written *and
 flushed* before we could make the data change at all, which would
 convert multiple updates of a single page into a series of write-and-
 wait-for-WAL-fsync steps.  Not good.  fsync'ing WAL once per transaction
 is bad enough, once per atomic action is intolerable.

Back when I was working out how to do this, I reckoned that you could
use mmap by keeping a write queue for each modified page. Reading,
you'd have to read the datum from the page and then check the write
queue for that page to see if that datum had been updated, using the
new value if it's there. Writing, you'd add the modified datum to the
write queue, but not apply the write queue to the page until you'd had
confirmation that the corresponding transaction log entry had been
written. So multiple writes are no big deal; they just all queue up in
the write queue, and at any time you can apply as much of the write
queue to the page itself as the current log entry will allow.

There are several different strategies available for mapping and
unmapping the pages, and in fact there might need to be several
available to get the best performance out of different systems. Most
OSes do not seem to be optimized for having thousands or tens of
thousands of small mappings (certainly NetBSD isn't), but I've never
done any performance tests to see what kind of strategies might work
well or not.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.NetBSD.org
 Make up enjoying your city life...produced by BIC CAMERA

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Insert performance, what should I expect?

2004-10-23 Thread Gaetano Mendola
Brock Henry wrote:
 Any comments/suggestions would be appreciated.
Tune also the disk I/O elevator.
look at this: http://www.varlena.com/varlena/GeneralBits/49.php
Regards
Gaetano Mendola

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] Insert performance, what should I expect?

2004-10-23 Thread Steinar H. Gunderson
On Sat, Oct 23, 2004 at 12:31:32PM +0200, Gaetano Mendola wrote:
 Any comments/suggestions would be appreciated.
 Tune also the disk I/O elevator.
 
 look at this: http://www.varlena.com/varlena/GeneralBits/49.php

Mm, interesting. I've heard somewhere that the best for database-like loads
on Linux is to disable the anticipatory I/O scheduler
(http://kerneltrap.org/node/view/567), which should probably
influence the numbers for elvtune also -- anybody know whether this is true
or not for PostgreSQL?

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] futex results with dbt-3

2004-10-23 Thread Gaetano Mendola
Josh Berkus wrote:
 Tom,


The bigger problem here is that the SMP locking bottlenecks we are
currently seeing are *hardware* issues (AFAICT anyway).  The only way
that futexes can offer a performance win is if they have a smarter way
of executing the basic atomic-test-and-set sequence than we do;
and if so, we could read their code and adopt that method without having
to buy into any large reorganization of our code.


 Well, initial results from Gavin/Neil's patch seem to indicate that, while
 futexes do not cure the CSStorm bug, they do lessen its effects in terms of
 real performance loss.
I proposed weeks ago to see how the CSStorm is affected by stick each backend
in one processor ( where the process was born ) using the cpu-affinity capability
( kernel 2.6 ), is this proposal completely out of mind ?
Regards
Gaetano Mendola



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Slow query

2004-10-23 Thread Joshua Marsh
Any time you run subqueries, it's going to slow down the update
process a lot.  Each record that is updated in source_song_title runs
two additional queries.  When I do large updates like this, I usualy
Run a transaction that will select all the new data into a new table
on a join.  For example

SELECT 
  a.*, 
  b.licensing_match_order, 
  b.affiliation_match_order, 
  d.title
INTO 
  updated_data 
FROM 
  source_song_title AS a
INNER JOIN
  source_system AS b
ON
  b.id = d.id
INNER JOIN
  source_song AS c
ON
  a.id = c.id
INNER JOIN
  source_title AS d
ON
  a.id = d.id

I'm not sure that query does what you want, but you get the idea. 
Then just drop the old table and rename the updated_data table.  This
way instead of doing a bunch of updates, you do one select and a
rename.

-Josh

On Fri, 22 Oct 2004 16:37:14 -0400, Tom Lane [EMAIL PROTECTED] wrote:
 Roger Ging [EMAIL PROTECTED] writes:
  update source_song_title set
  source_song_title_id = nextval('source_song_title_seq')
  ,licensing_match_order = (select licensing_match_order from
  source_system where source_system_id = ss.source_system_id)
  ,affiliation_match_order = (select affiliation_match_order from
  source_system where source_system_id = ss.source_system_id)
  ,title = st.title
  from source_song_title sst
  join source_song ss on ss.source_song_id = sst.source_song_id
  join source_title st on st.title_id = sst.title_id
  where source_song_title.source_song_id = sst.source_song_id;
 
 Why is source_song_title sst in there?  To the extent that
 source_song_id is not unique, you are multiply updating rows
 because of the self-join.
 
 regards, tom lane
 
 ---(end of broadcast)---
 TIP 4: Don't 'kill -9' the postmaster


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] First set of OSDL Shared Mem scalability results, some wierdness ...

2004-10-23 Thread Tom Lane
Curt Sampson [EMAIL PROTECTED] writes:
 Back when I was working out how to do this, I reckoned that you could
 use mmap by keeping a write queue for each modified page. Reading,
 you'd have to read the datum from the page and then check the write
 queue for that page to see if that datum had been updated, using the
 new value if it's there. Writing, you'd add the modified datum to the
 write queue, but not apply the write queue to the page until you'd had
 confirmation that the corresponding transaction log entry had been
 written. So multiple writes are no big deal; they just all queue up in
 the write queue, and at any time you can apply as much of the write
 queue to the page itself as the current log entry will allow.

Seems to me the overhead of any such scheme would swamp the savings from
avoiding kernel/userspace copies ... the locking issues alone would be
painful.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] futex results with dbt-3

2004-10-23 Thread Gaetano Mendola
Tom Lane wrote:
Gaetano Mendola [EMAIL PROTECTED] writes:
I proposed weeks ago to see how the CSStorm is affected by stick each
backend in one processor ( where the process was born ) using the
cpu-affinity capability ( kernel 2.6 ), is this proposal completely
out of mind ?

That was investigated long ago.  See for instance
http://archives.postgresql.org/pgsql-performance/2004-04/msg00313.php
If I read correctly this help on the CSStorm, I guess also that this could
also help the performances. Unfortunatelly I do not have any kernel 2.6 running
on SMP to give it a try.
Regards
Gaetano Mendola
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] futex results with dbt-3

2004-10-23 Thread Gaetano Mendola
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Josh Berkus wrote:
| Gaetano,
|
|
|I proposed weeks ago to see how the CSStorm is affected by stick each
|backend in one processor ( where the process was born ) using the
|cpu-affinity capability ( kernel 2.6 ), is this proposal completely out of
|mind ?
|
|
| I don't see how that would help.   The problem is not backends switching
| processors, it's the buffermgrlock needing to be swapped between processors.
This is not clear to me. What happen if during a spinlock a backend is
moved away from one processor to another one ?
Regards
Gaetano Mendola

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org
iD8DBQFBeudN7UpzwH2SGd4RAkL9AKCUY9vsw1CPmBV1kC7BKxUtuneN2wCfXaYr
E8utuJI34MAIP8jUm6By09M=
=oRvU
-END PGP SIGNATURE-
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] Vacuum takes a really long time, vacuum full required

2004-10-23 Thread Max Baker
On Tue, Oct 19, 2004 at 11:40:17AM -0400, Rod Taylor wrote:
  Whatever the case, the database still slows down to a halt after a month or
  so, and I have to go in and shut everything down and do a VACUUM FULL by
  hand.  One index (of many many) takes 2000 seconds to vacuum.  The whole
  process takes a few hours.
 
 Do a REINDEX on that table instead, and regular vacuum more frequently.
 
  $ pg_config --version
  PostgreSQL 7.3.2
 
 7.4.x deals with index growth a little better 7.3 and older did.

I did a REINDEX of the database.  The results are pretty insane, the db went
from 16GB to 381MB.  Needless to say things are running a lot faster. 

I will now take Tom's well-given advice and upgrade to 7.4.  But at least
now I have something to tell my users who are not able to do a DB upgrade
for whatever reason.

Thanks for all your help folks!
-m

Before:
# du -h pgsql   
135Kpgsql/global
128Mpgsql/pg_xlog
 80Mpgsql/pg_clog
3.6Mpgsql/base/1
3.6Mpgsql/base/16975
1.0Kpgsql/base/16976/pgsql_tmp
 16Gpgsql/base/16976
 16Gpgsql/base
 16Gpgsql

After Reindex:
# du /data/pgsql/
131K/data/pgsql/global
128M/data/pgsql/pg_xlog
 81M/data/pgsql/pg_clog
3.6M/data/pgsql/base/1
3.6M/data/pgsql/base/16975
1.0K/data/pgsql/base/16976/pgsql_tmp
268M/data/pgsql/base/16976
275M/data/pgsql/base
484M/data/pgsql/

After Vacuum:
# du /data/pgsql/  
131K/data/pgsql/global
144M/data/pgsql/pg_xlog
 81M/data/pgsql/pg_clog
3.6M/data/pgsql/base/1
3.6M/data/pgsql/base/16975
1.0K/data/pgsql/base/16976/pgsql_tmp
149M/data/pgsql/base/16976
156M/data/pgsql/base
381M/data/pgsql/

netdisco= select relname, relpages from pg_class order by relpages desc;

Before:
 relname | relpages 
-+--
 idx_node_switch_port_active |   590714
 idx_node_switch_port|   574344
 idx_node_switch |   482202
 idx_node_mac|   106059
 idx_node_mac_active |99842

After:
 relname | relpages 
-+--
 node_ip |13829
 node| 9560
 device_port | 2124
 node_ip_pkey| 1354
 idx_node_ip_ip  | 1017
 idx_node_ip_mac_active  |  846


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] First set of OSDL Shared Mem scalability results, some

2004-10-23 Thread Curt Sampson
On Sat, 23 Oct 2004, Tom Lane wrote:

 Seems to me the overhead of any such scheme would swamp the savings from
 avoiding kernel/userspace copies ...

Well, one really can't know without testing, but memory copies are
extremely expensive if they go outside of the cache.

 the locking issues alone would be painful.

I don't see why they would be any more painful than the current locking
issues. In fact, I don't see any reason to add more locking than we
already use when updating pages.

cjs
-- 
Curt Sampson  [EMAIL PROTECTED]   +81 90 7737 2974   http://www.NetBSD.org
 Make up enjoying your city life...produced by BIC CAMERA

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]