one of our tables has to be updated frequently, but concurrently running SELECT-queries must also have low latency times (it's being accessed through a web interface).
I'm looking for ideas that might improve the interactive performance of the system, without slowing down the updates too much. Here are the characteristics of the table and its use:
- approx. 2 million rows
- approx. 4-5 million rows per day are replaced in short bursts of 1-200k rows (average ~3000 rows per update)
- the table needs 6 indexes (not all indexes are used all the time, but keeping them all the time slows the system down less than re-creating some of them just before they're needed and dropping them afterwards)
- an "update" means that 1-200k rows with a common value in a particular field are replaced with an arbitrary number of new rows (with the same value in that field), i.e.:
begin transaction; delete from t where id=5; insert into t (id,...) values (5,...); ... [1-200k rows] end;
The problem is, that a large update of this kind can delay SELECT queries running in parallel for several seconds, so the web interface used by several people will be unusable for a short while.
Currently, I'm using temporary tables:
create table xyz as select * from t limit 0; insert into xyz ... ... begin transaction; delete from t where id=5; insert into t select * from xyz; end; drop table xyz;
This is slightly faster than inserting directly into t (and probably faster than using COPY, even though using that might reduce the overall load on the database).
What other possibilities are there, other than splitting up the 15 columns of that table into several smaller tables, which is something I'd like to avoid? Would replication help? (I doubt it, but haven't tried it yet...) Writing to one table (without indexes) and creating indexes and renaming it to the "read table" periodically in a double buffering-like fashion wouldn't work either(?), since the views and triggers would have to be re-created every time as well and other problems might arise.
The postgresql.conf options are already reasonably tweaked for performance(IMHO), but perhaps some settings are particularly critical:
shared_buffers=100000 (I tried many values, this seems to work well for us - 12GB RAM) wal_buffers=500 sort_mem=800000 checkpoint_segments=16 effective_cache_size=1000000 etc.
Any help/suggestions would be greatly appreciated... Even if it's something like "you need a faster db box, there's no other way" ;-)
Regards, Marinos
---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])