Frederik Ramm <frede...@remote.org> wrote:
 
> I am (ab)using a PostgreSQL database (with PostGIS extension) in
> a large data processing job - each day, I load several GB of data,
> run a lot of analyses on it, and then throw everything away again.
> Loading, running, and dumping the results takes about 18 hours
> every day.
> 
> The job involves a lot of index building and sorting, and is run
> on a 64-bit machine with 96 GB of RAM.
> 
> Naturally I would like the system to use as much RAM as possible
> before resorting to disk-based operations, but no amount of 
> maintenance_work_mem setting seems to make it do my bidding.
 
If you can tolerate some risk that for a given day you might fail to
generate the analysis, or you might need to push the schedule back
to get it, you could increase performance by compromising
recoverability.  You seem to be willing to consider such risk based
on your mention of a RAM disk.
 
 - If a single session can be maintained for loading and using the
data, you might be able to use temporary tables and a large
temp_buffers size.  Of course, when the connection closes, the
tables are gone.
 
 - You could turn off fsync and full_page_writes, but on a crash
your database might be corrupted beyond usability.
 
 - You could turn off synchronous_commit.
 
 - Make sure you have archiving turned off.
 
 - If you are not already doing so, load the data into each table
within the same database transaction which does CREATE TABLE or
TRUNCATE TABLE.
 
Other than the possibility that the temp table might keep things in
RAM, these suggestions don't directly address your question, but I
thought they might be helpful.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to