Hello performance wizards! (Sorry for the re-post if this appears twice - I see 
no evidence e-mailing to pgsql-perfomrance is working yet.)
 
My client has migrated his 8.3 hosted DB to new machines 
running PG 9.0. It’s time to look at the config settings. 

 
Immediately below are the config settings. 

 
The specifics of the DB and how it is used is below 
that, but in general let me say that this is a full-time ETL system, with only 
a 
handful of actual “users” and automated processes over 300 connections running 
“import” programs 24/7.
 
I appreciate the help,
 
Carlo
 
The host system:
 
Intel® Xeon® Processor X5560 (8M Cache, 2.80 GHz, 6.40 
GT/s Intel® QPI) x 2, dual quad core
48 GB RAM
RAID 10, 6 X 600 GB 15krpm 
SAS)
LINUX Redhat/Centos 
2.6.18-164.el5
 
Sys admin says that battery-backup RAID controller and 
consequent write settings should have no impact on performance. Is this 
true?
 
Current config and my thoughts on what to do with it. If 
it isn’t mentioned here, the values are default 
values:
 
# 
=========================================================== 

max_connections = 
300
shared_buffers = 
500MB        # At 48GB of RAM, could we go to 2GB
                              
# - what is the impact on LINX config?
effective_cache_size = 
2457MB # Sys admin says assume 25% of 48GB
                              
#  is used by OS and other apps
work_mem = 
512MB              # Complex reads are called many times a second 

                              
# from each connection, so what should this be?
maintenance_work_mem = 
256MB  # Should this be bigger - 1GB at least?
checkpoint_segments = 
128     # There is lots of write activity; this is high 

                              
# but could it be higher?                              

#checkpoint_completion_target 
not set; 
#  Recommendation appears to 
be .9 for our 128 checkpoint segments
 
default_statistics_target = 
200  # Deprecated?
 
#autovacuum_freeze_max_age 
not set; 
# recommendation is 
1,000,000 for non-activity. 
 
# What is the metric for 
wal_buffers setting?
wal_buffers = 
4MB             # Looks low, recommendation appears to be 16MB. 

                              
# Is it really "set it and forget it"?
 
#synchronous_commit not set; 

#  Recommendation is to turn 
this off and leave fsync on
 
#fsync not set; 

#  Recommendation is to 
leave this on
 
#wal_level not set; 

#  Do we only needed for 
replication?
 
#max_wal_senders not set; 

#  Do we only needed for 
replication?
 
# The issue of 
vacuum/analyze is a tricky one.
#  Data imports are running 
24/7. One the DB is seeded, the vast majority
#  of write activity is 
updates, and not to indexed columns. 
#  Deletions are vary 
rare.
vacuum_cost_delay = 
20ms
 
# The background writer has 
not been addressed at all.
#  Can our particular setup 
benefit from changing 
#  the bgwriter 
values?
bgwriter_lru_maxpages = 
100   # This is the default; 
 
listen_addresses = 
'*'
port = 
5432
log_destination = 
'stderr'
logging_collector = 
on
log_directory = 
'pg_log'
log_filename = 
'postgresql-%a.log'
log_truncate_on_rotation = 
on
log_rotation_age = 
1d
log_rotation_size = 
0
log_line_prefix = 
'%t'
track_counts = 
on
# 
=========================================================== 

 

The DB is pretty large, and organized by schema. The 
most active are:
 
1)       
One “Core” schema
a.       
100 tables
b.      
Typical row counts in the low 
millions.
c.       
This represents the enterprise’s core data. 

d.      
Equal read/write activity
2)       
Multiple “Import” schemas
a.       
Contain several thousand raw “flat file” 
tables
b.      
Ragged column structure, up to hundreds of 
columns
c.       
Erratic row counts, from dozens of rows to 1 
million
d.      
Each table sequentially read once, only 
status fields are written back
3)       
One “Audit” schema
a.       
A new log table is created every 
month
b.      
Typical row count is 200 
million
c.       
Log every write to the “Core”
d.      
Almost entirely write operations, but the few 
read operations that are done have to be fast owing to the size of the 
tables
e.       
Links the “Core” data to the “Import” 
data
 
There are next to no “users” on the system – each 
connection services a constantly running import process which takes the 
incoming 
“import” data, analyzes the “core” data and decides how to distil the import 
into the core.
 
Analytical Processes are not 
report-oriented
The “Core” reads are mostly single row 
results
The “Import” reads are 1,000 row 
pages
There is next to no use of aggregate 
queries
 
Transactional Processes are a steady stream of 
writes
Not bursty or sporadic
Overwhelmingly inserts and updates, next to no 
deletes
Each transaction represents 10 – 50 writes to the “core” 
schema
 
                                          

Reply via email to