I have a box with an app and postgresql on it. Hardware includes with 2 2.8 Ghz
xeons 512KB cache, 4 GB of memory, 6 scsi disk in a software
raid 5 on a trustix 2.2 with a 2.6.15.3 kernel. The data and indexes are on the
raid array while the tx log is on disk
with the OS. All is well.
The one application executes one transaction every 60 seconds or so. The
transaction can range from tiny
to relatively large. Maybe 30-70k inserts, 60-100k updates... nothing too
heavy, take about 8-12 seconds
to finish the the entire update in the worst case. The application is using the
latest jdbc.... I am using
preparedStatements with addBatch/executebatch/clearBatch to send statements in
batches of 10 thousand...
(is that high?)
The box itself is a little over subscribed for memory which is causing us to
swap a bit... As the
application runs, I notice the postgres process which handles this particular
app connection grows in memory seemingly
uncrontrollably until kaboom. Once the kernel kills off enough processes and
the system settles, I see the postgres process is at 1.9GB
of res memory and 77MB of shared memory. This challenges a number of
assumptions I have made in the last while and raises a
few questions... BTW, I am assuming this is not a memory leak b/c the same
install of our software on a box
with 8GB of memory and no swap being used has no unexplained growth in the
memory... it is perfectly healthy
and quite performant.
Anyway, due to errors in the transaction, it is rolledback afterwhich the
postgres process remains at 901MB of
resident memory and 91MB of of shared memory.
27116 postgres 15 0 1515m 901m 91m S 0.0 22.9 18:33.96 postgres: qradar
qradar ::ffff:x.x.x.x(51149) idle
There are a few things I would like to understand.
- What in the postgres will grow at an uncontrolled rate when the system is
under heavy load or the transaction
is larger... there must be something not governed by the shared memory or
other configuration in postgresql.conf.
It seems like, once we start hitting swap, postgres grows in memory resulting
in more swapping... until applications
start getting killed.
- when the transaction was rolled back why did the process hold onto the 901MB
of memory?
- when is a transaction too big? is this determined by the configuration and
performance of wal_buffers and wal log or is there
house cleaning which MUST be done at commit/rollback to avoid siutations like
this thus indicating there is an upper bound.
I have been configuring postgres from tidbits I collected reading this list in
the last few months....
not sure if what I have is totally right for the work load, but when I have
adequate memory and avoid swap, we are more than
happy with performance. Configuration which is not below is just the default.
shared_buffers = 32767
work_mem = 20480
maintenance_work_mem = 32768
max_fsm_pages = 4024000
max_fsm_relations = 2000
fsync = false
wal_sync_method = fsync
wal_buffers = 4096
checkpoint_segments = 32
checkpoint_timeout = 1200
checkpoint_warning = 60
commit_delay = 5000
commit_siblings = 5
effective_cache_size = 175000
random_page_cost = 2
autovacuum = true
autovacuum_naptime = 60
autovacuum_vacuum_threshold = 500
autovacuum_analyze_threshold = 250
autovacuum_vacuum_scale_factor = 0.2
autovacuum_analyze_scale_factor = 0.1
autovacuum_vacuum_cost_delay=100
autovacuum_vacuum_cost_limit=100
default_statistics_target = 40
Is there anything here which looks weird or mis configured? I am just starting
to play with the bg writer configuration so I did not include.
typically, there is little or no iowait... and no reason to think there is
something miconfigured... from what I have seen.
In one transaction i have seen as many as 5 checkpoint_segments be created/used
so I was considering increasing wal_buffers to 8192 from 4096
given as many as 4 segments in memory/cache at once... need to test this though
....
Anyone have any thoughts on what could have caused the bloat?
thanks
---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faq