We are using a third party monitoring solution called Zabbix with Postgresql 
9.1.We are observing lots of sharelock problems possibly as a result of 
multiple, concurrent updates to the specific table called items.

First of all, what can I do to reduce lock contention in Postgresql?  Surely 
someone is using postgresql in a high transaction environment supporting 
concurrent updates to the same table?  Secondly how do I determine what would 
be appropriate in terms of settings for the postgresql.conf to deal with this 
issue.  Third, what you suggest to the Zabbix developers to fix such a problem?

2012-11-01 00:14:23 PDT zabbix zabbix 127.0.0.1 - DETAIL:  Process 5983 waits 
for ShareLock on transaction 48349033; blocked by process 5992.
        Process 5992 waits for ShareLock on transaction 48349079; blocked by 
process 5983.
        Process 5983: update items set 
lastclock=1351754051,lastns=230190945,prevvalue=lastvalue,lastvalue='2.692934' 
where itemid=200589;
        update items set 
lastclock=1351754050,lastns=329344694,prevvalue=lastvalue,lastvalue='2080' 
where itemid=200650;
        update items set 
lastclock=1351754051,lastns=378134858,prevvalue=lastvalue,lastvalue='1' where 
itemid=200710;
        update items set 
lastclock=1351754051,lastns=381995730,prevorgvalue='56',prevvalue=lastvalue,lastvalue='0'
 where itemid=201150;
        update items set 
lastclock=1351754051,lastns=378331407,prevvalue=lastvalue,lastvalue='Unit: 0 
Slot: 0 Port: 3 Gigabit - Level 0x6000001' where itemid=201300;
        update items set 
lastclock=1351754051,lastns=380237518,prevorgvalue='0',prevvalue=lastvalue,lastvalue='0'
 where itemid=201350;
        update items set 
lastclock=1351754051,lastns=378222316,prevvalue=lastvalue,lastvalue='0' where 
itemid=201670;
        update items set 
lastclock=1351754050,lastns=319926666,prevvalue=lastvalue,lastvalue='893' where 
itemid=201970;
        update items set lastclock=1351754051,lastns=393
        Process 5992: update items set 
lastclock=1351754043,lastns=980316865,prevvalue=lastvalue,lastvalue='65381720064'
 where itemid=23343;
        update items set 
lastclock=1351754044,lastns=392778204,prevvalue=lastvalue,lastvalue='887279616' 
where itemid=23344;
        update items set 
lastclock=1351754044,lastns=3196233,prevvalue=lastvalue,lastvalue='1351754044' 
where itemid=23763;
        update items set 
lastclock=1351754044,lastns=867856940,prevvalue=lastvalue,lastvalue='92.097214' 
where itemid=23764;
        update items set 
lastclock=1351754043,lastns=998890879,prevorgvalue='94217753500',prevvalue=lastvalue,lastvalue='14024'
 where itemid=23943;
        update items set 
lastclock=1351754044,lastns=865008900,prevorgvalue='0',prevvalue=lastvalue,lastvalue='0'
 where itemid=23944;
        update items set 
lastclock=1351754043,lastns=970666655,prevorgvalue='66951216662',prevvalue=lastvalue,lastvalue='605816'
 where itemid=24003;
        update items set 
lastclock=1351754044,lastns=871362232,prevvalue=lastvalue,lastvalue='2715914240'
 where itemid=24004;
        update items set lastclock=13
2012-11-01 00:14:23 PDT zabbix zabbix 127.0.0.1 - HINT:  See server log for 
query details.

Here is our modifications to the default postgresql.conf file:

listen_addresses = '*'
max_connections = 300                   # (change requires restart)
shared_buffers = 40GB                   # min 128kB  #  WSUIT
work_mem = 1024MB                         # WSU IT
maintenance_work_mem = 512MB            # WSU IT
effective_io_concurrency = 4
synchronous_commit = off                # immediate fsync at commit  #  WSUIT
wal_buffers = 1024MB
checkpoint_segments = 512               # in logfile segments, min 1, 16MB each 
 #  WSUIT
checkpoint_completion_target = 0.9      # checkpoint target duration, 0.0 - 1.0 
 #  WSUIT
effective_cache_size = 48GB                                         #  WSUIT
logging_collector = on                  # Enable capturing of stderr and csvlog
log_directory = '/var/log/pg_log'       # directory where log files are 
written, WSUIT
log_filename = 'postgresql-%a.log'      # log file name pattern,
log_truncate_on_rotation = on           # If on, an existing log file of the
log_rotation_size = 100MB               # Automatic rotation of logfiles will
log_min_error_statement = error         # values in order of decreasing detail:
log_min_duration_statement = 1000       # WSUIT -- Log statements over ten 
seconds
log_line_prefix = '%t %u %d %h - '                      # special values:
log_lock_waits = on                     # WSUIT - Turn on logging waits
autovacuum_max_workers = 6              # WSUIT -- Increase these for our 
larger tables
datestyle = 'iso, mdy'
lc_messages = 'en_US.UTF-8'                     # locale for system error 
message
lc_monetary = 'en_US.UTF-8'                     # locale for monetary formatting
lc_numeric = 'en_US.UTF-8'                      # locale for number formatting
lc_time = 'en_US.UTF-8'                         # locale for time formatting

postgres=# select version();
                                                   version
--------------------------------------------------------------------------------------------------------------
PostgreSQL 9.1.6 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.6 
20120305 (Red Hat 4.4.6-4), 64-bit

Hardware:
12 cpu cores,
96 GB of RAM
4-15K RPM drives in a RAID 1+0

TOP output:

top - 12:42:20 up 97 days, 21:17,  8 users,  load average: 1.33, 1.19, 1.14
Tasks: 718 total,   7 running, 711 sleeping,   0 stopped,   0 zombie
Cpu(s): 18.7%us,  1.1%sy,  0.0%ni, 80.1%id,  0.0%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:  99060040k total, 96891404k used,  2168636k free,   220860k buffers
Swap: 16777208k total,  1467392k used, 15309816k free, 85556104k cached

You can see the load is really low and virtually no I/O waits.

Reply via email to