Re: [Dovecot] High Load Average on POP/IMAP.

2013-08-21 Thread Urban Loesch

Hi,

if you try the following command during the server has a high load:

# ps -ostat,pid,time,wchan='WCHAN-',cmd ax  |grep D

Do you get back something like this?

STAT   PID TIME WCHAN- CMD
D18713 00:00:00 synchronize_srcu   dovecot/imap
D18736 00:00:00 synchronize_srcu   dovecot/imap
D18775 00:00:05 synchronize_srcu   dovecot/imap
D20330 00:00:00 synchronize_srcu   dovecot/imap
D20357 00:00:00 synchronize_srcu   dovecot/imap
D20422 00:00:00 synchronize_srcu   dovecot/imap
D20687 00:00:00 synchronize_srcu   dovecot/imap
S+   20913 00:00:00 pipe_wait  grep D

If yes, it could be a problem with Inotify in your kernel. You can try to 
disable inotify
in the kernel with:

echo 0  /proc/sys/fs/inotify/max_user_watches
echo 0  /proc/sys/fs/inotify/max_user_instances

Full article:
http://thread.gmane.org/gmane.linux.kernel/1315430

For me this resolved the problem. Load goes down to  1.00


Regards
Urban




Am 21.08.2013 12:37, schrieb Kavish Karkera:

Hi,

We have a serious issue running on our POP/IMAP servers these days. The load 
average of a servers
spikes up to 400-500  as a uptime command result, for a particular time period 
, to be specific
mostly in noon time and evening, but it last for few minutes only.

We have 2 servers running dovecot 1.1.20 , in loadbanlancer, We have used 
KEEPLIVE (1.1.13) for
loadbalacing.

Server specification.
Operating System : CentOS 5.5 64bit
CPU cores : 16
RAM : 8GB

Mail and Indexes are mounted on NFS (NetApp).

Below is the dovecot -n ... (top results during high spike)


#

# 1.1.20: /usr/local/etc/dovecot.conf
# OS: Linux 2.6.28 x86_64 CentOS release 5.5 (Final)
log_path: /var/log/dovecot-info.log
info_log_path: /var/log/dovecot-info.log
syslog_facility: local1
protocols: imap imaps pop3 pop3s
listen(default): *:143
listen(imap): *:143
listen(pop3): *:110
ssl_listen(default): *:993
ssl_listen(imap): *:993
ssl_listen(pop3): *:995
ssl_cert_file: /usr/local/etc/ssl/certs/dovecot.pem
ssl_key_file: /usr/local/etc/ssl/private/dovecot.pem
disable_plaintext_auth: no
login_dir: /usr/local/var/run/dovecot/login
login_executable(default): /usr/local/libexec/dovecot/imap-login
login_executable(imap): /usr/local/libexec/dovecot/imap-login
login_executable(pop3): /usr/local/libexec/dovecot/pop3-login
login_greeting: Welcome to Popserver.
login_process_per_connection: no
max_mail_processes: 1024
mail_max_userip_connections(default): 100
mail_max_userip_connections(imap): 100
mail_max_userip_connections(pop3): 50
verbose_proctitle: yes
first_valid_uid: 99
first_valid_gid: 99
mail_location: maildir:~/Maildir:INDEX=/indexes/%h:CONTROL=/indexes/%h
mmap_disable: yes
mail_nfs_storage: yes
mail_nfs_index: yes
lock_method: dotlock
mail_executable(default): /usr/local/libexec/dovecot/imap
mail_executable(imap): /usr/local/libexec/dovecot/imap
mail_executable(pop3): /usr/local/libexec/dovecot/pop3
mail_plugins(default): quota imap_quota
mail_plugins(imap): quota imap_quota
mail_plugins(pop3): quota
mail_plugin_dir(default): /usr/local/lib/dovecot/imap
mail_plugin_dir(imap): /usr/local/lib/dovecot/imap
mail_plugin_dir(pop3): /usr/local/lib/dovecot/pop3
pop3_no_flag_updates(default): no
pop3_no_flag_updates(imap): no
pop3_no_flag_updates(pop3): yes
pop3_lock_session(default): no
pop3_lock_session(imap): no
pop3_lock_session(pop3): yes
pop3_client_workarounds(default):
pop3_client_workarounds(imap):
pop3_client_workarounds(pop3): outlook-no-nuls
lda:
   postmaster_address: ad...@research.com
   mail_plugins: cmusieve quota mail_log
   mail_plugin_dir: /usr/local/lib/dovecot/lda
   auth_socket_path: /var/run/dovecot/auth-master
auth default:
   worker_max_count: 15
   passdb:
 driver: sql
 args: /usr/local/etc/dovecot-mysql.conf
   userdb:
 driver: sql
 args: /usr/local/etc/dovecot-mysql.conf
   userdb:
 driver: prefetch
   socket:
 type: listen
 client:
   path: /var/run/dovecot/auth-client
   mode: 432
   user: nobody
   group: nobody
 master:
   path: /var/run/dovecot/auth-master
   mode: 384
   user: nobody
   group: nobody
plugin:
   quota_warning: storage=95%% /usr/local/bin/quota-warning.sh 95 %u
   quota_warning2: storage=80%% /usr/local/bin/quota-warning.sh 80 %u
   quota: maildir:storage=64
##

##

top - 12:08:31 up 206 days, 10:45,  3 users,  load average: 189.88, 82.07, 55.97
Tasks: 771 total,   1 running, 767 sleeping,   1 stopped,   2 zombie
Cpu(s):  8.3%us,  7.6%sy,  0.0%ni,  8.3%id, 75.0%wa,  0.0%hi,  0.8%si,  0.0%st
Mem:  16279824k total, 11913788k used,  4366036k free,   334308k buffers
Swap:  

Re: [Dovecot] High Load Average on POP/IMAP.

2013-08-21 Thread Kavish Karkera
Thanks Urban, will try this and will let you know.

Regards,
Kavish Karkera





 From: Urban Loesch b...@enas.net
To: dovecot@dovecot.org dovecot@dovecot.org 
Sent: Wednesday, 21 August 2013 5:34 PM
Subject: Re: [Dovecot] High Load Average on POP/IMAP.
 

Hi,

if you try the following command during the server has a high load:

# ps -ostat,pid,time,wchan='WCHAN-',cmd ax  |grep D

Do you get back something like this?

STAT   PID     TIME WCHAN- CMD
D    18713 00:00:00 synchronize_srcu           dovecot/imap
D    18736 00:00:00 synchronize_srcu           dovecot/imap
D    18775 00:00:05 synchronize_srcu           dovecot/imap
D    20330 00:00:00 synchronize_srcu           dovecot/imap
D    20357 00:00:00 synchronize_srcu           dovecot/imap
D    20422 00:00:00 synchronize_srcu           dovecot/imap
D    20687 00:00:00 synchronize_srcu           dovecot/imap
S+   20913 00:00:00 pipe_wait                  grep D

If yes, it could be a problem with Inotify in your kernel. You can try to 
disable inotify
in the kernel with:

echo 0  /proc/sys/fs/inotify/max_user_watches
echo 0  /proc/sys/fs/inotify/max_user_instances

Full article:
http://thread.gmane.org/gmane.linux.kernel/1315430

For me this resolved the problem. Load goes down to  1.00


Regards
Urban




Am 21.08.2013 12:37, schrieb Kavish Karkera:
 Hi,

 We have a serious issue running on our POP/IMAP servers these days. The load 
 average of a servers
 spikes up to 400-500  as a uptime command result, for a particular time 
 period , to be specific
 mostly in noon time and evening, but it last for few minutes only.

 We have 2 servers running dovecot 1.1.20 , in loadbanlancer, We have used 
 KEEPLIVE (1.1.13) for
 loadbalacing.

 Server specification.
 Operating System : CentOS 5.5 64bit
 CPU cores : 16
 RAM : 8GB

 Mail and Indexes are mounted on NFS (NetApp).

 Below is the dovecot -n ... (top results during high spike)


 #

 # 1.1.20: /usr/local/etc/dovecot.conf
 # OS: Linux 2.6.28 x86_64 CentOS release 5.5 (Final)
 log_path: /var/log/dovecot-info.log
 info_log_path: /var/log/dovecot-info.log
 syslog_facility: local1
 protocols: imap imaps pop3 pop3s
 listen(default): *:143
 listen(imap): *:143
 listen(pop3): *:110
 ssl_listen(default): *:993
 ssl_listen(imap): *:993
 ssl_listen(pop3): *:995
 ssl_cert_file: /usr/local/etc/ssl/certs/dovecot.pem
 ssl_key_file: /usr/local/etc/ssl/private/dovecot.pem
 disable_plaintext_auth: no
 login_dir: /usr/local/var/run/dovecot/login
 login_executable(default): /usr/local/libexec/dovecot/imap-login
 login_executable(imap): /usr/local/libexec/dovecot/imap-login
 login_executable(pop3): /usr/local/libexec/dovecot/pop3-login
 login_greeting: Welcome to Popserver.
 login_process_per_connection: no
 max_mail_processes: 1024
 mail_max_userip_connections(default): 100
 mail_max_userip_connections(imap): 100
 mail_max_userip_connections(pop3): 50
 verbose_proctitle: yes
 first_valid_uid: 99
 first_valid_gid: 99
 mail_location: maildir:~/Maildir:INDEX=/indexes/%h:CONTROL=/indexes/%h
 mmap_disable: yes
 mail_nfs_storage: yes
 mail_nfs_index: yes
 lock_method: dotlock
 mail_executable(default): /usr/local/libexec/dovecot/imap
 mail_executable(imap): /usr/local/libexec/dovecot/imap
 mail_executable(pop3): /usr/local/libexec/dovecot/pop3
 mail_plugins(default): quota imap_quota
 mail_plugins(imap): quota imap_quota
 mail_plugins(pop3): quota
 mail_plugin_dir(default): /usr/local/lib/dovecot/imap
 mail_plugin_dir(imap): /usr/local/lib/dovecot/imap
 mail_plugin_dir(pop3): /usr/local/lib/dovecot/pop3
 pop3_no_flag_updates(default): no
 pop3_no_flag_updates(imap): no
 pop3_no_flag_updates(pop3): yes
 pop3_lock_session(default): no
 pop3_lock_session(imap): no
 pop3_lock_session(pop3): yes
 pop3_client_workarounds(default):
 pop3_client_workarounds(imap):
 pop3_client_workarounds(pop3): outlook-no-nuls
 lda:
    postmaster_address: ad...@research.com
    mail_plugins: cmusieve quota mail_log
    mail_plugin_dir: /usr/local/lib/dovecot/lda
    auth_socket_path: /var/run/dovecot/auth-master
 auth default:
    worker_max_count: 15
    passdb:
      driver: sql
      args: /usr/local/etc/dovecot-mysql.conf
    userdb:
      driver: sql
      args: /usr/local/etc/dovecot-mysql.conf
    userdb:
      driver: prefetch
    socket:
      type: listen
      client:
        path: /var/run/dovecot/auth-client
        mode: 432
        user: nobody
        group: nobody
      master:
        path: /var/run/dovecot/auth-master
        mode: 384
        user: nobody
        group: nobody
 plugin:
    quota_warning: storage=95%% /usr/local/bin/quota-warning.sh 95 %u
    quota_warning2: storage=80%% /usr/local/bin/quota-warning.sh 80 %u
    quota: maildir:storage=64

Re: [Dovecot] High Load Average on POP/IMAP.

2013-08-21 Thread Stan Hoeppner
On 8/21/2013 5:37 AM, Kavish Karkera wrote:

 We have a serious issue running on our POP/IMAP servers these days. The load 
 average of a servers 
 spikes up to 400-500  as a uptime command result, for a particular time 
 period , to be specific 
 mostly in noon time and evening, but it last for few minutes only.
 
 We have 2 servers running dovecot 1.1.20 , in loadbanlancer, We have used 
 KEEPLIVE (1.1.13) for 
 loadbalacing.
 
 Server specification.
 Operating System : CentOS 5.5 64bit
 CPU cores : 16
 RAM : 8GB
 
 Mail and Indexes are mounted on NFS (NetApp).
...

 Cpu(s):  8.3%us,  7.6%sy,  0.0%ni,  8.3%id, 75.0%wa,  0.0%hi,  0.8%si,  0.0%st
  ^^^
   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND  

   408 mysql 18   0  384m  38m 4412 S 52.8  0.2  42221:44 mysqld   
 

This doesn't seem to be a dovecot issue.  mysql has apparently 8 (or
more) threads on 8 cores all blocking on IO.  I see a few possible causes.

1.  The NetApp is unable to keep up with the request rate because:

   a.  There are too few spindles in the RAID set backing this NFS
volume and/or the file(s) aren't properly striped across all spindles

   b.  An inappropriate RAID level.  The mysql job is apparently doing
large table updates and you're experiencing massive RMW latency from
RAID5/6.  This is why one should never put a transactional database, or
one that sees large frequent table updates, on a parity RAID
volume--unless the disks are SSD.  SSDs have no mechanical parts, thus
RMW latency is almost nonexistent.

2. Apparently 8 (or more) threads are concurrently accessing the same
file or files.  Thus the massive iowait could simply be the result of
filesystem and/or NFS locking, NFS client caching issues, etc.

The cause of the massive iowait could be one or all of the above, or
could be something else entirely.  These are the typical causes.

You seem to have a database job scheduled to run twice daily that
triggers the problem.  Identify this job, figure out what it does, why
it does it, how necessary it is, and if it can be scheduled to run at
off peak hours.  If it can you may want to simply do so, as it may be
expensive, in hardware and/or labor dollars, to fix the IO latency problem.

-- 
Stan