Re: [Dovecot] Restarting dovecot-auth stops authentication

2010-12-22 Thread Attila Nagy

 On 12/21/2010 01:31 PM, Timo Sirainen wrote:

On 21.12.2010, at 13.51, Attila Nagy wrote:


On 11/18/2010 06:45 PM, Timo Sirainen wrote:

On Wed, 2010-11-17 at 21:17 +0100, Attila Nagy wrote:


pop3-login: Error: net_connect_unix(pop3) failed: Connection refused

Right. This is the main problem. So the question becomes why is the
connection being refused.

I would really love to solve this now. :)
What connects here to what? pop3-login to the pop3 service?

Yes. For each login.

service imap {
  client_limit = 8

This definitely doesn't help. I'm not sure if setting client_limit=1 fixes this 
problem entirely or not, but try that first.
It seems it does, at least I haven't got any connection refused errors 
since I set client_limit to 1 on the imap and pop3 services.
BTW, the difference is about +10k open file handles, and 2-3 (judging 
from munin graphs, which aren't so precise) times "active" memory usage 
(this is on FreeBSD).
Are there any chances to solve this, or should I forget running in this 
mode until you implement async IO?


Thanks,


Re: [Dovecot] Restarting dovecot-auth stops authentication

2010-12-21 Thread Attila Nagy

 On 12/21/2010 06:41 PM, Timo Sirainen wrote:

On 21.12.2010, at 19.37, Attila Nagy wrote:


service imap-login {
  client_limit = 8

Why imap-login with this small client_limit? The default should be ok (1000).

Because I think that Dovecot's processes block on IO and not just on distinct 
IO operations, but larger tasks, like opening a maildir with a lot of e-mails 
without indexes.

*-login processes do no disk I/O.
Oh, I've had problem with authentication previously, and that stuck. 
Removed, thanks for noticing.

Am I wrong? Or partly wrong, because it uses blocking IO, but it can multiplex 
them, so while one user struggles with the file system to build indexes of his 
maildir, an other client in the same process can happily do POP/IMAP stuff?
The rationale is to spread IO (and users) amongst processes, because the OS can 
schedule them concurrently, but don't have too many processes, because that 
eats a lot of memory and other resources.

This applies to imap/pop3 service, and that's why only client_limit=1 works 
well with them for now.

Apart from these errors, it works fine. It would be interesting to see a 
response time statistics for both settings, to see how worse it gets 
with raising client_limit.


Re: [Dovecot] Restarting dovecot-auth stops authentication

2010-12-21 Thread Timo Sirainen
On 21.12.2010, at 19.37, Attila Nagy wrote:

>>> service imap-login {
>>>  client_limit = 8
>> Why imap-login with this small client_limit? The default should be ok (1000).
> Because I think that Dovecot's processes block on IO and not just on distinct 
> IO operations, but larger tasks, like opening a maildir with a lot of e-mails 
> without indexes.

*-login processes do no disk I/O.

> Am I wrong? Or partly wrong, because it uses blocking IO, but it can 
> multiplex them, so while one user struggles with the file system to build 
> indexes of his maildir, an other client in the same process can happily do 
> POP/IMAP stuff?
> The rationale is to spread IO (and users) amongst processes, because the OS 
> can schedule them concurrently, but don't have too many processes, because 
> that eats a lot of memory and other resources.

This applies to imap/pop3 service, and that's why only client_limit=1 works 
well with them for now.



Re: [Dovecot] Restarting dovecot-auth stops authentication

2010-12-21 Thread Attila Nagy

 On 12/21/2010 01:31 PM, Timo Sirainen wrote:

On 21.12.2010, at 13.51, Attila Nagy wrote:


On 11/18/2010 06:45 PM, Timo Sirainen wrote:

On Wed, 2010-11-17 at 21:17 +0100, Attila Nagy wrote:


pop3-login: Error: net_connect_unix(pop3) failed: Connection refused

Right. This is the main problem. So the question becomes why is the
connection being refused.

I would really love to solve this now. :)
What connects here to what? pop3-login to the pop3 service?

Yes. For each login.

Yes, that's clear.

service imap-login {
  client_limit = 8

Why imap-login with this small client_limit? The default should be ok (1000).
Because I think that Dovecot's processes block on IO and not just on 
distinct IO operations, but larger tasks, like opening a maildir with a 
lot of e-mails without indexes.
Am I wrong? Or partly wrong, because it uses blocking IO, but it can 
multiplex them, so while one user struggles with the file system to 
build indexes of his maildir, an other client in the same process can 
happily do POP/IMAP stuff?
The rationale is to spread IO (and users) amongst processes, because the 
OS can schedule them concurrently, but don't have too many processes, 
because that eats a lot of memory and other resources.



service imap {
  client_limit = 8

This definitely doesn't help. I'm not sure if setting client_limit=1 fixes this 
problem entirely or not, but try that first.
I've already done experiments with that, but had to switch context and 
forgot the results. Will do again.




Re: [Dovecot] Restarting dovecot-auth stops authentication

2010-12-21 Thread Timo Sirainen
On 21.12.2010, at 13.51, Attila Nagy wrote:

> On 11/18/2010 06:45 PM, Timo Sirainen wrote:
>> On Wed, 2010-11-17 at 21:17 +0100, Attila Nagy wrote:
>> 
>>> pop3-login: Error: net_connect_unix(pop3) failed: Connection refused
>> Right. This is the main problem. So the question becomes why is the
>> connection being refused.
> I would really love to solve this now. :)
> What connects here to what? pop3-login to the pop3 service?

Yes. For each login.

> service imap-login {
>  client_limit = 8

Why imap-login with this small client_limit? The default should be ok (1000).

> service imap {
>  client_limit = 8

This definitely doesn't help. I'm not sure if setting client_limit=1 fixes this 
problem entirely or not, but try that first.

> service pop3-login {
>  client_limit = 8
..
> service pop3 {
>  client_limit = 8

Same for pop3.

Re: [Dovecot] Restarting dovecot-auth stops authentication

2010-12-21 Thread Attila Nagy

 On 11/18/2010 06:45 PM, Timo Sirainen wrote:

On Wed, 2010-11-17 at 21:17 +0100, Attila Nagy wrote:


pop3-login: Error: net_connect_unix(pop3) failed: Connection refused

Right. This is the main problem. So the question becomes why is the
connection being refused.

I would really love to solve this now. :)
What connects here to what? pop3-login to the pop3 service?
My current config is:
# 2.0.8: /usr/local/etc/dovecot/dovecot.conf
# OS: FreeBSD 8.2-PRERELEASE amd64
auth_cache_negative_ttl = 0
auth_cache_size = 100 M
auth_cache_ttl = 1 days
default_process_limit = 2000
disable_plaintext_auth = no
info_log_path = syslog
log_path = /var/log/dovecot-errors.log
mail_fsync = never
mail_gid = 999
mail_location = maildir:~/Maildir
mail_plugins = " quota"
mail_uid = 999
managesieve_notify_capability = mailto
managesieve_sieve_capability = fileinto reject envelope 
encoded-character vacation subaddress comparator-i;ascii-numeric 
relational regex imap4flags copy include variables body enotify 
environment mailbox date

passdb {
  args = /usr/local/etc/dovecot/dovecot-ldap.conf.ext
  driver = ldap
}
plugin {
  autocreate = INBOX.Trash
  autocreate2 = INBOX.Drafts
  autocreate3 = INBOX.Sent
  autocreate4 = INBOX.Spam
  autosubscribe = INBOX.Trash
  autosubscribe2 = INBOX.Drafts
  autosubscribe3 = INBOX.Sent
  autosubscribe4 = INBOX.Spam
  mail_log_events = delete undelete expunge copy mailbox_delete 
mailbox_rename flag_change save mailbox_create

  mail_log_fields = uid box msgid size flags vsize from subject
  quota = maildir:User quota
}
protocols = pop3 imap lmtp
service anvil {
  client_limit = 8192
}
service auth {
  client_limit = 8192
  unix_listener auth-userdb {
mode = 0600
user = qmailldap
  }
}
service imap-login {
  client_limit = 8
  process_min_avail = 16
  service_count = 0
}
service imap {
  client_limit = 8
  process_min_avail = 16
  service_count = 0
}
service lmtp {
  inet_listener lmtp {
port = 24
  }
  user = qmailldap
}
service pop3-login {
  client_limit = 8
  process_min_avail = 16
  service_count = 0
}
service pop3 {
  client_limit = 8
  process_min_avail = 32
  service_count = 0
}
ssl = no
userdb {
  driver = prefetch
}
userdb {
  args = /usr/local/etc/dovecot/dovecot-ldap.conf.ext
  driver = ldap
}
protocol lmtp {
  mail_plugins = " quota mail_log notify"
}
protocol imap {
  mail_max_userip_connections = 1024
  mail_plugins = " quota imap_quota autocreate"
}
protocol pop3 {
  mail_max_userip_connections = 1024
  mail_plugins = " quota autocreate"
}

I've raised pop3's process_min_avail, but I still get these errors. 
There is nothing more in the error log.



I've checked, auth's start time is yesterday, so it wasn't restarted. I
guess what remains is the resource limit (client_limit maybe?).

If that happens, there should be a log entry. You can grep process_limit
and client_limit from logs.

Nothing with those.

How does dovecot logs timed out LDAP lookups?

"Connection appears to be hanging, reconnecting"

I usually log errors to a different log file. Normally that file should
be empty, so you can easily see all the errors and warnings that could
be causing problems.

log_path = /var/log/dovecot-errors.log
info_log_path = /var/log/dovecot-info.log

Done that, but I only have these in the errors.log:
Dec 21 12:41:06 pop3-login: Error: net_connect_unix(pop3) failed: 
Connection refused
Dec 21 12:41:06 pop3-login: Error: net_connect_unix(pop3) failed: 
Connection refused
Dec 21 12:43:29 pop3-login: Error: net_connect_unix(pop3) failed: 
Connection refused
Dec 21 12:43:29 pop3-login: Error: net_connect_unix(pop3) failed: 
Connection refused
Dec 21 12:43:29 pop3-login: Error: net_connect_unix(pop3) failed: 
Connection refused
Dec 21 12:43:29 pop3-login: Error: net_connect_unix(pop3) failed: 
Connection refused
Dec 21 12:43:29 pop3-login: Error: net_connect_unix(pop3) failed: 
Connection refused
Dec 21 12:43:29 pop3-login: Error: net_connect_unix(pop3) failed: 
Connection refused
Dec 21 12:43:29 pop3-login: Error: net_connect_unix(pop3) failed: 
Connection refused
Dec 21 12:43:29 pop3-login: Error: net_connect_unix(pop3) failed: 
Connection refused
Dec 21 12:43:29 pop3-login: Error: net_connect_unix(pop3) failed: 
Connection refused
Dec 21 12:43:29 pop3-login: Error: net_connect_unix(pop3) failed: 
Connection refused
Dec 21 12:43:29 pop3-login: Error: net_connect_unix(pop3) failed: 
Connection refused
Dec 21 12:43:29 pop3-login: Error: net_connect_unix(pop3) failed: 
Connection refused
Dec 21 12:43:29 pop3-login: Error: net_connect_unix(pop3) failed: 
Connection refused
Dec 21 12:43:29 pop3-login: Error: net_connect_unix(pop3) failed: 
Connection refused
Dec 21 12:43:29 pop3-login: Error: net_connect_unix(pop3) failed: 
Connection refused
Dec 21 12:43:29 pop3-login: Error: net_connect_unix(pop3) failed: 
Connection refused
Dec 21 12:43:29 pop3-login: Error: net_connect_unix(pop3) failed: 
Connection refused
Dec 21 12:43:29 pop3-login: Error: net_connect_unix(pop3) failed: 
Conne

Re: [Dovecot] Restarting dovecot-auth stops authentication

2010-11-19 Thread Attila Nagy

On 11/16/10 14:40, Attila Nagy wrote:
The Dovecot wiki states that Dovecot's master restarts all died 
processes, which is good for availability. But when I kill 
dovecot/auth (to simulate an error condition which happened on a 
machine), the authentication fails with:
Nov 16 14:32:40 be dovecot: imap: Error: net_connect_unix(auth-master) 
failed: No such file or directory

Of course I forgot to tell it's 2.0.6.
BTW, sending SIGUSR2 to dovecot/auth doesn't lot anything, while sending 
SIGHUP logs the "clearing cache" message. The wiki says on USR2 it 
should log cache statistics.


Re: [Dovecot] Restarting dovecot-auth stops authentication

2010-11-18 Thread Timo Sirainen
On Wed, 2010-11-17 at 21:17 +0100, Attila Nagy wrote:

> pop3-login: Error: net_connect_unix(pop3) failed: Connection refused

Right. This is the main problem. So the question becomes why is the
connection being refused.

> I've checked, auth's start time is yesterday, so it wasn't restarted. I 
> guess what remains is the resource limit (client_limit maybe?).

If that happens, there should be a log entry. You can grep process_limit
and client_limit from logs.

> The strange thing is according to tcpdump, dovecot does very few LDAP 
> lookups, and they are answered in a reasonable time, but I haven't done 
> a strict correlation with the above errors.

This shouldn't have anything to do with authentication.

> How does dovecot logs timed out LDAP lookups?

"Connection appears to be hanging, reconnecting"

I usually log errors to a different log file. Normally that file should
be empty, so you can easily see all the errors and warnings that could
be causing problems.

log_path = /var/log/dovecot-errors.log
info_log_path = /var/log/dovecot-info.log




Re: [Dovecot] Restarting dovecot-auth stops authentication

2010-11-17 Thread Attila Nagy

 On 11/17/2010 06:55 PM, Timo Sirainen wrote:

On Wed, 2010-11-17 at 13:45 +0100, Attila Nagy wrote:


Nov 17 11:42:10 be dovecot: pop3-login: Internal login failure (auth
failed, 1 attempts): user=, method=PLAIN, rip=172.28.16.20,
lip=172.16.253.13

Something else really should have been logged just before this. An error
or a warning. There were a few situations where it might not have, which
are fixed by this patch:
http://hg.dovecot.org/dovecot-2.0/rev/aec1f1614028


I'm sorry, I was in a rush and I have a lot of logs.
pop3-login: Error: net_connect_unix(pop3) failed: Connection refused
This precedes all the above log entries.
I've checked, auth's start time is yesterday, so it wasn't restarted. I 
guess what remains is the resource limit (client_limit maybe?).
The strange thing is according to tcpdump, dovecot does very few LDAP 
lookups, and they are answered in a reasonable time, but I haven't done 
a strict correlation with the above errors.
What's the best way to find out what causes this connection refused 
batches once or twice a day?


How does dovecot logs timed out LDAP lookups?


Re: [Dovecot] Restarting dovecot-auth stops authentication

2010-11-17 Thread Timo Sirainen
On Wed, 2010-11-17 at 13:45 +0100, Attila Nagy wrote:

> Nov 17 11:42:10 be dovecot: pop3-login: Internal login failure (auth 
> failed, 1 attempts): user=, method=PLAIN, rip=172.28.16.20, 
> lip=172.16.253.13 

Something else really should have been logged just before this. An error
or a warning. There were a few situations where it might not have, which
are fixed by this patch:
http://hg.dovecot.org/dovecot-2.0/rev/aec1f1614028




Re: [Dovecot] Restarting dovecot-auth stops authentication

2010-11-17 Thread Attila Nagy

On 11/16/10 18:29, Timo Sirainen wrote:

On Tue, 2010-11-16 at 14:52 +0100, Attila Nagy wrote:

Nov 16 14:32:40 be dovecot: imap: Error: net_connect_unix(auth-master)
failed: No such file or directory

Of course I forgot to tell it's 2.0.6.

2.0.7 fixed this.

Thanks, I've upgraded to it.

BTW, I have these in batches:
Nov 17 11:42:10 be dovecot: pop3-login: Internal login failure (auth 
failed, 1 attempts): user=, method=PLAIN, rip=172.28.16.20, 
lip=172.16.253.13
Nov 17 11:42:10 be dovecot: pop3-login: Internal login failure (auth 
failed, 1 attempts): user=, method=PLAIN, rip=172.28.16.20, 
lip=172.16.253.13

[...]
22 from this in the same second, then nothing for hours. This time this 
wasn't because the auth process disappeared.
I suspected LDAP errors, but Dovecot is so effective in LDAP caching 
that there are no 22 LDAP queries in the same second. How could I figure 
out what causes these errors? I don't see any more verbosity in the 
source code in the place, where this comes from, and I have pretty much 
connections, so doing a verbose log for days isn't an option...

Config:
# 2.0.7: /usr/local/etc/dovecot/dovecot.conf
# OS: FreeBSD 8.1-STABLE amd64
auth_cache_negative_ttl = 0
auth_cache_size = 100 M
auth_cache_ttl = 1 days
default_process_limit = 2000
disable_plaintext_auth = no
mail_fsync = never
mail_gid = 999
mail_location = maildir:~/Maildir
mail_plugins = " quota"
mail_uid = 999
managesieve_notify_capability = mailto
managesieve_sieve_capability = fileinto reject envelope 
encoded-character vacation subaddress comparator-i;ascii-numeric 
relational regex imap4flags copy include variables body enotify 
environment mailbox date

passdb {
  args = /usr/local/etc/dovecot/dovecot-ldap.conf.ext
  driver = ldap
}
plugin {
  autocreate = INBOX.Trash
  autocreate2 = INBOX.Drafts
  autocreate3 = INBOX.Sent
  autocreate4 = INBOX.Spam
  autosubscribe = INBOX.Trash
  autosubscribe2 = INBOX.Drafts
  autosubscribe3 = INBOX.Sent
  autosubscribe4 = INBOX.Spam
  mail_log_events = delete undelete expunge copy mailbox_delete 
mailbox_rename flag_change save mailbox_create

  mail_log_fields = uid box msgid size flags vsize from subject
  quota = maildir:User quota
}
protocols = pop3 imap lmtp
service anvil {
  client_limit = 8192
}
service auth {
  client_limit = 8192
  unix_listener auth-userdb {
mode = 0600
user = qmailldap
  }
}
service imap-login {
  client_limit = 8
  process_min_avail = 16
  service_count = 0
  vsz_limit = 64 M
}
service imap {
  client_limit = 8
  process_min_avail = 16
  service_count = 0
}
service lmtp {
  inet_listener lmtp {
port = 24
  }
  user = qmailldap
}
service pop3-login {
  client_limit = 8
  process_min_avail = 16
  service_count = 0
}
service pop3 {
  client_limit = 8
  process_min_avail = 16
  service_count = 0
}
ssl = no
userdb {
  driver = prefetch
}
userdb {
  args = /usr/local/etc/dovecot/dovecot-ldap.conf.ext
  driver = ldap
}
protocol lmtp {
  mail_plugins = " quota mail_log notify"
}
protocol imap {
  mail_max_userip_connections = 1024
  mail_plugins = " quota imap_quota autocreate"
}
protocol pop3 {
  mail_max_userip_connections = 1024
  mail_plugins = " quota autocreate"
}
but the process' size barely grows, regardless the large number of 
connections and users:
dovecot   21600  0.9  0.0 32304 14604  ??  S 9:24PM   6:06.91 
dovecot/auth




BTW, sending SIGUSR2 to dovecot/auth doesn't lot anything, while sending
SIGHUP logs the "clearing cache" message. The wiki says on USR2 it
should log cache statistics.

Works here:

Nov 16 17:26:25 auth: Info: Authentication cache hits 0/2 (0%)
Nov 16 17:26:25 auth: Info: Authentication cache inserts: positive: 2 95B, 
negative: 0 0B

So .. Since SIGHUP works, I don't really know. They should be using
exactly the same code right next to each others. I guess something could
disable SIGUSR2 somewhere somehow. What passdb/userdb do you use?


LDAP.
procstat -i says it's OK:
  PID COMM SIG FLAGS
21600 auth HUP  --C
21600 auth INT  --C
21600 auth QUIT ---
21600 auth ILL  ---
21600 auth TRAP ---
21600 auth ABRT ---
[...]
21600 auth USR1 ---
21600 auth USR2 --C




Re: [Dovecot] Restarting dovecot-auth stops authentication

2010-11-16 Thread Timo Sirainen
On Tue, 2010-11-16 at 14:52 +0100, Attila Nagy wrote:
> > Nov 16 14:32:40 be dovecot: imap: Error: net_connect_unix(auth-master) 
> > failed: No such file or directory
> Of course I forgot to tell it's 2.0.6.

2.0.7 fixed this.

> BTW, sending SIGUSR2 to dovecot/auth doesn't lot anything, while sending 
> SIGHUP logs the "clearing cache" message. The wiki says on USR2 it 
> should log cache statistics.

Works here:

Nov 16 17:26:25 auth: Info: Authentication cache hits 0/2 (0%)
Nov 16 17:26:25 auth: Info: Authentication cache inserts: positive: 2 95B, 
negative: 0 0B

So .. Since SIGHUP works, I don't really know. They should be using
exactly the same code right next to each others. I guess something could
disable SIGUSR2 somewhere somehow. What passdb/userdb do you use?



[Dovecot] Restarting dovecot-auth stops authentication

2010-11-16 Thread Attila Nagy

Hi,

The Dovecot wiki states that Dovecot's master restarts all died 
processes, which is good for availability. But when I kill dovecot/auth 
(to simulate an error condition which happened on a machine), the 
authentication fails with:
Nov 16 14:32:40 be dovecot: imap: Error: net_connect_unix(auth-master) 
failed: No such file or directory


It seems -albeit it gets restarted- dovecot/auth doesn't re-create its 
socket file.

Before:
# ls /var/run/dovecot/
anvil   auth-worker doveadm-server
anvil-auth-penalty  config  dovecot.conf
auth-client dictempty
auth-login  director-admin  lmtp
auth-master director-userdb login
auth-userdb dns-client  master.pid
# ps auwx | grep dovecot/auth
dovecot   87455  0.0  0.0 20024  3832  ??  S 2:34PM   0:00.01 
dovecot/auth

# rm /var/run/dovecot/auth-master; kill 87455
# ps auwx | grep dovecot/auth
dovecot   88815  0.0  0.0 20024  3776  ??  S 2:36PM   0:00.01 
dovecot/auth

# ls /var/run/dovecot/
anvil   config  dovecot.conf
anvil-auth-penalty  dictempty
auth-client director-admin  lmtp
auth-login  director-userdb login
auth-userdb dns-client  master.pid
auth-worker doveadm-server

I've deleted the auth-master socket because if I don't, you can't see 
that it's not re-created. :)


Is this a normal behaviour? I understand that killing dovecot/auth is 
not, but Dovecot could survive this easily, if recreating and re-using 
the new socket file would work. And loosing dovecot/auth happens 
sometimes (I don't yet now why).


Thanks,