Re: Too many wait in auth process
service auth-worker { client_limit = 1 idle_kill = 0 process_limit = 600 process_min_avail = 0 service_count = 1 vsz_limit = 18446744073709551615 B } What dovecot version is this? with 2.3.17 or later you should probably use service_count=0 here. That would prevent auth-worker process from dying after each authentication and then need for new process to be spawned for each authentication. Yes, it is 2.3.17. I give a try, it's slighty better. There is a little fewer stalled auth processes. But I didn't manage to go more than 2000 clients although in production it's more than 8000 connections. Maybe, it's because I didn't find how to make persistent connections with imaptest and there was too many login/logout. I use delay to make client during around 5 seconds So I increase this delay up to 120s, this slow down login/logout and decrease processes stuck in wait auth queue. I think I will go this way to simulate normal load on this server. But that doesn't simulate a reboot of service while clients are connected. Thank you all, Ismaël Hello, I made some little progress in my benchmarks. I have found how to use imaptest to get IDLE command and make persistent connections, using profile. I have ended yesterday to have 8000 persistents clients on the bench server. My target is 6 persistents clients for 250k mailboxes. The server has 12 procs (24 cores) and 192 Go RAM, fs is zfs. Increasing clients over 8000 make stalled all connections. Login slows down drastically, but after login, IMAP commands stay fast. I'm wondering how to go further. I believe that I have to tune imap-login service. I'm seeing 60 Go RAM used in my tests, I suppose that's login process and authentication UNIX socket. Monitoring alerts also about some minor page faults, it could be related. Conf for now : service auth-worker { client_limit = 1 # because only the master auth process connects to auth worker process_limit = 18000 # should be a bit higher than auth_worker_max_count setting service_count = 0 # prevent auth-worker process from dying after each authentication process_min_avail = 96 # number of CPU cores * 4 } service imap-login { client_limit = 200 process_limit = 3000 process_min_avail = 96 service_count = 0 vsz_limit = 1G } // using High-performance mode :https://doc.dovecot.org/admin_manual/login_processes/ I'll try today differents settings for this imap-login step, while trying to increase number of clients. If you have any hints to achieve that, I thank you Ismaël Tanguy
Re: Too many wait in auth process
On 8. Feb 2022, at 12.27, itan...@univ-brest.fr wrote: service auth-worker { client_limit = 1 idle_kill = 0 process_limit = 600 process_min_avail = 0 service_count = 1 vsz_limit = 18446744073709551615 B } What dovecot version is this? with 2.3.17 or later you should probably use service_count=0 here. That would prevent auth-worker process from dying after each authentication and then need for new process to be spawned for each authentication. Yes, it is 2.3.17. I give a try, it's slighty better. There is a little fewer stalled auth processes. But I didn't manage to go more than 2000 clients although in production it's more than 8000 connections. Maybe, it's because I didn't find how to make persistent connections with imaptest and there was too many login/logout. I use delay to make client during around 5 seconds So I increase this delay up to 120s, this slow down login/logout and decrease processes stuck in wait auth queue. I think I will go this way to simulate normal load on this server. But that doesn't simulate a reboot of service while clients are connected. Thank you all, Ismaël
Re: Too many wait in auth process
> On 8. Feb 2022, at 12.27, itan...@univ-brest.fr wrote: > > service auth-worker { > client_limit = 1 > idle_kill = 0 > process_limit = 600 > process_min_avail = 0 > service_count = 1 > vsz_limit = 18446744073709551615 B > } What dovecot version is this? with 2.3.17 or later you should probably use service_count=0 here. That would prevent auth-worker process from dying after each authentication and then need for new process to be spawned for each authentication. Sami
Re: Too many wait in auth process
Hello, thank you for your advices and sorry to not have detailed infra ismael> I'm currently benchmarking new hardware aimed to serve around ismael> 70k users For now, our IMAP server have 13k users. This doesn't help us help you. Is this a new rasperry Pi 4? Is it a Dual CPU AMD Rzyzen with 128gb of memory and fast NVMe disks? What is your system setup? Sorry, I have two servers to bench : - first one (a model like our current IMAP servers) is 18To HDD, 256Go RAM, 8c/16th - second (new one aimed to serve many more customers) is 24 x 14 TO (HDD SAS), 192GB DDR4 2,6Ghz, 12c/24t - 2.4GHz/3.5GHz OS is FreeBSD 12.2 ismael> To run imaptest, I've spwan some bench clients. Are these tests run from remote hosts? What kind of network are you using? Yes, imaptest is running from kvm remote virtual machines in the same DC. They are some networks hops between them, but few. ismael> Each bench client can run imaptest with 1000 clients. ismael> More than 1000 clients will load CPU of this bench client ismael> imaptest command (command are chosen from usage stat on our other IMAP servers): ismael> imaptest host=x port=xxx userfile=userfile mbox=/root/dovecot-crlf ismael> pass=s seed=123 clients=1000 select=194 uidfetch=94 noop=70 ismael> status=82 append=49 fetch=276 list=12 store=19 expunge=22 ismael> msubs=4 search=4 logout=1 delete=81 no_pipelining ismael> With one bench client, everything runs smoothly. ismael> # ps aux | grep dovecot | awk '{print $11,$12,$13,$14,$15,$16,$17,$18}' | sort | uniq -c ismael> 1 anvil: [221 connections] (anvil) ismael> 1 auth: [13 wait, 0 passdb, 0 userdb] (auth) ismael> 1 dovecot/config ismael> 1 dovecot/imap ismael> 84 dovecot/imap-login ismael> 1 dovecot/log ismael> 20 dovecot/pop3-login ismael> 1 grep dovecot ismael> 1 stats: [1307 connections] (stats) ismael> When a second instance bench instance start imaptest, clients ismael> of first and second instance begin to stall : ismael> 1400 stalled for 20 secs in command: 1 LOGIN"fakeuser644@mailbench" "password" So how is your dovecot authentication setup? Are you using a mysql backend? LDAP? Where is the server you're querying against? Are you running mysql on the same server you're running dovecot on? In production, we use a remote galera cluster. On benchmarking, for now, I use static for passdb and a file for userdb. Are you running multiple dovecot servers with dovecot director in front of them to help spread the load and to offer resilience if/when a backend server fails? No. I'm directly benchmarking backend. ismael> And : ismael> # ps aux | grep dovecot | awk '{print $11,$12,$13,$14,$15,$16,$17,$18}' | sort | uniq -c ismael> 1 anvil: [221 connections] (anvil) ismael> 1 auth: [1227 wait, 0 passdb, 0 userdb] (auth) ismael> 1 dovecot/config ismael> 1 dovecot/imap ismael> 37 dovecot/imap-login ismael> 1 dovecot/log ismael> 20 dovecot/pop3-login ismael> 1 grep dovecot ismael> 1 stats: [680 connections] (stats) ismael> Every auth go in wait, number of connection decreases. ismael> Using mysql or a password file give same results. Where is mysql located? Remote one, but I'll go, for now, with a passwd-file to exclude potentials DB problems at the beginning of benchmarking. ismael> I have used different values for service_count with also no success. Post your configuration details. #doveconf -n auth_cache_negative_ttl = 0 auth_cache_size = 100 M auth_cache_ttl = 2 mins auth_failure_delay = 5 secs auth_master_user_separator = * auth_username_chars = abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ01234567890.-_@%+ auth_username_translation = %@ auth_verbose = yes auth_worker_max_count = 500 base_dir = /var/run/dovecot/ default_client_limit = 10 disable_plaintext_auth = no imap_idle_notify_interval = 30 secs listen = login_greeting = xx login_trusted_networks = xxx mail_gid = mail_uid = mailbox_list_index = no namespace { inbox = yes location = prefix = INBOX. separator = . type = private } namespace { hidden = yes inbox = no list = no location = prefix = separator = . type = private } passdb { args = password=#hidden_use-P_to_show# driver = static } plugin { acl = vfile quota = maildir:User quota } protocols = imap pop3 service anvil { client_limit = 97000 unix_listener anvil-auth-penalty { mode = 00 } } service auth-worker { client_limit = 1 idle_kill = 0 process_limit = 600 process_min_avail = 0 service_count = 1 vsz_limit = 18446744073709551615 B } service auth { client_limit = 0 idle_kill = 0 process_limit = 1 process_min_avail = 1 service_count = 0 vsz_limit = 1000 M } service imap-login { client_limit = 26000 process_min_avail = 16 service_count = 0 vsz_limit = 1 G } service imap { drop_priv_before_exec = yes process_limit = 1 }
Re: Too many wait in auth process
> "ismael" == ismael tanguy@univ-brest fr > writes: ismael> I'm currently benchmarking new hardware aimed to serve around ismael> 70k users For now, our IMAP server have 13k users. This doesn't help us help you. Is this a new rasperry Pi 4? Is it a Dual CPU AMD Rzyzen with 128gb of memory and fast NVMe disks? What is your system setup? ismael> To run imaptest, I've spwan some bench clients. Are these tests run from remote hosts? What kind of network are you using? ismael> Each bench client can run imaptest with 1000 clients. ismael> More than 1000 clients will load CPU of this bench client ismael> imaptest command (command are chosen from usage stat on our other IMAP servers): ismael> imaptest host=x port=xxx userfile=userfile mbox=/root/dovecot-crlf ismael> pass=s seed=123 clients=1000 select=194 uidfetch=94 noop=70 ismael> status=82 append=49 fetch=276 list=12 store=19 expunge=22 ismael> msubs=4 search=4 logout=1 delete=81 no_pipelining ismael> With one bench client, everything runs smoothly. ismael> # ps aux | grep dovecot | awk '{print $11,$12,$13,$14,$15,$16,$17,$18}' | sort | uniq -c ismael> 1 anvil: [221 connections] (anvil) ismael> 1 auth: [13 wait, 0 passdb, 0 userdb] (auth) ismael> 1 dovecot/config ismael> 1 dovecot/imap ismael> 84 dovecot/imap-login ismael> 1 dovecot/log ismael> 20 dovecot/pop3-login ismael> 1 grep dovecot ismael> 1 stats: [1307 connections] (stats) ismael> When a second instance bench instance start imaptest, clients ismael> of first and second instance begin to stall : ismael> 1400 stalled for 20 secs in command: 1 LOGIN "fakeuser644@mailbench" "password" So how is your dovecot authentication setup? Are you using a mysql backend? LDAP? Where is the server you're querying against? Are you running mysql on the same server you're running dovecot on? Are you running multiple dovecot servers with dovecot director in front of them to help spread the load and to offer resilience if/when a backend server fails? ismael> And : ismael> # ps aux | grep dovecot | awk '{print $11,$12,$13,$14,$15,$16,$17,$18}' | sort | uniq -c ismael> 1 anvil: [221 connections] (anvil) ismael> 1 auth: [1227 wait, 0 passdb, 0 userdb] (auth) ismael> 1 dovecot/config ismael> 1 dovecot/imap ismael> 37 dovecot/imap-login ismael> 1 dovecot/log ismael> 20 dovecot/pop3-login ismael> 1 grep dovecot ismael> 1 stats: [680 connections] (stats) ismael> Every auth go in wait, number of connection decreases. ismael> Using mysql or a password file give same results. Where is mysql located? ismael> I have used different values for service_count with also no success. Post your configuration details. ismael> I think my use of imaptest could be false. It could be. Are you thinking that 2000 users will all be logging into the system at the same time? ismael> My understanding of service auth is limited for now because ismael> I'm quite new to Dovecot (I have previously worked with ismael> Cyrus). Can't really give you any hints until you tell us more about your setup. John
Re: Too many wait in auth process
Hey, please refer to: https://doc.dovecot.org/admin_manual/login_processes/ We are using high-performance mode and it is serving 30k users with no problems. Best, Justas On 2022-02-07 17:33, ismael.tan...@univ-brest.fr wrote: Hello, I'm currently benchmarking new hardware aimed to serve around 70k users For now, our IMAP server have 13k users. To run imaptest, I've spwan some bench clients. Each bench client can run imaptest with 1000 clients. More than 1000 clients will load CPU of this bench client imaptest command (command are chosen from usage stat on our other IMAP servers): imaptest host=x port=xxx userfile=userfile mbox=/root/dovecot-crlf pass=s seed=123 clients=1000 select=194 uidfetch=94 noop=70 status=82 append=49 fetch=276 list=12 store=19 expunge=22 msubs=4 search=4 logout=1 delete=81 no_pipelining With one bench client, everything runs smoothly. # ps aux | grep dovecot | awk '{print $11,$12,$13,$14,$15,$16,$17,$18}' | sort | uniq More than 1000 clients will load CPU of this bench client-c 1 anvil: [221 connections] (anvil) 1 auth: [13 wait, 0 passdb, 0 userdb] (auth) 1 dovecot/config 1 dovecot/imap 84 dovecot/imap-login 1 dovecot/log 20 dovecot/pop3-login 1 grep dovecot 1 stats: [1307 connections] (stats) When a second instance bench instance start imaptest, clients of first and second instance begin to stall : 1400 stalled for 20 secs in command: 1 LOGIN"fakeuser644@mailbench" "password" And : # ps aux | grep dovecot | awk '{print $11,$12,$13,$14,$15,$16,$17,$18}' | sort | uniq -c 1 anvil: [221 connections] (anvil) 1 auth: [1227 wait, 0 passdb, 0 userdb] (auth) 1 dovecot/config 1 dovecot/imap 37 dovecot/imap-login 1 dovecot/log 20 dovecot/pop3-login 1 grep dovecot 1 stats: [680 connections] (stats) Every auth go in wait, number of connection decreases. Using mysql or a password file give same results. I have used different values for service_count with also no success. I think my use of imaptest could be false. My understanding of service auth is limited for now because I'm quite new to Dovecot (I have previously worked with Cyrus). Thank you for every hints. Ismaël Tanguy
Too many wait in auth process
Hello, I'm currently benchmarking new hardware aimed to serve around 70k users For now, our IMAP server have 13k users. To run imaptest, I've spwan some bench clients. Each bench client can run imaptest with 1000 clients. More than 1000 clients will load CPU of this bench client imaptest command (command are chosen from usage stat on our other IMAP servers): imaptest host=x port=xxx userfile=userfile mbox=/root/dovecot-crlf pass=s seed=123 clients=1000 select=194 uidfetch=94 noop=70 status=82 append=49 fetch=276 list=12 store=19 expunge=22 msubs=4 search=4 logout=1 delete=81 no_pipelining With one bench client, everything runs smoothly. # ps aux | grep dovecot | awk '{print $11,$12,$13,$14,$15,$16,$17,$18}' | sort | uniq -c 1 anvil: [221 connections] (anvil) 1 auth: [13 wait, 0 passdb, 0 userdb] (auth) 1 dovecot/config 1 dovecot/imap 84 dovecot/imap-login 1 dovecot/log 20 dovecot/pop3-login 1 grep dovecot 1 stats: [1307 connections] (stats) When a second instance bench instance start imaptest, clients of first and second instance begin to stall : 1400 stalled for 20 secs in command: 1 LOGIN"fakeuser644@mailbench" "password" And : # ps aux | grep dovecot | awk '{print $11,$12,$13,$14,$15,$16,$17,$18}' | sort | uniq -c 1 anvil: [221 connections] (anvil) 1 auth: [1227 wait, 0 passdb, 0 userdb] (auth) 1 dovecot/config 1 dovecot/imap 37 dovecot/imap-login 1 dovecot/log 20 dovecot/pop3-login 1 grep dovecot 1 stats: [680 connections] (stats) Every auth go in wait, number of connection decreases. Using mysql or a password file give same results. I have used different values for service_count with also no success. I think my use of imaptest could be false. My understanding of service auth is limited for now because I'm quite new to Dovecot (I have previously worked with Cyrus). Thank you for every hints. Ismaël Tanguy