Actually, all CPU use is system , so it's a kernel thing, not a dovecot thing
I guess.
Cor
> If there were no error messages logged with 1.1.5, there's nothing I
> can think of that could explain it.
Im not sure when this started, but im seeing very high CPU use in dovecot.
I recently swapped our systems to Linux from FreeBSD, and ive also moved
versions up a few (now on 1.1.6).
Here
Has there been some issue fixed between 1.1.5 and 1.1.6 that could explain
a huge drop in CPU use?
I was having problems with a 1.1.5 box that kept being on about 80%
CPU (a dual quad core box). I updated it to 1.1.6, and ever since it's
been at < 5% CPU. (no change in number of users).
Cor
I just noticed that about 25% of our servers have imap processes hanging:
97996 xx 1 1010 4096K 2052K CPU3 2 180:54 98.93% imap
All with 98-100% cpu load, taking up 1 or more CPUs.
dovecot 1.1.2, FreeBSD 6.2. is this caused by the deadlock issue thats fixed
in 1.1.4?
Cor
> >I've been following this thread: the bug is related with copying
> >mail across NFS in combination with cache locking, right?
> >Cor uses FBSD; but is this a bug that might impact other platforms
> >as well?
>
> The main bug here was that with lock_method=dotlock when cache file
> was bei
> Conceptually at least, but I don't know if it applies cleanly without
> them.
Cool! Ive just applied all patches you gave me to our production source,
and i'll sync it to a few servers to try it out and see :)
Cor
> > > Sep 10 22:06:41 imap dovecot: IMAP(scorpio):
> > > rename(/var/spool/mail/dovecot-control/indexes/s/sc/scorpio/.Trash/dovecot.index.cache.lock,
> > >
> > > /var/spool/mail/dovecot-control/indexes/s/sc/scorpio/.Trash/dovecot.index.cache)
> > > failed: No such file or directory
> > > Sep 10
> Since it's copying related, try running two instances of imaptest where
> one copies messages and another runs on the destination box:
>
> ./imaptest logout=0 copy=100 copybox=Trash
> ./imaptest box=Trash append=0
It took about 1.5 hours, but I got 1..running the above 2 tests at the same
time.
> ./imaptest logout=0 copy=100 copybox=Trash
> ./imaptest box=Trash append=0
Ok, running those 2 now.
> > Still need dovecot -n?
>
> I guess it could still show something useful. :)
# 1.1.2: /usr/local/etc/dovecot.conf
base_dir: /var/run/dovecot/
ssl_ca_file: /etc/ssl/certs/verisign.pem
ssl_cer
> > Sep 10 13:43:13 userimap7.xs4all.nl dovecot: IMAP(xx):
> > rename(/var/spool/mail/dovecot-control/indexes/c/co/xx/.Junk
> > E-mail/dovecot.index.cache.lock,
> > /var/spool/mail/dovecot-control/indexes/c/co/xx/.Junk
> > E-mail/dovecot.index.cache) failed: No such file or director
> >> So this is Outlook and its Junk Email Filters auto-copying these
> >> messages...
>
> > Nah, thats coincidental, the target folder doesnt seem to matter.
>
> Ok, but what about the client - is this outlook specific?
I doubt it. It's just a COPY command.
Cor
> > da56 UID COPY 11078 "Junk E-mail"
>
> So this is Outlook and its Junk Email Filters auto-copying these messages...
Nah, thats coincidental, the target folder doesnt seem to matter.
Cor
Ive been able to grab the dovecot.rawlog output of a few people with this
problem.. basically this is what I see with all of them:
Sep 10 13:43:13 userimap7.xs4all.nl dovecot: IMAP(xx):
rename(/var/spool/mail/dovecot-control/indexes/c/co/xx/.Junk
E-mail/dovecot.index.cache.lock,
/var/sp
> > Neither can i..
> >
> > As a test I set 1 server up with local FS again, but with NFS=yes and
> > mmap/fsync etc as if it's nfs index, and im getting the same errors.
>
> So you can reproduce it easily with imaptest? Could you post your
> dovecot -n output so I could see if I can reproduce i
> I just tested on my kvm FreeBSD 7.0 installation. I can't reproduce it
> there either with "imaptest logout=0".
Neither can i..
As a test I set 1 server up with local FS again, but with NFS=yes and
mmap/fsync etc as if it's nfs index, and im getting the same errors.
So it doesnt seem related
> Yes, although the error message could be changed to "locking timed out".
> But at least now the error shouldn't be visible to clients (other than
> small slowdowns due to the 2 second lock wait).
>
> Anyway, the real problem is one of:
>
> a) Dovecot is really locking dovecot.index.cache file f
Now im seeing these:
Sep 9 19:28:25 userimap3 dovecot: IMAP(x): file_dotlock_create() failed
with index cache file
/var/spool/mail/dovecot-control/indexes/u/ul/x/.Collega
Coaching/dovecot.index.cache: Resource temporarily unavailable
Is that any better?
Cor
Oh, now im getting assert failed with those 2 patches applied..
Sep 9 18:33:59 userimap3 dovecot: Panic: IMAP(x): file mail-cache.c: line
572 (mail_cache_lock): assertion failed: ((ret <= 0 && !cache->locked) || (ret
> 0 && cache->locked))
Cor
> Did a couple of changes:
>
> http://hg.dovecot.org/dovecot-1.1/rev/898e3810c014
> http://hg.dovecot.org/dovecot-1.1/rev/e3c5acf92b53
Ok, will add them.
> You could check how large the dovecot.index.cache file is for those
> users. Normally it's something like 10-20% of the mailbox size, but it
> 60 seconds is also when Dovecot decides the dotlock file is stale. So I
> guess the cache file compression is taking longer than that. Hmm. I
> hadn't thought about that before, since it was supposed to happen rarely
> enough. But I guess during the compression other processes shouldn't be
> stuc
Oops, I do see a logline, just not in the error log. Weird.
Sep 9 16:42:54 userimap3.xs4all.nl dovecot: IMAP(xxx):
rename(/var/spool/mail/dovecot-control/indexes/d/dr/xxx/.Trash/dovecot.index.cache.lock,
/var/spool/mail/dovecot-control/indexes/d/dr/xxx/.Trash/dovecot.index.cache)
> If copying a single mail takes longer than the dotlocking timeout,
> another process may have overridden the lock file and caused errors.
> And since it took so long, maybe PHP or something timed out.
>
> This should help figuring out if the problem is due to timeouts:
> http://hg.dovecot.or
> >They often say it took a while to get the error, and some people are
> >suggesting it's only with large emails. So it could be a PHP timeout
> >related bug, although im not positive.
> >
> >I dont think it's a coincidence that every single person that
> >complaints
> >has that set of errors in
> > Ive been getting a LOT of these errors since 1.1.2:
>
> What version did you use before?
I used 1.1-rc4 -> 1.1.1 -> 1.1.2
> > So far ive seen this
> > with 1800 customers, and we're getting active complaints about errors in
> > imap clients. When I check the users log I always see this err
Ive been getting a LOT of these errors since 1.1.2: So far ive seen this
with 1800 customers, and we're getting active complaints about errors in
imap clients. When I check the users log I always see this error..
This always happens with COPY commands in Squirrelmail.
Sep 2 20:09:35 userimap4.x
> >userimap1# cd /var/spool/mail/dovecot-control/r/ro/rossXXX/
> >INBOX/.Sport.NKR
> >userimap1# ls -al
> >total 8
> >drw--- 2 rossXXX user 4096 Jul 2 18:35 .
> >drwx-- 33 rossXXX user 4096 Jul 31 07:39 ..
>
> That's beginning to sound like Dovecot isn't even creating these
> di
> >>>find
> >>>as we have millions and millions of folders. In theory it could even
> >>>be an NFS server bug, although we havent seen this in the mailspool
> >>>directories.
> >>
> >>Do you see any errors related to this in log files? Do you see
> >>maildirfolder file inside that directory? If it
> >So this was created after we started using 1.1.1. Yesterday I only
> >found
> >5 after a week of operation. So I realize this can be very hard to
> >find
> >as we have millions and millions of folders. In theory it could even
> >be an NFS server bug, although we havent seen this in the mails
> > Jun 26 00:02:34 userimap13.xs4all.nl dovecot: IMAP(xxx):
> > file_dotlock_create(/var/spool/mail/dovecot-control/g/gl/xxx/INBOX/.Apple
> > Mail To Do/dovecot-uidlist) failed: Permission denied
> >
> > userimap1# ls -al "/var/spool/mail/dovecot-control/g/gl/xxx/INBOX/.Apple
> > Mail To Do" t
Im still finding a number of control directories that have the wrong
permission:
Jun 26 00:02:34 userimap13.xs4all.nl dovecot: IMAP(xxx):
file_dotlock_create(/var/spool/mail/dovecot-control/g/gl/xxx/INBOX/.Apple Mail
To Do/dovecot-uidlist) failed: Permission denied
userimap1# ls -al "/var/spool
Congrats timo!
Cor
> How large are the (individual) mailboxes you're hosting there?
Most of them are max 500MB, but average use is much less. It's a little
difficult to calculate because almost all POP users empty their mailbox.
We did some reports a few months ago where it showed that average mailbox
size for POP u
> Just in case I understand you wrong: You're serving 20k concurrent
> users with 1 (one) server?
Wait, I think I misunderstood you. We do not have just 1 imap server.
We have 30 imap servers (a little overdimensioned at this time).
I was just showing the graph of one of them. The others look sim
> > This specific server is a dual core 2.8ghz xeon with hyperthreading
> > running on FreeBSD 6.2-STABLE. We have over 1 million mailboxes, with about
> > 75,000 daily active users. At peak maybe 20,000 concurrent, in a mix of
> > webmail and direct imap. (no POP, thats handled by different s
> > We've been running 1.1 on about half of our servers for about a week now.
> > Ive mailed before that I was pleasantly surprised by its better use of
> > resources. Here's a graph showing that fact. Server load in the last 10
> > days.
>
> It may be good to list your hardware, user count, mailb
We've been running 1.1 on about half of our servers for about a week now.
Ive mailed before that I was pleasantly surprised by its better use of
resources. Here's a graph showing that fact. Server load in the last 10
days.
http://uwimages.smugmug.com/photos/286355874_9FNp2-L.png
Cor
> >
> > Could it be old data that is only showing up because of 1.1 and was
> > ignored or fatal in 1.0?
>
> The code was different in v1.0, but in both cases if you have 0600
> directory it would have failed the same way.. Can you reproduce this if
> you create a new mailbox and select it?
I ca
> This is created in src/lib-storage/index/maildir/maildir-util.c
> maildir_create_subdirs(). But I don't see how box->dir_create_mode could
> be 0600 and not 0700. I also can't reproduce this in my tests. What
> plugins do you use? Does it work if you disable them?
Could it be old data that is on
Hi, ever since i switched a few servers to 1.1RC4 im finding control dirs
with the wrong permissions:
800236158 drw---2 xxx user 4096
Apr 23 09:46 ./a/an/xxx/INBOX/.INBOX.Maatschappijen
When dovecot then wants to write a file inside, it gets a per
I recently switched one of our 30 imap servers to 1.1RC4 from 1.0.x, and
the difference is huge. The load on the server dropped from 5+ to 0.3 or so.
This means there is no more io waiting going on.
Thats on FreeBSD 6.2.
Great work Timo!
Cor
> "For webmail type setups indexes help a lot. For Outlook/Thunderbird
> they help a lot less."
>
> Very interesting!
>
> I'm scared to use (index) files that go sort of unnoticed (it's not
> calculated in the maildirsize file) and can potentially grow with no
> limit.
Diskspace is a lot cheap
Hi Timo,
> I'm getting that "link() succeeded, but link count=1" error with FreeBSD
> 6.2. So I'd like to know if this is a FreeBSD bug, NFS server bug or a
> more common NFS problem that I should work around..
Same here, im getting that error on FreeBSD 6.2.
Cor
Is it possible somehow to configure a fallback for a failed proxy? I am using
sql based proxying through dovecot, but it would be nice if you can fallback
to another host if the proxy destination server is down. High availability
and all..
Regards,
Cor
> Sorry to be so clueless, but all the activity about rquotad drives me to
> admit my puzzlement (or ignorance)...
> I run rquotad on my mail server that also runs DCrquotad is used by
> the other 3 hosts (a login/FTP server, a mailing list server and a user
> mgmnt server) that NFS mount th
Hi,
> > I know webmail.us use Dovecot, what is the most big dovecot architecture
> > known ?
> > Do you think Dovecot can handle 1 million of active users in a good
> > architecture ?
>
> Yep... and we have 500K very active users on it. We've scaled Dovecot
> horizontally without NFS, just lo
> Running rpc.lockd and rpc.statd on FreeBSD 6.2-STABLE #17: Sun Jun 24
> 22:11:00 EDT 2007
> with a NetApp filer as the server, ran 3 times and compared output (same):
>
Weird, I have the exact same setup and for me this program hangs, and I have
to kill lockd.
Cor
> http://dovecot.org/releases/dovecot-1.0.0.tar.gz
> http://dovecot.org/releases/dovecot-1.0.0.tar.gz.sig
>
> It took almost 5 years, but it's finally ready. I'm not expecting to
> release v1.0.1 anytime soon, unless someone's been sitting on a major
> bug just waiting for v1.0 to be released. :)
Hi Timo,
> OS X. Could you BSD people try if it works there? http://dovecot.org/
> tmp/append.c and see if it says "offset = 0" (bad) or non-zero (yay).
FreeBSD 6.2: offset = 5
FreeBSD 4.10: offset = 5
FreeBSD 4.7: offset = 5
NetBSD 3.0.1: offset = 5
Cor
101 - 148 of 148 matches
Mail list logo