Replication errors in 2.3.16-8 (do_user bailing out)

2010-11-19 Thread Simpson, John R
I'm running into inconsistent errors from sync_client on a Cyrus 2.3.16-8 
system.  It's driving me crazy because the same sync_client command will 
succeed at some times and fail at others.  I know others are running similar 
and much more complex systems successfully, so I must be missing something 
simple or fundamental.

Thanks in advance,

John

When the problem occurs, sync_client gives a "do_client(u...@example.com): 
bailing out!" error and logs a message about failing to reset the replication 
(cyrusadmin) account.  This occurs both when running sync_client against a 
single user and when using sync_client with a file containing 27,000 user 
names.  The failures do not always effect the same users, and attempts to 
replicate a given user generally succeed after 1-5 tries.  When running the 
27,000 user test the failure can occur for anywhere from a handful to a few 
hundred users.  The problem occurs whether sync_client is run by root, by root 
"su -c"' to cyrus, or as the cyrus user.  Running "reconstruct", "reconstruct 
-G", or "reconstruct -x -f" prior to running sync_client does not resolve the 
problem.  

In addition to the "do_client(u...@example.com): bailing out!" failures, users 
are sometimes skipped without a  sync_client error message -- they simply 
aren't replicated on that pass.  Sync_client displays the line "USER 
u...@example.com" but none of the ADDSUB lines, similar to when processing a 
user that has never used their mailbox.

Both problems occur whether or not rolling replication is running.

Both the master mailstore (eml-store04) and the replica (eml-replica04) are 
CentOS 5.4 64-bit servers running as virtual machines on a single ESXi 4 
server.  They are using the ext3 filesystem.  The cyrus-imapd and 
cyrus-imapd-utils RPMS were built from the Invoca RPMS on a similar system.  I 
enabled SNMP support, but didn't make any other changes to the spec file before 
running "rpmbuild -ba cyrus-imapd.spec".  No errors were reported during the 
build.

The data was transferred from a Cyrus 2.3.7 system using rsync on /var/lib/imap 
and /var/spool/imap, and appears normal from a Cyrus and IMAP client 
perspective.  Between tests, I clear the replica with "rm -rf 
/var/lib/imap/domain/* /var/lib/imap/sieve/* /var/spool/imap/domain/*".  
Everything under /var/lib/imap and /var/spool/imap is owned by cyrus:mail.

[r...@eml-store04 ~]# uname -a
Linux eml-store04 2.6.18-194.26.1.el5 #1 SMP Tue Nov 9 12:54:20 EST 2010 x86_64 
x86_64 x86_64 GNU/Linux

[r...@eml-replica04 reyrey.net]# rpm -qa | grep cyrus | sort
cyrus-imapd-2.3.16-8
cyrus-imapd-utils-2.3.16-8
cyrus-sasl-2.1.22-5.el5_4.3
cyrus-sasl-lib-2.1.22-5.el5_4.3
cyrus-sasl-lib-2.1.22-5.el5_4.3
cyrus-sasl-plain-2.1.22-5.el5_4.3
cyrus-sasl-plain-2.1.22-5.el5_4.3

Failed attempt:
[r...@eml-store04 ~]# /usr/lib/cyrus-imapd/sync_client -l -v -u -f 
mailboxlist.ldap

eml-store04: output from sync_client:
USER admi...@reptest.org
Error from do_user(admi...@reptest.org): bailing out!

eml-store04: content of /var/log/maillog
Nov 18 11:37:27 eml-store04 sync_client[6328]: USER admi...@reptest.org
Nov 18 11:37:27 eml-store04 sync_client[6328]: USER received NO response: 
IMAP_MAILBOX_NONEXISTENT Failed to access inbox for admi...@reptest.org: System 
I/O error
Nov 18 11:37:27 eml-store04 sync_client[6328]: RESET received NO response: 
Failed to reset account cyrusadmin: Internal Error
Nov 18 11:37:27 eml-store04 sync_client[6328]: Error in 
do_user(admi...@reptest.org): bailing out!

eml-replica04: content of /var/log/maillog
Nov 18 11:37:27 eml-replica04 syncserver[12635]: IOERROR: opening 
/var/spool/imap/domain/r/reptest.org/a/user/admin01/cyrus.header: No such file 
or directory
Nov 18 11:37:27 eml-replica04 syncserver[12635]: Failed to access inbox for 
admi...@reptest.org
Nov 18 11:37:27 eml-replica04 syncserver[12635]: IOERROR: opening 
/var/spool/imap/domain/r/reptest.org/a/user/admin01/cyrus.header: No such file 
or directory
Nov 18 11:37:27 eml-replica04 syncserver[12635]: Unlocked


Successful attempt:
[r...@eml-store04 ~]# /usr/lib/cyrus-imapd/sync_client -l -v -u 
admi...@reptest.org
eml-store04:output from sync_client:
USER admi...@reptest.org
ADDSUB admi...@reptest.org INBOX
ADDSUB admi...@reptest.org INBOX.Drafts
ADDSUB admi...@reptest.org INBOX.Sent Items
ADDSUB admi...@reptest.org INBOX.Trash

eml-store04: content of /var/log/maillog:
Nov 18 11:54:47 eml-store04 sync_client[6416]: USER admi...@reptest.org
Nov 18 11:54:47 eml-store04 sync_client[6416]: USER received NO response: 
IMAP_MAILBOX_NONEXISTENT Failed to access inbox for admi...@reptest.org: 
Mailbox does not exist
Nov 18 11:54:48 eml-store04 sync_client[6416]: ADDSUB admi...@reptest.org INBOX
Nov 18 11:54:48 eml-store04 sync_client[6416]: ADDSUB admi...@reptest.org 
INBOX.Drafts
Nov 18 11:54:48 eml-store04 sync_client[6416]: ADDSUB admi...@reptest.org 
INBOX.Sent Items
Nov 18 11:54:48 eml-store04 sync_client[6416]: ADDSUB admi...@reptest.org 
INBOX.Trash

eml-repl

RE: Does anyone allow unlimited or extremely large quotas?

2010-11-19 Thread Michel Sébastien

> Our biggest currently is about 30GB I think.

>> I think the issue you will encounter first is clients will start to fall
>> down when folders exceed a 'reasonable' number of messages.  Common IMAP
>> clients I've seen start to exhibit severe performance issues beyond a
>> few hundred thousand messages.

> On a 32 bit architecture: we had one folder with over a million messages
> which was causing processes to run out of virtual memory trying to map
> the cache file in.  This wouldn't be a problem with a 64 bit userland.

very impressive to have so much messages in one folder therefor in one 
partition!. But with so many messages in one folder, I think that cyrus.index 
and even more cyrus.cache are huge.
Is mmap still efficient ? map a gigabit file should cost a lot of I/O and a 
relatively long reponse time to just access the records of the most recent 
emails.

Is it time to break the design of one cyrus.index and cyrus.cache per folder by 
something more scalable ?


Ce message et les pièces jointes sont confidentiels et réservés à l'usage 
exclusif de ses destinataires. Il peut également être protégé par le secret 
professionnel. Si vous recevez ce message par erreur, merci d'en avertir 
immédiatement l'expéditeur et de le détruire. L'intégrité du message ne pouvant 
être assurée sur Internet, la responsabilité du groupe Atos Origin ne pourra 
être recherchée quant au contenu de ce message. Bien que les meilleurs efforts 
soient faits pour maintenir cette transmission exempte de tout virus, 
l'expéditeur ne donne aucune garantie à cet égard et sa responsabilité ne 
saurait être recherchée pour tout dommage résultant d'un virus transmis.

This e-mail and the documents attached are confidential and intended solely for 
the addressee; it may also be privileged. If you receive this e-mail in error, 
please notify the sender immediately and destroy it. As its integrity cannot be 
secured on the Internet, the Atos Origin group liability cannot be triggered 
for the message content. Although the sender endeavours to maintain a computer 
virus-free network, the sender does not warrant that this transmission is 
virus-free and will not be liable for any damages resulting from any virus 
transmitted.


Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/


RE: Does anyone allow unlimited or extremely large quotas?

2010-11-19 Thread David Carter

On Fri, 19 Nov 2010, Michel Sébastien wrote:


On a 32 bit architecture: we had one folder with over a million messages
which was causing processes to run out of virtual memory trying to map
the cache file in.  This wouldn't be a problem with a 64 bit userland.


very impressive to have so much messages in one folder therefor in one 
partition!.


We have hit this limit once, and so far only once, as well.

A user was sorting their email archive (thousands of messages) generating 
copies to the Trash mailbox. They repeated this exercise multiple times.


Each time that the user reached hit their quota limit (several GBytes), 
they emptied the Trash folder. Consequently the live mailbox itself never 
contained huge numbers of messages. However delayed expunge means that a 
lot of wreckage was left behind: hundreds of thousands of messages.


Easily fixed by a reconstruct which discarded the obsolete information.

Is mmap still efficient ? map a gigabit file should cost a lot of I/O 
and a relatively long reponse time to just access the records of the 
most recent emails.


The mmap() itself has very little cost.

It would only become a problem if something actually tried to read all of 
the cache entries, causing the data to be paged in from disk.


--
David Carter Email: david.car...@ucs.cam.ac.uk
University Computing Service,Phone: (01223) 334502
New Museums Site, Pembroke Street,   Fax:   (01223) 334679
Cambridge UK. CB2 3QH.
Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/

Re: Running multiple sync_clients

2010-11-19 Thread Michael D. Sofka
Bron Gondwana wrote:
> On Thu, Nov 18, 2010 at 12:58:10PM -0500, Michael D. Sofka wrote:
>> Is it safe to run multiple sync_clients?  Is there an advantage to doing so?
>>
>> I had to restart sync_client once.  After later restarting cyrus two 
>> sync_clients were running, and appeared to do well together.  Still, I 
>> stopped the old process out of caution.
> 
> What version?  In 2.3.x, the sync_server will obtain a lock, so you can
> only run one sync_server at a time.  This means the sync_clients will
> take it in turns sending data.

2.3.16.  This description fits what I was seeing.


> In 2.4, there is no global locking - just locks on each mailbox.  So it's
> possible for two sync_clients to run at the same time.  There is a
> possibility this will lead to the same data being sent twice.  It won't
> cause corruption, but it will cause excess IO!
> 
> Are you sure it's actually two sync_client instances?  In every released
> version, sync_client forks immediately upon starting and then forks again
> to actually send data.

Yes.  There were a total of 4 sync processes.  Two parents, each with a 
child worker.  The logs showed two sync_client processes taking turns.

> Which is a problem because it opens the BDB environment once and closes it
> many times.  So I've now changed it on my development branch to not fork
> the second time, but just keep it all in once process.  We'll see if there
> are actually any memory leaks left!

Another question I have is what are /var/lib/imap/db/*?  File says they 
are Berkeley dbs, but I thought all databases were skiplist in this 
release.  I can't find the configuration options associated with this 
db.  I noticed them when looking at files open by sync_client.  And not 
that all cyrus process have these files open.

I also noticed that when the child process dies, it tends to take out 
the parent.  Shouldn't the parent catch the error and fork another 
child?  This has so far only happened twice, and in both cases appears 
to be associated with xfering accounts from 2.2.12 to 2.3.16.  In at 
least one of the cases there was a back-end directory that has been 
deleted by the most recent rsync, prior to migration.  A reconstruct 
fixed everything.

Mike
-- 
Michael D. Sofka   sof...@rpi.edu
C&MT Sr. Systems Programmer,   Email, HPC, TeX, Epistemology
Rensselaer Polytechnic Institute, Troy, NY.  http://www.rpi.edu/~sofkam/

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/


Re: Running multiple sync_clients

2010-11-19 Thread Michael D. Sofka
Michael D. Sofka wrote:

> Another question I have is what are /var/lib/imap/db/*?  File says they 
> are Berkeley dbs, but I thought all databases were skiplist in this 
> release.  I can't find the configuration options associated with this 
> db.  I noticed them when looking at files open by sync_client.  And not 
> that all cyrus process have these files open.

That s/b "And noticed that all cyrus process have these db files open."

Mike

-- 
Michael D. Sofka   sof...@rpi.edu
C&MT Sr. Systems Programmer,   Email, HPC, TeX, Epistemology
Rensselaer Polytechnic Institute, Troy, NY.  http://www.rpi.edu/~sofkam/

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/


Re: Running multiple sync_clients

2010-11-19 Thread Simon Matter
> Bron Gondwana wrote:
>> On Thu, Nov 18, 2010 at 12:58:10PM -0500, Michael D. Sofka wrote:
>>> Is it safe to run multiple sync_clients?  Is there an advantage to
>>> doing so?
>>>
>>> I had to restart sync_client once.  After later restarting cyrus two
>>> sync_clients were running, and appeared to do well together.  Still, I
>>> stopped the old process out of caution.
>>
>> What version?  In 2.3.x, the sync_server will obtain a lock, so you can
>> only run one sync_server at a time.  This means the sync_clients will
>> take it in turns sending data.
>
> 2.3.16.  This description fits what I was seeing.
>
>
>> In 2.4, there is no global locking - just locks on each mailbox.  So
>> it's
>> possible for two sync_clients to run at the same time.  There is a
>> possibility this will lead to the same data being sent twice.  It won't
>> cause corruption, but it will cause excess IO!
>>
>> Are you sure it's actually two sync_client instances?  In every released
>> version, sync_client forks immediately upon starting and then forks
>> again
>> to actually send data.
>
> Yes.  There were a total of 4 sync processes.  Two parents, each with a
> child worker.  The logs showed two sync_client processes taking turns.
>
>> Which is a problem because it opens the BDB environment once and closes
>> it
>> many times.  So I've now changed it on my development branch to not fork
>> the second time, but just keep it all in once process.  We'll see if
>> there
>> are actually any memory leaks left!
>
> Another question I have is what are /var/lib/imap/db/*?  File says they
> are Berkeley dbs, but I thought all databases were skiplist in this
> release.  I can't find the configuration options associated with this

That's not true for 2.3.16. Some db's are still BDB by default in vanilla
2.3.16.

Simon



Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/


Re: Does anyone allow unlimited or extremely large quotas?

2010-11-19 Thread Bron Gondwana
On Fri, Nov 19, 2010 at 05:48:22PM +0100, Michel Sébastien wrote:
> 
> > Our biggest currently is about 30GB I think.
> 
> >> I think the issue you will encounter first is clients will start to fall
> >> down when folders exceed a 'reasonable' number of messages.  Common IMAP
> >> clients I've seen start to exhibit severe performance issues beyond a
> >> few hundred thousand messages.
> 
> > On a 32 bit architecture: we had one folder with over a million messages
> > which was causing processes to run out of virtual memory trying to map
> > the cache file in.  This wouldn't be a problem with a 64 bit userland.
> 
> very impressive to have so much messages in one folder therefor in one 
> partition!. But with so many messages in one folder, I think that cyrus.index 
> and even more cyrus.cache are huge.
> Is mmap still efficient ? map a gigabit file should cost a lot of I/O and a 
> relatively long reponse time to just access the records of the most recent 
> emails.
> 
> Is it time to break the design of one cyrus.index and cyrus.cache per folder 
> by something more scalable ?

To be honest - it doesn't actually hurt too badly once it's in memory
cache.  The cyrus.cache file isn't generally needed to be entirely
read, and the secret of mmap is that you only read the bits you need
as you need them - it's lazily loaded.

The cyrus.index is still pretty small - about 100MB for a million
messages.  It doesn't take long to speed through that.  Couple of
seconds at most.

There's no real answer if you're doing a sort on the messages,
unless you go to multiple indexes (a la database engines).  That's
a whole different ballgame - but the the multiplier factor gets
higher.  For sane sizes of N (up to 20-30 thousand messages) the
O(N) of the way Cyrus does it is cheaper than a more complex
database.

Bron.

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/


Re: Running multiple sync_clients

2010-11-19 Thread Bron Gondwana
On Fri, Nov 19, 2010 at 08:11:38PM +0100, Simon Matter wrote:
> > Another question I have is what are /var/lib/imap/db/*?  File says they
> > are Berkeley dbs, but I thought all databases were skiplist in this
> > release.  I can't find the configuration options associated with this
> 
> That's not true for 2.3.16. Some db's are still BDB by default in vanilla
> 2.3.16.

More to the point, that's the BDB environment.  It gets created even if you
don't have any BDB files.  And every time a sync_client child exists the
reference count gets messed up :(

(it opens the environment before forking, and closes it in every child!)

Bron.

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/


Re: Running multiple sync_clients

2010-11-19 Thread Bron Gondwana
On Fri, Nov 19, 2010 at 12:33:44PM -0500, Michael D. Sofka wrote:
> I also noticed that when the child process dies, it tends to take out 
> the parent.  Shouldn't the parent catch the error and fork another 
> child?  This has so far only happened twice, and in both cases appears 
> to be associated with xfering accounts from 2.2.12 to 2.3.16.  In at 
> least one of the cases there was a back-end directory that has been 
> deleted by the most recent rsync, prior to migration.  A reconstruct 
> fixed everything.

The parent only forks another child if the connection dropped or the
remote end sent a [RESTART] - otherwise it drops out to avoid
hammering over and over with something it doesn't know how to fix.

My medium term plan is to either have master or another "angel"
process keep an eye on sync_client processes and be able to restart
them and clean up old logs!

Bron.

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/


RE: Replication errors in 2.3.16-8 (do_user bailing out)

2010-11-19 Thread Simpson, John R
I've re-run the replication tests using the same mailstore data on a Cyrus 
2.3.7 (CentOS/RHEL package) replica pair and the same type of errors occurred.  
I then pulled another mailstore data from a Cyrus system in one of our QA 
environments (i.e. production, but non-customer data, as opposed to the lab 
data I've been using to this point) and tested it on the 2.3.16-8 
master/replica pair.  The first sync_client run was promising -- there were a 
few "bailing" out errors and a few skipped users, but a much lower percentage 
that usual.  However, when I cleared the replica server and ran a second 
replication test there was a high rate of both types of issues.  As before, 
running sync_client multiple times eventually results in a fully synchronized 
replica.

John

John Simpson 
Senior Software Engineer, I. T. Engineering and Operations


> -Original Message-
> From: info-cyrus-bounces+john_simpson=reyrey@lists.andrew.cmu.edu 
> [mailto:info-
> cyrus-bounces+john_simpson=reyrey@lists.andrew.cmu.edu] On Behalf Of 
> Simpson,
> John R
> Sent: Friday, November 19, 2010 11:17 AM
> To: info-cyrus@lists.andrew.cmu.edu
> Subject: Replication errors in 2.3.16-8 (do_user bailing out)
> 
> I'm running into inconsistent errors from sync_client on a Cyrus 2.3.16-8 
> system.  It's
> driving me crazy because the same sync_client command will succeed at some 
> times and
> fail at others.  I know others are running similar and much more complex 
> systems
> successfully, so I must be missing something simple or fundamental.
> 
> Thanks in advance,
> 
> John
> 
> When the problem occurs, sync_client gives a "do_client(u...@example.com): 
> bailing
> out!" error and logs a message about failing to reset the replication 
> (cyrusadmin) account.
> This occurs both when running sync_client against a single user and when 
> using sync_client
> with a file containing 27,000 user names.  The failures do not always effect 
> the same users,
> and attempts to replicate a given user generally succeed after 1-5 tries.  
> When running the
> 27,000 user test the failure can occur for anywhere from a handful to a few 
> hundred users.
> The problem occurs whether sync_client is run by root, by root "su -c"' to 
> cyrus, or as the
> cyrus user.  Running "reconstruct", "reconstruct -G", or "reconstruct -x -f" 
> prior to
> running sync_client does not resolve the problem.
> 
> In addition to the "do_client(u...@example.com): bailing out!" failures, 
> users are
> sometimes skipped without a  sync_client error message -- they simply aren't 
> replicated on
> that pass.  Sync_client displays the line "USER u...@example.com" but none of 
> the
> ADDSUB lines, similar to when processing a user that has never used their 
> mailbox.
> 
> Both problems occur whether or not rolling replication is running.
> 
> Both the master mailstore (eml-store04) and the replica (eml-replica04) are 
> CentOS 5.4
> 64-bit servers running as virtual machines on a single ESXi 4 server.  They 
> are using the
> ext3 filesystem.  The cyrus-imapd and cyrus-imapd-utils RPMS were built from 
> the Invoca
> RPMS on a similar system.  I enabled SNMP support, but didn't make any other 
> changes to
> the spec file before running "rpmbuild -ba cyrus-imapd.spec".  No errors were 
> reported
> during the build.
> 
> The data was transferred from a Cyrus 2.3.7 system using rsync on 
> /var/lib/imap and
> /var/spool/imap, and appears normal from a Cyrus and IMAP client perspective. 
>  Between
> tests, I clear the replica with "rm -rf /var/lib/imap/domain/* 
> /var/lib/imap/sieve/*
> /var/spool/imap/domain/*".  Everything under /var/lib/imap and 
> /var/spool/imap is owned
> by cyrus:mail.
> 
> [r...@eml-store04 ~]# uname -a
> Linux eml-store04 2.6.18-194.26.1.el5 #1 SMP Tue Nov 9 12:54:20 EST 2010 
> x86_64
> x86_64 x86_64 GNU/Linux
> 
> [r...@eml-replica04 reyrey.net]# rpm -qa | grep cyrus | sort
> cyrus-imapd-2.3.16-8
> cyrus-imapd-utils-2.3.16-8
> cyrus-sasl-2.1.22-5.el5_4.3
> cyrus-sasl-lib-2.1.22-5.el5_4.3
> cyrus-sasl-lib-2.1.22-5.el5_4.3
> cyrus-sasl-plain-2.1.22-5.el5_4.3
> cyrus-sasl-plain-2.1.22-5.el5_4.3
> 
> Failed attempt:
> [r...@eml-store04 ~]# /usr/lib/cyrus-imapd/sync_client -l -v -u -f 
> mailboxlist.ldap
> 
> eml-store04: output from sync_client:
> USER admi...@reptest.org
> Error from do_user(admi...@reptest.org): bailing out!
> 
> eml-store04: content of /var/log/maillog
> Nov 18 11:37:27 eml-store04 sync_client[6328]: USER admi...@reptest.org
> Nov 18 11:37:27 eml-store04 sync_client[6328]: USER received NO response:
> IMAP_MAILBOX_NONEXISTENT Failed to access inbox for admi...@reptest.org:
> System I/O error
> Nov 18 11:37:27 eml-store04 sync_client[6328]: RESET received NO response: 
> Failed to
> reset account cyrusadmin: Internal Error
> Nov 18 11:37:27 eml-store04 sync_client[6328]: Error in 
> do_user(admi...@reptest.org):
> bailing out!
> 
> eml-replica04: content of /var/log/maillog
> Nov 18 11:37:27 eml-replica04 syncserv

Re: Does anyone allow unlimited or extremely large quotas?

2010-11-19 Thread Ciprian
Adam Tauno Williams wrote:
> I think the issue you will encounter first is clients will start to fall
> down when folders exceed a 'reasonable' number of messages.  Common IMAP
> clients I've seen start to exhibit severe performance issues beyond a
> few hundred thousand messages.
>   
Older versions of Outlook (e.g. Office 2003) will choke well before that 
(I think the limit was around 35.000 on XP SP2 ) also couple that with 
the local Outlook file store for that IMAP account which used to be 
limited to 2G. We generally advice our users to avoid going past 20.000 
messages in one folder.

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/