Re: Problems with mupdate server.

2005-05-20 Thread João Assad
Carol Tardif wrote: [...] Thanks for the explanation. So now that I know that my mailboxes.db can get corrupted what can I do to get away from long downtimes? edit your cyrus.conf and change the following option EVENTS { # this is required checkpoint cmd="ctl_cyrusdb -c" period=30 } this is how

Re: Problems with mupdate server.

2005-05-20 Thread João Assad
Carol Tardif wrote: [...] One thing I would like to know is instead of synchronising everything can I: 1- delete mailboxe.db on mupdate. 2- do "ctl_mboxlist -d -f /var/imap/mailboxes.db > mailbox_backend1.txt" on both backends 3- Then on the mupdate import mailbox via:

Re: Problems with mupdate server.

2005-05-19 Thread João Assad
Carol Tardif wrote: Hello, We are using Cyrus-2.2.10 in a Murder environnement with two backend +200K mailboxes each. Since our upgrade from Red Hat 7.3 to RHEL 3.0 AS we get this error: May 19 13:56:07 mupdate mupdate[26023]: DBERROR: skiplist recovery /var/imap/mailboxes.db: 45DF894 should be AD

Re: Running "ctl_mboxlist -m" on a running server

2005-05-19 Thread João Assad
Etienne Goyer wrote: ... RHEL 4, kernel 2.4.21-27.0.4.ELsmp I had many corruption problems with fedora , turns out there seems to be a bug in mmap. RHEL 4 might have the same mmap bug. can you check your logs and see if you find any IOERROR stating that the mailbox file couldn't be mapped ? You

Re: Running "ctl_mboxlist -m" on a running server

2005-05-18 Thread João Assad
Etienne Goyer wrote: Greeting, folks, I have a Murder with two backends. We have experienced what we believe to be skiplist corruption on the mupdate master server. More precisely, the log show a few instance of such an error : May 17 09:50:26 mupdate mupdate[19842]: DBERROR: skiplist recovery

Re: mupdate exits with signal 11

2005-04-28 Thread João Assad
Derrick J Brashear wrote: If anyone else wants to try this (I'd guess auth_unix use in mupdate isn't common?) I use saslauthd for authentication which then uses pam. Is there a way to get rid of auth_unix ? sasl_pwcheck_method: saslauthd sasl_mech_list: PLAIN http://www.contrib.andrew.cmu.edu/~s

Re: mupdate exits with signal 11

2005-04-26 Thread João Assad
nice nice, I'll commence testing tonight. on a side note, regarding that old mmap problem.. Im still testing it. Im pretty sure its some bug in fedora's mmap. Derrick J Brashear wrote: On Tue, 26 Apr 2005, João Assad wrote: sometimes mupdate exits with signal 11 . Im clueless why. cyrus-2.2.12 gd

mupdate exits with signal 11

2005-04-26 Thread João Assad
sometimes mupdate exits with signal 11 . Im clueless why. cyrus-2.2.12 gdb backtrace attached. Regards, João Assad backtrace.gz Description: application/gzip-compressed

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-13 Thread João Assad
João Assad wrote: yet another gdb backtrace of the mmap problem http://www.gazzag.com/gdb.output2.gz regards Hey Derrick, did you find anything usefull in this last backtrace ? Regards --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/In

Re: ctl_mboxlist -u doesnt work on master server

2005-04-12 Thread João Assad
João Assad wrote: also repeating the same process on one of the backends works perfectly. It seems to be master related only. do_undump in ctl_mboxlist.c have this: mboxlist_makeentry(0, partition, acl); making the mailboxes always type 0 , never type remote there should be another option to ctl_

Re: mupdate thread died with SIGSEGV

2005-04-11 Thread João Assad
João Assad wrote: This doesnt seem to be related to the mmap problem The only errors I got on cyrus log when it happened were a bunch of this: Apr 10 16:35:03 cyrus-fe1 cyrus/lmtp[30983]: mupdate-client: connect(10.1.5.101): Connection timed out This time I have strace and gdb logs. Im sending at

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-08 Thread João Assad
João Assad wrote: Derrick J Brashear wrote: curiously, the strace output isn't showing an mmap() call fail, that I see, before the error shows up. I could do a strace -f wich would dump all the traces from all the threads into a single file... but its a nightmare to read it. by reading some strac

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-08 Thread João Assad
João Assad wrote: Derrick J Brashear wrote: On Fri, 8 Apr 2005, João Assad wrote: João Assad wrote: Managed to get a backtrace using debug_command ( thanks for this nifty feature Henrique de Moraes ) 2 gdb backtraces from the production server. curiously, the strace output isn't showing an mmap(

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-08 Thread João Assad
João Assad wrote: Managed to get a backtrace using debug_command ( thanks for this nifty feature Henrique de Moraes ) and now a strace from the production server.. Im sending just the last few lines of it. its really big. 16:18:02.399469 accept(4, 0, NULL) = 104 16:18:02.470386 getpeername(1

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-08 Thread João Assad
João Assad wrote: Managed to get a backtrace using debug_command ( thanks for this nifty feature Henrique de Moraes ) 2 gdb backtraces from the production server. #18988 0x0804dcd3 in fatal ( s=0x8d52f070 "Internal error: assertion failed: mupdate.c: 586: 0", cod

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-07 Thread João Assad
João Assad wrote: Derrick J Brashear wrote: On Thu, 7 Apr 2005, João Assad wrote: Ok I got a backtrace ( I think ) . I dont really know how to use gdb did you compile without giving gcc the -g option? Probably. Having unstripped binaries with useful symbols would probably make for a more useful

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-07 Thread João Assad
Derrick J Brashear wrote: On Thu, 7 Apr 2005, João Assad wrote: Ok I got a backtrace ( I think ) . I dont really know how to use gdb did you compile without giving gcc the -g option? Probably. Having unstripped binaries with useful symbols would probably make for a more useful backtrace. (at lea

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-07 Thread João Assad
Derrick J Brashear wrote: On Wed, 6 Apr 2005, João Assad wrote: and then give us a backtrace from the core which you will then get? After doing that, the mupdate process now exits with signal 11 as expected. OTOH the core isnt getting dumped to disk for some reason... The only internal resource

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-07 Thread João Assad
Derrick J Brashear wrote: On Wed, 6 Apr 2005, João Assad wrote: and then give us a backtrace from the core which you will then get? After doing that, the mupdate process now exits with signal 11 as expected. OTOH the core isnt getting dumped to disk for some reason... The only internal resource

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-06 Thread João Assad
cyrus/mupdate[12614]: IOERROR: mapping /var/lib/imap/mailboxes.db file: Cannot allocate memory Resource limited memory, or are you really running out of memory? Letting processes continue running in the face of an mmap failure needs to be re-examined I guess. Hello again. I've been trying to t

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-04 Thread João Assad
The following options seem to have a direct impact on how fast I run out of resources (obviously) . The more I increase them, the faster I get the mmap error. *mupdate_workers_start mupdate_workers_minspare mupdate_workers_maxspare mupdate_workers_max I have them all set to the default values

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-01 Thread João Assad
João Assad wrote: The system have plenty of RAM available and ulimit -a reports the max virtual memory as unlimited core file size(blocks, -c) 0 data seg size (kbytes, -d) unlimited file size (blocks, -f) unlimited max locked memory (kbytes, -l) 32 max memory size

Re: cyrus-murder problems with database corruption in the frontend/master

2005-03-31 Thread João Assad
Derrick J Brashear wrote: On Thu, 31 Mar 2005, João Assad wrote: My db just got corrupted again 2 hours ago. seems like moving 200k mailboxes between backends really speed it up. I have my corrupted mailboxes.db and I can send it to anyone interested in taking a look. I needed to restart cyrus m

Re: cyrus-murder problems with database corruption in the frontend/master

2005-03-30 Thread João Assad
Derrick J Brashear wrote: On Tue, 29 Mar 2005, João Assad wrote: Come on guys, someone must have at least an idea I can try. Anything will help, maybe Im missing something obvious. Well, you have skiplist corruption, but there's not really anything in your report which is helpful at suggesting wh

cyrus-murder problems with database corruption in the frontend/master

2005-03-29 Thread João Assad
Come on guys, someone must have at least an idea I can try. Anything will help, maybe Im missing something obvious. a repost of the problem below, a bit more elaborated. We use cyrus-imapd-murder as the solution for our website messaging/e-mail service meaning all our users have an e-mail acco

problems with database corruption in the frontend/master (additional info)

2005-03-28 Thread João Assad
João Assad wrote: Hello everyone, We use cyrus-imapd-murder as the solution for our website messaging/e-mail service . We currently have 1.240.088 users split in 3 backend servers using 1 frontend / master server (both services running on the same server) for a grand total of 3.479.526 mailbox