Re: Cyrus in ISP environment?
On Thu, 17 Feb 2005 14:54:11 +0100 Marco Colombo [EMAIL PROTECTED] wrote: 10-15MBps ... then add a few hundred concurrent pop imap sessions plus some monitoring/statistical script walking your spool doing various operations and see this number fall down dramatically ... Because with random i/o ops you increase time disk heads travel around and add latency to the whole setup. Just came up with a test: if you have linux software raid, you can fail one drive and put it back, forcing a resync; then you can play with sysct dev.raid.speed_limit_min|max to establish a linear read/write i/o that takes some % of your total i/o capacity. Then add your above test to the mix and observe throughput numbers. Might be interesting :) Consider splitting the SMPT incoming part from the IMAP/POP serving one. Have the SMTP server receive, queue, scan messages. Once messages are in the queue, use a queue runner to deliver message the IMAP server via LMTP. I think this goes by default? :) Unless you are willing to accept mail for unknow users (and discover that later at LMTP level) you may need to teach your SMTP server how to recognize valid users. If you have some external db for user auth, it is relatively trivial to build a postfix (or sendmail..) map that checks for valid users before accepting the mail. What i'd like to see next is a overquota check on the same level. -- Jure Pear http://jure.pecar.org/ --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: User directory hashing
On Thu, 17 Feb 2005 16:56:57 +0100 Tucsek Jnos [EMAIL PROTECTED] wrote: Hi, Does anybody know, how to turn on directory hashing on users directory? So, for example: Our user's messages are in: /imap/domain/foobar.com/user/foo . . /imap/domain/foobar.com/user/bar directories. And the question is: is there any directory hashing patch to cyrus ( 2.2+ version ), what will make something like this with directories: /imap/domain/foobar.com/user/f/fo/foo . . /imap/domain/foobar.com/user/b/ba/bar Hm? hashimapspool: 1 in imapd.conf gives me this: /imap/domain/f/foobar.com/b/user/bar ... which is ok, for now. Because we have approx. 20-25 thousand user under one domain dir (free mail service), same here :) and when doing a backup it tooks a lot of time to get the directory listing... Which filesystem? default ext3 is going to take some time here, yes ... Also, the next limit you're going to hit is 32k subdirectories per directory max on many filesystems (at least ext2/3 and veritas behave that way). Luckily, reiserfs does not suffer from this. For those who don't (want) / can't use reiserfs and fulldirhash is too messy ... it would be nice to hash in the f/fo/user/foo way ... maybe like something postfix does with the hash_queue_depth. Anyone? -- Jure Pear http://jure.pecar.org/ --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Hardware RAID Level Performance
On Thu, 17 Feb 2005 10:19:28 -0800 (PST) Andrew Morgan [EMAIL PROTECTED] wrote: You may want to look into Dell's AX100 SAN (a rebranded version of the EMC Clariion AX100). These use SATA drives with a FC front end. They are relatively inexpensive for the amount of storage you can get, if your I/O needs match. You can also go a little more upscale with the CX300/500/700 models which support a mix of FC and SATA hard drives and offer greater expandability. Has anyone any expirience with Aplle Xserve RAID offer? It seems to be the cheapest of the bunch. -- Jure Pear http://jure.pecar.org/ --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Hardware RAID Level Performance
On Thu, 17 Feb 2005 20:50:56 +0100 [EMAIL PROTECTED] wrote: If you want the benefits of host independant RAID and cheap SATA disks you may have a look at this one : http://www.icp-vortex.com/english/product/pci/rz_sata_8/8586rz_e.htm I'm actually very afraif of all those cards with plenty of cache and no battery backup for it. It has been proven (on certain notebook disks iirc) that even 2mb cache on disks themselves if not flushed properly on shutdown can be a disaster for the filesystem. Don't want to expirience what happens if you manage to create a 128mb large hole in your data. That's why you see write caching disabled everywhere by default. And write caching is what we in the mail business want the most ... 3ware has gotten the right idea recently and started offering batery backup units for their cards. I'm trying to get one to test ... Because of that (and because I already have FC infrastructure in place) I'm mostly interested in standalone disk enclosures doing their own raid with cheap sata drives and big caches with batteries. -- Jure Pear http://jure.pecar.org/ --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Hardware RAID Level Performance
On Thu, 17 Feb 2005 14:41:29 -0800 David R Bosso [EMAIL PROTECTED] wrote: I just ran across these today: http://www.synetic.net/Synetic-Products/SyneRAID-Units/SyneRAID6-16-3U/S yneRAID6-SATA.htm No experience with them, but they're the best specs I've seen for SATA external RAID - decent processor and NCQ support. Hm ... $7,345 for just the enclosure and only 128mb cache ... For $8,849.00 Apple gives you 7 400gb disks ... but they're standard ata and since it's Apple, i'd guess the whole thing is tuned more toward large media files ... hmm ... -- Jure Pear http://jure.pecar.org/ --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus in ISP environment?
On Wed, 16 Feb 2005 16:21:09 +0100 Attila Nagy [EMAIL PROTECTED] wrote: Amavisd was slow like hell, but cyrus could easily put email down to disk at a rate of 10-15 MBps. Take the above numbers with a grain of salt, because the testing was pretty lame 10-15MBps ... then add a few hundred concurrent pop imap sessions plus some monitoring/statistical script walking your spool doing various operations and see this number fall down dramatically ... Because with random i/o ops you increase time disk heads travel around and add latency to the whole setup. The only thing that helps here is having lots and lots of disks or hw raid controllers with nice big caches. Just what 'lots' and 'big' means depends much on your actual needs. -- Jure Pear http://jure.pecar.org/ --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Hardware RAID Level Performance
On Wed, 16 Feb 2005 13:29:27 -0500 Lee [EMAIL PROTECTED] wrote: Do you have a particular suggestion for brand/model of device? It would obviously have to be redundant (or capable of being made redundant) and cost effectiveness would be critical. Umem cards seem a preferred choice, since there's a driver for them in the standard linux kernel ... But they're kinda hard to find these days, so Rob Mueller found these in our last discussion: http://www.curtisssd.com/products/drives/nitrofc/ Rob, do you mind sharing some expiriences with them? -- Jure Pear http://jure.pecar.org/ --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Hardware RAID Level Performance
On Tue, 15 Feb 2005 12:18:57 -0500 Lee [EMAIL PROTECTED] wrote: What are the implication of raid 10 vs. raid 5 with cyrus? Are they significant? Does EXT3 play into the discussion? Cyrus 2.3 CVS code enables you to split indexes and cyrus db files into their own partition. That's where most of the i/o activity is concentrated, so you only need to optimize that partition. The mail spool that remains can be raid5. Yes, ext3 does have its problems, depending on how many users and how big mailboxes you have. I'd recommend reiserfs. -- Jure Pear http://jure.pecar.org/ --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Hardware RAID Level Performance
On Tue, 15 Feb 2005 17:37:05 -0200 Henrique de Moraes Holschuh [EMAIL PROTECTED] wrote: I've heard bad things about reiserfs' capabilities to withstand corruption *and* to be repaired later. Something that I'd take into account when choosing the FS for the big spools. But maybe reiserfs has non-joke repair utilities these days... Maybe it really does get a fatal corrutpion with a bit or byte memory error sooner than ext3 ... can't really compare :) But reiserfsck did some real magic for me at least once ... when i thought the fs was toasted, it nicely put it back together again, lost a few megs and put ~2gb into /lost+found (out of ~480gb). The only thing that's a bit problematic right now is the time fsck takes on bushy trees like cyrus spool. -- Jure Pear http://jure.pecar.org/ --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Performance Monitoring?
On Fri, 4 Feb 2005 12:20:32 -0500 (EST) Bill Earle [EMAIL PROTECTED] wrote: - things we would like to monitor: connect to imap port to banner response time what exactly do you want to measure here? if it's machine responsiveness, use standard w/free/vmstat/iostat info. imap login time that depends mostly on where your accounts are stored. in my example, mysql; so i monitor that. mailbox selection time this is i/o problem. use iostat for that. imap process time (maybe create a new folder, move a few messages, delete them and expung) same. iostat. - We would also prefer graphing / trending, like a MRTG add-on. mrtg is a bit too much router oriented to plot everything you want nicely ... rrd is a better solution. as to how to do it ... there are many many scripts floating around for the general things like cpu load, bandwith and so on, but what we have here is a higly specific stuff and so the best way is to roll your own scripts. -- Jure Pear http://jure.pecar.org/ --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Postfix and real time integration
On Fri, 31 Dec 2004 10:33:18 +0100 Paul Dekkers [EMAIL PROTECTED] wrote: Hi, We're using postfix to feed cyrus (over LMTP) but we're missing the real-time integration like people described for sendmail and exim. To circumvene this I wrote a small perl script that creates a list of users and puts these in a file used by the local_recipients_map. It isn't always in sync (since I run the script only daily for now), but at least it prevents mail for unknown users to be scanned for virusses, spam, fed to cyrus and then bounced... (We would scan about 10 times more mail without this map! :-S) Is there a better way to do this? Sure it is. Have user accounts in mysql/ldap and point local_recipient_maps there. What I'd like to see for real time integration is a policyd for postfix that would check the quota status of a certain account, so postfix could reject mail immediately. -- Jure Pear http://jure.pecar.org/ --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: 64bit capability bug
On Thu, 30 Dec 2004 01:22:48 +0300 Alex Deiter [EMAIL PROTECTED] wrote: Hi, Cyrus IMAP on 64bit arch incorrectly interprets defaults numerical parameters of a imapd.conf: all of them are equal to zero! Can't confirm on alpha, gcc 3.3.5: alphabox:~# gcc test.c alphabox:~# file a.out a.out: ELF 64-bit LSB executable, Alpha (unofficial), version 1 (SYSV), for GNU/Linux 2.2.0, dynamically linked (uses shared libs), not stripped alphabox:~# ./a.out autocreatequota=0 berkeley_cachesize=512 berkeley_locks_max=5 berkeley_txns_max=100 client_timeout=10 imapidlepoll=60 ldap_size_limit=1 ldap_time_limit=5 ldap_timeout=5 ldap_version=3 maxmessagesize=0 mupdate_connections_max=128 mupdate_port=3905 mupdate_retry_delay=20 mupdate_workers_start=5 mupdate_workers_minspare=2 mupdate_workers_maxspare=10 mupdate_workers_max=50 plaintextloginpause=0 popexpiretime=-1 popminpoll=0 poptimeout=10 ptscache_timeout=10800 quotawarn=90 quotawarnkb=0 sasl_maximum_layer=256 sasl_minimum_layer=0 sieve_maxscriptsize=32 sieve_maxscripts=5 timeout=30 tls_session_timeout=1440 -- Jure Pear http://jure.pecar.org/ --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: corrupted quota or index files
On Mon, 20 Dec 2004 12:16:29 +0100 Kjetil Torgrim Homme [EMAIL PROTECTED] wrote: we are running Cyrus imapd 2.2.10 with default database types. every morning, we run quota -f and parse the output so we can send warning messages to people close to or over their quota. on two occasions, this has gone terribly wrong, a lot of users have their quota usage jacked way up, generally by a factor of 11. here are the first 20 lines of the log which has 1372 lines in total. [snip] re-running quota -f was uneffective, the preceding was the result after running a recursive reconstruct on the entire spool first. has anyone seen anything like this before? Yes. I've learned the hard way that qouta -f is doing bad things to my setup. While I have things under control now, unfortunately i still didn't find time to examine in detail what's going on. For now i suggest you to parse quota files directly. Maybe we can start with posting our relevant configs: altnamespace: no hashimapspool: 1 quotawarn: 90 unixhierarchysep: 1 virtdomains: 1 I think these are that matter to the quota handling code. -- Jure Pear http://jure.pecar.org/ --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Migrating Cyrus to new server AND new mailboxes
On Sun, 19 Dec 2004 15:32:16 +0100 Nikola Milutinovic [EMAIL PROTECTED] wrote: Hi all. I have a working Cyrus IMAP 2.1.13. I would like to migrate to a new server (new machine), running Cyrus IMAP 2.2.10. Additional trouble is I would like to move existing users to new mailboxes/usernames. For instance, I would like to do this: Old.ev.co.yu( IMAP 2.1.13 ): user.nikola New.ev.co.yu( IMAP 2.2.10): user.milutinovicn I imagine that I can create a new mailbox, stop both servers, copy all mail files and folders from Old to New and run reconstruct. But it that a good thing to do? Or perhaps, stop both MTAs, connect from a client ot both mailboxes and do a move from the client. It is slower, but is regular. Since you need a old_server/old_mailbox - new_server/new_mailbox mapping, i suggest a custom php/perl/python script that would do the same job as a client would do. You can use a cyrus admin account to access all mailboxes. If you have many accounts, you can probably also do some magic with your MTA to deliver mail to the right mail server and right mailbox. For example, with postfix =2 you can set up per user transport maps that would take care of this for you. -- Jure Pear http://jure.pecar.org/ --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: quota -f output
On Thu, 21 Oct 2004 09:22:09 -0400 Ken Murchison [EMAIL PROTECTED] wrote: IIRC, this means that the quota root file for [EMAIL PROTECTED] is missing, so all references to this quota root have been removed from the mailboxes. all references to this quota root ... how do they look like? In which file are they stored? I *need* to put them back and since quota tool is not doing that for me, I guess I'll have to do that manually. -- Jure Pear http://jure.pecar.org/ --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: (dead)locking ...
On Thu, 11 Nov 2004 08:49:38 -0500 Ken Murchison [EMAIL PROTECTED] wrote: Try an strace/truss on the process and see what its doing. I did a strace on reconstruct to determine on which file it was locking and determined it's cyrus.header. Looking at the locking order in wiki I see the cyrus.header is the very fisrt file being locked. Now, how can I find the process that is holding the lock and that probably caused all this? Last night I rebuilt mailboxes.db (which shrunk from 98mb to 64mb) and today everything was fine. This again points to some db as the source of the problem ... -- Jure Pear http://jure.pecar.org/ --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: (dead)locking ...
On Fri, 12 Nov 2004 13:12:10 -0500 (EST) Igor Brezac [EMAIL PROTECTED] wrote: Use lsof. Maybe my question wasnt clear enough. When fuser somefile shows ~20 processess, how do I figure out which one is the first that caused all others to block? Or first two that are fighting for a lock? This problem has been around for awhile (including 2.1). I run into this issue when mua (ms outlook and pine) saves outgoing messages to another imap folder. I took a brief look at the code, but nothing jumps out. There is an issue somewhere. I found it strange that it happened with mailboxes that were very near eachother namespace-wise. And that I've never seen it before ... -- Jure Pear http://jure.pecar.org/ --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: (dead)locking ...
On Wed, 10 Nov 2004 16:17:18 +0100 Jure Pe_ar [EMAIL PROTECTED] wrote: Is this problem on just user/sta* mailboxes a coincidence? Or can it point to something with one of the databases? I'm seeing this again, in the morning it was agains user/ab* mailboxes, now it's user/iz*. This never happened on 2.2.0, 2.2.3 and 2.2.6, it started happening about three weeks after I upgraded to 2.2.8. Anything I can do to track this down? -- Jure Pear http://jure.pecar.org/ --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
(dead)locking ...
Hi all, Today I've expirienced for the first time on our production server the locking issue mentioned on this mailing list. I found about 25 problematic mailboxes, all belonging to users matching the user/sta* pattern. Even reconstruct would hang on these, waiting for the stat() of cyrus.header. Without touching anything, I just restarted cyrus. But the problem came back after a couple of hours, so I went into a little lock hunt, reconstructing problematic mailboxes immediately after fuser -k on their cyrus.header. Half an hour after that the situation seems normal. Is this problem on just user/sta* mailboxes a coincidence? Or can it point to something with one of the databases? -- Jure Pear http://jure.pecar.org/ --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: how do I reconstruct user.username@domain
On Thu, 04 Nov 2004 10:25:25 +1300 Matthew Cocker [EMAIL PROTECTED] wrote: /usr/sbin/reconstruct -rf [EMAIL PROTECTED] returns [EMAIL PROTECTED] not [EMAIL PROTECTED] [EMAIL PROTECTED] [EMAIL PROTECTED] For the default domain /usr/sbin/reconstruct -rf user.name will find all mailboxes. Is there a command line switch I have missed? Same here, 2.2.6, 2.2.8, 2.3. -- Jure Pear http://jure.pecar.org/ --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: quota -f output
On Thu, 21 Oct 2004 09:22:09 -0400 Ken Murchison [EMAIL PROTECTED] wrote: IIRC, this means that the quota root file for [EMAIL PROTECTED] is missing, so all references to this quota root have been removed from the mailboxes. So you interpret this as config/domain/d/dom.ain/quota/u/user.username file is missing? That can't be the case, as the file is there for sure. Also, when does quota -f finish its processing? I'd expect it to walk the whole user tree, but i see it quitting early, somewhere in the middle of user.a*. And why would it generate a report only for users beggining with 0-9 and _ ? If someone else can confirm this behaviour, then this is a bug. It is also possible that this is some side effect of on-disk corruption. What kind of corruption would cause such behaviour? Where should I look for it? -- Jure Pear http://jure.pecar.org/ --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
quota -f output
Hi, what exactly does this mean in quota -f -d dom.ain output? dom.ain!user.username: quota root dom.ain!user.username -- (none) dom.ain!user.username.Trash: quota root dom.ain!user.username -- (none) -- Jure Pear http://jure.pecar.org/ --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Quota inconsistency...
On Mon, 18 Oct 2004 15:20:08 -0400 Scott Adkins [EMAIL PROTECTED] wrote: We have a user that was reporting that she was way *way* over her quota. She was alotted only 100MB, and for some reason, she was showing that she was using over 400MB of space on her IMAP account! Furthermore, looking at actual disk space consumption, she waas actually only using 261KB, which is barely anything at all. Can confirm this on 2.2.6 and 2.2.8. My quota fixing script ran over our tree tonight and found mailboxes as large as 900mb (our quotas are 10mb by default). Looks like for these quota was not being updated at all. I also have many situations where user complains about being 10 times over quota while they just deleted all their unneeded mail .. I have yet to figure out a good way to fix this discrepancy. I thought that maybe doing a recursive reconstruct on her account would maybe sync up her quota entry to what was actually in use. Nope. I thought that the quota command would be able to do something with it, again nope, unless I am just missing an option somwehere. I even used cyradm to do a setquota on the user to see if it would fix it, and again it did not. I use a shell script that rewrites the quota files. Altough i have no idea why cyrus doesnt like to modify them on its own. Permissions are ok ... I first noticed this quota issue a while ago running quota -f -d. After that some quotas were really off ... -- Jure Pear http://jure.pecar.org/ --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus Aggregator advicories
On Wed, 20 Oct 2004 14:24:00 +0200 Sebastian Hagedorn [EMAIL PROTECTED] wrote: We now have a SAN and have been very happy with it. I suppose a FibreChannel RAID delivers the same benefits. I've had no problems with clustering software on AS 2.1, but i've had major problems with fiber storage that caused some serious downtime. Whatever cluster you put up, the storage is still a single point of failure. The storage AND a filesystem there. I don't know how SAN boxen like ibm shark co do in such cases, i hope they do much better than JBOD and software raid :) -- Jure Pear http://jure.pecar.org/ --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Funding Cyrus High Availability
On Sun, 19 Sep 2004 00:52:08 -0700 (PDT) David Lang [EMAIL PROTECTED] wrote: Nice review of replication ABC :) Here are my thoughts: 1. Active-Slave replication with manual failover This is really the simplest way to do it. Rsync (and friends) does 90% of the required job here; the only thing it's lacking is the concept of the mailbox as a unit. It would be nice if our daemon here would do its job in an atomic way. A few days ago someone was asking for an event notification system that would be able to call some program when a certain action happened on a mailbox. Something like this would come handy here i think :) 2. Active-Slave replication with automatic failover 2 is really just 1 + your heartbeat package of choice and some scripts to tie it all together. 3. Active-Slave replication with Slave able to accept client connections I think here would be good to start thinking about the app itself and define connections better. Cyrus has three kinds of connections that modify a mailbox: lmtp that puts new mails into mailbox, pop that (generally) retrieves (and delete) them and imap that does both plus some other (folder ops and moving mails around). Now if you deceide that it does not hurt you if slave is a bit out of date when it accepts a connection (but i guess most of us would find this unacceptable), you can ditch some of the complexity; but you'd want the changes that were made on the slave in that connection to propagate up to the master. I dont really like this, because the concepts of master and slave gets blurred here and things can easily end in a mess. Once you have mailstores that are synchronizing each other in a way that is not very well defined, you'll end up with conflicts sooner or later. There are some unpredictable factors like network latency that can lead you to unexpected situations easily. 4. #3 with automatic failover Another level of mess over 3 :) 5. Active/Active designate one of the boxes as primary and identify all items in the datastore that absolutly must not be subject to race conditions between the two boxes (message UUID for example). In addition to implementing the replication needed for #1 modify all functions that need to update these critical pieces of data to update them on the master and let the master update the other box. Exactly. This is the atomicy i was mentioning above. I'd say this is going to be the larger part of the job. 6. active/active/active/... This is what most of us would want. while #6 is the ideal option to have it can get very complex Despite everything you've said, i still think this *can* be done in a relatively simple way. See my previos mail where i was dreaming about the whole ha concept in a raid way. There i assumed murder as the only agent through which clinets would be able to access their mailboxes. If you think of murder handling all of the jobs of your daemon in 1-4, one thing that you gain immediately is much simpler synchronization of actions between the mailstore machines. If you start empty or with exactly the same data on two machines, all that murder needs to do is take care that both receive the same commands and data in the same order. Also if you put all logic into one place, backend mailstores need not to be taught any special tricks and can remain pretty much as they are today. Or am i missing something? personally I would like to see #1 (with a sample daemon or two to provide basic functionality and leave the doors open for more creative uses) followed by #3 while people try and figure out all the problems with #5 and #6 and i would like to see that we come here to a conclusion of what kind of ha setup would be best for all and focus our energy on only one implementation. I have enough old hardware here (and i'm getting some more in about a month) that i can setup a nice little test environment. Right now it also looks like i'll have plenty of time in the february - june 2005 so i can volunteer to be a tester. there are a lot of senerios that are possible with #1 or #3 that are not possible with #5 One i think is slave of a slave of a slave (...) kind of setup. Does anybody really need such setup for a mail? I understand it for a ldap for example, there are even some things where it is usefull for a sql database, but i see no reason to have it for a mail server. -- Jure Pear http://jure.pecar.org/ --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Funding Cyrus High Availability
On Fri, 17 Sep 2004 08:25:26 +0200 Paul Dekkers [EMAIL PROTECTED] wrote: I would say not at an interval but as soon as there is an action performed on one mailbox, the other one would be pushed to do something. I believe that is called rolling replication. I would not be really happy with a interval synchronisation. It would make it harder to use both platforms at the same time, and that is what I want as well. So there is a little-bit of load-balancing involved, but more and more _availability_. Being able to use both platforms at the same time maybe implies that there is either no master/slave role or that this is auto-elected between the two and that this role is floating... Paul I'm jumping back into this thread a bit late ... My feeling is that most of cyrus instalations run one or a few domains with many users; at least that is my case. That's why i'd base any kind of replication we come up with on the mailbox as the base unit. As raid uses disk block for its unit, so would we use mailbox (with all its subfolders). In a way that one would be able to take care of the whole domains on the higher level, if needed. Today we have the option of using murder (or perdition, with some added logic) when more than one backend machine is needed. This brings us a kind of raid linear (linux md speak) or concatenation of space into a single mailstore. With all the 'features' of such setup: if you lose one machine(disk), all users(data) on that machine(disk) are not available. So what i'm thinking is we need is a kind of raid1 or mirroring of mailboxes. Imagine user1 having its mailbox on server1 and server2, user2 on server2 and server3, user3 on server3 and server1 ... for example. Murder is already a central point with a knowledge of where a certain mailbox is and how to proxy pop, imap and lmtp to it and in my way of seeing things, it would be best to teach it how to handle this 'mirroring' too. Let say one of the two mailboxes is primary, and the other is secondary; murder connects to the primary, lets the client do whatever it wants and then replays the exact same actions to the secondary mailbox. If this is done after the primary disconnects or while the client is still talking to the primary, is implementation detail. Performance bonus: connect to both mailboxes at once and pronounce as primary the one that responds faster :) Murder would have to know how to record and playback the whole client-server dialogues. Considering that there's already a system in cyrus that lets admin see the 'telemetry' of the imap conversation, i guess this could be extended and tied into murder. So far this is just how clients would talk to our system. What else would we need? Certanly a mechanism to manually move mailboxes between servers in a way that murder knows about the changes. Thinking of it, mupdate protocol already knows how to push metadatas around; why not extend it so it can also move mailboxes? Or should perl mupdate module be born and then some scripts should be written with it and imap? Then maybe some mechanism for murder to deceide on which servers to put newly created mailboxes on. Ideally this would be plugin based with different policies (load, disk space, responsiveness, combination of those, something else), but a simple round robin would do for a start. For those that do not want to have mailboxes in sync, a mechanism to delay updates to the secondary mailbox. (In this case, which mailbox is primary and which is secondary should not change) Also a way of handling huge piles of backlogs in case one of the machines is down for a longer period of time. Maybe a mechanism to sync the mailbox from the other server and discarding the backlogs would be handy in such case. And a way to manually trigger such resync on a specific mailbox. Probalby something else i can't think of right now. So how does this cyrus in a raid view sound? It should probalby be called raims for redundand array of inexpensive mail servers anyway ;) This way all the logic is done in one place and you only have to take good care (in a HA sense) of the mupdate master machine. Others can remain cheap and relatively dumb than can be pulled offline at will. Given fast enough and reliable links, this could also work in a geographycally distributed manner. Ken, is something like this reasonable? Oh, i'd like to know what fastmail.fm folks think about all this HA thing. I'm sure they have some interesting insights :) -- Jure Pear --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Funding Cyrus High Availability
On Fri, 17 Sep 2004 13:28:08 -0700 [EMAIL PROTECTED] wrote: My biggest question here is, simply, why recreate what's already out there? Because none of the existing solutions does not fit our needs well enough. There are a number of projects (LVM, PVFS) which do this kind of replication/distribution/virtulization for filesystems. We're discussing replication on the application level. Block level replication is nice for many things, but doesn't really take care of consistency, which cyrus relies on pretty much. There are a number of databases which have active/active clustering (mysql, DB2, Oracle, et al) and master/slave. Personally, I would LOVE to see a full RDBMS-backed system. You define your database(s) in the config file ... and that is all. You can go with dbmail and one of the existing well established databases anytime. This can solve the issue we're having here, but brings lots of other problems that cyrus is keeping away. Just ask any Exchange admin :) The other advantages would be very nice integration with other applications which can query against databases. (ex: postfix directly supports mysql lookups.) For mail routing auth, yes ... many of us are already doing this. However, storing mail in a db gives you about 20% of db overhead (straight from the Oralce sales rep) and i/o is already a very valuable resource ... But then, I can't afford to really help with this myself so take my thoughts with a big hope pill. =D Yup :) -- Jure Pear --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html