Re: fixing unixheirsep for 2.5
Le 15/02/2013 07:10, Bron Gondwana a écrit : Thanks for this! commit c95da833a910150bb8d55585d221237cf1b673d9 Author: Julien Coloos julien.col...@atos.net Date: Mon Feb 11 15:36:05 2013 +0100 Various fixes for userid with dots and unixhierarchysep Use correct (not internal) userid in ACLs when creating root level folder. Fix ACLs setting when userid contains dots. In pop3d, as for imapd, correctly handle userid with dots. Fixing the various unixheirsep and altnamespace cases to work correctly has been on my todo list for far too long. Anyone who's following along with 2.5 development and uses either of those two set to other than the default - I'd love your test results and bug reports. I really want this rock solid by release. (anyone who can write test cases would be greatly appreciated too!) Thanks, Bron. Here is a test case for the issues I fixed: Repository: git://github.com/worldline-messaging/cassandane.git Branch: ticket/3373 Commit: e866d3f134ed07663d2742113964ed91b80be831 (https://github.com/worldline-messaging/cassandane/commit/e866d3f134ed07663d2742113964ed91b80be831) Notice that there actually still exist incoherences regarding how userid is handled. In most - if not all - places, the current master code deals with the internal representation of the userid, and thus in some places (usually when dealing with ACLs, as with my fix) there are conversions to its external form. The incoherences arise when the external userid contains the character '^', which is the value used to represent dots in internal form when unix hierarchy separator is used. For example both user ids my.id and my^id have the same mailbox (internal name: user.my^id). Thus if you set ACLs for my^id on mailbox (external name) user/my.id, user my^id would be able to login in this mailbox. In places where internal form is converted back to external, other kind of issues can happen: my^id is converted back to my.id which is not the actual external user id. I believe that to fix those last issues, either '^' shall become forbidden with unix hierarchy separator enabled, or the code shall be refactored to receive (a structure knowing) both external and internal form of the user id.
Re: Issue with Emails being changed from UNREAD to READ with no explanation
Le 18/10/2011 12:53, Smith, Michael a écrit : Hello, We (Pegasystems) are working with a customer who are using Cyrus 2.3.13 in production. *Process* ·The application that we have built uses a listener process that polls an email account every 60 minutes. ·All UNREAD emails are processed when the email listener starts ·All UNREAD emails are then deleted when the listener process disconnects ·Any new emails received during the time of the email listener is processing emails are somehow marked as READ when they have not been read or processed ·For all subsequent instantiations of the email listener process the READ emails are not processed *Investigations undertaken* ·All debug code from the application has revealed that there are no java errors from the application or from the JavaMail API ·No Pegasystems code or command is being issued that can change the status of the email from UNREAD to READ Are there any existing issues/bugs that match the symptoms detailed above in Cyrus post 2.3.13? Cheers, Michael ** *Michael Smith | Technical Solutions Director | Pegasystems Ltd* Office: 01189 651627 | Mobile: 07966 768244 | E-Mail: michael.sm...@pega.com mailto:michael.sm...@pega.com | www.pega.com http://www.pega.com Looks similar to an issue discussed some time ago: http://asg.andrew.cmu.edu/archive/message.php?mailbox=archive.cyrus-develmsg=2961 Bug happens in a complex part of the code - that no-one dared to approach. It was fixed in version 2.4 as part of a heavy refactoring. Regards
Re: Graceful restart also for imapd.conf
As proposed by Olivier, there is the 'sighup' branch available on git://github.com/worldline-messaging/cyrus-imapd.git. It is based on current official cyrus 'master' branch. Details about the commits: 20a7e4b32c01e7c8dd7aeb8eae41d35e4c630369 - Recycle running services upon SIGHUP on master. Currently when SIGHUP is sent to the master process, only new and legacy services are being specifically processed: - new ones are being started - legacy ones are stopped by transmitting SIGHUP to the concerned children (to stop them once the client - if connected - leaves) What we propose is to extend the use of SIGHUP to also recycle remaining services so that any change to imapd.conf is taken into account as fast as possible. 5b55aab917d6686c3ba74dc8abad1a3eb906cd12 - Send message to master when service is exiting. This commit is useful to make recycling smoother. Commit log: Master process is notified about children exiting by a SIGCHLD signal, which are taken into account but processed separately (not right when receiving the signal). The master process uses a 'select' call to wait for messages from its children, and manage other events. Signal handlers in the master process are set with the SA_RESTART flag. But the POSIX specifications let the implementation decide whether 'select' restarts or returns EINTR in this case. Thus it may take some time ('select' timeout) before the master process actually reap children (and fork new ones when necessary). It can also happen due to race conditions. Provided that letting children send back a message when they are exiting is not too much time consuming (for both child and master), removing the static MESSAGE_MASTER_ON_EXIT configuration variable is actually useful for smoother janitoring, and faster services recycling when sending SIGHUP to master process. I actually wonder why 'MESSAGE_MASTER_ON_EXIT' was used and set to 0 up to now. Compared to what services and master are already doing, I believe that it should not consume a significant amount of resources to send this 'exit' message. But maybe someone can prove me wrong here ? 4578a0268a1e1c2c1f0b33d68e547fc864dcca4b - Remain root on master process. Currently master process is usually started as root to later become cyrus. With SIGHUP it reloads its configuration and can start newly added services. Sometimes this is however problematic because non privileged users (as cyrus) are not allowed to bind ports below 1024: this results in the added service being unabled to start. To be fair I don't know if there are - for all platforms targeted by cyrus - ways to circumvent this security limitation. At least I tried setcap'ing the exe on Ubuntu, but it did not work as expected ... Now, as far as security is concerned it is best to not be root. But after thinking about it, two things make us think that maybe the master process could stay root: - it does not discuss with external clients as IMAP/POP/etc services do; so no remote exploits, right ? - many daemon services do run as root by default - ok, ok, that does not mean it's the right thing to do :p Or maybe it could be root by default, and the configuration would allow to use a given user/group, as some other daemon services do. What do you think ? As usual, comments are welcomed :) Now, a side note about something I observed while messing with signals in cyrus: from time to time I did get deadlocked processes upon SIGTERM; for example, when sending it to the master process right after starting it (yeah I know, who in their sane mind would want to do that ? - except me). Stacktraces of deadlocked processes: #0 0xf430 in __kernel_vsyscall () #1 0xf7388753 in __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:95 #2 0xf731bbac in _L_lock_10489 () from /lib/i386-linux-gnu/libc.so.6 #3 0xf731a553 in __libc_realloc (oldmem=0x97929e0, bytes=62) at malloc.c:3813 #4 0xf7309cea in _IO_mem_finish (fp=0x9792868, dummy=0) at memstream.c:132 #5 0xf7305949 in _IO_new_fclose (fp=0x9792868) at iofclose.c:66 #6 0xf73764bc in __vsyslog_chk (pri=value optimized out, flag=1, fmt=0x805482e exiting on SIGTERM/SIGINT, ap=0xffb7f43c o+\367\023\t) at ../misc/syslog.c:228 #7 0xf7376896 in __syslog_chk (pri=6, flag=1, fmt=0x805482e exiting on SIGTERM/SIGINT) at ../misc/syslog.c:131 #8 0x08049b7f in syslog (sig=15) at /usr/include/bits/syslog.h:32 #9 sigterm_handler (sig=15) at master.c:1067 #10 signal handler called #11 0xf430 in __kernel_vsyscall () #12 0xf73430d7 in __libc_fork () at ../nptl/sysdeps/unix/sysv/linux/i386/../fork.c:130 #13 0x0804afb5 in spawn_service (si=2) at master.c:633 #14 0x0804cce4 in main (argc=10, argv=0xffb80db4) at master.c:2069 #0 0xf430 in __kernel_vsyscall () #1 0xf7388753 in __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:95 ... (same backtrace) #13 0x0804afb5 in
Re: MESSAGE quota resource implemention
As discussed here and on IRC, I rebased my commits with all the changes: - the 'utility' methods have been promoted to 'command' ones, which are now generic and may (options hash) concern cyrus binaries - the 'start_command_bg' method is now replaced by a 'background' option in 'run_command' - dropped the 'mode' option, leaving only the 'redirects' one - moved 'unpack' method from Cassandane::Unit::TestCase to Cassandane::Instance, leaving the tar utility determine itself the kind of the tar file - by default destination is the current cassandane instance base directory; alternatively one can specify a relative path - 'path/to/file' - to this directory, or an absolute path - '/path/to/file' - use the pre-existing 'user.cassandane' mailbox for quota upgrade tests - fixed a few minor things (like avoiding short-life zombies by harvesting some cassandane dead children) - refactored a few more stuff - removed stuff that became unnecessary Hoping this will work out this time :) Regards Julien
Re: MESSAGE quota resource implemention
Le 12/09/2011 23:09, Bron Gondwana a écrit : - for 'old' mailboxes (those created before the annotation storage usage field in the index header), current annotations usage shall be computed (and added to the quota entry) upon upgrading; this way users won't have to run 'quota -f' for all quotaroots after switching to this new version ;) Definitely. Upgrading usually handles things like that. It's the right way[tm]. Pushed a few lines of code in the 'upgrade_index' function. Though it still lacks the annotation storage usage computing. It's in the branch 'gnb/annotate/fixes'; commit ed61f721487804b205e399c538fc35ddc153bd93. Actually I just gave up the 'old' test: there is no easy way to simulate upgrading mailbox index, or at least I don't feel confident enough to make it in cassandane :( Easiest way is the same way I did for the broken quotas test. Have a tar file with the contents of the old mailbox which you unpack onto the filesystem, and then open the mailbox and check that the upgraded fields are what you would expect. I also did something similar for the crash on thread test, where there were 5 messages which were known to be able to crash the THREAD command. I unpacked the folder contents with those messages in a test. Nice idea :) Pushed a few things. Added an option hash so that when running commands: - test can run system commands (based on the current code which would run cyrus commands inside the current cassandane instance directory) - finer I/O redirections are possible - working directory can be specified While adding those new 'features', I still kept the 'run_utility' method (cyrus commands) and simply added the 'run_command' method for other commands. Then, as you suggested on IRC, I added an 'unpack' method to extract tar/gz/bz2 (and combinations) files using system commands. And I added back my quota upgrade test from a cyrus v2.4(.11) mailbox, using two tar.gz files containing a mailbox content and its quota file. It's in the 'quotamessage/gnb/annotate' branch; commits d2bf4e4f42f7f8c9b53713441116ddbea5b0a265, 86a52daeed5a03f078b88de67d3d10b51a7f8cc4 and fb827e8fc77529a5e23465d2c19d6e88adf7cae8. Maybe you won't keep everything I pushed, but I hope some parts will be helpful :) Regards Julien
Re: MESSAGE quota resource implemention
Le 09/09/2011 14:18, Greg Banks a écrit : Ok, here you go. Not completely tested yet, so caveat emptor. Had to change two things: - mailbox_quota_check now expects a quota diff array which is good :), but change is now really applied if all diffs are ' 0' (instead of '= 0' previously); in some cases '0' is used to check that mailbox is not currently overquota (e.g. in LMTP service), so either we should go back to testing '= 0', or change callers that did rely on that test - changing the value of annotations quota storage in the index (mailbox_use_annot_quota function) do dirty the quota, but not the index; the new value is thus not committed (at least when setting mailbox annotations), and the quota entry becomes false when later deleting the mailbox If needed, you can look in our 'cyrus-imapd' repository at the 'gnb/annotate/fixes' branch. There is one commit for those matters. Remaining things concerning annotations: - when deleting messages, annotations length is not substracted; the solution may not be that simple, since I believe users are allowed to unexpunge mails: so since index entry is still here - until real unlinking -, annotations may have to stay there too - until unlinking too - for 'old' mailboxes (those created before the annotation storage usage field in the index header), current annotations usage shall be computed (and added to the quota entry) upon upgrading; this way users won't have to run 'quota -f' for all quotaroots after switching to this new version ;) Then maybe some of the cassandane tests I pushed on our repository would need to be refreshed (at least the one that checks what happens for legacy mailboxes on which we add one of the newly handled quota resources). That's next week's focus I think. Actually I just gave up the 'old' test: there is no easy way to simulate upgrading mailbox index, or at least I don't feel confident enough to make it in cassandane :( Other tests do work. Once the annotations usage is subtracted upon messages deletion (see before), all tests shall pass :) I rebased the 'quotamessage/gnb/annotate' branch of our 'cassandane' repository today, leaving that test aside. Regards Julien
Re: MESSAGE quota resource implemention
Le 06/09/2011 10:23, Greg Banks a écrit : So I think we'll need another round, sorry :( Given that the annotations quota is broken and I'll be reimplementing it anyway, you may as well ignore QUOTA_ANNOTSTORAGE in all commits, and leave out the function annotatemore_computequota() for now. We'll use something very much like it for reconstruct later. I'm hoping to be able to pull your next round of changes into my annotate branch before I reimplement the annotation quota in the next few days. If the new code is capable of returning actual resource usages upon getquota (I think Bron wanted it that way), then I guess I can drop some more of the functions I added (annotatemore_computequota, but also mboxlist_updatequota and associated code including the quota.sets[]; then maybe I could also change the code as proposed earlier and prevent writing of resource limit in quota entry if it is QUOTA_UNLIMITED). Will look at it tomorrow. Worked on DUMP/UNDUMP today: https://github.com/worldline-messaging/cyrus-imapd/commit/45148b20a4f2343d72aa7436e7255a92508d7bf8 I used an IMAP list to send data, as for annotations. For future usages, maybe the UNDUMP code could try to skip IMAP lists if it finds any instead of the expected file content LITERAL ? (to prevent breaking compatibility too many times, in case other things would need to be transmitted this way through DUMP/UNDUMP). By the way, I noticed that UNDUMPing annotations fails for now (function annotate_state_write uses 'int_mboxname' in the annotation state struct, but it is NULL in this context). I'm still not convinced we'll need quota.sets[], but I'll play along. Thanks again for your work, and sorry that my annotate branch wasn't quite as stable a base as you first thought :) No problem, I was somehow prepared to that kind of scenario :) Regards Julien
Re: MESSAGE quota resource implemention
Le 06/09/2011 19:48, Julien Coloos a écrit : Le 06/09/2011 10:23, Greg Banks a écrit : So I think we'll need another round, sorry :( Given that the annotations quota is broken and I'll be reimplementing it anyway, you may as well ignore QUOTA_ANNOTSTORAGE in all commits, and leave out the function annotatemore_computequota() for now. We'll use something very much like it for reconstruct later. I'm hoping to be able to pull your next round of changes into my annotate branch before I reimplement the annotation quota in the next few days. If the new code is capable of returning actual resource usages upon getquota (I think Bron wanted it that way), then I guess I can drop some more of the functions I added (annotatemore_computequota, but also mboxlist_updatequota and associated code including the quota.sets[]; then maybe I could also change the code as proposed earlier and prevent writing of resource limit in quota entry if it is QUOTA_UNLIMITED). Will look at it tomorrow. When I said 'actual', I meant based on the upcoming mailbox index field. But wait, I got a bit confused with the last point about playing along with quota.sets[]. If what you discussed with Bron is that, in the end, usage (re)computing will only be done with the quota utility (no more automatic computing upon setquota, neither for getquota), then after upgrading people shall call 'quota -f' once and for all and quota.sets[] must disappear - and all resources must be written in the quota entry -, otherwise users would need to call 'quota -f' on a given quotaroot each time they set a new quota resource limit to a mailbox. Regards Julien
Re: MESSAGE quota resource implemention
Le 05/09/2011 06:12, Greg Banks a écrit : Julien, I think we agreed on everything else, right? I'm looking forward to your next iteration. After picking your 'uquota_t removal' commit, I also removed it on my end, and changed the code according to our previous discussions. Adding an helper function which fills a quota_t array with the current messages usage from mailbox struct made the code a bit cleaner and more generic :) I still have to look for transferring limits upon DUMP/UNDUMP. And for the time being, I still kept the quota.sets[] thing (we will see later if we can do it more smartly). Regards Julien
Re: MESSAGE quota resource implemention
be updated to check (LITERAL parameter) that each message about to be received would fit in the mailbox, instead of checking only once at the end ? imap/mailbox.c I'm not sure it's a good idea to set mailbox-i.exists = 0 in mailbox_delete(). Perhaps it would be better to check for (mailbox-i.options OPT_MAILBOX_DELETED) when sampling mailbox-i.exists in mailbox_commit_quota(). Hmm, I wondered about it too. At the end of mailbox_delete, upon calling mailbox_close, all physical files in the mailbox are deleted, and it didn't seem like 'exists' was used anywhere in-between, so I thought it would be alright. But given the complexity of the code at that point (lot of cleaning when deleting the mailbox), and unforeseen future usages, I guess you are right. Actually maybe even quota_mailbox_use shall be treated that way ? I'm thinking that there's now three places in the code which take a mailbox* and fill out an array of quota diffs, interpreting the contents of the struct mailbox. That should really be centralised. I'm not sure to see what you have in mind here. Are you talking about the places where the QUOTA_STORAGE and QUOTA_MESSAGE entries of the quota diff array are computed relatively to the 'quota_mailbox_used' and 'exists' index fields of the struct mailbox ? If so I guess some of the code could indeed be centralised. Regards Julien Coloos
MESSAGE quota resource implemention
Hi, As discussed on IRC last week, I worked on implementing MESSAGE quota resource in cyrus (see RFC 2087; STORAGE resource already being handled). I created a branch based on Greg's 'annotate' one on github, since his work on annotation storage management made mine a lot easier. Details on the changes I made: cyrus-imapd: - added MESSAGE quota resource management - updated cunit test - added 'quotawarnmsg' option which behaviour is similar to 'quotawarnkb'; warning message shown when selecting folder in IMAP tells which resource limit was triggered - added 'autocreatequotamsg' option, to set MESSAGE limit (unlimited by default) when user auto-creates its mailbox - xfer transfers all non-unlimited resources - added helper function to compute annotation storage usage for a given mailbox - changed the way quota entries are read/written - resources presence in fetched entry is remembered - when writing entry, only present resources are written - when setting resources limits, entries which were not present upon fetching are now marked as present and their current usage is computed - quota utility lists and computes (-f) all resources associated to mailbox The branch 'quotamessage/gnb/annotate' is available here: git://github.com/worldline-messaging/cyrus-imapd.git. It is based on Greg's 'annotate' branch on github. cassandane: - added tests for MESSAGE resource - modified current tests to play a bit with subfolders and setting quota after adding messages/annotations - special test to check new resource usage is computed when necessary for pre-existing mailboxes (which only have usage for STORAGE resource) The branch 'quotamessage/gnb/master' is available here: git://github.com/worldline-messaging/cassandane.git. It is based on Greg's 'master' branch on github. Things that may be worth noting: - DUMP/UNDUMP currently does nothing special about MESSAGE or X-ANNOTATION-STORAGE quota resources - should it be transferred ? - without breaking backward compatibility, limits could only be transferred through a 'fake' file entry, as for annotations - quota usage is currently stored in a uquota_t variable, and delta is computed as quota_t; so theorically there could be overflow issues if quota usage to add/substract cannot be held in a quota_t; in practice it should be unlikely since that would mean a usage of over 2^63-1 Why the change concerning quota entries read/write ? The thing is that I wanted to make version upgrading as painless as possible, both for users and in the code. With previous code, when quota entry exists and is read, missing resources are initialised with default values (0 usage, unlimited). Thus only usage delta is tracked, and actual usage computing would only happen if quota entry was missing: this is not nice when upgrading, since lots of mailboxes are likely to have an entry with only the STORAGE resource present. So actual usage has to be computed at some point for newly handled resources. The idea here is to compute it when setting the resource for the first time. To do that it was necessary to know when the resource was not previously present and keep it that way until its limit is finally set for the first time. This scheme has another advantage: for platforms where only STORAGE quota is used, quota entries size remains as it is now. Only people using those new quota resources will have their quota entries grow to store the new data. Comments are welcomed :) I think that there are two things that may also be done concerning quota entries: - always recompute resource usage when changing its limit from unlimited to bounded - currently it is only done once when setting the usage limit for the first time - that way, it may not be necessary to track resource presence when reading/writing quota entries - but maybe it could be too time consuming in some cases, since it seems to be possible to associate a quota resource to a whole domain (recomputing usage for all mailboxes would then take some time) - do not write resource in quota entry when its usage is unlimited - except for the STORAGE resource, for backward compatibility - would also help keeping quota entries size to the bare minimum What do you think ? PS: should I open a new bugzilla ticket for that ? PPS: cunit: on my computer cunit tests succeed except the 'foreach' one in the 'quota' suite which timeouts (it seems messing with 4096 quotalegacy is too much for my computer). cassandane: a few quota tests I added fail due to some of the issues I reported (#5327 and #5329). And poor lemmings (well thought out name in this context :)) are dying hopelessly and endlessly in the Cassandane::Cyrus::Master test, preventing it to complete. Don't know if it is normal.
Future plans for replication
Hi there, Are there any known plans for the future of replication inside cyrus ? There are a few things we dream of having, and that we consider implementing. But if some of those do overlap with cyrus roadmap, maybe we could help you by sharing the work ? In no particular order: 1) Master-master replication There are many difficulties to overcome for this feature, as well as many ways to achieve it: - 2-servers only, multi-master, ... ? - optimistic replication ? - since most meta-data do not (yet ?) have some kind of revision control, how to reconciliate when concurrent changes happen ? - actually, should data be allowed to diverge naturally ? that is, are users allowed to hit any master for the mailbox, or should they be directed to *the* master reference of the mailbox (other masters being hit only when the reference is down) - ... 2) Dedicated (list of) replication server(s) per mailbox For example there is the default server-wide replication configuration, but a new per-mailbox annotation could override that global setting. One of the advantages of this feature is to allow to put the load on more than one server (which is the replica in current situation) when a master is down. 3) Handling messages partition that actually do not need replication For example: - meta-data files are stored on local partitions, for faster access, and do need to be replicated - messages are stored on partitions shared with other masters and do not need to be replicated; think NFS access (or similar), with data already secured (e.g. RAID or proper filesystem replication) That's a peculiar scenario, and we know that feature may be a little tricky (even more if trying to manage it in a generic way, taking into account each kind of data/meta-data). But maybe it would benefit to people other than us ? In our case such a scenario would actually make sense, notably considering the number of mailboxes (tens of millions). Different people have different needs for their architecture, and the more possibilities offered to cyrus users, the better. But not everything can be implemented, and we guess choices will have to be made. Comments and inputs are welcomed :) Regards Julien Coloos
Coding standards for 32/64-bits data
Hi there, I am currently wondering how to properly handle (ex-32/)64-bits data in cyrus code. Since this may be useful for other developpers willing to contribute to cyrus, I am asking on the mailing list instead of IRC channel (that I have yet to join actually). In previous versions of cyrus, some data could be 32 or 64-bits depending on the architecture. For example the quota value (and the associated printf format) was 64-bits if HAVE_LONG_LONG_INT was defined, 32-bits otherwise. To limit the impact on source code, typedef and #define was used: #ifdef HAVE_LONG_LONG_INT typedef unsigned long long int uquota_t; typedef long long int quota_t; #define UQUOTA_T_FMT %llu #define QUOTA_T_FMT %lld #define QUOTA_REPORT_FMT %8llu #else typedef unsigned long uquota_t; typedef long quota_t; #define UQUOTA_T_FMT %lu #define QUOTA_T_FMT %ld #define QUOTA_REPORT_FMT %8lu #endif I used the same scheme for ticket 3374 (selection of most fitting partition/backend) where I had to handle the partition/backend disk size. But it appears a few months ago 64-bits support became mandatory in master, and in the case of quota only the typedef and #define of the 64-bits section were kept. Leaving aside legacy data, the question is how to handle new ones (like the feature I worked on) ? Should we: - keep on using typedef and #define, even if we now only use 64-bits types: in that case it is easier to change the data type if needed (Who will ever need more than 640kB memory ? ;p) or - just use plain (unsigned) long long / (u)int64_t types and associated printf formats Regards
Re: Cyrus 2.3.16 \Seen flag problem
Le 21/04/2011 20:40, Bron Gondwana a écrit : Unfortunately, moving to 2.4.x version is not yet an option for us :( Why not? So ... is there some cyrus guru expert out there with enough knowledge of that part of the code to fix it in the 2.3.x version ? Yes - I already did once... and I'd rather not touch it with a bargepole again! Seriously, there's all sorts of issues with that bit of the code in 2.3.x, and this is just one of them. I'd be interested in knowing your reasons for not upgrading to 2.4, because I'd prefer to fix those than ever look at the 2.3 seen management code again! Bron. First, thanks for your quick reply. Actually we do have quite a few (dozens) patches of our own applied to the cyrus codebase, and are currently on the 2.3.16 version. I say patches because not all of them are quite clean. In the past we already gave back some, so that you could merge them to the codebase if you deemed them interesting. On the long run, we try to check/clean our other patches and decide whether they could be of any interest to be given back too; knowing that many of them are pretty specific to our customers requests, and are of absolutely no interest to the community (some of them should actually be burnt on the public place - the patches I mean ;) - ... but the customer is King and we do not always have the last word). Did I already say we have quite a few patches ? Some of them do alter a lot of the original codebase. So it takes us some time to move to newer versions of cyrus, especially when the codebase is heavily rewritten - be it for the better :) - as it was for 2.4.x. Actually we are currently considering whether the next codebase on which to work will be 2.4.x or (next to-be) 2.5.x. And that's it for our story. We thus tried our luck on the mailing list to see if someone with its bio-hazard suit on would dare to fix the bug in the maintained 2.3.x branch. Regards Julien PS: maybe you remember having received an email from one of us some months ago (see attachment) ? It gave some more details on a few patches we do use on production platforms. ---BeginMessage--- Dear All, We are an European hosting company and we deploy Cyrus for several years with success. Our main clients are major European ISPs, we host 40 million mailboxes on several platforms, 25 million mailboxes on the biggest. We made some code rewrite for better scalability/performance or new features, asked by our clients. Unfortunately, several developments are not RFC compliant and we believe are not interesting to reverse :-( So, we are very happy with Cyrus and we want to contribute more. (we contributed to issue 3187 , 3188 , 3191, 3214, 3215 , 3216 , 3227 , 3238, 3146, 3152, 3066, 3246, 2913, 2926 and the pop3 optimizer added in 2.3.16). Here some features that we can share (if you think they have value). First we plan to merge our code with the next Cyrus 2.4 and we will spend some time to have a clean code (unfortunately we often had to compromise between time limit and good code design) : - split header/body from a given mail size : headers on high-end storage, big body on lowcost storage (emails more that 1Mb are 10% of total emails and 80% of our global storage). It is also possible to compress the body part to save space. It is possible to choose among 3 backends to store the body : filesystem, webdav (not really tested, just for fun) and Grid Storage (our proprietary storage solution). However we think that a better way would be to split at attachment level instead of body level, it could be possible to save base64 overhead, permit de-duping and do data compression (i.e on PDF or Office). May be it would be nice to add an http/webdav daemon to Cyrus to give direct access to the attachments... - RFC 5423 - Internet Message Store Events is mostly implemented to send external notifications. We add an activemq plugin to notifyd to send JMS notifications. - We just finalize and deploy full-text search in Cyrus like dovecot implementation (it breaks IMAP SEARCH). We write a Solr client that not just indexes html/plain text, but also PDF and Office attachments - murderless Cyrus proxy based on a specific mysql cyrusdb backend - a dynamic mail delivery feature. Pop3d and Imapd processes query a daemon after login (1 time per mailbox per day) to get the number of current marketing emails to deliver. If any, the process sleeps a few ms to let the external daemon deliver the emails. Thus, when a campaign is out of date we don't deliver any email to the mailbox (save space and to pollute user's mailbox). - gracefully shutdown (SIGUSR1). The master stop to listen but don't break current connections. We can also start a new master (i.e. to reload the configuration) A better way would be to spread SIGHUP to children All these features are today on production. Sincerly, Sébastien -Message d'origine- De :
Cyrus 2.3.16 \Seen flag problem
Hi, We encountered a bug in cyrus 2.3.16: mails freshly delivered by LMTP were already flagged \Seen while not having been read by anyone. After some investigations, we tracked it down to the code in charge of updating the seen db: function index_checkseen in imap/index.c file. That complicated code seems quite prone to issues considering that 5 sets of data are handled: - the 'old' seen db, listing UIDs with its initial \Seen state - an array in which \Seen state after actions in the current session are stored - the 'new' seen db, loaded right before updating its content (to merge any concurrent sessions changes) - the index file data held in static variables of the index.c code section - lets call it index1 - the mailbox structure, with a possibly newer version of the index file data - lets call it index2 The steps to trigger the issue: 1. The INBOX is opened 2. Mails not already flagged \Seen are read (and so the \Seen flag is set); alternatively mails are not read but the \Seen flag is set on them 3. One or more mails are delivered to the mailbox through LMTP 4. The INBOX is closed (with implicit expunge) Note: using the 'UID' version of the commands (FETCH/STORE) lowers the probability of encountering the issue because index1 is refreshed for those commands. Why those steps trigger the issue ? Upon closing the INBOX folder, an implicit expunge is performed; a side effect is that index2 is refreshed. With the other conditions met: - all known mails (not taking into account the ones delivered by LMTP, which are not visible in index1) now have the \Seen flag - the 'old' and 'new' seen db do not differ the current code updates the seen db believing that UIDs 1 up to mailbox-last_uid have the \Seen flag. The problem is that mailbox was refeshed and so all mails delivered by LMTP are now flagged \Seen. Following the source code with those conditions we would get the following values: - newseen = 0 - newnext = mailbox-last_uid + 1 - neweof = 1 - start = 1 - inrange = 1 - usecomma = 0 - uid = mailbox-last_uid So we would end in the following part of the code: if (inrange) { /* Last message read. */ ... if (!start uid 1) start = 1; if (usecomma++) *save++ = ','; if (start start != uid) { sprintf(save, %u:, start); save += strlen(save); } sprintf(save, %u, uid); save += strlen(save); ... which does output the range [1,mailbox-last_uid] in the seen db. It may be possible that some other set of conditions would trigger a similar issue. We tried to think of a clean way to solve it, but provided the complexity of that code we gave up (mainly fearing nasty side effetcs and introducing more bugs). Maybe that's the reason it was completely rewritten in cyrus 2.4.x (which does not have that bug) ? ;) Unfortunately, moving to 2.4.x version is not yet an option for us :( So ... is there some cyrus guru expert out there with enough knowledge of that part of the code to fix it in the 2.3.x version ? Regards
Re: Selection of most fitting partition/backend upon account creation
On Thu, Dec 16, 2010 at 12:01 AM, Bron Gondwana br...@fastmail.fm wrote: On Wed, Dec 15, 2010 at 05:39:32PM +0100, Julien Coloos wrote: We are not using git yet, but it seems interesting enough to start doing so from now on. We will be looking into it, so it shall take some more time before we can get back to you with a patch. No worries. This looks like something that would be good to get into 2.5 when we release it. I'm thinking the timeframe is 1-2 months, depending what else is ready :) Bugzilla ticket #3374 (http://bugzilla.cyrusimap.org/show_bug.cgi?id=3374) has been created and updated with necessary information. If there is anything missing, just ask. Regards
Re: Selection of most fitting partition/backend upon account creation
On Thu, Dec 16, 2010 at 9:43 AM, Michael Menge michael.me...@zdv.uni-tuebingen.de wrote: Quoting Bron Gondwana br...@fastmail.fm: On Wed, Dec 15, 2010 at 03:31:00PM +0100, Michael Menge wrote: I would like to see this included in the official cyrus. It would be nice if the space reserved by quota could be taken into account. Oooh... interesting. Yes, that does make sense. I don't know about you, but we over-supply quota by heaps. Yes, i think this is common, so the not reserved disk space can get negative. And the code must be able to hanlde this, and the case where the not reserved disk space is 0. We have a balancing task that watches for servers filling up and moves users automagically behind the scenes. We use sync_move, which I'm planning to write into Cyrus for the future as well - where it uses the replication engine rather than the evil XFER :) Much saner, and it means you can sync everything FIRST and then just replicate the diff once you actually shut down the mailbox, so the downtime is much shorter. Very nice You are talking about the quota associated to the mailboxes already present on the partition/server, right ? On our platforms we also give more quota to all users than what we could physically store. Sometimes not all users have the same quota limit: either because some subscribed to a specific service, or because some backends are shared between clients. To precisely determine how much space would be left if all users would fill their mailbox, we would need to retrieve the quota usage of all those mailboxes. Unfortunately we still have legacy quota db (that is: one file per mailbox) on most platforms ... and migrating to other db formats is not always possible since some clients are very picky about the actions we do on 'their' platform ;) Considering some backends do host hundreds of thousands of mailboxes, determining the quota usage of all users would be quite time consuming for us :) Thus we currently still rely on many tools to watch backends usage and perform account moving between backends. For the time being we will focus on the code we currently have. But there could be another selection mode that may make sense: if we consider that backends tend to be homogeneous (as far as quota usage is concerned), then comparing the average available space per mailbox on each partition could also fit some needs. Though that would still require to determine the number of mailboxes on the concerned partition. Regards
Re: Selection of most fitting partition/backend upon account creation
On Wed, Dec 15, 2010 at 2:19 PM, Bron Gondwana br...@fastmail.fm wrote: Yes, it sounds excellent. We use git for version control now - so you could easily fork a copy and put it somewhere (github for example) with your patch, and just point us to the branch :) Or put the diff in bugzilla - whatever suits. We are not using git yet, but it seems interesting enough to start doing so from now on. We will be looking into it, so it shall take some more time before we can get back to you with a patch. One question: do you attempt to keep mailboxes for the same user together on the same partition/same server? While same partition isn't really necessary (just somewhat desirable) - same server is pretty useful. Otherwise things like seen files become a mess (less so in 2.4.x, but still) Indeed, keeping the user mailboxes on the same partition/server seems more logical. Only new user account is concerned by this behaviour. When a new user subfolder is created, its parent server/partition is still retrieved and used unless explicitely given as parameter upon creation. Regards