Re: fixing unixheirsep for 2.5

2013-02-17 Thread Julien Coloos

Le 15/02/2013 07:10, Bron Gondwana a écrit :

Thanks for this!

commit c95da833a910150bb8d55585d221237cf1b673d9
Author: Julien Coloos 
Date:   Mon Feb 11 15:36:05 2013 +0100

 Various fixes for userid with dots and unixhierarchysep
 Use correct (not internal) userid in ACLs when creating root level folder.
 Fix ACLs setting when userid contains dots.
 In pop3d, as for imapd, correctly handle userid with dots.


Fixing the various unixheirsep and altnamespace cases to work correctly has 
been on my todo list for far too long.

Anyone who's following along with 2.5 development and uses either of those two 
set to other than the default - I'd love your test results and bug reports.  I 
really want this rock solid by release.

(anyone who can write test cases would be greatly appreciated too!)

Thanks,

Bron.


Here is a test case for the issues I fixed:
Repository: git://github.com/worldline-messaging/cassandane.git
Branch: ticket/3373
Commit: e866d3f134ed07663d2742113964ed91b80be831 
(https://github.com/worldline-messaging/cassandane/commit/e866d3f134ed07663d2742113964ed91b80be831)



Notice that there actually still exist incoherences regarding how userid 
is handled.
In most - if not all - places, the current master code deals with the 
internal representation of the userid, and thus in some places (usually 
when dealing with ACLs, as with my fix) there are conversions to its 
external form.


The incoherences arise when the external userid contains the character 
'^', which is the value used to represent dots in internal form when 
unix hierarchy separator is used.
For example both user ids "my.id" and "my^id" have the same mailbox 
(internal name: "user.my^id"). Thus if you set ACLs for "my^id" on 
mailbox (external name) "user/my.id", user "my^id" would be able to 
login in this mailbox.
In places where internal form is converted back to external, other kind 
of issues can happen: "my^id" is converted back to "my.id" which is not 
the actual external user id.


I believe that to fix those last issues, either '^' shall become 
forbidden with unix hierarchy separator enabled, or the code shall be 
refactored to receive (a structure knowing) both external and internal 
form of the user id.




Selection of most fitting partition/backend upon account creation

2010-12-15 Thread Julien Coloos
Hi,

In latest cyrus versions (beginning with 2.3.x branch), cyrus is
capable of selecting the most fitting partition when creating a new
account. To do so, the default partition has to be left unspecified,
and in this case cyrus selects the partition with the most free space.
Similarly, when creating an account from a murder proxy, it is
possible to let cyrus select the backend with the most free space.

Due to some of our clients needs, we had to rework those features to
handle other modes of selection (see below). We currently have code
for 2.3.16 version.
If you are interested, we could create a patch out of it, adapted for
2.4.5 version, and add an entry for it on bugzilla.


Here are some details on the changes we made:

Partitions/backend are managed using a new structure which contains
all valuable data. Those data are retrieved the first time it is
needed, and then cached (configuration allows to refresh data).

New configuration options manage partition/backend selection:
   - partition_mode: how most fitting partition is selected; value is one of
  - random: (pseudo-)random selection
  - freespace-most: partition with the most free space (KiB)
 -> same as current cyrus behaviour
  - freespace-percent-most: partition with the most free space (%)
  - freespace-percent-weighted: each partition is weighted
according to its free space (%)
 -> the more free space the partition has, the more chances it
has to be selected
  - freespace-percent-weighted-delta: each partition is weighted
according to its difference of free space (%) compared to the most
used partition
 -> the more the partition is lagging behind the most used
partition, the more chances it has to be selected
 -> actually the code is made so that even the most used
partition has a few chances to be selected, and those chances increase
when other partitions get closer
   - partition_mode_exclude: list of partitions to exclude from selection mode
   - partition_mode_weighted_usage_limit: limit of partition usage (%)
  -> if a partition is over that limit, it is automatically
excluded from selection mode
  -> if all partitions are over that limit, this feature is not used anymore
   - partition_mode_usage_reinit: for a given session, number of
"operations" (e.g. partition selection) for which partitions usage
data are cached
  -> useful for clients that massively create new mailboxes using
the same session

   - serverlist_mode: same as partition_mode, but used on proxy for
selecting most fitting backend
  - random: (pseudo-)random
  - freespace-most: backend with the most (total) free space (KiB)
 -> same as current cyrus behaviour
  - freespace-percent-most: backend whose partition has the most
free space (%)
 -> and not the backend with the most (total) free space (%);
the goal is to compare the most fitting partition of each backend
  - freespace-percent-weighted: same as for partition selection,
comparing the free space (%) of the least used partition of each
backend
 -> again the goal is to compare the most fitting partition of
each backend
  - freespace-percent-weighted-delta: same as for partition
selection, comparing the free space (%) of the least used partition of
each backend
 -> again the goal is to compare the most fitting partition of
each backend
   - serverlist_mode_weighted_usage_limit: same as
partition_mode_weighted_usage_limit
   - serverlist_mode_usage_reinit: same as partition_mode_usage_reinit

Necessary data are fetched from backend through new annotations
(vendor name was replaced by yours):
   - "/vendor/cmu/cyrus-imapd/freespace/total": returns the (sum of)
available and total space of configured partitions, in the form
"available;total"
   - "/vendor/cmu/cyrus-imapd/freespace/percent/most": returns the
available and total space of the partition with the most free space
(%), in the form "available;total"

For the proxy, the function find_free_server was modified to select
the most fitting backend according to the configured mode, thanks to
the two new annotations.

For the backend, the function find_free_partition was replaced to
select the most fitting partition according to the configured mode.


We refactored our code recently, and if necessary the options can be
renamed (as well as the different modes or annotations): better use a
consistent naming scheme if you decide to integrate it in cyrus :)
So, let us know if you are interested in that code.

Regards


Re: Selection of most fitting partition/backend upon account creation

2010-12-15 Thread Julien Coloos
On Wed, Dec 15, 2010 at 2:19 PM, Bron Gondwana  wrote:
> Yes, it sounds excellent.  We use git for version control now - so you
> could easily fork a copy and put it somewhere (github for example) with
> your patch, and just point us to the branch :)  Or put the diff in
> bugzilla - whatever suits.

We are not using git yet, but it seems interesting enough to start
doing so from now on. We will be looking into it, so it shall take
some more time before we can get back to you with a patch.

> One question: do you attempt to keep mailboxes for the same user
> together on the same partition/same server?  While same partition
> isn't really necessary (just somewhat desirable) - same server
> is pretty useful.  Otherwise things like seen files become a mess
> (less so in 2.4.x, but still)

Indeed, keeping the user mailboxes on the same partition/server seems
more logical.
Only new user account is concerned by this behaviour. When a new user
subfolder is created, its parent server/partition is still retrieved
and used unless explicitely given as parameter upon creation.



Regards


Re: Selection of most fitting partition/backend upon account creation

2010-12-16 Thread Julien Coloos
On Thu, Dec 16, 2010 at 9:43 AM, Michael Menge
 wrote:
> Quoting Bron Gondwana :
>
>> On Wed, Dec 15, 2010 at 03:31:00PM +0100, Michael Menge wrote:
>>>
>>> I would like to see this included in the official cyrus.
>>>
>>> It would be nice if the space reserved by quota could be
>>> taken into account.
>>
>> Oooh... interesting.  Yes, that does make sense.  I don't
>> know about you, but we over-supply quota by heaps.
>
> Yes, i think this is common, so the "not reserved" disk
> space can get negative. And the code must be able to hanlde
> this, and the case where the "not reserved" disk space is 0.
>
>> We
>> have a "balancing task" that watches for servers filling
>> up and moves users automagically behind the scenes.
>>
>> We use sync_move, which I'm planning to write into Cyrus
>> for the future as well - where it uses the replication
>> engine rather than the evil XFER :)  Much saner, and it
>> means you can sync everything FIRST and then just
>> replicate the diff once you actually shut down the
>> mailbox, so the downtime is much shorter.
>>
>
> Very nice
>

You are talking about the quota associated to the mailboxes already
present on the partition/server, right ?

On our platforms we also give more quota to all users than what we
could physically store. Sometimes not all users have the same quota
limit: either because some subscribed to a specific service, or
because some backends are shared between clients.
To precisely determine how much space would be left if all users would
fill their mailbox, we would need to retrieve the quota usage of all
those mailboxes. Unfortunately we still have legacy quota db (that is:
one file per mailbox) on most platforms ... and migrating to other db
formats is not always possible since some clients are very picky about
the actions we do on 'their' platform ;)
Considering some backends do host hundreds of thousands of mailboxes,
determining the quota usage of all users would be quite time consuming
for us :)

Thus we currently still rely on many tools to watch backends usage and
perform account moving between backends.

For the time being we will focus on the code we currently have.
But there could be another selection mode that may make sense: if we
consider that backends tend to be homogeneous (as far as quota usage
is concerned), then comparing the average available space per mailbox
on each partition could also fit some needs. Though that would still
require to determine the number of mailboxes on the concerned
partition.


Regards


Re: Selection of most fitting partition/backend upon account creation

2010-12-21 Thread Julien Coloos
On Thu, Dec 16, 2010 at 12:01 AM, Bron Gondwana  wrote:
> On Wed, Dec 15, 2010 at 05:39:32PM +0100, Julien Coloos wrote:
>> We are not using git yet, but it seems interesting enough to start
>> doing so from now on. We will be looking into it, so it shall take
>> some more time before we can get back to you with a patch.
>
> No worries.  This looks like something that would be good to get into 2.5
> when we release it.  I'm thinking the timeframe is 1-2 months, depending
> what else is ready :)

Bugzilla ticket #3374
(http://bugzilla.cyrusimap.org/show_bug.cgi?id=3374) has been created
and updated with necessary information.
If there is anything missing, just ask.


Regards


Cyrus 2.3.16 \Seen flag problem

2011-04-21 Thread Julien Coloos
Hi,

We encountered a bug in cyrus 2.3.16: mails freshly delivered by LMTP
were already flagged \Seen while not having been read by anyone.
After some investigations, we tracked it down to the code in charge of
updating the seen db: function index_checkseen in imap/index.c file.
That complicated code seems quite prone to issues considering that 5
sets of data are handled:
 - the 'old' seen db, listing UIDs with its initial \Seen state
 - an array in which \Seen state after actions in the current session are stored
 - the 'new' seen db, loaded right before updating its content (to
merge any concurrent sessions changes)
 - the index file data held in static variables of the index.c code section
   -> lets call it index1
 - the mailbox structure, with a possibly newer version of the index file data
   -> lets call it index2

The steps to trigger the issue:
1. The INBOX is opened
2. Mails not already flagged \Seen are read (and so the \Seen flag is
set); alternatively mails are not read but the \Seen flag is set on
them
3. One or more mails are delivered to the mailbox through LMTP
4. The INBOX is closed (with implicit expunge)
Note: using the 'UID' version of the commands (FETCH/STORE) lowers the
probability of encountering the issue because index1 is refreshed for
those commands.

Why those steps trigger the issue ?
Upon closing the INBOX folder, an implicit expunge is performed; a
side effect is that index2 is refreshed. With the other conditions
met:
 - all known mails (not taking into account the ones delivered by
LMTP, which are not visible in index1) now have the \Seen flag
 - the 'old' and 'new' seen db do not differ
the current code updates the seen db believing that UIDs 1 up to
mailbox->last_uid have the \Seen flag. The problem is that mailbox was
refeshed and so all mails delivered by LMTP are now flagged \Seen.

Following the source code with those conditions we would get the
following values:
 - newseen = 0
 - newnext = mailbox->last_uid + 1
 - neweof = 1
 - start = 1
 - inrange = 1
 - usecomma = 0
 - uid = mailbox->last_uid

So we would end in the following part of the code:

if (inrange) {
/* Last message read. */
...
if (!start && uid > 1) start = 1;
if (usecomma++) *save++ = ',';
if (start && start != uid) {
sprintf(save, "%u:", start);
save += strlen(save);
}
sprintf(save, "%u", uid);
save += strlen(save);
...

which does output the range [1,mailbox->last_uid] in the seen db.


It may be possible that some other set of conditions would trigger a
similar issue.
We tried to think of a clean way to solve it, but provided the
complexity of that code we gave up (mainly fearing nasty side effetcs
and introducing more bugs).
Maybe that's the reason it was completely rewritten in cyrus 2.4.x
(which does not have that bug) ? ;)
Unfortunately, moving to 2.4.x version is not yet an option for us :(

So ... is there some cyrus guru expert out there with enough knowledge
of that part of the code to fix it in the 2.3.x version ?


Regards


Re: Cyrus 2.3.16 \Seen flag problem

2011-04-22 Thread Julien Coloos

Le 21/04/2011 20:40, Bron Gondwana a écrit :



Unfortunately, moving to 2.4.x version is not yet an option for us :(

Why not?


So ... is there some cyrus guru expert out there with enough knowledge
of that part of the code to fix it in the 2.3.x version ?

Yes - I already did once... and I'd rather not touch it with a
bargepole again!  Seriously, there's all sorts of issues with
that bit of the code in 2.3.x, and this is just one of them.

I'd be interested in knowing your reasons for not upgrading to
2.4, because I'd prefer to fix those than ever look at the
2.3 seen management code again!

Bron.


First, thanks for your quick reply.

Actually we do have quite a few (dozens) "patches" of our own applied to 
the cyrus codebase, and are currently on the 2.3.16 version.
I say "patches" because not all of them are quite "clean". In the past 
we already gave back some, so that you could merge them to the codebase 
if you deemed them interesting.
On the long run, we try to check/clean our other patches and decide 
whether they could be of any interest to be given back too; knowing that 
many of them are pretty specific to our customers requests, and are of 
absolutely no interest to the community (some of them should actually be 
burnt on the public place - the patches I mean ;) - ... but "the 
customer is King" and we do not always have the last word).


Did I already say we have quite a few patches ? Some of them do alter a 
lot of the original codebase. So it takes us some time to move to newer 
versions of cyrus, especially when the codebase is heavily rewritten - 
be it for the better :) - as it was for 2.4.x.
Actually we are currently considering whether the next codebase on which 
to work will be 2.4.x or (next to-be) 2.5.x.


And that's it for our story.

We thus tried our luck on the mailing list to see if someone with its 
bio-hazard suit on would dare to fix the bug in the maintained 2.3.x branch.



Regards
Julien

PS: maybe you remember having received an email from one of us some 
months ago (see attachment) ? It gave some more details on a few patches 
we do use on production platforms.


--- Begin Message ---
Dear All,

We are an European hosting company and we deploy Cyrus for several years with 
success.
Our main clients are major European ISPs, we host 40 million mailboxes on 
several platforms, 25 million mailboxes on the biggest.


We made some code rewrite for better scalability/performance or new features, 
asked by our clients. Unfortunately, several developments are not RFC compliant 
and we believe are not interesting to reverse :-(
So, we are very happy with Cyrus and we want to contribute more. (we 
contributed to issue 3187 , 3188 , 3191, 3214, 3215 , 3216 , 3227 , 3238, 3146, 
3152, 3066, 3246, 2913, 2926 and the pop3 optimizer added in 2.3.16).


Here some features that we can share (if you think they have value). First we 
plan to merge our code with the next Cyrus 2.4 and we will spend some time to 
have a clean code (unfortunately we often had to compromise between time limit 
and good code design) :

 - split header/body from a given mail size : headers on high-end storage, big 
body on lowcost storage (emails more that 1Mb are 10% of total emails and 80% 
of our global storage). It is also possible to compress the body part to save 
space. It is possible to choose among 3 backends to store the body : 
filesystem, webdav (not really tested, just for fun) and Grid Storage (our 
proprietary storage solution).
However we think that a better way would be to split at attachment level 
instead of body level, it could be possible to save base64 overhead, permit 
de-duping and do data compression (i.e on PDF or Office). May be it would be 
nice to add an http/webdav daemon to Cyrus to give direct access to the 
attachments...

 - RFC 5423 - Internet Message Store Events is mostly implemented to send 
external notifications. We add an activemq plugin to notifyd to send JMS 
notifications.

 - We just finalize and deploy full-text search in Cyrus like dovecot 
implementation (it breaks IMAP SEARCH). We write a Solr client that not just 
indexes html/plain text, but also PDF and Office attachments

 - murderless Cyrus proxy based on a specific mysql cyrusdb backend

 - a dynamic mail delivery feature.
Pop3d and Imapd processes query a daemon after login (1 time per mailbox per 
day) to get the number of current marketing emails to deliver. If any, the 
process sleeps a few ms to let the external daemon deliver the emails. Thus, 
when a campaign is out of date we don't deliver any email to the mailbox (save 
space and to pollute user's mailbox).

 - gracefully shutdown (SIGUSR1). The master stop to listen but don't break 
current connections. We can also start a new master (i.e. to reload the 
configuration)
A better way would be to spread SIGHUP to children

All these features are today on production.


Sincerly,
Sébastien

-Message d'origine-
De : info-cyrus-bounces+sebastie

Coding standards for 32/64-bits data

2011-06-29 Thread Julien Coloos

Hi there,

I am currently wondering how to properly handle (ex-32/)64-bits data in 
cyrus code. Since this may be useful for other developpers willing to 
contribute to cyrus, I am asking on the mailing list instead of IRC 
channel (that I have yet to join actually).


In previous versions of cyrus, some data could be 32 or 64-bits 
depending on the architecture. For example the quota value (and the 
associated printf format) was 64-bits if HAVE_LONG_LONG_INT was defined, 
32-bits otherwise. To limit the impact on source code, typedef and 
#define was used:

#ifdef HAVE_LONG_LONG_INT
typedef unsigned long long int uquota_t;
typedef long long int quota_t;
#define UQUOTA_T_FMT "%llu"
#define QUOTA_T_FMT  "%lld"
#define QUOTA_REPORT_FMT "%8llu"
#else
typedef unsigned long uquota_t;
typedef long quota_t;
#define UQUOTA_T_FMT "%lu"
#define QUOTA_T_FMT  "%ld"
#define QUOTA_REPORT_FMT "%8lu"
#endif

I used the same scheme for ticket 3374 (selection of most fitting 
partition/backend) where I had to handle the partition/backend disk size.
But it appears a few months ago 64-bits support became mandatory in 
master, and in the case of quota only the typedef and #define of the 
64-bits section were kept.
Leaving aside legacy data, the question is how to handle new ones (like 
the feature I worked on) ? Should we:
 - keep on using typedef and #define, even if we now only use 64-bits 
types: in that case it is easier to change the data type if needed (Who 
will ever need more than 640kB memory ? ;p)

or
 - just use plain "(unsigned) long long / (u)int64_t" types and 
associated printf formats



Regards



Re: Coding standards for 32/64-bits data

2011-07-06 Thread Julien Coloos

Le 29/06/2011 15:32, Bron Gondwana a écrit :

On Wed, Jun 29, 2011 at 02:21:17PM +0200, Julien Coloos wrote:

Hi there,
...
Leaving aside legacy data, the question is how to handle new ones
(like the feature I worked on) ? Should we:
  - keep on using typedef and #define, even if we now only use
64-bits types: in that case it is easier to change the data type if
needed (Who will ever need more than 640kB memory ? ;p)
or
  - just use plain "(unsigned) long long / (u)int64_t" types and
associated printf formats

I vote for using (u)int64_t and associated formats everywhere.
And stripping all the other stuff.   I did an informal poll and
discovered that approximately nobody compiles anywhere that
doesn't offer a longlong type.  Even on 32 bit platforms.

The future is now people.


In that case, should there be some kind of fallback for people using 
compilers that do not support C99 ?
Because while there should not be much issues concerning the existence 
of (u)intXX_t types, the macro format specifiers usually defined in 
inttypes.h (e.g. PRId64 and PRIu64 for signed/unsigned 64-bits number) 
may not be present.


From time to time I like to do such changes, leaving aside "legacy" 
stuff. But people that are not up-to-date with standards may not like it 
as much as I do ;)



Regards


Re: Coding standards for 32/64-bits data

2011-07-08 Thread Julien Coloos

Le 07/07/2011 08:08, Bron Gondwana a écrit :

I've also seen it suggested that _every_ non-base-integer type _always_
be cast to [u]intmax_t and printed using PRI[ud]MAX:

printf("%" PRIuMAX, (uintmax_t) ival);
printf("%" PRIdMAX, (intmax_t) ival); /* for signed values */

The extra heavy lifting by printf() is trivial compared to the
programmer overhead of trying to define and use the right PRI* macros.

Makes sense.


Seems like a good compromise. It's also future-proof if the variable to 
print changes from 32-bits to 64-bits etc.
For the time being I used that approach on the few concerned lines in 
the feature I worked on. Should not be hard to change later if a better 
solution is found.


Future plans for replication

2011-08-12 Thread Julien Coloos

Hi there,

Are there any known plans for the future of replication inside cyrus ?
There are a few things we dream of having, and that we consider 
implementing. But if some of those do overlap with cyrus roadmap, maybe 
we could help you by sharing the work ?


In no particular order:

1) Master-master replication
There are many difficulties to overcome for this feature, as well as 
many ways to achieve it:

  - 2-servers only, multi-master, ... ?
  - optimistic replication ?
  - since most meta-data do not (yet ?) have some kind of revision 
control, how to reconciliate when concurrent changes happen ?
-> actually, should data be allowed to diverge naturally ? that is, 
are users allowed to hit any master for the mailbox, or should they be 
directed to *the* master reference of the mailbox (other masters being 
hit only when the reference is down)

  - ...


2) Dedicated (list of) replication server(s) per mailbox
For example there is the default server-wide replication configuration, 
but a new per-mailbox annotation could override that global setting.
One of the advantages of this feature is to allow to put the load on 
more than one server (which is the replica in current situation) when a 
master is down.



3) Handling messages partition that actually do not need replication
For example:
  - meta-data files are stored on local partitions, for faster access, 
and do need to be replicated
  - messages are stored on partitions shared with other masters and do 
not need to be replicated; think NFS access (or similar), with data 
already secured (e.g. RAID or proper filesystem replication)
That's a peculiar scenario, and we know that feature may be a little 
tricky (even more if trying to manage it in a generic way, taking into 
account each kind of data/meta-data). But maybe it would benefit to 
people other than us ?
In our case such a scenario would actually make sense, notably 
considering the number of mailboxes (tens of millions).
Different people have different needs for their architecture, and the 
more possibilities offered to cyrus users, the better. But not 
everything can be implemented, and we guess choices will have to be made.



Comments and inputs are welcomed :)


Regards
Julien Coloos



Re: Future plans for replication

2011-08-12 Thread Julien Coloos

Le 12/08/2011 13:37, Bron Gondwana a écrit :
On Fri, 12 Aug 2011 11:16:30 +0200, Julien Coloos 
 wrote:



1) Master-master replication
There are many difficulties to overcome for this feature, as well as 
many ways to achieve it:

   - 2-servers only, multi-master, ... ?


N servers - mesh, chain, whatever.  Ideally any layout will work.

My plan (which is partially done) is named channels for sync - basically
the name of the server you're connecting to.  When you make a change,
it injects a record into the sync log for each channel you have 
configured.

The sync_client for that channel then runs a reconciliation of every
listed item.



Nice, no wonder I though that this channel stuff I saw in the source 
code could be used to separate logs for different servers, since that 
was your intent :)
And proceeding that way allows to create mesh/chain topologies depending 
on the configuration.



If sync_server makes a change, is should suppress the sync_log for the
channel on which the change came in - so it doesn't try to replicate
back.  But it doesn't hurt too badly, because the data will be in sync
anyway, in theory...



Yeah, thinking about it it's not easy to prevent unnecessary 
reconciliations when topologies get complex. Fortunately propagation 
between servers quickly ends once they see they are in sync. And people 
who do care could use a chain topology to limit this.



   - optimistic replication ?


There are some things that I want to add to this, in particular a
"replication cache" which remembers the state of the mailbox at
last sync, allowing a sync_client to optimistically generate a
"change request" to the sync_server, listing all the changes that
would bring the target mailbox up-to-date, and also listing the
current state it EXPECTS on the server.  That way we can check
up-front if anything on the replica side has already changed, and
fall back to a full replication run.

The "full replication run" would first reconcile everything past the
last known "in sync" modseq, and only if that failed would it
send the entire mailbox contents.



Would that cache be per-server, or shared on the platform (kinda like 
the MUPDATE server) ?


   - since most meta-data do not (yet ?) have some kind of revision 
control, how to reconciliate when concurrent changes happen ?


Add revision control to everything.  In particular:

a) modseq to all annotations
b) keep DELETED entries in mailboxes.db with the UIDVALIDITY of
   the deleted mailbox (also on moves)
c) version numbers and "deleted markers" for all sieve scripts



Fair enough.
b is indeed useful to determine whether the mailbox was created on 
replica while master was down (needs to be copied back), or if it was 
deleted on master and was not yet pushed to replica.
A little bit curious about those "deleted markers" :) What do you have 
in mind ?


 -> actually, should data be allowed to diverge naturally ? that 
is, are users allowed to hit any master for the mailbox, or should 
they be directed to *the* master reference of the mailbox (other 
masters being hit only when the reference is down)

   - ...


They should be directed to the primary copy - IMAP and eventual
consistency are not friends.  You can make it work with UID promotion
logic, but flag merging will always be an inexact science unless we
keep MODSEQ and LASTMODIFIED separately for every single flag, and
the UID promotion still sucks when it happens to you.



I also think that this would be easier to handle than trying to manage 
concurrent modifications on all replicas as the nominal case (thinking 
about it already give me headaches).
Guess MODSEQ and LASTMODIFIED for every flag would be a little overkill 
to achieve better merging.


Talking about UID promotion (great idea by the way :)), we wonder if 
there could be some gain using a centralised service (ala MUPDATE; hence 
my previous question about the "replication cache") which would be used 
to store and give the next UID of mailboxes. This could prevent UIDs 
collisions, replacing UID promotion need in general, and should have 
less impact on IMAP clients with agressive caching.



2) Dedicated (list of) replication server(s) per mailbox
For example there is the default server-wide replication 
configuration, but a new per-mailbox annotation could override that 
global setting.
One of the advantages of this feature is to allow to put the load on 
more than one server (which is the replica in current situation) when 
a master is down.


I want to do this in mailboxes.db.  It's a lot of work, but extending the
murder syntax to allow multiple "locations" to be specified for a mailbox
including listing the preferred primary would allow all this to fit in
naturally with murder.  And it would ROCK for XFER, because you would 
just

add the mailbox to a new server, replicate it, set the other end as the
preferr

MESSAGE quota resource implemention

2011-08-31 Thread Julien Coloos

Hi,

As discussed on IRC last week, I worked on implementing MESSAGE quota 
resource in cyrus (see RFC 2087; STORAGE resource already being handled).
I created a branch based on Greg's 'annotate' one on github, since his 
work on annotation storage management made mine a lot easier.


Details on the changes I made:

cyrus-imapd:
  - added MESSAGE quota resource management
-> updated cunit test
-> added 'quotawarnmsg' option which behaviour is similar to 
'quotawarnkb'; warning message shown when selecting folder in IMAP tells 
which resource limit was triggered
-> added 'autocreatequotamsg' option, to set MESSAGE limit 
(unlimited by default) when user auto-creates its mailbox

  - xfer transfers all non-unlimited resources
  - added helper function to compute annotation storage usage for a 
given mailbox

  - changed the way quota entries are read/written
-> resources presence in fetched entry is remembered
-> when writing entry, only present resources are written
-> when setting resources limits, entries which were not present 
upon fetching are now marked as present and their current usage is computed
  - quota utility lists and computes (-f) all resources associated to 
mailbox
The branch 'quotamessage/gnb/annotate' is available here: 
git://github.com/worldline-messaging/cyrus-imapd.git. It is based on 
Greg's 'annotate' branch on github.


cassandane:
  - added tests for MESSAGE resource
  - modified current tests to play a bit with subfolders and setting 
quota after adding messages/annotations
  - special test to check new resource usage is computed when necessary 
for pre-existing mailboxes (which only have usage for STORAGE resource)
The branch 'quotamessage/gnb/master' is available here: 
git://github.com/worldline-messaging/cassandane.git. It is based on 
Greg's 'master' branch on github.



Things that may be worth noting:
  - DUMP/UNDUMP currently does nothing special about MESSAGE or 
X-ANNOTATION-STORAGE quota resources

-> should it be transferred ?
-> without breaking backward compatibility, limits could only be 
transferred through a 'fake' file entry, as for annotations
  - quota usage is currently stored in a uquota_t variable, and delta 
is computed as quota_t; so theorically there could be overflow issues if 
quota usage to add/substract cannot be held in a quota_t; in practice it 
should be unlikely since that would mean a usage of over 2^63-1



Why the change concerning quota entries read/write ?
The thing is that I wanted to make version upgrading as painless as 
possible, both for users and in the code.
With previous code, when quota entry exists and is read, missing 
resources are initialised with default values (0 usage, unlimited). Thus 
only usage delta is tracked, and actual usage computing would only 
happen if quota entry was missing: this is not nice when upgrading, 
since lots of mailboxes are likely to have an entry with only the 
STORAGE resource present.
So actual usage has to be computed at some point for newly handled 
resources. The idea here is to compute it when setting the resource for 
the first time. To do that it was necessary to know when the resource 
was not previously present and keep it that way until its limit is 
finally set for the first time.
This scheme has another advantage: for platforms where only STORAGE 
quota is used, quota entries size remains as it is now. Only people 
using those new quota resources will have their quota entries grow to 
store the new data.


Comments are welcomed :)


I think that there are two things that may also be done concerning quota 
entries:
  - always recompute resource usage when changing its limit from 
unlimited to bounded
-> currently it is only done once when setting the usage limit for 
the first time
-> that way, it may not be necessary to track resource presence 
when reading/writing quota entries
-> but maybe it could be too time consuming in some cases, since it 
seems to be possible to associate a quota resource to a whole domain 
(recomputing usage for all mailboxes would then take some time)

  - do not write resource in quota entry when its usage is unlimited
-> except for the STORAGE resource, for backward compatibility
-> would also help keeping quota entries size to the bare minimum

What do you think ?


PS: should I open a new bugzilla ticket for that ?

PPS:
cunit: on my computer cunit tests succeed except the 'foreach' one in 
the 'quota' suite which timeouts (it seems messing with 4096 quotalegacy 
is too much for my computer).
cassandane: a few quota tests I added fail due to some of the issues I 
reported (#5327 and #5329). And poor lemmings (well thought out name in 
this context :)) are dying hopelessly and endlessly in the 
Cassandane::Cyrus::Master test, preventing it to complete. Don't know if 
it is normal.




Re: MESSAGE quota resource implemention

2011-09-01 Thread Julien Coloos
e only to trash it at the end (when overquota). So actually I wonder 
if the code could be updated to check (LITERAL parameter) that each 
message about to be received would fit in the mailbox, instead of 
checking only once at the end ?



imap/mailbox.c

I'm not sure it's a good idea to set mailbox->i.exists = 0 in 
mailbox_delete().  Perhaps it would be better to check for 
(mailbox->i.options & OPT_MAILBOX_DELETED) when sampling 
mailbox->i.exists in mailbox_commit_quota().
Hmm, I wondered about it too. At the end of mailbox_delete, upon calling 
mailbox_close, all physical files in the mailbox are deleted, and it 
didn't seem like 'exists' was used anywhere in-between, so I thought it 
would be alright.
But given the complexity of the code at that point (lot of cleaning when 
deleting the mailbox), and unforeseen future usages, I guess you are 
right. Actually maybe even quota_mailbox_use shall be treated that way ?


I'm thinking that there's now three places in the code which take a 
mailbox* and fill out an array of quota diffs, interpreting the 
contents of the struct mailbox.  That should really be centralised.

I'm not sure to see what you have in mind here.
Are you talking about the places where the QUOTA_STORAGE and 
QUOTA_MESSAGE entries of the quota diff array are computed relatively to 
the 'quota_mailbox_used' and 'exists' index fields of the struct mailbox 
? If so I guess some of the code could indeed be centralised.



Regards
Julien Coloos


Re: MESSAGE quota resource implemention

2011-09-01 Thread Julien Coloos

Le 01/09/2011 14:22, Bron Gondwana a écrit :


A provisioning system could run quota -f itself after making the
change, of course.
Sure. But since the quota is being changed, clients would wonder why 
they need to call another quota utility to finish the job.
Plus I wanted to have something similar to how it works when the quota 
entry is first created (which previously only managed one resource): 
people didn't need to call the quota utility after creating the 
quotaroot, so if it's doable for other resources it's better.



But I agree that in case of underflow detection throwing a warning
in syslog might help draw the attention when logs are analysed.

"when".  haha.

(maybe at a few sites that care... but for the vast majority of
  sites, if you're depending on them reading syslog you've already
  lost.  Software that understands that and is robust in the face
  of errors is much nicer for the poor suckers on the receiving end
  of all this)


:D
It's sometimes hard to find a good compromise between "we have good 
faith current data are correct, so for non-vital ones we perform minimal 
checks and provide tools to repair punctually" and "we are paranoid and 
check/repair each and every data all the time" ;)
But I am a bit biased here due to past experiences with non-robust/buggy 
code that tries to correct itself and often ends up in endless iterations...




FYI - I'm planning to be a bit more lazy about mailbox deletes
at some point.  It would be super-cool to not even move the files
until someone trys to create a new mailbox with the same name,
otherwise just clean them up in the regular course of cyr_expire.

Need to think about that some more though.



Actually, really I'd like to create a new UNIQUEID - and store
all the files in paths based on uniqueid rather than on folder
name.  This would not only mean renames could be fast and
atomic, but that delayed delete would be fast.  The downside is
a more opaque filesystem layout.  Oh, another upside - file path
limitations don't exist so much any more.

Nice :)



One place please :)  Ideally I'd like to absorb more of the quota
stuff into mailbox.c.  Greg and I have some debate about this - how
much is too much for that file to be doing.  Probably it should be
abstracted into a couple of layers of stuff - but I really do like
the consistency of having just a couple of function calls:

mailbox_append_index_record; and
mailbox_update_index_record

which do all the consistency checking and counter updating inside.
Plus of course a mailbox_check_quota thing that takes a set of
quota checks to do and sees if there will be space for the
planned changes!

Makes sense.


Regards
Julien


Re: MESSAGE quota resource implemention

2011-09-05 Thread Julien Coloos

Le 05/09/2011 06:12, Greg Banks a écrit :


Julien, I think we agreed on everything else, right?  I'm looking
forward to your next iteration.
After picking your 'uquota_t removal' commit, I also removed it on my 
end, and changed the code according to our previous discussions. Adding 
an helper function which fills a quota_t array with the current messages 
usage from mailbox struct made the code a bit cleaner and more generic :)
I still have to look for transferring limits upon DUMP/UNDUMP. And for 
the time being, I still kept the quota.sets[] thing (we will see later 
if we can do it more smartly).



Regards
Julien


Re: MESSAGE quota resource implemention

2011-09-05 Thread Julien Coloos

Le 05/09/2011 12:06, Bron Gondwana a écrit :

On Mon, Sep 05, 2011 at 02:32:40PM +1000, Greg Banks wrote:


  - add a 32b mailbox index header entry to track the storage in bytes
used by all annotations for the mailbox itself or for messages in the
mailbox

Why not 64b?  Admittedly hanging gigabytes of annotations off a single
folders is probably evil, but this is the mailbox header - 4 bytes per
mailbox to not even have to think about it is super-cheap.  We keep a
handful of "blanks" around in the header already.

...


  - in mailbox_commit_quota(), if the field is not "unknown", then
calculate the delta in usage and apply to the quota db.


This new field seems useful for the quota utility and when setting a new 
quotaroot: it wouldn't need to check each annotation associated to the 
mailbox (or messages) and would rely on that field, as it currently does 
for total message size and now the number of messages (with my patch).
I don't know what other people do with cyrus, but here those two 
situations do not happen that often:

  - quotaroot is usually set once upon creating the user mailbox
  - quota utility is rarely used, and usually on one mailbox at a time
So maybe I am missing something, but my question would be: is it really 
worth it ?



Regards
Julien


Re: MESSAGE quota resource implemention

2011-09-06 Thread Julien Coloos

Le 06/09/2011 10:23, Greg Banks a écrit :


So I think we'll need another round, sorry :(  Given that the
annotations quota is broken and I'll be reimplementing it anyway, you
may as well ignore QUOTA_ANNOTSTORAGE in all commits, and leave out the
function annotatemore_computequota() for now.  We'll use something very
much like it for reconstruct later.  I'm hoping to be able to pull your
next round of changes into my annotate branch before I reimplement the
annotation quota in the next few days.

If the new code is capable of returning actual resource usages upon 
getquota (I think Bron wanted it that way), then I guess I can drop some 
more of the functions I added (annotatemore_computequota, but also 
mboxlist_updatequota and associated code including the quota.sets[]; 
then maybe I could also change the code as proposed earlier and prevent 
writing of resource limit in quota entry if it is QUOTA_UNLIMITED). Will 
look at it tomorrow.


Worked on DUMP/UNDUMP today: 
https://github.com/worldline-messaging/cyrus-imapd/commit/45148b20a4f2343d72aa7436e7255a92508d7bf8
I used an IMAP list to send data, as for annotations. For future usages, 
maybe the UNDUMP code could try to skip IMAP lists if it finds any 
instead of the expected file content LITERAL ? (to prevent breaking 
compatibility too many times, in case other things would need to be 
transmitted this way through DUMP/UNDUMP).


By the way, I noticed that UNDUMPing annotations fails for now (function 
annotate_state_write uses 'int_mboxname' in the annotation state struct, 
but it is NULL in this context).




I'm still not convinced we'll need quota.sets[], but I'll play along.

Thanks again for your work, and sorry that my annotate branch wasn't
quite as stable a base as you first thought :)

No problem, I was somehow prepared to that kind of scenario :)


Regards
Julien


Re: MESSAGE quota resource implemention

2011-09-06 Thread Julien Coloos

Le 06/09/2011 19:48, Julien Coloos a écrit :

Le 06/09/2011 10:23, Greg Banks a écrit :


So I think we'll need another round, sorry :(  Given that the
annotations quota is broken and I'll be reimplementing it anyway, you
may as well ignore QUOTA_ANNOTSTORAGE in all commits, and leave out the
function annotatemore_computequota() for now.  We'll use something very
much like it for reconstruct later.  I'm hoping to be able to pull your
next round of changes into my annotate branch before I reimplement the
annotation quota in the next few days.

If the new code is capable of returning actual resource usages upon 
getquota (I think Bron wanted it that way), then I guess I can drop 
some more of the functions I added (annotatemore_computequota, but 
also mboxlist_updatequota and associated code including the 
quota.sets[]; then maybe I could also change the code as proposed 
earlier and prevent writing of resource limit in quota entry if it is 
QUOTA_UNLIMITED). Will look at it tomorrow.

When I said 'actual', I meant based on the upcoming mailbox index field.
But wait, I got a bit confused with the last point about playing along 
with quota.sets[]. If what you discussed with Bron is that, in the end, 
usage (re)computing will only be done with the quota utility (no more 
automatic computing upon setquota, neither for getquota), then after 
upgrading people shall call 'quota -f' once and for all and quota.sets[] 
must disappear - and all resources must be written in the quota entry -, 
otherwise users would need to call 'quota -f' on a given quotaroot each 
time they set a new quota resource limit to a mailbox.



Regards
Julien


Re: MESSAGE quota resource implemention

2011-09-07 Thread Julien Coloos

Le 06/09/2011 10:23, Greg Banks a écrit :


a) my commit "Make quota -f less racy" is going to cause lots of clashes
(sorry!)

b) Bron and I both think that your commit "Compute each quota resource
upon setting it for the first time." is unnecessary, given that

   i) quota -f doesn't suck now, and

   ii) soon, all of the quota-able quantities will be tracked in fields
in the index header.

So I think we'll need another round, sorry :(  Given that the
annotations quota is broken and I'll be reimplementing it anyway, you
may as well ignore QUOTA_ANNOTSTORAGE in all commits, and leave out the
function annotatemore_computequota() for now.  We'll use something very
much like it for reconstruct later.  I'm hoping to be able to pull your
next round of changes into my annotate branch before I reimplement the
annotation quota in the next few days.

...

I'm still not convinced we'll need quota.sets[], but I'll play along.

Thanks again for your work, and sorry that my annotate branch wasn't
quite as stable a base as you first thought :)
So, I saved my current branch to 'quotamessage-0/gnb/annotate' and 
rebased my patches on current 'annotate' branch (with less racy 'quota -f').
I removed everything related to recomputing from my patches (as well as 
quota.sets[]).


What is missing now is the new index field, which value will be used in 
mailbox_get_usage function. Since my changes do rely on this function, 
and sometimes computes a delta compared to a previous call of that 
function, it may not need to be updated afterwards ... I hope.
Then maybe some of the cassandane tests I pushed on our repository would 
need to be refreshed (at least the one that checks what happens for 
legacy mailboxes on which we add one of the newly handled quota resources).



Regards
Julien


Re: MESSAGE quota resource implemention

2011-09-12 Thread Julien Coloos

Le 09/09/2011 14:18, Greg Banks a écrit :

Ok, here you go. Not completely tested yet, so caveat emptor.


Had to change two things:
 - mailbox_quota_check now expects a quota diff array which is good :), 
but change is now really applied if all diffs are '> 0' (instead of '>= 
0' previously); in some cases '0' is used to check that mailbox is not 
currently overquota (e.g. in LMTP service), so either we should go back 
to testing '>= 0', or change callers that did rely on that test
 - changing the value of annotations quota storage in the index 
(mailbox_use_annot_quota function) do dirty the quota, but not the 
index; the new value is thus not committed (at least when setting 
mailbox annotations), and the quota entry becomes false when later 
deleting the mailbox


If needed, you can look in our 'cyrus-imapd' repository at the 
'gnb/annotate/fixes' branch. There is one commit for those matters.



Remaining things concerning annotations:
 - when deleting messages, annotations length is not substracted; the 
solution may not be that simple, since I believe users are allowed to 
unexpunge mails: so since index entry is still here - until real 
unlinking -, annotations may have to stay there too - until unlinking too
 - for 'old' mailboxes (those created before the annotation storage 
usage field in the index header), current annotations usage shall be 
computed (and added to the quota entry) upon upgrading; this way users 
won't have to run 'quota -f' for all quotaroots after switching to this 
new version ;)



Then maybe some of the cassandane tests I pushed on our repository would
need to be refreshed (at least the one that checks what happens for
legacy mailboxes on which we add one of the newly handled quota resources).

That's next week's focus I think.
Actually I just gave up the 'old' test: there is no easy way to simulate 
upgrading mailbox index, or at least I don't feel confident enough to 
make it in cassandane :(
Other tests do work. Once the annotations usage is subtracted upon 
messages deletion (see before), all tests shall pass :)


I rebased the 'quotamessage/gnb/annotate' branch of our 'cassandane' 
repository today, leaving that test aside.



Regards
Julien


Re: MESSAGE quota resource implemention

2011-09-14 Thread Julien Coloos

Le 12/09/2011 23:09, Bron Gondwana a écrit :



  - for 'old' mailboxes (those created before the annotation storage
usage field in the index header), current annotations usage shall be
computed (and added to the quota entry) upon upgrading; this way
users won't have to run 'quota -f' for all quotaroots after
switching to this new version ;)

Definitely.  Upgrading usually handles things like that.  It's
the right way[tm].

Pushed a few lines of code in the 'upgrade_index' function. Though it 
still lacks the annotation storage usage computing.
It's in the branch 'gnb/annotate/fixes'; commit 
ed61f721487804b205e399c538fc35ddc153bd93.



Actually I just gave up the 'old' test: there is no easy way to
simulate upgrading mailbox index, or at least I don't feel confident
enough to make it in cassandane :(

Easiest way is the same way I did for the broken quotas test.
Have a tar file with the contents of the "old mailbox" which
you unpack onto the filesystem, and then open the mailbox and
check that the "upgraded" fields are what you would expect.

I also did something similar for the "crash on thread" test,
where there were 5 messages which were known to be able to
crash the THREAD command.  I unpacked the folder contents
with those messages in a test.

Nice idea :)
Pushed a few things.
Added an option hash so that when running commands:
  - test can run system commands (based on the current code which would 
run cyrus commands inside the current cassandane instance directory)

  - finer I/O redirections are possible
  - working directory can be specified
While adding those new 'features', I still kept the 'run_utility' method 
(cyrus commands) and simply added the 'run_command' method for other 
commands.


Then, as you suggested on IRC, I added an 'unpack' method to extract 
tar/gz/bz2 (and combinations) files using system commands.


And I added back my quota upgrade test from a cyrus v2.4(.11) mailbox, 
using two tar.gz files containing a mailbox content and its quota file.


It's in the 'quotamessage/gnb/annotate' branch; commits 
d2bf4e4f42f7f8c9b53713441116ddbea5b0a265, 
86a52daeed5a03f078b88de67d3d10b51a7f8cc4 and 
fb827e8fc77529a5e23465d2c19d6e88adf7cae8.



Maybe you won't keep everything I pushed, but I hope some parts will be 
helpful :)


Regards
Julien


Re: MESSAGE quota resource implemention

2011-09-16 Thread Julien Coloos

As discussed here and on IRC, I rebased my commits with all the changes:
  - the 'utility' methods have been promoted to 'command' ones, which 
are now generic and may (options hash) concern cyrus binaries
  - the 'start_command_bg' method is now replaced by a 'background' 
option in 'run_command'

  - dropped the 'mode' option, leaving only the 'redirects' one
  - moved 'unpack' method from Cassandane::Unit::TestCase to 
Cassandane::Instance, leaving the tar utility determine itself the kind 
of the tar file
-> by default destination is the current cassandane instance base 
directory; alternatively one can specify a relative path - 
'path/to/file' - to this directory, or an absolute path  - '/path/to/file'

  - use the pre-existing 'user.cassandane' mailbox for quota upgrade tests
  - fixed a few minor things (like avoiding short-life zombies by 
harvesting some cassandane dead children)

  - refactored a few more stuff
  - removed stuff that became unnecessary

Hoping this will work out this time :)


Regards
Julien


Re: Graceful restart also for imapd.conf

2011-10-06 Thread Julien Coloos
As proposed by Olivier, there is the 'sighup' branch available on 
git://github.com/worldline-messaging/cyrus-imapd.git. It is based on 
current official cyrus 'master' branch.


Details about the commits:

20a7e4b32c01e7c8dd7aeb8eae41d35e4c630369 - Recycle running services upon 
SIGHUP on master.
Currently when SIGHUP is sent to the master process, only new and legacy 
services are being specifically processed:

  - new ones are being started
  - legacy ones are stopped by transmitting SIGHUP to the concerned 
children (to stop them once the client - if connected - leaves)
What we propose is to extend the use of SIGHUP to also recycle remaining 
services so that any change to imapd.conf is taken into account as fast 
as possible.



5b55aab917d6686c3ba74dc8abad1a3eb906cd12 - Send message to master when 
service is exiting.

This commit is useful to make recycling smoother. Commit log:

  Master process is notified about children exiting by a SIGCHLD 
signal, which are taken into account but processed separately (not right 
when receiving the signal).
  The master process uses a 'select' call to wait for messages from its 
children, and manage other events.
  Signal handlers in the master process are set with the SA_RESTART 
flag. But the POSIX specifications let the implementation decide whether 
'select' restarts or returns EINTR in this case. Thus it may take some 
time ('select' timeout) before the master process actually reap children 
(and fork new ones when necessary). It can also happen due to race 
conditions.


  Provided that letting children send back a message when they are 
exiting is not too much time consuming (for both child and master), 
removing the static MESSAGE_MASTER_ON_EXIT configuration variable is 
actually useful for smoother janitoring, and faster services recycling 
when sending SIGHUP to master process.


I actually wonder why 'MESSAGE_MASTER_ON_EXIT' was used and set to 0 up 
to now. Compared to what services and master are already doing, I 
believe that it should not consume a significant amount of resources to 
send this 'exit' message. But maybe someone can prove me wrong here ?



4578a0268a1e1c2c1f0b33d68e547fc864dcca4b - Remain root on master process.
Currently master process is usually started as root to later become cyrus.
With SIGHUP it reloads its configuration and can start newly added 
services. Sometimes this is however problematic because non privileged 
users (as cyrus) are not allowed to bind ports below 1024: this results 
in the added service being unabled to start.
To be fair I don't know if there are - for all platforms targeted by 
cyrus - ways to circumvent this security limitation. At least I tried 
setcap'ing the exe on Ubuntu, but it did not work as expected ...
Now, as far as security is concerned it is best to not be root. But 
after thinking about it, two things make us think that maybe the master 
process could stay root:
 - it does not discuss with external clients as IMAP/POP/etc services 
do; so no remote exploits, right ?
 - many daemon services do run as root by default - ok, ok, that does 
not mean it's the right thing to do :p
Or maybe it could be root by default, and the configuration would allow 
to use a given user/group, as some other daemon services do.

What do you think ?


As usual, comments are welcomed :)


Now, a side note about something I observed while messing with signals 
in cyrus: from time to time I did get deadlocked processes upon SIGTERM; 
for example, when sending it to the master process right after starting 
it (yeah I know, who in their sane mind would want to do that ? - except 
me).


Stacktraces of deadlocked processes:
#0  0xf430 in __kernel_vsyscall ()
#1  0xf7388753 in __lll_lock_wait_private () at 
../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:95

#2  0xf731bbac in _L_lock_10489 () from /lib/i386-linux-gnu/libc.so.6
#3  0xf731a553 in __libc_realloc (oldmem=0x97929e0, bytes=62) at 
malloc.c:3813

#4  0xf7309cea in _IO_mem_finish (fp=0x9792868, dummy=0) at memstream.c:132
#5  0xf7305949 in _IO_new_fclose (fp=0x9792868) at iofclose.c:66
#6  0xf73764bc in __vsyslog_chk (pri=, flag=1, 
fmt=0x805482e "exiting on SIGTERM/SIGINT", ap=0xffb7f43c " o+\367\023\t")

at ../misc/syslog.c:228
#7  0xf7376896 in __syslog_chk (pri=6, flag=1, fmt=0x805482e "exiting on 
SIGTERM/SIGINT") at ../misc/syslog.c:131

#8  0x08049b7f in syslog (sig=15) at /usr/include/bits/syslog.h:32
#9  sigterm_handler (sig=15) at master.c:1067
#10 
#11 0xf430 in __kernel_vsyscall ()
#12 0xf73430d7 in __libc_fork () at 
../nptl/sysdeps/unix/sysv/linux/i386/../fork.c:130

#13 0x0804afb5 in spawn_service (si=2) at master.c:633
#14 0x0804cce4 in main (argc=10, argv=0xffb80db4) at master.c:2069

#0  0xf430 in __kernel_vsyscall ()
#1  0xf7388753 in __lll_lock_wait_private () at 
../nptl/sysdeps/unix/sysv/linux/i386/i686/../i486/lowlevellock.S:95

... (same backtrace)
#13 0x0804afb5 in spawn_service (si=3) at master.c:633

Re: Issue with Emails being changed from UNREAD to READ with no explanation

2011-10-18 Thread Julien Coloos

Le 18/10/2011 12:53, Smith, Michael a écrit :


Hello,

We (Pegasystems) are working with a customer who are using Cyrus 
2.3.13 in production.


*Process*

·The application that we have built uses a listener process that polls 
an email account every 60 minutes.


·All "UNREAD" emails are processed when the email listener starts

·All "UNREAD" emails are then deleted when the listener process 
disconnects


·Any new emails received during the time of the email listener is 
processing emails are somehow marked as "READ" when they have not been 
read or processed


·For all subsequent instantiations of the email listener process the 
"READ" emails are not processed


*Investigations undertaken*

·All debug code from the application has revealed that there are no 
java errors from the application or from the JavaMail API


·No Pegasystems code or command is being issued that can change the 
status of the email from "UNREAD" to "READ


Are there any existing issues/bugs that match the symptoms detailed 
above in Cyrus post 2.3.13?


Cheers,

Michael

**

*Michael Smith | Technical Solutions Director | Pegasystems Ltd*

Office: 01189 651627 | Mobile: 07966 768244 | E-Mail: 
michael.sm...@pega.com  | www.pega.com 





Looks similar to an issue discussed some time ago: 
http://asg.andrew.cmu.edu/archive/message.php?mailbox=archive.cyrus-devel&msg=2961
Bug happens in a complex part of the code - that no-one dared to 
approach. It was fixed in version 2.4 as part of a heavy refactoring.



Regards


Re: Cyrus SEEN and uidl patch

2012-04-01 Thread Julien Coloos

Le 01/04/2012 14:45, Bron Gondwana a écrit :

On Sun, Apr 01, 2012 at 11:08:36AM +0100, Jeroen van Meeuwen (Kolab Systems) 
wrote:

I'm running an ISP environment with Cyrus 2.3.18.

Given that Cyrus IMAP 2.3 is... well... rather old, it is maintained
by only a small group of people having volunteered to do so. The
main focus for the project is Cyrus IMAP 2.4, and future releases
currently in development, so a question that pops up in my mind is -
can this issue be reproduced in Cyrus IMAP 2.4.14, or even GIT
master (future 2.5)?

No to the main issue, because per-user seen is stored in the mailbox,
so these seen strings don't exist.

And I'm fairly sure the replication protocol won't mind arbitrarily
long SEEN strings either, because it uses a different method.

But - I have no objection to fixing the issue in 2.3.x.

Bron.


I guess the issue is that the original patch replaces prot_printf 
implementation to rely on vsnprintf with a fixed-size (2048 bytes) buffer.
So if any part of cyrus has to write a string (%s) which size is over 
2048 characters (actually less than that, since there is the nul-byte 
ending and possibly extra data to take into account in the format), the 
output is truncated.
This is to compare to the original code of prot_printf which sends the 
string as-is when if parses '%s' in the format.


The revised version of the patch is not really ideal either, since it 
introduces a second prot_printf function only to take care of the root 
UIDL issue related to Outlook.
The current code do already use snprintf to format some data (integers 
etc), so I would say that extending it to recognize formats such as 
'%#.#d' (that is integer and the likes with minimum field width and/or 
precision) could be a solution to have the Outlook UIDL patch use the 
prot_printf function without breaking anything else. But of course it's 
easier said than done ;)


Julien