MESSAGE quota resource implemention

2011-08-31 Thread Julien Coloos

Hi,

As discussed on IRC last week, I worked on implementing MESSAGE quota 
resource in cyrus (see RFC 2087; STORAGE resource already being handled).
I created a branch based on Greg's 'annotate' one on github, since his 
work on annotation storage management made mine a lot easier.


Details on the changes I made:

cyrus-imapd:
  - added MESSAGE quota resource management
-> updated cunit test
-> added 'quotawarnmsg' option which behaviour is similar to 
'quotawarnkb'; warning message shown when selecting folder in IMAP tells 
which resource limit was triggered
-> added 'autocreatequotamsg' option, to set MESSAGE limit 
(unlimited by default) when user auto-creates its mailbox

  - xfer transfers all non-unlimited resources
  - added helper function to compute annotation storage usage for a 
given mailbox

  - changed the way quota entries are read/written
-> resources presence in fetched entry is remembered
-> when writing entry, only present resources are written
-> when setting resources limits, entries which were not present 
upon fetching are now marked as present and their current usage is computed
  - quota utility lists and computes (-f) all resources associated to 
mailbox
The branch 'quotamessage/gnb/annotate' is available here: 
git://github.com/worldline-messaging/cyrus-imapd.git. It is based on 
Greg's 'annotate' branch on github.


cassandane:
  - added tests for MESSAGE resource
  - modified current tests to play a bit with subfolders and setting 
quota after adding messages/annotations
  - special test to check new resource usage is computed when necessary 
for pre-existing mailboxes (which only have usage for STORAGE resource)
The branch 'quotamessage/gnb/master' is available here: 
git://github.com/worldline-messaging/cassandane.git. It is based on 
Greg's 'master' branch on github.



Things that may be worth noting:
  - DUMP/UNDUMP currently does nothing special about MESSAGE or 
X-ANNOTATION-STORAGE quota resources

-> should it be transferred ?
-> without breaking backward compatibility, limits could only be 
transferred through a 'fake' file entry, as for annotations
  - quota usage is currently stored in a uquota_t variable, and delta 
is computed as quota_t; so theorically there could be overflow issues if 
quota usage to add/substract cannot be held in a quota_t; in practice it 
should be unlikely since that would mean a usage of over 2^63-1



Why the change concerning quota entries read/write ?
The thing is that I wanted to make version upgrading as painless as 
possible, both for users and in the code.
With previous code, when quota entry exists and is read, missing 
resources are initialised with default values (0 usage, unlimited). Thus 
only usage delta is tracked, and actual usage computing would only 
happen if quota entry was missing: this is not nice when upgrading, 
since lots of mailboxes are likely to have an entry with only the 
STORAGE resource present.
So actual usage has to be computed at some point for newly handled 
resources. The idea here is to compute it when setting the resource for 
the first time. To do that it was necessary to know when the resource 
was not previously present and keep it that way until its limit is 
finally set for the first time.
This scheme has another advantage: for platforms where only STORAGE 
quota is used, quota entries size remains as it is now. Only people 
using those new quota resources will have their quota entries grow to 
store the new data.


Comments are welcomed :)


I think that there are two things that may also be done concerning quota 
entries:
  - always recompute resource usage when changing its limit from 
unlimited to bounded
-> currently it is only done once when setting the usage limit for 
the first time
-> that way, it may not be necessary to track resource presence 
when reading/writing quota entries
-> but maybe it could be too time consuming in some cases, since it 
seems to be possible to associate a quota resource to a whole domain 
(recomputing usage for all mailboxes would then take some time)

  - do not write resource in quota entry when its usage is unlimited
-> except for the STORAGE resource, for backward compatibility
-> would also help keeping quota entries size to the bare minimum

What do you think ?


PS: should I open a new bugzilla ticket for that ?

PPS:
cunit: on my computer cunit tests succeed except the 'foreach' one in 
the 'quota' suite which timeouts (it seems messing with 4096 quotalegacy 
is too much for my computer).
cassandane: a few quota tests I added fail due to some of the issues I 
reported (#5327 and #5329). And poor lemmings (well thought out name in 
this context :)) are dying hopelessly and endlessly in the 
Cassandane::Cyrus::Master test, preventing it to complete. Don't know if 
it is normal.




Re: MESSAGE quota resource implemention

2011-08-31 Thread Bron Gondwana
On Wed, Aug 31, 2011 at 05:50:36PM +0200, Julien Coloos wrote:
> Things that may be worth noting:
>   - DUMP/UNDUMP currently does nothing special about MESSAGE or
> X-ANNOTATION-STORAGE quota resources
> -> should it be transferred ?

I'd like to replace DUMP/UNDUMP with replication protocol
communications for XFER.

> -> without breaking backward compatibility, limits could only be
> transferred through a 'fake' file entry, as for annotations

But for now, that's definitely the pragmatic way to go.

>   - quota usage is currently stored in a uquota_t variable, and
> delta is computed as quota_t; so theorically there could be overflow
> issues if quota usage to add/substract cannot be held in a quota_t;
> in practice it should be unlikely since that would mean a usage of
> over 2^63-1

I propose we scrap uquota_t - it is un-necessary for the medium-term
future, now that we're requiring 64 bit types.

> I think that there are two things that may also be done concerning
> quota entries:
>   - always recompute resource usage when changing its limit from
> unlimited to bounded

That's not too expensive really - we do it in the "quota" tool anyway.

The bigger issue is locking.  You need to be very careful of deadlock
situations if you do this.  We have that issue in the quota tool now.

The only _real_ way to do it reliably would be to go through and
remove all the quota roots from each mailbox, then go through again
one at a time and add them back, recording usage as you go.  Otherwise
there's no way of knowing if your end result is consistent.

> -> that way, it may not be necessary to track resource presence
> when reading/writing quota entries
> -> but maybe it could be too time consuming in some cases, since
> it seems to be possible to associate a quota resource to a whole
> domain (recomputing usage for all mailboxes would then take some
> time)

Yeah, now this is what you could call all sorts of bad names.  It's
really not sane to keep a quota counter at EVERY possible level in
the tree just incase someone wants to hang a limit there later.
You would get insane amounts of contention on the domain quota
lock for a busy domain, which would be un-necessary.

> PS: should I open a new bugzilla ticket for that ?

Why not - numbers are cheap.

> PPS:
> cunit: on my computer cunit tests succeed except the 'foreach' one
> in the 'quota' suite which timeouts (it seems messing with 4096
> quotalegacy is too much for my computer).

There's stuff you can do with fd limits.  Check out /etc/pam.d/* for
pam_limits.so, and then /etc/security/limits.conf.  I just had to
learn about that stuff today!

Bron.


Re: MESSAGE quota resource implemention

2011-08-31 Thread Greg Banks

Julien Coloos wrote:

Hi,

As discussed on IRC last week, I worked on implementing MESSAGE quota 
resource in cyrus (see RFC 2087; STORAGE resource already being handled).
I created a branch based on Greg's 'annotate' one on github, since his 
work on annotation storage management made mine a lot easier.


Details on the changes I made:

cyrus-imapd:
  - added MESSAGE quota resource management
-> updated cunit test
-> added 'quotawarnmsg' option which behaviour is similar to 
'quotawarnkb'; warning message shown when selecting folder in IMAP 
tells which resource limit was triggered
-> added 'autocreatequotamsg' option, to set MESSAGE limit 
(unlimited by default) when user auto-creates its mailbox

  - xfer transfers all non-unlimited resources
  - added helper function to compute annotation storage usage for a 
given mailbox

  - changed the way quota entries are read/written
-> resources presence in fetched entry is remembered
-> when writing entry, only present resources are written
-> when setting resources limits, entries which were not present 
upon fetching are now marked as present and their current usage is 
computed
  - quota utility lists and computes (-f) all resources associated to 
mailbox
The branch 'quotamessage/gnb/annotate' is available here: 
git://github.com/worldline-messaging/cyrus-imapd.git. It is based on 
Greg's 'annotate' branch on github.


cassandane:
  - added tests for MESSAGE resource
  - modified current tests to play a bit with subfolders and setting 
quota after adding messages/annotations
  - special test to check new resource usage is computed when 
necessary for pre-existing mailboxes (which only have usage for 
STORAGE resource)
The branch 'quotamessage/gnb/master' is available here: 
git://github.com/worldline-messaging/cassandane.git. It is based on 
Greg's 'master' branch on github.




That all sounds promising and I look forward to reading it.


Things that may be worth noting:
  - DUMP/UNDUMP currently does nothing special about MESSAGE or 
X-ANNOTATION-STORAGE quota resources

-> should it be transferred ?


Yes.  I've been lazy updating the dump/undump code.
-> without breaking backward compatibility, limits could only be 
transferred through a 'fake' file entry, as for annotations
  - quota usage is currently stored in a uquota_t variable, and delta 
is computed as quota_t; so theorically there could be overflow issues 
if quota usage to add/substract cannot be held in a quota_t; in 
practice it should be unlikely since that would mean a usage of over 
2^63-1

That seems...unlikely.




Why the change concerning quota entries read/write ?
The thing is that I wanted to make version upgrading as painless as 
possible, both for users and in the code.
With previous code, when quota entry exists and is read, missing 
resources are initialised with default values (0 usage, unlimited). 
Thus only usage delta is tracked, and actual usage computing would 
only happen if quota entry was missing: this is not nice when 
upgrading, since lots of mailboxes are likely to have an entry with 
only the STORAGE resource present.
So actual usage has to be computed at some point for newly handled 
resources. The idea here is to compute it when setting the resource 
for the first time. To do that it was necessary to know when the 
resource was not previously present and keep it that way until its 
limit is finally set for the first time.
This scheme has another advantage: for platforms where only STORAGE 
quota is used, quota entries size remains as it is now. Only people 
using those new quota resources will have their quota entries grow to 
store the new data.


Comments are welcomed :)


I'll look at the code before commenting.



I think that there are two things that may also be done concerning 
quota entries:
  - always recompute resource usage when changing its limit from 
unlimited to bounded
-> currently it is only done once when setting the usage limit for 
the first time
-> that way, it may not be necessary to track resource presence 
when reading/writing quota entries
-> but maybe it could be too time consuming in some cases, since 
it seems to be possible to associate a quota resource to a whole 
domain (recomputing usage for all mailboxes would then take some time)

  - do not write resource in quota entry when its usage is unlimited
-> except for the STORAGE resource, for backward compatibility
-> would also help keeping quota entries size to the bare minimum

What do you think ?


This one's tough, I wasn't sure what to do.  However, I'm happy to leave 
it to the administrator to have to manually run quota -f (maybe twice!) 
if they set a quota on a resource that is already being used.  I'm 
unconvinced that automatically doing the equivalent of -f as a side 
effect of setting the first limit is necessary or wise.  Perhaps we 
should a) document it clearly and b) detect the situation and put an 
obvious message s

Re: MESSAGE quota resource implemention

2011-09-01 Thread Greg Banks

Julien Coloos wrote:

Hi,

As discussed on IRC last week, I worked on implementing MESSAGE quota 
resource in cyrus (see RFC 2087; STORAGE resource already being handled).
I created a branch based on Greg's 'annotate' one on github, since his 
work on annotation storage management made mine a lot easier.


Details on the changes I made:

cyrus-imapd:
  - added MESSAGE quota resource management
-> updated cunit test
-> added 'quotawarnmsg' option which behaviour is similar to 
'quotawarnkb'; warning message shown when selecting folder in IMAP 
tells which resource limit was triggered
-> added 'autocreatequotamsg' option, to set MESSAGE limit 
(unlimited by default) when user auto-creates its mailbox

  - xfer transfers all non-unlimited resources
  - added helper function to compute annotation storage usage for a 
given mailbox

  - changed the way quota entries are read/written
-> resources presence in fetched entry is remembered
-> when writing entry, only present resources are written
-> when setting resources limits, entries which were not present 
upon fetching are now marked as present and their current usage is 
computed
  - quota utility lists and computes (-f) all resources associated to 
mailbox
The branch 'quotamessage/gnb/annotate' is available here: 
git://github.com/worldline-messaging/cyrus-imapd.git. It is based on 
Greg's 'annotate' branch on github.




Reviewed:

commit 2d4fb0 "Added quota MESSAGE resource (RFC 2087) management."

imap/quota.h

Why #include ?  It's not like we're using any of the typedefs 
there.  There's a good argument to be made that we *should* do so, but 
until then the #include doesn't seem useful.


I'm unconvinced about the need for quota.sets[].  After a lot of reading 
it seems to me that (quota.sets[x]) and (quota.limits[x] != 
QUOTA_UNLIMITED) are identical or should be.  Also, are you setting 
sets[] anywhere except when loading from the db?


I like quota_update_useds(), it does seem necessary to be able to update 
several usages together.


Otherwise, looks good.

imap/quota_db.c

In quota_write(), your sanity check should goto the buf_free() at the 
end of the function rather than return early.  Otherwise, looks good.


imap/append.c

Looks good.

imap/append.h

Looks good

imap/cyr_virusscan.c

Looks good

imap/imap_err.et

Looks good

imap/imapd.c

In the 1st hunk in cmd_append(), at this point in the code I believe 
totalsize = 0, so you could easily pass 0 for the new last argument as well.


Otherwise looks good.

BTW, reading your code I realised a number of places I failed to account 
for X-ANNOTATION-STORAGE quota properly :(  I think I'll have to change 
a number of those new pairs of quota_t arguments to a quota_t 
quotadiff[QUOTA_NUMRESOURCES].


imap/index.c

Looks good

imapd/lmtpd.c

Looks good

imap/lmtpengine.c

Looks good

imap/lmtpengine.h

Looks good


imap/mailbox.c

I'm not sure it's a good idea to set mailbox->i.exists = 0 in 
mailbox_delete().  Perhaps it would be better to check for 
(mailbox->i.options & OPT_MAILBOX_DELETED) when sampling 
mailbox->i.exists in mailbox_commit_quota().


Otherwise, looks good.

imap/mailbox.h

Looks good

imap/mbdump.c

Looks good

imap/nntpd.c

Looks good

lib/imapoptions

Looks good.

commit 0e613c "Removed unused code."

Looks good.

commit c8aeef "Compute each quota resource upon setting it for the first 
time."


commit b30c10 "List and compute all set quota resources."

Oh, here's why you include stdint.h:  you're using PRIdMAX and not 
QUOTA_REPORT_FMT, which is used only in imapd.c.


I'm thinking that there's now three places in the code which take a 
mailbox* and fill out an array of quota diffs, interpreting the contents 
of the struct mailbox.  That should really be centralised.



--
Greg.



Re: MESSAGE quota resource implemention

2011-09-01 Thread Julien Coloos

Hi, thanks for your comments and reviewing the code

Le 01/09/2011 03:03, Greg Banks a écrit :
I think that there are two things that may also be done concerning 
quota entries:
  - always recompute resource usage when changing its limit from 
unlimited to bounded
-> currently it is only done once when setting the usage limit 
for the first time
-> that way, it may not be necessary to track resource presence 
when reading/writing quota entries
-> but maybe it could be too time consuming in some cases, since 
it seems to be possible to associate a quota resource to a whole 
domain (recomputing usage for all mailboxes would then take some time)

  - do not write resource in quota entry when its usage is unlimited
-> except for the STORAGE resource, for backward compatibility
-> would also help keeping quota entries size to the bare minimum

What do you think ?


This one's tough, I wasn't sure what to do.  However, I'm happy to 
leave it to the administrator to have to manually run quota -f (maybe 
twice!) if they set a quota on a resource that is already being used.  
I'm unconvinced that automatically doing the equivalent of -f as a 
side effect of setting the first limit is necessary or wise.  Perhaps 
we should a) document it clearly and b) detect the situation and put 
an obvious message saying something like "you may need to run quota -f 
..." where the human making the change will see it.  Also, such a 
message might be useful when usage underflow is detected.


My concern here is that there are times when quota changes are not done 
manually by a human, but happen automatically as part of platform 
provisioning scheme. For example, on many platforms we manage, if a user 
subscribes to some offer his/her mailbox quota will automatically be 
upgraded (IMAP action). So, at least in our case, I though it might be 
useful.


But I agree that in case of underflow detection throwing a warning in 
syslog might help draw the attention when logs are analysed.




Le 01/09/2011 10:15, Greg Banks a écrit :

commit 2d4fb0 "Added quota MESSAGE resource (RFC 2087) management."

imap/quota.h

Why #include ?  It's not like we're using any of the 
typedefs there.  There's a good argument to be made that we *should* 
do so, but until then the #include doesn't seem useful.


As you realised in the next commit, I removed QUOTA_REPORT_FMT 
(appearing only in quota.c) to use PRIdMAX/PRIuMAX and cast to 
(u)intmax_t upon formatting. Using those appeared as a future-proof 
solution in the discussion about 32/64-bit variables thread I started 
some time ago :)
So yeah, the include could have been done *only* in quota.c, but I 
remembered that Bron wanted to get rid of (u)quota_t later. So I decided 
to put the include in quota.h for later use.


I'm unconvinced about the need for quota.sets[].  After a lot of 
reading it seems to me that (quota.sets[x]) and (quota.limits[x] != 
QUOTA_UNLIMITED) are identical or should be.  Also, are you setting 
sets[] anywhere except when loading from the db?


These would be perfectly identical if implementing the last point I 
brought up (not writing the resource in the quota entry if it's usage is 
unlimited). But then I need to compute actual quota usage at the right 
time. And when I saw that the code would allow to associate a quota to a 
whole domain (ouch, performance issue foreseen if quota is to be 
computed too often), I decided on the solution I proposed: compute it 
only once the first time it is set, hence the need of quota.sets[] 
(since when changing from bounded to unlimited limit, the resource is 
still kept in the quota entry to prevent recomputing its usage next time).


Otherwise, sets[] is updated when setting quota limits (in 
mboxlist_setquotas).



imap/quota_db.c

In quota_write(), your sanity check should goto the buf_free() at the 
end of the function rather than return early.  Otherwise, looks good.



Ah, forgot that one.
Actually I took a shortcut here, since buf_free should not be necessary 
with the current code (which does append strings; so if the len is still 
0, no string was malloc'ed). But for sanity reasons you are right, it is 
wiser to goto the end of the function and do the expected cleanup stuff. 
This shall be future-proof.



imap/imapd.c

In the 1st hunk in cmd_append(), at this point in the code I believe 
totalsize = 0, so you could easily pass 0 for the new last argument as 
well.

Yes at this point totalsize is still 0.
The current code, which handles MULTIAPPEND, does a preliminary check to 
see if mailbox is not overquota, then receives all messages, and finally 
checks that all messages can fit in the mailbox.
Since we know for sure at least one message is coming, I though it could 
be a good idea to early-check that at least one more message would fit 
in the mailbox before receiving anything. Otherwise we are staging new 
file only to trash it at the end (when overquota). So actually I wonder 
if the code could be u

Re: MESSAGE quota resource implemention

2011-09-01 Thread Bron Gondwana
On Thu, Sep 01, 2011 at 01:27:00PM +0200, Julien Coloos wrote:
> Le 01/09/2011 03:03, Greg Banks a écrit :
> >This one's tough, I wasn't sure what to do.  However, I'm happy to
> >leave it to the administrator to have to manually run quota -f
> >(maybe twice!) if they set a quota on a resource that is already
> >being used.  I'm unconvinced that automatically doing the
> >equivalent of -f as a side effect of setting the first limit is
> >necessary or wise.  Perhaps we should a) document it clearly and
> >b) detect the situation and put an obvious message saying
> >something like "you may need to run quota -f ..." where the human
> >making the change will see it.  Also, such a message might be
> >useful when usage underflow is detected.
> >
> My concern here is that there are times when quota changes are not
> done manually by a human, but happen automatically as part of
> platform provisioning scheme. For example, on many platforms we
> manage, if a user subscribes to some offer his/her mailbox quota
> will automatically be upgraded (IMAP action). So, at least in our
> case, I though it might be useful.

A provisioning system could run quota -f itself after making the
change, of course.

But more generally, the "update the quotaroot" is atomic-safe,
because the mailbox doesn't add/remove things from the quotaroot
racily - but quota -f IS racy.  The only way around that would
be a dual pass mark and collect thing, where you marked each
mailbox as "we're re-calculating, so don't update the quota usage"
in the first sweep, then came back and removed the mark as you
read the value.  Tricky, but doable.  The downside is that a
crash part way through can leave you in a broken state.  But
that's true of everything.

> But I agree that in case of underflow detection throwing a warning
> in syslog might help draw the attention when logs are analysed.

"when".  haha.

(maybe at a few sites that care... but for the vast majority of
 sites, if you're depending on them reading syslog you've already
 lost.  Software that understands that and is robust in the face
 of errors is much nicer for the poor suckers on the receiving end
 of all this)

> Le 01/09/2011 10:15, Greg Banks a écrit :
> >commit 2d4fb0 "Added quota MESSAGE resource (RFC 2087) management."
> >
> >imap/quota.h
> >
> >Why #include ?  It's not like we're using any of the
> >typedefs there.  There's a good argument to be made that we
> >*should* do so, but until then the #include doesn't seem useful.
> >
> As you realised in the next commit, I removed QUOTA_REPORT_FMT
> (appearing only in quota.c) to use PRIdMAX/PRIuMAX and cast to
> (u)intmax_t upon formatting. Using those appeared as a future-proof
> solution in the discussion about 32/64-bit variables thread I
> started some time ago :)
> So yeah, the include could have been done *only* in quota.c, but I
> remembered that Bron wanted to get rid of (u)quota_t later. So I
> decided to put the include in quota.h for later use.

I don't mind keeping quota_t around.  Using it makes it a bit
clearer that you intend to store a quota in this variable.  It's
like free documentation.  Just uquota_t is pointless in a 64bit
required world.

> >In the 1st hunk in cmd_append(), at this point in the code I
> >believe totalsize = 0, so you could easily pass 0 for the new last
> >argument as well.
> Yes at this point totalsize is still 0.
> The current code, which handles MULTIAPPEND, does a preliminary
> check to see if mailbox is not overquota, then receives all
> messages, and finally checks that all messages can fit in the
> mailbox.
> Since we know for sure at least one message is coming, I though it
> could be a good idea to early-check that at least one more message
> would fit in the mailbox before receiving anything. Otherwise we are
> staging new file only to trash it at the end (when overquota). So
> actually I wonder if the code could be updated to check (LITERAL
> parameter) that each message about to be received would fit in the
> mailbox, instead of checking only once at the end ?

There are some bugs about this already.  In particularl the opposite
case, checking that we actually want to append something before
aborting a sieve delivery - because it may be discarded or redirected
or even duplicate suppressed anyway.  Something to keep in mind.

> >imap/mailbox.c
> >
> >I'm not sure it's a good idea to set mailbox->i.exists = 0 in
> >mailbox_delete().  Perhaps it would be better to check for
> >(mailbox->i.options & OPT_MAILBOX_DELETED) when sampling
> >mailbox->i.exists in mailbox_commit_quota().
> Hmm, I wondered about it too. At the end of mailbox_delete, upon
> calling mailbox_close, all physical files in the mailbox are
> deleted, and it didn't seem like 'exists' was used anywhere
> in-between, so I thought it would be alright.
> But given the complexity of the code at that point (lot of cleaning
> when deleting the mailbox), and unforeseen future usages, I guess
> you are right. Actually maybe even q

Re: MESSAGE quota resource implemention

2011-09-01 Thread Michael Menge

Quoting Bron Gondwana :


Actually, really I'd like to create a new UNIQUEID - and store
all the files in paths based on uniqueid rather than on folder
name.  This would not only mean renames could be fast and
atomic, but that delayed delete would be fast.  The downside is
a more opaque filesystem layout.  Oh, another upside - file path
limitations don't exist so much any more.



How do you restore a folder from a filebased backup?
Or exclude some folders form backup (e.g. trash folders)?




M.MengeTel.: (49) 7071/29-70316
Universität Tübingen   Fax.: (49) 7071/29-5912
Zentrum für Datenverarbeitung  mail:  
michael.me...@zdv.uni-tuebingen.de

Wächterstraße 76
72074 Tübingen

smime.p7s
Description: S/MIME Signatur


Re: MESSAGE quota resource implemention

2011-09-01 Thread Bron Gondwana
On Thu, Sep 01, 2011 at 02:50:30PM +0200, Michael Menge wrote:
> Quoting Bron Gondwana :
> 
> >Actually, really I'd like to create a new UNIQUEID - and store
> >all the files in paths based on uniqueid rather than on folder
> >name.  This would not only mean renames could be fast and
> >atomic, but that delayed delete would be fast.  The downside is
> >a more opaque filesystem layout.  Oh, another upside - file path
> >limitations don't exist so much any more.
> >
> 
> How do you restore a folder from a filebased backup?
> Or exclude some folders form backup (e.g. trash folders)?

Tools :)

More specfically, an actual backup command rather than doing
file-based backups in the first place.  That way we can do
consistent snapshots.

Bron.


Re: MESSAGE quota resource implemention

2011-09-01 Thread Julien Coloos

Le 01/09/2011 14:22, Bron Gondwana a écrit :


A provisioning system could run quota -f itself after making the
change, of course.
Sure. But since the quota is being changed, clients would wonder why 
they need to call another quota utility to finish the job.
Plus I wanted to have something similar to how it works when the quota 
entry is first created (which previously only managed one resource): 
people didn't need to call the quota utility after creating the 
quotaroot, so if it's doable for other resources it's better.



But I agree that in case of underflow detection throwing a warning
in syslog might help draw the attention when logs are analysed.

"when".  haha.

(maybe at a few sites that care... but for the vast majority of
  sites, if you're depending on them reading syslog you've already
  lost.  Software that understands that and is robust in the face
  of errors is much nicer for the poor suckers on the receiving end
  of all this)


:D
It's sometimes hard to find a good compromise between "we have good 
faith current data are correct, so for non-vital ones we perform minimal 
checks and provide tools to repair punctually" and "we are paranoid and 
check/repair each and every data all the time" ;)
But I am a bit biased here due to past experiences with non-robust/buggy 
code that tries to correct itself and often ends up in endless iterations...




FYI - I'm planning to be a bit more lazy about mailbox deletes
at some point.  It would be super-cool to not even move the files
until someone trys to create a new mailbox with the same name,
otherwise just clean them up in the regular course of cyr_expire.

Need to think about that some more though.



Actually, really I'd like to create a new UNIQUEID - and store
all the files in paths based on uniqueid rather than on folder
name.  This would not only mean renames could be fast and
atomic, but that delayed delete would be fast.  The downside is
a more opaque filesystem layout.  Oh, another upside - file path
limitations don't exist so much any more.

Nice :)



One place please :)  Ideally I'd like to absorb more of the quota
stuff into mailbox.c.  Greg and I have some debate about this - how
much is too much for that file to be doing.  Probably it should be
abstracted into a couple of layers of stuff - but I really do like
the consistency of having just a couple of function calls:

mailbox_append_index_record; and
mailbox_update_index_record

which do all the consistency checking and counter updating inside.
Plus of course a mailbox_check_quota thing that takes a set of
quota checks to do and sees if there will be space for the
planned changes!

Makes sense.


Regards
Julien


Re: MESSAGE quota resource implemention

2011-09-01 Thread Rob Mueller



Actually, really I'd like to create a new UNIQUEID - and store
all the files in paths based on uniqueid rather than on folder
name.  This would not only mean renames could be fast and
atomic, but that delayed delete would be fast.  The downside is
a more opaque filesystem layout.  Oh, another upside - file path
limitations don't exist so much any more.


While UNIQUEID is nice, the opaqueness is annoying. Personally, I liked the 
idea that we talked about a while back which I think was:


$spooldir/a/user/aardvark/user.aardvark/
$spooldir/a/user/aardvark/user.aardvark.drafts/
$spooldir/a/user/aardvark/user.aardvark.trash/
$spooldir/a/user/aardvark/user.aardvark.foo/
$spooldir/a/user/aardvark/user.aardvark.foo.bar/
$spooldir/a/user/aardvark/user.aardvark.abc.xyz/

So you end up with every folder for a user in one dir. This solves the 
current messy handling of sub-dirs (eg currently you have to create the 
intermediate dir /abc/ even though there's no entry in mailboxes.db for it), 
and makes renaming any folder very cheap still (because you can do it with a 
single dir rename, rather than having to move each message file), but you 
don't go completely opaque.


Delete handling is still easier, just rename to DELETED.$oldname.$UNIQUEID 
or something like that, because it's cheap to rename anyway.


Of course it means re-organising folder layout for every installation out 
there, but maybe we need to bump major versions anyway, cyrus 3.0 here we 
come :)


Rob



Re: MESSAGE quota resource implemention

2011-09-01 Thread Bron Gondwana
On Fri, Sep 02, 2011 at 09:37:24AM +1000, Rob Mueller wrote:
> 
> >Actually, really I'd like to create a new UNIQUEID - and store
> >all the files in paths based on uniqueid rather than on folder
> >name.  This would not only mean renames could be fast and
> >atomic, but that delayed delete would be fast.  The downside is
> >a more opaque filesystem layout.  Oh, another upside - file path
> >limitations don't exist so much any more.
> 
> While UNIQUEID is nice, the opaqueness is annoying. Personally, I
> liked the idea that we talked about a while back which I think was:
> 
> $spooldir/a/user/aardvark/user.aardvark/
> $spooldir/a/user/aardvark/user.aardvark.drafts/
> $spooldir/a/user/aardvark/user.aardvark.trash/
> $spooldir/a/user/aardvark/user.aardvark.foo/
> $spooldir/a/user/aardvark/user.aardvark.foo.bar/
> $spooldir/a/user/aardvark/user.aardvark.abc.xyz/

I still have a patch somewhere that does that.

> So you end up with every folder for a user in one dir. This solves
> the current messy handling of sub-dirs (eg currently you have to
> create the intermediate dir /abc/ even though there's no entry in
> mailboxes.db for it), and makes renaming any folder very cheap still
> (because you can do it with a single dir rename, rather than having
> to move each message file), but you don't go completely opaque.

It does give you a 256 character folder tree limit on many operating
systems.

> Delete handling is still easier, just rename to
> DELETED.$oldname.$UNIQUEID or something like that, because it's
> cheap to rename anyway.

Sorry, make that 239, once you make space for timestamps and dots.

> Of course it means re-organising folder layout for every
> installation out there, but maybe we need to bump major versions
> anyway, cyrus 3.0 here we come :)

tools/rehash - I have one of those that can handle this plus all the
other existing formats as well.  It's on a git branch somewhere.

Bron.


Re: MESSAGE quota resource implemention

2011-09-02 Thread Greg Banks
On 01/09/11 22:22, Bron Gondwana wrote:
> On Thu, Sep 01, 2011 at 01:27:00PM +0200, Julien Coloos wrote:
>> Le 01/09/2011 03:03, Greg Banks a écrit :

> 
> But more generally, the "update the quotaroot" is atomic-safe,
> because the mailbox doesn't add/remove things from the quotaroot
> racily - but quota -f IS racy.  The only way around that would
> be a dual pass mark and collect thing, where you marked each
> mailbox as "we're re-calculating, so don't update the quota usage"
> in the first sweep, then came back and removed the mark as you
> read the value.  Tricky, but doable.  The downside is that a
> crash part way through can leave you in a broken state.  But
> that's true of everything.
> 
>> But I agree that in case of underflow detection throwing a warning
>> in syslog might help draw the attention when logs are analysed.
> 
> "when".  haha.
> 
> (maybe at a few sites that care... but for the vast majority of
>  sites, if you're depending on them reading syslog you've already
>  lost.  Software that understands that and is robust in the face
>  of errors is much nicer for the poor suckers on the receiving end
>  of all this)

If the software was robust, underflow would not happen and we would not
need to test for it and handle it.  Thus the log messages are not
operational messages intended for the sysadmin, but warnings about
internal Cyrus problems intended for Cyrus developers, and syslog is a
suitable place for them.

How's about this for a strategy?

When a quota resource is first enabled, (i.e. the limit is changed from
UNLIMITED to some finite value), the usage is stored as some special
value which I'll call INDETERMINATE.

Changing a mailbox' quotaroot to an existing quotaroot, resets all the
quotaroot's usages to INDETERMINATE.

Usage deltas applied to the INDETERMINATE value silently yield
INDETERMINATE back.

Usage deltas which would underflow a finite (i.e. not INDETERMINATE)
usage value are clamped with a warning about an internal consistency
problem.

A regularly run process such as cyr_expire finds quotaroots which have
one or more INDETERMINATE usages and runs quota -f on them.

In this model, INDETERMINATE value acts like your "dual pass mark".


> 
>>> In the 1st hunk in cmd_append(), at this point in the code I
>>> believe totalsize = 0, so you could easily pass 0 for the new last
>>> argument as well.
>> Yes at this point totalsize is still 0.
>> The current code, which handles MULTIAPPEND, does a preliminary
>> check to see if mailbox is not overquota, then receives all
>> messages, and finally checks that all messages can fit in the
>> mailbox.
>> Since we know for sure at least one message is coming, I though it
>> could be a good idea to early-check that at least one more message
>> would fit in the mailbox before receiving anything. Otherwise we are
>> staging new file only to trash it at the end (when overquota).

As far as I can see the only point of that first append_check() call is
to fail early in the case of a permission fail.

 So
>> actually I wonder if the code could be updated to check (LITERAL
>> parameter) that each message about to be received would fit in the
>> mailbox, instead of checking only once at the end ?

Literals, and binary, and urls...

I do think there's an argument to made for moving the maintenance of the
totalsize parameter into the struct appendstate, and once you do that
it's possible to do quota/permission checks at finer grained points.
But why bother?  Why slow down the common case of not exceeding quota?

> 
> There are some bugs about this already.  In particularl the opposite
> case, checking that we actually want to append something before
> aborting a sieve delivery - because it may be discarded or redirected
> or even duplicate suppressed anyway.  Something to keep in mind.
> 

> 
>>> I'm thinking that there's now three places in the code which take
>>> a mailbox* and fill out an array of quota diffs, interpreting the
>>> contents of the struct mailbox.  That should really be
>>> centralised.
>> I'm not sure to see what you have in mind here.
>> Are you talking about the places where the QUOTA_STORAGE and
>> QUOTA_MESSAGE entries of the quota diff array are computed
>> relatively to the 'quota_mailbox_used' and 'exists' index fields of
>> the struct mailbox ? 

Yes.


> One place please :)  Ideally I'd like to absorb more of the quota
> stuff into mailbox.c.  Greg and I have some debate about this - how
> much is too much for that file to be doing.  Probably it should be
> abstracted into a couple of layers of stuff - but I really do like
> the consistency of having just a couple of function calls:
> 
> mailbox_append_index_record; and
> mailbox_update_index_record
> 
> which do all the consistency checking and counter updating inside.
> Plus of course a mailbox_check_quota thing that takes a set of
> quota checks to do and sees if there will be space for the
> planned changes!

After implementing the X-ANNOTATION-STORAGE quota, I'm

Re: MESSAGE quota resource implemention

2011-09-02 Thread Bron Gondwana
On Fri, Sep 02, 2011 at 07:36:20PM +1000, Greg Banks wrote:
> If the software was robust, underflow would not happen and we would not
> need to test for it and handle it.  Thus the log messages are not
> operational messages intended for the sysadmin, but warnings about
> internal Cyrus problems intended for Cyrus developers, and syslog is a
> suitable place for them.

In theory, yes - assuming you can keep blackbox control over all the
filesystems that Cyrus is operating over, and the sysadmin never restores
from backup or otherwise screws with any of the underlying files.

> How's about this for a strategy?
> 
> When a quota resource is first enabled, (i.e. the limit is changed from
> UNLIMITED to some finite value), the usage is stored as some special
> value which I'll call INDETERMINATE.

What about 'getquota'?  I don't support any solution which leaves getquota
returning bogus values or failing to respond.  That's just icky and
confusing.

I don't think you can avoid two passes, and I don't even think you can
avoid two values during if you really want to be good about it.

Anyway - moving a new folder into a quotaroot is NOT racy.  You just need
to read quota_mailbox_used on the mailbox, then lock the old quota root,
subtract N bytes and unlock it - then lock the NEW quota root, add N bytes
and unlock it again.  No problem.

The only issue is updating WITHOUT changing the quotaroot, which is an
issue because a particular mailbox doesn't know if it's already been counted
in the new quota value or not, so if it should be updating the value.

There are pure ways to do this, that guarantee consistency.  I think the best
way is probably some sort of A/B thing, where you label the quotaroot as A
or B in the mailbox - AND in the quota root.  So the initial state looks like
this:

ROOT: A
A: $usage
B: INVALID

When you want to run a quota -f you set 'B' to zero, and then run the update
logic over all mailboxes, updating B with the value as you go, and setting
the quotaroot in the mailbox to be in state 'B', so it also updates B.
Any mailbox in state 'B' will update both A AND B, because the root is still
in state A.  Mailboxes in state 'A' will only update 'A', because they match
the root.

When you have finished quota -f, both values are being updated simultaneously
by all mailboxes.  You also have two fields which you can compare, and a
guarantee that they were both atomically updated, so if they're not the same
then there was definitely corruption, not just a race condition.  So you can
report that.

Then you update the quota root to say 'ROOT: B', and 'A' invalid.  Or even
just A: zero.  If anything continues to update the wrong field then you
also have corruption (probably a mailbox outside the quotaroot pointing to
it, which is pretty silly)

That's a real, robust solution.  But it's pretty heavy engineering.

> As far as I can see the only point of that first append_check() call is
> to fail early in the case of a permission fail.

It's nice to fail before the client starts uploading.  Failing any later
than that is kinda pointless, because you're not saving the client
bandwidth - which may matter.  The disk IO is unlikely to matter to the
server, since it's a rare case.

> convinced I got it very wrong, and should have left all the quota
> updating in mailbox_commit_quota().  I was trying hard to avoid adding a
> field to the index header to track the storage used by all the
> annotations for the mailbox and for messages in the mailbox; but I'm
> really not happy with the results :(

Well, we're not committed to keeping it that way - it's not as if it's
even "in the wild" except for some really early adopters of the master
branch, who deserve whatever pain they get (mostly, that's us - and we
know enough to be able to clean up any mess)

Bron.


Re: MESSAGE quota resource implemention

2011-09-04 Thread Greg Banks
On 01/09/11 03:21, Bron Gondwana wrote:
> On Wed, Aug 31, 2011 at 05:50:36PM +0200, Julien Coloos wrote:
>> Things that may be worth noting:
>>   - DUMP/UNDUMP currently does nothing special about MESSAGE or
>> X-ANNOTATION-STORAGE quota resources
>> -> should it be transferred ?
> 
> I'd like to replace DUMP/UNDUMP with replication protocol
> communications for XFER.
> 
>> -> without breaking backward compatibility, limits could only be
>> transferred through a 'fake' file entry, as for annotations
> 
> But for now, that's definitely the pragmatic way to go.
> 
>>   - quota usage is currently stored in a uquota_t variable, and
>> delta is computed as quota_t; so theorically there could be overflow
>> issues if quota usage to add/substract cannot be held in a quota_t;
>> in practice it should be unlikely since that would mean a usage of
>> over 2^63-1
> 
> I propose we scrap uquota_t - it is un-necessary for the medium-term
> future, now that we're requiring 64 bit types.

Agreed, and done:

https://github.com/gnb/cyrus-imapd/commit/836eae7b4bda663882c7f135e76b7453b1665436


-- 
Greg.


Re: MESSAGE quota resource implemention

2011-09-04 Thread Greg Banks
On 01/09/11 21:27, Julien Coloos wrote:

>>
>> [...]  Perhaps
>> we should a) document it clearly and b) detect the situation and put
>> an obvious message saying something like "you may need to run quota -f
>> ..." where the human making the change will see it.  Also, such a
>> message might be useful when usage underflow is detected.
>>
> My concern here is that there are times when quota changes are not done
> manually by a human, but happen automatically as part of platform
> provisioning scheme. For example, on many platforms we manage, if a user
> subscribes to some offer his/her mailbox quota will automatically be
> upgraded (IMAP action). So, at least in our case, I though it might be
> useful.
> 
> But I agree that in case of underflow detection throwing a warning in
> syslog might help draw the attention when logs are analysed.

Ok, done:

https://github.com/gnb/cyrus-imapd/commit/084dcf9b2f586c6f6c0e2298a66b73f46ae43f16

Julien, I think we agreed on everything else, right?  I'm looking
forward to your next iteration.

-- 
Greg.


Re: MESSAGE quota resource implemention

2011-09-04 Thread Greg Banks
On 01/09/11 22:22, Bron Gondwana wrote:
> On Thu, Sep 01, 2011 at 01:27:00PM +0200, Julien Coloos wrote:
>> Le 01/09/2011 03:03, Greg Banks a écrit :
> 
>>> I'm thinking that there's now three places in the code which take
>>> a mailbox* and fill out an array of quota diffs, interpreting the
>>> contents of the struct mailbox.  That should really be
>>> centralised.
>> I'm not sure to see what you have in mind here.
>> Are you talking about the places where the QUOTA_STORAGE and
>> QUOTA_MESSAGE entries of the quota diff array are computed
>> relatively to the 'quota_mailbox_used' and 'exists' index fields of
>> the struct mailbox ? If so I guess some of the code could indeed be
>> centralised.
> 
> One place please :)  Ideally I'd like to absorb more of the quota
> stuff into mailbox.c.  Greg and I have some debate about this - how
> much is too much for that file to be doing.  Probably it should be
> abstracted into a couple of layers of stuff - but I really do like
> the consistency of having just a couple of function calls:
> 
> mailbox_append_index_record; and
> mailbox_update_index_record
> 
> which do all the consistency checking and counter updating inside.
> Plus of course a mailbox_check_quota thing that takes a set of
> quota checks to do and sees if there will be space for the
> planned changes!

Ok, I'm now convinced my first attempt at annotation quotas sucked too
hard, here's how I want to re-implement it.  Let me know what you think.

 - rejig the annotation code to always have an open and locked mailbox*
before opening any annotations db (for per-mailbox and per-message
annotations)

 - add a 32b mailbox index header entry to track the storage in bytes
used by all annotations for the mailbox itself or for messages in the
mailbox

 - at header upgrade time, write the special value ~0 into this field,
meaning "unknown"

 - when setting or clearing annotations, if the field's value is ~0 then
recalculate it by scanning the annotations databases

 - when setting or clearing annotations, after the field is no longer in
it's "unknown" state, increment or decrement the field by the size of
the annotation

 - in mailbox_commit_quota(), if the field is not "unknown", then
calculate the delta in usage and apply to the quota db.

 - make reconstruct scan the annotations db to figure out the correct
value for this new field.

This means that the annotation STORE and SETMETADATA paths will be
updating the quota db in the same place that APPEND et al do,
mailbox_commit_quota().  This should work around Bug #3529 and also make
the code neater.


-- 
Greg.


Re: MESSAGE quota resource implemention

2011-09-05 Thread Bron Gondwana
On Mon, Sep 05, 2011 at 02:32:40PM +1000, Greg Banks wrote:
> Ok, I'm now convinced my first attempt at annotation quotas sucked too
> hard, here's how I want to re-implement it.  Let me know what you think.

>  - add a 32b mailbox index header entry to track the storage in bytes
> used by all annotations for the mailbox itself or for messages in the
> mailbox

Why not 64b?  Admittedly hanging gigabytes of annotations off a single
folders is probably evil, but this is the mailbox header - 4 bytes per
mailbox to not even have to think about it is super-cheap.  We keep a
handful of "blanks" around in the header already.

>  - at header upgrade time, write the special value ~0 into this field,
> meaning "unknown"

Funky.  If you didn't want to get all special-value about it, you could
use a flag over in the flags field as well, there are only about 5 of
the 32 in use so far.

>  - in mailbox_commit_quota(), if the field is not "unknown", then
> calculate the delta in usage and apply to the quota db.

I assume this is done the same way it's done now?  By taking a "snapshot"
of the value at the start, and calculating a diff to apply at the end.
That was actually a filthy workaround (one of many, dammit) for using
unsigned datatypes.  If we made this a signed 64 bit datatype as well,
with a flag off to the side for "unknown", then we could actually store
a diff in the mailbox object directly - and update both this field and
the quota's value during the commit.

>  - make reconstruct scan the annotations db to figure out the correct
> value for this new field.

Yep - that makes sense.  Reconstruct already does this for the
quota_mailbox_used field, doesn't it.  And any other derived
"aggregate" values in the header.

> This means that the annotation STORE and SETMETADATA paths will be
> updating the quota db in the same place that APPEND et al do,
> mailbox_commit_quota().  This should work around Bug #3529 and also make
> the code neater.

Nice.

Bron.


Re: MESSAGE quota resource implemention

2011-09-05 Thread Greg Banks
On 02/09/11 20:03, Bron Gondwana wrote:
> On Fri, Sep 02, 2011 at 07:36:20PM +1000, Greg Banks wrote:
> 
>> How's about this for a strategy?
>>
>> When a quota resource is first enabled, (i.e. the limit is changed from
>> UNLIMITED to some finite value), the usage is stored as some special
>> value which I'll call INDETERMINATE.
> 
> What about 'getquota'?  I don't support any solution which leaves getquota
> returning bogus values or failing to respond.  That's just icky and
> confusing.
> 
> I don't think you can avoid two passes, and I don't even think you can
> avoid two values during if you really want to be good about it.
> 
> [...]
> 
> There are pure ways to do this, that guarantee consistency. [...]
> 
> That's a real, robust solution.  But it's pretty heavy engineering.

After some thought, I agreed, decided it's quite doable, and decided to
start. I have an implementation coded, so far the hardest bit is testing it.



-- 
Greg.


Re: MESSAGE quota resource implemention

2011-09-05 Thread Greg Banks
On 05/09/11 20:06, Bron Gondwana wrote:
> On Mon, Sep 05, 2011 at 02:32:40PM +1000, Greg Banks wrote:
>> Ok, I'm now convinced my first attempt at annotation quotas sucked too
>> hard, here's how I want to re-implement it.  Let me know what you think.
> 
>>  - add a 32b mailbox index header entry to track the storage in bytes
>> used by all annotations for the mailbox itself or for messages in the
>> mailbox
> 
> Why not 64b?  Admittedly hanging gigabytes of annotations off a single
> folders is probably evil,

Do we even have a db backend that would support that?

> but this is the mailbox header - 4 bytes per
> mailbox to not even have to think about it is super-cheap.  We keep a
> handful of "blanks" around in the header already.
> 
>>  - at header upgrade time, write the special value ~0 into this field,
>> meaning "unknown"
> 
> Funky.  If you didn't want to get all special-value about it, you could
> use a flag over in the flags field as well, there are only about 5 of
> the 32 in use so far.

Sure.

>>  - in mailbox_commit_quota(), if the field is not "unknown", then
>> calculate the delta in usage and apply to the quota db.
> 
> I assume this is done the same way it's done now?

Something like that.

>  By taking a "snapshot"
> of the value at the start, and calculating a diff to apply at the end.
> That was actually a filthy workaround (one of many, dammit) for using
> unsigned datatypes.  If we made this a signed 64 bit datatype as well,
> with a flag off to the side for "unknown", then we could actually store
> a diff in the mailbox object directly - and update both this field and
> the quota's value during the commit.

Ok.

>>  - make reconstruct scan the annotations db to figure out the correct
>> value for this new field.
> 
> Yep - that makes sense.  Reconstruct already does this for the
> quota_mailbox_used field, doesn't it.

Yes.

>  And any other derived
> "aggregate" values in the header.
> 
>> This means that the annotation STORE and SETMETADATA paths will be
>> updating the quota db in the same place that APPEND et al do,
>> mailbox_commit_quota().  This should work around Bug #3529 and also make
>> the code neater.
> 
> Nice.

Yeah, all I have to do is work out how to futz with the annotate API yet
again :(


-- 
Greg.


Re: MESSAGE quota resource implemention

2011-09-05 Thread Julien Coloos

Le 05/09/2011 06:12, Greg Banks a écrit :


Julien, I think we agreed on everything else, right?  I'm looking
forward to your next iteration.
After picking your 'uquota_t removal' commit, I also removed it on my 
end, and changed the code according to our previous discussions. Adding 
an helper function which fills a quota_t array with the current messages 
usage from mailbox struct made the code a bit cleaner and more generic :)
I still have to look for transferring limits upon DUMP/UNDUMP. And for 
the time being, I still kept the quota.sets[] thing (we will see later 
if we can do it more smartly).



Regards
Julien


Re: MESSAGE quota resource implemention

2011-09-05 Thread Julien Coloos

Le 05/09/2011 12:06, Bron Gondwana a écrit :

On Mon, Sep 05, 2011 at 02:32:40PM +1000, Greg Banks wrote:


  - add a 32b mailbox index header entry to track the storage in bytes
used by all annotations for the mailbox itself or for messages in the
mailbox

Why not 64b?  Admittedly hanging gigabytes of annotations off a single
folders is probably evil, but this is the mailbox header - 4 bytes per
mailbox to not even have to think about it is super-cheap.  We keep a
handful of "blanks" around in the header already.

...


  - in mailbox_commit_quota(), if the field is not "unknown", then
calculate the delta in usage and apply to the quota db.


This new field seems useful for the quota utility and when setting a new 
quotaroot: it wouldn't need to check each annotation associated to the 
mailbox (or messages) and would rely on that field, as it currently does 
for total message size and now the number of messages (with my patch).
I don't know what other people do with cyrus, but here those two 
situations do not happen that often:

  - quotaroot is usually set once upon creating the user mailbox
  - quota utility is rarely used, and usually on one mailbox at a time
So maybe I am missing something, but my question would be: is it really 
worth it ?



Regards
Julien


Re: MESSAGE quota resource implemention

2011-09-05 Thread Bron Gondwana
On Mon, Sep 05, 2011 at 07:19:00PM +0200, Julien Coloos wrote:
> Le 05/09/2011 12:06, Bron Gondwana a écrit :
> >On Mon, Sep 05, 2011 at 02:32:40PM +1000, Greg Banks wrote:
> >
> >>  - add a 32b mailbox index header entry to track the storage in bytes
> >>used by all annotations for the mailbox itself or for messages in the
> >>mailbox
> >Why not 64b?  Admittedly hanging gigabytes of annotations off a single
> >folders is probably evil, but this is the mailbox header - 4 bytes per
> >mailbox to not even have to think about it is super-cheap.  We keep a
> >handful of "blanks" around in the header already.
> >
> >...
> >
> >>  - in mailbox_commit_quota(), if the field is not "unknown", then
> >>calculate the delta in usage and apply to the quota db.
> >
> This new field seems useful for the quota utility and when setting a
> new quotaroot: it wouldn't need to check each annotation associated
> to the mailbox (or messages) and would rely on that field, as it
> currently does for total message size and now the number of messages
> (with my patch).
> I don't know what other people do with cyrus, but here those two
> situations do not happen that often:
>   - quotaroot is usually set once upon creating the user mailbox
>   - quota utility is rarely used, and usually on one mailbox at a time
> So maybe I am missing something, but my question would be: is it
> really worth it ?

I would challenge the "usually on one mailbox at a time" theory - it's
often on one quota root at a time (i.e. - a single user), but also can
be used on the entire server.  Or for someone with a domain quota, on
all the users in a single domain.

This can be quite the race condition if someone has a lot of mailboxes
and a lot of traffic on said mailboxes.  Short of killing their
connections and blocking LMTP traffic, you can't guarantee atomicity
on actions that cross multiple mailboxes.

But - the new field definitely has value.  I like the idea of just
trusting that field in the quota utility and for regular tasks.

Bron.


Re: MESSAGE quota resource implemention

2011-09-05 Thread Greg Banks
On 05/09/11 20:16, Greg Banks wrote:
> On 02/09/11 20:03, Bron Gondwana wrote:
>> On Fri, Sep 02, 2011 at 07:36:20PM +1000, Greg Banks wrote:
>>
>>> How's about this for a strategy?
>>>
>>> When a quota resource is first enabled, (i.e. the limit is changed from
>>> UNLIMITED to some finite value), the usage is stored as some special
>>> value which I'll call INDETERMINATE.
>>
>> What about 'getquota'?  I don't support any solution which leaves getquota
>> returning bogus values or failing to respond.  That's just icky and
>> confusing.
>>
>> I don't think you can avoid two passes, and I don't even think you can
>> avoid two values during if you really want to be good about it.
>>
>> [...]
>>
>> There are pure ways to do this, that guarantee consistency. [...]
>>
>> That's a real, robust solution.  But it's pretty heavy engineering.
> 
> After some thought, I agreed, decided it's quite doable, and decided to
> start. I have an implementation coded, so far the hardest bit is testing it.

Implemented, tested, and I'd appreciate a review.

https://github.com/gnb/cyrus-imapd/commit/af8d5bd3b40c4cb49fb7c943c85cb0aeab209215

https://github.com/gnb/cassandane/commit/c529f745b51aa0b566020ecbffca774e783124ee
https://github.com/gnb/cassandane/commit/569eea2b9a2ad56b98c3c5cea60509689fa49bad



-- 
Greg.


Re: MESSAGE quota resource implemention

2011-09-06 Thread Greg Banks
On 06/09/11 02:55, Julien Coloos wrote:
> Le 05/09/2011 06:12, Greg Banks a écrit :
>>
>> Julien, I think we agreed on everything else, right?  I'm looking
>> forward to your next iteration.
> After picking your 'uquota_t removal' commit, I also removed it on my
> end, and changed the code according to our previous discussions.

Ok, I've read these:

commit "Remove uquota_t"
commit "Minor code cleanup"
commit "Do not mess with mailbox index struct fields upon deleting."
commit "When tracking quota usage changes, check all resources."

and it all looks fine to me, except that:

a) my commit "Make quota -f less racy" is going to cause lots of clashes
(sorry!)

b) Bron and I both think that your commit "Compute each quota resource
upon setting it for the first time." is unnecessary, given that

  i) quota -f doesn't suck now, and

  ii) soon, all of the quota-able quantities will be tracked in fields
in the index header.

So I think we'll need another round, sorry :(  Given that the
annotations quota is broken and I'll be reimplementing it anyway, you
may as well ignore QUOTA_ANNOTSTORAGE in all commits, and leave out the
function annotatemore_computequota() for now.  We'll use something very
much like it for reconstruct later.  I'm hoping to be able to pull your
next round of changes into my annotate branch before I reimplement the
annotation quota in the next few days.

> Adding
> an helper function which fills a quota_t array with the current messages
> usage from mailbox struct made the code a bit cleaner and more generic :)

Yes, I'm very pleased with that :)

> I still have to look for transferring limits upon DUMP/UNDUMP. And for
> the time being, I still kept the quota.sets[] thing (we will see later
> if we can do it more smartly).

I'm still not convinced we'll need quota.sets[], but I'll play along.

Thanks again for your work, and sorry that my annotate branch wasn't
quite as stable a base as you first thought :)

-- 
Greg.


Re: MESSAGE quota resource implemention

2011-09-06 Thread Julien Coloos

Le 06/09/2011 10:23, Greg Banks a écrit :


So I think we'll need another round, sorry :(  Given that the
annotations quota is broken and I'll be reimplementing it anyway, you
may as well ignore QUOTA_ANNOTSTORAGE in all commits, and leave out the
function annotatemore_computequota() for now.  We'll use something very
much like it for reconstruct later.  I'm hoping to be able to pull your
next round of changes into my annotate branch before I reimplement the
annotation quota in the next few days.

If the new code is capable of returning actual resource usages upon 
getquota (I think Bron wanted it that way), then I guess I can drop some 
more of the functions I added (annotatemore_computequota, but also 
mboxlist_updatequota and associated code including the quota.sets[]; 
then maybe I could also change the code as proposed earlier and prevent 
writing of resource limit in quota entry if it is QUOTA_UNLIMITED). Will 
look at it tomorrow.


Worked on DUMP/UNDUMP today: 
https://github.com/worldline-messaging/cyrus-imapd/commit/45148b20a4f2343d72aa7436e7255a92508d7bf8
I used an IMAP list to send data, as for annotations. For future usages, 
maybe the UNDUMP code could try to skip IMAP lists if it finds any 
instead of the expected file content LITERAL ? (to prevent breaking 
compatibility too many times, in case other things would need to be 
transmitted this way through DUMP/UNDUMP).


By the way, I noticed that UNDUMPing annotations fails for now (function 
annotate_state_write uses 'int_mboxname' in the annotation state struct, 
but it is NULL in this context).




I'm still not convinced we'll need quota.sets[], but I'll play along.

Thanks again for your work, and sorry that my annotate branch wasn't
quite as stable a base as you first thought :)

No problem, I was somehow prepared to that kind of scenario :)


Regards
Julien


Re: MESSAGE quota resource implemention

2011-09-06 Thread Julien Coloos

Le 06/09/2011 19:48, Julien Coloos a écrit :

Le 06/09/2011 10:23, Greg Banks a écrit :


So I think we'll need another round, sorry :(  Given that the
annotations quota is broken and I'll be reimplementing it anyway, you
may as well ignore QUOTA_ANNOTSTORAGE in all commits, and leave out the
function annotatemore_computequota() for now.  We'll use something very
much like it for reconstruct later.  I'm hoping to be able to pull your
next round of changes into my annotate branch before I reimplement the
annotation quota in the next few days.

If the new code is capable of returning actual resource usages upon 
getquota (I think Bron wanted it that way), then I guess I can drop 
some more of the functions I added (annotatemore_computequota, but 
also mboxlist_updatequota and associated code including the 
quota.sets[]; then maybe I could also change the code as proposed 
earlier and prevent writing of resource limit in quota entry if it is 
QUOTA_UNLIMITED). Will look at it tomorrow.

When I said 'actual', I meant based on the upcoming mailbox index field.
But wait, I got a bit confused with the last point about playing along 
with quota.sets[]. If what you discussed with Bron is that, in the end, 
usage (re)computing will only be done with the quota utility (no more 
automatic computing upon setquota, neither for getquota), then after 
upgrading people shall call 'quota -f' once and for all and quota.sets[] 
must disappear - and all resources must be written in the quota entry -, 
otherwise users would need to call 'quota -f' on a given quotaroot each 
time they set a new quota resource limit to a mailbox.



Regards
Julien


Re: MESSAGE quota resource implemention

2011-09-07 Thread Julien Coloos

Le 06/09/2011 10:23, Greg Banks a écrit :


a) my commit "Make quota -f less racy" is going to cause lots of clashes
(sorry!)

b) Bron and I both think that your commit "Compute each quota resource
upon setting it for the first time." is unnecessary, given that

   i) quota -f doesn't suck now, and

   ii) soon, all of the quota-able quantities will be tracked in fields
in the index header.

So I think we'll need another round, sorry :(  Given that the
annotations quota is broken and I'll be reimplementing it anyway, you
may as well ignore QUOTA_ANNOTSTORAGE in all commits, and leave out the
function annotatemore_computequota() for now.  We'll use something very
much like it for reconstruct later.  I'm hoping to be able to pull your
next round of changes into my annotate branch before I reimplement the
annotation quota in the next few days.

...

I'm still not convinced we'll need quota.sets[], but I'll play along.

Thanks again for your work, and sorry that my annotate branch wasn't
quite as stable a base as you first thought :)
So, I saved my current branch to 'quotamessage-0/gnb/annotate' and 
rebased my patches on current 'annotate' branch (with less racy 'quota -f').
I removed everything related to recomputing from my patches (as well as 
quota.sets[]).


What is missing now is the new index field, which value will be used in 
mailbox_get_usage function. Since my changes do rely on this function, 
and sometimes computes a delta compared to a previous call of that 
function, it may not need to be updated afterwards ... I hope.
Then maybe some of the cassandane tests I pushed on our repository would 
need to be refreshed (at least the one that checks what happens for 
legacy mailboxes on which we add one of the newly handled quota resources).



Regards
Julien


Re: MESSAGE quota resource implemention

2011-09-07 Thread Greg Banks



Sent from my iPhone

On 08/09/2011, at 0:45, Julien Coloos  wrote:


Le 06/09/2011 10:23, Greg Banks a écrit :


a) my commit "Make quota -f less racy" is going to cause lots of  
clashes

(sorry!)

b) Bron and I both think that your commit "Compute each quota  
resource

upon setting it for the first time." is unnecessary, given that

  i) quota -f doesn't suck now, and

  ii) soon, all of the quota-able quantities will be tracked in  
fields

in the index header.

So I think we'll need another round, sorry :(  Given that the
annotations quota is broken and I'll be reimplementing it anyway, you
may as well ignore QUOTA_ANNOTSTORAGE in all commits, and leave out  
the
function annotatemore_computequota() for now.  We'll use something  
very
much like it for reconstruct later.  I'm hoping to be able to pull  
your
next round of changes into my annotate branch before I reimplement  
the

annotation quota in the next few days.

...

I'm still not convinced we'll need quota.sets[], but I'll play along.

Thanks again for your work, and sorry that my annotate branch wasn't
quite as stable a base as you first thought :)
So, I saved my current branch to 'quotamessage-0/gnb/annotate' and  
rebased my patches on current 'annotate' branch (with less racy  
'quota -f').
I removed everything related to recomputing from my patches (as well  
as quota.sets[]).




Excellent, I'll take a look at these when I get into the office.


What is missing now is the new index field, which value will be used  
in mailbox_get_usage function. Since my changes do rely on this  
function, and sometimes computes a delta compared to a previous call  
of that function, it may not need to be updated afterwards ... I hope.


That seems likely. I have an almost-building diff which adds that field.

Then maybe some of the cassandane tests I pushed on our repository  
would need to be refreshed (at least the one that checks what  
happens for legacy mailboxes on which we add one of the newly  
handled quota resources).




Yep.

Greg.

Re: MESSAGE quota resource implemention

2011-09-09 Thread Greg Banks
On 08/09/11 00:45, Julien Coloos wrote:
> Le 06/09/2011 10:23, Greg Banks a écrit :
>>
>
>>
>> ...
>>
>> I'm still not convinced we'll need quota.sets[], but I'll play along.
>>
>> Thanks again for your work, and sorry that my annotate branch wasn't
>> quite as stable a base as you first thought :)
> So, I saved my current branch to 'quotamessage-0/gnb/annotate' and
> rebased my patches on current 'annotate' branch (with less racy 'quota
> -f').
> I removed everything related to recomputing from my patches (as well as
> quota.sets[]).
> 
> What is missing now is the new index field, which value will be used in
> mailbox_get_usage function. Since my changes do rely on this function,
> and sometimes computes a delta compared to a previous call of that
> function, it may not need to be updated afterwards ... I hope.

Ok, here you go.  Not completely tested yet, so caveat emptor.

The following changes since commit af8d5bd3b40c4cb49fb7c943c85cb0aeab209215:

  Make quota -f less racy. (2011-09-06 15:53:49 +1000)

are available in the git repository at:
  ssh://g...@github.com/gnb/cyrus-imapd.git annotate

Greg Banks (9):
  Fix compile warning in cmd_setquota
  append_setup() et al take an array of quota diffs
  unit: restore simplicity to quota TESTCASE macro
  annotate: kill annotate_state_write_{start,finish}
  annotate: _delete and _msg_copy take mailbox*
  annotate: use mailbox* instead of name in API
  Fix annotation memory leak in sync_server
  annotate: track quota in mailbox header
  annotate: woops, forgot this hunk

Julien Coloos (5):
  Added quota MESSAGE resource (RFC 2087) management.
  List and compute all set quota resources.
  Do not mess with mailbox index struct fields upon deleting.
  When tracking quota usage changes, check all resources. Added
helper function to get messages current usage from mailbox struct.
  Handle quota resources other than STORAGE in DUMP/UNDUMP. Also
fix protocol breaking upon dumping annotations.

 cunit/annotate.c |   84 +
 cunit/quota.c|   72 +--
 imap/annotate.c  |  507
--
 imap/annotate.h  |   19 ++-
 imap/append.c|   30 ++-
 imap/append.h|6 +-
 imap/cyr_virusscan.c |2 +-
 imap/imap_err.et |2 +-
 imap/imapd.c |  219 ---
 imap/index.c |7 +-
 imap/lmtpd.c |   33 +++-
 imap/lmtpengine.c|7 +-
 imap/lmtpengine.h|4 +-
 imap/mailbox.c   |  104 ---
 imap/mailbox.h   |   12 +-
 imap/mbdump.c|  146 ---
 imap/mboxlist.c  |   31 ++--
 imap/nntpd.c |4 +-
 imap/quota.c |   94 +++---
 imap/quota.h |   18 ++-
 imap/quota_db.c  |   36 +++-
 imap/smmapd.c|   15 +-
 imap/sync_client.c   |4 +-
 imap/sync_server.c   |   19 ++-
 imap/sync_support.c  |6 -
 lib/imapoptions  |   11 +-
 26 files changed, 933 insertions(+), 559 deletions(-)



> Then maybe some of the cassandane tests I pushed on our repository would
> need to be refreshed (at least the one that checks what happens for
> legacy mailboxes on which we add one of the newly handled quota resources).
> 


That's next week's focus I think.

-- 
Greg.


Re: MESSAGE quota resource implemention

2011-09-12 Thread Julien Coloos

Le 09/09/2011 14:18, Greg Banks a écrit :

Ok, here you go. Not completely tested yet, so caveat emptor.


Had to change two things:
 - mailbox_quota_check now expects a quota diff array which is good :), 
but change is now really applied if all diffs are '> 0' (instead of '>= 
0' previously); in some cases '0' is used to check that mailbox is not 
currently overquota (e.g. in LMTP service), so either we should go back 
to testing '>= 0', or change callers that did rely on that test
 - changing the value of annotations quota storage in the index 
(mailbox_use_annot_quota function) do dirty the quota, but not the 
index; the new value is thus not committed (at least when setting 
mailbox annotations), and the quota entry becomes false when later 
deleting the mailbox


If needed, you can look in our 'cyrus-imapd' repository at the 
'gnb/annotate/fixes' branch. There is one commit for those matters.



Remaining things concerning annotations:
 - when deleting messages, annotations length is not substracted; the 
solution may not be that simple, since I believe users are allowed to 
unexpunge mails: so since index entry is still here - until real 
unlinking -, annotations may have to stay there too - until unlinking too
 - for 'old' mailboxes (those created before the annotation storage 
usage field in the index header), current annotations usage shall be 
computed (and added to the quota entry) upon upgrading; this way users 
won't have to run 'quota -f' for all quotaroots after switching to this 
new version ;)



Then maybe some of the cassandane tests I pushed on our repository would
need to be refreshed (at least the one that checks what happens for
legacy mailboxes on which we add one of the newly handled quota resources).

That's next week's focus I think.
Actually I just gave up the 'old' test: there is no easy way to simulate 
upgrading mailbox index, or at least I don't feel confident enough to 
make it in cassandane :(
Other tests do work. Once the annotations usage is subtracted upon 
messages deletion (see before), all tests shall pass :)


I rebased the 'quotamessage/gnb/annotate' branch of our 'cassandane' 
repository today, leaving that test aside.



Regards
Julien


Re: MESSAGE quota resource implemention

2011-09-12 Thread Bron Gondwana
On Mon, Sep 12, 2011 at 04:46:52PM +0200, Julien Coloos wrote:
> Le 09/09/2011 14:18, Greg Banks a écrit :
> >Ok, here you go. Not completely tested yet, so caveat emptor.
> Remaining things concerning annotations:
>  - when deleting messages, annotations length is not substracted;
> the solution may not be that simple, since I believe users are
> allowed to unexpunge mails: so since index entry is still here -
> until real unlinking -, annotations may have to stay there too -
> until unlinking too

When they unexpunge, you add it back.  The decision to have
delayed expunge and WHEN to expunge is a server administrator
decision that's outside the control of the user.

With storage, we subtract the usage when we mark the record
expunged.  Annotation length should be handled the same way.
This will have to be done when the message gets expunged,
by finding the size of the annotations and subtracting that
also.

>  - for 'old' mailboxes (those created before the annotation storage
> usage field in the index header), current annotations usage shall be
> computed (and added to the quota entry) upon upgrading; this way
> users won't have to run 'quota -f' for all quotaroots after
> switching to this new version ;)

Definitely.  Upgrading usually handles things like that.  It's
the right way[tm].

> Actually I just gave up the 'old' test: there is no easy way to
> simulate upgrading mailbox index, or at least I don't feel confident
> enough to make it in cassandane :(

Easiest way is the same way I did for the broken quotas test.
Have a tar file with the contents of the "old mailbox" which
you unpack onto the filesystem, and then open the mailbox and
check that the "upgraded" fields are what you would expect.

I also did something similar for the "crash on thread" test,
where there were 5 messages which were known to be able to
crash the THREAD command.  I unpacked the folder contents
with those messages in a test.

Regards,

Bron.


Re: MESSAGE quota resource implemention

2011-09-14 Thread Julien Coloos

Le 12/09/2011 23:09, Bron Gondwana a écrit :



  - for 'old' mailboxes (those created before the annotation storage
usage field in the index header), current annotations usage shall be
computed (and added to the quota entry) upon upgrading; this way
users won't have to run 'quota -f' for all quotaroots after
switching to this new version ;)

Definitely.  Upgrading usually handles things like that.  It's
the right way[tm].

Pushed a few lines of code in the 'upgrade_index' function. Though it 
still lacks the annotation storage usage computing.
It's in the branch 'gnb/annotate/fixes'; commit 
ed61f721487804b205e399c538fc35ddc153bd93.



Actually I just gave up the 'old' test: there is no easy way to
simulate upgrading mailbox index, or at least I don't feel confident
enough to make it in cassandane :(

Easiest way is the same way I did for the broken quotas test.
Have a tar file with the contents of the "old mailbox" which
you unpack onto the filesystem, and then open the mailbox and
check that the "upgraded" fields are what you would expect.

I also did something similar for the "crash on thread" test,
where there were 5 messages which were known to be able to
crash the THREAD command.  I unpacked the folder contents
with those messages in a test.

Nice idea :)
Pushed a few things.
Added an option hash so that when running commands:
  - test can run system commands (based on the current code which would 
run cyrus commands inside the current cassandane instance directory)

  - finer I/O redirections are possible
  - working directory can be specified
While adding those new 'features', I still kept the 'run_utility' method 
(cyrus commands) and simply added the 'run_command' method for other 
commands.


Then, as you suggested on IRC, I added an 'unpack' method to extract 
tar/gz/bz2 (and combinations) files using system commands.


And I added back my quota upgrade test from a cyrus v2.4(.11) mailbox, 
using two tar.gz files containing a mailbox content and its quota file.


It's in the 'quotamessage/gnb/annotate' branch; commits 
d2bf4e4f42f7f8c9b53713441116ddbea5b0a265, 
86a52daeed5a03f078b88de67d3d10b51a7f8cc4 and 
fb827e8fc77529a5e23465d2c19d6e88adf7cae8.



Maybe you won't keep everything I pushed, but I hope some parts will be 
helpful :)


Regards
Julien


Re: MESSAGE quota resource implemention

2011-09-14 Thread Greg Banks
On 15/09/11 03:14, Julien Coloos wrote:
> Le 12/09/2011 23:09, Bron Gondwana a écrit :
>>
>>>   - for 'old' mailboxes (those created before the annotation storage
>>> usage field in the index header), current annotations usage shall be
>>> computed (and added to the quota entry) upon upgrading; this way
>>> users won't have to run 'quota -f' for all quotaroots after
>>> switching to this new version ;)
>> Definitely.  Upgrading usually handles things like that.  It's
>> the right way[tm].
>>
> Pushed a few lines of code in the 'upgrade_index' function. Though it
> still lacks the annotation storage usage computing.
> It's in the branch 'gnb/annotate/fixes'; commit
> ed61f721487804b205e399c538fc35ddc153bd93.

Thanks Julien, I've cherry-picked this commit and the "Fixes" commit
from the other day into my annotate branch.

BTW I'm working on making reconstruct calculate the X-ANNOTATION-STORAGE
quota.  Some of that code may also be useful during upgrade and for
handling message expunging.

> 
>>> Actually I just gave up the 'old' test: there is no easy way to
>>> simulate upgrading mailbox index, or at least I don't feel confident
>>> enough to make it in cassandane :(

No, there isn't a really good way at the moment.

>> Easiest way is the same way I did for the broken quotas test.
>> Have a tar file with the contents of the "old mailbox" which
>> you unpack onto the filesystem, and then open the mailbox and
>> check that the "upgraded" fields are what you would expect.
>>
>> I also did something similar for the "crash on thread" test,
>> where there were 5 messages which were known to be able to
>> crash the THREAD command.  I unpacked the folder contents
>> with those messages in a test.
> Nice idea :)

Bron's method of saving a tarball of the state of a mailbox generated by
an older release is reasonably straightforward.  The difficulty however
is going to be working out whether the result of the upgrade is correct.
 In the general case, you can't just test that the first open of the
mailbox doesn't report an error from upgrading, you need to test at least:

 * the mailbox is re-written with the latest version number

 * all the "new" fields have correct or at least feasible values

 * the messages previously present are still present and in the same
order with the same contents and the same metadata (uid, flags,
internaldate, etc)

To do this properly you really need to save, independently of the Cyrus
mailbox store, a whole lot of information about the expected results of
the mailbox open so that Cassandane can check it.  The two other
alternatives are both worse: a) don't check and b) hardcode checks into
the Cassandane code which depend intimately on details of the data in
the mailbox state tarball.

I'm thinking that what we need is a standalone utility built from the
Cassandane source tree.  Someone would point this utility at a live
working Cyrus instance (not one controlled by Cassandane), and it would
create a new folder and fill the folder with known data, then harvest
the on-disk state left by Cyrus and packages that state up into a
tarball along with a separate description of the messages.  That tarball
could then be checked into the Cassandane tree and used to run upgrade
tests with reasonable post-upgrade checks.  The advantages I see for
this approach are:

 * when (not 'if') we need to improve either the data generated or the
checks run, we can improve this utility and then re-capture new state as
time permits.

 * the versions of Cyrus tested can be on separate machines from
Cassandane itself

   - which allows testing upgrades from really old versions which don't
install or run on modern OSes due to library issues, and

   - allows testing upgrades from Cyrus on other platforms (e.g.
platforms with different endianness or wordsize).


BTW at some point in the future I want Cassandane to be able to handle
multiple installed versions of Cyrus, so that we can test configurations
like cross-version replication.  But that's not really the same problem
as upgrade testing.  For upgrade testing, we don't need functioning code
for the old Cyrus versions, just a snapshot of their spoor.  For
cross-version replication or murder testing we need live installs to
poke.  The difference is critical, because I figure we'll need to
support upgrade from much older versions than we support in a
cross-version murder, and those older versions will be quite challenging
to install or get running on the same platform that we develop on.

> Pushed a few things.

Thanks. I pulled in your commit "Added quota MESSAGE resource (RFC 2087)
test cases." yesterday and then refactored the new tests a bit, but
didn't get a chance to push it out again until today.


> Added an option hash so that when running commands:
>   - test can run system commands (based on the current code which would
> run cyrus commands inside the current cassandane instance directory)



>   - finer I/O redirections are possible
>   - working dire

Re: MESSAGE quota resource implemention

2011-09-16 Thread Julien Coloos

As discussed here and on IRC, I rebased my commits with all the changes:
  - the 'utility' methods have been promoted to 'command' ones, which 
are now generic and may (options hash) concern cyrus binaries
  - the 'start_command_bg' method is now replaced by a 'background' 
option in 'run_command'

  - dropped the 'mode' option, leaving only the 'redirects' one
  - moved 'unpack' method from Cassandane::Unit::TestCase to 
Cassandane::Instance, leaving the tar utility determine itself the kind 
of the tar file
-> by default destination is the current cassandane instance base 
directory; alternatively one can specify a relative path - 
'path/to/file' - to this directory, or an absolute path  - '/path/to/file'

  - use the pre-existing 'user.cassandane' mailbox for quota upgrade tests
  - fixed a few minor things (like avoiding short-life zombies by 
harvesting some cassandane dead children)

  - refactored a few more stuff
  - removed stuff that became unnecessary

Hoping this will work out this time :)


Regards
Julien


Re: MESSAGE quota resource implemention

2011-09-18 Thread Greg Banks
On 17/09/11 01:37, Julien Coloos wrote:
> As discussed here and on IRC, I rebased my commits with all the changes:
>   - the 'utility' methods have been promoted to 'command' ones, which
> are now generic and may (options hash) concern cyrus binaries
>   - the 'start_command_bg' method is now replaced by a 'background'
> option in 'run_command'
>   - dropped the 'mode' option, leaving only the 'redirects' one
>   - moved 'unpack' method from Cassandane::Unit::TestCase to
> Cassandane::Instance, leaving the tar utility determine itself the kind
> of the tar file
> -> by default destination is the current cassandane instance base
> directory; alternatively one can specify a relative path -
> 'path/to/file' - to this directory, or an absolute path  - '/path/to/file'
>   - use the pre-existing 'user.cassandane' mailbox for quota upgrade tests
>   - fixed a few minor things (like avoiding short-life zombies by
> harvesting some cassandane dead children)
>   - refactored a few more stuff
>   - removed stuff that became unnecessary
> 
> Hoping this will work out this time :)

Excellent work, thanks a lot.  I particularly like the refactoring in
Quota.pm that allows the tests to express checks of reported quota
numbers in a way that's concise but not sensitive to the order reported
by the server.

I've pulled your changes and pushed them out to github and cmu.


-- 
Greg.