Re: MESSAGE quota resource implemention

2011-09-05 Thread Bron Gondwana
On Mon, Sep 05, 2011 at 02:32:40PM +1000, Greg Banks wrote:
 Ok, I'm now convinced my first attempt at annotation quotas sucked too
 hard, here's how I want to re-implement it.  Let me know what you think.

  - add a 32b mailbox index header entry to track the storage in bytes
 used by all annotations for the mailbox itself or for messages in the
 mailbox

Why not 64b?  Admittedly hanging gigabytes of annotations off a single
folders is probably evil, but this is the mailbox header - 4 bytes per
mailbox to not even have to think about it is super-cheap.  We keep a
handful of blanks around in the header already.

  - at header upgrade time, write the special value ~0 into this field,
 meaning unknown

Funky.  If you didn't want to get all special-value about it, you could
use a flag over in the flags field as well, there are only about 5 of
the 32 in use so far.

  - in mailbox_commit_quota(), if the field is not unknown, then
 calculate the delta in usage and apply to the quota db.

I assume this is done the same way it's done now?  By taking a snapshot
of the value at the start, and calculating a diff to apply at the end.
That was actually a filthy workaround (one of many, dammit) for using
unsigned datatypes.  If we made this a signed 64 bit datatype as well,
with a flag off to the side for unknown, then we could actually store
a diff in the mailbox object directly - and update both this field and
the quota's value during the commit.

  - make reconstruct scan the annotations db to figure out the correct
 value for this new field.

Yep - that makes sense.  Reconstruct already does this for the
quota_mailbox_used field, doesn't it.  And any other derived
aggregate values in the header.

 This means that the annotation STORE and SETMETADATA paths will be
 updating the quota db in the same place that APPEND et al do,
 mailbox_commit_quota().  This should work around Bug #3529 and also make
 the code neater.

Nice.

Bron.


Re: MESSAGE quota resource implemention

2011-09-05 Thread Greg Banks
On 02/09/11 20:03, Bron Gondwana wrote:
 On Fri, Sep 02, 2011 at 07:36:20PM +1000, Greg Banks wrote:
 
 How's about this for a strategy?

 When a quota resource is first enabled, (i.e. the limit is changed from
 UNLIMITED to some finite value), the usage is stored as some special
 value which I'll call INDETERMINATE.
 
 What about 'getquota'?  I don't support any solution which leaves getquota
 returning bogus values or failing to respond.  That's just icky and
 confusing.
 
 I don't think you can avoid two passes, and I don't even think you can
 avoid two values during if you really want to be good about it.
 
 [...]
 
 There are pure ways to do this, that guarantee consistency. [...]
 
 That's a real, robust solution.  But it's pretty heavy engineering.

After some thought, I agreed, decided it's quite doable, and decided to
start. I have an implementation coded, so far the hardest bit is testing it.



-- 
Greg.


Re: MESSAGE quota resource implemention

2011-09-05 Thread Greg Banks
On 05/09/11 20:06, Bron Gondwana wrote:
 On Mon, Sep 05, 2011 at 02:32:40PM +1000, Greg Banks wrote:
 Ok, I'm now convinced my first attempt at annotation quotas sucked too
 hard, here's how I want to re-implement it.  Let me know what you think.
 
  - add a 32b mailbox index header entry to track the storage in bytes
 used by all annotations for the mailbox itself or for messages in the
 mailbox
 
 Why not 64b?  Admittedly hanging gigabytes of annotations off a single
 folders is probably evil,

Do we even have a db backend that would support that?

 but this is the mailbox header - 4 bytes per
 mailbox to not even have to think about it is super-cheap.  We keep a
 handful of blanks around in the header already.
 
  - at header upgrade time, write the special value ~0 into this field,
 meaning unknown
 
 Funky.  If you didn't want to get all special-value about it, you could
 use a flag over in the flags field as well, there are only about 5 of
 the 32 in use so far.

Sure.

  - in mailbox_commit_quota(), if the field is not unknown, then
 calculate the delta in usage and apply to the quota db.
 
 I assume this is done the same way it's done now?

Something like that.

  By taking a snapshot
 of the value at the start, and calculating a diff to apply at the end.
 That was actually a filthy workaround (one of many, dammit) for using
 unsigned datatypes.  If we made this a signed 64 bit datatype as well,
 with a flag off to the side for unknown, then we could actually store
 a diff in the mailbox object directly - and update both this field and
 the quota's value during the commit.

Ok.

  - make reconstruct scan the annotations db to figure out the correct
 value for this new field.
 
 Yep - that makes sense.  Reconstruct already does this for the
 quota_mailbox_used field, doesn't it.

Yes.

  And any other derived
 aggregate values in the header.
 
 This means that the annotation STORE and SETMETADATA paths will be
 updating the quota db in the same place that APPEND et al do,
 mailbox_commit_quota().  This should work around Bug #3529 and also make
 the code neater.
 
 Nice.

Yeah, all I have to do is work out how to futz with the annotate API yet
again :(


-- 
Greg.


Re: MESSAGE quota resource implemention

2011-09-05 Thread Julien Coloos

Le 05/09/2011 06:12, Greg Banks a écrit :


Julien, I think we agreed on everything else, right?  I'm looking
forward to your next iteration.
After picking your 'uquota_t removal' commit, I also removed it on my 
end, and changed the code according to our previous discussions. Adding 
an helper function which fills a quota_t array with the current messages 
usage from mailbox struct made the code a bit cleaner and more generic :)
I still have to look for transferring limits upon DUMP/UNDUMP. And for 
the time being, I still kept the quota.sets[] thing (we will see later 
if we can do it more smartly).



Regards
Julien


Re: MESSAGE quota resource implemention

2011-09-05 Thread Bron Gondwana
On Mon, Sep 05, 2011 at 07:19:00PM +0200, Julien Coloos wrote:
 Le 05/09/2011 12:06, Bron Gondwana a écrit :
 On Mon, Sep 05, 2011 at 02:32:40PM +1000, Greg Banks wrote:
 
   - add a 32b mailbox index header entry to track the storage in bytes
 used by all annotations for the mailbox itself or for messages in the
 mailbox
 Why not 64b?  Admittedly hanging gigabytes of annotations off a single
 folders is probably evil, but this is the mailbox header - 4 bytes per
 mailbox to not even have to think about it is super-cheap.  We keep a
 handful of blanks around in the header already.
 
 ...
 
   - in mailbox_commit_quota(), if the field is not unknown, then
 calculate the delta in usage and apply to the quota db.
 
 This new field seems useful for the quota utility and when setting a
 new quotaroot: it wouldn't need to check each annotation associated
 to the mailbox (or messages) and would rely on that field, as it
 currently does for total message size and now the number of messages
 (with my patch).
 I don't know what other people do with cyrus, but here those two
 situations do not happen that often:
   - quotaroot is usually set once upon creating the user mailbox
   - quota utility is rarely used, and usually on one mailbox at a time
 So maybe I am missing something, but my question would be: is it
 really worth it ?

I would challenge the usually on one mailbox at a time theory - it's
often on one quota root at a time (i.e. - a single user), but also can
be used on the entire server.  Or for someone with a domain quota, on
all the users in a single domain.

This can be quite the race condition if someone has a lot of mailboxes
and a lot of traffic on said mailboxes.  Short of killing their
connections and blocking LMTP traffic, you can't guarantee atomicity
on actions that cross multiple mailboxes.

But - the new field definitely has value.  I like the idea of just
trusting that field in the quota utility and for regular tasks.

Bron.