RE: Funding Cyrus High Availability

2004-09-19 Thread David Lang
On Fri, 17 Sep 2004 [EMAIL PROTECTED] wrote:
From: David Lang [mailto:[EMAIL PROTECTED]

Mike, one of the problems with this is that different databases have
different interfaces and capabilities.
if you design it to work on Oracle then if you try to make it work on
MySQL there are going to be quite a few things you need to change.
--snip
another issue in all this is the maintainance of the resulting code. If
this code can be used in many different situations then more people will
use it (probably including CMU) and it will be maintained as a
side effect
of any other changes. however if it's tailored towards a very narrow
situation then only the people who have that particular problem will use
it and it's likly to have issues with new changes.
I'd actually figured something like ODBC would be used, with prepared
statements.  /shrug.  Abstract the whole interface issue.
unfortunantly there are a few problems with this
to start with ODBC is not readily available on all platforms.
secondly ODBC can't cover up the fact that different database engines have 
vastly differeing capabilities. if you don't use any of these capabilities 
then you don't run into this pitfall, but if you want to you will.

I really wish that ODBC did live up to it's hype, but in practice only the 
most trivial database users can transparently switch from database to 
database by changing the ODBC config

David Lang
--
There are two ways of constructing a software design. One way is to make it so simple 
that there are obviously no deficiencies. And the other way is to make it so 
complicated that there are no obvious deficiencies.
 -- C.A.R. Hoare
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Funding Cyrus High Availability

2004-09-19 Thread David Lang
There are many ways of doing High Availability. This is an attempt to 
outline the various methods with the advantages and disadvantages. Ken and 
David (and anyne else who has thoughts on this) please feel free to add to 
this. I'm attempting to outline them roughly in order of complexity.

1. Active-Slave replication with manual failover
  This is where you can configure one machine to output all changes to a 
local daemon and another machine to implement the changes that are read 
from a local daemon.

  Pro:
   simplist implementation, since it makes no assumptions about how you 
are going to use it, it also sets no limits on how it is used.

   This is the basic functionality that all other variations will need so 
it's not wasted work no matter what is done later

   allows for multiple slaves from a single master
   allows for the propogation traffic pattern to be defined by the 
sysadmin (either master directly to all slaves or a tree-like propogation 
to save on WAN bandwidth when multiple slaves are co-located

   by involving a local daemon at each server there is a lot of 
flexibility in exactly how the replication takes place.
 for example you could
use netcat as your daemon for instant transmission of the 
messages
have a daemon that caches the messages so that if the link 
drops the messages are saved
have a daemon that gets an acknowlegement from the far side that 
the message got through
have a daemon that batches the messages up and compresses them for 
more efficiant transport
have a daemon that delays all messages by a given time period to 
give you a way to recover from logical corruption without having to go to 
a backup
have a daemon that filters the messages (say one that updates 
everything except it won't delete any messages so you have a known safe 
archive of all messages)
etc

  Con:
   since it makes no assumptions about how you are going to use it, it 
also gives you no help in useing it in any particular way

2. Active-Slave replication with automatic failover
  This takes #1, limits it to a pair of boxes and through changes to 
murder or other parts of cyrus will swap the active/slave status of the 
two boxes

  Pro:
   makes setting up of a HA pair of boxes easier
   increases availability by decreasing downtime
  Con:
   this functionality can be duplicated without changes to cyrus by the 
use of an external HA/cluster software package.

   Since this now assumes a particular mode of operation it starts to 
limit other uses (for example, if this is implemented as part of murder 
then it won't help much if you are trying to replicate to a DR datacenter 
several thousand miles away).

   Split-brain conditions are the responsibility of cyrus to prevent or 
solve. These are fundamentaly hard problems to get right in all cases

3. Active-Slave replication with Slave able to accept client connections
  This takes #1 and then further modifies the slave so that requests that 
would change the contents of things get relayed to the active box and then 
the results of the change get propogated back down before they are visable 
to the client.

  Pro:
   simulates active/active operation although it does cause longer delays 
when clients issue some commands.

   use of slaves for local access can reduce the load on the master 
resulting in higher performance.

   can be cascaded to multiple slaves and multiple tiers of slaves as 
needed

   in case of problems on the master the slaves can continue to operate as 
read-only servers providing degraded service while the master is fixed. 
depending on the problem with the master this may be very preferable to 
having to re-sync the master or recover from a split-brain situation

  Con:
   more extensive modifications needed to trap all changes and propogate 
them up to the master

   how does the slave know when the master has implemented the change (so 
that it can give the result to the client)

   raises questions about the requirement to get confirmation og all 
updates before the slave can respond to the client (for example, if a 
slave decides to read a message that is flagged as new should the slave 
wait until the master confirms that it knows the message has been read 
before it gives it to the client, or should it give the message to the 
client and not worry if the update fails on the master)

   since the slave needs to send updates to the master the latency of the 
link between them can become a limiting factor in the performance that 
clients see when connecting to the slave

4. #3 with automatic failover
  Since #3 supports multiple slaves the number of failover senerios grow 
significantly. you have multiple machines that could be the new master and 
you have the split-brain senerio to watch out for.

  Pro:
   increased availability by decreasing failover time
   potentially easier to setup then with external clustering software
  Con:
   increased complexity
  

Re: Funding Cyrus High Availability

2004-09-19 Thread David Carter
On Sun, 19 Sep 2004, David Lang wrote:
5. Active/Active
designate one of the boxes as primary and identify all items in the 
datastore that absolutly must not be subject to race conditions between 
the two boxes (message UUID for example). In addition to implementing 
the replication needed for #1 modify all functions that need to update 
these critical pieces of data to update them on the master and let the 
master update the other box.
We may be talking at cross purposes (and its entirely likely that I've
got the wrong end of the stick!), but I consider active-active to be
the case where there is no primary: users can make changes to either
system, and if the two systems lose touch with each other they have
to resolve their differences when contact is reestablished.
UUIDs aren't a problem (each machine in a cluster owns its own fraction of 
the address space). Message UIDs are a big problem. I guess in the case of 
conflict, you could bump the UIDvalidity value on a mailbox and reassign 
UIDs for all the messages, using timestamps determine the eventual 
ordering of messages. Now that I think about it, maybe that's not a 
totally absurd idea. It would involve a lot of work though.

 Pro:
  best use of available hardware as the load is split almost evenly between 
the boxes.

best availability becouse if there is a failure half of the clients won't 
see it at all
Actually this is what I do right now by having two live mailstores. Half 
the mailboxes on each system are active, the remainder are passive.

--
David Carter Email: [EMAIL PROTECTED]
University Computing Service,Phone: (01223) 334502
New Museums Site, Pembroke Street,   Fax:   (01223) 334679
Cambridge UK. CB2 3QH.
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Funding Cyrus High Availability

2004-09-19 Thread David Lang
On Sun, 19 Sep 2004, David Carter wrote:
On Sun, 19 Sep 2004, David Lang wrote:
5. Active/Active
designate one of the boxes as primary and identify all items in the 
datastore that absolutly must not be subject to race conditions between 
the two boxes (message UUID for example). In addition to implementing the 
replication needed for #1 modify all functions that need to update these 
critical pieces of data to update them on the master and let the master 
update the other box.
We may be talking at cross purposes (and its entirely likely that I've
got the wrong end of the stick!), but I consider active-active to be
the case where there is no primary: users can make changes to either
system, and if the two systems lose touch with each other they have
to resolve their differences when contact is reestablished.
UUIDs aren't a problem (each machine in a cluster owns its own fraction of 
the address space). Message UIDs are a big problem. I guess in the case of 
conflict, you could bump the UIDvalidity value on a mailbox and reassign UIDs 
for all the messages, using timestamps determine the eventual ordering of 
messages. Now that I think about it, maybe that's not a totally absurd idea. 
It would involve a lot of work though.
the problem is that when they are both up you have to have one of them 
allocate the message UID's or you have to change the UIDVALIDITY for every 
new message that arrives.

here is the problem.
  you have a new message created on both servers at the same time. how do 
you allocate the UID without any possibility of stepping on each other?

the only way to do this is to have some sort of locking so that only one 
machine at a time can allocate UID's. you can shuffle this responsibility 
back and forth between machines, but there's a significant amount of 
overhead in doing this so the useual answer is just to have one machine 
issue the numbers and the other ask the first for a number when it needs 
it.

changing UIDVALIDITY while recovering from  a split-brain is probably 
going to be needed.

but as you say it's a lot of work (which is why I'm advocating the simpler 
options get released first :-)

 Pro:
  best use of available hardware as the load is split almost evenly 
between the boxes.

best availability becouse if there is a failure half of the clients won't 
see it at all
Actually this is what I do right now by having two live mailstores. Half the 
mailboxes on each system are active, the remainder are passive.
right, but what this would allow is sharing the load on individual 
mailboxes

useually this won't matter, but I could see it for shared mailboxes
David Lang
--
David Carter Email: [EMAIL PROTECTED]
University Computing Service,Phone: (01223) 334502
New Museums Site, Pembroke Street,   Fax:   (01223) 334679
Cambridge UK. CB2 3QH.
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
--
There are two ways of constructing a software design. One way is to make it so simple 
that there are obviously no deficiencies. And the other way is to make it so 
complicated that there are no obvious deficiencies.
 -- C.A.R. Hoare
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Funding Cyrus High Availability

2004-09-19 Thread Jure Pe_ar
On Sun, 19 Sep 2004 00:52:08 -0700 (PDT)
David Lang [EMAIL PROTECTED] wrote:

Nice review of replication ABC :)

Here are my thoughts:

 1. Active-Slave replication with manual failover

This is really the simplest way to do it. Rsync (and friends) does 90% of
the required job here; the only thing it's lacking is the concept of the
mailbox as a unit. It would be nice if our daemon here would do its job in
an atomic way.
A few days ago someone was asking for an event notification system that
would be able to call some program when a certain action happened on a
mailbox. Something like this would come handy here i think :)

 2. Active-Slave replication with automatic failover

2 is really just 1 + your heartbeat package of choice and some scripts to
tie it all together.

 3. Active-Slave replication with Slave able to accept client connections

I think here would be good to start thinking about the app itself and define
connections better. Cyrus has three kinds of connections that modify a
mailbox: lmtp that puts new mails into mailbox, pop that (generally)
retrieves (and delete) them and imap that does both plus some other (folder
ops and moving mails around).
Now if you deceide that it does not hurt you if slave is a bit out of date
when it accepts a connection (but i guess most of us would find this
unacceptable), you can ditch some of the complexity; but you'd want the
changes that were made on the slave in that connection to propagate up to
the master. I dont really like this, because the concepts of master and
slave gets blurred here and things can easily end in a mess.
Once you have mailstores that are synchronizing each other in a way that is
not very well defined, you'll end up with conflicts sooner or later. There
are some unpredictable factors like network latency that can lead you to
unexpected situations easily.


 4. #3 with automatic failover

Another level of mess over 3 :)

 5. Active/Active

designate one of the boxes as primary and identify all items in the 
 datastore that absolutly must not be subject to race conditions between 
 the two boxes (message UUID for example). In addition to implementing the 
 replication needed for #1 modify all functions that need to update these 
 critical pieces of data to update them on the master and let the master 
 update the other box.

Exactly. This is the atomicy i was mentioning above. I'd say this is going
to be the larger part of the job.

 6. active/active/active/...

This is what most of us would want.
 
 while #6 is the ideal option to have it can get very complex

Despite everything you've said, i still think this *can* be done in a
relatively simple way. See my previos mail where i was dreaming about the
whole ha concept in a raid way.
There i assumed murder as the only agent through which clinets would be able
to access their mailboxes. If you think of murder handling all of the jobs
of your daemon in 1-4, one thing that you gain immediately is much simpler
synchronization of actions between the mailstore machines. If you start
empty or with exactly the same data on two machines, all that murder needs
to do is take care that both receive the same commands and data in the same
order.
Also if you put all logic into one place, backend mailstores need not to be
taught any special tricks and can remain pretty much as they are today.

Or am i missing something?

 personally I would like to see #1 (with a sample daemon or two to provide 
 basic functionality and leave the doors open for more creative uses) 
 followed by #3 while people try and figure out all the problems with #5 
 and #6

and i would like to see that we come here to a conclusion of what kind of ha
setup would be best for all and focus our energy on only one implementation.
I have enough old hardware here (and i'm getting some more in about a month)
that i can setup a nice little test environment. Right now it also looks
like i'll have plenty of time in the february - june 2005 so i can volunteer
to be a tester.

 there are a lot of senerios that are possible with #1 or #3 that are not 
 possible with #5

One i think is slave of a slave of a slave (...) kind of setup. Does anybody
really need such setup for a mail? I understand it for a ldap for example,
there are even some things where it is usefull for a sql database, but i see
no reason to have it for a mail server.


-- 

Jure Pear
http://jure.pecar.org/

---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


raw access to imap quotas, with mail user

2004-09-19 Thread Felix Cuello

Hello,

   I wrote a small C program to access to quota files without ask cyrus. This
   program run under mail group.
   I noticed when something change in the mailbox [deleting mails, receiving
   mails, etc] the /var/imap/quota permissions are resetted to:

   -rw---  cyrus.mail

   Then mail users can't have read access to this files and my C program
   doesn't have read access to the files.

   It is possible to change that?

   Thanks a lot,

   Félix
   

-- 
Felix Cuello
[EMAIL PROTECTED]
- 1504 -

Q:  Why do the police always travel in threes?
A:  One to do the reading, one to do the writing, and the other keeps
an eye on the two intellectuals.

---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Funding Cyrus High Availability

2004-09-19 Thread David Lang
please don't misunderstand my posts. it's not that I don't think that 
active/active/active is possible, it's just that I think it's far more 
complicated.

assiming that the simplest method would cost ~$3000 to code I would make a 
wild guess that the ballpark figures would be

1. active/passive without automatic failover $3k
2. active/passive with automatic failover (limited to two nodes or withing 
a murder cluster) $4k

3. active/passive with updates pushed to the master $5k
4. #3 with auto failover (failover not limited to two nodes or a single 
murder cluster) $7k

5. active/active (limited to a single geographic location) $10k
6. active/active/active (no limits) $30k
in addition to automaticly re-merge things after a split-brin has happened 
would probably be another $5k

now this doesn't mean that all ofs must be done in this funded project. I 
believe that people would end up going from #1 or #3 to #2 or #4 by 
individuals coding the required pieces and sharing them (#4 has as much of 
a jump over #3 becouse of all the different senrios that are involved, 
each one is individually simple) however #5 and #6 are significantly more 
difficult and I would not expent them to just happen (they are also much 
more intrusinve to the code so there is some possibility of them not 
getting merged into the core code quickly)

David Lang
-- There are two ways of constructing a software design. One way is to 
make it so simple that there are obviously no deficiencies. And the other 
way is to make it so complicated that there are no obvious deficiencies.
 -- C.A.R. Hoare

P.S. #1-4 all could qualify as the first way, #5 and #6 are both 
complicated enough to start with that it is really hard to keep them out 
of the second way
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Funding Cyrus High Availability

2004-09-19 Thread Michael Loftis

--On Monday, September 20, 2004 00:43 +0200 Jure Pe ar 
[EMAIL PROTECTED] wrote:

On Sun, 19 Sep 2004 00:52:08 -0700 (PDT)
David Lang [EMAIL PROTECTED] wrote:
Nice review of replication ABC :)
Here are my thoughts:
1. Active-Slave replication with manual failover
This is really the simplest way to do it. Rsync (and friends) does 90% of
the required job here; the only thing it's lacking is the concept of the
mailbox as a unit. It would be nice if our daemon here would do its job
in an atomic way.
A few days ago someone was asking for an event notification system that
would be able to call some program when a certain action happened on a
mailbox. Something like this would come handy here i think :)
we were doing this but really, rsync does not scale well.  when you get 
lots of small files it takes it loner to figure out what to transfer, than 
it'd take to just transfer almost everything over (assuming a small 768kbit 
to about 1.5mbit link and a average sized messages mailstore).  and unles 
you break it up into smaller chunks, it'll gobble up wads of RAM during the 
process.  insane amounts like well over a gig or so for our mailstore with 
about humm 51Gb of mail.  Not exactly sure the number of files off the top 
of my head though it could be figured if wanted.

---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: raw access to imap quotas, with mail user

2004-09-19 Thread Derrick J Brashear
On Sun, 19 Sep 2004, Felix Cuello wrote:
Hello,
  I wrote a small C program to access to quota files without ask cyrus. This
  program run under mail group.
  I noticed when something change in the mailbox [deleting mails, receiving
  mails, etc] the /var/imap/quota permissions are resetted to:
  -rw---  cyrus.mail
  Then mail users can't have read access to this files and my C program
  doesn't have read access to the files.
  It is possible to change that?
In general programs which access the mail store run as the cyrus user. 
Inasmuch as this should be being done at all, your program should be 
setuid cyrus. I don't think doing this is a good idea in general, though.

---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: raw access to imap quotas, with mail user

2004-09-19 Thread Derrick J Brashear
On Sun, 19 Sep 2004, Felix Cuello wrote:
Inasmuch as this should be being done at all, your program should be
setuid cyrus. I don't think doing this is a good idea in general, though.

Then i wrote a simple C code and compile that as a PERL Package, then i have
direct access to their quota. Apache user is in the mail group [i don't know
why... but that's the true] and i want to run the PERL script with apache
user rights [then mail rights]... i don't want to put apache into cyrus
group, just because i don't want apache could be read mailboxes.
So write simple C code and exec it (and collect the result) from perl?
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: raw access to imap quotas, with mail user

2004-09-19 Thread Felix Cuello
On Sun, Sep 19, 2004 at 10:15:20PM -0400, Derrick J Brashear wrote:
 In general programs which access the mail store run as the cyrus user. 
 Inasmuch as this should be being done at all, your program should be 
 setuid cyrus. I don't think doing this is a good idea in general, though.

Is a requirement to our Sutdents web portal, show mailbox usage of each
users. All students web portal are written in PERL, but IMAP::Admin is a
little bit slow just because it log as the cyrus admin [or the user] using
LDAP and... that takes a lot of time.

Then i wrote a simple C code and compile that as a PERL Package, then i have
direct access to their quota. Apache user is in the mail group [i don't know
why... but that's the true] and i want to run the PERL script with apache
user rights [then mail rights]... i don't want to put apache into cyrus
group, just because i don't want apache could be read mailboxes.

That's the whole problem...

thanks,

Félix


-- 
Felix Cuello
[EMAIL PROTECTED]
- 1506 -

A gift of a flower will soon be made to you.

---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: raw access to imap quotas, with mail user

2004-09-19 Thread Derrick J Brashear
On Sun, 19 Sep 2004, Felix Cuello wrote:
So write simple C code and exec it (and collect the result) from perl?
That's the problem... PERL is running as apache access. I wrote C code and
Right, that's why I suggested writing a C program, making it setuid and 
executing, not linking in via xs. Setuid makes it run as who you want, and 
life goes on.

---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: raw access to imap quotas, with mail user

2004-09-19 Thread Carl P. Corliss
Michael Loftis wrote:
[snip]
 Read it once, and then cache the result in the session information (or
 even in a cookie) along with a 'freshness' -- and when the timeout has
 expired, re-check it (say 1 minute, or five).  Same thing with the LDAP
 auth. Re-authing every single page load is not necessary.
Better yet, only update it when you absolute need to (meaning: only when you 
are checking mail or making a change to your mailbox by deleting, moving or 
renaming). That should work - of course providing your web portal is 
functioning as a mail client (checking mail/etc) and not -only- interacting 
with imap to retrieve the quota.

HTH,
--
Carl
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html