Re: Replication specifics

2006-05-24 Thread Kjetil Torgrim Homme
On Tue, 2006-05-23 at 17:16 -0400, Patrick Radtke wrote:
 On May 23, 2006, at 4:48 PM, David Korpiewski wrote:
  that currently only exists on the defunct master?   If the replica  
  updates every 10 seconds, then we have the potential to lose 10  
  seconds of email.   Or worse case, the sync_client dies and we lose  
  30 minutes or more of emails before we failover!
 
 Once we have the primary/master backend machine working again after a  
 failover (assuming its RAID is still intact) we do a find for any  
 messages that have timestamps just prior to the the machine failing.
 We then compare this list to the messages on the replica.  Since we  
 have delayed expunge on, we can still determine if a specific message  
 was replicated even if the user deleted it.

we use a different approach: our MTA (Exim) delivers a copy to a
separate server which has a very simple configuration, no LDAP lookups
to verify addresses or anything, it just stores the messages as batched
SMTP, one file per user and day.  if anything goes awry, we can replay
(parts of) this file and redeliver the messages.  in most cases, we do
this to supplement the tape backup when users delete all their e-mail by
mistake, and in that case we need to reset Cyrus' duplication database,
or else the messages will be dropped on the floor.  in the incomplete
replica scenario, however, the duplication database will actually help
us avoid duplicating e-mail from the period of the crash.

(we don't use Murder or replication yet, so such replica restoration
hasn't been tried for real.)
-- 
Kjetil T.



Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Replication specifics

2006-05-23 Thread David Korpiewski
So I got into a big argument with the people in my department about how 
replication works and I'm seeking some guidance from the community:


(1)The worst fear of any prof here at UMASS is the potential of losing a 
single email.   So my question is this:  If we set up replication, and 
we have to failover to the replica, is there any way to get back email 
that may not have been replicated -- ones that currently only exists on 
the defunct master?   If the replica updates every 10 seconds, then we 
have the potential to lose 10 seconds of email.   Or worse case, the 
sync_client dies and we lose 30 minutes or more of emails before we 
failover!


Do other folks out there plan for this potential for lost emails or do 
you just failover and if a few messages get lost, you don't worry about it?


(2)Also, is there a master sync transaction log file somewhere that 
specifies what is being done?  In other words, if we failed over, could 
we find a transaction log that would tell us what was not committed and 
then manually run through it to make the updates?  I found the log files 
in /var/lib/imap/sync, but these are very uninformative:

for example:
SEEN davidk user.davidk
SEEN davidk user.davidk
SEEN davidk user.davidk

it would be nice to see SEEN update message READ 12020 for 
user.davidk.INBOX, but I don't know if this detailed information is 
somewhere on the system or just resides in memory.



(3) My final question is this:  If we do a manual sync_client update, is 
the update a full copy or is it a differential copy?   So I want to 
know if we run a manual sync_client if it is going to overwrite the 
entire replica's mailstore or just search and find what is different and 
just update those portions.



Thank you kindly
David







--

David Korpiewski Phone: 413-545-4319
Software Specialist IFax:   413-577-2285
Department of Computer Science   ICQ:   7565766
University of Massachusetts Amherst



Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Replication specifics

2006-05-23 Thread Patrick Radtke


On May 23, 2006, at 4:48 PM, David Korpiewski wrote:

So I got into a big argument with the people in my department about  
how replication works and I'm seeking some guidance from the  
community:


(1)The worst fear of any prof here at UMASS is the potential of  
losing a single email.   So my question is this:  If we set up  
replication, and we have to failover to the replica, is there any  
way to get back email that may not have been replicated -- ones  
that currently only exists on the defunct master?   If the replica  
updates every 10 seconds, then we have the potential to lose 10  
seconds of email.   Or worse case, the sync_client dies and we lose  
30 minutes or more of emails before we failover!




Once we have the primary/master backend machine working again after a  
failover (assuming its RAID is still intact) we do a find for any  
messages that have timestamps just prior to the the machine failing.
We then compare this list to the messages on the replica.  Since we  
have delayed expunge on, we can still determine if a specific message  
was replicated even if the user deleted it.


We also monitor the sync_client process and someone gets alerted if  
it goes away.


Of course some messages can be lost. But the same is true for any of  
your smtp machines. If one suffers a catastrophic failure then any  
messages queued on the machine would be lost.


Do other folks out there plan for this potential for lost emails  
or do you just failover and if a few messages get lost, you don't  
worry about it?


(2)Also, is there a master sync transaction log file somewhere that  
specifies what is being done?  In other words, if we failed over,  
could we find a transaction log that would tell us what was not  
committed and then manually run through it to make the updates?  I  
found the log files in /var/lib/imap/sync, but these are very  
uninformative:

for example:
SEEN davidk user.davidk
SEEN davidk user.davidk
SEEN davidk user.davidk

it would be nice to see SEEN update message READ 12020 for  
user.davidk.INBOX, but I don't know if this detailed information  
is somewhere on the system or just resides in memory.


We look there as well (and back it up prior ). Then we just look in  
the users' folders for the timestamps on messages.




(3) My final question is this:  If we do a manual sync_client  
update, is the update a full copy or is it a differential copy?
So I want to know if we run a manual sync_client if it is going to  
overwrite the entire replica's mailstore or just search and find  
what is different and just update those portions.


I believe it does a diff (I haven't looked at the code)

-Patrick

Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html