Re: looking for Cyrus mail format documentation

2003-02-06 Thread John Alton Tamplin
Phil Howard wrote:


That would result in doubling the bandwidth on the inside server connection
since it would be dealing with the mail first coming in to the MX, then
being replicated back out to the other server.  By delivering outside mail
to the outside server first, the only bandwidth usage is replicating to
the inside server (reverse the scenario for mail originating inside).
 

Is the cost of bandwidth to your inside server really so expensive as to 
justify the expense of complicated development, hosting an offsite 
server with that much bandwidth, and maintaining a remote system?  It 
really sounds like you are overengineering the problem.

If there was a way to track when the flags got changed.  I feel it's OK
to trust the clocks on the servers, and simply decide which flag state
prevails based on which has the later timestamp.  But I bet that metadata
isn't in the current mailstore design.


No, the time a flag was changed isn't kept.  In fact for seen flags 
which are cached in memory while a mailbox is open, only a single bit is 
kept.

--
John A. Tamplin   Unix System Administrator
Emory University, School of Public Health +1 404/727-9931





Re: looking for Cyrus mail format documentation

2003-02-06 Thread Phil Howard
On Thu, Feb 06, 2003 at 09:58:30AM -0500, John Alton Tamplin wrote:

| Phil Howard wrote:
| 
| That would result in doubling the bandwidth on the inside server connection
| since it would be dealing with the mail first coming in to the MX, then
| being replicated back out to the other server.  By delivering outside mail
| to the outside server first, the only bandwidth usage is replicating to
| the inside server (reverse the scenario for mail originating inside).
|   
| 
| Is the cost of bandwidth to your inside server really so expensive as to 
| justify the expense of complicated development, hosting an offsite 
| server with that much bandwidth, and maintaining a remote system?  It 
| really sounds like you are overengineering the problem.

Under the original plan, the development was not complicated and thus
not expensive.  The new plan changes the picture.


| If there was a way to track when the flags got changed.  I feel it's OK
| to trust the clocks on the servers, and simply decide which flag state
| prevails based on which has the later timestamp.  But I bet that metadata
| isn't in the current mailstore design.
| 
| No, the time a flag was changed isn't kept.  In fact for seen flags 
| which are cached in memory while a mailbox is open, only a single bit is 
| kept.

And hence with a conflict in flags, it's not trivial, maybe impossible,
to resolve.

-- 
-
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |
| [EMAIL PROTECTED] | Texas, USA | http://ka9wgn.ham.org/|
-



Re: looking for Cyrus mail format documentation

2003-02-05 Thread Patrick Welche
On Sat, Feb 01, 2003 at 11:31:13AM -0500, Rob Siemborski wrote:
 On Fri, 31 Jan 2003, Phil Howard wrote:
 
  | Of course replicating some things such as seen state will be quite
  | painful, and you may need to do some hacks to keep uids unique between
  | the machines.
 
  How does Cyrus manage uids?  I hope these are not uids in /etc/passwd.
 
 No, they're the unique identifier numbers for each message.  I believe the
 problem John was asking about is, what happens if you have, say, an APPEND
 happen to a mailbox on both servers while they are not in communication
 with eachother.
 
 When they resync, each has a new message with the same unique identifier,
 but different contents.  This isn't a situation that can be recoverd from
 just be looking at the contents of the filesystem.
 
 Doing replicated IMAP stores (espeically geographicly distanct ones) is
 not an easy problem.

All this sounds remarkably similar to the postgres-r database replication
problem cf nice paper by Bettina Kemme
  http://www.cs.mcgill.ca/~kemme/papers/vldb00.html

Here it would be client connects to imap server A and says APPEND. Server A
then sends APPEND to server A and server B using a group communciation
protocol (cf spread) which guarantees the ordering of the commands. Server A
and server B then receive the APPEND and do it. If server B received an APPEND
at nearly the same time, that APPEND would still appear in the same place in
the input queue of both servers = the UID would come out the same. You still
have the hard problem of conflict resolution after network partitioning :(

Just 2 uninformed cents,

Patrick



Re: looking for Cyrus mail format documentation

2003-02-05 Thread John Alton Tamplin
Patrick Welche wrote:


All this sounds remarkably similar to the postgres-r database replication
problem cf nice paper by Bettina Kemme
 http://www.cs.mcgill.ca/~kemme/papers/vldb00.html

Here it would be client connects to imap server A and says APPEND. Server A
then sends APPEND to server A and server B using a group communciation
protocol (cf spread) which guarantees the ordering of the commands. Server A
and server B then receive the APPEND and do it. If server B received an APPEND
at nearly the same time, that APPEND would still appear in the same place in
the input queue of both servers = the UID would come out the same. You still
have the hard problem of conflict resolution after network partitioning :(
 

The issue is not when the two servers can talk -- that is easily solved 
with techniques such as two phase commit.  The problem is when server A 
and B are not able to communicate and you want both of them to be able 
to continue taking updates yet build a consistent view of the database 
once they can communicate.

--
John A. Tamplin   Unix System Administrator
Emory University, School of Public Health +1 404/727-9931





Re: looking for Cyrus mail format documentation

2003-02-05 Thread Phil Howard
On Tue, Feb 04, 2003 at 10:49:21AM -0500, Rob Siemborski wrote:

| Well, for one, this requires change to Cyrus, which Phil doesn't seem to
| want to do.

As long as the change is simple, I would not mind doing so.  Making the UID
so UID % NumberOfServers == ServerID holds true should be easy and simple.
Making them catch up when they are in communication might not be.  The
replicator would have to have a means to step the UID.  Maybe that's easy.
I just don't know (yet).


| Also, it still doesn't solve the problem of flag changes.

And that certainly can be an issue.

-- 
-
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |
| [EMAIL PROTECTED] | Texas, USA | http://ka9wgn.ham.org/|
-



Re: looking for Cyrus mail format documentation

2003-02-05 Thread Phil Howard
On Wed, Feb 05, 2003 at 11:41:12AM +0900, Mark Keasling wrote:

| It sounds like you may need to design a distributed mailstore that will
| satisfy both your requirements and those of IMAP and then implement a
| server around that mailstore.

That was my original plan to do on top of Maildir.  But that plan
had flaws not only in that Cyrus didn't support it (some other needs
suggested Cyrus to be a better solution), but that Maildir is not
really very good for an IMAP mailstore for performance reasons (also
important).

-- 
-
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |
| [EMAIL PROTECTED] | Texas, USA | http://ka9wgn.ham.org/|
-



Re: looking for Cyrus mail format documentation

2003-02-05 Thread Phil Howard
On Tue, Feb 04, 2003 at 09:35:34PM -0800, David Lang wrote:

| you stated that you want to have the outside box act as a secondary MX for
| the inside one, if you do this and accept the extra bandwidth used then
| you could still do this and have the mail only delivered to the inside box
| and then replicated out to the outside one.

That would result in doubling the bandwidth on the inside server connection
since it would be dealing with the mail first coming in to the MX, then
being replicated back out to the other server.  By delivering outside mail
to the outside server first, the only bandwidth usage is replicating to
the inside server (reverse the scenario for mail originating inside).


| this doesn't solve the problem of changing flags, but does solve the
| problem of getting the messages in correctly.
| 
| for the flags the real question is do you HAVE to allow them to be updated
| when the primary can't be reached? or can your users tolorate being able
| to see their mail, but not have the flags change if you have a connection
| problem? (or possibly allow some flags to be changed and queued up, seen
| flags can be reconsiled by changing both sides to the the or of the two
| when they reconnect, deletes can be queued and processed later, etc)

If there was a way to track when the flags got changed.  I feel it's OK
to trust the clocks on the servers, and simply decide which flag state
prevails based on which has the later timestamp.  But I bet that metadata
isn't in the current mailstore design.

-- 
-
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |
| [EMAIL PROTECTED] | Texas, USA | http://ka9wgn.ham.org/|
-



Re: looking for Cyrus mail format documentation

2003-02-04 Thread Phil Howard
On Tue, Feb 04, 2003 at 02:16:36AM -0500, Rob Siemborski wrote:

| On Tue, 4 Feb 2003, Phil Howard wrote:
| 
|  Does the RFC say that the IMAP UIDs have to be the file name?
| 
| No, of course not.
| 
|  Do the IMAP UIDs have to be the same between different sessions?
| 
| They cannot change without also chanigng the UIDVALIDITY of the mailbox,
| which is an expensive operation for disconnected clients (it forces them
| to resync)
| 
| So yes, every time you need to resync, you can increment the uidvalidity,
| but your disconnected users are going to hate you for it, and this isn't a
| tremendously good solution for the real world (where temporary outages
| between distant nodes is the norm).

So the message with UID 123 during one session has to still have UID 123
during the next session.  That indeed will break the ability to have
unique remote syncronization.

What's curious to me is how, with a Maildir format, that IMAP could be
implemented to retain that state without either storing some extra data
or updating the files in place.  I had thought that real unique message
IDs were the same as in RFC 822.  I didn't read RFC 2060 because I had
been talked out of implementing my own IMAP daemon.  But I guess I
should have read it, anyway, to understand its limitations.  Probably
better do that soon before I design something else that can't work :-)

-- 
-
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |
| [EMAIL PROTECTED] | Texas, USA | http://ka9wgn.ham.org/|
-



Re: looking for Cyrus mail format documentation

2003-02-04 Thread Mike Brodbelt
Rob Siemborski wrote:
 On Sat, 1 Feb 2003, Phil Howard wrote:
| Doing replicated IMAP stores (espeically geographicly distanct ones) is
| not an easy problem.

It's easy if every message is a separate file.
 
 This is not true.  It has nothing to do with the implementation of the
 mailstore.

I've never actually had the need to try this, so caveat emptor, but I've
heard of people having some success replicating Cyrus mailstores with
drbd - http://www.complang.tuwien.ac.at/reisner/drbd/.

There are limitations to this approach which may make it non-viable for
you - your machines would need to be Linux hosts, and you'd need a
dedicated network link, but it's the most promising approach I've seen
to simple mailstore replication between two machines so far.

Mike.




Re: looking for Cyrus mail format documentation

2003-02-04 Thread Phil Howard
On Tue, Feb 04, 2003 at 10:19:27AM +, Mike Brodbelt wrote:

| Rob Siemborski wrote:
|  On Sat, 1 Feb 2003, Phil Howard wrote:
| | Doing replicated IMAP stores (espeically geographicly distanct ones) is
| | not an easy problem.
| 
| It's easy if every message is a separate file.
|  
|  This is not true.  It has nothing to do with the implementation of the
|  mailstore.
| 
| I've never actually had the need to try this, so caveat emptor, but I've
| heard of people having some success replicating Cyrus mailstores with
| drbd - http://www.complang.tuwien.ac.at/reisner/drbd/.
| 
| There are limitations to this approach which may make it non-viable for
| you - your machines would need to be Linux hosts, and you'd need a
| dedicated network link, but it's the most promising approach I've seen
| to simple mailstore replication between two machines so far.

I wonder how well that method of replication works when both nodes
cannot reach each other, and both are doing updates.  And I wonder
just how much bandwidth it uses.  Suppose one node is connected over
a 28.8k analog modem connection because for the rate of SMTP flow
in and out, it's enough.  Now adding replication delta, will it be
any more than what is locally added to the mailstore?  I would think
the block layer replication would blindly replicate every block on
the device that gets changed, rather than a small piece of metadata
that would only need to be sent if the replicator understands the
meaning of the change.

-- 
-
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |
| [EMAIL PROTECTED] | Texas, USA | http://ka9wgn.ham.org/|
-



Re: looking for Cyrus mail format documentation

2003-02-04 Thread Phil Howard
On Tue, Feb 04, 2003 at 07:20:57PM +0900, Mark Keasling wrote:

| Hi,
| 
| On Tue, 4 Feb 2003 03:19:12 -0600, Phil Howard [EMAIL PROTECTED] wrote...
|  On Tue, Feb 04, 2003 at 02:16:36AM -0500, Rob Siemborski wrote:
|  
|  | On Tue, 4 Feb 2003, Phil Howard wrote:
|  | 
|  |  Does the RFC say that the IMAP UIDs have to be the file name?
|  | 
|  | No, of course not.
|  | 
|  |  Do the IMAP UIDs have to be the same between different sessions?
|  | 
|  | They cannot change without also chanigng the UIDVALIDITY of the mailbox,
|  | which is an expensive operation for disconnected clients (it forces them
|  | to resync)
|  | 
|  | So yes, every time you need to resync, you can increment the uidvalidity,
|  | but your disconnected users are going to hate you for it, and this isn't a
|  | tremendously good solution for the real world (where temporary outages
|  | between distant nodes is the norm).
|  
|  So the message with UID 123 during one session has to still have UID 123
|  during the next session.  That indeed will break the ability to have
|  unique remote syncronization.
| 
| Not really, it just makes it much more complex.  When the machines
| reconnect, they would need to have a way to determine if the same uid
| was referring to two different messages.  If so, appending those messages
| to the end of the folder with new uids and deleteing the old uid would
| re-uniquify the messages.  You may want to do that with all of the
| messages that two machines received while they were incommunicado.
| Essentially, the machines would have to remember their UIDNEXT value at
| the time that communication was lost.  Once communication was reestablished,
| all of the messages from the old UIDNEXT upto the current UIDNEXT held
| by each machine would need to be moved (append and delete) to the end of
| the folder.  The new uids would start from the largest UIDNEXT held by
| either machine. I think something like that would workat least in theory.

I guess moving them all as a group would help retain the apparent order.


| The main problem would be if the two machines appeared to users as
| the same machine due to DNS trickery and a user could connect to either
| one but the machines still could not connect to each other.  (One leg out
| of the triangle)  In that case, the user's client could see the mailbox
| having one message one time and a different message the next even though
| the uid was the same.  So if the client connected, fetched enough info to
| make a message cache list and then disconnected.  If the next time it
| connected to the other machine to fetch the message, the result would not
| be what was expected.

The intent was to have a faster machine for road warrier users to connect
to while the office is connected via a slow 28.8k analog modem.  The idea
is to avoid overloading the 28.8k line.  And to further help doing that,
the outside server would be the internet's MX host (fallback to the inside
server anyway), whereas the intranet would see the MX hosts the other way
around.  Mail from the internet would be delivered on the outside server
and replicated in.  Mail from the intranet would be delivered on the inside
server and replicated out.

One thing I was thinking of would be a hack to make one server always use
only odd UIDs, and the other always use only even UIDs, and to do catchups
while they are reachable with each other.  But this is getting into hacking
code I know nothing about, yet.  Maybe a later time.

My original design was for a Maildir based mailstore, and would work at the
file replication level (somewhat like rsync, but with some differences to
handle it two-ways).

-- 
-
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |
| [EMAIL PROTECTED] | Texas, USA | http://ka9wgn.ham.org/|
-



Re: looking for Cyrus mail format documentation

2003-02-04 Thread Henrique de Moraes Holschuh
On Tue, 04 Feb 2003, Phil Howard wrote:
 I wonder how well that method of replication works when both nodes
 cannot reach each other, and both are doing updates.  And I wonder

They don't.  If they cannot reach each other, at most one of them must allow
updates.

-- 
  One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie. -- The Silicon Valley Tarot
  Henrique Holschuh



Re: looking for Cyrus mail format documentation

2003-02-04 Thread John Alton Tamplin
Henrique de Moraes Holschuh wrote:


On Tue, 04 Feb 2003, Phil Howard wrote:
 

I wonder how well that method of replication works when both nodes
cannot reach each other, and both are doing updates.  And I wonder
   

They don't.  If they cannot reach each other, at most one of them must allow
updates.
 

Which, as far as I know, is how all commercial databases also do 
replication.  Only those servers involved in a quorum (even if it is 
because the primary is treated as having more weight) can accept updates 
-- the other replicas can only be read-only.  Trying to allow arbitrary 
updates during disconnected operation and then merge the results into a 
consistent and deterministic state is not just hard, but impossible. 
For example, if you allow flags to be set differently on the same 
message on each node, there is no way to resolve the two updates without 
discarding at least some of the work of one of the sessions. If you 
carefully constrain the updates that can be performed during 
disconnected operation (and for consistency that means even not in 
disconnected operation), then you can transform the problem from 
impossible to merely extremely difficult.

--
John A. Tamplin   Unix System Administrator
Emory University, School of Public Health +1 404/727-9931





Re: looking for Cyrus mail format documentation

2003-02-04 Thread Phil Howard
On Tue, Feb 04, 2003 at 09:51:21AM -0500, John Alton Tamplin wrote:

| Henrique de Moraes Holschuh wrote:
| 
| On Tue, 04 Feb 2003, Phil Howard wrote:
|   
| 
| I wonder how well that method of replication works when both nodes
| cannot reach each other, and both are doing updates.  And I wonder
| 
| 
| They don't.  If they cannot reach each other, at most one of them must allow
| updates.
|   
| 
| Which, as far as I know, is how all commercial databases also do 
| replication.  Only those servers involved in a quorum (even if it is 
| because the primary is treated as having more weight) can accept updates 
| -- the other replicas can only be read-only.  Trying to allow arbitrary 
| updates during disconnected operation and then merge the results into a 
| consistent and deterministic state is not just hard, but impossible. 
|  For example, if you allow flags to be set differently on the same 
| message on each node, there is no way to resolve the two updates without 
| discarding at least some of the work of one of the sessions. If you 
| carefully constrain the updates that can be performed during 
| disconnected operation (and for consistency that means even not in 
| disconnected operation), then you can transform the problem from 
| impossible to merely extremely difficult.

However, had message IDs been the RFC822 message ID, then it would have
been possible to receive new mail on each node.  Since the IDs would not
collide, that would be OK.  If you did get the same ID delivered to both
somehow, it should be the same mail.  If not, it violates the RFC, so
then you decide to replace one of them or change one of them and move on.

Deleted mail might be resurrected.

But certainly many other things can be a problem, and in the database
world, replicators can't really know a lot if the particulars about the
data involved, and wouldn't readily be able to do the things you could
get away with for email.

-- 
-
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |
| [EMAIL PROTECTED] | Texas, USA | http://ka9wgn.ham.org/|
-



Re: looking for Cyrus mail format documentation

2003-02-04 Thread Kendrick Vargas
I've been sorta following this thread, and I don't claim to know a whole 
lot about programming, but I'm wondering why something simple like what 
I'm about to suggest wouldn't work... Here goes:

If a group of servers are gonna be in constant communication, why not just 
have each server assign UID's in increments of the number of servers in 
the group? For example, if you have 3 servers, and they realize they've 
disconnected, each of the servers could number in increments of 3 starting 
from different points.

Assuming the last UID was 502, server 1 would receive messages and number
them as 503, 506, 509, etc... Server 2 would number at 504, 507, 510, etc. 
And Server 3 would number at 505, 508, 511, etc. When the servers 
re-connect, the file numbers would be unique, the biggest issue you'd run 
into would be that the incomming sort order will be a little off, and not 
really. If the user sets their MUA to sort on date, it would be even less 
of an issue.

Why wouldn't something like this work? I mean, assuming that there's some 
logic involved in the software, the individual servers could figure out 
which incremented number they could use as their UID next. 
-peace

On Tue, 4 Feb 2003, Mark Keasling wrote:

 On Tue, 4 Feb 2003 03:19:12 -0600, Phil Howard [EMAIL PROTECTED] wrote...

  On Tue, Feb 04, 2003 at 02:16:36AM -0500, Rob Siemborski wrote:
  
  | On Tue, 4 Feb 2003, Phil Howard wrote:
  | 
  |  Does the RFC say that the IMAP UIDs have to be the file name?
  | 
  | No, of course not.
  | 
  |  Do the IMAP UIDs have to be the same between different sessions?
  | 
  | They cannot change without also chanigng the UIDVALIDITY of the mailbox,
  | which is an expensive operation for disconnected clients (it forces them
  | to resync)
  | 
  | So yes, every time you need to resync, you can increment the uidvalidity,
  | but your disconnected users are going to hate you for it, and this isn't a
  | tremendously good solution for the real world (where temporary outages
  | between distant nodes is the norm).
  
  So the message with UID 123 during one session has to still have UID 123
  during the next session.  That indeed will break the ability to have
  unique remote syncronization.
 
 Not really, it just makes it much more complex.  When the machines
 reconnect, they would need to have a way to determine if the same uid
 was referring to two different messages.  If so, appending those messages
 to the end of the folder with new uids and deleteing the old uid would
 re-uniquify the messages.  You may want to do that with all of the
 messages that two machines received while they were incommunicado.
 Essentially, the machines would have to remember their UIDNEXT value at
 the time that communication was lost.  Once communication was reestablished,
 all of the messages from the old UIDNEXT upto the current UIDNEXT held
 by each machine would need to be moved (append and delete) to the end of
 the folder.  The new uids would start from the largest UIDNEXT held by
 either machine. I think something like that would workat least in theory.
 
 The main problem would be if the two machines appeared to users as
 the same machine due to DNS trickery and a user could connect to either
 one but the machines still could not connect to each other.  (One leg out
 of the triangle)  In that case, the user's client could see the mailbox
 having one message one time and a different message the next even though
 the uid was the same.  So if the client connected, fetched enough info to
 make a message cache list and then disconnected.  If the next time it
 connected to the other machine to fetch the message, the result would not
 be what was expected.
 
  What's curious to me is how, with a Maildir format, that IMAP could be
  implemented to retain that state without either storing some extra data
  or updating the files in place.  I had thought that real unique message
  IDs were the same as in RFC 822.  I didn't read RFC 2060 because I had
  been talked out of implementing my own IMAP daemon.  But I guess I
  should have read it, anyway, to understand its limitations.  Probably
  better do that soon before I design something else that can't work :-)
  
  -- 
  -
  | Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |
  | [EMAIL PROTECTED] | Texas, USA | http://ka9wgn.ham.org/|
  -
 
 
 Regards,
 Mark Keasling [EMAIL PROTECTED]
 
 

-- 
Let he who is without clue kiss my ass




Re: looking for Cyrus mail format documentation

2003-02-04 Thread Jason Fesler
 However, had message IDs been the RFC822 message ID, then it would have
 been possible to receive new mail on each node.  Since the IDs would not
 collide, that would be OK.  If you did get the same ID delivered to both
 somehow, it should be the same mail.  If not, it violates the RFC, so
 then you decide to replace one of them or change one of them and move on.

Perhaps the same mail, but with different Received: headers, and possibly
with mailing list software headers added.  IMO, Message-Id:  is not
strong enough of a UIDL, *especially* for list owners.



Re: looking for Cyrus mail format documentation

2003-02-04 Thread Mark Keasling
Hi,

On Tue, 4 Feb 2003 05:57:53 -0600, Phil Howard [EMAIL PROTECTED] wrote...
 One thing I was thinking of would be a hack to make one server always use
 only odd UIDs, and the other always use only even UIDs, and to do catchups
 while they are reachable with each other.  But this is getting into hacking
 code I know nothing about, yet.  Maybe a later time.

Using odd and even UIDs and merging would be a novel solution.  However,
it may be `too' novel.  There is an IMAP requirement that doesn't permit
merging (reordering).  Messages can only be appended to the end of the
mailbox because new messages must get a UID = UIDNEXT.  The situation
is never permitted to occur where the client opens a mailbox which has
UIDs 1,3,5 and then at some time later it has UIDs 1,2,3,4,5 with out a
change to UIDVALIDITY.  When UIDVALIDITY changes (which should be larger
than before), the client knows that what it knows about UIDs is now useless.

 around.  Mail from the internet would be delivered on the outside server
 and replicated in.  Mail from the intranet would be delivered on the inside
 server and replicated out.

It could be left at this point; but, I'm sure people would want mail in the
same order at both places and changes to be reflected in both places...

 My original design was for a Maildir based mailstore, and would work at the
 file replication level (somewhat like rsync, but with some differences to
 handle it two-ways).

It sounds like you may need to design a distributed mailstore that will
satisfy both your requirements and those of IMAP and then implement a
server around that mailstore.

Regards,
Mark Keasling [EMAIL PROTECTED]




Re: looking for Cyrus mail format documentation

2003-02-04 Thread David Lang
you stated that you want to have the outside box act as a secondary MX for
the inside one, if you do this and accept the extra bandwidth used then
you could still do this and have the mail only delivered to the inside box
and then replicated out to the outside one.

this doesn't solve the problem of changing flags, but does solve the
problem of getting the messages in correctly.

for the flags the real question is do you HAVE to allow them to be updated
when the primary can't be reached? or can your users tolorate being able
to see their mail, but not have the flags change if you have a connection
problem? (or possibly allow some flags to be changed and queued up, seen
flags can be reconsiled by changing both sides to the the or of the two
when they reconnect, deletes can be queued and processed later, etc)

there are some useage patterns here that can narrow the scope of the
limitation more then the generic database two-way-sync problem

David Lang

 On Tue, 4 Feb 2003, Phil Howard wrote:

 Date: Tue, 4 Feb 2003 03:19:12 -0600
 From: Phil Howard [EMAIL PROTECTED]
 To: Rob Siemborski [EMAIL PROTECTED]
 Cc: [EMAIL PROTECTED]
 Subject: Re: looking for Cyrus mail format documentation

 On Tue, Feb 04, 2003 at 02:16:36AM -0500, Rob Siemborski wrote:

 | On Tue, 4 Feb 2003, Phil Howard wrote:
 |
 |  Does the RFC say that the IMAP UIDs have to be the file name?
 |
 | No, of course not.
 |
 |  Do the IMAP UIDs have to be the same between different sessions?
 |
 | They cannot change without also chanigng the UIDVALIDITY of the mailbox,
 | which is an expensive operation for disconnected clients (it forces them
 | to resync)
 |
 | So yes, every time you need to resync, you can increment the uidvalidity,
 | but your disconnected users are going to hate you for it, and this isn't a
 | tremendously good solution for the real world (where temporary outages
 | between distant nodes is the norm).

 So the message with UID 123 during one session has to still have UID 123
 during the next session.  That indeed will break the ability to have
 unique remote syncronization.

 What's curious to me is how, with a Maildir format, that IMAP could be
 implemented to retain that state without either storing some extra data
 or updating the files in place.  I had thought that real unique message
 IDs were the same as in RFC 822.  I didn't read RFC 2060 because I had
 been talked out of implementing my own IMAP daemon.  But I guess I
 should have read it, anyway, to understand its limitations.  Probably
 better do that soon before I design something else that can't work :-)

 --
 -
 | Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |
 | [EMAIL PROTECTED] | Texas, USA | http://ka9wgn.ham.org/|
 -




Re: looking for Cyrus mail format documentation

2003-02-03 Thread Phil Howard
On Mon, Feb 03, 2003 at 09:18:47AM -0500, Rob Siemborski wrote:

| On Sun, 2 Feb 2003, Phil Howard wrote:
| 
|  Apparently the way Cyrus does it, there are problems.  But that does
|  not mean it cannot be done in general.  By keeping a sequential number
|  and naming the files by that number alone, of course there can be
|  collisions.  If the original design of the mailstore required being
|  able to do two-way replication reliably, it would be a matter of
|  making the file names be more unique, such as using a timestamp plus
|  hostname.
| 
| How?  IMAP UIDs are defined as strictly increasing integers.  See RFC 2060
| section 2.3.1.1.  This has nothing to do with Cyrus's implementation.

Does the RFC say that the IMAP UIDs have to be the file name?

Do the IMAP UIDs have to be the same between different sessions?


| You can, of course, disable updates when you don't have a quorum of
| replicated servers, but I don't think this is what you're asking for.


No.  There would be 2 such servers, and periodic unreachability.
There would need to be an ability to add new mail separately on
each while still avoiding message ID collections.  If there is no
need to keep the same ID mapping between sessions, then there may
be a way to do it by mapping.

-- 
-
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |
| [EMAIL PROTECTED] | Texas, USA | http://ka9wgn.ham.org/|
-



Re: looking for Cyrus mail format documentation

2003-02-03 Thread Rob Siemborski
On Tue, 4 Feb 2003, Phil Howard wrote:

 Does the RFC say that the IMAP UIDs have to be the file name?

No, of course not.

 Do the IMAP UIDs have to be the same between different sessions?

They cannot change without also chanigng the UIDVALIDITY of the mailbox,
which is an expensive operation for disconnected clients (it forces them
to resync)

So yes, every time you need to resync, you can increment the uidvalidity,
but your disconnected users are going to hate you for it, and this isn't a
tremendously good solution for the real world (where temporary outages
between distant nodes is the norm).

-Rob

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Rob Siemborski * Andrew Systems Group * Cyert Hall 207 * 412-268-7456
Research Systems Programmer * /usr/contributed Gatekeeper




Re: looking for Cyrus mail format documentation

2003-02-02 Thread Henrique de Moraes Holschuh
On Sat, 01 Feb 2003, Phil Howard wrote:
 | Doing replicated IMAP stores (espeically geographicly distanct ones) is
 | not an easy problem.
 
 It's easy if every message is a separate file.

Unless the UIDs are UUIDs, it is NOT simple.  And Cyrus does not use a UUID
for every message, but rather an UID that is unique to that mailbox: an
incremental integer.

-- 
  One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie. -- The Silicon Valley Tarot
  Henrique Holschuh



Re: looking for Cyrus mail format documentation

2003-02-02 Thread Rob Siemborski
On Sat, 1 Feb 2003, Phil Howard wrote:

 So this new message was be appended to the same FILE?  That sounds
 more like the old UNIX mailbox format.

No.  Same mailbox.

Two servers are in sync, both with a UIDNEXT of 1000 for a particular
mailbox.  They suffer a netsplit and both have an APPEND happen,
regardless of the mailstore implementation, they now both have a different
concept of what UID 1000 is.

 | Doing replicated IMAP stores (espeically geographicly distanct ones) is
 | not an easy problem.

 It's easy if every message is a separate file.

This is not true.  It has nothing to do with the implementation of the
mailstore.

-Rob

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Rob Siemborski * Andrew Systems Group * Cyert Hall 207 * 412-268-7456
Research Systems Programmer * /usr/contributed Gatekeeper




Re: looking for Cyrus mail format documentation

2003-02-02 Thread Phil Howard
On Sun, Feb 02, 2003 at 08:20:03PM -0500, Rob Siemborski wrote:

| On Sat, 1 Feb 2003, Phil Howard wrote:
| 
|  So this new message was be appended to the same FILE?  That sounds
|  more like the old UNIX mailbox format.
| 
| No.  Same mailbox.
| 
| Two servers are in sync, both with a UIDNEXT of 1000 for a particular
| mailbox.  They suffer a netsplit and both have an APPEND happen,
| regardless of the mailstore implementation, they now both have a different
| concept of what UID 1000 is.
| 
|  | Doing replicated IMAP stores (espeically geographicly distanct ones) is
|  | not an easy problem.
| 
|  It's easy if every message is a separate file.
| 
| This is not true.  It has nothing to do with the implementation of the
| mailstore.

Apparently the way Cyrus does it, there are problems.  But that does
not mean it cannot be done in general.  By keeping a sequential number
and naming the files by that number alone, of course there can be
collisions.  If the original design of the mailstore required being
able to do two-way replication reliably, it would be a matter of
making the file names be more unique, such as using a timestamp plus
hostname.

So basically it comes down to, this isn't possible with Cyrus without
major hacking.

-- 
-
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |
| [EMAIL PROTECTED] | Texas, USA | http://ka9wgn.ham.org/|
-



Re: looking for Cyrus mail format documentation

2003-02-01 Thread Rob Siemborski
On Fri, 31 Jan 2003, Phil Howard wrote:

 | Of course replicating some things such as seen state will be quite
 | painful, and you may need to do some hacks to keep uids unique between
 | the machines.

 How does Cyrus manage uids?  I hope these are not uids in /etc/passwd.

No, they're the unique identifier numbers for each message.  I believe the
problem John was asking about is, what happens if you have, say, an APPEND
happen to a mailbox on both servers while they are not in communication
with eachother.

When they resync, each has a new message with the same unique identifier,
but different contents.  This isn't a situation that can be recoverd from
just be looking at the contents of the filesystem.

Doing replicated IMAP stores (espeically geographicly distanct ones) is
not an easy problem.

-Rob

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Rob Siemborski * Andrew Systems Group * Cyert Hall 207 * 412-268-7456
Research Systems Programmer * /usr/contributed Gatekeeper




Re: looking for Cyrus mail format documentation

2003-02-01 Thread Phil Howard
On Sat, Feb 01, 2003 at 11:31:13AM -0500, Rob Siemborski wrote:

| On Fri, 31 Jan 2003, Phil Howard wrote:
| 
|  | Of course replicating some things such as seen state will be quite
|  | painful, and you may need to do some hacks to keep uids unique between
|  | the machines.
| 
|  How does Cyrus manage uids?  I hope these are not uids in /etc/passwd.
| 
| No, they're the unique identifier numbers for each message.  I believe the
| problem John was asking about is, what happens if you have, say, an APPEND
| happen to a mailbox on both servers while they are not in communication
| with eachother.
| 
| When they resync, each has a new message with the same unique identifier,
| but different contents.  This isn't a situation that can be recoverd from
| just be looking at the contents of the filesystem.

So this new message was be appended to the same FILE?  That sounds
more like the old UNIX mailbox format.


| Doing replicated IMAP stores (espeically geographicly distanct ones) is
| not an easy problem.

It's easy if every message is a separate file.

-- 
-
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |
| [EMAIL PROTECTED] | Texas, USA | http://ka9wgn.ham.org/|
-



looking for Cyrus mail format documentation

2003-01-31 Thread Phil Howard
A couple people have suggested to me that I use Cyrus-IMAP as
opposed to Courier-IMAP, and have given some good arguments
for that decision direction.  However, I have still have one
show stopper for that switch: some external programs that work
directly with the storage space of all the mail.  Due to the
nature of some of these programs, accessing that mail by means
of the IMAP protocol or any delivery protocol is not an option.

What I want to examine at this point is the potential ease of
converting those programs to work with the format Cyrus-IMAP
stores its mail.  Had Cyrus-IMAP used the Maildir format, this
would be a simple unplug Courier and plugin Cyrus.  The
issue is not about converting existing messages (the transition
will be done with all empty mailboxes).  The issue is knowing
the details of the format in its entirety.

I've looked around the web site and the source file tree and I
find no documentation on this format.  I have been told two
different stories about references to other formats it is like.
But then, I've also heard people tell me Cyrus-IMAP really
does use Maildir format (and as far as I can see, that simply
is not true).

So basically, I'm asking if any documentation(s) exists which
would described (preferrably in a standards style) just what
the format is.  Please don't refer me to the source code, as
I already have that, and I've never found that method to be
a clean way to deal with all the issues (too often semantics
are missed because the implementation doesn't push requirements
to the edge).  Documents in ASCII, HTML, or PDF preferred.

I was also looking for documentation on SASL.  That I found in
the RFCs.  That's the kind of thing I'm looking for regarding
the file formats.

-- 
-
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |
| [EMAIL PROTECTED] | Texas, USA | http://ka9wgn.ham.org/|
-



Re: looking for Cyrus mail format documentation

2003-01-31 Thread Earl R Shannon
Hello,

One of the disadvantage of using Cyrus might be that there is
no API to the mail store other than the IMAP protocol. You simply
cannot go mucking around the mail store with external programs
without the potential to cause problems.

That said, mail is stored in directories that map unto folders
and each message has its own file. Seems pretty straight forward
until you realize that the file names and directories have
metadata associated with them that the IMAP server process needs
and maintains. One simply does not mkdir in someone's account
and expect the corresponding folder to show up. Nor can you
simply create a file with what appears to be an appropriate name
and have the message show up in a folder.

Cyrus documentation calls the IMAP server a black box. This is
defined to mean that the users do not have access to the
account/data accept through the well defined ( :-/ ) IMAP protocol.
This black box concept also extends to a certain extent to
the administrators of the servers.

Best way to learn something is through experience. Set up a
server and look at how it does things. If you opt for compiling
it yourself choose the flat file options for all the databases.
This will leave the data in a format that is human readable,
sorta, and you can figure out what is going on.

Regards,
Earl Shannon



Phil Howard wrote:

A couple people have suggested to me that I use Cyrus-IMAP as
opposed to Courier-IMAP, and have given some good arguments
for that decision direction.  However, I have still have one
show stopper for that switch: some external programs that work
directly with the storage space of all the mail.  Due to the
nature of some of these programs, accessing that mail by means
of the IMAP protocol or any delivery protocol is not an option.

What I want to examine at this point is the potential ease of
converting those programs to work with the format Cyrus-IMAP
stores its mail.  Had Cyrus-IMAP used the Maildir format, this
would be a simple unplug Courier and plugin Cyrus.  The
issue is not about converting existing messages (the transition
will be done with all empty mailboxes).  The issue is knowing
the details of the format in its entirety.

I've looked around the web site and the source file tree and I
find no documentation on this format.  I have been told two
different stories about references to other formats it is like.
But then, I've also heard people tell me Cyrus-IMAP really
does use Maildir format (and as far as I can see, that simply
is not true).

So basically, I'm asking if any documentation(s) exists which
would described (preferrably in a standards style) just what
the format is.  Please don't refer me to the source code, as
I already have that, and I've never found that method to be
a clean way to deal with all the issues (too often semantics
are missed because the implementation doesn't push requirements
to the edge).  Documents in ASCII, HTML, or PDF preferred.

I was also looking for documentation on SASL.  That I found in
the RFCs.  That's the kind of thing I'm looking for regarding
the file formats.







Re: looking for Cyrus mail format documentation

2003-01-31 Thread Adam Tauno Williams
A couple people have suggested to me that I use Cyrus-IMAP as
opposed to Courier-IMAP, and have given some good arguments
for that decision direction.  However, I have still have one
show stopper for that switch: some external programs that work
directly with the storage space of all the mail.  Due to the
nature of some of these programs, accessing that mail by means
of the IMAP protocol or any delivery protocol is not an option.

You'd need to change how those applications work.  Cyrus is a sealed box,
external access to the mailstore is forbidden.  One of the many reasons Cyrus is
so fast and stable (stable, at least for us, in almost a geological sence, very
impressive).

What I want to examine at this point is the potential ease of
converting those programs to work with the format Cyrus-IMAP
stores its mail.  

I think you pretty much can't.

I've looked around the web site and the source file tree and I
find no documentation on this format.  I have been told two
different stories about references to other formats it is like.
But then, I've also heard people tell me Cyrus-IMAP really
does use Maildir format (and as far as I can see, that simply
is not true).

It does not use maildir.  It actually can use several storage backends, flat
file to sleepcat and some others.  Rumor of an SQL backend,  that might be what
your looking for.

I was also looking for documentation on SASL.  That I found in
the RFCs.  That's the kind of thing I'm looking for regarding
the file formats.

SASL documentation?!  There is some floating about.  I love SASL, but the shreds
of documentation are universally terrible.



Re: looking for Cyrus mail format documentation

2003-01-31 Thread Rob Siemborski
On Fri, 31 Jan 2003, Earl R Shannon wrote:

 Cyrus documentation calls the IMAP server a black box. This is defined
 to mean that the users do not have access to the account/data accept
 through the well defined ( :-/ ) IMAP protocol. This black box concept
 also extends to a certain extent to the administrators of the servers.

Obviously the mail store access is not limited to the IMAP protocol, one
can also use LMTP and POP3, and NNTP if you're using the 2.2 branch.

There's also some utilities such as deliver that can let you do various
things to the mail store.

 Best way to learn something is through experience. Set up a server and
 look at how it does things. If you opt for compiling it yourself choose
 the flat file options for all the databases. This will leave the data in
 a format that is human readable, sorta, and you can figure out what is
 going on.

This is really only useful for educational purposes, not for actually
running a mail store where other programs want direct access.  The only
way to accomplish the latter legitimately is to use the cyrus source
directly.

Often times mailbox access semantics change slightly from version to
version, and if you're not using all the same code (which assumes it is
only dealing with one version of itself), then you can run into trouble.

-Rob

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Rob Siemborski * Andrew Systems Group * Cyert Hall 207 * 412-268-7456
Research Systems Programmer * /usr/contributed Gatekeeper




Re: looking for Cyrus mail format documentation

2003-01-31 Thread Rob Siemborski
On Fri, 31 Jan 2003, Adam Tauno Williams wrote:

 SASL documentation?!  There is some floating about.  I love SASL, but
 the shreds of documentation are universally terrible.

If you have specific suggestions as to what needs to be added, please let
us know.  If you actually have text, that'd be even better.

-Rob

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Rob Siemborski * Andrew Systems Group * Cyert Hall 207 * 412-268-7456
Research Systems Programmer * /usr/contributed Gatekeeper




Re: looking for Cyrus mail format documentation

2003-01-31 Thread Brian

Adam Tauno Williams said:

 SASL documentation?!  There is some floating about.  I love SASL, but
 the shreds of documentation are universally terrible.

AFAIK, this is also true for much of the software that CMU produces,
however the quality of the software has been so good that no one seems to
mind too much.

I think O'Reilly could tap a big market by writing books on some of the
CMU sofware projects.

-- 
Brian





Re: looking for Cyrus mail format documentation

2003-01-31 Thread Earl R Shannon
Hello,

You are correct on all counts. I was simply trying to make a point,
and IMAP is the major protocol used to access the mail store, at
least it is here. Nor did I mean to imply that any server that one
set's up to see how things worked should be a production machine.
In fact, because one would be changing things to see how they
affected clients, etc. I would expect it to be a test platform.

But that does now beg the question. There must be some form of
coordination between the various processes as they access the
mail store. Can this not be abstracted out and put in an API to
make it easier for people to write their own applications?  I would
venture a guess to say that the API already exists in some form,
it just needs to be formalized and published.

Regards,
Earl Shannon

Rob Siemborski wrote:

On Fri, 31 Jan 2003, Earl R Shannon wrote:



Cyrus documentation calls the IMAP server a black box. This is defined
to mean that the users do not have access to the account/data accept
through the well defined ( :-/ ) IMAP protocol. This black box concept
also extends to a certain extent to the administrators of the servers.



Obviously the mail store access is not limited to the IMAP protocol, one
can also use LMTP and POP3, and NNTP if you're using the 2.2 branch.

There's also some utilities such as deliver that can let you do various
things to the mail store.



Best way to learn something is through experience. Set up a server and
look at how it does things. If you opt for compiling it yourself choose
the flat file options for all the databases. This will leave the data in
a format that is human readable, sorta, and you can figure out what is
going on.



This is really only useful for educational purposes, not for actually
running a mail store where other programs want direct access.  The only
way to accomplish the latter legitimately is to use the cyrus source
directly.

Often times mailbox access semantics change slightly from version to
version, and if you're not using all the same code (which assumes it is
only dealing with one version of itself), then you can run into trouble.

-Rob

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Rob Siemborski * Andrew Systems Group * Cyert Hall 207 * 412-268-7456
Research Systems Programmer * /usr/contributed Gatekeeper








Re: looking for Cyrus mail format documentation

2003-01-31 Thread Rob Siemborski
On Fri, 31 Jan 2003, Earl R Shannon wrote:

 But that does now beg the question. There must be some form of
 coordination between the various processes as they access the
 mail store. Can this not be abstracted out and put in an API to
 make it easier for people to write their own applications?  I would
 venture a guess to say that the API already exists in some form,
 it just needs to be formalized and published.

Of course there is.  Its (mostly) localized within libimap.a, and the
headers that go along with that library.

I still don't recommend this to someone who just wants to get their
applications to play nice with cyrus though, the externally defined
protocols and applications are much more well defined than the contents of
libimap.a ever will be (since changing external APIs is generally
consdiered bad form ;).

If someone wants to do work to document the internal API, I'd love to see
it.  I suspect the best way to do this would be to further comment all
the stuff in the header files directly, since documentation maintained
separatly is likely to go out of date.  Perhaps a general overview could
go in doc/internal.

Actually, we've been talking for a while about refactoring the mboxlist
API, since its grown in somewhat unsightly ways (especially with the
murder code).

-Rob

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Rob Siemborski * Andrew Systems Group * Cyert Hall 207 * 412-268-7456
Research Systems Programmer * /usr/contributed Gatekeeper





Re: looking for Cyrus mail format documentation

2003-01-31 Thread John Alton Tamplin
Earl R Shannon wrote:


But that does now beg the question. There must be some form of
coordination between the various processes as they access the
mail store. Can this not be abstracted out and put in an API to
make it easier for people to write their own applications?  I would
venture a guess to say that the API already exists in some form,
it just needs to be formalized and published.


The point is if you expose the internal API for accessing the mailstore 
you are stuck with it and can't make changes.  I can't imagine there is 
a big need for this or other people wanting to write code to implement 
that API, so if you really want to do this it is probably better as Rob 
suggested to just link to the Cyrus code that manipulates it (and watch 
for version skew between programs accessing the mail store).

--
John A. Tamplin   Unix System Administrator
Emory University, School of Public Health +1 404/727-9931





Re: looking for Cyrus mail format documentation

2003-01-31 Thread Rob Siemborski
On Fri, 31 Jan 2003, John Alton Tamplin wrote:

 The point is if you expose the internal API for accessing the mailstore
 you are stuck with it and can't make changes.  I can't imagine there is
 a big need for this or other people wanting to write code to implement
 that API, so if you really want to do this it is probably better as Rob
 suggested to just link to the Cyrus code that manipulates it (and watch
 for version skew between programs accessing the mail store).

Well, I don't recommend this for anything that isn't part of the cyrus
distribution.  The chance of version skew (especially in subtle ways) is
very high, and certainly this isn't a good option if the library is being
linked staticly.

For any external program, I strongly recommend you only use the
standard/provided interfaces (protocols, and deliver) and *NOT* the
internal APIs.

Documentation of the internal APIs is useful from a cyrus developer
standpoint, but not from a usefulness to users standpoint.

-Rob

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Rob Siemborski * Andrew Systems Group * Cyert Hall 207 * 412-268-7456
Research Systems Programmer * /usr/contributed Gatekeeper




Re: looking for Cyrus mail format documentation

2003-01-31 Thread Phil Howard
On Fri, Jan 31, 2003 at 08:57:50AM -0500, Adam Tauno Williams wrote:

| It does not use maildir.  It actually can use several storage backends, flat
| file to sleepcat and some others.  Rumor of an SQL backend,  that might be what
| your looking for.

SQL would be harder to do for what I'm doing.  Discrete files made
it easier (e.g. Maildir).

That it can do several backends might make it possible to plug
something else in.  I could like into that.

 
| I was also looking for documentation on SASL.  That I found in
| the RFCs.  That's the kind of thing I'm looking for regarding
| the file formats.
| 
| SASL documentation?!  There is some floating about.  I love SASL, but the shreds
| of documentation are universally terrible.

What I need in SASL isn't nearly as involved.  I may be able to plug
my auth data in somewhere or convert it to LDAP or something.

-- 
-
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |
| [EMAIL PROTECTED] | Texas, USA | http://ka9wgn.ham.org/|
-



Re: looking for Cyrus mail format documentation

2003-01-31 Thread Phil Howard
On Fri, Jan 31, 2003 at 09:57:40AM -0500, John Alton Tamplin wrote:

| Earl R Shannon wrote:
| 
|  But that does now beg the question. There must be some form of
|  coordination between the various processes as they access the
|  mail store. Can this not be abstracted out and put in an API to
|  make it easier for people to write their own applications?  I would
|  venture a guess to say that the API already exists in some form,
|  it just needs to be formalized and published.
| 
| The point is if you expose the internal API for accessing the mailstore 
| you are stuck with it and can't make changes.  I can't imagine there is 
| a big need for this or other people wanting to write code to implement 
| that API, so if you really want to do this it is probably better as Rob 
| suggested to just link to the Cyrus code that manipulates it (and watch 
| for version skew between programs accessing the mail store).

One of the needs I have is to build a two-way mail store replica.  Either
node may be delivered to, and either node may be accessed by the user but
only one at a time.  The two nodes are topologically and geographically
far apart, and bandwidth between them is to be considered costly and thus
should be not much more than the cost of actually transferring content.
If mail arrives at one, it should be replicated to the other ASAP.  If
mail is deleted at one, it should be deleted from the other ASAP.  If
mail is moved around between folders unchanged, it should be moved the
same on the other without transferring content.  Now here is the big one:
If the two nodes are unreachable between each other, changes have to be
stored in a way they can be re-syncronized when reachability is again
established.  And this may involve some changes to both and some issues
that have to be dealt with as best as possible such as noting dates of
changes (it can be assumed the two nodes are time syncronized).

This is one of needs I have.

-- 
-
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |
| [EMAIL PROTECTED] | Texas, USA | http://ka9wgn.ham.org/|
-



Re: looking for Cyrus mail format documentation

2003-01-31 Thread John A. Tamplin
Phil Howard wrote:


One of the needs I have is to build a two-way mail store replica.  Either
node may be delivered to, and either node may be accessed by the user but
only one at a time.  The two nodes are topologically and geographically
far apart, and bandwidth between them is to be considered costly and thus
should be not much more than the cost of actually transferring content.
If mail arrives at one, it should be replicated to the other ASAP.  If
mail is deleted at one, it should be deleted from the other ASAP.  If
mail is moved around between folders unchanged, it should be moved the
same on the other without transferring content.  Now here is the big one:
If the two nodes are unreachable between each other, changes have to be
stored in a way they can be re-syncronized when reachability is again
established.  And this may involve some changes to both and some issues
that have to be dealt with as best as possible such as noting dates of
changes (it can be assumed the two nodes are time syncronized).

This is one of needs I have.


Then I would suggest a better way of doing that than trying to figure 
out what changes have happened by looking at low-level data structures 
would be to put proxies in front of Cyrus (LMTP, IMAP, and if you use it 
POP).  The proxies would pass the data on to the local Cyrus to do the 
action as well as contacting the other proxy to duplicate the work.  If 
the other proxy is not accessible, keep a log of the work that needs to 
be performed (but allowing disconnected operation when the other node is 
not truly down will likely lead to changes that can't be automatically 
resolved -- better would be to have 3, run two-phase commit and only 
commit if you get agreement of two, but that may not be practical) and 
do those changes when the other proxy comes back up.  The level of 
abstraction you want is precisely the level you get at the higher level 
protocols rather than having to dig through all the folders and see what 
has changed.

Of course replicating some things such as seen state will be quite 
painful, and you may need to do some hacks to keep uids unique between 
the machines.

--
John A. Tamplin
Unix Systems Administrator





Re: looking for Cyrus mail format documentation

2003-01-31 Thread Phil Howard
On Fri, Jan 31, 2003 at 08:33:41PM -0500, John A. Tamplin wrote:

| Phil Howard wrote:
| 
| One of the needs I have is to build a two-way mail store replica.  Either
| node may be delivered to, and either node may be accessed by the user but
| only one at a time.  The two nodes are topologically and geographically
| far apart, and bandwidth between them is to be considered costly and thus
| should be not much more than the cost of actually transferring content.
| If mail arrives at one, it should be replicated to the other ASAP.  If
| mail is deleted at one, it should be deleted from the other ASAP.  If
| mail is moved around between folders unchanged, it should be moved the
| same on the other without transferring content.  Now here is the big one:
| If the two nodes are unreachable between each other, changes have to be
| stored in a way they can be re-syncronized when reachability is again
| established.  And this may involve some changes to both and some issues
| that have to be dealt with as best as possible such as noting dates of
| changes (it can be assumed the two nodes are time syncronized).
| 
| This is one of needs I have.
| 
| Then I would suggest a better way of doing that than trying to figure 
| out what changes have happened by looking at low-level data structures 
| would be to put proxies in front of Cyrus (LMTP, IMAP, and if you use it 
| POP).  The proxies would pass the data on to the local Cyrus to do the 
| action as well as contacting the other proxy to duplicate the work.  If 
| the other proxy is not accessible, keep a log of the work that needs to 
| be performed (but allowing disconnected operation when the other node is 
| not truly down will likely lead to changes that can't be automatically 
| resolved -- better would be to have 3, run two-phase commit and only 
| commit if you get agreement of two, but that may not be practical) and 
| do those changes when the other proxy comes back up.  The level of 
| abstraction you want is precisely the level you get at the higher level 
| protocols rather than having to dig through all the folders and see what 
| has changed.

If by proxy in from of Cyrus you mean to implement a layer of IMAP that
is connected to by clients, and then connected to Cyrus on some hidden
port, then I'd say that's not really practical.  That would mean doing
an implementation of IMAP, and it is things like this that I was trying
to avoid in the first place.  I might as well just directly access the
files, and hence have my own IMAP implementation.  But it's to avoid this
that I get so many suggestions to use Cyrus (or Courier) instead.  Doing
the syncronizing at the filesystem later won't be too hard, although a
few hacks are needed (for example deletes are saved as an empty file
with zero permissions, dated when the delete happened, until after the
syncronization clears it everywhere).


| Of course replicating some things such as seen state will be quite 
| painful, and you may need to do some hacks to keep uids unique between 
| the machines.

How does Cyrus manage uids?  I hope these are not uids in /etc/passwd.

-- 
-
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |
| [EMAIL PROTECTED] | Texas, USA | http://ka9wgn.ham.org/|
-