Problem:

- People leaves mail in the mailbox. Scanning the mailbox every time is
a I/O hungry operation

- Rewritting a partially updated mailbox is very expensive. UIDL update,
partial mailbox deleting, mail arrives while the popper is running...

Solution:

A simple and efficient database (key/value) used to store messages. For
example, BerkeleyDB (http://www.sleepycat.com/)

Qpopper would have six operations:

- Translate estándar mailboxes into the database.

- Serve mails from database.

- An additional tool to show statistics about users: messages in
database, lenght, last login, quota...

- An additional tool to list and delete a concrete user message.

- An additional tool to delete an user and all its messages.

- An additional tool to kill all popper processes, disable POP3 logins
and reconstruct the database if it's neccesary. This operation,
tipically, lasts 4-5 seconds.

We could have have another tool to delete messages already read and
older that a month, for example.

Example:

You could have a central mailbox database. Every email in the database
would have a unique UID. Every message resides in two register, for
example. One register contains the message body. The other register has
the message headers, which can be modified by qpopper (UIDL, Status,
etc).

There are per user registers to keep data like messages UID, messages
length, quota, last login, perhaps UIDL and Status.

There is a global register that keep a global serial number (used as a
UID generator), atomically updated every time a message is added to the
database.

When an user enters POP3, qpopper would translate new messages in user
standard mailbox into the database (erasing the original mailbox). Then,
the messages are served from the database. The message migration can be
implemented, also, with a cron job to migrate mailboxes with infrequent
logins.

The unique remaining problem would be "quotas", a very problematic issue
for current qpopper also. If you control the local mailer you can talk
to the database and control quotas there.

Advantages:

- You don't need scan anything when you have the messages in the
database. You know, everytime, how many messages an user has, lenght,
and so on. If new email arrives, you migrate it to the database.

- You can delete individual messages without needing a mailbox
rewriting.

- You can modify headers without expensive I/O, since headers (tipically
<2Kbytes) are kept separated from message bodies.

- New messages arriving while qpopper is working don't require mailbox
rewriting.

- Berkeley DB, for example, can retrieves partial registers. That is,
you can have a 15 MB message, and you don't need to read it in a shot.
In fact, you can read the message in 64 Kbytes chunks, for example, to
keep memory and I/O small.

- Berkeley DB overhead in disk space and CPU is fairly small.

- Berkeley DB implements atomic transactions. In fact, you have full
ACID semantic. A popper processs can die any time and the database is
always consistent.

- Berkeley DB detects and resolve deadlocks when multiple processes
access the database.

- Berkeley DB is free for non commercial usages.

- Last Berkeley DB version supports replication.

- You can support multiple mailboxes format: mailbox and maildir, for
example. The unique impact would be to program the mailbox to database
converter. This step if fairly simple.

PS: I'm advocating Berkeley DB because I'm using the system for years in
big (millions of registers) and critical environments, and its
performance and safety are stunning. But any similar DB will do the
work. Observe that I'm not talking about SQL database. That's not the
way. I'm talking about fully ACID semantic key/value databases.

-- 
Jesus Cea Avion                         _/_/      _/_/_/        _/_/_/
[EMAIL PROTECTED] http://www.argo.es/~jcea/ _/_/    _/_/  _/_/    _/_/  _/_/
                                      _/_/    _/_/          _/_/_/_/_/
PGP Key Available at KeyServ   _/_/  _/_/    _/_/          _/_/  _/_/
"Things are not so easy"      _/_/  _/_/    _/_/  _/_/    _/_/  _/_/
"My name is Dump, Core Dump"   _/_/_/        _/_/_/      _/_/  _/_/
"El amor es poner tu felicidad en la felicidad de otro" - Leibniz

Reply via email to