On 12 Feb 2009, at 16:51, Kenneth Kalmer wrote:
A deletion is effectively a set-deleted-flag operation. Compaction
then
takes care of getting rid of the file.
So taken from other threads, you're effectively tasked with running
compaction outside of your peak time. This is a no brainer if the
other
benefits are in reachable.
Right.
While I'm here, can the docs still be recovered before compaction?
Why I ask
is that it would be a bonus to be able to access the mails and do some
statistical reporting before compacting the database, if not, no
issues.
Mail server admins (and those footing the bills) love excessive
reporting...
They can, but you're still not advised to do it. If you need any data,
it should be
in the latest version of a document. Reports could be run on the side
and stored
into a secondary database.
Hmm, not too much information. Let's see, if you have any more
specific
questions, just send a follow up :)
Well, lets try and keep this as close as couch as we can and not
wander off
into the nasty world of email systems (except for effectively CRUD-ing
messages).
So mail arrives at our SMTP server. What would give us the best
performance
for ingesting mail, directly writing each doc as it arrives, or
having small
queues that empty out every X messages / Y seconds (whichever comes
first)?
Considering one of our mail clouds does about 15GB an hour during
office
hours. I know this size isn't anything when you consider larger
providers,
but we're growing constantly and some time in the future we're gonna
have to
become creative in how we store mail.
Single doc inserts get batch-witten to disk each second. This might
work for your
data and memory requirements. But you can fine-tune that with custom
queueing
and bulk doc inserts.
Retrieving mail also becomes interesting, we can use one view to get
the
total number of messages for the mailbox, and then another (with
parameters)
to batch them from couchdb as we deliver them to the client. Would
bulk
updates here be the cheapest way of "mark all as read" or "delete",
or would
you again handle documents individually?
You can set the _deleted member in a bulk-update operation to delete a
bunch
of docs in one go.
Cheers
Jan
--