Well,
I beleive a flag on the messageblock would be a good start, since
it is not that intrusive.
But I am very unsusre about how much performance we would gain by
searching directly in heavily optimized fast-header tables. Is
the pain worth the gain?
Small installations do not need the extra performance.
Large installations can probably get away with more expensive
hardware
Huge installations may run out of steam on the best availible
hardware, here the fast-headers might be of some use.
My beleif is that we should go for the easiest implementation,
that is the least intrusive and get a feeling for what we need.
I know that MySQL supports queries with regular expressions.
These queries will probably be quite fast.
I think we should se what happens when we implement server side
sorting, based on clever SQL statements using the messageblocks.
The huge question is: How much do we gain in performance by using
special header tables compared to searching the messageblocks?
The gain might be quite small.
well, that my $0.02
Magnus
Aaron Stone wrote:
I'm thinking in terms of searches primarily, where a query might be:
select distinct(message_id) from fastheaders
where header = 'to' and contents like '%bob%'
You're looking for displaying the message, where the webmail application
probably doesn't want to read in the header blocks of the message and parse
them just to show the Date, From, Subject, To, CC, Bcc, Fcc, Reply-To fields.
Originally I suggested using a table that has the header fields as the
columns, thinking that those were the only commonly requested ones, but it was
either Ilja or Roel at the time who watched for a couple of IMAP queries and
saw all kinds of headers being requested both for searching and for listing
messages. So the configurable columns thing means that the admin needs to be
watching to see what fields his clients are requesting, and to add those, and
it means quite a bit more complexity in DBMail, which would need to either
find the header in the header cache table or parse the full headers.
For your webmail application, since you know exactly which header fields you
want and you're writing your own queries, you can just do this:
select header, contents from fastheaders
where message_id = 82 and (header = 'To' or header = 'From' or ...)
My instinct is that this would still be reasonably fast, completely flexible
and generic. Even if the query took slightly longer, the much smaller result
set size and the negligible parsing needed would offset the query speed.
Aaron
""Mark Mackay - Orcon"" <[EMAIL PROTECTED]> said:
Surely for webmail applications, etc you're after a join where you get the
headers as columns for displaying a message list; to avoid having nested
queries.
E.g. avoid "select * from messages"; then for each message "select * from
headers where messageid=X"
I know the goal is to have the headers table really flexible so that
additional headers can be added, etc -- but maybe this could be done
differently. A run-time config file where you mapped the column
names/numbers to header names, etc.
That way if a developer wanted to add a new column and start caching a new
header they could alter the table and simply update the config files,
one-by-one. Or disable the headercache altogether.
I could be missing something here -- is there a query which can turn a
one-to-many join into a single line?
/Mark
-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Aaron Stone
Sent: Thursday, 18 March 2004 6:30 p.m.
To: DBMAIL Developers Mailinglist
Subject: Re: [Dbmail-dev] NNTP
Seems terribly complicated, and it would make the messages
very difficult to
dump by hand. I'm also not sure why you want to have two
tables; is that what
you're proposing:
header_labels (index on label)
id label
--------------
1 to
2 from
3 subject
message_headers (index on messageid, labelid, header and
labelid, header)
-----------------------------------------------------
id message_id label_id header
1 1 1 [EMAIL PROTECTED]
2 1 2 [EMAIL PROTECTED]
3 1 3 Hey Bob, it's Joe!
I think that unless your database has miraculously good
JOINs, this is a
nightmare; note that you cannot reassemble by returning rows
in ascending
order of the 'id' column, another column would be needed to
keep the order of
the headers. This is probably the most compact was to store
the headers,
though, but I think it is at the cost of being severely slow
and obtuse.
My proposed header looks like this:
fastheaders (index on messageid, header, contents and
header, contents)
id message_id header contents
1 1 to [EMAIL PROTECTED]
2 1 from [EMAIL PROTECTED]
Suffers from using a lot more space, but would have faster
search times.
Administrators would be able to more easily query the table
by hand. As it is
intended to only be a cache, one could freely zap this table
and rebuild it
from the original headers. Naturally, I'm strongly advocating
my idea ;-)
Aaron
""Ed K."" <[EMAIL PROTECTED]> said:
[snip]
What if a solution was proposed in which the order and the
formatting of
the headers could be preserved. A documented method to
deconstruct and
construct the headers, and working code to be included in
db.c. Then a flag
in the messageblks row that would indicate if the header is
either only,
also, or not in the fastheaders table. i suggest we call
the table pair
message_headers and header_labels.
ed
Security on the internet is impossible without strong, open,
and unhindered encryption.
_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://twister.fastxs.net/mailman/listinfo/dbmail-dev
--
_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://twister.fastxs.net/mailman/listinfo/dbmail-dev
_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://twister.fastxs.net/mailman/listinfo/dbmail-dev