In some email I received from Ryan Butler <[EMAIL PROTECTED]> on 29 Jul 2003 
09:32:09
-0500, wrote:

> That's completely false.



> You're still storing it on a disk, so the reliability is the same, store
> your db on a raid5, or store flat files on a raid5.... its the same
> thing, raid5 loses more than one disk at a time and you're hosed either
> way, so the only way to be sure is to have archived backups, or hot
> systems (replicated db's).

In order to understand why you have to know what are the probabilities of 
failure and etc.
Talking in general having everything in one database (excluding the case of 
this very
database being replicated) you've got yourself a single point failure, since 
with FS is
bit different (I'm underling _FS_ rather than _disk_, because the database 
itself is
dependable on both of them excluding the cases when you run database in the RAM 
disk or
NAS), granulating the FS in different partitions gives you more redundancy and 
in some
cases more security (not only in point of view of being consistent, but in 
general
security issues ACLs and so).

What's wrong with mirroring RAID, having RAID 1 +0 and IMHO this will solve 
your problems.
there are tons of applications, distributed filesystems and so which you can 
use to
provide consistency and redundancy as well as one major thing _security_.

ACL in database suck, you can put permissions on table, database, can you put 
permissions
per every record? I guess not in the every day database.

> As for the access time, that's true, in part, but the second part of
> your statement is false.  file system mta's with large numbers of users
> will suffer tremendously compared to a db.  A db is an indexed,
> organized storage, flat files are sequential only.  So when you have
> large mailboxes, a database will win over flat file for access time
> since it doesn't have to seek the whole file to search a message in the
> middle. 

Btrees, hash tables, used in FS as well.

> Of course, if you're only running pop3 and your clients are
> behaving appropriately (downloading all messages every time) a flat file
> might show a performance benefit (since it is a sequential read and all
> the other indexing isn't needed).  But some clients will skip messages
> they've already downloaded, requiring an expensive seek without an index
> as to where to look, and pop3 is becoming less popular as imap (as a
> direct protocol, or the backend of a webmail system) becomes more
> popular, which don't regularly do a sequential read.

The main strength of the database I'd say is in the cache the way it's handled 
and write
ahead logs and so (that's where metadata comes in). Look at reiserfs it uses 
Btrees also
it has hash table, not to mention the metadata. Now look at reiserfs 4 :-). 
even the
metadata and data journaling gives you more secure storage than database.
XFS, EXT3, REISERFS, JFS, NTFS all these are journals filesystems.

One thing I know for sure that most of the features are derived  from database 
designs.

Let's be honest, what do you prefer, being consistent, secure and redundant 
than speed?! 
I'd not think so. Hardware is cheap, horse power is cheap.
Making a cluster of Distributed Filesystems is much more feasible than doing it 
with
Database, for 1 most of the database use the Lazy  Replication approach rather 
than Eager,
and all the updates are done _after_ the transaction was done, in Eager you have
synchronous replication where the transactions are spread across the whole 
cluster,
not to mention Eager partial replication, where different bits can reside on 
different
servers, but:

1) It gives you more granularity
2) More consistency in sense of data writes
3) Creates a whole variety of options to add/remove server depending on the 
load.
4) ... and so on.

> In addition, file access becomes increasingly slow when you have
> thousands of files in a directory, or thousands of directories in a
> directory with hundreds of files in those subdirectories...  Flat files
> do not win in any scenario other than ease of installation on extremely
> small setups.

Actually files aint a bad idea, just not accepting it in this case looks like a 
biased
opinion. Look at Plan9 it have concepts of which other OS can dream of ;-)

Also how many files to add in a folder is a user choice .
since the database can easily start coughing when having too much indices and 
blah blah..

I mean if we're going to give reasonable facts lets put all the cards on the 
table and not
save anything from the user. Good comprehension  FS vs DB will be very nice to 
be done.



-- 
Lou Kamenov Researcher/Security Analyst

AEYE  R&D - http://www.aeye.net [EMAIL PROTECTED]
AEYE Tech - http://www.aeye.biz [EMAIL PROTECTED]
phone: +44 (0) 20 8879 9832 fax: +44 (0) 7092 129079
mobile: +44 (0) 79 3945 3026 PGP Key ID - 0xA297084A

AEYE(=AI) stands for Artificial Intelligence.

Reply via email to