In some email I received from Ryan Butler <[EMAIL PROTECTED]> on 29 Jul 2003 09:32:09 -0500, wrote:
> That's completely false. > You're still storing it on a disk, so the reliability is the same, store > your db on a raid5, or store flat files on a raid5.... its the same > thing, raid5 loses more than one disk at a time and you're hosed either > way, so the only way to be sure is to have archived backups, or hot > systems (replicated db's). In order to understand why you have to know what are the probabilities of failure and etc. Talking in general having everything in one database (excluding the case of this very database being replicated) you've got yourself a single point failure, since with FS is bit different (I'm underling _FS_ rather than _disk_, because the database itself is dependable on both of them excluding the cases when you run database in the RAM disk or NAS), granulating the FS in different partitions gives you more redundancy and in some cases more security (not only in point of view of being consistent, but in general security issues ACLs and so). What's wrong with mirroring RAID, having RAID 1 +0 and IMHO this will solve your problems. there are tons of applications, distributed filesystems and so which you can use to provide consistency and redundancy as well as one major thing _security_. ACL in database suck, you can put permissions on table, database, can you put permissions per every record? I guess not in the every day database. > As for the access time, that's true, in part, but the second part of > your statement is false. file system mta's with large numbers of users > will suffer tremendously compared to a db. A db is an indexed, > organized storage, flat files are sequential only. So when you have > large mailboxes, a database will win over flat file for access time > since it doesn't have to seek the whole file to search a message in the > middle. Btrees, hash tables, used in FS as well. > Of course, if you're only running pop3 and your clients are > behaving appropriately (downloading all messages every time) a flat file > might show a performance benefit (since it is a sequential read and all > the other indexing isn't needed). But some clients will skip messages > they've already downloaded, requiring an expensive seek without an index > as to where to look, and pop3 is becoming less popular as imap (as a > direct protocol, or the backend of a webmail system) becomes more > popular, which don't regularly do a sequential read. The main strength of the database I'd say is in the cache the way it's handled and write ahead logs and so (that's where metadata comes in). Look at reiserfs it uses Btrees also it has hash table, not to mention the metadata. Now look at reiserfs 4 :-). even the metadata and data journaling gives you more secure storage than database. XFS, EXT3, REISERFS, JFS, NTFS all these are journals filesystems. One thing I know for sure that most of the features are derived from database designs. Let's be honest, what do you prefer, being consistent, secure and redundant than speed?! I'd not think so. Hardware is cheap, horse power is cheap. Making a cluster of Distributed Filesystems is much more feasible than doing it with Database, for 1 most of the database use the Lazy Replication approach rather than Eager, and all the updates are done _after_ the transaction was done, in Eager you have synchronous replication where the transactions are spread across the whole cluster, not to mention Eager partial replication, where different bits can reside on different servers, but: 1) It gives you more granularity 2) More consistency in sense of data writes 3) Creates a whole variety of options to add/remove server depending on the load. 4) ... and so on. > In addition, file access becomes increasingly slow when you have > thousands of files in a directory, or thousands of directories in a > directory with hundreds of files in those subdirectories... Flat files > do not win in any scenario other than ease of installation on extremely > small setups. Actually files aint a bad idea, just not accepting it in this case looks like a biased opinion. Look at Plan9 it have concepts of which other OS can dream of ;-) Also how many files to add in a folder is a user choice . since the database can easily start coughing when having too much indices and blah blah.. I mean if we're going to give reasonable facts lets put all the cards on the table and not save anything from the user. Good comprehension FS vs DB will be very nice to be done. -- Lou Kamenov Researcher/Security Analyst AEYE R&D - http://www.aeye.net [EMAIL PROTECTED] AEYE Tech - http://www.aeye.biz [EMAIL PROTECTED] phone: +44 (0) 20 8879 9832 fax: +44 (0) 7092 129079 mobile: +44 (0) 79 3945 3026 PGP Key ID - 0xA297084A AEYE(=AI) stands for Artificial Intelligence.
