Steve,

I know you've been pointed at an interesting resource, and others are saying
they've done this successfully, but I think you should get some balance.
I've worked in document management for years, and I think it's a very bad
idea.  Have you ever had to restore a system from loads of incremental
backups?

I'd recommend you look at storing the files on the files system in a proper
storage array that can be expanded when needed.  Backing up and restoring
files is much easier in this scenario, as when someone deletes 1 document by
mistake or a file gets corrupted, you only have to restore 1 file.  How do
you do this in your database system?  The storage array will give much more
flexible backup options.

It's also much easier to use a full text indexer such as Lucene on the raw
files.

In the database, you store all of the document metadata and a unique id.  On
the file system, name the files as the same as the id.  That way, it's easy
to get between the file system and the database record, or the full text
index and the database record.

I have to say that reading the PHP article made me feel very uneasy - it all
seemed fundamentally flawed to me.

However, good luck if you decide to go that way.  I'd be interested to hear
if you genuinely feel it has been a success, especially after you've
recovered from your first major data loss :-) 
(This is not meant to be sarcastic)

Best regards,

Andy


-----Original Message-----
From: Steve Folly [mailto:[EMAIL PROTECTED] 
Sent: 07 January 2004 20:56
To: MySQL MySQL
Subject: MySQL as document storage?

Hi,

(disclaimer - this thread could easily go off topic; I'm interested 
only in the MySQL aspects of what follows...)

At work we are currently investigating ways of filing all our 
electronic documents.

There is commercial software that will do this I know, but I was 
wondering whether MySQL would be suitable for this type of thing.

The 'documents' could be literally any binary file. My idea would be to 
create a table with a blob column for the document itself, and document 
title, reference number, keywords, other meta-data. And a web-based 
front-end to search and serve documents.

Although the documents could be any file, the majority would be textual 
documents (Word documents, PDF, etc). How would one go about indexing 
such data, since full text searches operate on textual columns?

How to cope with columns exceeding the max packet length? Why is there 
a max_packet_length setting; surely this is low-level stuff that 
shouldn't affect query and result sizes?

Is storing the actual documents in the database such a good idea 
anyway? Perhaps store the file in a file system somwhere and just store 
the filename?


If anyone has experience in doing (or been dissuaded from doing) this 
kind of application your thoughts and comments would be appreciated. 
(If only to tell me "don't be so stupid, it'll never work" :)

Thanks.

-- 
Regards,
Steve.


-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:
http://lists.mysql.com/[EMAIL PROTECTED]




-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/[EMAIL PROTECTED]

Reply via email to