Steve, I know you've been pointed at an interesting resource, and others are saying they've done this successfully, but I think you should get some balance. I've worked in document management for years, and I think it's a very bad idea. Have you ever had to restore a system from loads of incremental backups?
I'd recommend you look at storing the files on the files system in a proper storage array that can be expanded when needed. Backing up and restoring files is much easier in this scenario, as when someone deletes 1 document by mistake or a file gets corrupted, you only have to restore 1 file. How do you do this in your database system? The storage array will give much more flexible backup options. It's also much easier to use a full text indexer such as Lucene on the raw files. In the database, you store all of the document metadata and a unique id. On the file system, name the files as the same as the id. That way, it's easy to get between the file system and the database record, or the full text index and the database record. I have to say that reading the PHP article made me feel very uneasy - it all seemed fundamentally flawed to me. However, good luck if you decide to go that way. I'd be interested to hear if you genuinely feel it has been a success, especially after you've recovered from your first major data loss :-) (This is not meant to be sarcastic) Best regards, Andy -----Original Message----- From: Steve Folly [mailto:[EMAIL PROTECTED] Sent: 07 January 2004 20:56 To: MySQL MySQL Subject: MySQL as document storage? Hi, (disclaimer - this thread could easily go off topic; I'm interested only in the MySQL aspects of what follows...) At work we are currently investigating ways of filing all our electronic documents. There is commercial software that will do this I know, but I was wondering whether MySQL would be suitable for this type of thing. The 'documents' could be literally any binary file. My idea would be to create a table with a blob column for the document itself, and document title, reference number, keywords, other meta-data. And a web-based front-end to search and serve documents. Although the documents could be any file, the majority would be textual documents (Word documents, PDF, etc). How would one go about indexing such data, since full text searches operate on textual columns? How to cope with columns exceeding the max packet length? Why is there a max_packet_length setting; surely this is low-level stuff that shouldn't affect query and result sizes? Is storing the actual documents in the database such a good idea anyway? Perhaps store the file in a file system somwhere and just store the filename? If anyone has experience in doing (or been dissuaded from doing) this kind of application your thoughts and comments would be appreciated. (If only to tell me "don't be so stupid, it'll never work" :) Thanks. -- Regards, Steve. -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED] -- MySQL General Mailing List For list archives: http://lists.mysql.com/mysql To unsubscribe: http://lists.mysql.com/[EMAIL PROTECTED]
