On Tuesday 05 March 2002 10:05, Thomas Spahni wrote:

There are some good reasons for wanting to store data directly in the 
database, sometimes.

For instance I built an application recently that used blobs. The reason is 
simplicity and security. In a web application it is nice to just dump the 
files into a directory and point to them, but consider that if you are 
building NOTHING but a database server, you don't want to have to support and 
manage security for another protocol, like ftp or http. In addition in my 
case it would have made the client code much more complicated since the 
client could already deal with an ODBC data source. I would have had to code 
in an entire new piece of functionality to deal with the ftp or http 
protocol, then figure out how to engineer it so the client code would 
automatically go and get the data, etc. Plus now I can't just manage data 
security with SQL GRANT instructions.

On top of that huge directories are very slow in most file systems. We had 
over 24 million records! That would mean that some elaborate scheme would 
have to be developed, like MD5 hash the filename and walk a series of 
directory levels down based on characters in the hash, etc. Since data is 
getting added and removed constantly there would then have to be a way to 
manage that tree to keep it from getting unbalanced etc. In the final 
analysis this would have been pretty close to the equivalent of building our 
own b-tree index! 

Given that database already have highly optimized indexes, it seemed more 
sensible to use the existing functionality of blobs and indexes. I think what 
all this points out is that MySQL developers might want to consider how to 
better support this type of application in the future. Currently HUGE tables 
of large blobs ARE very awkward. Our blob tables are terabytes in size, and 
the corruption of one record can force the tape restore from lower regions... 
Not to mention the horrors of backing the thing up, or any operation that 
mysql tries to perform by copying the table (of which there are several I 
believe). Maybe a special table handler module could be written that would 
cater to the needs of this type of application, plus a few special tools for 
doing repairs and such. Just an idea :o). 

> On Mon, 4 Mar 2002, Dang Nguyen wrote:
> > I'd like to know how to load files, such as MS-Word documents or PDF
> > documents, in a mySQL database.  I've setup a blob type in a table, but
> > where do I go from there?
> >
> > The purpose of this is to store files uploaded from a web page and
> > processed with Java servlets.  Then, the files should be retrievable
> > (displayed or downloaded) back to a client browser.  My environment:
> > Apache Web server 1.3.x on Solaris 2.8 with Java servlets environment.
>
> MS-Word and PDF files are essentially binary files (very much like
> images). I suggest leaving them as they are in a separate directory.
>
> These MS-Word and PDF files may be converted to plain ASCII using filters
> (mswordview, pdftotext). The resulting ASCII versions are then loaded into
> a table with one column of type TEXT along with a VARCHAR pointer to the
> filename. Create a fulltext index on the TEXT field.
>
> A fulltext search for some keywords will return a sorted list of
> filenames. These files can be returned to the client.
>
> This scenario works well for me on a collection of some 15'000 HTML
> documents (I used w3m as a filter to convert to ASCII). All conversions
> are done with a simple shell script calling the filters.
>
> Thomas Spahni
>
>
> ---------------------------------------------------------------------
> Before posting, please check:
>    http://www.mysql.com/manual.php   (the manual)
>    http://lists.mysql.com/           (the list archive)
>
> To request this thread, e-mail <[EMAIL PROTECTED]>
> To unsubscribe, e-mail
> <[EMAIL PROTECTED]> Trouble
> unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <[EMAIL PROTECTED]>
To unsubscribe, e-mail <[EMAIL PROTECTED]>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Reply via email to