Steve Atkins wrote:
On Jun 17, 2009, at 8:43 AM, Mike Kay wrote:

Now that's an interesting way of doing this I never thought about before.
Using a fileserver though, how would I categorize and index the files?

I was planning on using multiple databases to hold the data - one for each
client and a separate database for each file type. Yes, they would be
hosted on the same server.  I see the bottleneck.

I suppose that instead of saving the files, indexes and categories all in the same database, I could simply reference the location and file names in
the database - and index and categorize in this manner. Does this make
sense?

Storing all the metadata in the database and the content on the filesystem
of the webserver lets both do what they're good at.

Serving static files from the filesystem of the webserver is ridiculously
cheap compared with retrieving the data from the database, as it's
something that everything from the kernel up is optimized to do.
Backups are much simpler too.


Using the database to store BLOBs or do it via File system is a very old debate going back and fourth with common tone the db is slower the file system is faster. Using a DB easies maintenance, simplifies indexing, security and gives transaction protection to the files.

In my view the only argument still holding water storing large binary files on the Filesystem vs the DB is the overhead/access time losses connecting and read data from DB. The file system just wins out yet has several draw backs.

Also consider one does not need to use the large object interface anymore, the bytea type with TOAST simplify that problem . The draw back is you can't jump around the binary stream and the size is limited to 1Gig per record. One of the big draw backs to using File system and a DB for indexing/meta data is keeping the two up to date and linked. If files get accidentally deleted or moved to different directories the database index is now useless. This by itself can cause maintenance nightmare as the number of files and directories get into 10 of thousands. This also complicates disaster recovery because the directory structure has to be recreated exactly to get it to work again.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to