On Mon, Aug 3, 2009 at 3:08 PM, Wade Preston Shearer < [email protected]> wrote:
> I saved this long ago and therefore no longer have the source URL, but here > is some information about saving binary files in a database: > > ____________________________________________ > > Subject: Why should I not put images in a database? > Author: vincent > Posted: 04/11/2001 > Updated: 04/11/2001 > Because it makes no sense at all. > > The base of all evil > When you store binary data in a database, you are just using the database > as a filesystem. But where does the database store it's data? That's right: > on the filesystem. > > So all you do with queries to store binary data in a database is create a > very complex way of saving your files to disk. > > Fata morgana > It sounds clean and neat; just run a query and out pops your file. But is > it really that simple? Nope it is not. > > Data that is stored in a database cannot be accessed directly. It must > always be fetched using a query, and you can only run a query from a script, > so whenever you want to use a file that is stored in the database you _must_ > use a script to fetch it. > > Why is this bad? > Imagine you have an image gallery and you want to display an index page > with 9 images from the database. That means you need one script to fetch the > IDs of the images, print page with nine IMG tags that all link to a script > that fetches and displays the image. That means that you need to run 10 > scripts and do 10 database queries just to display the index. That's 10 hits > just for the page and the nine images. On average, each query will take > 30-100 ms to connect to the database, and another 200-300ms to fetch the > image, that's around 325ms per image, times nine is just under three seconds > just to make the page available (excluding download time). Yes those nine > will be concurrent and not sequencial, but higher concurrency never > increases the speed of a process. > > What else is bad news > Well, at 10 queries per pageview, you also need 10 concurrent database > connections per pageview. MySQL defaults to a max of 100, so you only need > 10 concurrent users on your site to go over that limit. > Then you will have to up the max connections of MySQL, but at 2.5MByte per > connection (plus whatever data you use in the query), you need a minimum of > 250MB for the first 100 connections, and another 250MB for the next 100. > Plus 128MB for the OS. You'd need 628MB just for the database server alone. > And remember, that is without the memory required for loading the files. > Apache also uses a fair share of memory, so if you are going to do this on > a (semi) professional level, you'd better have some serious servers to work > with. > > Just think of the table size. Try adding a few hundred megs worth of > binaries into a table and then do a sequencuial scan... ouch... > > Disaster > Also, think about what would happen if you has a busy site, say 30 > concurrent users, and you display an index of 25 images. Thats 780 > concurrent processes on apache, 780 php scripts making 780 concurrent > database connections, 780*2.5MB = 1950MB of memory required by the database > server. read: MySQL dies bigtime. > > > The solution: > The above should be more than enough to convince any sane person that > storing binary data in a database is a very bad thing. > > So what's the right way to store binaries? > Right where they belong, on disk. Put only the the name and location (and > whatever else you want) of the image or file in the database. > Then when you display a page you can print IMG or HREF tags that link > directly to the file on disk, so the webserver can serve the images > directly. That saves a massive number of scripts and database connections. > > Added bonuses: > - putting files on disk means your tables are _much_ smaller and thus > faster. > - files on disk can be backed up with any standard backup tool. > - files on disk can be edited without having to export them from there > database and re-import them later. > - by putting files on disk, you can share the load accross many disks. > - files on disk can break the 'magical' 2GB storage barrier. > > _______________________________________________ > Thanks, Wade. That was a great explanation. Dave _______________________________________________ UPHPU mailing list [email protected] http://uphpu.org/mailman/listinfo/uphpu IRC: #uphpu on irc.freenode.net
