On Mon, Nov 03, 2008 at 10:13:49AM -0200, Gilberto Camara wrote: > Dear all > > Concerning the benefits of having raster data > stored together with vector data in a spatial > database, let me first quote from an excellent > paper from the late Jim Gray > ("Scientific Data Management in the Coming Decade"): > > "What’s wrong with files? > Everything builds from files as a base. HDF uses files. > Database systems use files. But, file systems have no > metadata beyond a hierarchical directory structure and file > names. They encourage a do-it-yourself- data-model that > will not benefit from the growing suite of data analysis > tools. They encourage do-it-yourself-access-methods that > will not do parallel, associative, temporal, or spatial > search. They also lack a high-level query language. > Lastly, most file systems can manage millions of files, but > by the time a file system can deal with billions of files, it > has become a database system." > > In other words, if you have substantial amounts of raster > data (as is increasingly the case in geospatial application), > you will need to develop a significant amount of software > to manage your files. Unless... your data is handled by a > raster-enabled spatial database.
I don't see anything in that paragraph that indicates that storing the *image data* in the database is important. (A link to the paper online or something could change that, of course.) Specifically, I don't think there's any doubt that if you have many-many files, it makes sense to store the *queryable image information* -- things like spatial extent, temporal extent, etc. -- belong in a database. The question is, in the "data" column, do you store a File Path, or the Image Data? Until/Unless databases get/have image manipulation tools directly, I can't see the value of storing the image data itself in the database. The points above argue against file-system based metadata storage/retrieval: sorting files by date, searching through index files, etc., so far as I can tell, but I don't see a compelling argument for image data in the database above. Of course, this is assuming that the image data access pattern is the same "in the database" and "on disk": for example, storing GeoTIFF data, then using GDAL to parse the string from the database as a GeoTIFF file. If the database you're using has a different (faster) Image access algorithm, then of course there can be benefits. However, those same benefits could presumably be realized with sufficiently complete libraries for accessing the image externally: If Oracles' Database product, for example, internally tiles the image, and they had a library to access the image in the same way, presumably you could store those bits on disk as well. However, if that library depends internally on a database, then integration of all points into the same database might help in some ways. In any case, I think there's obvious reasons to store your image metadata in a database -- and *using the same tools for accessing the images*, I don't think we've yet seen a compelling argument for storing image blobs in the database. Of course, all things are not equal :) If your database has built in MrSID support, for example, you could imagine using Database Storage for Images, because you'd get the automatic compression combined with the querying -- but that's not about the Database Specifically, just the image storage/reading library that comes along with it. Regards, -- Christopher Schmidt Web Developer _______________________________________________ Discuss mailing list Discuss@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/discuss