I believe ST_Intersects() works on out-of-db rasters in the 2.0 series, possibly 2.0.1.
As for performance of in-db vs out-db, in-db is slightly faster but my benchmarks are rather old. I hope to do some testing soon to see if I can improve out-db performance. Tile size is critical regardless of whether or not you're going to store your rasters in-db or out-db. Generally, tiles should be 100x100 or smaller. Ideal tile size depends upon the input raster's dimensions and what tile dimension is cleanly divisible from the raster's dimension. I wonder what your benchmark's performance would be like if the raster is out-db. I'd expect a flat line with little change regardless the # of bands. -bborie On 10/29/2012 04:23 PM, James Hiebert wrote: >> If you've got a large number of bands (100s or more), you may want to >> consider having the rasters be out-of-db. > > I had considered that (better, actually, than duplicating our data, right?), > but was finding that st_intersects wasn't yet implemented for out of db > storage. Looking through the trunk code, though, it appears that maybe > you've gone ahead and implemented that since 2.0.1? If so, great! > ST_PixelAsPoints() is another good reason for me to seriously consider > working out of trunk... > >> Part of the problem is that >> anything stored in PostgreSQL (in-db) is TOASTed so needs to be >> deserialized (and probably decompressed). So, if the serialized raster >> is big (more bands), the deTOASTing will take longer. > > Thanks; good to know. > >> Another problem with your benchmark query is that the ST_Clip() is >> running twice (for height and width). > > Ah, that changes the picture pretty dramatically (see attached plot). Since > it improves by a lot more than a factor of two, I suspect maybe I'm having > some desktop scaling issues or something. I'll go ahead and actually put > this on our database server, try the trunk version, and go from there. This > is at least somewhat encouraging :) Thanks for the suggestions. > > ~James > > On Mon, Oct 29, 2012 at 03:50:04PM -0700, Bborie Park wrote: >> James, >> >> I use PostGIS raster for a similar purpose (model outputs) though my >> model outputs are for a specific day (average temperature for a specific >> date). So, one raster with one band per day per variable. I could >> combine a year's worth of bands into one raster but I decided against that. >> >> If you've got a large number of bands (100s or more), you may want to >> consider having the rasters be out-of-db. Part of the problem is that >> anything stored in PostgreSQL (in-db) is TOASTed so needs to be >> deserialized (and probably decompressed). So, if the serialized raster >> is big (more bands), the deTOASTing will take longer. >> >> Another problem with your benchmark query is that the ST_Clip() is >> running twice (for height and width). >> >> If you're in the evaluation stage and you're compiling PostGIS yourself, >> I'd recommend trying SVN -trunk (will become 2.1) as it has additional >> capabilities and performance improvements. I'm already using -trunk in >> production as I needed the new features (full disclosure: I wrote almost >> the new features in -trunk). >> >> -bborie >> >> On 10/29/2012 03:32 PM, James Hiebert wrote: >>> Hi All, >>> >>> I'm considering using PostGIS rasters for storage of raster data at my >>> organization and I'm looking for some advice (or perhaps a reality check). >>> I work for a region climate services provider and the vast majority of our >>> data (by volume, not necessarily complexity) are output from climate >>> models. These are generally a n-by-m raster with one band for each >>> timestep. There could be upwards of 36k to 72k timesteps for a typical >>> model run. We have hundreds of model runs. >>> >>> So my question is, is it insane to be thinking of storing that many bands >>> in a PostGIS raster? Or more specifically, is this _not_ a use case for >>> which PostGIS rasters were designed? I notice that most of the examples in >>> the docs and in "PostGIS In Action" focus only on images and I can imagine >>> that handling multispectral satellite images as being more of the intended >>> use case. >>> >>> I did a little benchmarking of a typical use case of ours ("What's the >>> average temperature inside a some polygon, e.g. a river basin?"). I >>> noticed that the run time for doing a ST_Clip(raster, band, geometry) and >>> ST_Intersects(raster, band, geometry) appears to be super-linear even when >>> doing it on just a single band. I ran the following query: >>> SELECT rid, st_height(st_clip(rast, 1, the_geom)), st_width(st_clip(rast, >>> the_geom)) FROM basins INNER JOIN bcsd ON ST_Intersects(rast, 1, the_geom) >>> WHERE rid = <rid> (where basins is table of river basins with one single >>> polygon and bcsd is a table with a raster column "rast"). >>> for a set of rasters with increasing number of bands, and the time to run >>> the query is shown in the attached plot. Since the raster properties are >>> presumably shared across all the bands, it seems odd to me that run time >>> would increase. I would expect it to be _contant_ (with constant number of >>> pixels), but I suppose that that's my own ignorance as to how the PG type >>> extensions work? >>> >>> Comments or explanations are welcome. >>> >>> ~James > > > > _______________________________________________ > postgis-users mailing list > postgis-users@postgis.refractions.net > http://postgis.refractions.net/mailman/listinfo/postgis-users > -- Bborie Park Programmer Center for Vectorborne Diseases UC Davis 530-752-8380 bkp...@ucdavis.edu _______________________________________________ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users