> I believe ST_Intersects() works on out-of-db rasters in the 2.0 series,
> possibly 2.0.1.

Hmmm, for me it it fails for the (raster, integer, geometry) signature:

raster_test=> SELECT rid FROM basins INNER JOIN bcsd ON ST_Intersects(rast, 1, 
the_geom) WHERE rid = 39;
ERROR:  rt_raster_intersects not implemented yet for OFFDB bands
CONTEXT:  PL/pgSQL function "_st_intersects" line 20 at RETURN

but it appears that you're right for the (geometry, raster, integer) signature:

raster_test=> SELECT rid FROM basins INNER JOIN bcsd ON ST_Intersects(the_geom, 
rast, 1) WHERE rid = 39;
 rid 
-----
  39
(1 row)

> I wonder what your benchmark's performance would be like if the raster
> is out-db.  I'd expect a flat line with little change regardless the #
> of bands.

Ah ha!  Yes, that's definitely the case.  With out of db storage, each of 
intersects/clip queries comes back in < 200ms, regardless of num bands.  That's 
more of the behaviour that I was expecting, too.  Thanks for helping me put a 
finger on it!

~James

On Mon, Oct 29, 2012 at 04:33:36PM -0700, Bborie Park wrote:
> I believe ST_Intersects() works on out-of-db rasters in the 2.0 series,
> possibly 2.0.1.
> 
> As for performance of in-db vs out-db, in-db is slightly faster but my
> benchmarks are rather old.  I hope to do some testing soon to see if I
> can improve out-db performance.
> 
> Tile size is critical regardless of whether or not you're going to store
> your rasters in-db or out-db.  Generally, tiles should be 100x100 or
> smaller.  Ideal tile size depends upon the input raster's dimensions and
> what tile dimension is cleanly divisible from the raster's dimension.
> 
> I wonder what your benchmark's performance would be like if the raster
> is out-db.  I'd expect a flat line with little change regardless the #
> of bands.
> 
> -bborie
> 
> On 10/29/2012 04:23 PM, James Hiebert wrote:
> >> If you've got a large number of bands (100s or more), you may want to
> >> consider having the rasters be out-of-db.
> > 
> > I had considered that (better, actually, than duplicating our data, 
> > right?), but was finding that st_intersects wasn't yet implemented for out 
> > of db storage.  Looking through the trunk code, though, it appears that 
> > maybe you've gone ahead and implemented that since 2.0.1?  If so, great!  
> > ST_PixelAsPoints() is another good reason for me to seriously consider 
> > working out of trunk...
> > 
> >> Part of the problem is that
> >> anything stored in PostgreSQL (in-db) is TOASTed so needs to be
> >> deserialized (and probably decompressed).  So, if the serialized raster
> >> is big (more bands), the deTOASTing will take longer.
> > 
> > Thanks; good to know.
> > 
> >> Another problem with your benchmark query is that the ST_Clip() is
> >> running twice (for height and width).
> > 
> > Ah, that changes the picture pretty dramatically (see attached plot).  
> > Since it improves by a lot more than a factor of two, I suspect maybe I'm 
> > having some desktop scaling issues or something.  I'll go ahead and 
> > actually put this on our database server, try the trunk version, and go 
> > from there.  This is at least somewhat encouraging :)  Thanks for the 
> > suggestions.
> > 
> > ~James
> > 
> > On Mon, Oct 29, 2012 at 03:50:04PM -0700, Bborie Park wrote:
> >> James,
> >>
> >> I use PostGIS raster for a similar purpose (model outputs) though my
> >> model outputs are for a specific day (average temperature for a specific
> >> date).  So, one raster with one band per day per variable.  I could
> >> combine a year's worth of bands into one raster but I decided against that.
> >>
> >> If you've got a large number of bands (100s or more), you may want to
> >> consider having the rasters be out-of-db.  Part of the problem is that
> >> anything stored in PostgreSQL (in-db) is TOASTed so needs to be
> >> deserialized (and probably decompressed).  So, if the serialized raster
> >> is big (more bands), the deTOASTing will take longer.
> >>
> >> Another problem with your benchmark query is that the ST_Clip() is
> >> running twice (for height and width).
> >>
> >> If you're in the evaluation stage and you're compiling PostGIS yourself,
> >> I'd recommend trying SVN -trunk (will become 2.1) as it has additional
> >> capabilities and performance improvements.  I'm already using -trunk in
> >> production as I needed the new features (full disclosure: I wrote almost
> >> the new features in -trunk).
> >>
> >> -bborie
> >>
> >> On 10/29/2012 03:32 PM, James Hiebert wrote:
> >>> Hi All,
> >>>
> >>> I'm considering using PostGIS rasters for storage of raster data at my 
> >>> organization and I'm looking for some advice (or perhaps a reality 
> >>> check).  I work for a region climate services provider and the vast 
> >>> majority of our data (by volume, not necessarily complexity) are output 
> >>> from climate models.  These are generally a n-by-m raster with one band 
> >>> for each timestep.  There could be upwards of 36k to 72k timesteps for a 
> >>> typical model run.  We have hundreds of model runs.
> >>>
> >>> So my question is, is it insane to be thinking of storing that many bands 
> >>> in a PostGIS raster?  Or more specifically, is this _not_ a use case for 
> >>> which PostGIS rasters were designed?  I notice that most of the examples 
> >>> in the docs and in "PostGIS In Action" focus only on images and I can 
> >>> imagine that handling multispectral satellite images as being more of the 
> >>> intended use case.
> >>>
> >>> I did a little benchmarking of a typical use case of ours ("What's the 
> >>> average temperature inside a some polygon, e.g. a river basin?").  I 
> >>> noticed that the run time for doing a ST_Clip(raster, band, geometry) and 
> >>> ST_Intersects(raster, band, geometry) appears to be super-linear even 
> >>> when doing it on just a single band.  I ran the following query:
> >>> SELECT rid, st_height(st_clip(rast, 1, the_geom)), st_width(st_clip(rast, 
> >>> the_geom)) FROM basins INNER JOIN bcsd ON ST_Intersects(rast, 1, 
> >>> the_geom) WHERE rid = <rid> (where basins is table of river basins with 
> >>> one single polygon and bcsd is a table with a raster column "rast").
> >>> for a set of rasters with increasing number of bands, and the time to run 
> >>> the query is shown in the attached plot.  Since the raster properties are 
> >>> presumably shared across all the bands, it seems odd to me that run time 
> >>> would increase.  I would expect it to be _contant_ (with constant number 
> >>> of pixels), but I suppose that that's my own ignorance as to how the PG 
> >>> type extensions work?
> >>>
> >>> Comments or explanations are welcome.
> >>>
> >>> ~James

-- 
James Hiebert
Lead, Computational Support
Pacific Climate Impacts Consortium
http://www.pacificclimate.org
Room 112, University House 1, University of Victoria
PO Box 1700 Sta CSC, Victoria, BC V8V 2Y2
E-mail: hieb...@uvic.ca
Tel: (250) 472-4521
Fax: (250) 472-4830
_______________________________________________
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users

Reply via email to