Re: [postgis-users] Database design for LIDAR data

2011-06-24 Thread Howard Butler

On Jun 24, 2011, at 4:04 PM, Jonathan Greenberg wrote:

> Interesting.  I came across this paper detailing the design of
> opentopography.org's lidar system, and they indicate they are doing
> something akin to load the LAS data in, and then running a spatial
> index (I'm too early in this game to know the difference between what
> they are describing and how the GIST index works):
> http://www.springerlink.com/content/x5q937840983un76/fulltext.pdf

That was their first (failed) attempt. Now I think they are storing bounding 
boxes that point to LAS files similar to many raster management systems that 
are built with postgis, et. al.  The US Army Corps of Engineers group that I am 
a part of has been driving lots of Oracle SDO_PC development, especially from 
the data management side of storing the actual point data in the database. 
We're 10x+ faster for loading data than when we started, developed a few 
algorithms to speed things up, and it has supported the ongoing development of 
libspatialindex/Rtree, PDAL, GDAL, libgeotiff, proj.4 and libLAS in the 
process. I have been pounding my head into this concrete wall for at least a 
couple of years now :)

> 
> Once I build an index for this 3-d data, setting aside the file size
> issues, should the spatial querying be relatively efficient?  

The cost of indexing all points in processing time and index storage space is 
not simply worth it when we start talking about billions to trillions of 
points.  The index just becomes some (significant) percentage of the existing 
burden that the points brought.  You most likely will never touch every point 
with windowed queries except for the case when you ask "give me all the 
points", which doesn't need an index anyway.  An index of bounds of tiles, even 
3d ones, is going to be much more efficient.  If all of your queries are for 
windows that are smaller than you blocks, just decrease their size instead of 
attempting to index them.

libLAS does have an octree index you can use to generate indexes with optional 
z-binning, and the Point Cloud Library also has an easy to use octree (no 
hookups for LAS yet though).

The most common spatial query for point cloud data is "here is my box, give me 
the points in this box".  Tiling the data, and then quickly throwing out 
candidate tiles eliminates much more cpu and data touching than having a giant 
index of 50 billion points and walking the index in some way to find candidates.

When the most common query becomes "here is my box, give me the points in this 
box that match these attributes", something else can be done to index those 
data inside the blocks.  We're simply not there yet, as most processing of 
point cloud data happens in exploitation/visualization software, and the act of 
doing windows queries already lowers their i/o costs significantly.

>From a data management perspective, I think point cloud data are best treated 
>as a specialized type of raster data.  They just get too unwieldy if you start 
>treating it as points in the vector sense.

> If so,
> how would I go about doing a cross-tile query?

In Oracle's case, you select blocks that cross your window and then go and 
unpack the point data to inspect the raw data within those blocks that cross.  
All the blocks completely contained within in your window already satisfy your 
query.

> 
> Howard, I am interested in checking out your tools but I don't have
> access to Oracle, just open source databases.  Can I use postgresql to
> utilize your algorithms?

You can setup and install Oracle for demonstration and development purposes 
without cost.  I would suggest going to oracle's site and fetching one of the 
"Developer Days" VirtualBox VMs and save yourself the hassle of trying to 
figure out how to set up the darn thing.  Hop on the PDAL list if you want to 
discuss that stuff more. We shouldn't burden this list with the minutiae of it.


Howard


___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Database design for LIDAR data

2011-06-24 Thread Jonathan Greenberg
Interesting.  I came across this paper detailing the design of
opentopography.org's lidar system, and they indicate they are doing
something akin to load the LAS data in, and then running a spatial
index (I'm too early in this game to know the difference between what
they are describing and how the GIST index works):
http://www.springerlink.com/content/x5q937840983un76/fulltext.pdf

Once I build an index for this 3-d data, setting aside the file size
issues, should the spatial querying be relatively efficient?  If so,
how would I go about doing a cross-tile query?

Howard, I am interested in checking out your tools but I don't have
access to Oracle, just open source databases.  Can I use postgresql to
utilize your algorithms?

Thanks!

--j

On Fri, Jun 24, 2011 at 12:39 PM, Shaun Langley  wrote:
> Hi Howard,
>
> I think this is an excellent question!  I'm actually in the process of 
> developing a manuscript that outlines the different methods for storage and 
> querying of spatial data such as LIDAR.  In my situation, I'm leaning towards 
> using triggers to create dynamic views that would allow me to simultaneously 
> query all tables of a given type.  I intent to explore a variety of different 
> storage types though... I would love to hear about what you decide to do!  
> Keep in touch!
>
> Cheers,
> Shaun
>
> On Jun 24, 2011, at 2:46 PM, Jonathan Greenberg wrote:
>
>> Folks:
>>
>> This topic I believe has been brought up before, but I thought I'd
>> send an email since I'm a bit of a noob with POSTGIS.  We have a large
>> collection of Lidar points that I would like to perform spatial
>> querying on (e.g. give me all points within a certain bounding box).
>> The data (currently in LAS format, but easily loadable into the DB),
>> is tiled up into smaller subsets.  The data is x,y,z,intensity (and
>> some other attributes that aren't so important)  I have a few
>> questions:
>>
>> 1) Should I load ALL of the LAS files into one massive table for
>> querying (this is going to be a LOT of points).
>> 2) If not, is there a trick where if I load up each LAS file into a
>> separate table (which would, in theory be preferable since I'd like to
>> do some testing before dealing with a database of this size), but
>> somehow when I do a spatial query, the query can span multiple tables
>> (e.g. say the query box is at the intersection of two adjacent tiles)?
>>
>> Related: what is the most efficient way to do a spatial query that
>> effectively "rasterizes" this data, e.g. the min z value between x1
>> and x2, and y1 and y2, where x2-x1 and y2-y1 are the x and y pixel
>> sizes?  I'm not talking about interpolation, I'm talking an exact
>> query.
>>
>> Thanks!
>>
>> --j
>>
>> --
>> Jonathan A. Greenberg, PhD
>> Assistant Project Scientist
>> Center for Spatial Technologies and Remote Sensing (CSTARS)
>> Department of Land, Air and Water Resources
>> University of California, Davis
>> One Shields Avenue
>> Davis, CA 95616
>> Phone: 415-763-5476
>> AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307
>> ___
>> postgis-users mailing list
>> postgis-users@postgis.refractions.net
>> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
> --
> Shaun Langley
> Graduate Student, PhD
> Department of Geography
> Michigan State University
> Home: (517) 974-9346
>
>
> ___
> postgis-users mailing list
> postgis-users@postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users
>
___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Database design for LIDAR data

2011-06-24 Thread Shaun Langley
Hi Howard,

I think this is an excellent question!  I'm actually in the process of 
developing a manuscript that outlines the different methods for storage and 
querying of spatial data such as LIDAR.  In my situation, I'm leaning towards 
using triggers to create dynamic views that would allow me to simultaneously 
query all tables of a given type.  I intent to explore a variety of different 
storage types though... I would love to hear about what you decide to do!  Keep 
in touch!

Cheers,
Shaun

On Jun 24, 2011, at 2:46 PM, Jonathan Greenberg wrote:

> Folks:
> 
> This topic I believe has been brought up before, but I thought I'd
> send an email since I'm a bit of a noob with POSTGIS.  We have a large
> collection of Lidar points that I would like to perform spatial
> querying on (e.g. give me all points within a certain bounding box).
> The data (currently in LAS format, but easily loadable into the DB),
> is tiled up into smaller subsets.  The data is x,y,z,intensity (and
> some other attributes that aren't so important)  I have a few
> questions:
> 
> 1) Should I load ALL of the LAS files into one massive table for
> querying (this is going to be a LOT of points).
> 2) If not, is there a trick where if I load up each LAS file into a
> separate table (which would, in theory be preferable since I'd like to
> do some testing before dealing with a database of this size), but
> somehow when I do a spatial query, the query can span multiple tables
> (e.g. say the query box is at the intersection of two adjacent tiles)?
> 
> Related: what is the most efficient way to do a spatial query that
> effectively "rasterizes" this data, e.g. the min z value between x1
> and x2, and y1 and y2, where x2-x1 and y2-y1 are the x and y pixel
> sizes?  I'm not talking about interpolation, I'm talking an exact
> query.
> 
> Thanks!
> 
> --j
> 
> -- 
> Jonathan A. Greenberg, PhD
> Assistant Project Scientist
> Center for Spatial Technologies and Remote Sensing (CSTARS)
> Department of Land, Air and Water Resources
> University of California, Davis
> One Shields Avenue
> Davis, CA 95616
> Phone: 415-763-5476
> AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307
> ___
> postgis-users mailing list
> postgis-users@postgis.refractions.net
> http://postgis.refractions.net/mailman/listinfo/postgis-users

--
Shaun Langley
Graduate Student, PhD
Department of Geography
Michigan State University
Home: (517) 974-9346


___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


Re: [postgis-users] Database design for LIDAR data

2011-06-24 Thread Howard Butler

On Jun 24, 2011, at 1:46 PM, Jonathan Greenberg wrote:

> Folks:
> 
> This topic I believe has been brought up before, but I thought I'd
> send an email since I'm a bit of a noob with POSTGIS.  We have a large
> collection of Lidar points that I would like to perform spatial
> querying on (e.g. give me all points within a certain bounding box).
> The data (currently in LAS format, but easily loadable into the DB),
> is tiled up into smaller subsets.  The data is x,y,z,intensity (and
> some other attributes that aren't so important)  I have a few
> questions:
> 
> 1) Should I load ALL of the LAS files into one massive table for
> querying (this is going to be a LOT of points).
> 2) If not, is there a trick where if I load up each LAS file into a
> separate table (which would, in theory be preferable since I'd like to
> do some testing before dealing with a database of this size), but
> somehow when I do a spatial query, the query can span multiple tables
> (e.g. say the query box is at the intersection of two adjacent tiles)?
> 
> Related: what is the most efficient way to do a spatial query that
> effectively "rasterizes" this data, e.g. the min z value between x1
> and x2, and y1 and y2, where x2-x1 and y2-y1 are the x and y pixel
> sizes?  I'm not talking about interpolation, I'm talking an exact
> query.
> 

Jonathan,

Paul Ramsey and I have discussed what loading point cloud data into PostGIS 
would mean, and it's pretty clear it doesn't mean story each point individually 
as a geometry :)  Oracle has something called SDO_PC which is a cloud object 
which references a table of "blocks".  Each of these blocks has a geometry that 
describes the bounds of the points within that block, and the points themselves 
are stored as a packed array of dumb bytes (blob).  The user does their spatial 
querying using the bounding boxes of the blocks, rather than the individual 
points themselves, and then unpacks the block data of blocks that match the 
query only when they need to.

I have been working on libLAS (and now PDAL) to load LAS data (and other point 
cloud format types) into Oracle, and except for the part that actually uses 
psql to write the block data into the database, most of the pieces are done.  
The essential piece to make this work is a blocking algorithm that optimizes 
fill capacity to minimize the number of blocks that are required to store the 
points. While a quad tree or other spatial indexing structure could be used, 
these are often optimized for query speed to neighborhood generation, and would 
end up creating lots more tiles than necessary for storage in the blocks table. 
 libLAS has a method, lasblock http://liblas.org/utilities/lasblock.html, that 
can be used for doing this operation.  It is integrated into the PDAL library 
 too as part of a loading pipeline for loading LAS 
data into Oracle.

Another component of this is description of the schema of the point cloud data 
being loaded.  PDAL has that one taken care of for you now, and it produces an 
XML document that describes the layout and arrangement of the points in the 
points blob for Oracle SDO_PC storage.  This is generic to all point cloud data 
types, and would be easily reusable inside a PostGIS context.

That said, it could be much more advantageous to have point cloud be an actual 
type so that PostGIS can take care a lot of things for you.  Paul has a 
proposal looking for funding to do just that.  See  
http://opengeo.org/technology/postgis/coredevelopment/pointclouds/ for more 
details.

Feel free to drop by the PDAL mailing list if you want to investigate 
developing a (C++) driver to load PostGIS data 
.

Howard


___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


[postgis-users] Database design for LIDAR data

2011-06-24 Thread Jonathan Greenberg
Folks:

This topic I believe has been brought up before, but I thought I'd
send an email since I'm a bit of a noob with POSTGIS.  We have a large
collection of Lidar points that I would like to perform spatial
querying on (e.g. give me all points within a certain bounding box).
The data (currently in LAS format, but easily loadable into the DB),
is tiled up into smaller subsets.  The data is x,y,z,intensity (and
some other attributes that aren't so important)  I have a few
questions:

1) Should I load ALL of the LAS files into one massive table for
querying (this is going to be a LOT of points).
2) If not, is there a trick where if I load up each LAS file into a
separate table (which would, in theory be preferable since I'd like to
do some testing before dealing with a database of this size), but
somehow when I do a spatial query, the query can span multiple tables
(e.g. say the query box is at the intersection of two adjacent tiles)?

Related: what is the most efficient way to do a spatial query that
effectively "rasterizes" this data, e.g. the min z value between x1
and x2, and y1 and y2, where x2-x1 and y2-y1 are the x and y pixel
sizes?  I'm not talking about interpolation, I'm talking an exact
query.

Thanks!

--j

-- 
Jonathan A. Greenberg, PhD
Assistant Project Scientist
Center for Spatial Technologies and Remote Sensing (CSTARS)
Department of Land, Air and Water Resources
University of California, Davis
One Shields Avenue
Davis, CA 95616
Phone: 415-763-5476
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307
___
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users