Howard,
My experience with this is extracting lidar subsets with Liblas. And I
thought that if it was possible to just sort the file by coordinate
(i.e. rewriting it ordered) it would enable to only scan a part of the
file. Maybe ordering the file would take some time (probably a lot, do
you have an idea ?), but if you are using it frequently for an
extraction, it could worth it, I don't really know what are the gains.
Etienne
Howard Butler a écrit :
I have been investigating and prototyping linking spatialindex with
libLAS, but I have a few questions of the practicality of this
approach. LiDAR point cloud data are typically very large (millions
to 100s of millions of points), contain attribute-like data such as
echo intensity, and are somewhat spatially sequential.
By spatially sequential, I mean that points near each other as they
are stored in the sequential LAS file format are frequently near each
other spatially (but not always). Many coders take advantage of this
property of the data by skipping data as they read it in -- naturally
thinning the volume for a small accuracy penalty. This is fine for
visualization, but in a warehousing scenario, tossing out data is a
big no-no. But we also want a warehouse that can be (somewhat) fast,
especially for spatial queries :)
A first naive attempt to use the DiskStorageManager to store an index
in parallel with a 32 million point (90mb) LAS file resulted in some
disappointing results. I used the default (70%) fill factor, 10 for
the index and leaf capacity, and star for the rtree variant. Nearly
4.5 hours of insertion later (I used a very small pagesize of 3, which
is the reason for the slow insert, so I could get a feel for the
minimum size required to make an index), the index of my 90mb, 32
million point LAS file turned into 517mb of .idx + and 190mb of .dat
file. This isn't going to be practical.
I need a plan B. Anyone have any ideas how to tackle this problem in
a way that balances space and time efficiency (though I'm not quite
sure what defines "balance" in this instance) for insert and query?
Should I start pursuing something like a KDB tree? Should I figure
out some sort of stripping/patching/tiling algorithm for the points to
lessen the indexing load (this tends to be the standard approach to
this problem)? Work out a different StorageManager approach that is a
clustered index of spatially clustered points?
Howard
PS sending to both lists because I hope to include spatial indexing
functionality in libLAS if I can find a reasonable approach, and if
any of the LiDAR folks have this one licked and would be willing to
share, I'd be happy to take it.
_______________________________________________
Liblas-devel mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/liblas-devel
_______________________________________________
Liblas-devel mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/liblas-devel