All,
libLAS was recently updated to allow a user to add extra data to each point. I
think this is technically within the bounds of the spec, and there have been a
few files in the wild that are out there that do this (none that have let me
put them on the public samples repository, however). Essentially what this
means is that the point format, which is specified 0...4 (or 0...5 for 1.3 but
we're not doing that right now) has a fixed width in bytes with specified
dimensions of data stored (X, Y, Z, R, G, B, T, I, etc). What
LASPoint::{Get|Set}ExtraData allows is for you to provide a byte array to tag
on to the point that is beyond what the header's point format specified. For
example, if the point format were 0 in the header, it would have a nominal
width of 20 bytes and it would contain the X, Y, Z, etc and everything in the
base format type. If a libLAS user set the point format to 0 and the data
record length to 40 in the header, they would have 20 bytes in which they could
use {Get|Set}ExtraData to store anything they want per point.
This development begs the question of how to tackle describing extra
dimensionality. The LAS specification is deficient in that it prescribes
mandatory items have bytes provided for them even when they are not filled with
actual data. This means extra bloat in the format and a developer must pan
through the data to determine statistics about it. What would be nice is if
there were header information to describe the dimensions that exist in the
file, whether they are used or not, and what their size(s) might be.
The Oracle Point Cloud work that I am currently working on highlights this
issue even more. OPC allows you to store up to 12 dimensions on the point data
(in aligned 8 byte BLOB form) but provides no way regularized way to describe
the dimensions used. I would like to propose that we provide a liblas.org VLR
record that contains an XML file to describe the dimensions, their sizes, if
they are used, etc. libLAS (and Oracle Point Cloud) would then be updated to
provide support for interpreting this information, but an unaware reader should
be able to work without knowing how to interpret it.
I propose each entry in the file have the following attributes:
* Name
* Description
* Position
* Size (in type)
* Type (bits, bytes)
* Data interpretation type (integer, double, float, etc)
Additionally, I think it should be possible to nest entries. For example, we
should have something like:
<Dimension name="Sensor Attributes" size="1" type="byte" position="4">
<Dimension name="Return Number" size="3" type="bit" position="0" />
<Dimension name="Number of Returns" size="3" type="bit" position="1" />
<Dimension name="Scan Direction" size="1" type="bit" position="2" />
<Dimension name="Edge of Flight Line" size="1" type="bit" position="3" />
</Dimension >
<Dimension name="Classification" size="1" type="byte" position="5"
interpretation="uchar" />
...
All this would be properly namespaced XML (liblas.org or something) along with
whatever we can find for standards such as those for describing data types,
etc. Maybe there's an existing standard for something like this already, I
don't know.
What do you think?
Howard
_______________________________________________
Liblas-devel mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/liblas-devel