All,

libLAS was recently updated to allow a user to add extra data to each point.  I 
think this is technically within the bounds of the spec, and there have been a 
few files in the wild that are out there that do this (none that have let me 
put them on the public samples repository, however).  Essentially what this 
means is that the point format, which is specified 0...4 (or 0...5 for 1.3 but 
we're not doing that right now) has a fixed width in bytes with specified 
dimensions of data stored (X, Y, Z, R, G, B, T, I, etc).  What 
LASPoint::{Get|Set}ExtraData allows is for you to provide a byte array to tag 
on to the point that is beyond what the header's point format specified.  For 
example, if the point format were 0 in the header, it would have a nominal 
width of 20 bytes and it would contain the X, Y, Z, etc and everything in the 
base format type.  If a libLAS user set the point format to 0 and the data 
record length to 40 in the header, they would have 20 bytes in which they could 
use {Get|Set}ExtraData to store anything they want per point.

This development begs the question of how to tackle describing extra 
dimensionality.  The LAS specification is deficient in that it prescribes 
mandatory items have bytes provided for them even when they are not filled with 
actual data.  This means extra bloat in the format and a developer must pan 
through the data to determine statistics about it.  What would be nice is if 
there were header information to describe the dimensions that exist in the 
file, whether they are used or not, and what their size(s) might be.

The Oracle Point Cloud work that I am currently working on highlights this 
issue even more.  OPC allows you to store up to 12 dimensions on the point data 
(in aligned 8 byte BLOB form) but provides no way regularized way to describe 
the dimensions used.  I would like to propose that we provide a liblas.org VLR 
record that contains an XML file to describe the dimensions, their sizes, if 
they are used, etc.  libLAS (and Oracle Point Cloud) would then be updated to 
provide support for interpreting this information, but an unaware reader should 
be able to work without knowing how to interpret it.  

I propose each entry in the file have the following attributes:

* Name
* Description
* Position
* Size (in type)
* Type (bits, bytes)
* Data interpretation type (integer, double, float, etc)

Additionally, I think it should be possible to nest entries.  For example, we 
should have something like:

<Dimension name="Sensor Attributes" size="1" type="byte" position="4">
 <Dimension name="Return Number" size="3" type="bit" position="0" />
 <Dimension name="Number of Returns" size="3" type="bit" position="1" />
 <Dimension name="Scan Direction" size="1" type="bit" position="2" />
 <Dimension name="Edge of Flight Line" size="1" type="bit" position="3" />
</Dimension >
<Dimension name="Classification" size="1" type="byte" position="5" 
interpretation="uchar" />
...

All this would be properly namespaced XML (liblas.org or something) along with 
whatever we can find for standards such as those for describing data types, 
etc.  Maybe there's an existing standard for something like this already, I 
don't know.

What do you think?

Howard

_______________________________________________
Liblas-devel mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/liblas-devel

Reply via email to