Re: [GRASS-dev] v.univar question: Why not lines and areas?

Helena Mitasova Mon, 28 Jan 2008 07:52:55 -0800


On Jan 28, 2008, at 10:22 AM, Michael Barton wrote:

On Jan 28, 2008, at 5:50 AM, Moritz Lennert wrote:
On 27/01/08 20:30, Michael Barton wrote:
v.univar only works with points. But since it is calculatingstats on a field in the attributes table, it should work the samefor all vector objects. Can we get rid of the limitation that itonly works with points?
There was some debate [1] about the statistical validity ofworking with the other types, as the way it was programmed, thestatistics were calculated with weights which corresponded to linelength / area surface .
I guess we might want to distinguish between a v.univar whichworks on the actual vector objects from a v.db.univar which workson any arbitrary attribute (or combination of attributes). Wecould write a C-replacement of the current v.db.univar script onthe base of the code I have for the classification algorithms usedin v.class.
AFAICT, v.univar does not calculate anything from vector topology,only from an attribute column. That is, it behaves the way youdescribe v.db.univar. For some weird (probably historical) reason,it won't calculate anything but N, max, and min of an attributecolumn linked to a non-point vector object.
"v.univar calculates univariate statistics of vector map features.This includes the number of features counted, minimum and maximumvalues, and range. Variance and standard deviation is calculatedonly for points if type=point is defined.Extended statistics adds median, 1st and 3rd quartiles, and 90thpercentile."
An attribute is the same whether it's linked to a point, line, orarea.
It would be nice to be able to calculate some stats from topology,but that is not possible at the moment without loading topology.
As mentioned earlier, it might be better that I move the code fromv.class into a library which can then be accessed by differentmodules...
Currently, I have defined the following statistics:

typedef struct
{
    double count;
    double min;
    double max;
    double sum;
    double sumsq;
    double mean;
    double var;
    double stdev;
} STATS;
But this could easily be extended according to needs andv.db.univar could also use the quantile classification algorithmto extract percentiles.
What are the statistics most people need ?
median, mode, and percentiles would be nice for any attribute ortopological data. Any diversity or other non-parametric stats thatwould actually be useful here?

I would like to add mean of absolute values of the attribute - thisis useful when the attribute is deviation or errorto measure accuracy of interpolation/approximation methods (there aresome papers that explain why MAE isbetter than RMSE for this). Although it is not in the man pages,v.univar computes it for the point option.Writing out the range is a nice convenience too (see v.univar outputfor points - it has pretty comprehensive

set of stats measures).

Helena

It would also be nice to get v.report type information (count,length, area) summed by value of a string attribute column orbinned numberic column. But this might better be an extension ofv.report.


Thanks for checking into it.

Michael


Moritz

[1] http://lists.osgeo.org/pipermail/grass-dev/2004-July/014976.html




____________________
C. Michael Barton, Professor of Anthropology
Director of Graduate Studies
School of Human Evolution & Social Change
Center for Social Dynamics & Complexity
Arizona State University

Phone: 480-965-6262
Fax: 480-965-7671
www: <www.public.asu.edu/~cmbarton>

_______________________________________________
grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev


_______________________________________________
grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev

Re: [GRASS-dev] v.univar question: Why not lines and areas?

Reply via email to