[Perldl] Understanding PDL::Stats::Kmeans

Hernán De Angelis Fri, 27 Sep 2013 04:14:08 -0700

Dear colleagues,

I am trying to implement simple unsupervised classification of stacks of
satellite images in PDL using k-means clustering.


I have no problems in getting the module PDL::Stats to work, following the
examples in the web page (http://pdl-stats.sourceforge.net/Kmeans.htm). The
module accepts tabular data as input, akin to a data frame in R, so feeding
the images to the algorithm is just a matter of clumping the n 2D piddles
(images) to a single dimension of size m, and stacking them in a new 2D
piddle of dimensions n x m.

The problems appear when I try to specify the number of desired clusters.
As suggested in the web page, if I use:

$cluster = random_cluster( $stack->dim(0), $k );

where $k is the number of desired clusters, the algorithm complains ("more
cluster than obs!")if $k > n, although I do not see any reason why this
should be so because as far as I understand there is in principle no
limitation to the number of clusters with regard to the number of data
dimensions.

Another thing that left me scratching my head is how to associate every
vector (image pixel) to a cluster number, so I can fold them back into a
classified image. The algorithm puts the result in a hash, but I see no
obvious way to relate this to the observation vectors.

Any hint will be appreciated!

Thanks in advance,

Hernán


-- 
Hernán De Angelis
http://talesoficeandstone.blogspot.se/

_______________________________________________
Perldl mailing list
[email protected]
http://mailman.jach.hawaii.edu/mailman/listinfo/perldl

[Perldl] Understanding PDL::Stats::Kmeans

Reply via email to