The first param to random_cluster() should be n, the number of images. You can use which_cluster()<http://pdl-stats.sourceforge.net/Kmeans.htm#which_cluster> on the output of random_cluster() to get the cluster index for each image.
I'm curious why you need to use random_cluster() though? You can use kmeans() <http://pdl-stats.sourceforge.net/Kmeans.htm#kmeans> directly to cluster the images. random_cluster() provides only the initial random assignment of images to clusters. Best, Maggie On Sat, Sep 28, 2013 at 10:09 AM, Chris Marshall <[email protected]>wrote: > On Fri, Sep 27, 2013 at 7:08 AM, Hern<E1>n De Angelis > <[email protected]> wrote: > > > > I have no problems in getting the module PDL::Stats > > to work, following the examples in the web page > > (http://pdl-stats.sourceforge.net/Kmeans.htm). The module > > accepts tabular data as input, akin to a data frame in R, > > so feeding the images to the algorithm is just a matter of > > clumping the n 2D piddles (images) to a single dimension of > > size m, and stacking them in a new 2D piddle of dimensions n x > > m. > > > > The problems appear when I try to specify the number of desired > > clusters. As suggested in the web page, if I use: > > > > $cluster = random_cluster( $stack->dim(0), $k ); > > $ pdldoc random_cluster > > Module PDL::Stats::Kmeans > random_cluster > Signature: (byte [o]cluster(o,c); int obs=>o; int clu=>c) > > Creates masks for random mutually exclusive clusters. Accepts two > parameters, num_obs and num_cluster. Extra parameter turns into extra > dim in mask. May loop a long time if num_cluster approaches num_obs > because empty cluster is not allowed. > > my $cluster = random_cluster( $num_obs, $num_cluster ); > > Is it m or n that is the observations? I'm guessing it may > be the other one. > > > where $k is the number of desired clusters, the algorithm > > complains ("more cluster than obs!")if $k > n, although I do > > not see any reason why this should be so because as far as I > > understand there is in principle no limitation to the number of > > clusters with regard to the number of data dimensions. > > > > Another thing that left me scratching my head is how to > > associate every vector (image pixel) to a cluster number, so I > > can fold them back into a classified image. The algorithm puts > > the result in a hash, but I see no obvious way to relate this > > to the observation vectors. > > It says that random_cluster returns a piddle of vectors x number > of clusters where I presume the slice for each value of the > cluster index is a mask of which pixels fall in that cluster. > You can use which to select coordinates, make a cluster number > image by doing an inner product along the cluster dimension > of something like pdl [1..numclusters] which would give you > an "image" where the "color" of each value is the cluster > number (note the use of 1-based count). > > Hope this helps, > Chris > > _______________________________________________ > Perldl mailing list > [email protected] > http://mailman.jach.hawaii.edu/mailman/listinfo/perldl >
_______________________________________________ Perldl mailing list [email protected] http://mailman.jach.hawaii.edu/mailman/listinfo/perldl
