I am trying to understand how the SOM algorithm works using library(class) SOM function. I have a 1000*10 matrix and I want to be able to summarize the different types of 10-element vectors. In my real world case it is likely that most of the 1000 values are of one kind the rest of other (this is an oversimplification). Say for example:
InputA<-matrix(cos(1:10),nrow=900,ncol=10,byrow=TRUE) InputB<-matrix(sin(5:14),nrow=100,ncol=10,byrow=TRUE) Input<-rbind(InputA,InputB) I though that a small grid of 3*3 would be enough to extract the patterns in such simple matrix : GridWidth<-3 GridLength<-3 gr <- somgrid(xdim=GridWidth,ydim=GridLength,topo = "hexagonal") test.som <- SOM(Input, gr) par(mfrow=c(GridLength,GridWidth)) for(i in 1:(GridWidth*GridLength)) plot(test.som$codes[i,],type="l") Only when I use a larger grid (say for example 7*3 ) I get some of the representatives for the sin pattern. This must have something to do with the initialization of the grid, as the sin is so rare it is unlikely that I get it as a reference vector. Afterwards, because the selection for the training is also random it is also unlikely they are picked. I've been trying to modify some of the other parameters for the SOM also, but I would appreciatte some input to keep me going until I receive the reference books from my bookstore. Are my suspictions right? Should I be using the SOM for my study or should I look somewhere else? NOTE: I have no prior knowledge of whether the datasets I want to analyse will have rare cases or not or where they will be located. Thanks, Manuel ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html