Re: [R] clustering in R

2010-05-28 Thread Tal Galili
Hi Ayesha, hclust is a way to go (much better then trying to invent the wheel here). Please add what you used to create: distA And create a sample data set to show us what you did, using dput Best, Tal Contact Details:---

Re: [R] clustering in R

2010-05-28 Thread Joris Meys
As Tal said. Next to that, I read that column1 (and column2?) are supposed to be seen as factors, not as numerical variables. Did you take that into account somehow? It's easy to reproduce the error code : n - NULL if(n2)print(This is OK) Error in if (n 2) print(This is OK) : argument is of

Re: [R] clustering in R

2010-05-28 Thread Ayesha Khan
Thanks Tal Joris! I created my distance matrix distA by using the dist() function in R manipulating my output in order to get a matrix. distA =as.matrix(dist(t(x2))) # x2 being my original dataset as according to the documentaion on dist() For the default method, a dist object, or a matrix (of

Re: [R] clustering in R

2010-05-28 Thread Tal Galili
Hi Ayesha, I wish to help you, but without a simple self contained example that shows your issue, I will not be able to help. Try using the ?dput command to create some simple data, and let us see what you are doing. Best, Tal Contact

Re: [R] clustering in R

2010-05-28 Thread Joris Meys
errr, forget about the output of dput(q), but keep it in mind for next time. f = dist(t(q)) hclust(f,method=single) it's as simple as that. Cheers Joris On Fri, May 28, 2010 at 10:39 PM, Ayesha Khan ayesha.diamond...@gmail.comwrote: v - dput(x,sampledata.txt) dim(v) q - v[1:10,1:10] f

Re: [R] clustering in R

2010-05-28 Thread Ayesha Khan
Yes Joris. I did try that and it does produce the results. I am now wondering why I wanted a matrix like structure in the first place. However, I do want 'f' to contain values less than 2 only. but when i try to get rid of values greater than 2 by doing N - (f[f2], f strcuture disrupts and hclust

Re: [R] clustering in R

2010-05-28 Thread Ayesha Khan
v - dput(x,sampledata.txt) dim(v) q - v[1:10,1:10] f =as.matrix(dist(t(q))) distB=NULL for(k in 1:(nrow(f)-1)) for( m in (k+1):ncol(f)) { if(f[k,m] 2) distB=rbind(distB,c(k,m,f[k,m])) } #now distB looks like this distB [,1] [,2] [,3] [1,]12 1.6275568 [2,]13

Re: [R] clustering in R

2010-05-28 Thread Ayesha Khan
I assume my matrix should look something like this?.. round(distance, 4) P00A P00B M02A M02B P04A P04B M06A M06B P08A P08B M10A P00B 0.9678 M02A 1.0054 1.0349 M02B 1.0258 1.0052 1.2106 P04A 1.0247 0.9928 1.0145 0.9260 P04B 0.9898 0.9769 0.9875 0.9855 0.6075 M06A 1.0159

Re: [R] clustering in R

2010-05-28 Thread Joris Meys
I can't run your code. Please, just give me whatever comes on your screen when you run: dput(q) On Fri, May 28, 2010 at 10:57 PM, Ayesha Khan ayesha.diamond...@gmail.comwrote: I assume my matrix should look something like this?.. round(distance, 4) P00A P00B M02A M02B P04A

Re: [R] clustering in R

2010-05-28 Thread Joris Meys
Ah OK, I didn't get your question then. a dist-object is actually a vector of numbers with a couple of attributes. You can't just cut out values like that. The hclust function needs a perfect distance matrix to use the calculations. shortcut is easy : just do f - f/2*max(f), and all values are

[R] clustering in R

2010-05-27 Thread Ayesha Khan
i have a matrix with the following dimensions 136 3 and it looks something like [,1] [,2] [,3] [1,] 402 675 1.802758 [2,] 402 696 1.938902 [3,] 402 699 1.994253 [4,] 402 945 1.898619 [5,] 424 470 1.812857 [6,] 424 905 1.816345 [7,] 470 905 1.871252

[R] Clustering with R - efficient processing of large sparse data sets (text data)

2009-09-27 Thread dataguru
I checked the R procedure HCLUST (hierarchical clustering) but it looks like it requires a full triangular n x n similarity matrix as input, where n = number of observations. The number of variables is 200. My data set has n = 50,000 observations (keywords), and I use ad-hoc similarity measures

[R] Clustering with R - efficient processing of large sparse data sets (text data)

2009-09-27 Thread dataguru
I checked the R procedure HCLUST (hierarchical clustering) but it looks like it requires a full triangular n x n similarity matrix as input, where n = number of observations. The number of variables is 200. My data set has n = 50,000 observations (keywords), and I use ad-hoc similarity measures

[R] Clustering In R. (rookie)

2008-11-04 Thread paul murima
Hi all. I have alrge microarray dat set that i would like to analyze using hierarchical clustering. The problem is when i use the command below, hc- hclust(dist(array), ave) i get get this feedback... Error in as.vector(x, mode) : cannot coerce type 'closure' to vector of type 'any' Can some