Ah OK, I didn't get your question then. a dist-object is actually a vector of numbers with a couple of attributes. You can't just cut out values like that. The hclust function needs a perfect distance matrix to use the calculations.
shortcut is easy : just do f <- f/2*max(f), and all values are below 2. Otherwise this function could do that for you : to.dist <- function(x){ x.names <- sort(unique(c(x[[1]],x[[2]]))) n <- length(x.names) x.dist <- matrix(0,n,n) dimnames(x.dist) <- list(x.names,x.names) x.ind <- rbind(cbind(match(x[[1]], x.names), match(x[[2]], x.names)), cbind(match(x[[2]], x.names), match(x[[1]], x.names))) x.dist[x.ind] <- rep(x[[3]], 2) x.dist <- as.dist(x.dist) return(x.dist) } d <- to.dist(distB) hclust(d) Cheers Joris On Sat, May 29, 2010 at 12:04 AM, Ayesha Khan <ayesha.diamond...@gmail.com>wrote: > Yes Joris. I did try that and it does produce the results. I am now > wondering why I wanted a matrix like structure in the first place. However, > I do want 'f' to contain values less than 2 only. but when i try to get rid > of values greater than 2 by doing N <- (f[f<2], f strcuture disrupts and > hclust doesnt want to recognize it anyore again. Because obviously the data > frame changes again with that. Any ideas on how to do that? > > > On Fri, May 28, 2010 at 4:13 PM, Joris Meys <jorism...@gmail.com> wrote: > >> errr, forget about the output of dput(q), but keep it in mind for next >> time. >> >> f = dist(t(q)) >> hclust(f,method="single") >> >> it's as simple as that. >> Cheers >> Joris >> >> >> On Fri, May 28, 2010 at 10:39 PM, Ayesha Khan < >> ayesha.diamond...@gmail.com> wrote: >> >>> v <- dput(x,"sampledata.txt") >>> dim(v) >>> q <- v[1:10,1:10] >>> f =as.matrix(dist(t(q))) >>> >>> distB=NULL >>> for(k in 1:(nrow(f)-1)) for( m in (k+1):ncol(f)) { >>> if(f[k,m] <2) distB=rbind(distB,c(k,m,f[k,m])) >>> } >>> #now distB looks like this >>> >>> > distB >>> [,1] [,2] [,3] >>> [1,] 1 2 1.6275568 >>> [2,] 1 3 0.5252058 >>> [3,] 1 4 0.7323116 >>> [4,] 1 5 1 .9966001 >>> [5,] 1 6 1.6664110 >>> [6,] 1 7 1.0800540 >>> [7,] 1 8 1.8698925 >>> [8,] 1 10 0.5161808 >>> [9,] 2 3 1.7325811 >>> [10,] 2 5 0.8267843 >>> [11,] 2 6 0.5963280 >>> [12,] 2 7 0.8787230 >>> >>> #now from this output< i want to cluster all 1's, friedns of 1 and >>> friends of friends of 1 in one cluster. The same goes for 2,3 and so on >>> But when i do that using hclust, i get the following error. I think what >>> I need to do is convert my cureent matrix somehow into a format that would >>> be accepted by the hclust function but I dont know how to achieve that. >>> distclust <- hclust(distB,method="single") >>> >>> Error in if (n < 2) stop("must have n >= 2 objects to cluster") : >>> argument is of length zero >>> >>> P.S: Please let me know if this makes things more clear? "cuz i dont know >>> how looking at the original data set would help becuase the matrix under >>> consdieration right now is the distance matrix and how it can be altered. I >>> have tried as.dist, doesnt work because my matrix as i mentioned eralier is >>> not a square matrix. >>> On Fri, May 28, 2010 at 2:37 PM, Tal Galili <tal.gal...@gmail.com>wrote: >>> >>>> Hi Ayesha, >>>> I wish to help you, but without a simple self contained example that >>>> shows your issue, I will not be able to help. >>>> Try using the ?dput command to create some simple data, and let us see >>>> what you are doing. >>>> >>>> Best, >>>> Tal >>>> ----------------Contact >>>> Details:------------------------------------------------------- >>>> Contact me: tal.gal...@gmail.com | 972-52-7275845 >>>> Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) >>>> | www.r-statistics.com (English) >>>> >>>> ---------------------------------------------------------------------------------------------- >>>> >>>> >>>> >>>> >>>> On Fri, May 28, 2010 at 9:04 PM, Ayesha Khan < >>>> ayesha.diamond...@gmail.com> wrote: >>>> >>>>> Thanks Tal & Joris! >>>>> I created my distance matrix distA by using the dist() function in R >>>>> manipulating my output in order to get a matrix. >>>>> distA =as.matrix(dist(t(x2))) # x2 being my original dataset >>>>> as according to the documentaion on dist() >>>>> >>>>> For the default method, a "dist" object, or a matrix (of distances) or >>>>> an object which can be coerced to such a matrix using as.matrix() >>>>> >>>>> On Fri, May 28, 2010 at 6:34 AM, Joris Meys <jorism...@gmail.com>wrote: >>>>> >>>>>> As Tal said. >>>>>> >>>>>> Next to that, I read that column1 (and column2?) are supposed to be >>>>>> seen as factors, not as numerical variables. Did you take that into >>>>>> account >>>>>> somehow? >>>>>> >>>>>> It's easy to reproduce the error code : >>>>>> > n <- NULL >>>>>> > if(n<2)print("This is OK") >>>>>> Error in if (n < 2) print("This is OK") : argument is of length zero >>>>>> >>>>>> In the hclust code, you find following line : >>>>>> n <- as.integer(attr(d, "Size")) >>>>>> where d is the distance object entered in the hclust function. Looking >>>>>> at the error you get, this means that the size attribute of your >>>>>> distance is >>>>>> NULL. Which tells me that distA is not a dist-object. >>>>>> >>>>>> > A <- matrix(1:4,ncol=2) >>>>>> > A >>>>>> [,1] [,2] >>>>>> [1,] 1 3 >>>>>> [2,] 2 4 >>>>>> > hclust(A,method="single") >>>>>> >>>>>> Error in if (n < 2) stop("must have n >= 2 objects to cluster") : >>>>>> argument is of length zero >>>>>> >>>>>> Did you actually put in a distance object? see also ?dist or ?as.dist. >>>>>> >>>>>> Cheers >>>>>> Joris >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Fri, May 28, 2010 at 1:41 AM, Ayesha Khan < >>>>>> ayesha.diamond...@gmail.com> wrote: >>>>>> >>>>>>> i have a matrix with the following dimensions >>>>>>> 136 3 >>>>>>> >>>>>>> and it looks something like >>>>>>> >>>>>>> [,1] [,2] [,3] >>>>>>> [1,] 402 675 1.802758 >>>>>>> [2,] 402 696 1.938902 >>>>>>> [3,] 402 699 1.994253 >>>>>>> [4,] 402 945 1.898619 >>>>>>> [5,] 424 470 1.812857 >>>>>>> [6,] 424 905 1.816345 >>>>>>> [7,] 470 905 1.871252 >>>>>>> [8,] 504 780 1.958191 >>>>>>> [9,] 504 848 1.997111............... >>>>>>> >>>>>>> ................................................................................ >>>>>>> so you get the idea. I want to group similar items in one >>>>>>> group/cluster >>>>>>> following the "friends of friends" approach. I tried doing >>>>>>> >>>>>>> distclust <- hclust(distA,method="single") >>>>>>> However, I got the following error. >>>>>>> >>>>>>> Error in if (n < 2) stop("must have n >= 2 objects to cluster") : >>>>>>> argument >>>>>>> is of length zero >>>>>>> which probably means there's something wrong with my input here. Is >>>>>>> there >>>>>>> another way of doing this kind of clustering without getting into all >>>>>>> the >>>>>>> looping and ifelse etc. Basically, if 402 is close to 675,696,and699 >>>>>>> and >>>>>>> thus fall in cluster A then all items close to 675,696,and 699 should >>>>>>> also >>>>>>> fall into the same cluster A following a friends of friedns strategy. >>>>>>> Any help would be highly appreciated. >>>>>>> >>>>>>> -- >>>>>>> Ayesha Khan >>>>>>> >>>>>>> MS Bioengineering >>>>>>> Dept. of Bioengineering >>>>>>> Rice University, TX >>>>>>> >>>>>>> [[alternative HTML version deleted]] >>>>>>> >>>>>>> ______________________________________________ >>>>>>> R-help@r-project.org mailing list >>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>> PLEASE do read the posting guide >>>>>>> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> >>>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Joris Meys >>>>>> Statistical Consultant >>>>>> >>>>>> Ghent University >>>>>> Faculty of Bioscience Engineering >>>>>> Department of Applied mathematics, biometrics and process control >>>>>> >>>>>> Coupure Links 653 >>>>>> B-9000 Gent >>>>>> >>>>>> tel : +32 9 264 59 87 >>>>>> joris.m...@ugent.be >>>>>> ------------------------------- >>>>>> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Ayesha Khan >>>>> >>>>> MS Bioengineering >>>>> Dept. of Bioengineering >>>>> Rice University, TX >>>>> >>>> >>>> >>> >>> >>> -- >>> Ayesha Khan >>> >>> MS Bioengineering >>> Dept. of Bioengineering >>> Rice University, TX >>> >> >> >> >> -- >> Joris Meys >> Statistical Consultant >> >> Ghent University >> Faculty of Bioscience Engineering >> Department of Applied mathematics, biometrics and process control >> >> Coupure Links 653 >> B-9000 Gent >> >> tel : +32 9 264 59 87 >> joris.m...@ugent.be >> ------------------------------- >> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php >> > > > > -- > Ayesha Khan > > MS Bioengineering > Dept. of Bioengineering > Rice University, TX > -- Joris Meys Statistical Consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control Coupure Links 653 B-9000 Gent tel : +32 9 264 59 87 joris.m...@ugent.be ------------------------------- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.