>>>>> "TS" == Tao Shi <shi...@hotmail.com> >>>>> on Wed, 10 Oct 2007 06:15:53 +0000 writes:
TS> Thank you very much, Benilton and Prof. Ripley, for the TS> speedy replies! TS> Looking forward to the fix! TS> ....Tao I have finally re-stumbled onto this e-mail thread, and indeed found fixed the problem. Version 1.12.0 of 'cluster' should become visible within a few days, and will allow to call silhoutte(g, dis) on a grouping vector of k different integer values which need *not* necessarily be in 1:k. Martin Maechler, ETH Zurich >> From: Prof Brian Ripley <rip...@stats.ox.ac.uk> >> To: Benilton Carvalho <bcarv...@jhsph.edu> >> CC: Tao Shi <shi...@hotmail.com>, maech...@stat.math.ethz.ch, >> r-help@r-project.org >> Subject: Re: [R] silhouette: clustering labels have to be consecutive >> intergers starting from 1? >> Date: Wed, 10 Oct 2007 05:33:03 +0100 (BST) >> >> It is a C-level problem in package cluster: valgrind gives >> >> ==11377== Invalid write of size 8 >> ==11377== at 0xA4015D3: sildist (sildist.c:35) >> ==11377== by 0x4706D8: do_dotCode (dotcode.c:1750) >> >> This is a matter for the package maintainer (Cc:ed here), not R-help. >> >> On Tue, 9 Oct 2007, Benilton Carvalho wrote: >> >>> that happened to me with R-2.4.0 (alpha) and was fixed on R-2.4.0 >>> (final)... >>> >>> http://tolstoy.newcastle.edu.au/R/e2/help/06/11/5061.html >>> >>> then i stopped using... now, the problem seems to be back. The same >>> examples still apply. >>> >>> This fails: >>> >>> require(cluster) >>> set.seed(1) >>> x <- rnorm(100) >>> g <- sample(2:4, 100, rep=T) >>> for (i in 1:100){ >>> print(i) >>> tmp <- silhouette(g, dist(x)) >>> } >>> >>> and this works: >>> >>> require(cluster) >>> set.seed(1) >>> x <- rnorm(100) >>> g <- sample(2:4, 100, rep=T) >>> for (i in 1:100){ >>> print(i) >>> tmp <- silhouette(as.integer(factor(g)), dist(x)) >>> } >>> >>> and here's the sessionInfo(): >>> >>> > sessionInfo() >>> R version 2.6.0 (2007-10-03) >>> x86_64-unknown-linux-gnu >>> >>> locale: >>> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.U >>> TF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF- >>> 8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_ID >>> ENTIFICATION=C >>> >>> attached base packages: >>> [1] stats graphics grDevices utils datasets methods base >>> >>> other attached packages: >>> [1] cluster_1.11.9 >>> >>> >>> (Red Hat EL 2.6.9-42 smp - AMD opteron 848) >>> >>> b >>> >>> On Oct 9, 2007, at 8:35 PM, Tao Shi wrote: >>> >>>> Hi list, >>>> >>>> When I was using 'silhouette' from the 'cluster' package to >>>> calculate clustering performances, R crashed. I traced the problem >>>> to the fact that my clustering labels only have 2's and 3's. when >>>> I replaced them with 1's and 2's, the problem was solved. Is the >>>> function purposely written in this way so when I have clustering >>>> labels, "2" and "3", for example, the function somehow takes the >>>> 'missing' cluster "2" into account when it calculates silhouette >>>> widths? >>>> >>>> Thanks, >>>> >>>> ....Tao >>>> >>>> ##============================================ >>>> ## sorry about the long attachment >>>> >>>>> R.Version() >>>> $platform >>>> [1] "i386-pc-mingw32" >>>> >>>> $arch >>>> [1] "i386" >>>> >>>> $os >>>> [1] "mingw32" >>>> >>>> $system >>>> [1] "i386, mingw32" >>>> >>>> $status >>>> [1] "" >>>> >>>> $major >>>> [1] "2" >>>> >>>> $minor >>>> [1] "5.1" >>>> >>>> $year >>>> [1] "2007" >>>> >>>> $month >>>> [1] "06" >>>> >>>> $day >>>> [1] "27" >>>> >>>> $`svn rev` >>>> [1] "42083" >>>> >>>> $language >>>> [1] "R" >>>> >>>> $version.string >>>> [1] "R version 2.5.1 (2007-06-27)" >>>> >>>>> library(cluster) >>>>> cl1 ## clustering labels >>>> [1] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 2 2 2 >>>> [30] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 >>>> [59] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 >>>> [88] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 >>>> [117] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 >>>> [146] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 >>>> [175] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 >>>> [204] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 >>>>> x1 ## 1-d input vector >>>> [1] 1.5707963 1.5707963 1.5707963 1.5707963 1.5707963 >>>> [6] 1.5707963 1.5707963 1.5707963 1.5707963 1.5707963 >>>> [11] 1.5707963 1.5707963 1.5707963 1.5707963 1.5707963 >>>> [16] 1.5707963 1.5707963 1.5707963 1.5707963 1.5707963 >>>> [21] 1.0163758 0.7657763 0.7370084 0.6999689 0.7366476 >>>> [26] 0.7883921 0.6925395 0.7729240 0.7202391 0.7910149 >>>> [31] 0.7397698 0.7958092 0.6978596 0.7350255 0.7294362 >>>> [36] 0.6125713 0.7174000 0.7413046 0.7044205 0.7568104 >>>> [41] 0.7048469 0.7334515 0.7143170 0.7002311 0.7540981 >>>> [46] 0.7627527 0.7712762 0.8193611 0.7801148 0.9061762 >>>> [51] 0.8248195 0.7932630 0.7248037 0.7423547 0.6419314 >>>> [56] 0.6001092 0.7572272 0.7631742 0.7085384 0.8710853 >>>> [61] 0.6589563 0.7464943 0.7487340 0.7751280 0.7946542 >>>> [66] 0.7666081 0.8508109 0.8314308 0.7442471 0.8006093 >>>> [71] 0.7949156 0.7852447 0.7630048 0.7104764 0.6768218 >>>> [76] 0.6806351 0.7255355 0.7431389 0.7523627 0.7670515 >>>> [81] 0.8118214 0.7215615 0.8186164 0.6941610 0.8285453 >>>> [86] 0.8395170 0.8088044 0.8182706 0.7550723 0.7948639 >>>> [91] 0.7204830 0.7109068 0.7756949 0.6837856 0.7055604 >>>> [96] 0.6126666 0.7201964 0.6849890 0.7779753 0.7845284 >>>> [101] 0.9370788 0.8242935 0.6908860 0.6446151 0.7660386 >>>> [106] 0.8141526 0.8111984 0.8624186 0.7865335 0.8213035 >>>> [111] 0.8059171 0.6735751 0.7815353 0.6972508 0.6699396 >>>> [116] 0.6293971 0.7475913 0.7700821 0.8258339 0.8096144 >>>> [121] 0.7058171 0.7516635 0.7323909 0.7229136 0.8344846 >>>> [126] 0.7205433 0.8287774 0.8322097 0.7767547 0.7402277 >>>> [131] 0.7939879 0.7797308 0.7112453 0.7091554 0.6417382 >>>> [136] 0.6369171 0.7059020 0.7496380 0.7298359 0.8202566 >>>> [141] 0.7331830 0.7344492 0.8316894 0.7323979 0.7977615 >>>> [146] 0.7841205 0.7587060 0.8056685 0.7895643 0.8140731 >>>> [151] 0.7890221 0.8016008 0.7381577 0.6936453 0.7133525 >>>> [156] 0.7121459 0.6851448 0.7946275 0.8077618 0.7899059 >>>> [161] 0.7128826 0.7546289 0.7042451 0.6606403 0.7525233 >>>> [166] 0.7527548 0.8098887 0.8254190 0.7873064 0.8139340 >>>> [171] 0.7903462 0.8377651 0.6709983 0.7423632 0.6632082 >>>> [176] 0.5676717 0.6925125 0.7077083 0.7488877 0.7630604 >>>> [181] 0.7843001 0.7524471 0.6871823 0.7144443 0.7692206 >>>> [186] 0.8690710 0.9282786 0.7844991 0.7094671 0.7578409 >>>> [191] 0.8026643 0.7759241 0.6997376 0.6167209 0.6682289 >>>> [196] 0.6572018 0.7615807 0.7415752 0.7659161 0.7040360 >>>> [201] 0.6874460 0.7052109 0.8290970 0.6915149 0.7173107 >>>> [206] 0.7848961 0.7943846 0.8437946 0.7817344 0.8867006 >>>> [211] 0.7575857 0.8390473 0.7382348 0.6789859 0.7129010 >>>> [216] 0.6938173 0.7384170 0.6747648 0.7203337 0.7278963 >>>>> silhouette(cl1, dist(x1)^2) ##### CRASHED! ###### >>>>> silhouette(ifelse(cl1==3,2,1), dist(x1)^2) >>>> cluster neighbor sil_width >>>> [1,] 2 1 1.0000000 >>>> [2,] 2 1 1.0000000 >>>> [3,] 2 1 1.0000000 >>>> [4,] 2 1 1.0000000 >>>> [5,] 2 1 1.0000000 >>>> [6,] 2 1 1.0000000 >>>> [7,] 2 1 1.0000000 >>>> [8,] 2 1 1.0000000 >>>> [9,] 2 1 1.0000000 >>>> [10,] 2 1 1.0000000 >>>> [11,] 2 1 1.0000000 >>>> [12,] 2 1 1.0000000 >>>> [13,] 2 1 1.0000000 >>>> [14,] 2 1 1.0000000 >>>> [15,] 2 1 1.0000000 >>>> [16,] 2 1 1.0000000 >>>> [17,] 2 1 1.0000000 >>>> [18,] 2 1 1.0000000 >>>> [19,] 2 1 1.0000000 >>>> [20,] 2 1 1.0000000 >>>> [21,] 1 2 0.7592857 >>>> [22,] 1 2 0.9934455 >>>> [23,] 1 2 0.9937880 >>>> [24,] 1 2 0.9909544 >>>> [25,] 1 2 0.9937769 >>>> [26,] 1 2 0.9912442 >>>> [27,] 1 2 0.9900156 >>>> [28,] 1 2 0.9929499 >>>> [29,] 1 2 0.9929125 >>>> [30,] 1 2 0.9908637 >>>> [31,] 1 2 0.9938610 >>>> [32,] 1 2 0.9900958 >>>> [33,] 1 2 0.9906993 >>>> [34,] 1 2 0.9937227 >>>> [35,] 1 2 0.9934823 >>>> [36,] 1 2 0.9740954 >>>> [37,] 1 2 0.9926948 >>>> [38,] 1 2 0.9938924 >>>> [39,] 1 2 0.9914623 >>>> [40,] 1 2 0.9938250 >>>> [41,] 1 2 0.9915088 >>>> [42,] 1 2 0.9936633 >>>> [43,] 1 2 0.9924367 >>>> [44,] 1 2 0.9909855 >>>> [45,] 1 2 0.9938891 >>>> [46,] 1 2 0.9936028 >>>> [47,] 1 2 0.9930799 >>>> [48,] 1 2 0.9848568 >>>> [49,] 1 2 0.9922685 >>>> [50,] 1 2 0.9371272 >>>> [51,] 1 2 0.9832647 >>>> [52,] 1 2 0.9905154 >>>> [53,] 1 2 0.9932217 >>>> [54,] 1 2 0.9939101 >>>> [55,] 1 2 0.9810071 >>>> [56,] 1 2 0.9708675 >>>> [57,] 1 2 0.9938131 >>>> [58,] 1 2 0.9935827 >>>> [59,] 1 2 0.9918943 >>>> [60,] 1 2 0.9628701 >>>> [61,] 1 2 0.9844965 >>>> [62,] 1 2 0.9939491 >>>> [63,] 1 2 0.9939495 >>>> [64,] 1 2 0.9927610 >>>> [65,] 1 2 0.9902895 >>>> [66,] 1 2 0.9933968 >>>> [67,] 1 2 0.9734481 >>>> [68,] 1 2 0.9811285 >>>> [69,] 1 2 0.9939341 >>>> [70,] 1 2 0.9892304 >>>> [71,] 1 2 0.9902461 >>>> [72,] 1 2 0.9916649 >>>> [73,] 1 2 0.9935909 >>>> [74,] 1 2 0.9920846 >>>> [75,] 1 2 0.9876779 >>>> [76,] 1 2 0.9882868 >>>> [77,] 1 2 0.9932665 >>>> [78,] 1 2 0.9939213 >>>> [79,] 1 2 0.9939182 >>>> [80,] 1 2 0.9933699 >>>> [81,] 1 2 0.9868129 >>>> [82,] 1 2 0.9930074 >>>> [83,] 1 2 0.9850624 >>>> [84,] 1 2 0.9902300 >>>> [85,] 1 2 0.9820895 >>>> [86,] 1 2 0.9781906 >>>> [87,] 1 2 0.9875197 >>>> [88,] 1 2 0.9851569 >>>> [89,] 1 2 0.9938688 >>>> [90,] 1 2 0.9902547 >>>> [91,] 1 2 0.9929304 >>>> [92,] 1 2 0.9921257 >>>> [93,] 1 2 0.9927096 >>>> [94,] 1 2 0.9887702 >>>> [95,] 1 2 0.9915856 >>>> [96,] 1 2 0.9741195 >>>> [97,] 1 2 0.9929094 >>>> [98,] 1 2 0.9889500 >>>> [99,] 1 2 0.9924910 >>>> [100,] 1 2 0.9917552 >>>> [101,] 1 2 0.9047049 >>>> [102,] 1 2 0.9834247 >>>> [103,] 1 2 0.9897916 >>>> [104,] 1 2 0.9815845 >>>> [105,] 1 2 0.9934304 >>>> [106,] 1 2 0.9862375 >>>> [107,] 1 2 0.9869624 >>>> [108,] 1 2 0.9677353 >>>> [109,] 1 2 0.9914973 >>>> [110,] 1 2 0.9843076 >>>> [111,] 1 2 0.9881568 >>>> [112,] 1 2 0.9871393 >>>> [113,] 1 2 0.9921114 >>>> [114,] 1 2 0.9906240 >>>> [115,] 1 2 0.9865148 >>>> [116,] 1 2 0.9781846 >>>> [117,] 1 2 0.9939511 >>>> [118,] 1 2 0.9931681 >>>> [119,] 1 2 0.9829519 >>>> [120,] 1 2 0.9873341 >>>> [121,] 1 2 0.9916130 >>>> [122,] 1 2 0.9939273 >>>> [123,] 1 2 0.9936196 >>>> [124,] 1 2 0.9930999 >>>> [125,] 1 2 0.9800620 >>>> [126,] 1 2 0.9929347 >>>> [127,] 1 2 0.9820138 >>>> [128,] 1 2 0.9808614 >>>> [129,] 1 2 0.9926103 >>>> [130,] 1 2 0.9938711 >>>> [131,] 1 2 0.9903987 >>>> [132,] 1 2 0.9923097 >>>> [133,] 1 2 0.9921578 >>>> [134,] 1 2 0.9919558 >>>> [135,] 1 2 0.9809652 >>>> [136,] 1 2 0.9799023 >>>> [137,] 1 2 0.9916220 >>>> [138,] 1 2 0.9939454 >>>> [139,] 1 2 0.9935022 >>>> [140,] 1 2 0.9846059 >>>> [141,] 1 2 0.9936526 >>>> [142,] 1 2 0.9937017 >>>> [143,] 1 2 0.9810402 >>>> [144,] 1 2 0.9936199 >>>> [145,] 1 2 0.9897557 >>>> [146,] 1 2 0.9918058 >>>> [147,] 1 2 0.9937665 >>>> [148,] 1 2 0.9882099 >>>> [149,] 1 2 0.9910776 >>>> [150,] 1 2 0.9862575 >>>> [151,] 1 2 0.9911553 >>>> [152,] 1 2 0.9890393 >>>> [153,] 1 2 0.9938209 >>>> [154,] 1 2 0.9901624 >>>> [155,] 1 2 0.9923515 >>>> [156,] 1 2 0.9922418 >>>> [157,] 1 2 0.9889731 >>>> [158,] 1 2 0.9902939 >>>> [159,] 1 2 0.9877542 >>>> [160,] 1 2 0.9910280 >>>> [161,] 1 2 0.9923092 >>>> [162,] 1 2 0.9938784 >>>> [163,] 1 2 0.9914431 >>>> [164,] 1 2 0.9848184 >>>> [165,] 1 2 0.9939159 >>>> [166,] 1 2 0.9939125 >>>> [167,] 1 2 0.9872706 >>>> [168,] 1 2 0.9830805 >>>> [169,] 1 2 0.9913937 >>>> [170,] 1 2 0.9862925 >>>> [171,] 1 2 0.9909633 >>>> [172,] 1 2 0.9788584 >>>> [173,] 1 2 0.9866989 >>>> [174,] 1 2 0.9939102 >>>> [175,] 1 2 0.9853007 >>>> [176,] 1 2 0.9617883 >>>> [177,] 1 2 0.9900120 >>>> [178,] 1 2 0.9918102 >>>> [179,] 1 2 0.9939489 >>>> [180,] 1 2 0.9935882 >>>> [181,] 1 2 0.9917836 >>>> [182,] 1 2 0.9939170 >>>> [183,] 1 2 0.9892708 >>>> [184,] 1 2 0.9924478 >>>> [185,] 1 2 0.9932287 >>>> [186,] 1 2 0.9640487 >>>> [187,] 1 2 0.9150126 >>>> [188,] 1 2 0.9917589 >>>> [189,] 1 2 0.9919865 >>>> [190,] 1 2 0.9937946 >>>> [191,] 1 2 0.9888295 >>>> [192,] 1 2 0.9926884 >>>> [193,] 1 2 0.9909269 >>>> [194,] 1 2 0.9751339 >>>> [195,] 1 2 0.9862132 >>>> [196,] 1 2 0.9841566 >>>> [197,] 1 2 0.9936557 >>>> [198,] 1 2 0.9938973 >>>> [199,] 1 2 0.9934375 >>>> [200,] 1 2 0.9914201 >>>> [201,] 1 2 0.9893087 >>>> [202,] 1 2 0.9915481 >>>> [203,] 1 2 0.9819092 >>>> [204,] 1 2 0.9898774 >>>> [205,] 1 2 0.9926876 >>>> [206,] 1 2 0.9917091 >>>> [207,] 1 2 0.9903339 >>>> [208,] 1 2 0.9764847 >>>> [209,] 1 2 0.9920887 >>>> [210,] 1 2 0.9526866 >>>> [211,] 1 2 0.9938025 >>>> [212,] 1 2 0.9783714 >>>> [213,] 1 2 0.9938230 >>>> [214,] 1 2 0.9880267 >>>> [215,] 1 2 0.9923108 >>>> [216,] 1 2 0.9901850 >>>> [217,] 1 2 0.9938279 >>>> [218,] 1 2 0.9873388 >>>> [219,] 1 2 0.9929195 >>>> [220,] 1 2 0.9934017 >>>> attr(,"Ordered") >>>> [1] FALSE >>>> attr(,"call") >>>> silhouette.default(x = ifelse(cl1 == 3, 2, 1), dist = dist(x1)^2) >>>> attr(,"class") >>>> [1] "silhouette" >>>> >>>> ## other examples >>>>> set.seed(1234) >>>>> cl.tmp <- rep(2:3, each=5) >>>>> x.tmp <- c(rep(-1,5), abs(rnorm(5)+3)) >>>>> silhouette(cl.tmp, dist(x.tmp)) >>>> cluster neighbor sil_width >>>> [1,] 2 1 NaN >>>> [2,] 2 1 NaN >>>> [3,] 2 1 NaN >>>> [4,] 2 1 NaN >>>> [5,] 2 1 NaN >>>> [6,] 3 2 -0.5736515 >>>> [7,] 3 2 -0.1557143 >>>> [8,] 3 2 -0.2922523 >>>> [9,] 3 2 -0.8340174 >>>> [10,] 3 2 -0.1511875 >>>> attr(,"Ordered") >>>> [1] FALSE >>>> attr(,"call") >>>> silhouette.default(x = cl.tmp, dist = dist(x.tmp)) >>>> attr(,"class") >>>> [1] "silhouette" >>>>> silhouette(ifelse(cl.tmp==2,1,2), dist(x.tmp)) >>>> cluster neighbor sil_width >>>> [1,] 1 2 1.0000000 >>>> [2,] 1 2 1.0000000 >>>> [3,] 1 2 1.0000000 >>>> [4,] 1 2 1.0000000 >>>> [5,] 1 2 1.0000000 >>>> [6,] 2 1 0.4136253 >>>> [7,] 2 1 0.7038917 >>>> [8,] 2 1 0.6467668 >>>> [9,] 2 1 -0.3360695 >>>> [10,] 2 1 0.7054709 >>>> attr(,"Ordered") >>>> [1] FALSE >>>> attr(,"call") >>>> silhouette.default(x = ifelse(cl.tmp == 2, 1, 2), dist = dist(x.tmp)) >>>> attr(,"class") >>>> [1] "silhouette" >>>>> silhouette(ifelse(cl.tmp==2,1,3), dist(x.tmp)) >>>> cluster neighbor sil_width >>>> [1,] 1 2 NaN >>>> [2,] 1 2 NaN >>>> [3,] 1 2 NaN >>>> [4,] 1 2 NaN >>>> [5,] 1 2 NaN >>>> [6,] 3 1 -0.7694686 >>>> [7,] 3 1 -0.8167313 >>>> [8,] 3 1 -0.6054665 >>>> [9,] 3 1 -0.9037412 >>>> [10,] 3 1 0.1875360 >>>> attr(,"Ordered") >>>> [1] FALSE >>>> attr(,"call") >>>> silhouette.default(x = ifelse(cl.tmp == 2, 1, 3), dist = dist(x.tmp)) >>>> attr(,"class") >>>> [1] "silhouette" >>>> >>>> _________________________________________________________________ >>>> >>>> It?s free. http://im.live.com/messenger/im/home/?source=TAGHM >>>> >>>> <mime-attachment.txt> >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> -- >> Brian D. Ripley, rip...@stats.ox.ac.uk >> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ >> University of Oxford, Tel: +44 1865 272861 (self) >> 1 South Parks Road, +44 1865 272866 (PA) >> Oxford OX1 3TG, UK Fax: +44 1865 272595 TS> ______________________________________________ TS> R-help@r-project.org mailing list TS> https://stat.ethz.ch/mailman/listinfo/r-help TS> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html TS> and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.