Hi, thank you very much for your reply. :-)
- So I have really only four objects in this data set. It looks this: objects cat1 cat2 cat3 cat4 ... A TRUE FALSE FALSE FALSE B TRUE FALSE TRUE FALSE C TRUE FALSE FALSE FALSE D FALSE TRUE TRUE TRUE E TRUE TRUE TRUE TRUE F TRUE FALSE TRUE FALSE - I have modified standard separator for CSV file from comma to | because I do other specific parsing and etc. Original data have integer values 1 (TRUE) and 0 (FALSE). - Now I use this procedure for convert 1 and 0 on TRUE/FALSE coding (see above) without duplicities: dummyVar <- db[-1] > 0 x <- dummyVar - Result is the same as in my previous mail. Result is the same (in my last message) too when I use predict or fitted (rp <- predict(rc, x) / rf <- fitted(rc)). Do you know what is different between predict and fitted please? And what value of beta and theta parameter is optimal please? So my clusters are: ABC - cluster 1, DEF - cluster NA. What is means with "NA"? So these objects (ABC, DEF) are the most similar. I will apply this algorithm on next set of data, it includes much more objects... I will have question about Proximus algorithm yet (in next mail), because it will be second algorithm for binary clustering of my data sets... Thanks. -- Best Regards Matej Zuzcak Dňa 16.8.2016 o 8:42 PIKAL Petr napísal(a): > Hi > > see in line > >> -----Original Message----- >> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Matej >> Zuzčák >> Sent: Monday, August 15, 2016 11:23 AM >> To: r-help@r-project.org >> Subject: [R] Need help with use of ROCK algorithm in R for binary data >> >> Dear list members, >> >> I have one appeal for you. >> >> I need use ROCK (RockCluster) algorithm for binary data in R. My binary data >> looks this: >> >> |objects cat1 cat2 cat3 cat4 ...A TRUE FALSE FALSE FALSE B TRUE FALSE >> TRUE FALSE C TRUE FALSE FALSE FALSE D FALSE TRUE TRUE TRUE E TRUE TRUE >> TRUE TRUE F TRUE FALSE TRUE FALSE| > Better to show your data with dput command. Just copy the output of > > dput(header(db, 20)) > > to your mail. >> Now I need clasify these objects A-F to clusters. I apply this procedure >> https://en.wikibooks.org/wiki/Data_Mining_Algorithms_In_R/Clustering/Ro >> ckCluster#Dataset >> But I have several problems. >> >> 1. I import data from CSV file. |db <- read.csv(file="file.csv", >> header=TRUE, sep="|")| Fields are 1 (TRUE) and 0 (FALSE). > Hm. Why do you use csv if you set the separator to "|". I would use > read.table. > >> 2. I convert this data: |x <- as.dummy(db[-1]|). After this step all >> columns in x are duplicated with 1 and 0. Why? It is correct please? > Hm. Strange. In help page the result is TRUE/FALSE coding. Again posting real > data would help us to understand your problem. > > x <- as.integer(sample(3,10,rep=TRUE)) >> x > [1] 1 1 1 3 1 3 1 3 2 2 >> as.dummy(x) > [,1] [,2] [,3] > [1,] TRUE FALSE FALSE > [2,] TRUE FALSE FALSE > [3,] TRUE FALSE FALSE > [4,] FALSE FALSE TRUE > [5,] TRUE FALSE FALSE > [6,] FALSE FALSE TRUE > [7,] TRUE FALSE FALSE > [8,] FALSE FALSE TRUE > [9,] FALSE TRUE FALSE > [10,] FALSE TRUE FALSE > attr(,"levels") > [1] "1" "2" "3" > > As I understand from help page, each columns is repeated the levels(column) > times and each column in result has coding T/F based on that particular > factor level. > >> 3. |rc <- rockCluster(x, n=4, debug=TRUE)| 4. |rf <- fitted(rc)| Why >> |fitted| >> and when rather use |predict(rc, x)|? >> 5. |table(db$objects, rf$cl)| After I get this output: >> >> | 1 NA >> A 1 0 >> B 1 0 >> C 1 0 >> D 0 1 >> E 0 1 >> F 0 1 >> | >> >> What way I can read this output? What objects are in clusters with other? >> What objects are the most similar please? > There are only 2 clusters with levels 1 and NA. ABC belongs to cluster 1, DEF > belongs to cluster NA. An what is the most weird, you have only 6 values in > your db data ??? > > So again presenting your data either by dput or str is vital for evaluating > your problem. > > And BTW do not post in HTML, your messages are more or less scrambled. > > Cheers > Petr > > >> Many thanks for your help. >> >> -- >> Best Regards >> Matej Zuzcak >> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting- >> guide.html >> and provide commented, minimal, self-contained, reproducible code. > ________________________________ > Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou > určeny pouze jeho adresátům. > Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně > jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze > svého systému. > Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email > jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat. > Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či > zpožděním přenosu e-mailu. > > V případě, že je tento e-mail součástí obchodního jednání: > - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, > a to z jakéhokoliv důvodu i bez uvedení důvodu. > - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; > Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany > příjemce s dodatkem či odchylkou. > - trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným > dosažením shody na všech jejích náležitostech. > - odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost > žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně > pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu > případně osobě, kterou adresát zastupuje, předloženy nebo jejich existence je > adresátovi či osobě jím zastoupené známá. > > This e-mail and any documents attached to it may be confidential and are > intended only for its intended recipients. > If you received this e-mail by mistake, please immediately inform its sender. > Delete the contents of this e-mail with all attachments and its copies from > your system. > If you are not the intended recipient of this e-mail, you are not authorized > to use, disseminate, copy or disclose this e-mail in any manner. > The sender of this e-mail shall not be liable for any possible damage caused > by modifications of the e-mail or by delay with transfer of the email. > > In case that this e-mail forms part of business dealings: > - the sender reserves the right to end negotiations about entering into a > contract in any time, for any reason, and without stating any reasoning. > - if the e-mail contains an offer, the recipient is entitled to immediately > accept such offer; The sender of this e-mail (offer) excludes any acceptance > of the offer on the part of the recipient containing any amendment or > variation. > - the sender insists on that the respective contract is concluded only upon > an express mutual agreement on all its aspects. > - the sender of this e-mail informs that he/she is not authorized to enter > into any contracts on behalf of the company except for cases in which he/she > is expressly authorized to do so in writing, and such authorization or power > of attorney is submitted to the recipient or the person represented by the > recipient, or the existence of such authorization is known to the recipient > of the person represented by the recipient. > ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.