Hi,

thank you very much for your reply. :-)

- So I have really only four objects in this data set. It looks this:

objects cat1      cat2     cat3      cat4     ...
A           TRUE    FALSE   FALSE   FALSE
B           TRUE    FALSE   TRUE    FALSE
C           TRUE    FALSE   FALSE   FALSE
D           FALSE   TRUE    TRUE    TRUE
E           TRUE    TRUE    TRUE    TRUE
F           TRUE    FALSE   TRUE    FALSE

- I have modified standard separator for CSV file from comma to |
because I do other specific parsing and etc.  Original data have integer
values 1 (TRUE) and 0 (FALSE).

- Now I use this procedure for convert 1 and 0 on TRUE/FALSE coding (see
above) without duplicities:

dummyVar <- db[-1] > 0
x <- dummyVar

- Result is the same as in my previous mail. Result is the same (in my
last message) too when I use predict or fitted (rp <- predict(rc, x) /
rf <- fitted(rc)). Do you know what is different between predict and
fitted please? And what value of beta and theta parameter is optimal
please? So my clusters are: ABC - cluster 1, DEF - cluster NA. What is
means with "NA"? So these objects (ABC, DEF) are the most similar. I
will apply this algorithm on next set of data, it includes much more
objects... I will have question about Proximus algorithm yet (in next
mail), because it will be second algorithm for binary clustering of my
data sets... 

Thanks.

-- 

Best Regards
Matej Zuzcak

Dňa 16.8.2016 o 8:42 PIKAL Petr napísal(a):

> Hi
>
> see in line
>
>> -----Original Message-----
>> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Matej
>> Zuzčák
>> Sent: Monday, August 15, 2016 11:23 AM
>> To: r-help@r-project.org
>> Subject: [R] Need help with use of ROCK algorithm in R for binary data
>>
>> Dear list members,
>>
>> I have one appeal for you.
>>
>> I need use ROCK (RockCluster) algorithm for binary data in R. My binary data
>> looks this:
>>
>> |objects cat1 cat2 cat3 cat4 ...A TRUE FALSE FALSE FALSE B TRUE FALSE
>> TRUE FALSE C TRUE FALSE FALSE FALSE D FALSE TRUE TRUE TRUE E TRUE TRUE
>> TRUE TRUE F TRUE FALSE TRUE FALSE|
> Better to show your data with dput command. Just copy the output of
>
> dput(header(db, 20))
>
> to your mail.
>> Now I need clasify these objects A-F to clusters. I apply this procedure
>> https://en.wikibooks.org/wiki/Data_Mining_Algorithms_In_R/Clustering/Ro
>> ckCluster#Dataset
>> But I have several problems.
>>
>>  1. I import data from CSV file. |db <- read.csv(file="file.csv",
>>     header=TRUE, sep="|")| Fields are 1 (TRUE) and 0 (FALSE).
> Hm. Why do you use csv if you set the separator to "|". I would use 
> read.table.
>
>>  2. I convert this data: |x <- as.dummy(db[-1]|). After this step all
>>     columns in x are duplicated with 1 and 0. Why? It is correct please?
> Hm. Strange. In help page the result is TRUE/FALSE coding. Again posting real 
> data would help us to understand your problem.
>
> x <- as.integer(sample(3,10,rep=TRUE))
>> x
>  [1] 1 1 1 3 1 3 1 3 2 2
>> as.dummy(x)
>        [,1]  [,2]  [,3]
>  [1,]  TRUE FALSE FALSE
>  [2,]  TRUE FALSE FALSE
>  [3,]  TRUE FALSE FALSE
>  [4,] FALSE FALSE  TRUE
>  [5,]  TRUE FALSE FALSE
>  [6,] FALSE FALSE  TRUE
>  [7,]  TRUE FALSE FALSE
>  [8,] FALSE FALSE  TRUE
>  [9,] FALSE  TRUE FALSE
> [10,] FALSE  TRUE FALSE
> attr(,"levels")
> [1] "1" "2" "3"
>
> As I understand from help page, each columns is repeated the levels(column) 
> times and each column in result has coding T/F based on that particular 
> factor level.
>
>>  3. |rc <- rockCluster(x, n=4, debug=TRUE)|  4. |rf <- fitted(rc)| Why 
>> |fitted|
>> and when rather use |predict(rc, x)|?
>>  5. |table(db$objects, rf$cl)| After I get this output:
>>
>> |    1   NA
>> A   1    0
>> B   1    0
>> C   1    0
>> D   0    1
>> E   0    1
>> F   0    1
>> |
>>
>> What way I can read this output? What objects are in clusters with other?
>> What objects are the most similar please?
> There are only 2 clusters with levels 1 and NA. ABC belongs to cluster 1, DEF 
> belongs to cluster NA. An what is the most weird, you have only 6 values in 
> your db data ???
>
> So again presenting your data either by dput or str is vital for evaluating 
> your problem.
>
> And BTW do not post in HTML, your messages are more or less scrambled.
>
> Cheers
> Petr
>
>
>> Many thanks for your help.
>>
>> --
>> Best Regards
>> Matej Zuzcak
>>
>>
>>       [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> ________________________________
> Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou 
> určeny pouze jeho adresátům.
> Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
> jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
> svého systému.
> Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
> jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
> Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či 
> zpožděním přenosu e-mailu.
>
> V případě, že je tento e-mail součástí obchodního jednání:
> - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, 
> a to z jakéhokoliv důvodu i bez uvedení důvodu.
> - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; 
> Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany 
> příjemce s dodatkem či odchylkou.
> - trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným 
> dosažením shody na všech jejích náležitostech.
> - odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost 
> žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně 
> pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu 
> případně osobě, kterou adresát zastupuje, předloženy nebo jejich existence je 
> adresátovi či osobě jím zastoupené známá.
>
> This e-mail and any documents attached to it may be confidential and are 
> intended only for its intended recipients.
> If you received this e-mail by mistake, please immediately inform its sender. 
> Delete the contents of this e-mail with all attachments and its copies from 
> your system.
> If you are not the intended recipient of this e-mail, you are not authorized 
> to use, disseminate, copy or disclose this e-mail in any manner.
> The sender of this e-mail shall not be liable for any possible damage caused 
> by modifications of the e-mail or by delay with transfer of the email.
>
> In case that this e-mail forms part of business dealings:
> - the sender reserves the right to end negotiations about entering into a 
> contract in any time, for any reason, and without stating any reasoning.
> - if the e-mail contains an offer, the recipient is entitled to immediately 
> accept such offer; The sender of this e-mail (offer) excludes any acceptance 
> of the offer on the part of the recipient containing any amendment or 
> variation.
> - the sender insists on that the respective contract is concluded only upon 
> an express mutual agreement on all its aspects.
> - the sender of this e-mail informs that he/she is not authorized to enter 
> into any contracts on behalf of the company except for cases in which he/she 
> is expressly authorized to do so in writing, and such authorization or power 
> of attorney is submitted to the recipient or the person represented by the 
> recipient, or the existence of such authorization is known to the recipient 
> of the person represented by the recipient.
>

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to