[R] Outer function in R
Dear members I am trying to apply the function kl.dist (Kullback-Leibler Distance measure) to multiple matrixes. I tried the following : veckldist - Vectorize(kl.dist) distancematrix - outer (matrix1,matrix2, veckldist) But the code is complaining that the list of the object does not match. The lengths of my matrixes are same How could I fix the error? Thanks -- View this message in context: http://r.789695.n4.nabble.com/Outer-function-in-R-tp4670738.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Distance Measurement between probability distributions
Dear R -Users, I wanted to know about some existing functions (despite of euclidiean) to compute the distance between multiple histograms. I have found some examples like kullbakc -Leibler DIvergenz but the syntax for this is not available? Does anybody have an idea? Thanks -- View this message in context: http://r.789695.n4.nabble.com/Distance-Measurement-between-probability-distributions-tp4670646.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Bhattacharyya in R
Dear R-user, I am trying to apply bhattacharyya-distance function to my data. Did anybody use it before ? My code is the following #Bhattacharya Distance measure #a and b are vectors a - (1,2,3,4,2,2,2,2,2,2,2,1,4,5,6,-1,-1,-1,-1,-1,-3,-3,-3) b - (1.1,1.1,1.2,1.2,1.2,1.2,1.2,2.1,2.1,2.2,2.2,2,0,0,0,0,2,2,2,2,2,3.1,3.1) dist - bhattacharyya.matrix(a,b,missclasification = TRUE) plot(dist) Could somebody give me a guide on the syntax ? Thanks Dizem -- View this message in context: http://r.789695.n4.nabble.com/Bhattacharyya-in-R-tp4670671.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] K-means results understanding!!!
Dear members. I am having problems to understand the kmeans- results in R. I am applying kmeans-algorithms to my big data file, and it is producing the results of the clusters. Q1) Does anybody knows how to find out in which cluster (I have fixed numberofclusters = 5 ) which data have been used? COMMAND (kmeans.results - kmeans(mydata,centers =5, iter.max= 1000, nstart =1)) Q2) When I call kmeans.results I have the following output: K-means clustering with 5 clusters of sizes 17, 1, 6, 4, 32 Cluster means: [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11][,12] 1000000000 0 0.000 0.0008235294 2000000000 0 0.000 0.00 3000000000 0 0.000 0.00 4000000000 0 0.000 0.004000 5000000000 0 0.0003125 0.000375 [,13] [,14] [,15] [,16] [,17] [,18] 1 0.0008235294 0.001176471 0.005176471 0.012471295 0.041181652 0.10663935 2 0.00 0.0 0.0 0.0 0.169491525 0.61016949 3 0.00 0.0 0.0 0.00233 0.00667 0.07695015 4 0.003000 0.00150 0.00100 0.01750 0.02900 0.0615 5 0.0015625000 0.003437500 0.010687500 0.046375000 0.100062500 0.14306250 [,19] [,20] [,21] [,22] [,23] [,24] [,25] 1 0.12946535 1.0017347 0.3360283 0.2455259 0.08565672 0.02553212 0.00600 2 0.94915254 0.1694915 0.1016949 0.000 0. 0. 0.0 3 0.09376439 1.3857837 0.2659812 0.1015707 0.03804953 0.02023362 0.00767 4 0.1710 0.6665000 0.786 0.186 0.0465 0.0145 0.01200 5 0.1810 0.5200625 0.4156875 0.3461250 0.16925000 0.04918750 0.01150 [,26] [,27] [,28] [,29] [,30] [,31] [,32] [,33] [,34] [,35] 1 0.0005882353 0.001176471 0 0 0 0 0 0 0 0 2 0.00 0.0 0 0 0 0 0 0 0 0 3 0.001000 0.0 0 0 0 0 0 0 0 0 4 0.00 0.0 0 0 0 0 0 0 0 0 5 0.0013125000 0.0 0 0 0 0 0 0 0 0 [,36] [,37] [,38] [,39] [,40] 1 0 0 0 0 0 2 0 0 0 0 0 3 0 0 0 0 0 4 0 0 0 0 0 5 0 0 0 0 0 Clustering vector: [1] 1 5 5 3 1 5 5 5 5 1 4 1 5 5 5 5 4 5 2 3 5 5 1 5 5 5 5 1 3 1 4 5 5 1 5 5 5 1 [39] 3 1 5 5 3 1 1 1 1 5 5 1 4 1 3 5 5 5 5 5 5 1 Within cluster sum of squares by cluster: [1] 0.6702803 0.000 0.2453294 0.1860180 1.3535263 (between_SS / total_SS = 76.8 %) Available components: [1] cluster centers totsswithinss tot.withinss [6] betweensssize Q3)I would like to understand which raw data are in which cluster ? Does somebody knows how to access the table of raw data which are in the same cluster ? Thanks for help DZU -- View this message in context: http://r.789695.n4.nabble.com/K-means-results-understanding-tp4670171.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] K-means results understanding!!!
Hi, Thanks for reply but I already read the help page I am new in R and did not understand the output description of kmeans -function. That is why I wanted to ask some experts in the group. My point is that I do not understand which data are combined in the specific cluster? I tried the following : (kmeans.results - kmeans(mydata,centers =4, iter.max= 1000, nstart =1)) # The output data type is logical , cl1 is the cluster 1 cl1 - data.frame(as.numeric(kmeans.results$cluster == 1)) nbcl1 - sum (cl1, na.rm = 1) #output of the number of cl1 logical 1 values is for example 22 #this means there are 22 vectors which are similar but when I call : mydata[kmeans.results$cluster==1,] I only get 1 vector not 22 vectors that are in the cluster 1. I thought in the cluster 1 there are many vectors that are similar based on kmeans -function. But the output is only one vector! -- View this message in context: http://r.789695.n4.nabble.com/K-means-results-understanding-tp4670171p4670187.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] hist function in a for loop
Dear all, I need to create a for-loop in which I can compute multiple histograms My code is the following : #singlefile includes huge csv file #I want to specify the binsize #I would like to compute in the for -loop the histograms numfiles - length(singlefile) for (i in 1 :51) { binsize - -20 :20/2 hist(singlefile(singlefile$GVC[singlefile$new_id==i]], break=seq(), by = binsize))) What do I have to do ? How can I specify the range for i ? I am totally lost Thanks for support D.U -- View this message in context: http://r.789695.n4.nabble.com/hist-function-in-a-for-loop-tp4669797.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] hist function in a for loop
Hello Thanks for reply I want to compute several histograms in a for loop.I am trying to set the binsize constant in the beginning. #compute the histograms for (i in 1:12) { binsize - -20 :20/2 hist(singlefile$GVC(singlefile$new_id[,i], freq = FALSE,xlab =Graph i, col = pink,main =Example Histogram, ylim = c(-3.0,3.0))) singlefile$GVCmin - min(singlefile$GVC[1]) singlefile$GVCmin - min(singlefile$GVC[1]) x1 - seq(-3.0,3.0,by=.01) lines(x1,dnorm(x1),col =black) } I tried also this but it does not do anything. I also tried your proposal , but it says that : breaks = binsize is not allowed. I think I am totaly far away from that what I want to do with my code . One single histogram plotting and computing is easy , but if it is in the loop , by the syntax to feed the function with counter i is not working Thanks Dizem -- View this message in context: http://r.789695.n4.nabble.com/hist-function-in-a-for-loop-tp4669797p4669816.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] hist function in a for loop
Dear, I want to do the following : #I have created a huge csv.files with 44 colums #I want to select the specific colums from these files #CL1 consist data from which I want to compute the histogramms, CL2 is the cloumn which has numbers that identifies know from which line my second histogram data should start. THE CSV FILE loos like this: CL1 CL2 CL3 CL4 ..CLn 0.316.7 4.3 ... ... .. ... 0.82 .. . My target is to select only CL1 and CL2 compute histogram using CL1 data for each CL2-block as an example [1:2] until CL2 [1:60] I could print the histogramms but I can do only one by one. I want to compute all of them with the same binsize!! Therefore I wrote this code: #combine diffrent csv files into one files - list.files (path = ./Inputfiles,.csv) numfiles - length(files) print(files) singlefile - list() #for loop offset - 1 mytotaldata - list() #mytotaldata includes merged csv.file for (i in 1:numfiles) { mytotaldata[[files[i]]] - read.csv(files[i], header = TRUE, sep = , ,quote = \) #CL5 adding and giving an identification mytotaldata[[files[i]]][CL5] - i #CL2 adding and create identification for the number of lines mytotaldata[[files[i]]][CL2] - as.character(floor(as.numeric(rownames(mytotaldata[[files[i]]]))/1000)+offset) offset - as.numeric(tail(mytotaldata[[files[i]]],1)[CL2]) + 1 #Create a singlefile for the whole data singlefile - rbind(singlefile,mytotaldata[[files[i]]]) } #Now I have combined csv file added 2 columns CL2, CL5 # Compute the histograms #library (lattice) numfiles - length(singlefile) ###Is this necessary??? for (i in 1:i) { #all the histograms with the same csv file binsize - -20 :20/2 hist(singlefile$CL1(singlefile$CL2[,1], freq = FALSE,xlab =Graph i, col = pink,main =Example Histogram, ylim = c(-3.0,3.0))) singlefile$GVCmin - min(singlefile$CL1[1]) singlefile$GVCmin - min(singlefile$CL1[1]) x1 - seq(-3.0,3.0,by=.01) lines(x1,dnorm(x1),col =black) } My struggle point is the for-loop with the histograms computation in the loop and using the binsize I have specified. Maybe now the question is clear! In case somebody has faced a similar problem ,please let me know about tircks, ideas !! I am trying many diffrent thing to let this for loop work but I did not find a solution, therefore I decided to ask in the forum Thanks in advance DZU -- View this message in context: http://r.789695.n4.nabble.com/hist-function-in-a-for-loop-tp4669797p4669823.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reading multiple csv files with a for loop
Dear R-help users, I am quite new in R. I have multiple csv.files with different size. I would like to read them by using a for- loop and parallel by reading I need to add a new column which can be specified by myself. But my for-loop does not work ! Could somebody give me any idea ? Many thanks! myfiles -list() for ( i in 1:11) myfiles[i] - read.csv(toread,header = TRUE, sep=) names(myfiles) - paste(myfiles) mytotalfiles - myfiles #sample the data by the number of the columns by adding a new column sample(i1, 1000, replace = FALSE, prob = NULL) for n - 1000 sample - myfiles[sample(nrow (df), 1000),] -- View this message in context: http://r.789695.n4.nabble.com/Reading-multiple-csv-files-with-a-for-loop-tp4669681.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.