Re: [R] How to add a new row in to an existing data set in R Language?
Hi Ista Zahn, Thanks for your advice. Please see the following Image. http://r.789695.n4.nabble.com/file/n4645113/data.png i am expecting the result should same in the image. -- View this message in context: http://r.789695.n4.nabble.com/How-to-add-a-new-row-in-to-an-existing-data-set-in-R-Language-tp4644855p4645113.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to add a new row in to an existing data set in R Language?
Hi. I have one dataset with 3 variables with 4 observations. I want to add the some more observation into the existing dataset. Data;Mydata center drugchange 101 P 18 102 P 14 103 D 12 104 D 18 105 P 10 This is the data set. So I want to add some more rows to the existing dataset that rows are 106 P 17 107 D 18 108 D NA 109 P 13 110 P 12 Without using any combine functions. This is possible? Please reply me... -- View this message in context: http://r.789695.n4.nabble.com/How-to-add-a-new-row-in-to-an-existing-data-set-in-R-Language-tp4645114.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to add a new row in to an existing data set in R Language?
Thanks for your answer Arun, That above image shows the merge in SAS but its possible as same in R please reply me let me know? -- View this message in context: http://r.789695.n4.nabble.com/How-to-merge-data-set-in-R-Language-tp4644855p4645119.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Rainflow, range pair counting
That is right. It is a part of materials engineering. Could say its a standard method in it. Better description is always on Wikipedia :) http://en.wikipedia.org/wiki/Rainflow-counting_algorithm Have found packs for all other big statistic software like Matlab or Diadem but none for R. And I thought R can do all instead of making coffee. 2012/10/4 Bert Gunter gunter.ber...@gene.com I'm going to make a wild guess that this might have something to do with bump hunting =local conditional extrema. If so, the prim package might be applicable. Otherwise (...cycles in time series) time series methods might be appropriate. But I basically agree with others that the OP has not clearly specified the issue. Cheers, Bert On Thu, Oct 4, 2012 at 10:34 AM, Marc Schwartz marc_schwa...@me.com wrote: A quick Google search suggests that the methods are used in material fatigue studies where varying stress loads are placed upon the material, which puts this into a materials engineering context. Using rseek.org does not give any joy. Perhaps someone with experience in that domain will chime in. Regards, Marc Schwartz On Oct 4, 2012, at 10:27 AM, R. Michael Weylandt michael.weyla...@gmail.com wrote: R is used by many many folks, and most of us don't have the domain knowledge to make heads or tails of what you're mentioning here. You'll need to greatly clarify and perhaps ask on the appropriate R-SIG-* list. (Ecology maybe?) Perhaps rseek.org is also of help. Cheers, Michael On Thu, Oct 4, 2012 at 1:19 PM, Dandy Ehlert dandy.ehl...@gmail.com wrote: Hello I got some question about R. Im searching for a function or R-code which makes me some collectives. In the literature they are called: level crossing counting rainflow counting range pair counting Cant believe thats not available for R. May I just looking for the wrong word. There functions are used to count the cycles (Range and Means) of a time serie. Thx for helping :) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with convergence in mle2/optim function
Presumably you checked out the CRAN Optimization task view to see if you could find any joy there, right? (I doubt there is, but ...) -- Bert On Thu, Oct 4, 2012 at 10:12 PM, Adam Zeilinger zeil0...@umn.edu wrote: Hello R Help, I am trying solve an MLE convergence problem: I would like to estimate four parameters, p1, p2, mu1, mu2, which relate to the probabilities, P1, P2, P3, of a multinomial (trinomial) distribution. I am using the mle2() function and feeding it a time series dataset composed of four columns: time point, number of successes in category 1, number of successes in category 2, and number of success in category 3. The column headers are: t, n1, n2, and n3. The mle2() function converges occasionally, and I need to improve the rate of convergence when used in a stochastic simulation, with multiple stochastically generated datasets. When mle2() does not converge, it returns an error: Error in optim(par = c(2, 2, 0.001, 0.001), fn = function (p) : L-BFGS-B needs finite values of 'fn'. I am using the L-BFGS-B optimization method with a lower box constraint of zero for all four parameters. While I do not know any theoretical upper limit(s) to the parameter values, I have not seen any parameter estimates above 2 when using empirical data. It seems that when I start with certain 'true' parameter values, the rate of convergence is quite high, whereas other true parameter values are very difficult to estimate. For example, the true parameter values p1 = 2, p2 = 2, mu1 = 0.001, mu2 = 0.001 causes convergence problems, but the parameter values p1 = 0.3, p2 = 0.3, mu1 = 0.08, mu2 = 0.08 lead to high convergence rate. I've chosen these two sets of values because they represent the upper and lower estimates of parameter values derived from graphical methods. First, do you have any suggestions on how to improve the rate of convergence and avoid the finite values of 'fn' error? Perhaps it has to do with the true parameter values being so close to the boundary? If so, any suggestions on how to estimate parameter values that are near zero? Here is reproducible and relevant code from my stochastic simulation: library(bbmle) library(combinat) # define multinomial distribution dmnom2 - function(x,prob,log=FALSE) { r - lgamma(sum(x) + 1) + sum(x * log(prob) - lgamma(x + 1)) if (log) r else exp(r) } # vector of time points tv - 1:20 # Negative log likelihood function NLL.func - function(p1, p2, mu1, mu2, y){ t - y$tv n1 - y$n1 n2 - y$n2 n3 - y$n3 P1 - (p1*((-1 + exp(sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))*t))*((-mu2)*(mu2 - p1 + p2) + mu1*(mu2 + 2*p2)) - mu2*sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))) - exp(sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))*t)* mu2*sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))) + 2*exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2*t)*mu2* sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)/ exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2*t)/(2*(mu2*p1 + mu1*(mu2 + p2))* sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2 P2 - (p2*((-1 + exp(sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))*t))*(-mu1^2 + 2*mu2*p1 + mu1*(mu2 - p1 + p2)) - mu1*sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))) - exp(sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))*t)* mu1*sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))) + 2*exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2*t)*mu1* sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)/ exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2*t)/(2*(mu2*p1 + mu1*(mu2 + p2))* sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2 P3 - 1 - P1 - P2 p.all - c(P1, P2, P3) -sum(dmnom2(c(n1, n2, n3), prob = p.all, log = TRUE)) } ## Generate simulated data # Model equations as expressions, P1 - expression((p1*((-1 + exp(sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))*t))*((-mu2)*(mu2 - p1 + p2) + mu1*(mu2 + 2*p2)) - mu2*sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))) - exp(sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))*t)* mu2*sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))) + 2*exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2*t)*mu2* sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)/ exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2*t)/(2*(mu2*p1 + mu1*(mu2 + p2))* sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 +
Re: [R] How to add a new row in to an existing data set in R Language?
Hi ?merge or ?rbind Regards Petr -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of killerkarthick Sent: Friday, October 05, 2012 6:39 AM To: r-help@r-project.org Subject: [R] How to add a new row in to an existing data set in R Language? Hi. I have one dataset with 3 variables with 4 observations. I want to add the some more observation into the existing dataset. Data;Mydata centerdrugchange 101 P 18 102 P 14 103 D 12 104 D 18 105 P 10 This is the data set. So I want to add some more rows to the existing dataset that rows are 106 P 17 107 D 18 108 D NA 109 P 13 110 P 12 Without using any combine functions. This is possible? Please reply me... -- View this message in context: http://r.789695.n4.nabble.com/How-to-add- a-new-row-in-to-an-existing-data-set-in-R-Language-tp4645114.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] barplot with some 0 frequencies
On Oct 4, 2012, at 9:05 PM, David Winsemius wrote: On Oct 4, 2012, at 4:49 PM, Guillaume2883 wrote: Hi all, I am back with a new question ! I recorded the occurence of 4 differents event on 20 places for a given time period. Now, I want to do some barplot of the frequency of theses events for each place, so it should be easy. My problem is that I want to see the frequencies of the 4 events on my barplots even if the frequency of some of them is 0. How could I do that ? Put a big zero on the axis? barchart(abs(-3:3) ~ letters[1:7], origin=0) require(grid) trellis.focus(panel, 1,1) grid.text(label=paste(0), x = convertX(unit(4, native), npc), y = convertY(unit(0, native), npc) ) trellis.unfocus() -- David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Format of numbers in plotmath expressions.
I want to do something like: TH - sprintf(%1.1f,c(0.3,0.5,0.7,0.9,1)) plot(1:10) legend(bottomright,pch=1:5,legend=parse(text=paste(theta ==,TH))) Notice that the final 1 comes out in the legend as just plain 1 and NOT as 1.0 although TH is [1] 0.3 0.5 0.7 0.9 1.0 I can get plotmath to preserve 1.0 as 1.0 and NOT convert it to 1 if I use substitute, as in text(2,5,labels=substitute(list(theta == a),list(a=TH[5]))) but substitute doesn't work appropriately with vectors. Can anyone tell me how to get a 1.0 rather than 1 in the legend? Ta. cheers, Rolf Turner P.S. Just figured out a way using sapply(): leg - sapply(TH,function(x){substitute(list(theta == a),list(a=x))}) plot(1:10) legend(bottomright,pch=1:5,legend=parse(text=leg)) Note that the use of parse (pace Thomas Lumley! :-) ) is required --- legend=leg does NOT work. Getting here required an enormous amount of trial and error. And it seems pretty kludgy. Is there a sexier way? R. T. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] many-facet rasch analysis in R
Dear R-list, is there any R-package that I can use to carry out a many-facet rasch analysis in R? Thank you. Best wishes Alain [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with colors in contour plot
On 10/04/2012 10:25 PM, Loukia Spineli wrote: Dear R users, I have a 51 by 51 matrix of p-values (named as pvalue_MA). I want to present graphically this matrix in a plot (filled contour plot) where both axes represent probabilities. I have also added a grid in this plot. I want to highlight in white the cells of the grid that represent p-values smaller than the (common) significance threshold, 0.05. The code from this plot is colored in blue (I had to copy-pasted all this code in order to reach to the plot). I suspect that the problem might be in the col parameter of the filled.contour function. Honestly, I cannot understand why the plot appears to be completely white!! I checked the values of my variable and it has a great range from 0 to 1. Any suggestion/ comment would really help me to move on with my project. Hi Loukia, Do you really need a contour plot for this? pvals-matrix(runif(2601),nrow=51) library(plotrix) pcols-ifelse(pvals=0.05,white,gray) color2D.matplot(pvals,cellcolors=pcols) Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating vegetation distance groups from one column
Thank you! That has worked for me when creating graphs. In plyr I used the script: # Veg Index data.to.analyze$VegIndex - cut(data.to.analyze$Veg, breaks=seq(0, 35, 5), include.lowest=TRUE) VegIndex - data.to.analyze$VegIndex plot(VegIndex) But the vegetation distances on the x-axis in the graph are showing up as: [-5,0] (0,5] (5,10] (10,15] (15,20] (20,25] (25,30] I am concerned these vegetation classes are not grouped probably and there is overlap between the classes. It should read, preferably without brackets or only one kind (): (-5-0) (1-5) (6-10) (11-15) (16-20) (21-25) (26-30) How do I fix this? Please advise, Jean -- View this message in context: http://r.789695.n4.nabble.com/Creating-vegetation-distance-groups-from-one-column-tp4644970p4645127.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Anova
Hi R-listers, I am trying to do an ANOVA for the following scatterplot and received the following error: library(car) scatterplot(HSuccess ~ Veg, data = data.to.analyze, xlab = Vegetation border (m), ylab = Hatching success (%)) anova(HSuccess ~ Veg, data=data.to.analyze) Error in UseMethod(anova) : no applicable method for 'anova' applied to an object of class formula I am wondering if there is a better way to do this? Please advise, Jean -- View this message in context: http://r.789695.n4.nabble.com/Anova-tp4645130.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error in lmer: asMethod(object) : matrix is not symmetric [1, 2]
Dear R Users, I am having trouble with lmer. I am looking at recombinant versus non recombinant individuals. In the response variable recombinant individuals are coded as 1's and non-recombinant as 0's. I built a model with 2 fixed factors and 1 random effect. Sex (males/females) is the first fixed effect and sexual genotype (XY, YY, WX and WY) the second one. Sexual Genotype is nested in sex. XY and YY individuals are males and WX and WY females. I crossed 8 XY males with 8 WY females and 8 YY males with 8 WX females. Each cross corresponds to a family (i.e. family 1 to 8 for XY x WY and family 9 to 16 for YY WX). For each family I have 20 offspring. Family is nested in sexual genotype as a random factor. My data looks like: reps-factor(sample(c(0,1),640,replace=T,prob=c(0.95,0.05))) sex-factor(rep(c(M,F),each=320)) geno-factor(rep(c(XY,YY,WY,WX),each=160)) fam-factor(rep(c(1:8,9:16,1:8,9:16),each=20)) dat-data.frame(reps,sex,geno,fam) and I built the following model: lmer(reps~sex+geno+(1|fam),data=dat,family=binomial) and tried also lmer(reps~sex/geno+(1|fam),data=dat,family=binomial) but I keep getting this error: asMethod(object) : matrix is not symmetric [1,2] Does someone have an idea where the error might come from? Thank you very much in advance! Olivier Sex-chromosome turnovers induced by deleterious mutation load [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] seasonal ARFIMA for fractional values for lag non-seasonal and seasonal lag
I need help if anybody know how to analyse data by using seasonal arfima method where the values for d seasonal and d non-seasonal are fractional. pplease help. -- View this message in context: http://r.789695.n4.nabble.com/seasonal-ARFIMA-for-fractional-values-for-lag-non-seasonal-and-seasonal-lag-tp4645131.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with colors in contour plot
Thank you very much Jim!!! Your function achieves to color the grid tha way I want, but it can handle only one variable, if I have understood correctly the details of this function.The plot must display in the axes percentages and in the grids pvalues. I have incorporated 3 different variables in a contourplot. On Fri, Oct 5, 2012 at 12:08 PM, Jim Lemon j...@bitwrit.com.au wrote: On 10/04/2012 10:25 PM, Loukia Spineli wrote: Dear R users, I have a 51 by 51 matrix of p-values (named as pvalue_MA). I want to present graphically this matrix in a plot (filled contour plot) where both axes represent probabilities. I have also added a grid in this plot. I want to highlight in white the cells of the grid that represent p-values smaller than the (common) significance threshold, 0.05. The code from this plot is colored in blue (I had to copy-pasted all this code in order to reach to the plot). I suspect that the problem might be in the col parameter of the filled.contour function. Honestly, I cannot understand why the plot appears to be completely white!! I checked the values of my variable and it has a great range from 0 to 1. Any suggestion/ comment would really help me to move on with my project. Hi Loukia, Do you really need a contour plot for this? pvals-matrix(runif(2601),**nrow=51) library(plotrix) pcols-ifelse(pvals=0.05,**white,gray) color2D.matplot(pvals,**cellcolors=pcols) Jim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with convergence in mle2/optim function
On 05-10-2012, at 07:12, Adam Zeilinger wrote: Hello R Help, I am trying solve an MLE convergence problem: I would like to estimate four parameters, p1, p2, mu1, mu2, which relate to the probabilities, P1, P2, P3, of a multinomial (trinomial) distribution. I am using the mle2() function and feeding it a time series dataset composed of four columns: time point, number of successes in category 1, number of successes in category 2, and number of success in category 3. The column headers are: t, n1, n2, and n3. The mle2() function converges occasionally, and I need to improve the rate of convergence when used in a stochastic simulation, with multiple stochastically generated datasets. When mle2() does not converge, it returns an error: Error in optim(par = c(2, 2, 0.001, 0.001), fn = function (p) : L-BFGS-B needs finite values of 'fn'. I am using the L-BFGS-B optimization method with a lower box constraint of zero for all four parameters. While I do not know any theoretical upper limit(s) to the parameter values, I have not seen any parameter estimates above 2 when using empirical data. It seems that when I start with certain 'true' parameter values, the rate of convergence is quite high, whereas other true parameter values are very difficult to estimate. For example, the true parameter values p1 = 2, p2 = 2, mu1 = 0.001, mu2 = 0.001 causes convergence problems, but the parameter values p1 = 0.3, p2 = 0.3, mu1 = 0.08, mu2 = 0.08 lead to high convergence rate. I've chose! n these two sets of values because they represent the upper and lower estimates of parameter values derived from graphical methods. First, do you have any suggestions on how to improve the rate of convergence and avoid the finite values of 'fn' error? Perhaps it has to do with the true parameter values being so close to the boundary? If so, any suggestions on how to estimate parameter values that are near zero? Here is reproducible and relevant code from my stochastic simulation: library(bbmle) library(combinat) # define multinomial distribution dmnom2 - function(x,prob,log=FALSE) { r - lgamma(sum(x) + 1) + sum(x * log(prob) - lgamma(x + 1)) if (log) r else exp(r) } # vector of time points tv - 1:20 # Negative log likelihood function NLL.func - function(p1, p2, mu1, mu2, y){ t - y$tv n1 - y$n1 n2 - y$n2 n3 - y$n3 P1 - (p1*((-1 + exp(sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))*t))*((-mu2)*(mu2 - p1 + p2) + mu1*(mu2 + 2*p2)) - mu2*sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))) - exp(sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))*t)* mu2*sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))) + 2*exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2*t)*mu2* sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)/ exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2*t)/(2*(mu2*p1 + mu1*(mu2 + p2))* sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2 P2 - (p2*((-1 + exp(sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))*t))*(-mu1^2 + 2*mu2*p1 + mu1*(mu2 - p1 + p2)) - mu1*sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))) - exp(sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))*t)* mu1*sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))) + 2*exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2*t)*mu1* sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)/ exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2*t)/(2*(mu2*p1 + mu1*(mu2 + p2))* sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2 P3 - 1 - P1 - P2 p.all - c(P1, P2, P3) -sum(dmnom2(c(n1, n2, n3), prob = p.all, log = TRUE)) } ## Generate simulated data # Model equations as expressions, P1 - expression((p1*((-1 + exp(sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))*t))*((-mu2)*(mu2 - p1 + p2) + mu1*(mu2 + 2*p2)) - mu2*sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))) - exp(sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))*t)* mu2*sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2))) + 2*exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2*t)*mu2* sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)/ exp((1/2)*(mu1 + mu2 + p1 + p2 + sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2*t)/(2*(mu2*p1 + mu1*(mu2 + p2))* sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2) P2 - expression((p2*((-1 + exp(sqrt((mu1 + mu2 + p1 + p2)^2 - 4*(mu2*p1 + mu1*(mu2 + p2)))*t))*(-mu1^2 + 2*mu2*p1 + mu1*(mu2 - p1 + p2)) - mu1*sqrt((mu1 +
Re: [R] Anova
Hello, You're thinking of ?aov. anova() does _not_ have a formula interface, it would be anova(lm(HSuccess ~ Veg, data = data.to.analyze)) or aov(HSuccess ~ Veg, data = data.to.analyze) Hope this helps, Rui Barradas Em 05-10-2012 09:27, Jhope escreveu: Hi R-listers, I am trying to do an ANOVA for the following scatterplot and received the following error: library(car) scatterplot(HSuccess ~ Veg, data = data.to.analyze, xlab = Vegetation border (m), ylab = Hatching success (%)) anova(HSuccess ~ Veg, data=data.to.analyze) Error in UseMethod(anova) : no applicable method for 'anova' applied to an object of class formula I am wondering if there is a better way to do this? Please advise, Jean -- View this message in context: http://r.789695.n4.nabble.com/Anova-tp4645130.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Anova
Dear Jean, On Fri, 5 Oct 2012 01:27:55 -0700 (PDT) Jhope jeanwaij...@gmail.com wrote: Hi R-listers, I am trying to do an ANOVA for the following scatterplot and received the following error: library(car) scatterplot(HSuccess ~ Veg, data = data.to.analyze, xlab = Vegetation border (m), ylab = Hatching success (%)) anova(HSuccess ~ Veg, data=data.to.analyze) Error in UseMethod(anova) : no applicable method for 'anova' applied to an object of class formula I am wondering if there is a better way to do this? anova() needs a model object, not a formula, as its first argument: anova(lm(HSuccess ~ Veg, data=data.to.analyze)) Alternatively, you can use aov(), with summary(), to get the ANOVA table: summary(aov(HSuccess ~ Veg, data=data.to.analyze)) I hope this helps, John John Fox Sen. William McMaster Prof. of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/ Please advise, Jean -- View this message in context: http://r.789695.n4.nabble.com/Anova-tp4645130.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating vegetation distance groups from one column
Hi, Try this: data.to.analyze-read.table(text= Area Veg 456 0 3400 10 79 25 56 18 467 4 67 7 839 30 1120 16 3482 32 ,sep=,header=TRUE) data.to.analyze$VegIndex=cut(data.to.analyze $Veg,breaks=c(-5,0,5,10,15,20,25,30,35), labels=c((-5-0), (1-5), (6-10), (11-15), (16-20),(21-25),(26-30),(31-35))) VegIndex - data.to.analyze$VegIndex plot(VegIndex) A.K. - Original Message - From: Jhope jeanwaij...@gmail.com To: r-help@r-project.org Cc: Sent: Friday, October 5, 2012 3:38 AM Subject: Re: [R] Creating vegetation distance groups from one column Thank you! That has worked for me when creating graphs. In plyr I used the script: # Veg Index data.to.analyze$VegIndex - cut(data.to.analyze$Veg, breaks=seq(0, 35, 5), include.lowest=TRUE) VegIndex - data.to.analyze$VegIndex plot(VegIndex) But the vegetation distances on the x-axis in the graph are showing up as: [-5,0] (0,5] (5,10] (10,15] (15,20] (20,25] (25,30] I am concerned these vegetation classes are not grouped probably and there is overlap between the classes. It should read, preferably without brackets or only one kind (): (-5-0) (1-5) (6-10) (11-15) (16-20) (21-25) (26-30) How do I fix this? Please advise, Jean -- View this message in context: http://r.789695.n4.nabble.com/Creating-vegetation-distance-groups-from-one-column-tp4644970p4645127.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data structure for plsr
Emma Jones evjo...@ualberta.ca writes: My current data structure consists of a .csv file read into R containing 15 columns (a charcoal dilution series going From 100% to 0%) and 1050 rows of absorbance data from 400 nm to 2500 nm at 2 nm interval. I think I need to transpose the data such that the specific wavelengths become my columns and dilutions are defined in rows, Yes, you need to transpose the data so a coloumn corresponds to a variable (response or predictor). Should I (and how do I) make my absorbance data into individual matrices that read into a data frame with only two columns It is best to put all predictors (wavelengths) together in one matrix, yes. The same for the responses, if you have more than one response coloumn. This is untested, so there might be errors: Assuming that your spectroscopic data is read into a data frame called origspec: ## This should create a matrix with the wavelengths as coloumns: spec - t(as.matrix(origspec)) I don't know what your response is, so I'm just assuming it is in a vector called resp. # This would create a data frame suitable for plsr(): mydata - data.frame(resp = resp, spec = I(spec)) Then you can analyse like this: plsr(resp ~ spec, data = mydata, ) -- Bjørn-Helge Mevik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to add a new row in to an existing data set in R Language?
Hi Karthick, Not sure if I understand your message correctly. Are you saying that the SAS merge as shown in the image (Nabble) is correct?. The fourth row in the merged data (105 21JUN1980 21JAN2006) is incorrect. A.K. - Original Message - From: killerkarthick karthick@gmail.com To: r-help@r-project.org Cc: Sent: Friday, October 5, 2012 2:18 AM Subject: Re: [R] How to add a new row in to an existing data set in R Language? Thanks for your answer Arun, That above image shows the merge in SAS but its possible as same in R please reply me let me know? -- View this message in context: http://r.789695.n4.nabble.com/How-to-merge-data-set-in-R-Language-tp4644855p4645119.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating vegetation distance groups from one column
Hello, That now seems to be a presentation problem. See this example: x - 0:30 + runif(124) y - cut(x, breaks = seq(0, 35, 5)) l - levels(y) l1 - sub(\\], ), l[1]) l2 - as.numeric(sub(\\(([[:digit:]]+),.*, \\1, l[-1])) + 1 l3 - sub(.*,([[:digit:]]+).*, \\1, l[-1]) l.new - c(l1, paste0((, l2, ,, l3, ))) levels(y) - l.new str(y) barplot(table(y)) Instead of 'y' use data.to.analyze$VegIndex and it should give what you want. Hope this helps, Rui Barradas Em 05-10-2012 08:38, Jhope escreveu: Thank you! That has worked for me when creating graphs. In plyr I used the script: # Veg Index data.to.analyze$VegIndex - cut(data.to.analyze$Veg, breaks=seq(0, 35, 5), include.lowest=TRUE) VegIndex - data.to.analyze$VegIndex plot(VegIndex) But the vegetation distances on the x-axis in the graph are showing up as: [-5,0] (0,5] (5,10] (10,15] (15,20] (20,25] (25,30] I am concerned these vegetation classes are not grouped probably and there is overlap between the classes. It should read, preferably without brackets or only one kind (): (-5-0) (1-5) (6-10) (11-15) (16-20) (21-25) (26-30) How do I fix this? Please advise, Jean -- View this message in context: http://r.789695.n4.nabble.com/Creating-vegetation-distance-groups-from-one-column-tp4644970p4645127.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] avoid - in specific case
Hi all, I improved a function drawing horizontal histograms (code below) which uses barplot and works fine. It does,however, need to assign a function to the global environment to later find the actual location on the vertical axis, and not the number of bars used by barplot. Hopefully, running the examples below will illustrate that. As said, it works perfectly fine and does exactly what I want. The problem arises now that I'm building a package (just for myself, for now) and run into R CMD check telling me 'no visible binding for '-' assignment'. (wording below) Similar problem as in http://r.789695.n4.nabble.com/R-CMD-check-tells-me-no-visible-binding-for-global-variable-what-does-it-mean-td1837236.html My question is: Is there a way to avoid assigning the function to the global environment with - but still have it available? I know it is generally not good practice. Or ist it OK in a case like this, and is there a way to avoid the notes from the rcmd check (make the function package-compatible)? Or should I just ignore these notes? (I'm completely new to building packages and cannot judge the importance yet.) I'd be very thankful for any hints! Berry PS: I recently read about barcharts in lattice, but by now I'm already used to my function. (And I learned a lot writing it a couple of years ago). # Function horiz.hist - function(Data, breaks=Sturges, col=transparent, las=1, ylim=range(HBreaks), labelat=pretty(ylim), labels=labelat, border=par(fg), ... ) {a - hist(Data, plot=FALSE, breaks=breaks) HBreaks - a$breaks HBreak1 - a$breaks[1] hpos - function(Pos) (Pos-HBreak1)*(length(HBreaks)-1)/ diff(range(HBreaks)) # Assign a function to the global environment with values calculated inside the main function. barplot(a$counts, space=0, horiz=T, ylim=hpos(ylim), col=col, border=border,...) axis(2, at=hpos(labelat), labels=labels, las=las, ...) print(use hpos() to address y-coordinates) } # Data and basic concept set.seed(8); ExampleData - rnorm(50,8,5)+5 hist(ExampleData) horiz.hist(ExampleData, xlab=absolute frequency) # Caution: the labels at the y-axis are not the real coordinates! # abline(h=2) will draw above the second bar, not at the label value 2. Use hpos: abline(h=hpos(11), col=2) # Further arguments horiz.hist(ExampleData, xlim=c(-8,20)) horiz.hist(ExampleData, main=the ... argument worked!, col.axis=3) hist(ExampleData, xlim=c(-10,40)) # with xlim horiz.hist(ExampleData, ylim=c(-10,40), border=red) # with ylim horiz.hist(ExampleData, breaks=20, col=orange) axis(2, hpos(0:10), labels=F, col=2) # another use of hpos() # One shortcoming: doesn't work with breakpoints provided as a vector with different widths of the bars Wording from the rcmd check when building a package: * checking R code for possible problems ... NOTE horiz.hist: no visible binding for '-' assignment to 'hpos' horiz.hist: no visible global function definition for 'hpos' [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] avoid - in specific case
On 05/10/2012 8:19 AM, Berry Boessenkool wrote: Hi all, I improved a function drawing horizontal histograms (code below) which uses barplot and works fine. It does,however, need to assign a function to the global environment to later find the actual location on the vertical axis, and not the number of bars used by barplot. Hopefully, running the examples below will illustrate that. As said, it works perfectly fine and does exactly what I want. The problem arises now that I'm building a package (just for myself, for now) and run into R CMD check telling me 'no visible binding for '-' assignment'. (wording below) Similar problem as in http://r.789695.n4.nabble.com/R-CMD-check-tells-me-no-visible-binding-for-global-variable-what-does-it-mean-td1837236.html My question is: Is there a way to avoid assigning the function to the global environment with - but still have it available? I know it is generally not good practice. You can return the function as the value of your function. A bonus: if it is created within the body of your function, it will have access to all the local variables there. You shouldn't write to the global environment, because globalenv belongs to the user, not to you. If the user wants your function in the global environment s/he can just assign the value of your function to a variable there. Duncan Murdoch Or ist it OK in a case like this, and is there a way to avoid the notes from the rcmd check (make the function package-compatible)? Or should I just ignore these notes? (I'm completely new to building packages and cannot judge the importance yet.) I'd be very thankful for any hints! Berry PS: I recently read about barcharts in lattice, but by now I'm already used to my function. (And I learned a lot writing it a couple of years ago). # Function horiz.hist - function(Data, breaks=Sturges, col=transparent, las=1, ylim=range(HBreaks), labelat=pretty(ylim), labels=labelat, border=par(fg), ... ) {a - hist(Data, plot=FALSE, breaks=breaks) HBreaks - a$breaks HBreak1 - a$breaks[1] hpos - function(Pos) (Pos-HBreak1)*(length(HBreaks)-1)/ diff(range(HBreaks)) # Assign a function to the global environment with values calculated inside the main function. barplot(a$counts, space=0, horiz=T, ylim=hpos(ylim), col=col, border=border,...) axis(2, at=hpos(labelat), labels=labels, las=las, ...) print(use hpos() to address y-coordinates) } # Data and basic concept set.seed(8); ExampleData - rnorm(50,8,5)+5 hist(ExampleData) horiz.hist(ExampleData, xlab=absolute frequency) # Caution: the labels at the y-axis are not the real coordinates! # abline(h=2) will draw above the second bar, not at the label value 2. Use hpos: abline(h=hpos(11), col=2) # Further arguments horiz.hist(ExampleData, xlim=c(-8,20)) horiz.hist(ExampleData, main=the ... argument worked!, col.axis=3) hist(ExampleData, xlim=c(-10,40)) # with xlim horiz.hist(ExampleData, ylim=c(-10,40), border=red) # with ylim horiz.hist(ExampleData, breaks=20, col=orange) axis(2, hpos(0:10), labels=F, col=2) # another use of hpos() # One shortcoming: doesn't work with breakpoints provided as a vector with different widths of the bars Wording from the rcmd check when building a package: * checking R code for possible problems ... NOTE horiz.hist: no visible binding for '-' assignment to 'hpos' horiz.hist: no visible global function definition for 'hpos' [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Kolmogorov-Smirnov test and the plot of max distance between two ecdf curves
Rui, Your response nearly answered a similar question of mine except that I also have ecdfs of different lengths. Do you know how I can adjust x - seq(min(loga, logb), max(loga, logb), length.out=length(loga)) to account for this? It must be in length.out() but I'm unsure how to proceed. Any advice is much appreciated. -L Rui Barradas wrote Hello, Try the following. (i've changed the color of the first ecdf.) loga - log10(a+1) # do this logb - log10(b+1) # only once f.a - ecdf(loga) f.b - ecdf(logb) # (2) max distance D x - seq(min(loga, logb), max(loga, logb), length.out=length(loga)) x0 - x[which( abs(f.a(x) - f.b(x)) == max(abs(f.a(x) - f.b(x))) )] y0 - f.a(x0) y1 - f.b(x0) plot(f.a, verticals=TRUE, do.points=FALSE, col=blue) plot(f.b, verticals=TRUE, do.points=FALSE, col=green, add=TRUE) ## alternatine, use standard R plot of ecdf #plot(f.a, col=blue) #lines(f.b, col=green) points(c(x0, x0), c(y0, y1), pch=16, col=red) segments(x0, y0, x0, y1, col=red, lty=dotted) ## alternative, down to x axis #segments(x0, 0, x0, y1, col=red, lty=dotted) Hope this helps, Rui Barradas maxbre wrote Hi all, given this example #start a-c(0,70,50,100,70,650,1300,6900,1780,4930,1120,700,190,940, 760,100,300,36270,5610,249680,1760,4040,164890,17230,75140,1870,22380,5890,2430) length(a) b-c(0,0,10,30,50,440,1000,140,70,90,60,60,20,90,180,30,90, 3220,490,20790,290,740,5350,940,3910,0,640,850,260) length(b) out-ks.test(log10(a+1),log10(b+1)) # max distance D out$statistic f.a-ecdf(log10(a+1)) f.b-ecdf(log10(b+1)) plot(f.a, verticals=TRUE, do.points=FALSE, col=red) plot(f.b, verticals=TRUE, do.points=FALSE, col=green, add=TRUE) #inverse of ecdf a x.a-get(x, environment(f.a)) y.a-get(y, environment(f.a)) # inverse of ecdf b x.b-get(x, environment(f.b)) y.b-get(y, environment(f.b)) #end I want to plot the max distance between the two ecdf curves as in the above given chart Is that possible and how? Thanks for your help PS: this is an amended version of a previous thread (but no reply followed) that I’ve deleted from Nabble repository because I realised it was not enough clear (now I hope it’s a little better, sorry for that) -- View this message in context: http://r.789695.n4.nabble.com/Kolmogorov-Smirnov-test-and-the-plot-of-max-distance-between-two-ecdf-curves-tp4631437p4645140.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] glm (probit/logit) optimizer
Dear all, I am using glm function in order to estimate a logit model i.e. glm(Y ~ data[,2] + data[,3], family = binomial(link = logit)). I also created a function that estimates logit model and I would like it to compare it with the glm function. So, does anyone know what optimizer or optimization method glm uses in order to derive the result? Thank you Dimitris -- View this message in context: http://r.789695.n4.nabble.com/glm-probit-logit-optimizer-tp4645141.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Calculating the mean in one column with empty cells
Hi all, I recently tried to calculate the mean and the median just for one column. In this column I have numbers with some empty cells due to missing data. So how can I calculate the mean just for the filled cells? I tried: mean(dataSet2$ac_60d_4d_after_ann[!is.na(master$ac_60d_4d_after_ann)], na.rm=TRUE) But the output was different to the calculation I died in Microsoft Excel. Thanks in advance, Felix -- View this message in context: http://r.789695.n4.nabble.com/Calculating-the-mean-in-one-column-with-empty-cells-tp4645135.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating all possible ratios
Hello, Comment out the second apply and all following instructions using 'r2'. In the end return 'r1', not cbind. Hope this helps, Rui Barradas Em 04-10-2012 23:38, genome1976 escreveu: Hi Rui, A while ago you helped me with calculaing all possible ratios from a dataset. This is the code I am using as suggested by you. data - read.table(new_data.txt, header=T, row.names=1, sep=\t) pairwise.ratios - function(x, prefix=probeset, char=:){ n - ncol(x) cn - colnames(x) if(length(cn) == 0){ cn - gsub( , 0, formatC(seq.int(n), width=nchar(n))) cn - paste(prefix, cn, sep=) } cmb - combn(n, 2) r1 - apply(cmb, 2, function(j) x[, j[1]]/x[, j[2]]) r2 - apply(cmb, 2, function(j) x[, j[2]]/x[, j[1]]) colnames(r1) - apply(cmb, 2, function(j) paste(cn[j], collapse=char)) colnames(r2) - apply(cmb, 2, function(j) paste(cn[rev(j)], collapse=char)) cbind(r1, r2)[, order(c(colnames(r1), colnames(r2)))] } results - pairwise.ratios(data.t) write.table(t(results), ratios_results.txt, sep=\t) It works perfectly fine only that it gives both pairs of ratios a:b and b:a for any two variables a and b. Can you suggest me a way so that I get only one ratio and not both (Combination with caring for the order and not Permutation??) Thanks for any help. Best Regards, Som. Date: Sat, 12 May 2012 15:20:52 -0700 From: ml-node+s789695n4629656...@n4.nabble.com To: genome1...@hotmail.com Subject: RE: Calculating all possible ratios Hello, Nothing wrong with me, maybe your R session has some conflicting objects. Running the function in the previous post on the first 4 rows and first 6 columns of your dataset the result was (copypaste to your session) result - structure(c(8.74714923153198, 1.83094400392095, 9.92065138471113, 1.77145415014708, 1.01515180575001, 0.167175438316099, 0.222321656865252, 0.155576771874649, 3.09417748158541, 0.469647988505747, 1.29398633565582, 0.524043736521509, 3.75969597954255, 0.422694576901317, 9.75471698113208, 0.290397651827521, 4.9035575319622, 1.00105273231888, 1.01093964697178, 0.26895145631068, 0.114322960947685, 0.546166347992352, 0.100799832714726, 0.564507977763338, 0.11605516024473, 0.0913055986191245, 0.0224099858208782, 0.0878243288779063, 0.353735531392494, 0.256505926724138, 0.130433606169248, 0.295826869963301, 0.42981957664441, 0.230861553382365, 0.983273839877614, 0.163931791180376, 0.56058921623124, 0.546741314958369, 0.10190254729944, 0.151825242718447, 0.9850743448771, 5.98173996175908, 4.49798734905118, 6.4276947512815, 8.61659229879359, 10.9522309159971, 44.622964422, 11.3863665430362, 3.04799485560622, 2.8093121408046, 5.82033416762497, 3.36839317468124, 3.70358005398494, 2.52844904226946, 43.8765935747068, 1.86658746243623, 4.83036872336483, 5.98803713273998, 4.5471937427, 1.72873786407767, 0.323187666496628, 2.12925430210325, 0.772805687699305, 1.90823767237023, 2.82697074863659, 3.89854539725884, 7.66673581578674, 3.38035554418724, 0.328084543240185, 0.35595902124055, 0.1718114409242, 0.296877457036954, 1.21508737036511, 0.900024246342843, 7.53850076491586, 0.554147739185128, 1.58476931628683, 2.13149583692219, 0.781259909100518, 0.513223300970874, 0.265978952936953, 2.36577437858509, 0.102514506769826, 3.44355401535389, 2.32655759378615, 4.33160041310018, 1.01701068353905, 6.10009805175427, 0.270009014365446, 0.395499368696959, 0.0227911949977918, 0.535737017484743, 0.822986086753186, 1.11108117816092, 0.132652370966651, 1.8045729131197, 1.30424309801742, 2.36826490573261, 0.103635979283374, 0.926148867313916, 0.203933571388086, 0.998948374760994, 0.989178733859585, 3.71814309436142, 1.78383738225087, 1.82901853699522, 9.81329737579089, 6.58652001534723, 0.207023533247665, 0.166999632405824, 0.219915855047535, 0.578456699988768, 0.631006664328306, 0.469154094827586, 1.27998376513563, 1.9484696000908, 0.76672822844154, 0.422250060615857, 9.64915859255482, 1.07974002376127), .Dim = c(4L, 30L), .Dimnames = list(c(S1, S2, S3, S4), c(P1:P2, P1:P3, P1:P4, P1:P5, P1:P6, P2:P1, P2:P3, P2:P4, P2:P5, P2:P6, P3:P1, P3:P2, P3:P4, P3:P5, P3:P6, P4:P1, P4:P2, P4:P3, P4:P5, P4:P6, P5:P1, P5:P2, P5:P3, P5:P4, P5:P6, P6:P1, P6:P2, P6:P3, P6:P4, P6:P5))) Rui Barradas genome1976 wrote Hi Rui, Thanks once again. I really appreciate it. I tried using the code with the following dataset: Sample P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 S1 5292.9 605.1 5213.9 1710.6 1407.8 1079.4 1379.6 9321.4 6951 1205.8 S2 104.6 57.129 625.69 222.72 247.46 104.49 330.29 1863.7 389.67 216.29 S3 191.29 19.282 860.42 147.83 19.61 189.22 203.27 1799 369.9 175.73 S4 41.553 23.457 267.09 79.293 143.09 154.5 52.567 613.54 408.86 61.715 S5
[R] BCA Package
Hi...In R rankscore() function How to calculate estimation validation holdout?Iam waiting for replyPlease Reply. -- View this message in context: http://r.789695.n4.nabble.com/BCA-Package-tp4645146.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Setting the desired reference category with contr.sum
Hi, I have 6 career types, represented as a factor in R, coded from 1 to 6. I need to use the effect coding (also known as deviation coding) which is normally done by contr.sum, e.g. contrasts(career) - contr.sum(6) However, this results in the 6th category being the reference, that is being coded as -1: $contrasts [,1] [,2] [,3] [,4] [,5] 110000 201000 300100 400010 500001 6 -1 -1 -1 -1 -1 In fact, I need number 1 to be the reference. How do I achieve that? Thank you. Regards, Maxim -- View this message in context: http://r.789695.n4.nabble.com/Setting-the-desired-reference-category-with-contr-sum-tp4645150.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Bayesian inference distribution parameters
Dear all, I'm having a hard time trying to estimate in a bayesian way the parameters alpha and beta for a beta distributed set of data (frequency of occurrence). Do you have any suggestions or tutorials on how to estimate this? Best regards, JC [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glm (probit/logit) optimizer
Dimitris.Kapetanakis dimitrios.kapetanakis at gmail.com writes: I am using glm function in order to estimate a logit model i.e. glm(Y ~ data[,2] + data[,3], family = binomial(link = logit)). I also created a function that estimates logit model and I would like it to compare it with the glm function. So, does anyone know what optimizer or optimization method glm uses in order to derive the result? Iteratively reweighted least squares. If you need the gory details, look at the code of glm.fit() by typing glm.fit __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Kolmogorov-Smirnov test and the plot of max distance between two ecdf curves
Hello, Try length.out = max(length(loga), length(logb)) Note also that all of the previous code and the line above assumes that we are interested in the max distance, whereas the KS statistic computes the supremum of the distance. If it's a two sample test then their values are almost surely the same but not if it's a one sample test. Hope this helps, Rui Barradas Em 05-10-2012 12:15, user1234 escreveu: Rui, Your response nearly answered a similar question of mine except that I also have ecdfs of different lengths. Do you know how I can adjust x - seq(min(loga, logb), max(loga, logb), length.out=length(loga)) to account for this? It must be in length.out() but I'm unsure how to proceed. Any advice is much appreciated. -L Rui Barradas wrote Hello, Try the following. (i've changed the color of the first ecdf.) loga - log10(a+1) # do this logb - log10(b+1) # only once f.a - ecdf(loga) f.b - ecdf(logb) # (2) max distance D x - seq(min(loga, logb), max(loga, logb), length.out=length(loga)) x0 - x[which( abs(f.a(x) - f.b(x)) == max(abs(f.a(x) - f.b(x))) )] y0 - f.a(x0) y1 - f.b(x0) plot(f.a, verticals=TRUE, do.points=FALSE, col=blue) plot(f.b, verticals=TRUE, do.points=FALSE, col=green, add=TRUE) ## alternatine, use standard R plot of ecdf #plot(f.a, col=blue) #lines(f.b, col=green) points(c(x0, x0), c(y0, y1), pch=16, col=red) segments(x0, y0, x0, y1, col=red, lty=dotted) ## alternative, down to x axis #segments(x0, 0, x0, y1, col=red, lty=dotted) Hope this helps, Rui Barradas maxbre wrote Hi all, given this example #start a-c(0,70,50,100,70,650,1300,6900,1780,4930,1120,700,190,940, 760,100,300,36270,5610,249680,1760,4040,164890,17230,75140,1870,22380,5890,2430) length(a) b-c(0,0,10,30,50,440,1000,140,70,90,60,60,20,90,180,30,90, 3220,490,20790,290,740,5350,940,3910,0,640,850,260) length(b) out-ks.test(log10(a+1),log10(b+1)) # max distance D out$statistic f.a-ecdf(log10(a+1)) f.b-ecdf(log10(b+1)) plot(f.a, verticals=TRUE, do.points=FALSE, col=red) plot(f.b, verticals=TRUE, do.points=FALSE, col=green, add=TRUE) #inverse of ecdf a x.a-get(x, environment(f.a)) y.a-get(y, environment(f.a)) # inverse of ecdf b x.b-get(x, environment(f.b)) y.b-get(y, environment(f.b)) #end I want to plot the max distance between the two ecdf curves as in the above given chart Is that possible and how? Thanks for your help PS: this is an amended version of a previous thread (but no reply followed) that I’ve deleted from Nabble repository because I realised it was not enough clear (now I hope it’s a little better, sorry for that) -- View this message in context: http://r.789695.n4.nabble.com/Kolmogorov-Smirnov-test-and-the-plot-of-max-distance-between-two-ecdf-curves-tp4631437p4645140.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] (no subject)
DL, Looks like you have a typo in the expression() function. You had only one s. This works for me ... m - 10 beta.q - matrix(rnorm(81), ncol=9) for (i in 1:9){ plot(seq(1/m, 1-1/m, 1/m), beta.q[, i], type=l, col=1, ylim=range(beta.q), xlab=quantile, ylab=expression(beta[i])) } Jean dick liang dickliang...@gmail.com wrote on 10/04/2012 08:12:07 AM: producing a multi-figure plot, i am try to add beta_1, beta_2,.. beta_9 to ylab using expression or substitution, but cannot work out like for (i in 1:9){ plot(seq(1/m, 1-1/m, 1/m), beta.q[,i], type=l, col=1, ylim=range(beta.q), xlab=quantile, ylab=expresion(beta[i])) } any suggestions will be greatly appreciated. DL [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating the mean in one column with empty cells
Hi see inline -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of fxen3k Sent: Friday, October 05, 2012 11:28 AM To: r-help@r-project.org Subject: [R] Calculating the mean in one column with empty cells Hi all, I recently tried to calculate the mean and the median just for one column. In this column I have numbers with some empty cells due to missing data. So how can I calculate the mean just for the filled cells? I tried: mean(dataSet2$ac_60d_4d_after_ann[!is.na(master$ac_60d_4d_after_ann)], na.rm=TRUE) mean(dataSet2$ac_60d_4d_after_ann, na.rm=TRUE) shall suffice. But the output was different to the calculation I died in Microsoft Excel. Hm. I am also sometimes dying from Excel performance. There could be 2 options: Excel is wrong You did not have transferred values from Excel to R correctly, they are screwed somehow. Which one is true is difficult to decide based on information you revealed. Regards Petr Thanks in advance, Felix -- View this message in context: http://r.789695.n4.nabble.com/Calculating-the-mean-in-one-column-with- empty-cells-tp4645135.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating the mean in one column with empty cells
On 05-10-2012, at 11:27, fxen3k wrote: Hi all, I recently tried to calculate the mean and the median just for one column. In this column I have numbers with some empty cells due to missing data. So how can I calculate the mean just for the filled cells? I tried: mean(dataSet2$ac_60d_4d_after_ann[!is.na(master$ac_60d_4d_after_ann)], na.rm=TRUE) But the output was different to the calculation I died in Microsoft Excel. No data == no can answer question. What did you expect? What did Excel give you? But you are trying to calculate the mean of dataSet2$ac_60d_4d_after_ann and indexing with the indices of non-NA numbers of master$ac_60d_4d_after_ann. mean(dataSet2$ac_60d_4d_after_ann, na.rm=TRUE) should do what you want. Berend __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating the mean in one column with empty cells
fxen3k f.sehardt at gmail.com writes: Hi all, I recently tried to calculate the mean and the median just for one column. In this column I have numbers with some empty cells due to missing data. So how can I calculate the mean just for the filled cells? I tried: mean(dataSet2$ac_60d_4d_after_ann[!is.na(master$ac_60d_4d_after_ann)], na.rm=TRUE) But the output was different to the calculation I died in Microsoft Excel. Thanks in advance, Felix -- View this message in context: http://r.789695.n4.nabble.com/Calculating-the-mean-in-one-column-with-empty-cells-tp4645135.html Sent from the R help mailing list archive at Nabble.com. Hi Felix, Assuming that you have the data frame formatted properly, mean(yourdata$column, na.rm = T) should work. When coverting an excel table to R, I usually fill in blank cells with NA before importing. If you try to import an data frame with empty cells, you usually get an error using read.table(). But since you seem to have already got you data into R, that may not be the problem. HTH, Ken __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to add a new row in to an existing data set in R Language?
Hi, On Fri, Oct 5, 2012 at 12:27 AM, killerkarthick karthick@gmail.com wrote: Hi Ista Zahn, Thanks for your advice. Please see the following Image. http://r.789695.n4.nabble.com/file/n4645113/data.png To me this example looks plain wrong. You end up with a row that contains the date of birth for subject 104 and the date of informed consent for subject 105. If this really is the situation you face, I suggest correcting the ID before merging. i am expecting the result should same in the image. OK, then fix your ID numbers and use merge(DF1, DF2). Best, Ista -- View this message in context: http://r.789695.n4.nabble.com/How-to-add-a-new-row-in-to-an-existing-data-set-in-R-Language-tp4644855p4645113.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] avoid - in specific case
Hi Berry, You might look at scatterplot3d in the scatterplot3d package for an example of how a similar problem was handled, precisely as Duncan suggests. Sarah On Fri, Oct 5, 2012 at 8:25 AM, Duncan Murdoch murdoch.dun...@gmail.com wrote: On 05/10/2012 8:19 AM, Berry Boessenkool wrote: Hi all, I improved a function drawing horizontal histograms (code below) which uses barplot and works fine. It does,however, need to assign a function to the global environment to later find the actual location on the vertical axis, and not the number of bars used by barplot. Hopefully, running the examples below will illustrate that. As said, it works perfectly fine and does exactly what I want. The problem arises now that I'm building a package (just for myself, for now) and run into R CMD check telling me 'no visible binding for '-' assignment'. (wording below) Similar problem as in http://r.789695.n4.nabble.com/R-CMD-check-tells-me-no-visible-binding-for-global-variable-what-does-it-mean-td1837236.html My question is: Is there a way to avoid assigning the function to the global environment with - but still have it available? I know it is generally not good practice. You can return the function as the value of your function. A bonus: if it is created within the body of your function, it will have access to all the local variables there. You shouldn't write to the global environment, because globalenv belongs to the user, not to you. If the user wants your function in the global environment s/he can just assign the value of your function to a variable there. Duncan Murdoch Or ist it OK in a case like this, and is there a way to avoid the notes from the rcmd check (make the function package-compatible)? Or should I just ignore these notes? (I'm completely new to building packages and cannot judge the importance yet.) I'd be very thankful for any hints! Berry PS: I recently read about barcharts in lattice, but by now I'm already used to my function. (And I learned a lot writing it a couple of years ago). # Function horiz.hist - function(Data, breaks=Sturges, col=transparent, las=1, ylim=range(HBreaks), labelat=pretty(ylim), labels=labelat, border=par(fg), ... ) {a - hist(Data, plot=FALSE, breaks=breaks) HBreaks - a$breaks HBreak1 - a$breaks[1] hpos - function(Pos) (Pos-HBreak1)*(length(HBreaks)-1)/ diff(range(HBreaks)) # Assign a function to the global environment with values calculated inside the main function. barplot(a$counts, space=0, horiz=T, ylim=hpos(ylim), col=col, border=border,...) axis(2, at=hpos(labelat), labels=labels, las=las, ...) print(use hpos() to address y-coordinates) } # Data and basic concept set.seed(8); ExampleData - rnorm(50,8,5)+5 hist(ExampleData) horiz.hist(ExampleData, xlab=absolute frequency) # Caution: the labels at the y-axis are not the real coordinates! # abline(h=2) will draw above the second bar, not at the label value 2. Use hpos: abline(h=hpos(11), col=2) # Further arguments horiz.hist(ExampleData, xlim=c(-8,20)) horiz.hist(ExampleData, main=the ... argument worked!, col.axis=3) hist(ExampleData, xlim=c(-10,40)) # with xlim horiz.hist(ExampleData, ylim=c(-10,40), border=red) # with ylim horiz.hist(ExampleData, breaks=20, col=orange) axis(2, hpos(0:10), labels=F, col=2) # another use of hpos() # One shortcoming: doesn't work with breakpoints provided as a vector with different widths of the bars Wording from the rcmd check when building a package: * checking R code for possible problems ... NOTE horiz.hist: no visible binding for '-' assignment to 'hpos' horiz.hist: no visible global function definition for 'hpos' -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Setting the desired reference category with contr.sum
Hi, The contr. functions return matrices, so you can can create any arbitrary matrix you like. For example contrasts(career) - cbind(c1=c(-1, 1, 0, 0, 0, 0), c2=c(-1, 0, 1, 0, 0, 0), c3=c(-1, 0, 0, 1, 0, 0), c4=c(-1, 0, 0, 0, 1, 0), c5=c(-1, 0, 0, 0, 0, 1)) Best, Ista On Fri, Oct 5, 2012 at 8:01 AM, autarkie kovla...@hotmail.com wrote: Hi, I have 6 career types, represented as a factor in R, coded from 1 to 6. I need to use the effect coding (also known as deviation coding) which is normally done by contr.sum, e.g. contrasts(career) - contr.sum(6) However, this results in the 6th category being the reference, that is being coded as -1: $contrasts [,1] [,2] [,3] [,4] [,5] 110000 201000 300100 400010 500001 6 -1 -1 -1 -1 -1 In fact, I need number 1 to be the reference. How do I achieve that? Thank you. Regards, Maxim -- View this message in context: http://r.789695.n4.nabble.com/Setting-the-desired-reference-category-with-contr-sum-tp4645150.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with making figures
It's helpful to provide reproducible code in your posting to R help. The dput() function can be used to share some of your data. For example, you might have used dput(mydata[1:10, 1:10]) # here's some data I made up as an example ... df - structure(list(`2000` = c(44L, 31L, 55L, 83L, 39L, 12L, 21L, 20L, 52L, 63L, 92L, 90L, 22L, 71L, 23L, 46L, 84L, 9L, 98L, 47L ), `2001` = c(88L, 11L, 61L, 86L, 6L, 78L, 97L, 70L, 10L, 72L, 14L, 37L, 94L, 60L, 8L, 19L, 73L, 57L, 2L, 30L), `2002` = c(29L, 87L, 56L, 17L, 4L, 95L, 3L, 77L, 53L, 24L, 79L, 48L, 59L, 42L, 54L, 28L, 25L, 18L, 43L, 15L), `2003` = c(16L, 40L, 58L, 65L, 13L, 38L, 76L, 41L, 1L, 66L, 32L, 45L, 5L, 51L, 33L, 82L, 68L, 74L, 91L, 69L), `2004` = c(67L, 7L, 75L, 80L, 99L, 89L, 81L, 93L, 62L, 85L, 64L, 35L, 100L, 34L, 50L, 49L, 27L, 96L, 36L, 26L)), .Names = c(2000, 2001, 2002, 2003, 2004), row.names = c(Site A, Site B, Site C, Site D, Site E, Site F, Site G, Site H, Site I, Site J, Site K, Site L, Site M, Site N, Site O, Site P, Site Q, Site R, Site S, Site T), class = data.frame) # transpose the data (switch columns and rows) df.turned - as.data.frame(t(df)) # site names sites - names(df.turned) # years year - as.numeric(dimnames(df.turned)[[1]]) # a separate plot for each site for(i in seq(sites)) { plot(year, df.turned[, i], type=b, xlab=Year, ylab=My data, main=sites[i]) } Jean megalops megalop...@hotmail.com wrote on 10/04/2012 03:01:17 PM: I need to make about 30 figures and I am trying to create a program in R that will make my life a lot easier. First I will tell you how my data is setup. I have 30 sites and then data for each year at the site. I have 10 years of data for each site. Below is a small chunk of my data to show the format. 2000 2001 2002 2003 2004 Site A 50 75 25 55 60 Site B 58 22 68 77 30 I am trying to write a program in R that will create figures showing the annual data for each individual site. As opposed to making 30 individual graphs in Excel. Any help would be greatly appreciated. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] avoid - in specific case
I agree with this. But there is (in my opinion) a much more dangerous point: ab If there is an object called b in the function, it uses this b. But if you forgot to inialise b in the function, it looks in the global environment. So the following happens: b-100 a-b (if there is no b in the function, it uses b in the global environment. So a=100 b9 (writes to b in the local environment) Now b in the global environment is 100, while b in the function is 9 It would be more logical if a-b would give an error if b does not exist in the function (just like pascal does). --- Frans -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Duncan Murdoch Verzonden: vrijdag 5 oktober 2012 14:26 Aan: Berry Boessenkool CC: R Help Onderwerp: Re: [R] avoid - in specific case On 05/10/2012 8:19 AM, Berry Boessenkool wrote: Hi all, I improved a function drawing horizontal histograms (code below) which uses barplot and works fine. It does,however, need to assign a function to the global environment to later find the actual location on the vertical axis, and not the number of bars used by barplot. Hopefully, running the examples below will illustrate that. As said, it works perfectly fine and does exactly what I want. The problem arises now that I'm building a package (just for myself, for now) and run into R CMD check telling me 'no visible binding for '-' assignment'. (wording below) Similar problem as in http://r.789695.n4.nabble.com/R-CMD-check-tells-me-no-visible-binding- for-global-variable-what-does-it-mean-td1837236.html My question is: Is there a way to avoid assigning the function to the global environment with - but still have it available? I know it is generally not good practice. You can return the function as the value of your function. A bonus: if it is created within the body of your function, it will have access to all the local variables there. You shouldn't write to the global environment, because globalenv belongs to the user, not to you. If the user wants your function in the global environment s/he can just assign the value of your function to a variable there. Duncan Murdoch Or ist it OK in a case like this, and is there a way to avoid the notes from the rcmd check (make the function package-compatible)? Or should I just ignore these notes? (I'm completely new to building packages and cannot judge the importance yet.) I'd be very thankful for any hints! Berry PS: I recently read about barcharts in lattice, but by now I'm already used to my function. (And I learned a lot writing it a couple of years ago). # Function horiz.hist - function(Data, breaks=Sturges, col=transparent, las=1, ylim=range(HBreaks), labelat=pretty(ylim), labels=labelat, border=par(fg), ... ) {a - hist(Data, plot=FALSE, breaks=breaks) HBreaks - a$breaks HBreak1 - a$breaks[1] hpos - function(Pos) (Pos-HBreak1)*(length(HBreaks)-1)/ diff(range(HBreaks)) # Assign a function to the global environment with values calculated inside the main function. barplot(a$counts, space=0, horiz=T, ylim=hpos(ylim), col=col, border=border,...) axis(2, at=hpos(labelat), labels=labels, las=las, ...) print(use hpos() to address y-coordinates) } # Data and basic concept set.seed(8); ExampleData - rnorm(50,8,5)+5 hist(ExampleData) horiz.hist(ExampleData, xlab=absolute frequency) # Caution: the labels at the y-axis are not the real coordinates! # abline(h=2) will draw above the second bar, not at the label value 2. Use hpos: abline(h=hpos(11), col=2) # Further arguments horiz.hist(ExampleData, xlim=c(-8,20)) horiz.hist(ExampleData, main=the ... argument worked!, col.axis=3) hist(ExampleData, xlim=c(-10,40)) # with xlim horiz.hist(ExampleData, ylim=c(-10,40), border=red) # with ylim horiz.hist(ExampleData, breaks=20, col=orange) axis(2, hpos(0:10), labels=F, col=2) # another use of hpos() # One shortcoming: doesn't work with breakpoints provided as a vector with different widths of the bars Wording from the rcmd check when building a package: * checking R code for possible problems ... NOTE horiz.hist: no visible binding for '-' assignment to 'hpos' horiz.hist: no visible global function definition for 'hpos' [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __
Re: [R] Error in lmer: asMethod(object) : matrix is not symmetric [1, 2]
Olivier Blaser olivier.blaser at unil.ch writes: I am having trouble with lmer. I am looking at recombinant versus non recombinant individuals. In the response variable recombinant individuals are coded as 1's and non-recombinant as 0's. I built a model with 2 fixed factors and 1 random effect. Sex (males/females) is the first fixed effect and sexual genotype (XY, YY, WX and WY) the second one. Sexual Genotype is nested in sex. XY and YY individuals are males and WX and WY females. I crossed 8 XY males with 8 WY females and 8 YY males with 8 WX females. Each cross corresponds to a family (i.e. family 1 to 8 for XY x WY and family 9 to 16 for YY WX). For each family I have 20 offspring. Family is nested in sexual genotype as a random factor. My data looks like: reps-factor(sample(c(0,1),640,replace=T,prob=c(0.95,0.05))) sex-factor(rep(c(M,F),each=320)) geno-factor(rep(c(XY,YY,WY,WX),each=160)) fam-factor(rep(c(1:8,9:16,1:8,9:16),each=20)) dat-data.frame(reps,sex,geno,fam) with(dat,table(sex,geno)) with(dat,table(sex,geno,fam)) Follow-ups to this question should go to r-sig-mixed-models The slightly obscure error message is caused by a glitch in lme4's printing code, where it tries to print a model that hasn't been successfully fitted in the first place. I actually get a **slightly** more informative error message (maybe with a slightly later version of the package): g1 - glmer(reps~sex+geno+(1|fam),data=dat,family=binomial) Error in mer_finalize(ans) : Downdated X'X is not positive definite, 1. [although lmer() automatically calls glmer() for GLMMs, I prefer to use glmer() explicitly] To diagnose/show what the problem is, let's try fitting in glm. It works, but the answer indicates that your design matrix is rank-deficient/overfitted: (g0 - glm(reps~sex+geno,data=dat,family=binomial)) i.e., the 'genoYY' estimate is NA, because once 'sexM', 'genoWY', and 'genoXY' parameters are fitted the model is completely determined. Unlike [g]lm, [g]lmer can't handle rank-deficient/overfitted fixed-effect design matrices (as I understand it, because the somewhat fancier/more efficient linear algebra primitives used in lme4 don't handle this case). Therefore, you have to figure out a way to code your design matrix that will work. At the moment I don't know of an elegant way to do that, but here's a hack: ## extract model matrix and drop intercept and genoYY dummy variables mm - model.matrix(g0)[,2:4] dat2 -data.frame(reps,mm,fam) g2 - glmer(reps~sexM+genoWY+genoXY+(1|fam),data=dat2,family=binomial) I don't know if this gives you the answer you want or not ... __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Dúvida função Anova pacote car - Medidas repetidas
Olá pessoal, estou realizando uma ANOVA com medidas repetidas e estou utilizando a função Anova do pacote car. Medi o biovolume de algas a cada dois dias durante 10 dias (no banco de dados abaixo só coloquei até o 4° dia). Tenho 2 tratamentos (c,t) e o experimento foi realizado em tréplicas (A,B,C). Pa2 Day Type Replicate logbiovolume 10c A19.34 20c B18.27 30c C18.56 40t A18.41 50t B18.68 60t C18.86 72c A18.81 82c B18.84 92c C18.52 10 2t A18.29 11 2t B17.91 12 2t C17.67 13 4c A19.16 14 4c B18.85 15 4c C19.36 16 4t A19.05 17 4t B19.09 18 4t C18.26 . . . Pa2.teste = within(Pa2,{group = factor(Type) time = factor(Day) id = factor(Replicate)}) matrix = with(Pa2.teste,cbind(Pa2[,VAR][group==c],Pa2[,VAR][group==t])) matrix [,1] [,2] [1,] 19.34 18.41 [2,] 18.27 18.68 [3,] 18.56 18.86 [4,] 18.81 18.29 [5,] 18.84 17.91 [6,] 18.52 17.67 [7,] 19.16 19.05 [8,] 18.85 19.09 [9,] 19.36 18.26 [10,] 19.63 18.96 [11,] 19.94 18.06 [12,] 19.54 18.37 [13,] 19.98 17.96 [14,] 20.99 17.93 [15,] 20.45 17.74 [16,] 21.12 17.60 [17,] 21.66 17.33 [18,] 21.51 18.12 model - lm(matrix ~ 1) design - factor(c(c,t)) options(contrasts=c(contr.sum, contr.poly)) aov - Anova(model, idata=data.frame(design), idesign=~design, type=III) summary(aov, multivariate=F) Univariate Type III Repeated-Measures ANOVA Assuming Sphericity SS num Df Error SS den Df FPr(F) (Intercept) 12951.2 1 6.3312 17 34775.336 2.2e-16 *** design 19.1 1 17.3901 1718.697 0.0004606 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 O problema é que eu acho que esta função não está levando em consideração os dias, nem as réplicas. Como faço para introduzir isto na análise. Vocês conhecem alguma função correspondente não paramétrica para este teste? Tipo um teste de Friedman com dois grupos (tratamento e réplica) e um bloco (tempo)? Muito Obrigado Diego PJ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] avoid - in specific case
On 05/10/2012 9:38 AM, Frans Marcelissen wrote: I agree with this. But there is (in my opinion) a much more dangerous point: ab If there is an object called b in the function, it uses this b. But if you forgot to inialise b in the function, it looks in the global environment. So the following happens: b-100 a-b (if there is no b in the function, it uses b in the global environment. So a=100 b9 (writes to b in the local environment) Now b in the global environment is 100, while b in the function is 9 It would be more logical if a-b would give an error if b does not exist in the function (just like pascal does). If functions didn't have access to objects outside the local frame, they couldn't do much of anything at all. Unlike Pascal, functions in R are first class objects. Even - is an object (found in the base package). So functions need to be able to look outside themselves to find the thinks like mean(), var(), +, -, etc. Duncan Murdoch --- Frans -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Duncan Murdoch Verzonden: vrijdag 5 oktober 2012 14:26 Aan: Berry Boessenkool CC: R Help Onderwerp: Re: [R] avoid - in specific case On 05/10/2012 8:19 AM, Berry Boessenkool wrote: Hi all, I improved a function drawing horizontal histograms (code below) which uses barplot and works fine. It does,however, need to assign a function to the global environment to later find the actual location on the vertical axis, and not the number of bars used by barplot. Hopefully, running the examples below will illustrate that. As said, it works perfectly fine and does exactly what I want. The problem arises now that I'm building a package (just for myself, for now) and run into R CMD check telling me 'no visible binding for '-' assignment'. (wording below) Similar problem as in http://r.789695.n4.nabble.com/R-CMD-check-tells-me-no-visible-binding- for-global-variable-what-does-it-mean-td1837236.html My question is: Is there a way to avoid assigning the function to the global environment with - but still have it available? I know it is generally not good practice. You can return the function as the value of your function. A bonus: if it is created within the body of your function, it will have access to all the local variables there. You shouldn't write to the global environment, because globalenv belongs to the user, not to you. If the user wants your function in the global environment s/he can just assign the value of your function to a variable there. Duncan Murdoch Or ist it OK in a case like this, and is there a way to avoid the notes from the rcmd check (make the function package-compatible)? Or should I just ignore these notes? (I'm completely new to building packages and cannot judge the importance yet.) I'd be very thankful for any hints! Berry PS: I recently read about barcharts in lattice, but by now I'm already used to my function. (And I learned a lot writing it a couple of years ago). # Function horiz.hist - function(Data, breaks=Sturges, col=transparent, las=1, ylim=range(HBreaks), labelat=pretty(ylim), labels=labelat, border=par(fg), ... ) {a - hist(Data, plot=FALSE, breaks=breaks) HBreaks - a$breaks HBreak1 - a$breaks[1] hpos - function(Pos) (Pos-HBreak1)*(length(HBreaks)-1)/ diff(range(HBreaks)) # Assign a function to the global environment with values calculated inside the main function. barplot(a$counts, space=0, horiz=T, ylim=hpos(ylim), col=col, border=border,...) axis(2, at=hpos(labelat), labels=labels, las=las, ...) print(use hpos() to address y-coordinates) } # Data and basic concept set.seed(8); ExampleData - rnorm(50,8,5)+5 hist(ExampleData) horiz.hist(ExampleData, xlab=absolute frequency) # Caution: the labels at the y-axis are not the real coordinates! # abline(h=2) will draw above the second bar, not at the label value 2. Use hpos: abline(h=hpos(11), col=2) # Further arguments horiz.hist(ExampleData, xlim=c(-8,20)) horiz.hist(ExampleData, main=the ... argument worked!, col.axis=3) hist(ExampleData, xlim=c(-10,40)) # with xlim horiz.hist(ExampleData, ylim=c(-10,40), border=red) # with ylim horiz.hist(ExampleData, breaks=20, col=orange) axis(2, hpos(0:10), labels=F, col=2) # another use of hpos() # One shortcoming: doesn't work with breakpoints provided as a vector with different widths of the bars Wording from the rcmd check when building a package: * checking R code for possible problems ... NOTE horiz.hist: no visible binding for '-' assignment to 'hpos' horiz.hist: no visible global function definition for 'hpos' [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide
Re: [R] avoid - in specific case
Inline. On Fri, Oct 5, 2012 at 6:38 AM, Frans Marcelissen frans.marcelis...@digipsy.nl wrote: I agree with this. But there is (in my opinion) a much more dangerous point: ab If there is an object called b in the function, it uses this b. But if you forgot to inialise b in the function, it looks in the global environment. So the following happens: b-100 a-b (if there is no b in the function, it uses b in the global environment. So a=100 b9 (writes to b in the local environment) Now b in the global environment is 100, while b in the function is 9 It would be more logical if a-b would give an error if b does not exist in the function (just like pascal does). Oy! What you find logical others (like me) would find objectionable. You have just asked to break R's entire scoping procedure. Do you really think that is reasonable? It is the duty of a programmer to learn and use the paradigms of the language in which he/she programs. These will, of course, be different from language to language -- appropriately so. To complain that one language does not follow the procedures of another, especially without understanding the big picture of each language's design (which few of us do) just does not make sense to me. All IMHO, of course. Cheers, Bert --- Frans -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Duncan Murdoch Verzonden: vrijdag 5 oktober 2012 14:26 Aan: Berry Boessenkool CC: R Help Onderwerp: Re: [R] avoid - in specific case On 05/10/2012 8:19 AM, Berry Boessenkool wrote: Hi all, I improved a function drawing horizontal histograms (code below) which uses barplot and works fine. It does,however, need to assign a function to the global environment to later find the actual location on the vertical axis, and not the number of bars used by barplot. Hopefully, running the examples below will illustrate that. As said, it works perfectly fine and does exactly what I want. The problem arises now that I'm building a package (just for myself, for now) and run into R CMD check telling me 'no visible binding for '-' assignment'. (wording below) Similar problem as in http://r.789695.n4.nabble.com/R-CMD-check-tells-me-no-visible-binding- for-global-variable-what-does-it-mean-td1837236.html My question is: Is there a way to avoid assigning the function to the global environment with - but still have it available? I know it is generally not good practice. You can return the function as the value of your function. A bonus: if it is created within the body of your function, it will have access to all the local variables there. You shouldn't write to the global environment, because globalenv belongs to the user, not to you. If the user wants your function in the global environment s/he can just assign the value of your function to a variable there. Duncan Murdoch Or ist it OK in a case like this, and is there a way to avoid the notes from the rcmd check (make the function package-compatible)? Or should I just ignore these notes? (I'm completely new to building packages and cannot judge the importance yet.) I'd be very thankful for any hints! Berry PS: I recently read about barcharts in lattice, but by now I'm already used to my function. (And I learned a lot writing it a couple of years ago). # Function horiz.hist - function(Data, breaks=Sturges, col=transparent, las=1, ylim=range(HBreaks), labelat=pretty(ylim), labels=labelat, border=par(fg), ... ) {a - hist(Data, plot=FALSE, breaks=breaks) HBreaks - a$breaks HBreak1 - a$breaks[1] hpos - function(Pos) (Pos-HBreak1)*(length(HBreaks)-1)/ diff(range(HBreaks)) # Assign a function to the global environment with values calculated inside the main function. barplot(a$counts, space=0, horiz=T, ylim=hpos(ylim), col=col, border=border,...) axis(2, at=hpos(labelat), labels=labels, las=las, ...) print(use hpos() to address y-coordinates) } # Data and basic concept set.seed(8); ExampleData - rnorm(50,8,5)+5 hist(ExampleData) horiz.hist(ExampleData, xlab=absolute frequency) # Caution: the labels at the y-axis are not the real coordinates! # abline(h=2) will draw above the second bar, not at the label value 2. Use hpos: abline(h=hpos(11), col=2) # Further arguments horiz.hist(ExampleData, xlim=c(-8,20)) horiz.hist(ExampleData, main=the ... argument worked!, col.axis=3) hist(ExampleData, xlim=c(-10,40)) # with xlim horiz.hist(ExampleData, ylim=c(-10,40), border=red) # with ylim horiz.hist(ExampleData, breaks=20, col=orange) axis(2, hpos(0:10), labels=F, col=2) # another use of hpos() # One shortcoming: doesn't work with breakpoints provided as a vector with different widths of the bars Wording from the rcmd check when building a package: * checking R code for possible problems ... NOTE horiz.hist: no visible binding for '-' assignment to 'hpos' horiz.hist: no visible
Re: [R] Dúvida função Anova pacote car - Medidas repetidas
Trabalho de casa? Não fazemos trabalhos de casa. Ao menos podia ter tentado correr esse código... Rui Barradas Em 05-10-2012 14:56, Diego Pujoni escreveu: Olá pessoal, estou realizando uma ANOVA com medidas repetidas e estou utilizando a função Anova do pacote car. Medi o biovolume de algas a cada dois dias durante 10 dias (no banco de dados abaixo só coloquei até o 4° dia). Tenho 2 tratamentos (c,t) e o experimento foi realizado em tréplicas (A,B,C). Pa2 Day Type Replicate logbiovolume 10c A19.34 20c B18.27 30c C18.56 40t A18.41 50t B18.68 60t C18.86 72c A18.81 82c B18.84 92c C18.52 10 2t A18.29 11 2t B17.91 12 2t C17.67 13 4c A19.16 14 4c B18.85 15 4c C19.36 16 4t A19.05 17 4t B19.09 18 4t C18.26 . . . Pa2.teste = within(Pa2,{group = factor(Type) time = factor(Day) id = factor(Replicate)}) matrix = with(Pa2.teste,cbind(Pa2[,VAR][group==c],Pa2[,VAR][group==t])) matrix [,1] [,2] [1,] 19.34 18.41 [2,] 18.27 18.68 [3,] 18.56 18.86 [4,] 18.81 18.29 [5,] 18.84 17.91 [6,] 18.52 17.67 [7,] 19.16 19.05 [8,] 18.85 19.09 [9,] 19.36 18.26 [10,] 19.63 18.96 [11,] 19.94 18.06 [12,] 19.54 18.37 [13,] 19.98 17.96 [14,] 20.99 17.93 [15,] 20.45 17.74 [16,] 21.12 17.60 [17,] 21.66 17.33 [18,] 21.51 18.12 model - lm(matrix ~ 1) design - factor(c(c,t)) options(contrasts=c(contr.sum, contr.poly)) aov - Anova(model, idata=data.frame(design), idesign=~design, type=III) summary(aov, multivariate=F) Univariate Type III Repeated-Measures ANOVA Assuming Sphericity SS num Df Error SS den Df FPr(F) (Intercept) 12951.2 1 6.3312 17 34775.336 2.2e-16 *** design 19.1 1 17.3901 1718.697 0.0004606 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 O problema é que eu acho que esta função não está levando em consideração os dias, nem as réplicas. Como faço para introduzir isto na análise. Vocês conhecem alguma função correspondente não paramétrica para este teste? Tipo um teste de Friedman com dois grupos (tratamento e réplica) e um bloco (tempo)? Muito Obrigado Diego PJ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating the mean in one column with empty cells
HI, Not sure how your dataset looks like: set.seed(1) dat1-data.frame(col1=c(sample(1:50,5,replace=TRUE),NA,sample(1:25,3,replace=TRUE),NA,sample(1:25,2,replace=TRUE)),col2=c(NA,NA,rnorm(8,15),NA,NA)) mean(dat1$col1[!is.na(dat1$col1)]) #[1] 20.1 mean(dat1$col1,na.rm=TRUE) #[1] 20.1 But, there is one problem that is obvious dataset2 and master. Looks like you have two datasets. A.K. - Original Message - From: fxen3k f.seha...@gmail.com To: r-help@r-project.org Cc: Sent: Friday, October 5, 2012 5:27 AM Subject: [R] Calculating the mean in one column with empty cells Hi all, I recently tried to calculate the mean and the median just for one column. In this column I have numbers with some empty cells due to missing data. So how can I calculate the mean just for the filled cells? I tried: mean(dataSet2$ac_60d_4d_after_ann[!is.na(master$ac_60d_4d_after_ann)], na.rm=TRUE) But the output was different to the calculation I died in Microsoft Excel. Thanks in advance, Felix -- View this message in context: http://r.789695.n4.nabble.com/Calculating-the-mean-in-one-column-with-empty-cells-tp4645135.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] jointModel error messages
I contacted the package developer and that lead to me removing events at time 0 (or subjects with only 1 longitudinal measurement). I then still had the error message Can't fit a Cox model with 0 failures which I have managed to avoid by adding 1.8*10^(-15) to all my survival times, any number greater than this also works but nothing smaller! Any explanation of this would help! From there I have been able to fit a few models but run into problems when I try to inclue a variable called logrna (=log(rna)) in the fixed effects of the lme model, it produces the error Error in if (t1 || t2) { : missing value where TRUE/FALSE needed. However if I include the term exp(logrna) in the fixed effects, the joint model works fine, but this is not what I want! This is where I really need help, what is happening? Charles Charles Graham wrote I am trying to use the jointModel function in the JM package to fit a simple joint model to longitudinal and survival data. I have come accross a range of errors when trying different things and just can't seem to get around them all. The code I use is as follows: fitLME = lme(cd4~trt+time, random=~time|num, data=mnuts2); summary(fitLME) fitSURV = coxph(Surv(fail.time, SI.code)~trt, x=TRUE, data=cov); summary(fitSURV) fitJM = jointModel(fitLME, fitSURV, timeVar=time, method=piecewise-PH-GH); summary(fitJM) Both the lme and coxph functions work fine and both give the same sample size (though sometimes I get the unequal sample size error that I see others have experienced without a solution). My current error says Can't fit a Cox model with 0 failures despite the coxph function working fine. Previously I had an error message saying there were longitudinal measurements after the event (which I wouldn't have thought would be a problem!), I dealt with that by removing all measurements after the failure time. I want to know if there is somewhere I can find all the requirements (in terms of data structures, lme and coxph limitations etc.) for the jointModel function to work or someone who can help me with my errors. Charles -- View this message in context: http://r.789695.n4.nabble.com/jointModel-error-messages-tp4644812p4645164.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R: machine for moderately large data
Dear all, I would like to ask your advice about a suitable computer for the following usage. I (am starting to) work with moderately big data in R: - cca 2 - 20 million rows * 100 - 1000 columns (market basket data) - mainly clustering, classification trees, association analysis (e.g. libraries rpart, cba, proxy, party) Can you recommend a sufficient computer for this volume? I am routinely working in Windows but feel that Mac or some linux machine might be needed. Please, respond directly to my email. Many thanks! Zdenek Skala zdenek.sk...@gfk.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Minute Difference
Hi, Here i have a time along with date, for eg:- 10/5/2012 5:05:00 AM i need to do minus 10 minutes along current date Like this :- 10/5/2012 4:55:00 AM Thanks in Advance Antony -- View this message in context: http://r.789695.n4.nabble.com/Minute-Difference-tp4645157.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating the mean in one column with empty cells
I imported the whole dataset with read.csv2() and it works fine. (2 for German is correct ;) ) I already checked the numbers and I also tried to calculate the mean of a range of numbers where there is no NA given. (as mentioned in my last post above). -- View this message in context: http://r.789695.n4.nabble.com/Calculating-the-mean-in-one-column-with-empty-cells-tp4645135p4645166.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Missing data (Na) and chi-square tests
Dear everyone I am a bit of a computer imbecile and are having problems with R. I am using R in my research project to do chi-square tests on data imported from excel . However I have som missing data in one of my variables (columns) and I need R to exclude these and make chi-square test on the data that I have. I use a formula to make 2x2 tables which is: data - matrix(c(sum(!Variable[Group==1]), sum(Variable[DAAC==1]), sum(!Variable[Group==0]), sum(Variable[DAAC==0])),2,2) How can I get R to ignore Na's in this formula? Many Regards -- View this message in context: http://r.789695.n4.nabble.com/Missing-data-Na-and-chi-square-tests-tp4645167.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating the mean in one column with empty cells
I'm sorry! Now I tried it again with just 10 numbers (just random numbers) and Excel gives a different output than R. Here are the numbers I used: 0,2006160108532920 0,1321167173880490 0,0563941428921262 0,0264198664609803 0,0200581303857603 -0,2971754213679500 -0,2353086361784190 0,0667195538296534 0,1755852636926560 And this is the command in R: nums - as.numeric(as.character(dataSet2$ac_bhar_60d_4d_after_ann[2:10])) m - mean(nums, na.rm = T) m The output of R is: print(m, digits= 12) [1] 0.01667 The output in Excel is: 0,0161584031062386 The numbers are imported correctly. Or does R reduce the imported numbers to any decimal place? (i don't think so ;-) ) Best Regards, Felix -- View this message in context: http://r.789695.n4.nabble.com/Calculating-the-mean-in-one-column-with-empty-cells-tp4645135p4645165.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Likelihood: Multiple Maxima
Hi everyone, I estimate structural equation models and use maximum likelihood estimation. However, the estimates differ depeding on the starting values I choose, so I guess there are multiple local maxima. I'm new to R (and statistics...;), does anybody maybe know how I solve this best? Thanks a lt! ;) Felix -- View this message in context: http://r.789695.n4.nabble.com/Likelihood-Multiple-Maxima-tp4645170.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] smoothScatter plot
In line John Kane Kingston ON Canada -Original Message- From: zhyjiang2...@hotmail.com Sent: Fri, 5 Oct 2012 05:41:29 +0800 To: jrkrid...@inbox.com Subject: RE: [R] smoothScatter plot Hi John, Thanks for your email. Your way works good. However, I was wondering if you can help with a smoothed scatter plot that has shadows with different darker blue color representing higher density of points. Zhengyu Do you mean something like what is being discussed here? http://andrewgelman.com/2012/08/graphs-showing-uncertainty-using-lighter-int ensities-for-the-lines-that-go-further-from-the-center-to-de-emphasize-the-e dges/ If so I think there has been some discussion and accompanying ggplot2 code on google groups ggplot2 site. Otherwise can you explain a bit more clearly? _ Date: Thu, 4 Oct 2012 05:46:46 -0800 From: jrkrid...@inbox.com Subject: RE: [R] smoothScatter plot To: zhyjiang2...@hotmail.com CC: r-help@r-project.org Hi, Do you mean something like this? = scatter.smooth(x,y)scatter.smooth(x,y) = It looks like invoking that dcols - densCols(x,y) is callling in some package that is masking the basic::smoothScatter() and applying some other version of smoothScatter, but I am not expert enough to be sure. Another way to get the same result as mine with smoothScatter is to use the ggplot2 package. it looks a bit more complicated but it is very good and in some ways easier to see exactly what is happening. To try it you would need to install the ggplot2 package (install.packages(ggplot2) then with your original x and y data frames === library(ggplot2) xy - cbind(x, y) names(xy) - c(xx, yy) p - ggplot(xy , aes(xx, yy )) + geom_point( ) + geom_smooth( method=loess, se =FALSE) p Thanks for the data set. However it really is easier to use dput() To use dput() simply issue the command dput(myfile) where myfile is the file you are working with. It will give you something like this: == 1 dput(x) structure(c(0.4543462924, 0.2671718761, 0.1641577016, 1.1593356462, 0.0421177346, 0.3127782861, 0.4515537795, 0.5332559665, 0.0913911528, 0.1472054054, 0.1340672893, 1.2599304224, 0.3872026125, 0.0368560053, 0.0371828779, 0.3999714282, 0.0175815783, 0.8871547761, 0.2706762487, 0.7401904063, 0.0991320236, 0.2565567348, 0.5854167363, 0.7515717421, 0.7220388222, 1.3528297744, 0.9339971349, 0.0128652431, 0.4102527051 ), .Dim = c(29L, 1L), .Dimnames = list(NULL, V1)) 1 dput(y) structure(list(V1 = c(0.8669898448, 0.6698647266, 0.1641577016, 0.4779091929, 0.2109900366, 0.2915241414, 0.2363116664, 0.3808731568, 0.379908928, 0.2565868263, 0.1986675964, 0.7589866876, 0.6496236922, 0.1327986663, 0.4196107999, 0.3436442638, 0.1910728051, 0.5625817464, 0.1429791079, 0.6441837334, 0.1477153617, 0.369079266, 0.3839842979, 0.39044223, 0.4186374286, 0.7611640016, 0.446291999, 0.2943343355, 0.3019098386)), .Names = V1, class = data.frame, row.names = c(NA, -29L)) 1 === That is your x in dput() form. You just copy it from the R terminal and paste it into your email message. It is handy if you add the x - and y - to the output. Your method works just fine but it's a bit more cumbersome with a lot of data. Also, please reply to the R-help list as well. It is a source of much more expertise than me and it also can reply when a single person is unavailable. I hope this helps John Kane Kingston ON Canada -Original Message- From: zhyjiang2...@hotmail.com Sent: Thu, 4 Oct 2012 05:19:14 +0800 To: jrkrid...@inbox.com Subject: RE: [R] smoothScatter plot Hi John, Thanks for your reply. But I cannot figure out how to use dput(). I included data and code below. Is that possible to make a plot similar to attached smoothing effect. Zhengyu ### x-read.table(text=0.4543462924 0.2671718761 0.1641577016 1.1593356462 0.0421177346 0.3127782861 0.4515537795 0.5332559665 0.0913911528 0.1472054054 0.1340672893 1.2599304224 0.3872026125 0.0368560053 0.0371828779 0.3999714282 0.0175815783 0.8871547761 0.2706762487 0.7401904063 0.0991320236 0.2565567348 0.5854167363 0.7515717421 0.7220388222 1.3528297744 0.9339971349 0.0128652431 0.4102527051,header=FALSE) y-read.table(text=0.8669898448 0.6698647266 0.1641577016 0.4779091929 0.2109900366
Re: [R] Calculating the mean in one column with empty cells
If the numbers were imported correctly you wouldn't need to do as.numeric(as.character(yourdata)). Please use dput() to provide your data, as in: dput(dataSet2$ac_bhar_60d_4d_after_ann[2:10]) Otherwise it's impossible for us to diagnose your problem or reproduce your error. testdata - c(0.2006160108532920, 0.1321167173880490, 0.0563941428921262, 0.0264198664609803, 0.0200581303857603, -0.2971754213679500, -0.2353086361784190, 0.0667195538296534, 0.1755852636926560) mean(testdata) [1] 0.0161584 Sarah On Fri, Oct 5, 2012 at 9:14 AM, fxen3k f.seha...@gmail.com wrote: I'm sorry! Now I tried it again with just 10 numbers (just random numbers) and Excel gives a different output than R. Here are the numbers I used: 0,2006160108532920 0,1321167173880490 0,0563941428921262 0,0264198664609803 0,0200581303857603 -0,2971754213679500 -0,2353086361784190 0,0667195538296534 0,1755852636926560 And this is the command in R: nums - as.numeric(as.character(dataSet2$ac_bhar_60d_4d_after_ann[2:10])) m - mean(nums, na.rm = T) m The output of R is: print(m, digits= 12) [1] 0.01667 The output in Excel is: 0,0161584031062386 The numbers are imported correctly. Or does R reduce the imported numbers to any decimal place? (i don't think so ;-) ) Best Regards, Felix -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] t-test
Many many thanks for the input in this context. Nico On Thu, Oct 4, 2012 at 4:56 PM, Jose Iparraguirre jose.iparragui...@ageuk.org.uk wrote: Hi, You can also have a look at this paper (no subscription needed): Herberich, E.; Sikorski, J.; and Hothorn, T. (2010). A Robust Procedure for Comparing Multiple Means under Heteroscedasticity in Unbalanced Designs. PLoSONE, Vol. 5, Issue 3, e9788. doi:10.1371/journal.pone.0009788 In this paper, Herberich et al have included a piece of R code to run their procedure. Their procedure is a modification of Tukey, and as it says in the title of the paper, it can be used regardless of whether the samples had different sizes or distributions. José José Iparraguirre Chief Economist Age UK T 020 303 31482 E jose.iparragui...@ageuk.org.uk Twitter @jose.iparraguirre@ageuk Tavis House, 1- 6 Tavistock Square London, WC1H 9NB www.ageuk.org.uk | ageukblog.org.uk | @ageukcampaigns For a copy of our new Economic Monitor and the full Chief Economist's report, visit the Age UK Knowledge Hub http://www.ageuk.org.uk/professional-resources-home/knowledge-hub-evidence-statistics/ For evidence and statistics on the older population, visit the Age UK Knowledge Hub http://www.ageuk.org.uk/professional-resources-home/knowledge-hub-evidence-statistics/ -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of arun Sent: 04 October 2012 13:59 To: John Kane Cc: R help Subject: Re: [R] t-test Hi John, You are right. With more than two groups, the type 1 error rate should be a problem. A.K. - Original Message - From: John Kane jrkrid...@inbox.com To: arun smartpink...@yahoo.com Cc: Sent: Thursday, October 4, 2012 8:40 AM Subject: Re: [R] t-test My stats are lousy but isnt Nico doing some multiple t-tests when an anova with some post hoc comparisons complete with a Tukey or Bonferroni correction looks more suitable? Of course I have no idea of the topic area and maybe he has already done thi. John Kane Kingston ON Canada -Original Message- From: smartpink...@yahoo.com Sent: Thu, 4 Oct 2012 05:31:55 -0700 (PDT) To: nicome...@gmail.com Subject: Re: [R] t-test HI, Try this: sapply(split(dat,dat$Name),function(x) t.test(x[,2],dat[,2])$p.value) #CTK100 CTK103 CTK121 #0.86330310 0.32706859 0.02023357 A.K. - Original Message - From: Nico Met nicome...@gmail.com To: Rui Barradas ruipbarra...@sapo.pt Cc: r-help@r-project.org; r-help r-h...@stat.math.ethz.ch Sent: Thursday, October 4, 2012 6:37 AM Subject: Re: [R] t-test Dear Rui, Many thanks for help. mean for CTK and all = comparison between mean of all groups ( which means second col) vs. each groups like CTK100, CTK121 etc. Regards Nico On Thu, Oct 4, 2012 at 12:28 PM, Rui Barradas ruipbarra...@sapo.pt wrote: Hello, I'm not quite sure I understand, but something like this? tapply(dat$Score, dat$Name, FUN = mean) sapply(unique(dat$Name), function(un){ with(dat, t.test(Score[Name == un], Score[Name != un])$p.value)}) My doubt is in what you mean by mean for CTK and all. The ?t.test gives a confidence interval for the difference in the means, so maybe you'll have to look there for what you want. Hope this helps, Rui Barradas Em 04-10-2012 10:34, Nico Met escreveu: Dear Group, I want to do a t-test calculation on a large data set. I am pasting some part of it structure(list(Name = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c(CTK100, CTK103, CTK121), class = factor), Score = c(236.9726, 207.0055, 237.3464, 224.4774, 236.5034, 206.7382, 233.94, 240.31, 240.9, 235.15, 223.36, 248.67, 249.25, 201.4051, 244.1689, 182.2756, 229.001, 241.3211, 196.0453, 232.6055, 225.0783, 196.0453, 232.6055, 225.0783 )), .Names = c(Name, Score), class = data.frame, row.names = c(NA, 24L)) I want to compare groups with CTK100 and with all the groups and want to save the p-values and mean for each of that particular group (for example: mean for CTK and all) Similarly, for other groups like that CTK121 etc... Is there any way to automate this process? Thanks for your advice ! Nico [[alternative HTML version deleted]] __** R-help@r-project.org mailing list https://stat.ethz.ch/mailman/**listinfo/r-help https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/** posting-guide.html http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing
Re: [R] problems with printing and plotting aareg
It's a bug in summary.aareg which no one found until now. What's wrong: If dfbeta=TRUE then there is a second estimate of variance calculated, labeled as test.var2. If maxtime is set, then both estimates of variance need to be recalculated by the summary routine. An incorrect if-then-else flow led it to look for test.var2 when it wasn't relevant. My test cases with maxtime also happened to have dfbeta=TRUE. Short term solution: set dfbeta=TRUE. A bit more computation time though. Long term: I'll fix it, and a new version of survival will one day appear. (With 200+ packages that depend on survival, new releases now require a lot of checking. No overnight fixes). Terry T On 10/05/2012 05:00 AM, r-help-requ...@r-project.org wrote: Hi all, I've ventured into the world of nonparametric survival and I would like to use the maxtime option for printing and plotting my aareg fit. However, my fit does not have test.var2 and this stops the print and plot when adding a maxtime. My code is as follows: Response-Surv(Time,Event) Model-aareg(Response~Factor1*Factor2) Model2-aareg(Response~Factor1+Factor2) #Just did this to see if the interaction term had anything to do with it Model, print(Model), summary(Model), and plot(Model) seem to work fine, but as soon as I try summary/print/plot(Model, maxtime=400) it tells me that test.var2 is not found and when I look at summary(Model), there is indeed a NULL under test.var2. Anyone know why it doesn't include test.var2? Is this a compatibility problem? I'm using R version 2.13 (I know it's quite old, but updating is a pain when you don't have admin rights to your computer) and just updated the survival package (no warning messages). Any input would be much appreciated. Cheers, Freya __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating the mean in one column with empty cells
You need to show us the verbatim output of the following R command dput(dataSet2$ac_bhar_60d_4d_after_ann[2:10]) to make any further progress. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of fxen3k Sent: Friday, October 05, 2012 6:15 AM To: r-help@r-project.org Subject: Re: [R] Calculating the mean in one column with empty cells I'm sorry! Now I tried it again with just 10 numbers (just random numbers) and Excel gives a different output than R. Here are the numbers I used: 0,2006160108532920 0,1321167173880490 0,0563941428921262 0,0264198664609803 0,0200581303857603 -0,2971754213679500 -0,2353086361784190 0,0667195538296534 0,1755852636926560 And this is the command in R: nums - as.numeric(as.character(dataSet2$ac_bhar_60d_4d_after_ann[2:10])) m - mean(nums, na.rm = T) m The output of R is: print(m, digits= 12) [1] 0.01667 The output in Excel is: 0,0161584031062386 The numbers are imported correctly. Or does R reduce the imported numbers to any decimal place? (i don't think so ;-) ) Best Regards, Felix -- View this message in context: http://r.789695.n4.nabble.com/Calculating-the-mean-in- one-column-with-empty-cells-tp4645135p4645165.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Missing data (Na) and chi-square tests
Hello, There are two ways, 1. ?sum # see argument na.rm sum(whatever, na.rm = TRUE) 2. ?table # produces the 2x2 contingency table, if there are only 2 values Also, you should provide us with a data example, especially since your code clearly doesn't work. Use ?dput like this dput( head(MyData, 20) ) # Then paste the output of this in a post. Hope this helps, Rui Barradas Em 05-10-2012 14:26, Rerda escreveu: Dear everyone I am a bit of a computer imbecile and are having problems with R. I am using R in my research project to do chi-square tests on data imported from excel . However I have som missing data in one of my variables (columns) and I need R to exclude these and make chi-square test on the data that I have. I use a formula to make 2x2 tables which is: data - matrix(c(sum(!Variable[Group==1]), sum(Variable[DAAC==1]), sum(!Variable[Group==0]), sum(Variable[DAAC==0])),2,2) How can I get R to ignore Na's in this formula? Many Regards -- View this message in context: http://r.789695.n4.nabble.com/Missing-data-Na-and-chi-square-tests-tp4645167.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Minute Difference
On Oct 5, 2012, at 5:56 AM, Rantony wrote: Hi, Here i have a time along with date, for eg:- 10/5/2012 5:05:00 AM i need to do minus 10 minutes along current date Like this :- 10/5/2012 4:55:00 AM There are both 'cut' and 'seq' methods for vectors of class POSIXt. These support operations that specify particular units. ?cut.POSIXt # has link to the seq method help page I am not posting code when questions fail to adhere to the reproducible code request. -- David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] LMMs with some variance terms forced constant
Erica E.M. Moodie, Dr. erica.moodie at mcgill.ca writes: I have been asked to help perform a meta-analysis with individual- and aggregate-level data. I found a nice article on this, and the idea is easy to understand: use a mixed effects models, but for the studies where there is only aggregate level data, force the variance to be that which was observed. Unfortunately, I am struggling to see how to implement this in R. I am familiar with standard mixed model formulations using nlme and lmer, but not able to see how to force this specific variance structure on my model so that some variance terms are not estimated. No solutions, but a few ideas: * I assume this is beyond the capacities/flexibility of the metafor package? * It's possible that the regress package http://cran.r-project.org/web/packages/regress/index.html could be made to do this * There *may* be a way to do this with the pdStruct structures in nlme (read the relevant bits of Pinheiro and Bates 2000, then stare at your computer really hard for a while and see if anything comes to you), although I suspect not * you can 'roll your own' mixed models in a flexible way with WinBUGS/JAGS, maybe now Stan (if you want to be an early adopter!), all of which have good R interfaces; it's conceivable that AS-REML can do this too * follow-ups to r-sig-mixed-models please __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Missing data (Na) and chi-square tests
On Oct 5, 2012, at 6:26 AM, Rerda wrote: Dear everyone I am a bit of a computer imbecile and are having problems with R. I am using R in my research project to do chi-square tests on data imported from excel . However I have som missing data in one of my variables (columns) and I need R to exclude these and make chi-square test on the data that I have. I use a formula to make 2x2 tables which is: data - matrix(c(sum(!Variable[Group==1]), sum(Variable[DAAC==1]), sum(!Variable[Group==0]), sum(Variable[DAAC==0])),2,2) This might be as simple as: with(Variable, table( Grp.1= Group == 1, DAAC.1= DAAC == 1) ) table will produce an object that inherits from matrix and the counts of logical values will get cross tabulated. The NA values will neither be equal to 1 nor not-equal to 1, but will rather be ...NA. So they will not appear in the counts. This, of course, depends on my guess regarding you data structure being correct and my guess regarding you goals also beingcorrect. Neither of those critical bits of information have yet been offered. -- David. How can I get R to ignore Na's in this formula? Many Regards -- View this message in context: http://r.789695.n4.nabble.com/Missing-data-Na-and-chi-square-tests-tp4645167.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in lmer: asMethod(object) : matrix is not symmetric [1, 2]
Ben Bolker bbolker at gmail.com writes: PS this error message is listed in http://glmm.wikidot.com/faq -- if you google 'lmer matrix is not symmetric' it's in the hits (although pretty far from the top, alas). I've added a little bit more discussion. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating the mean in one column with empty cells
Hi -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of fxen3k Sent: Friday, October 05, 2012 3:18 PM To: r-help@r-project.org Subject: Re: [R] Calculating the mean in one column with empty cells I imported the whole dataset with read.csv2() and it works fine. (2 for German is correct ;) ) I already checked the numbers and I also tried to calculate the mean of a range of numbers where there is no NA given. (as mentioned in my last post above). But as Sarah pointed out the result in R from your values (when correctly read) are the same as in Excel. Therefore the problem seems to be in ***your*** data output from str(dataSet2$ac_bhar_60d_4d_after_ann[2:10]) can be helpful in diagnosing what may be going on. Regards Petr -- View this message in context: http://r.789695.n4.nabble.com/Calculating-the-mean-in-one-column-with- empty-cells-tp4645135p4645166.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R: machine for moderately large data
Hi -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Skála, Zdeněk (INCOMA GfK) Sent: Friday, October 05, 2012 3:38 PM To: r-help@r-project.org Subject: [R] R: machine for moderately large data Dear all, I would like to ask your advice about a suitable computer for the following usage. I (am starting to) work with moderately big data in R: - cca 2 - 20 million rows * 100 - 1000 columns (market basket data) - mainly clustering, classification trees, association analysis (e.g. libraries rpart, cba, proxy, party) If I compute correctly, such a big matrix (20e6*1000) needs about 160 GB just to be in memory. Are you prepared for this? Maybe some suitable database interface shall be preferable. Regards Petr Can you recommend a sufficient computer for this volume? I am routinely working in Windows but feel that Mac or some linux machine might be needed. Please, respond directly to my email. Many thanks! Zdenek Skala zdenek.sk...@gfk.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Multiple graphs boxplot
Dear all I am trying to represent a dependent variable (treatment) against different independent variables (v1, v2, v3v20). I am using the following command: boxplot(v1~treatment,data=y, main=xx,xlab=xx, ylab=xx) However, it provides me only one graph for v1~treatment. For the other comparisons, I have to repeat the same command but changing the parameters. My intentions is to get different plots in just one sheet using only one command. Is it possible to join the same order for all the comparisons in only one command? Thanks David [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Kolmogorov-Smirnov test and the plot of max distance between two ecdf curves
Another alternative is to put the data in a linear model structure (1 column for the response, another column for an indicator variable indicating group) and estimate all possible quantile regressions with rq() in quantreg package using a model with y ~ intercept + indicator (0,1) variable for group. The estimated quantiles for the intercept will be the quantiles of the ecdf for one group and the estimated quantiles for the indicator grouping variable will be the differences in quantiles (ecdf) between the two groups. There is useful built in graphing capability in quantreg with the plot.rqs() function. Brian Brian S. Cade, PhD U. S. Geological Survey Fort Collins Science Center 2150 Centre Ave., Bldg. C Fort Collins, CO 80526-8818 email: brian_c...@usgs.gov tel: 970 226-9326 From: user1234 mehenderso...@gmail.com To: r-help@r-project.org Date: 10/05/2012 06:46 AM Subject: Re: [R] Kolmogorov-Smirnov test and the plot of max distance between two ecdf curves Sent by: r-help-boun...@r-project.org Rui, Your response nearly answered a similar question of mine except that I also have ecdfs of different lengths. Do you know how I can adjust x - seq(min(loga, logb), max(loga, logb), length.out=length(loga)) to account for this? It must be in length.out() but I'm unsure how to proceed. Any advice is much appreciated. -L Rui Barradas wrote Hello, Try the following. (i've changed the color of the first ecdf.) loga - log10(a+1) # do this logb - log10(b+1) # only once f.a - ecdf(loga) f.b - ecdf(logb) # (2) max distance D x - seq(min(loga, logb), max(loga, logb), length.out=length(loga)) x0 - x[which( abs(f.a(x) - f.b(x)) == max(abs(f.a(x) - f.b(x))) )] y0 - f.a(x0) y1 - f.b(x0) plot(f.a, verticals=TRUE, do.points=FALSE, col=blue) plot(f.b, verticals=TRUE, do.points=FALSE, col=green, add=TRUE) ## alternatine, use standard R plot of ecdf #plot(f.a, col=blue) #lines(f.b, col=green) points(c(x0, x0), c(y0, y1), pch=16, col=red) segments(x0, y0, x0, y1, col=red, lty=dotted) ## alternative, down to x axis #segments(x0, 0, x0, y1, col=red, lty=dotted) Hope this helps, Rui Barradas maxbre wrote Hi all, given this example #start a-c(0,70,50,100,70,650,1300,6900,1780,4930,1120,700,190,940, 760,100,300,36270,5610,249680,1760,4040,164890,17230,75140,1870,22380,5890,2430) length(a) b-c(0,0,10,30,50,440,1000,140,70,90,60,60,20,90,180,30,90, 3220,490,20790,290,740,5350,940,3910,0,640,850,260) length(b) out-ks.test(log10(a+1),log10(b+1)) # max distance D out$statistic f.a-ecdf(log10(a+1)) f.b-ecdf(log10(b+1)) plot(f.a, verticals=TRUE, do.points=FALSE, col=red) plot(f.b, verticals=TRUE, do.points=FALSE, col=green, add=TRUE) #inverse of ecdf a x.a-get(x, environment(f.a)) y.a-get(y, environment(f.a)) # inverse of ecdf b x.b-get(x, environment(f.b)) y.b-get(y, environment(f.b)) #end I want to plot the max distance between the two ecdf curves as in the above given chart Is that possible and how? Thanks for your help PS: this is an amended version of a previous thread (but no reply followed) that I?ve deleted from Nabble repository because I realised it was not enough clear (now I hope it?s a little better, sorry for that) -- View this message in context: http://r.789695.n4.nabble.com/Kolmogorov-Smirnov-test-and-the-plot-of-max-distance-between-two-ecdf-curves-tp4631437p4645140.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Setup Alias Working Directory
I have 2 questions: 1- I want to setup alias for dbconnect in User .Rprofile. mydb = dbConnect(MySQL(), user='ghulam', password='password', dbname='ghulam_test', host='localhost') 2- I want to setup two working directory one is for user personal dirctory. I have setup .Rprofile in users Home directory with this command setwd(/home/ghulam/R) I also want to setup Shared Working Directory for all Users . Can you please help me to figure it out. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] survival predictions
Dear All, I have built a survival cox-model, which includes a covariate * time interaction. (non-proportionality detected) I am now wondering how could I most easily get survival predictions from my model. My model was specified: coxph(formula = Surv(event_time_mod, event_indicator_mod) ~ Sex + ageC + HHcat_alt + Main_Branch + Acute_seizure + TreatmentType_binary + ICH + IVH_dummy + IVH_dummy:log(event_time_mod) And now I was hoping to get a prediction using survfit and providing new.data for the combination of variables I am doing the predictions: survfit(cox, new.data=new) Now as I have event_time_mod in the right-hand side in my model I need to specify it in the new data frame passed on to survfit. This event_time would need to be set at individual times of the predictions. Is there an easy way to specify event_time_mod to be the correct time to survfit? Or are there any other options for achieving predictions from my model? Of course I could create as many rows in the new data frame as there are distinct times in the predictions and setting to event_time_mod to correct values but it feels really cumbersome and I thought that there must be a better way. Best, Mitja __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Test for Random Points on a Sphere
Dear All, I implemented an algorithm for (uniform) random rotations. In order to test it, I can apply it to a unit vector (0,0,1) in Cartesian coordinates. The result is supposed to be a set of random, uniformly distributed, points on a sphere (not the point of the algorithm, but a way to test it). This is what the points look like when I plot them, but other then eyeballing them, can anyone suggest a test to ensure that I am really generating uniform random points on a sphere? Many thanks Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Broken Links on http://www.r-project.org
On 04.10.2012 21:03, Hasan Diwan wrote: The R Graphics Gallery has moved to http://gallery.r-enthusiasts.com/ Thanks, fixed. Uwe Ligges and there's another R Graphics Manual at http://rgm2.lab.nig.ac.jp/RGM2 -- H On 26 September 2012 04:56, Viechtbauer Wolfgang (STAT) wolfgang.viechtba...@maastrichtuniversity.nl wrote: I was not sure who I should contact about this, so I am posting this here. There are a few broken links on the R website. 1) http://www.r-project.org/search.html - link to the Nabble R Forum. I belive the correct/new URL should be: http://r.789695.n4.nabble.com/ 2) http://www.r-project.org/other-docs.html - link to Auswertung ökologischer Daten. Not sure if there is a new URL. 3) http://www.r-project.org/other-projects.html - link to Jim Lindsey's R page. I believe the correct/new URL should be: http://www.commanster.eu/rcode.html Best, Wolfgang -- Wolfgang Viechtbauer, Ph.D., Statistician Department of Psychiatry and Psychology School for Mental Health and Neuroscience Faculty of Health, Medicine, and Life Sciences Maastricht University, P.O. Box 616 (VIJV1) 6200 MD Maastricht, The Netherlands +31 (43) 388-4170 | http://www.wvbauer.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple graphs boxplot
Hello, I've once written a function that does more or less what you want, but it has no formula interface. # Input: #x - matrix or data.frame of numeric vectors to graph #by - a factor or coercible to factor multi.boxplot - function(x, by, col=0, ...){ x - as.data.frame(x) uniq.by - unique(by) by - factor(by) len - length(uniq.by) - 1 n - ncol(x) n1 - n + 1 col - rep(col, n)[seq_len(n)] boxplot(x[[ 1 ]] ~ by, at = 0:len*n1 + 1, xlim = c(0, (len + 1)*n1), ylim = range(unlist(x)), xaxt = n, col=col[1], ...) for(i in seq_len(n)[-1]) boxplot(x[[i]] ~ by, at = 0:len*n1 + i, xaxt = n, add = TRUE, col=col[i], ...) axis(1, at = 0:len*n1 + n1/2, labels = uniq.by, tick = TRUE) } a - matrix(data=runif(300,max=2), nrow=100, ncol=3) fac - sample(letters[1:4], 100, TRUE) multi.boxplot(a, fac) Hope this helps, Rui Barradas Em 05-10-2012 17:01, David Gramaje escreveu: Dear all I am trying to represent a dependent variable (treatment) against different independent variables (v1, v2, v3v20). I am using the following command: boxplot(v1~treatment,data=y, main=xx,xlab=xx, ylab=xx) However, it provides me only one graph for v1~treatment. For the other comparisons, I have to repeat the same command but changing the parameters. My intentions is to get different plots in just one sheet using only one command. Is it possible to join the same order for all the comparisons in only one command? Thanks David [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] smoothScatter plot
Hi John, Thanks for your link. Those plots look pretty but way too complicated in terms of making R code. Maybe my decription is not clear. But could you take a look at the attached png? I saw several publications showing smoothed plots like this but not sure how to make one... Thanks,Best,Zhengyu Date: Fri, 5 Oct 2012 06:36:38 -0800 From: jrkrid...@inbox.com Subject: RE: [R] smoothScatter plot To: zhyjiang2...@hotmail.com CC: r-help@r-project.org In line John Kane Kingston ON Canada -Original Message- From: zhyjiang2...@hotmail.com Sent: Fri, 5 Oct 2012 05:41:29 +0800 To: jrkrid...@inbox.com Subject: RE: [R] smoothScatter plot Hi John, Thanks for your email. Your way works good. However, I was wondering if you can help with a smoothed scatter plot that has shadows with different darker blue color representing higher density of points. Zhengyu Do you mean something like what is being discussed here? http://andrewgelman.com/2012/08/graphs-showing-uncertainty-using-lighter-intensities-for-the-lines-that-go-further-from-the-center-to-de-emphasize-the-edges/ If so I think there has been some discussion and accompanying ggplot2 code on google groups ggplot2 site. Otherwise can you explain a bit more clearly? Date: Thu, 4 Oct 2012 05:46:46 -0800 From: jrkrid...@inbox.com Subject: RE: [R] smoothScatter plot To: zhyjiang2...@hotmail.com CC: r-help@r-project.org Hi, Do you mean something like this? = scatter.smooth(x,y)scatter.smooth(x,y) = It looks like invoking that dcols - densCols(x,y) is callling in some package that is masking the basic::smoothScatter() and applying some other version of smoothScatter, but I am not expert enough to be sure. Another way to get the same result as mine with smoothScatter is to use the ggplot2 package. it looks a bit more complicated but it is very good and in some ways easier to see exactly what is happening. To try it you would need to install the ggplot2 package (install.packages(ggplot2) then with your original x and y data frames === library(ggplot2) xy - cbind(x, y) names(xy) - c(xx, yy) p - ggplot(xy , aes(xx, yy )) + geom_point( ) + geom_smooth( method=loess, se =FALSE) p Thanks for the data set. However it really is easier to use dput() To use dput() simply issue the command dput(myfile) where myfile is the file you are working with. It will give you something like this: == 1 dput(x) structure(c(0.4543462924, 0.2671718761, 0.1641577016, 1.1593356462, 0.0421177346, 0.3127782861, 0.4515537795, 0.5332559665, 0.0913911528, 0.1472054054, 0.1340672893, 1.2599304224, 0.3872026125, 0.0368560053, 0.0371828779, 0.3999714282, 0.0175815783, 0.8871547761, 0.2706762487, 0.7401904063, 0.0991320236, 0.2565567348, 0.5854167363, 0.7515717421, 0.7220388222, 1.3528297744, 0.9339971349, 0.0128652431, 0.4102527051 ), .Dim = c(29L, 1L), .Dimnames = list(NULL, V1)) 1 dput(y) structure(list(V1 = c(0.8669898448, 0.6698647266, 0.1641577016, 0.4779091929, 0.2109900366, 0.2915241414, 0.2363116664, 0.3808731568, 0.379908928, 0.2565868263, 0.1986675964, 0.7589866876, 0.6496236922, 0.1327986663, 0.4196107999, 0.3436442638, 0.1910728051, 0.5625817464, 0.1429791079, 0.6441837334, 0.1477153617, 0.369079266, 0.3839842979, 0.39044223, 0.4186374286, 0.7611640016, 0.446291999, 0.2943343355, 0.3019098386)), .Names = V1, class = data.frame, row.names = c(NA, -29L)) 1 === That is your x in dput() form. You just copy it from the R terminal and paste it into your email message. It is handy if you add the x - and y - to the output. Your method works just fine but it's a bit more cumbersome with a lot of data. Also, please reply to the R-help list as well. It is a source of much more expertise than me and it also can reply when a single person is unavailable. I hope this helps John Kane Kingston ON Canada -Original Message- From: zhyjiang2...@hotmail.com Sent: Thu, 4 Oct 2012 05:19:14 +0800 To: jrkrid...@inbox.com Subject: RE: [R] smoothScatter plot Hi John, Thanks for your reply. But I cannot figure out how to use dput(). I included data and code below. Is that possible to make a plot similar to attached smoothing effect. Zhengyu ### x-read.table(text=0.4543462924 0.2671718761 0.1641577016 1.1593356462 0.0421177346 0.3127782861 0.4515537795 0.5332559665 0.0913911528 0.1472054054 0.1340672893 1.2599304224 0.3872026125 0.0368560053 0.0371828779 0.3999714282 0.0175815783 0.8871547761 0.2706762487 0.7401904063 0.0991320236 0.2565567348 0.5854167363 0.7515717421 0.7220388222 1.3528297744 0.9339971349 0.0128652431 0.4102527051,header=FALSE)
Re: [R] R: machine for moderately large data
On Fri, Oct 5, 2012 at 12:09 PM, PIKAL Petr petr.pi...@precheza.cz wrote: Hi -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of Skála, Zdeněk (INCOMA GfK) Sent: Friday, October 05, 2012 3:38 PM To: r-help@r-project.org Subject: [R] R: machine for moderately large data Dear all, I would like to ask your advice about a suitable computer for the following usage. I (am starting to) work with moderately big data in R: - cca 2 - 20 million rows * 100 - 1000 columns (market basket data) - mainly clustering, classification trees, association analysis (e.g. libraries rpart, cba, proxy, party) If I compute correctly, such a big matrix (20e6*1000) needs about 160 GB just to be in memory. Are you prepared for this? This is not as outrageous as one might think -- you can get a mac pro with 32 gigs of memory for around $3,500 Best, Ista Maybe some suitable database interface shall be preferable. Regards Petr Can you recommend a sufficient computer for this volume? I am routinely working in Windows but feel that Mac or some linux machine might be needed. Please, respond directly to my email. Many thanks! Zdenek Skala zdenek.sk...@gfk.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Format of numbers in plotmath expressions.
On 05.10.2012 09:30, Rolf Turner wrote: I want to do something like: TH - sprintf(%1.1f,c(0.3,0.5,0.7,0.9,1)) plot(1:10) legend(bottomright,pch=1:5,legend=parse(text=paste(theta ==,TH))) Notice that the final 1 comes out in the legend as just plain 1 and NOT as 1.0 although TH is [1] 0.3 0.5 0.7 0.9 1.0 I can get plotmath to preserve 1.0 as 1.0 and NOT convert it to 1 if I use substitute, as in text(2,5,labels=substitute(list(theta == a),list(a=TH[5]))) but substitute doesn't work appropriately with vectors. Can anyone tell me how to get a 1.0 rather than 1 in the legend? Ta. cheers, Rolf Turner P.S. Just figured out a way using sapply(): leg - sapply(TH,function(x){substitute(list(theta == a),list(a=x))}) plot(1:10) legend(bottomright,pch=1:5,legend=parse(text=leg)) Note that the use of parse (pace Thomas Lumley! :-) ) is required --- legend=leg does NOT work. Getting here required an enormous amount of trial and error. And it seems pretty kludgy. Is there a sexier way? Not sure what is sexier here. I'd stay with leg - lapply(TH, function(x) bquote(list(theta == .(x plot(1:10) legend(bottomright, pch=1:5, legend=as.expression(leg)) Best, Uwe Ligges R. T. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R: machine for moderately large data
Hi, On Fri, Oct 5, 2012 at 1:41 PM, Ista Zahn istaz...@gmail.com wrote: On Fri, Oct 5, 2012 at 12:09 PM, PIKAL Petr petr.pi...@precheza.cz wrote: [snip] If I compute correctly, such a big matrix (20e6*1000) needs about 160 GB just to be in memory. Are you prepared for this? This is not as outrageous as one might think -- you can get a mac pro with 32 gigs of memory for around $3,500 And even so, I suspect the matrices that will be worked with are sparse, so you can get more savings there (although I'm not sure which of the packages the OP had listed work with sparse input. That having been said, if you don't want to sample from your data, sometimes R isn't the best solution. There are projects being developed to specifically deal with such big data. For one, you might consider looking at the graphlab/graphchi stuff: http://graphlab.org (Graphchi is meant to process big data on a modest machine). If you go to the Toolkits menu, you'll see they have an implementation of kmeans++ clustering that might be suitable for your clustering analysis (perhaps some matrix factorizations are useful here, too -- perhaps your market basket data can be viewed as some type of collaborative filtering problem, in which case their collaborative filtering toolkit is right up your alley ;-) The OP also mentioned classification trees. Perhaps rf-ace might be useful: http://code.google.com/p/rf-ace/ From their website: RF-ACE implements both Random Forest (RF) and Gradient Boosting Tree (GBT) algorithms, and is strongly related to ACE, originally outlined in http://jmlr.csail.mit.edu/papers/volume10/tuv09a/tuv09a.pdf If you scroll down to the case study section of their main page, you can there is some talk about how they used this in a distributed manner ... perhaps it is applicable in your case as well (in which case you might be able to rig up AWS to help you). HTH, -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test for Random Points on a Sphere
On Fri, Oct 5, 2012 at 5:39 PM, Lorenzo Isella lorenzo.ise...@gmail.com wrote: Dear All, I implemented an algorithm for (uniform) random rotations. In order to test it, I can apply it to a unit vector (0,0,1) in Cartesian coordinates. The result is supposed to be a set of random, uniformly distributed, points on a sphere (not the point of the algorithm, but a way to test it). This is what the points look like when I plot them, but other then eyeballing them, can anyone suggest a test to ensure that I am really generating uniform random points on a sphere? Many thanks Gut says to divide the surface into n bits of equal area and see if the points appear uniformly in those using something chi-squared-ish, but I'm not aware of a canonical way to do so. Cheers, Michael Lorenzo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Test for Random Points on a Sphere
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-bounces@r- project.org] On Behalf Of R. Michael Weylandt Sent: Friday, October 05, 2012 11:17 AM To: Lorenzo Isella Cc: r-help@r-project.org Subject: Re: [R] Test for Random Points on a Sphere On Fri, Oct 5, 2012 at 5:39 PM, Lorenzo Isella lorenzo.ise...@gmail.com wrote: Dear All, I implemented an algorithm for (uniform) random rotations. In order to test it, I can apply it to a unit vector (0,0,1) in Cartesian coordinates. The result is supposed to be a set of random, uniformly distributed, points on a sphere (not the point of the algorithm, but a way to test it). This is what the points look like when I plot them, but other then eyeballing them, can anyone suggest a test to ensure that I am really generating uniform random points on a sphere? Many thanks Gut says to divide the surface into n bits of equal area and see if the points appear uniformly in those using something chi-squared-ish, but I'm not aware of a canonical way to do so. Cheers, Michael Lorenzo I would be more inclined to use a method which is known to produce a points uniformly distributed on the surface of a sphere and not worry about testing your results. You might find the discussion at the following link useful. http://mathworld.wolfram.com/SpherePointPicking.html Hope this is helpful, Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Minute Difference
Hi, Try this: date1-10/5/2012 5:05:00 AM format(as.POSIXct(strptime(date1,format=%m/%d/%Y %H:%M:%S))-600,format=%m/%d/%Y %r) #[1] 10/05/2012 04:55:00 AM A.K. - Original Message - From: Rantony antony.akk...@ge.com To: r-help@r-project.org Cc: Sent: Friday, October 5, 2012 8:56 AM Subject: [R] Minute Difference Hi, Here i have a time along with date, for eg:- 10/5/2012 5:05:00 AM i need to do minus 10 minutes along current date Like this :- 10/5/2012 4:55:00 AM Thanks in Advance Antony -- View this message in context: http://r.789695.n4.nabble.com/Minute-Difference-tp4645157.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Minute Difference
Mr Rantony could also probably benefit from taking a look at the lubridate package. Cheers, Michael On Fri, Oct 5, 2012 at 8:04 PM, arun smartpink...@yahoo.com wrote: Hi, Try this: date1-10/5/2012 5:05:00 AM format(as.POSIXct(strptime(date1,format=%m/%d/%Y %H:%M:%S))-600,format=%m/%d/%Y %r) #[1] 10/05/2012 04:55:00 AM A.K. - Original Message - From: Rantony antony.akk...@ge.com To: r-help@r-project.org Cc: Sent: Friday, October 5, 2012 8:56 AM Subject: [R] Minute Difference Hi, Here i have a time along with date, for eg:- 10/5/2012 5:05:00 AM i need to do minus 10 minutes along current date Like this :- 10/5/2012 4:55:00 AM Thanks in Advance Antony __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] convert multi dimensional array to list
See fortune(toad) for a bit more on this concept. On Thu, Oct 4, 2012 at 3:19 PM, David Winsemius dwinsem...@comcast.net wrote: On Oct 4, 2012, at 8:57 AM, anto.r wrote: Hi Michael thanks! That was the option if I kept it an array. The list format with $ sign since it leaves me feeling that the names are there and can be easily accessed. Why would you rather not use the $ sign? It would be better in programing to learn to use the [[ operator for which '$' is just a particular application that is less flexible because it won't evaluate its argument. I use R-Studio and there names can be selected from a drop-down list, I have found it easier but that could be my lack of proper training in R. You should be able to do that with column names in dataframes using object[[col]] Cheers Anto -- View this message in context: http://r.789695.n4.nabble.com/convert-multi-dimensional-array-to-list-tp4645011p4645036.html Sent from the R help mailing list archive at Nabble.com. Learn to post context, please. The all too typical habit of Nabble-users in failing to include context is a constant source of annoyance to regular R-help readers. -- David Winsemius, MD Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] barplot with some 0 frequencies
Others have shown barchart commands that include 0's. My guess (which could be wrong, feel free to ignore this e-mail if so) is that your problem is due to the way you are creating/summarizing the data before the plotting command is used. R needs some way to know about what bars you want, even if they are 0. This is information that should be a property of the data rather than a property of the graph (if made a property of the data then the graph will take care of itself). Compare the 2 plots created by the following code: par(mfrow=c(2,1)) set.seed(1) tmp - sample( LETTERS[1:7], 10, TRUE ) barplot( table(tmp) ) tmp2 - factor(tmp, levels=LETTERS[1:7]) barplot( table(tmp2) ) Does that show what your problem is? and what you would like the results to look like? The key is to create the data object (the factor 'tmp2' in this case) which includes the information about the levels, even if they are not present in this dataset. On Thu, Oct 4, 2012 at 5:49 PM, Guillaume2883 guillaume.bal@gmail.com wrote: Hi all, I am back with a new question ! I recorded the occurence of 4 differents event on 20 places for a given time period. Now, I want to do some barplot of the frequency of theses events for each place, so it should be easy. My problem is that I want to see the frequencies of the 4 events on my barplots even if the frequency of some of them is 0. How could I do that ? Thanking you in advance for your help ! Guillaume -- View this message in context: http://r.789695.n4.nabble.com/barplot-with-some-0-frequencies-tp4645102.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] BCA Package
Kokila, Note that the quality of replies on this mailing list tend to correlate with the quality of the questions. Follow the advice found in the last 2 lines of this (and all) message and ask a better question and you are more likely to get a better reply. On Fri, Oct 5, 2012 at 5:53 AM, kokila kokila.krish...@quantilez.com wrote: Hi...In R rankscore() function How to calculate estimation validation holdout?Iam waiting for replyPlease Reply. -- View this message in context: http://r.789695.n4.nabble.com/BCA-Package-tp4645146.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] loop for column substraction of a matrix
Dear useRs, I have a matrix with 38 columns and 365 rows. what i want to do is the following. 1. subtracting from each column, first itself and then all the remaining columns. More precisely, from column number 1, i will first subtract itself(column 1) and then the remaining 37 columns. Afterwards i will take column number 2 and do the same. In this way i want to proceed till 38th column. 2. i want to print the resulting matrix on A4 paper. Bundle of thanks in advance. best regards eliza [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bayesian inference distribution parameters
Maybe this link could help you http://sirs.agrocampus-ouest.fr/bayes_V2/index_menu.html You will be able to download some code ! -- View this message in context: http://r.789695.n4.nabble.com/Bayesian-inference-distribution-parameters-tp4645155p4645207.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using variables from different environments in one function
Dear R-community, I have been experiencing this issue with functions within a function and I can kind of feel where the problem is (environments, closures etc), but I don't manage to solve it. I will sketch it with an example: #We have two simple inner functions: ## INNER FUNCTION1 # innerfunction1-function() { ab-a+b cd-c+d innerfunctionlist-list(ab=ab,cd=cd) } ## INNER FUNCTION2 # innerfunction2-function() { print(AB+CD) } #And we have a main script with a number of predefined variables. #This shapes an environment in which variables that will be used #by the inner function are defined. # MAIN SCRIPT a=1 b=2 c=3 d=4 #source(filepath/to/innerfunction1) outcome1-innerfunction1() AB -outcome1$ab CD -outcome1$cd #source(filepath/to/innerfunction2) innerfunction2() # #So far so good. No problem if you run this. #The problem emerges if you want to make the main script a function. #So now I wrap it into a function: main-function() { a=1 b=2 c=3 d=4 #source(filepath/to/innerfunction1) outcome1-innerfunction1() AB -outcome1$ab CD -outcome1$cd #source(filepath/to/innerfunction2) innerfunction2() } when I now run main() in an environment where all these variables are not defined, I will get this error message: Error in innerfunction1() : object 'a' not found I can solve this by defining the variables a, b, c and d. However, I will then get next error message: Error in print(AB + CD) : object 'AB' not found I think I see what happens. The right variables are not present in the right environment. Probably I did something silly, but I really don't see how to solve this. Can someone please help me? Many thanks! Rik Verdonck University of Leuven [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with making figures
Bert, Can you help me understand your suggestion? I don't understand how I can include all 30 sites under the label called site in the xypot code example you provided. -- View this message in context: http://r.789695.n4.nabble.com/help-with-making-figures-tp4645074p4645216.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] barplot with some 0 frequencies
Thank for all your answer ! Here is another way I discoverd this morning : barplot(table(factor(variable, levels = 1:4)),names=c(1,2,3,4)) -- View this message in context: http://r.789695.n4.nabble.com/barplot-with-some-0-frequencies-tp4645102p4645208.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] LaTeX consistent publication graphics from R and Comparison of GLE and R
Hi Everyone I am at the moment preparing my thesis and am looking at producing a few Organigrams / Flow charts (unrelated to the calculations in R) as well as a range of charts (barcharts, histograms, ...) based on calculations in R. For the Organigrams I am looking at an Opensource package called GLE at sourceforge, which produces the text part in Latex figures which is very neat and also in the same style of the thesis, which I wrote in LaTeX. It also offers a range of graphical features, and I am quite tempted. It also produces barcharts and histograms with the options of legends etc. I have done most of my graphs so far with R, but with Organigrams and flow charts I am at a loss (A pointer here would also be very welcome). For some charts I have used MS Visio, but it would be convenient to use just one program for graphing throughout the thesis (i.e. same colour coding etc.). Does anybody have any experience with GLE, ideally working with it with CSV tables generated within R ? Or does there exist another way to generate 'visually LaTeX consistent' graphics within R ? Any takers ? - Christian Langkamp christian.langkamp-at-gmxpro.de -- View this message in context: http://r.789695.n4.nabble.com/LaTeX-consistent-publication-graphics-from-R-and-Comparison-of-GLE-and-R-tp4645218.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] loop for column substraction of a matrix
Hello, 1. Let me refrase it a bit. For each column, sum all others and take the symmetric. mat - matrix(1:30, ncol = 3) sapply(seq_len(ncol(mat)), function(i) -rowSums(mat[, -i])) 2. write.table (maybe using sep = \t ?) and send the file to printer. Hope this helps, Rui Barradas Em 05-10-2012 21:05, eliza botto escreveu: Dear useRs, I have a matrix with 38 columns and 365 rows. what i want to do is the following. 1. subtracting from each column, first itself and then all the remaining columns. More precisely, from column number 1, i will first subtract itself(column 1) and then the remaining 37 columns. Afterwards i will take column number 2 and do the same. In this way i want to proceed till 38th column. 2. i want to print the resulting matrix on A4 paper. Bundle of thanks in advance. best regards eliza [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating all possible ratios
Thanks Rui.I actually exactly did that. Date: Fri, 5 Oct 2012 05:49:20 -0700 From: ml-node+s789695n4645154...@n4.nabble.com To: genome1...@hotmail.com Subject: Re: Calculating all possible ratios Hello, Comment out the second apply and all following instructions using 'r2'. In the end return 'r1', not cbind. Hope this helps, Rui Barradas Em 04-10-2012 23:38, genome1976 escreveu: Hi Rui, A while ago you helped me with calculaing all possible ratios from a dataset. This is the code I am using as suggested by you. data - read.table(new_data.txt, header=T, row.names=1, sep=\t) pairwise.ratios - function(x, prefix=probeset, char=:){ n - ncol(x) cn - colnames(x) if(length(cn) == 0){ cn - gsub( , 0, formatC(seq.int(n), width=nchar(n))) cn - paste(prefix, cn, sep=) } cmb - combn(n, 2) r1 - apply(cmb, 2, function(j) x[, j[1]]/x[, j[2]]) r2 - apply(cmb, 2, function(j) x[, j[2]]/x[, j[1]]) colnames(r1) - apply(cmb, 2, function(j) paste(cn[j], collapse=char)) colnames(r2) - apply(cmb, 2, function(j) paste(cn[rev(j)], collapse=char)) cbind(r1, r2)[, order(c(colnames(r1), colnames(r2)))] } results - pairwise.ratios(data.t) write.table(t(results), ratios_results.txt, sep=\t) It works perfectly fine only that it gives both pairs of ratios a:b and b:a for any two variables a and b. Can you suggest me a way so that I get only one ratio and not both (Combination with caring for the order and not Permutation??) Thanks for any help. Best Regards, Som. Date: Sat, 12 May 2012 15:20:52 -0700 From: [hidden email] To: [hidden email] Subject: RE: Calculating all possible ratios Hello, Nothing wrong with me, maybe your R session has some conflicting objects. Running the function in the previous post on the first 4 rows and first 6 columns of your dataset the result was (copypaste to your session) result - structure(c(8.74714923153198, 1.83094400392095, 9.92065138471113, 1.77145415014708, 1.01515180575001, 0.167175438316099, 0.222321656865252, 0.155576771874649, 3.09417748158541, 0.469647988505747, 1.29398633565582, 0.524043736521509, 3.75969597954255, 0.422694576901317, 9.75471698113208, 0.290397651827521, 4.9035575319622, 1.00105273231888, 1.01093964697178, 0.26895145631068, 0.114322960947685, 0.546166347992352, 0.100799832714726, 0.564507977763338, 0.11605516024473, 0.0913055986191245, 0.0224099858208782, 0.0878243288779063, 0.353735531392494, 0.256505926724138, 0.130433606169248, 0.295826869963301, 0.42981957664441, 0.230861553382365, 0.983273839877614, 0.163931791180376, 0.56058921623124, 0.546741314958369, 0.10190254729944, 0.151825242718447, 0.9850743448771, 5.98173996175908, 4.49798734905118, 6.4276947512815, 8.61659229879359, 10.9522309159971, 44.622964422, 11.3863665430362, 3.04799485560622, 2.8093121408046, 5.82033416762497, 3.36839317468124, 3.70358005398494, 2.52844904226946, 43.8765935747068, 1.86658746243623, 4.83036872336483, 5.98803713273998, 4.5471937427, 1.72873786407767, 0.323187666496628, 2.12925430210325, 0.772805687699305, 1.90823767237023, 2.82697074863659, 3.89854539725884, 7.66673581578674, 3.38035554418724, 0.328084543240185, 0.35595902124055, 0.1718114409242, 0.296877457036954, 1.21508737036511, 0.900024246342843, 7.53850076491586, 0.554147739185128, 1.58476931628683, 2.13149583692219, 0.781259909100518, 0.513223300970874, 0.265978952936953, 2.36577437858509, 0.102514506769826, 3.44355401535389, 2.32655759378615, 4.33160041310018, 1.01701068353905, 6.10009805175427, 0.270009014365446, 0.395499368696959, 0.0227911949977918, 0.535737017484743, 0.822986086753186, 1.11108117816092, 0.132652370966651, 1.8045729131197, 1.30424309801742, 2.36826490573261, 0.103635979283374, 0.926148867313916, 0.203933571388086, 0.998948374760994, 0.989178733859585, 3.71814309436142, 1.78383738225087, 1.82901853699522, 9.81329737579089, 6.58652001534723, 0.207023533247665, 0.166999632405824, 0.219915855047535, 0.578456699988768, 0.631006664328306, 0.469154094827586, 1.27998376513563, 1.9484696000908, 0.76672822844154, 0.422250060615857, 9.64915859255482, 1.07974002376127), .Dim = c(4L, 30L), .Dimnames = list(c(S1, S2, S3, S4), c(P1:P2, P1:P3, P1:P4, P1:P5, P1:P6, P2:P1, P2:P3, P2:P4, P2:P5, P2:P6, P3:P1, P3:P2, P3:P4, P3:P5, P3:P6, P4:P1, P4:P2, P4:P3, P4:P5, P4:P6, P5:P1, P5:P2, P5:P3, P5:P4, P5:P6, P6:P1, P6:P2, P6:P3, P6:P4, P6:P5))) Rui Barradas genome1976 wrote Hi Rui, Thanks once again. I really appreciate it. I tried using the code with the following dataset: Sample P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 S1 5292.9 605.1 5213.9 1710.6 1407.8 1079.4
Re: [R] loop for column substraction of a matrix
HI, Try this: set.seed(1) mat1-matrix(sample(1:500,380,replace=TRUE),ncol=38) list1-list() for(i in 1:ncol(mat1)){ list1[[i]]-t(apply(mat1,1,function(x) x[i]-x)) list1} list1 A.K. - Original Message - From: eliza botto eliza_bo...@hotmail.com To: r-help@r-project.org r-help@r-project.org Cc: Sent: Friday, October 5, 2012 4:05 PM Subject: [R] loop for column substraction of a matrix Dear useRs, I have a matrix with 38 columns and 365 rows. what i want to do is the following. 1. subtracting from each column, first itself and then all the remaining columns. More precisely, from column number 1, i will first subtract itself(column 1) and then the remaining 37 columns. Afterwards i will take column number 2 and do the same. In this way i want to proceed till 38th column. 2. i want to print the resulting matrix on A4 paper. Bundle of thanks in advance. best regards eliza [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] LaTeX consistent publication graphics from R and Comparison of GLE and R
On Oct 5, 2012, at 3:32 PM, clangkamp christian.langk...@gmxpro.de wrote: Hi Everyone I am at the moment preparing my thesis and am looking at producing a few Organigrams / Flow charts (unrelated to the calculations in R) as well as a range of charts (barcharts, histograms, ...) based on calculations in R. For the Organigrams I am looking at an Opensource package called GLE at sourceforge, which produces the text part in Latex figures which is very neat and also in the same style of the thesis, which I wrote in LaTeX. It also offers a range of graphical features, and I am quite tempted. It also produces barcharts and histograms with the options of legends etc. I have done most of my graphs so far with R, but with Organigrams and flow charts I am at a loss (A pointer here would also be very welcome). For some charts I have used MS Visio, but it would be convenient to use just one program for graphing throughout the thesis (i.e. same colour coding etc.). Does anybody have any experience with GLE, ideally working with it with CSV tables generated within R ? Or does there exist another way to generate 'visually LaTeX consistent' graphics within R ? Any takers ? If you are comfortable in LaTeX, I would suggest that you look at PSTricks: http://tug.org/PSTricks/main.cgi I use that for creating subject disposition flow charts for clinical trials with Sweave. I can then use \Sexpr{}'s to fill in various annotations in the boxes, etc. so that all content is programmatically created in a reproducible fashion. There are some examples of flow charts and tree diagrams here: http://tug.org/PSTricks/main.cgi?file=pst-node/psmatrix/psmatrix#flowchart and there are various other online resources for using PSTricks. Keep in mind that since this is PostScript based, you need to use a latex + dvips + ps2pdf sequence, rather than just pdflatex. Regards, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using variables from different environments in one function
Hi Rik, I don't fully understand it, but setting local=TRUE to your source() calls will avoid the specific error you noted. Best, Ista On Fri, Oct 5, 2012 at 3:47 PM, Rik Verdonck rik.verdo...@bio.kuleuven.be wrote: Dear R-community, I have been experiencing this issue with functions within a function and I can kind of feel where the problem is (environments, closures etc), but I don't manage to solve it. I will sketch it with an example: #We have two simple inner functions: ## INNER FUNCTION1 # innerfunction1-function() { ab-a+b cd-c+d innerfunctionlist-list(ab=ab,cd=cd) } ## INNER FUNCTION2 # innerfunction2-function() { print(AB+CD) } #And we have a main script with a number of predefined variables. #This shapes an environment in which variables that will be used #by the inner function are defined. # MAIN SCRIPT a=1 b=2 c=3 d=4 #source(filepath/to/innerfunction1) outcome1-innerfunction1() AB -outcome1$ab CD -outcome1$cd #source(filepath/to/innerfunction2) innerfunction2() # #So far so good. No problem if you run this. #The problem emerges if you want to make the main script a function. #So now I wrap it into a function: main-function() { a=1 b=2 c=3 d=4 #source(filepath/to/innerfunction1) outcome1-innerfunction1() AB -outcome1$ab CD -outcome1$cd #source(filepath/to/innerfunction2) innerfunction2() } when I now run main() in an environment where all these variables are not defined, I will get this error message: Error in innerfunction1() : object 'a' not found I can solve this by defining the variables a, b, c and d. However, I will then get next error message: Error in print(AB + CD) : object 'AB' not found I think I see what happens. The right variables are not present in the right environment. Probably I did something silly, but I really don't see how to solve this. Can someone please help me? Many thanks! Rik Verdonck University of Leuven [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] loop for column substraction of a matrix
Hi, Sorry, I think I misunderstand your question (after reading Rui's solution). You can also try any of these to get the result if this is what you meant: set.seed(1) mat1-matrix(sample(1:500,380,replace=TRUE),ncol=38) res1-t(do.call(rbind,lapply(1:ncol(mat1), function(i) mat1[,i]-apply(mat1,1,sum #or res1-sapply(1:ncol(mat1), function(i) mat1[,i]-apply(mat1,1,sum)) #or res1-mapply(FUN=function(i) mat1[,i]-rowSums(mat1), 1:ncol(mat1)) A.K - Original Message - From: eliza botto eliza_bo...@hotmail.com To: r-help@r-project.org r-help@r-project.org Cc: Sent: Friday, October 5, 2012 4:05 PM Subject: [R] loop for column substraction of a matrix Dear useRs, I have a matrix with 38 columns and 365 rows. what i want to do is the following. 1. subtracting from each column, first itself and then all the remaining columns. More precisely, from column number 1, i will first subtract itself(column 1) and then the remaining 37 columns. Afterwards i will take column number 2 and do the same. In this way i want to proceed till 38th column. 2. i want to print the resulting matrix on A4 paper. Bundle of thanks in advance. best regards eliza [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Dúvida função Anova pacote car - Medidas repetidas
Dear Diego, This is close enough to Spanish for me to understand it (I think). Using Anova() in the car package for repeated-measures designs requires a multivariate linear model for all of the responses, which in turn requires that the data set be in wide format, with each response as a variable. In your case, there are two crossed within-subjects factors and no between-subjects factors. If this understanding is correct (but see below), then you could proceed as follows, where the crucial step is reshaping the data from long to wide: - snip -- Pa2$type.day - with(Pa2, paste(Type, Day, sep=.)) (Wide - reshape(Pa2, direction=wide, v.names=logbiovolume, idvar=Replicate, timevar=type.day, drop=c(Type, Day))) day - ordered(rep(c(0, 2, 4), each=2)) type - factor(rep(c(c, t), 3)) (idata - data.frame(day, type)) mod - lm(cbind(logbiovolume.c.0, logbiovolume.t.0, logbiovolume.c.2, logbiovolume.t.2, logbiovolume.c.4, logbiovolume.t.4) ~ 1, data=Wide) Anova(mod, idata=idata, idesign=~day*type) - snip -- This serves to analyze the data that you showed; you'll have to adapt it for the full data set. I'm assuming that the replicates are independent units, and that the design is therefore entirely within replicate. If that's wrong, then the analysis I've suggested is also incorrect. I hope this helps, John --- John Fox Senator McMaster Professor of Social Statistics Department of Sociology McMaster University Hamilton, Ontario, Canada -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Diego Pujoni Sent: Friday, October 05, 2012 9:57 AM To: r-help@r-project.org Subject: [R] Dúvida função Anova pacote car - Medidas repetidas Ola pessoal, estou realizando uma ANOVA com medidas repetidas e estou utilizando a fungco Anova do pacote car. Medi o biovolume de algas a cada dois dias durante 10 dias (no banco de dados abaixo ss coloquei ati o 40 dia). Tenho 2 tratamentos (c,t) e o experimento foi realizado em triplicas (A,B,C). Pa2 Day Type Replicate logbiovolume 10c A19.34 20c B18.27 30c C18.56 40t A18.41 50t B18.68 60t C18.86 72c A18.81 82c B18.84 92c C18.52 10 2t A18.29 11 2t B17.91 12 2t C17.67 13 4c A19.16 14 4c B18.85 15 4c C19.36 16 4t A19.05 17 4t B19.09 18 4t C18.26 . . . Pa2.teste = within(Pa2,{group = factor(Type) time = factor(Day) id = factor(Replicate)}) matrix = with(Pa2.teste,cbind(Pa2[,VAR][group==c],Pa2[,VAR][group==t])) matrix [,1] [,2] [1,] 19.34 18.41 [2,] 18.27 18.68 [3,] 18.56 18.86 [4,] 18.81 18.29 [5,] 18.84 17.91 [6,] 18.52 17.67 [7,] 19.16 19.05 [8,] 18.85 19.09 [9,] 19.36 18.26 [10,] 19.63 18.96 [11,] 19.94 18.06 [12,] 19.54 18.37 [13,] 19.98 17.96 [14,] 20.99 17.93 [15,] 20.45 17.74 [16,] 21.12 17.60 [17,] 21.66 17.33 [18,] 21.51 18.12 model - lm(matrix ~ 1) design - factor(c(c,t)) options(contrasts=c(contr.sum, contr.poly)) aov - Anova(model, idata=data.frame(design), idesign=~design, type=III) summary(aov, multivariate=F) Univariate Type III Repeated-Measures ANOVA Assuming Sphericity SS num Df Error SS den Df FPr(F) (Intercept) 12951.2 1 6.3312 17 34775.336 2.2e-16 *** design 19.1 1 17.3901 1718.697 0.0004606 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 O problema i que eu acho que esta fungco nco esta levando em consideragco os dias, nem as riplicas. Como fago para introduzir isto na analise. Vocjs conhecem alguma fungco correspondente nco paramitrica para este teste? Tipo um teste de Friedman com dois grupos (tratamento e riplica) e um bloco (tempo)? Muito Obrigado Diego PJ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] loop for column substraction of a matrix
Dear Arun and Barradas, millions of thanks. you guys always rock. regards eliza Date: Fri, 5 Oct 2012 14:41:38 -0700 From: smartpink...@yahoo.com Subject: Re: [R] loop for column substraction of a matrix To: eliza_bo...@hotmail.com CC: ruipbarra...@sapo.pt; r-help@r-project.org Hi, Sorry, I think I misunderstand your question (after reading Rui's solution). You can also try any of these to get the result if this is what you meant: set.seed(1) mat1-matrix(sample(1:500,380,replace=TRUE),ncol=38) res1-t(do.call(rbind,lapply(1:ncol(mat1), function(i) mat1[,i]-apply(mat1,1,sum #or res1-sapply(1:ncol(mat1), function(i) mat1[,i]-apply(mat1,1,sum)) #or res1-mapply(FUN=function(i) mat1[,i]-rowSums(mat1), 1:ncol(mat1)) A.K - Original Message - From: eliza botto eliza_bo...@hotmail.com To: r-help@r-project.org r-help@r-project.org Cc: Sent: Friday, October 5, 2012 4:05 PM Subject: [R] loop for column substraction of a matrix Dear useRs, I have a matrix with 38 columns and 365 rows. what i want to do is the following. 1. subtracting from each column, first itself and then all the remaining columns. More precisely, from column number 1, i will first subtract itself(column 1) and then the remaining 37 columns. Afterwards i will take column number 2 and do the same. In this way i want to proceed till 38th column. 2. i want to print the resulting matrix on A4 paper. Bundle of thanks in advance. best regards eliza [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] loop for column substraction of a matrix
Or, using your mapply solution, compute rs - rowSums(mat1) and then use it for all subtractions. With a larger dataset this would be probably faster. Rui Barradas Em 05-10-2012 22:41, arun escreveu: Hi, Sorry, I think I misunderstand your question (after reading Rui's solution). You can also try any of these to get the result if this is what you meant: set.seed(1) mat1-matrix(sample(1:500,380,replace=TRUE),ncol=38) res1-t(do.call(rbind,lapply(1:ncol(mat1), function(i) mat1[,i]-apply(mat1,1,sum #or res1-sapply(1:ncol(mat1), function(i) mat1[,i]-apply(mat1,1,sum)) #or res1-mapply(FUN=function(i) mat1[,i]-rowSums(mat1), 1:ncol(mat1)) A.K - Original Message - From: eliza botto eliza_bo...@hotmail.com To: r-help@r-project.org r-help@r-project.org Cc: Sent: Friday, October 5, 2012 4:05 PM Subject: [R] loop for column substraction of a matrix Dear useRs, I have a matrix with 38 columns and 365 rows. what i want to do is the following. 1. subtracting from each column, first itself and then all the remaining columns. More precisely, from column number 1, i will first subtract itself(column 1) and then the remaining 37 columns. Afterwards i will take column number 2 and do the same. In this way i want to proceed till 38th column. 2. i want to print the resulting matrix on A4 paper. Bundle of thanks in advance. best regards eliza [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.