[R] is there any option like cex.axis in ggplot2?
Dear list, I made boxplots using ggplot and want to control for x- and yaxis. Using plot I can do it by setting cex.axis equally to any size but can't figure out how to do it with ggplot. ggplot(dat, aes(x = factor(time), y = volume)) + opts(axis.title.x=theme_text(size=8),axis.title.y=theme_text(size=8)) + geom_boxplot() + geom_jitter(aes(colour = id))+labs(x = time, y = volume) Thanks for your help, Tom __ Går det långsamt? Skaffa dig en snabbare bredbandsuppkoppling. Sök och jämför priser hos Kelkoo. http://www.kelkoo.se/c-100015813-bredband.html?partnerId=96914325 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ggplot2: labels points with different colors or idnumbers
Dear list, Using ggplot2 I could produce both boxplot and points in the same plot but instead of points I would like to label the different subjects with different colors or their idnumbers. Is there away to do it? Also how can I put three plots on the same graph with ggplot2? mfrow=c(3,1) did not do the job. dat group time id freq 1 1 00 0018 5.21 2 1 00 3026 3.13 3 1 00 5030 5.04 4 1 00 5108 3.23 5 1 00 5152 3.97 6 1 00 6080 0.16 7 1 01 0018 4.89 8 1 01 3026 6.58 9 1 01 5030 7.42 10 1 01 5108 10.10 11 1 01 5152 3.74 12 1 01 6080 0.81 library(ggplot2) qplot(factor(dat$time),dat$freq,dat,geom=c(boxplot,jitter), ylab=names(dat[,4]),xlab=time) __ Ta semester! - sök efter resor hos Kelkoo. Jämför pris på flygbiljetter och hotellrum här: http://www.kelkoo.se/c-169901-resor-biljetter.html?partnerId=96914052 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] select specific listcomponents to calculate the means
Dear list, I have following list [[1]] Pnr timeCA CACen 1 62083014541 0.008 TRUE 2 62083014542 0.008 TRUE 3 62083014543 0.008 FALSE 4 62083014544 0.013 TRUE 5 62083014545 0.007 FALSE [[2]] Pnr timeCA CACen 1 64031471161 0.020 FALSE 2 64031471162 0.089 FALSE 3 64031471163 0.020 FALSE 4 64031471164 0.025 FALSE 5 64031471165 0.012 FALSE [[3]] Pnr time CACACen 1 49051274131 0.008 TRUE 2 49051274132 0.007 TRUE 3 49051274133 0.003 TRUE 4 49051274134 0.006 TRUE [[4]] Pin timeCA CACen 1 50092771371 0.008 TRUE 2 50092771372 0.009 TRUE 3 50092771373 0.008 FALSE 4 50092771374 0.009 FALSE 5 50092771375 0.008 FALSE How do I tell R to select the listelements containing both TRUE and FALSE to calculate the weighted means and those with only TRUE or FALSE to calculate the aritmetic means and then put all the means together in a dataframe. The result should look like Pnr Mean 6208301454weighted mean 6403147116arith. mean 4905127413arith. mean 5009277137weighted mean Thanks for any help, Tom - Låna pengar utan säkerhet. Sök och jämför lån hos Kelkoo. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] error when using logistic.display within a loop
Dear list, I tried to apply the logistic regression to different response variables from a dataframe and would like to store the results using the function logistic.display from the epicalc package in a list, but got an error message Error in eval(expr, envir, enclos) : y values must be 0 = y = 1. All the response variables have value of 0 or 1. It worked well for summary and anova but not for logistic.display in the loop. Thanks for any help, Tom logreg1-vector(list, length(l)) logreg.anov1-vector(list,length(l)) logreg.summ1-vector(list,length(l)) logreg.conf1-vector(list,length(l)) for (i in c(13:16)){ logreg1[[i-x]]-glm(dat[,i]~group + age, family=binomial, data=Ndat) logreg.anov1[[i-x]]-anova(logreg1[[i-x]]) logreg.summ1[[i-x]]-summary(logreg1[[i-x]]) logreg.conf1[[i-x]]-logistic.display(logreg1[[i-x]], crude=FALSE) } - Går det långsamt? Skaffa dig en snabbare bredbandsuppkoppling. Sök och jämför priser hos Kelkoo. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help with calculating the differences between dates
Dear list, How can I calculate the difference in days between the eventdate and basedate in the below dataset? id basedate outcome.3 eventdate daydiff 1 1001 1999-09-28 2 1999-10-013 2 1002 1999-09-22 1 3 1003 2000-01-19 1 4 1004 2004-01-25 2 2004-02-039 5 1005 2005-08-11 1 6 1006 2000-07-04 1 2001-05-29 7 1007 2004-02-12 1 2004-11-18 8 1008 2006-01-18 2 2006-02-02 9 1009 2005-04-29 2 2005-06-14 10 1010 2006-03-17 2 2006-03-31 11 1011 2000-03-21 2 2000-03-28 12 1012 2004-07-12 1 2006-11-28 13 1013 2000-02-24 1 14 1014 2003-04-17 1 15 1015 2000-04-05 1 Thanks for any help, Tom - Går det långsamt? Skaffa dig en snabbare bredbandsuppkoppling. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] set the lower bound of normal distribution to 0 ?
Tom Cohen [EMAIL PROTECTED] skrev:Thanks Prof Brian for your suggestion. I should know that for right-skewed data, one should generate the samples from a lognormal. My problem is that x and y are two instruments that were thought to be measured the same thing but somehow show a wide confidence interval of the difference between the two intruments.This may be true that these two measure differently but can also due to the small number of observations, so the idea is if I increases the sample size then I may get better precision between the two instrument by generating samples based on the means and standard deviations from x and y. I am using 'urlnorm' which allows sampling from truncated distribution since I want the samples to take values from 0 to the max(x) respectively max(y). I am unsure how to specify the means and standard deviations in 'urlnorm'. Based on x- and y-values I have standard deviations sd_x=0.3372137, sd_y=0.5120841 and the means mean_x=0.3126667 mean_y=0.4223137 which are not on log scale as required in urlnorm. To covert sd_x, sd_y and mean_x, mean_y on a log-scale I did sd_logx=sqrt(log(1.3372137))=0.54, sd_logy=sqrt(log(1.5120841))=0.64, mean_logx=-(0.54^2)/2 and mean_logy=-(0.64^2)/2. Can anyone tell if these are correctly calculated? Are these the values to be specified in urlnorm? Do the lower respectively upper bound have to be on the log-scale as well or which scale? set.seed(7) for(i in 1:len){ s1[[i]]-cbind.data.frame(x=urlnorm(n*i,meanlog=mean_logx,sdlog=sd_logx, lb=0, ub=max(x)), y=urlnorm(n*i,meanlog=mean_logy,sdlog=sd_logy, lb=0, ub=max(y))) } Thanks again for any suggetions. Prof Brian Ripley [EMAIL PROTECTED] skrev: On Thu, 27 Mar 2008, Tom Cohen wrote: Dear list, I have a dataset containing values obtained from two different instruments (x and y). I want to generate 5 samples from normal distribution for each instrument based on their means and standard deviations. The problem is values from both instruments are non-negative, so if using rnorm I would get some negative values. Is there any options to determine the lower bound of normal distribution to be 0 or can I simulate the samples in different ways to avoid the negative values? Well, that would not be a normal distribution. If you want a _truncated_ normal distribution it is very easy by inversion. E.g. trunc_rnorm - function(n, mean = 0, sd = 1, lb = 0) { lb - pnorm(lb, mean, sd) qnorm(runif(n, lb, 1), mean, sd) } but I suggest you may rather want samples from a lognormal. dat id x y 75 101 0.134 0.1911315 79 102 0.170 0.1610306 76 103 0.134 0.1911315 84 104 0.170 0.1610306 74 105 0.134 0.1911315 80 106 0.170 0.1610306 77 107 0.134 0.1911315 81 108 0.170 0.1610306 82 109 0.170 0.1610306 78 111 0.170 0.1610306 83 112 0.170 0.1610306 85 113 0.097 0.278 2 201 1.032 1.5510434 1 202 0.803 1.0631001 5 203 1.032 1.5510434 mu-apply(dat[,-1],2,mean) sigma-apply(dat[,-1],2,sd) len-5 n-20 s1-vector(list,len) set.seed(7) for(i in 1:len){ s1[[i]]-cbind.data.frame(x=rnorm(n*i,mean=mu[1],sd=sigma[1]), y=rnorm(n*i,mean=mu[2],sd=sigma[2])) } Thanks for any help, Tom - S?? efter k??leken! [[alternative HTML version deleted]] -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 - Går det långsamt? Skaffa dig en snabbare bredbandsuppkoppling. - Låna pengar utan säkerhet. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Reorder the x-axis using lattice
Dear list, Is there a way to reorder the xaxis using lattice. Using the following data, the x-axis is ordered as BP GH MH PF RE RP SF VT but I would like the x-axis to be ordered as PF RP BP GH VT SF RE MH. Kön Skalor Tillfälle Medelvärde 1 Kvinnor BP 1-inskrivning 36.45283 2 Kvinnor GH 1-inskrivning 38.62255 3 Kvinnor MH 1-inskrivning 62.9 4 Kvinnor PF 1-inskrivning 39.80710 5 Kvinnor RE 1-inskrivning 41.50943 6 Kvinnor RP 1-inskrivning 22.2 7 Kvinnor SF 1-inskrivning 59.19811 8 Kvinnor VT 1-inskrivning 34.84568 9 Kvinnor BP 2-utskrivning 43.14815 10 Kvinnor GH 2-utskrivning 44.11321 11 Kvinnor MH 2-utskrivning 77.2 12 Kvinnor PF 2-utskrivning 44.74280 13 Kvinnor RE 2-utskrivning 68.95425 14 Kvinnor RP 2-utskrivning 39.90385 15 Kvinnor SF 2-utskrivning 64.62264 16 Kvinnor VT 2-utskrivning 51.97531 bwplot(Medelvärde ~ Skalor| Kön , kt, panel = panel.superpose, groups = Tillfälle,scales = list(x = list(rot = 45),cex=0.7,alternating=2), panel.groups = panel.linejoin,lty=c(1:3),lwd=3,col=c(steelblue,grey50,green4), ylab = list(label = skalpoäng (0-100), cex = 0.8), xlab = list(label = skalor, cex = 0.8), key = list(lines = Rows(list(col=c(steelblue,grey50,green4),lty=c(1:3)), c(1:3, 0)),cex=0.8,text = list(lab = as.character(unique(kt$Tillfälle))), columns = 2, title = SF-36: Skalpoäng för respektive kön vid 3 mättillfälle , cex.title=0.9)) Thanks in advance, Tom - Jämför pris på flygbiljetter och hotellrum: http://shopping.yahoo.se/c-169901-resor-biljetter.html [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] options in 'rnorm' to set the lower bound of normal distribution to 0 ?
Dear list, I have a dataset containing values obtained from two different instruments (x and y). I want to generate 5 samples from normal distribution for each instrument based on their means and standard deviations. The problem is values from both instruments are non-negative, so if using rnorm I would get some negative values. Is there any options to determine the lower bound of normal distribution to be 0 or can I simulate the samples in different ways to avoid the negative values? dat id x y 75 101 0.134 0.1911315 79 102 0.170 0.1610306 76 103 0.134 0.1911315 84 104 0.170 0.1610306 74 105 0.134 0.1911315 80 106 0.170 0.1610306 77 107 0.134 0.1911315 81 108 0.170 0.1610306 82 109 0.170 0.1610306 78 111 0.170 0.1610306 83 112 0.170 0.1610306 85 113 0.097 0.278 2 201 1.032 1.5510434 1 202 0.803 1.0631001 5 203 1.032 1.5510434 mu-apply(dat[,-1],2,mean) sigma-apply(dat[,-1],2,sd) len-5 n-20 s1-vector(list,len) set.seed(7) for(i in 1:len){ s1[[i]]-cbind.data.frame(x=rnorm(n*i,mean=mu[1],sd=sigma[1]), y=rnorm(n*i,mean=mu[2],sd=sigma[2])) } Thanks for any help, Tom - Sök efter kärleken! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] options in 'rnorm' to set the lower bound of normal distribution to 0 ?
Thanks Prof Brian for your suggestion. I should know that for right-skewed data, one should generate the samples from a lognormal. My problem is that x and y are two instruments that were thought to be measured the same thing but somehow show a wide confidence interval of the difference between the two intruments.This may be true that these two measure differently but can also due to the small number of observations, so the idea is if I increases the sample size then I may get better precision between the two instrument by generating samples based on the means and standard deviations from x and y. I am using 'urlnorm' which allows sampling from truncated distribution since I want the samples to take values from 0 to the max(x) respectively max(y). I am unsure how to specify the means and standard deviations in 'urlnorm'. Based on x- and y-values I have standard deviations sd_x=0.3372137, sd_y=0.5120841 and the means mean_x=0.3126667 mean_y=0.4223137 which are not on log scale as required in urlnorm. To covert sd_x, sd_y and mean_x, mean_y on a log-scale I did sd_logx=sqrt(log(1.3372137))=0.54, sd_logy=sqrt(log(1.5120841))=0.64, mean_logx=-(0.54^2)/2 and mean_logy=-(0.64^2)/2. Can anyone tell if these are correctly calculated? Are these the values to be specified in urlnorm? Do the lower respectively upper bound have to be on the log-scale as well or which scale? set.seed(7) for(i in 1:len){ s1[[i]]-cbind.data.frame(x=urlnorm(n*i,meanlog=mean_logx,sdlog=sd_logx, lb=0, ub=max(x)), y=urlnorm(n*i,meanlog=mean_logy,sdlog=sd_logy, lb=0, ub=max(y))) } Thanks again for any suggetions. Prof Brian Ripley [EMAIL PROTECTED] skrev: On Thu, 27 Mar 2008, Tom Cohen wrote: Dear list, I have a dataset containing values obtained from two different instruments (x and y). I want to generate 5 samples from normal distribution for each instrument based on their means and standard deviations. The problem is values from both instruments are non-negative, so if using rnorm I would get some negative values. Is there any options to determine the lower bound of normal distribution to be 0 or can I simulate the samples in different ways to avoid the negative values? Well, that would not be a normal distribution. If you want a _truncated_ normal distribution it is very easy by inversion. E.g. trunc_rnorm - function(n, mean = 0, sd = 1, lb = 0) { lb - pnorm(lb, mean, sd) qnorm(runif(n, lb, 1), mean, sd) } but I suggest you may rather want samples from a lognormal. dat id x y 75 101 0.134 0.1911315 79 102 0.170 0.1610306 76 103 0.134 0.1911315 84 104 0.170 0.1610306 74 105 0.134 0.1911315 80 106 0.170 0.1610306 77 107 0.134 0.1911315 81 108 0.170 0.1610306 82 109 0.170 0.1610306 78 111 0.170 0.1610306 83 112 0.170 0.1610306 85 113 0.097 0.278 2 201 1.032 1.5510434 1 202 0.803 1.0631001 5 203 1.032 1.5510434 mu-apply(dat[,-1],2,mean) sigma-apply(dat[,-1],2,sd) len-5 n-20 s1-vector(list,len) set.seed(7) for(i in 1:len){ s1[[i]]-cbind.data.frame(x=rnorm(n*i,mean=mu[1],sd=sigma[1]), y=rnorm(n*i,mean=mu[2],sd=sigma[2])) } Thanks for any help, Tom - S?? efter k??leken! [[alternative HTML version deleted]] -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 - Går det långsamt? Skaffa dig en snabbare bredbandsuppkoppling. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Repeated measures using lme
Dear list, I am trying to do a repeated analysis using lme in R and a little bit unsure if I have set up the right statement. The problem is the IL6 (interleukin 6) was measured 5 times on each individual in each of 6 companies. The hypotheses are to see whether there is a relationship between IL6 and the total dust in each of the companies and if there is any change in IL6 across time points. So the fixed effects are total dust, company and random effects is individual. The model would be like this lme(IL6~ dust + time*company, random=~time|individual, correlation=corAR1(form=~time|individual), data=dat) with time as a repeated measure. the analysis in SAS would be proc mixed data=dat; class time individual company; model IL6=dust company time company*time; repeated time/Sub=individual(company) type=AR(1) r rcorr; random individual; lsmeans company time company*time/slice=time; run; Am I writing the right code in R that would give me the same results if doing the analysis in SAS. Also is there any command in R that does the same thing as SLICE in SAS does, to test when in time there is difference between companies? Thanks for any help, Tom - Sök efter kärleken! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Change the color and lines of the legend using bwplot
Dear list, I have following plot, where I have set the color (red and green) and lines (lty=2:3) in the panel.groups but can't not figure out how change the lines and color of the legend in the key to the same lines and color as in the panel.groups. bwplot(means ~ age | scales , dat, panel = panel.superpose, groups = sex,scales = list(x = list(rot = 45),cex=0.7,alternating=2), panel.groups = panel.linejoin, lwd=1.2,lty=c(2:3),type=b,col=c (red,green), ylab = list(label = mean value, cex = 0.8), xlab = list(label = scales, cex = 0.8), key = list(lines = Rows(trellis.par.get(superpose.line),c(1:2, 0)), cex=0.8,text = list(lab = as.character(unique(dat$sex))), columns = 2, title = age sex, cex.title=0.9)) Thanks for any help, Tom - Låna pengar utan säkerhet. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Two bwplots in one single graph
Dear list, With the below codes, I got 8 bwplots but I would like to put 2 bwplots in one single graph so that instead of 8 separate bwplots I would have 4 graphs, each contains 2 bwplots. How can I do that? Another question is how do I add the mean as a point to each boxplot in the bwplot but also keeping the median line. for (i in 1:length(dat)){ windows() with(dat[[i]], print(bwplot(value ~ time | sex + age , scales = list(x = list(rot = 45)), ylab = list(label = paste(sname[i],-value,sep=), cex = 0.8), xaxis = list(cex = 0.6), panel = function(x, y){panel.bwplot(x, y, pch = '|',horiz=F,stats = boxplot.stats, fill = khaki2, varwidth = T)}))) } Thanks, Tom - Sök efter kärleken! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help with bwplot
Dear list, I have following data set, which I want to plot the Scale variable on the x-axis and Mean´on the y-axis for each Ageclass and for each sex. The Mean value of each Ageclass for each sex would be connected by a line. Totally, there should be 6 lines, from which three present the Mean values of each Ageclass for respective sex. Are there any easy ways to do this in R? Ageclass Scale MeanSex 1 21-40BP 40.26667 female 2 41-60BP 34.10714 female 3 61-79BP 37.3 female 4 21-40GH 30.25000 female 5 41-60GH 39.00926 female 6 61-79GH 49.3 female 7 21-40MH 56.5 female 8 41-60MH 62.42857 female 9 61-79MH 72.72727 female 1021-40PF 25.86111 female 1141-60PF 42.42063 female 1261-79PF 52.17172 female 1321-40RE 38.09524 female 1441-60RE 42.85714 female 1561-79RE 42.42424 female 1621-40RP 20.0 female 1741-60RP 25.89286 female 1861-79RP 15.90909 female 1921-40SF 51.7 female 2041-60SF 63.9 female 2161-79SF 57.95455 female 2221-40VT 32.1 female 2341-60VT 36.96429 female 2461-79VT 33.18182 female 2521-40BP 35.0 male 2641-60BP 37.75000 male 2761-79BP 36.0 male 2821-40GH 42.16667 male 2941-60GH 41.89062 male 3061-79GH 41.4 male 3121-40MH 72.0 male 3241-60MH 66.60417 male 3361-79MH 75.2 male 3421-40PF 41.85185 male 3541-60PF 55.31250 male 3661-79PF 47.0 male 3721-40RE 37.03704 male 3841-60RE 54.16667 male 3961-79RE 46.7 male 4021-40RP 27.8 male 4141-60RP 28.12500 male 4261-79RP 20.0 male 4321-40SF 61.1 male 4441-60SF 66.40625 male 4561-79SF 60.0 male 4621-40VT 38.9 male 4741-60VT 30.93750 male 4861-79VT 42.0 male Thanks for any help, Tom - Går det långsamt? Skaffa dig en snabbare bredbandsuppkoppling. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Calculate the difference between dates
Dear list, I have two data columns (part of big data frame) containing dates presenting the dates when two measurements (M1 and M2) were taken. The data consists of 73 individuals divided in different groups. Each group was examined at different time points (see M1 date),but the measurements (M1 and M2) within each group should be taken in the same day but due to some practical issues, some M2-measurents were taken on another day. All individuals measured at the same day in column M1date belong to one group. For example, 24 measurements were taken 8/15/2005 (M1date column,one group of individual) but only 4 M2-mesurements were taken at the same day, the 24-2=20 M2-measurements were taken 2-65 days later within this group. For analysis, I need to know how many days the M2-measurements were taken if they were not taken at the same day as M1-measurements within each group. Each of the different dates in column M1date would reflect the starting date for each group,for example 8/15/2005 is starting date for one group, and 8/8/2005 is a starting date for another group, so if M2-measurements wíthin group were taken for example 8/15/2005, then we say that they were taken at day 1, and if they were taken 8/16/2005 we say they were taken 2 days after the starting date for that group and so on. For group with starting date 8/8/2005, if the M2-measurements were taken at the same date, then these would be considered taken at day 1 and for those that were taken 8/9/2005, we say they were taken at day 2 for this group In the below data, I have manually calculated the number of days when M2-measurements were taken if they were not taken at the starting day for some individuals (column Days). Is there any automatic ways to do this in R ? M1date M2date Days 75 8/15/2005 8/15/2005 1 79 8/15/2005 8/16/2005 2 76 8/15/2005 8/15/2005 1 84 8/15/2005 8/16/2005 2 74 8/15/2005 8/15/2005 1 80 8/15/2005 8/16/2005 2 77 8/15/2005 8/15/2005 1 81 8/15/2005 8/16/2005 2 82 8/15/2005 8/16/2005 2 78 8/15/2005 8/16/2005 2 83 8/15/2005 8/16/2005 2 85 8/15/2005 10/17/2005 62 2 8/15/2005 10/19/2005 64 1 8/15/2005 10/18/2005 63 5 8/16/2005 10/19/2005 65 3 8/15/2005 10/19/2005 65 4 8/15/2005 10/19/2005 65 6 8/15/2005 10/19/2005 65 12 8/8/2005 8/9/2005 2 10 8/8/2005 8/9/2005 2 11 8/8/2005 8/9/2005 2 8 8/8/2005 11/7/2005 7 8/8/2005 11/8/2005 9 8/8/2005 8/8/2005 29 8/8/2005 11/10/2005 25 8/8/2005 11/9/2005 30 8/8/2005 11/10/2005 28 8/8/2005 11/9/2005 32 8/15/2005 11/10/2005 33 8/15/2005 11/10/2005 31 8/15/2005 11/10/2005 24 8/15/2005 11/9/2005 26 8/15/2005 11/9/2005 27 8/15/2005 11/9/2005 14 7/31/2006 11/15/2006 18 7/31/2006 11/13/2006 13 7/31/2006 11/15/2006 16 7/31/2006 11/16/2006 20 7/31/2006 11/14/2006 17 7/31/2006 11/16/2006 19 7/31/2006 11/13/2006 37 8/7/2006 8/7/2006 39 8/7/2006 8/7/2006 42 8/7/2006 9/20/2006 49 8/7/2006 9/21/2006 52 8/7/2006 9/21/2006 50 8/7/2006 9/21/2006 47 8/7/2006 9/21/2006 38 8/7/2006 8/7/2006 45 8/7/2006 9/19/2006 43 8/7/2006 9/20/2006 48 8/7/2006 9/21/2006 36 8/7/2006 8/7/2006 44 8/7/2006 9/20/2006 46 8/7/2006 9/19/2006 51 8/7/2006 9/21/2006 41 8/7/2006 8/8/2006 40 8/7/2006 8/8/2006 68 7/31/2006 9/28/2006 59 7/31/2006 8/1/2006 69 7/31/2006 9/28/2006 71 7/31/2006 9/28/2006 58 8/1/2006 9/27/2006 60 7/31/2006 8/1/2006 70 7/31/2006 9/28/2006 66 7/31/2006 9/28/2006 67 7/31/2006 9/28/2006 72 7/31/2006 9/28/2006 64 8/1/2006 8/1/2006 61 7/31/2006 8/1/2006 62 7/31/2006 8/1/2006 63 7/31/2006 8/1/2006 65 7/31/2006 8/1/2006 Thanks for any help, Tom - Går det långsamt? Skaffa dig en snabbare bredbandsuppkoppling. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] adding the mean and standard deviation to boxplots
Dear list, How can I add the mean and standard deviation to each of the boxplots using the example provided in the boxplot function? boxplot(len ~ dose, data = ToothGrowth, boxwex = 0.25, at = 1:3 - 0.2, subset = supp == VC, col = yellow, main = Guinea Pigs' Tooth Growth, xlab = Vitamin C dose mg, ylab = tooth length, ylim = c(0, 35), yaxs = i) boxplot(len ~ dose, data = ToothGrowth, add = TRUE, boxwex = 0.25, at = 1:3 + 0.2, subset = supp == OJ, col = orange) legend(2, 9, c(Ascorbic acid, Orange juice), fill = c(yellow, orange)) Thanks for any help, Tom - Jämför pris på flygbiljetter och hotellrum: http://shopping.yahoo.se/c-169901-resor-biljetter.html [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help with checking out-of-range values in each column in data frame
Dear list, I have following data, where I want to check if any value in each column is out of range. For example, column f1 can only take values 1-5, so if any values less than 1 or 5 will be defined as missing value (i.e. NA), column f4 can only take values of 1-3 and any values that are outside this interval will be considered as missing values. The below data is a subset of a big survey sample and I want to create an automatic procedure to check if all particpants gave a reasonable answer. How can I do this in R and also replace the empty values with NA? dat id f1 f2 f3 f4 f5 f6 f7 f8 f9 f10 1 1 5 3 1 1 1 1 2 1 1 1 2 2 5 5 1 1 1 1 2 1 1 2 3 3 3 4 1 1 1 1 2 1 1 1 4 4 5 5 1 1 1 1 1 1 1 1 5 5 4 3 2 1 2 2 1 2 3 6 6 4 4 1 2 2 1 2 1 1 1 7 7 4 4 1 1 1 2 3 2 2 2 8 8 4 5 2 2 2 2 2 2 2 2 9 9 4 4 2 3 3 3 3 3 3 3 10 10 4 3 1 2 3 1 2 1 2 3 11 11 2 5 1 1 2 1 3 1 1 2 12 12 4 3 1 2 3 3 3 3 2 3 13 13 5 5 1 1 1 1 2 1 1 2 14 14 5 3 3 3 3 2 1 3 1 1 15 15 4 3 1 1 1 2 2 2 1 2 16 16 3 2 2 3 2 3 3 2 2 3 17 17 4 5 1 1 1 1 2 1 1 1 18 18 3 3 2 2 3 2 3 2 3 3 19 19 4 4 1 2 2 2 3 2 3 3 20 20 4 4 1 2 3 3 3 2 3 3 Thanks in advance, Tom - Går det långsamt? Skaffa dig en snabbare bredbandsuppkoppling. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help with replacing all comma in a data frame with a dot
Dear list, I have imported a spss data file in R, where a comma is used to separate the decimal numbers, e.g. 3,567 instead of 3.567 as in R. How can I replace the comma with a dot for all values in the data frame kk ab c d ef 1 ,0925199910320613 82523 8 ,855 ,803 ,69 2 ,278314161923372 91657 26 1,285 1,032 ,823 3 ,278314161923372 91657 26 1,285 1,032 ,823 4 ,278314161923372 91657 26 1,285 1,032 ,823 5 ,278314161923372 91657 26 1,285 1,032 ,823 6 ,278314161923372 91657 26 1,285 1,032 ,823 7 ,203740581833404 72026 94 ,479,3 ,061 8 ,243694169416943 23684 77 1,375 ,437 ,054 9 ,3 21857 86 ,829 ,315 ,029 10,6 111569 93 ,764,4 ,076 11,6 111569 93 ,764,4 ,076 12,6 111569 93 ,764,4 ,076 13,419788431 35744 95 ,44 ,298 ,076 14,419788431 35744 95 ,44 ,298 ,076 16 ,39266161 29098 90 ,361 ,256 ,076 17 ,39266161 29098 90 ,361 ,256 ,076 18,691736472 40135 85 ,864 ,284 ,09 19,691736472 40135 85 ,864 ,284 ,09 20,442407817 48673 86 ,44 ,279 ,088 24 ,0925199910320613 22482 64 ,104 ,082 ,054 Thanks in advance, Tom - Sök efter kärleken! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with reshape data into long format
Dear list, I have the following data set id 1 2 3 4 5 6 7 8 9 10 disease a b c d e f g h i j age 23 40 32 34 25 32 22 35 29 21 cityNY LD NY SG NY LD VG SA LD SG sex 1 1 2 2 2 2 1 1 1 2 treat_a y y yy treat_b n n n n n n ques1_1 2 4 5 6 8 3 1 2 4 5 ques1_2 6 4 5 12 10 9 8 4 5 7 ques1_3 17 23 32 25 14 24 23 22 32 29 ques2_1 4 7 9 10 6 8 5 7 8 9 ques2_2 8 9 10 12 17 19 14 21 22 19 ques2_3 23 18 19 20 23 24 26 28 29 22 ques3_1 5 7 9 1 4 7 9 8 10 5 ques3_2 34 35 32 23 31 29 27 25 32 33 ques3_3 29 33 27 25 27 23 24 29 27 24 where the first row is the header row in a dataframe. First I want to merge the two variables treat_a and treat_b to a new variable called treat which will be given n if it's left blank in the variable treat_a and y if it's left blank in treat_b. The new data set will look like id 1 2 3 4 5 6 7 8 9 10 disease a b c d e f g h i j age 23 40 32 34 25 32 22 35 29 21 cityNY LD NY SG NY LD VG SA LD SG sex 1 1 2 2 2 2 1 1 1 2 treatn n n y y y n n y n ques1_1 2 4 5 6 8 3 1 2 4 5 ques1_2 6 4 5 12 10 9 8 4 5 7 ques1_3 17 23 32 25 14 24 23 22 32 29 ques2_1 4 7 9 10 6 8 5 7 8 9 ques2_2 8 9 10 12 17 19 14 21 22 19 ques2_3 23 18 19 20 23 24 26 28 29 22 ques3_1 5 7 9 1 4 7 9 8 10 5 ques3_2 34 35 32 23 31 29 27 25 32 33 ques3_3 29 33 27 25 27 23 24 29 27 24 Now I want to reshape the data in a long format with target output id disease age city sex treat ques 1 a 23 NY1 n 1_1 1 a 23 NY1 n 1_2 1 a 23 NY1 n 1_3 1 a 23 NY1 n 2_1 1 a 23 NY1 n 2_2 1 a 23 NY1 n 2_3 1 a 23 NY1 n 3_1 1 a 23 NY1 n 3_2 1 a 23 NY1 n 3_3 2 b 40 LD1 n 1_1 2 b 40 LD1 n 1_2 2 b 40 LD1 n 1_3 2 b 40 LD1 n 2_1 2 b 40 LD1 n 2_2 2 b 40 LD1 n 2_3 2 b 40 LD1 n 3_1 2 b 40 LD1 n 3_2 2 b 40 LD1 n 3_3 . . . 10 j 21 SG2 n 3_3 How can I do this in R? Thanks alot for any help, Tom - Jämför pris på flygbiljetter och hotellrum: http://shopping.yahoo.se/c-169901-resor-biljetter.html [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help with reshaping data into long format (correct question)
Dear list, I have the following data set id 1 2 3 4 5 6 7 8 9 10 disease a b c d e f g h i j age 23 40 32 34 25 32 22 35 29 21 cityNY LD NY SG NY LD VG SA LD SG sex 1 1 2 2 2 2 1 1 1 2 treat_a y y yy treat_b n n n n n n ques1_1 2 4 5 6 8 3 1 2 4 5 ques1_2 6 4 5 12 10 9 8 4 5 7 ques1_3 17 23 32 25 14 24 23 22 32 29 ques2_1 4 7 9 10 6 8 5 7 8 9 ques2_2 8 9 10 12 17 19 14 21 22 19 ques2_3 23 18 19 20 23 24 26 28 29 22 ques3_1 5 7 9 1 4 7 9 8 10 5 ques3_2 34 35 32 23 31 29 27 25 32 33 ques3_3 29 33 27 25 27 23 24 29 27 24 where the first row is the header row in a dataframe. First I want to merge the two variables treat_a and treat_b to a new variable called treat which will be given n if it's left blank in the variable treat_a and y if it's left blank in treat_b. The new data set will look like id 1 2 3 4 5 6 7 8 9 10 disease a b c d e f g h i j age 23 40 32 34 25 32 22 35 29 21 cityNY LD NY SG NY LD VG SA LD SG sex 1 1 2 2 2 2 1 1 1 2 treatn n n y y y n n y n ques1_1 2 4 5 6 8 3 1 2 4 5 ques1_2 6 4 5 12 10 9 8 4 5 7 ques1_3 17 23 32 25 14 24 23 22 32 29 ques2_1 4 7 9 10 6 8 5 7 8 9 ques2_2 8 9 10 12 17 19 14 21 22 19 ques2_3 23 18 19 20 23 24 26 28 29 22 ques3_1 5 7 9 1 4 7 9 8 10 5 ques3_2 34 35 32 23 31 29 27 25 32 33 ques3_3 29 33 27 25 27 23 24 29 27 24 Now I want to reshape the data in a long format with target output id disease age city sex treat ques ques_value 1 a 23 NY1 n 1_1 2 1 a 23 NY1 n 1_2 6 1 a 23 NY1 n 1_3 17 1 a 23 NY1 n 2_1 4 1 a 23 NY1 n 2_2 8 1 a 23 NY1 n 2_3 23 1 a 23 NY1 n 3_1 5 1 a 23 NY1 n 3_2 34 1 a 23 NY1 n 3_3 29 2 b 40 LD1 n 1_1 4 2 b 40 LD1 n 1_2 4 2 b 40 LD1 n 1_3 23 2 b 40 LD1 n 2_1 7 2 b 40 LD1 n 2_2 9 2 b 40 LD1 n 2_3 18 2 b 40 LD1 n 3_1 7 2 b 40 LD1 n 3_2 35 2 b 40 LD1 n 3_3 33 .. .. .. 10 j 21 SG2 n 3_3 24 How can I do this in R? Thanks alot for any help, Tom - Jämför pris på flygbiljetter och hotellrum: http://shopping.yahoo.se/c-169901-resor-biljetter.html [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Append the sum of each row and column to a table matrix
Dear list, I have following table ee-table(ID,Day) ee Day ID2 3 4 5 6 7 9 10 14 16 35 5 0 0 3 1 0 0 5 0 0 36 0 0 0 0 0 0 0 1 0 0 43 13 15 15 0 0 13 13 15 13 15 46 0 1 0 0 0 0 0 0 0 0 58 0 0 0 0 0 0 4 4 0 0 and want to calculate the sum for each row and column and then append the sums to the table matrix. kkcbind(ee,as.matrix(apply(ee,1,sum))) dd-rbind(kk,apply(ee,2,sum)) Warning message: number of columns of result is not a multiple of vector length (arg 2) in: rbind(1, kk, apply(ee,2, sum)) rownames(dd)-c(rownames(dd)[-6],Total:) colnames(dd)-c(colnames(dd)[-11],Total:) I got a table as wanted (see below), except that the variable names Day and ID are missing. Is there a way to add back these variable names to the table dd as shown in ee. Also I got a warning message that I'm not exactly know how to skip. Can I make the table dd in a different and easier way ? Any suggestions are highly appreciated. Thanks, Tom dd 2 3 4 5 6 7 9 10 14 16 Total: 35 5 0 0 3 1 0 0 5 0 0 14 36 0 0 0 0 0 0 0 1 0 0 1 43 13 15 15 0 0 13 13 15 13 15112 46 0 1 0 0 0 0 0 0 0 0 1 58 0 0 0 0 0 0 4 4 0 0 8 Total: 18 16 15 3 1 13 17 25 13 15 18 - Går det långsamt? Skaffa dig en snabbare bredbandsuppkoppling. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help with making a function of scatter plot with multiple variables
Thanks Jim for the excellent solution. Can I make this function more flexible for the usage of different numbers of parameters? Tom jim holtman [EMAIL PROTECTED] skrev: The simple way is to enclose it in a 'function' and pass parameters. Assuming that you have the same number of parameters, then the following will do: my.func - function(x,y,d1,v1,s1,t1,s2,t2,s3,t3,s4,t4,s5,t5) { op - par(bg = grey97) par(mfrow=c(1,2)) plot(d1,v1, pch=v, col=orange,cex=0.6, lwd=2, xlab=day, ylab=resp,cex.main =1,font.main= 1,main= Surv data,ylim=y,xlim=x, col.main=navyblue,col.lab=navyblue,cex.lab=0.7) points(s1,t1, pch=A, col=green4, cex=1) points(s2,t2, pch=B,col=navyblue, cex=1) points(s3,t3, pch=C,col=red, cex=1) points(s4,t4, pch=D,col=darkviolet, cex=1) points(s5,t5, pch=E,col=blue, cex=1) legend(topright,lbels,col=c(orange,green4,navyblue,red,darkviolet,blue), text.col=c(orange,green4,navyblue,red,darkviolet,steelblue), pch=c(v,A,B,C,D,E),bg='gray100',cex=0.7,box.lty=1,box.lwd=1) abline(h = -1:9, v = 0:8, col = lightgray, lty=3) par(op) } # call it with my.func(x,y,d1,v1,s1,t1,s2,t2,s3,t3,s4,t4,s5,t5) You might also include the data in a list to make it easier On 9/20/07, Tom Cohen wrote: Dear list, I have done a scatter plot of multiple variables in the same graph, with different col and pch. I managed to do it with the following code but not know how to make a function of these so that next time if I want to do similar graph but with new variables, I dont have to copy the code and then change the old variables with the new ones but just call a function with the new variables. I dont have any experience in making a function and would be very grateful if you can help me. A function will shorten my prog dramatically, since I repeat tthis type of graph alots in my analysis. Thanks in advance, Tom op - par(bg = grey97) par(mfrow=c(1,2)) plot(d1,v1, pch=v, col=orange,cex=0.6, lwd=2, xlab=day, ylab=resp,cex.main =1,font.main= 1,main= Surv data,ylim=y,xlim=x, col.main=navyblue,col.lab=navyblue,cex.lab=0.7) points(s1,t1, pch=A, col=green4, cex=1) points(s2,t2, pch=B,col=navyblue, cex=1) points(s3,t3, pch=C,col=red, cex=1) points(s4,t4, pch=D,col=darkviolet, cex=1) points(s5,t5, pch=E,col=blue, cex=1) legend(topright,lbels,col=c(orange,green4,navyblue,red,darkviolet,blue), text.col=c(orange,green4,navyblue,red,darkviolet,steelblue), pch=c(v,A,B,C,D,E),bg='gray100',cex=0.7,box.lty=1,box.lwd=1) abline(h = -1:9, v = 0:8, col = lightgray, lty=3) par(op) - Jämför pris på flygbiljetter och hotellrum: http://shopping.yahoo.se/c-169901-resor-biljetter.html [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? - Låna pengar utan säkerhet. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help with making a function of scatter plot with multiple variables
Dear list, I have done a scatter plot of multiple variables in the same graph, with different col and pch. I managed to do it with the following code but not know how to make a function of these so that next time if I want to do similar graph but with new variables, I dont have to copy the code and then change the old variables with the new ones but just call a function with the new variables. I dont have any experience in making a function and would be very grateful if you can help me. A function will shorten my prog dramatically, since I repeat tthis type of graph alots in my analysis. Thanks in advance, Tom op - par(bg = grey97) par(mfrow=c(1,2)) plot(d1,v1, pch=v, col=orange,cex=0.6, lwd=2, xlab=day, ylab=resp,cex.main =1,font.main= 1,main= Surv data,ylim=y,xlim=x, col.main=navyblue,col.lab=navyblue,cex.lab=0.7) points(s1,t1, pch=A, col=green4, cex=1) points(s2,t2, pch=B,col=navyblue, cex=1) points(s3,t3, pch=C,col=red, cex=1) points(s4,t4, pch=D,col=darkviolet, cex=1) points(s5,t5, pch=E,col=blue, cex=1) legend(topright,lbels,col=c(orange,green4,navyblue,red,darkviolet,blue), text.col=c(orange,green4,navyblue,red,darkviolet,steelblue), pch=c(v,A,B,C,D,E),bg='gray100',cex=0.7,box.lty=1,box.lwd=1) abline(h = -1:9, v = 0:8, col = lightgray, lty=3) par(op) - Jämför pris på flygbiljetter och hotellrum: http://shopping.yahoo.se/c-169901-resor-biljetter.html [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.