[R] Question about logistic regression with ordered factor variable using the rms package (prev.Design)
Dear R users, Hopefully someone can help me, Maybe I just misunderstand the function in the package? I am working with a logistic regression model. Until now I always worked with the basic glm function, where for the model was: ¡§ glm( disease ~ test.value + cnct , family=binomial(link=¡¦logit¡¦) ¡¨. This works fine when test .value and concentration (cnct) are continuous vairables. However, concentration is in fact a grouping variable over 5 experiments with 5 concentrations ( 25, 50, 100, 200 400). Therefore I believe concentration to be an ordered factor ( in model : cnct_o). To make this model I used the ¡§rms¡¨ library (previously known as Design) and functions lrm (or Glm). The lrm (or Glm) returns the odds for disease, the ¡§inv.logit (odds) ¡¨ gives the probability of disease, but I have to do this with the Predict function of the ¡§rms¡¨ package. # The resulting model (with lrm or Glm) would be : CoefS.E.Wald Z Pr(|Z|) Intercept 23.800 0.8891 2.680.0074 test.value 20.806 0.3409 6.100.0001 cnct_o -0.1127 0.0268 -4.21 0.0001 cnct_o=100 77.393 17.542 4.410.0001 cnct_o=200 204.291 45.080 4.530.0001 cnct_o=400 427.829 98.180 4.360.0001 # The results of the standard glm function are very different : # Standard glm Deviance Residuals: Min 1Q Median 3Q Max -2.7361 -0.2750 0.2177 0.5143 1.6897 Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -0.9022 0.3370 -2.677 0.007427 ** test.value2.0806 0.3409 6.103 1.04e-09 *** cnct_o.L 1.4363 0.4722 3.042 0.002352 ** cnct_o.Q 1.2208 0.4934 2.474 0.013359 * cnct_o.C -2.0649 0.5610 -3.681 0.000232 *** cnct_o^4 0.5599 0.4760 1.176 0.239485 --- Signif. codes: 0 ¡¥***¡¦ 0.001 ¡¥**¡¦ 0.01 ¡¥*¡¦ 0.05 ¡¥.¡¦ 0.1 ¡¥ ¡¦ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 252.32 on 220 degrees of freedom Residual deviance: 167.68 on 215 degrees of freedom AIC: 179.68 Number of Fisher Scoring iterations: 6 # As I use the model parameters of the standard glm model to calculate the odds and probability manualy, I believe cnct_o = 25 to be the reference category and that cnct_o.L = 50, cnct_o.Q=100, cnct_o.C = 200 and cnct_o^4= 400. But I am not sure of this. The formulas used are : Odds - intercept + slope * test.value + cnct_o , where cnct_o is the corresponding value for the given concentration. Probability - inv.logit ( Odds ), the function inv.logit from package ¡§car¡¨. The results of the glm are in the table below, which are first the odds and then the probability¡¦s ( inv.logit (odds)). # glm oddscntc: test.value 25 cnct_o.Lcnct_o.Qcnct_o.C cnct_o^4 0 -0.902180.5341690.318649-2.96706 -0.34231 0.5 0.1381321.5744771.358957-1.92675 0.698001 1 1.17844 2.6147852.399265-0.88644 1.738309 1.5 2.2187473.6550933.4395720.153864 2.778616 2 3.2590554.6954 4.47988 1.194172 3.818924 # glm prob. cntc: test.value 25 cnct_o.Lcnct_o.Qcnct_o.C cnct_o^4 0 0.2886040.6304550.5789950.048936 0.415249 0.5 0.5344780.8284210.79559 0.127111 0.667744 1 0.7646670.9318070.9167710.291844 0.850472 1.5 0.90192 0.9747930.9689190.53839 0.941509 2 0.9629970.9909460.9887920.767486 0.97852 # If I compare this with the result of the Predict function in rms, the results seem very different, it can be because I misinterpret the glm model parameters for the ordered factor. How can I be sure which model parameter corresponds to which factor in the standard glm. # Results of lrm: Predict.lrm cntc: test.value 25 50 100 200 400 0 -0,43815-3,25628-1,153230,264036 0,072751 0,50580154 0,614227-2,2039 -0,100851,316414 1,125129 1,01160308 1,06-1,151530,9515262,368793 2,177508 1,5050682,693316-0,124821,9782363,395503 3,204218 2,01086954 3,7456950,9275633,0306154,447882 4,256597 # Prob. (inv.logit(odds)) cntc: test.value 25 50 100 200 400 0 0,3921820,0371020,2398990,565628 0,51818 0,50580154 0,6489040,0994 0,4748080,788585 0,754939 1,01160308 0,8411230,24021 0,7214220,914416 0,898211 1,505068
[R] Logit regression, I observed different results for glm or lrm (Design) for ordered factor variables
Dear useR's, I was comparing results for a logistic regression model between different library's. themodel formula is arranged as follows: response ~ (intercept) + value + group OR: glm( response ~ (intercept) + value + group , family=binomial(link='logit')) lrm( response ~ (intercept) + value + group ) ROC( from = response ~ (intercept) + value + group , plot='ROC') the response is a binary vaiable, the independent predictor 'value' is a continuous variable, and the grouping factor is a ordered factor (with 5 levels (25,50,100,200,400)) When I compare the GLM model with the ROC model and the LRM model setting 'group' as factor variable, the resulting coefficients are similar to eachother. When I set 'group' as an ordered factor variable (as it should be) the GLM model with the ROC model coefficients are still comparable, but the LRM coefficients are completely different. I have looked up the Design package, and there is a function 'cr.setup', which sets up an ordinal logistic response, this is however not the case here. the model hase a binary response (0 or 1), a continuos predicter and a ordered grouping factor. Does anybody know what I am doing wrong ? Thanks for you time, Tom Disclaimer: click here [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-sig-ME] lmer() - no applicable method for 'profile' under R version 2.15.1
Hi all, I was working with the MEMSS mle4 library's under R version 2.15.1. apparently some practical functions of do not work under R 2.15.1. After searching the archives i found a mail thread on this subject, stating that these problems were partialy solved for R 2.12.0 but only for lmer() not for glmer(). Is someone aware of an update available of these library's ? Or should I install R 2.12.0 and lme4a. Kind regards, Tom. ref: List: r-sig-mixed-models Subject:Re: [R-sig-ME] lmer() - no applicable method for 'profile' From: Ben Bolker bbolker () gmail ! com Date: 2011-01-06 15:03:52 Message-ID: 4D25D9D8.6030402 () gmail ! com [Download message RAW] I believe you're stuck for the time being: profiling is not yet implemented for GLMMs. REML is not implemented for GLMMs either: there is some debate as to whether a useful analogue of REML can be defined: see https://stat.ethz.ch/pipermail/r-sig-mixed-models/2009q1/002104.html for example. I don't know of any canned approach to computing likelihood profiles for GLMMs: there are MCMC approaches (e.g. MCMCglmm, or AD Model Builder followed by MCMC sampling) which give you a marginal posterior distribution ... in principle AD Model Builder can profile over the marginal likelihood, although the last time I checked profiling didn't actually work with random-effects models. If you are simply trying to get confidence intervals on your parameters, your best *simple* bet is to take the Wald test (results of summary()). If you want a better answer than that, then I think your choices are either an MCMC-based approach or bootstrapping (see http://glmm.wikidot.com/basic-glmm-simulation to get started). (Since you have 16 variables in the model, I hope you have at least 200-300 observations -- and that's assuming you have only main effects ...) On 11-01-06 09:35 AM, sam steyaert wrote: Thank you for the helping out before. I could install lme4a, and it ran fine for all chunks of chapter 1. Anyhow, if i try with my own data, it works, until i specify REML = FALSE in the model script, or use the update() function. Then, i get the following error message (it is in fact a warning message): In glmer(mymodelstructure), : extra arguments REML are disregarded I can still get the parameter estimates by calling the model name. I would like to get the confidence intervals around the parameter estimates, and this appears not to work. prM1 = profile(Model1) Error: is(fm@resp, lmerResp) is not TRUE confint(prM1) (this function logically does not work after the former one) Error in UseMethod(vcov) : no applicable method for 'vcov' applied to an object of class data.frame So i guess there is something with my data structure? I use logistic regression to model habitat use, and have 16 variables included in the model, and one random factor (as a character). Does anyone has some advice? Thanks a lot, Sam 2010/12/29 Douglas Bates ba...@stat.wisc.edu Disclaimer: click here [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] font size on graphics
Dear R users, My question is about finding the proper font size for graphics. For this i had written a code that creats 4 diferent graphics and saves them as a png file. From these PNG.graphics , i select one of the proper size and past it to a word document. I have experimented with lots of settings yet:nd lost my track a bit. there are cex; cexaxis cexlabes and so on, i lost track of wich cex does what exactly. Fruthermore, these graphs work well in an R window, yet when i open the png file it did not print the labels of bottom x axis. One other problem i have is that often '10' on the botom x axis is not printed. I tried to patch it up this way (below), yet its unsatisfactory, sometimes it works sometimes it don't. axis(1,at=c(28:30),c('','10','') , padj=-1.5 , cex.axis=ifelse((rol==1),setting[rol,8]-0.15,setting[rol,8]-0.1)) And finaly, when creating the graphs, i prefer the size of graf_1_small it pastes easaly to a word documentn ( 2 one one line), yet the labels are not always clear to read, so i copy the one without suffix to word then i size it to 75%, wich means it loses some quality. The question here is, can anyone help me with the right cex sizes? I gues that that is the question in general, since it are the cex that are eighter printed or not, on all these graph's. Thaks in advance, T here is my example: # saving Directory WinDir - C:/Documents and Settings/towil/Projecten # data trials - rep(1:10,3) test - c(rep(Low,10),rep(Normal,10),rep(Randomised,10)) means - rbeta(1:30,11,3)+rbeta(1:30,8,3) ci - rbeta(1:30,1,2) meanstable - data.frame( Trial=trials,Test=test,Means= means,Upper=means+ci ,Lower=means-ci) graf_1 - Means 1 graf_2 - Means 2 # settings for different graph's setting - data.frame('adname'=c(paste(graf_1,'_small'),graf_1,paste(graf_1,'_big')),'width'=c(300,400,500),'height' = c(300,400,500), 'pointsetting' = c(10,12,14),'Directory'=rep(WinDir,3), 'cex.lab'= c(0.85,0.9,1),'cexsize'= c(0.8,1,1.2), 'cex.axis'= c(0.6,0.9,1)) # loop for different graph's for(rol in 1:3){ save_at -WinDir setwd( save_at) x11() png(filename = paste(graf_1,setting[rol,1],.png,sep=),width = setting[rol,2], height = setting[rol,3],pointsize = setting[rol,4], bg = white, res = NA) plot(1:nrow(meanstable),meanstable$Mean, xlim=c(0,nrow(meanstable)),ylim= ylimits, xaxt='n', pch = 19, frame.plot=F,xlab='',ylab=lg10_label,cex.lab=setting[rol,6],cex.axis=setting[rol,8]) arrows(1:nrow(meanstable) , meanstable$Upper , 1:nrow(meanstable) , meanstable$Lower, lty=3, code = 3, angle = 45, length = .1) axis(3,at=1:2,unique(meanstable$Test)[c(1,2)],las=1 ,hadj=0.5, padj=c(1,0),cex.axis=setting[rol,8]) # upper X axis axis(3,at=2:3,unique(meanstable$Test)[c(2,2)],las=1 ,hadj=0.5, padj=c(0,-1),cex.axis=setting[rol,8]) axis(3,at=3,unique(meanstable$Test)[3],las=1, hadj=0,padj=-1,cex.axis=setting[rol,8]) axis(1,at=c(1:3),c('','1','') , padj=-1 , cex.axis=setting[rol,8]) # lopwer x axis this does not show on the png files, yet it works in an r graphic axis(1,at=c(4:6),c('','2','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(7:9),c('','3','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(10:12),c('','4','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(13:15),c('','5','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(16:18),c('','6','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(19:21),c('','7','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(22:24),c('','8','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(25:27),c('','9','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(28:30),c('','10','') , padj=-1 , cex.axis=setting[rol,8]) dev.off() } # graphics.off() End of example. Disclaimer: click here [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] font size on graphics question (correction in example,sorry)
Dear R users, My question is about finding the proper font size for graphics. For this i had written a code that creats 4 diferent graphics and saves them as a png file. From these PNG.graphics , i select one of the proper size and past it to a word document. I have experimented with lots of settings yet:nd lost my track a bit. there are cex; cexaxis cexlabes and so on, i lost track of wich cex does what exactly. Fruthermore, these graphs work well in an R window, yet when i open the png file it did not print the labels of bottom x axis. One other problem i have is that often '10' on the botom x axis is not printed. I tried to patch it up this way (below), yet its unsatisfactory, sometimes it works sometimes it don't. axis(1,at=c(28:30),c('','10','') , padj=-1.5 , cex.axis=ifelse((rol==1),setting[rol,8]-0.15,setting[rol,8]-0.1)) And finaly, when creating the graphs, i prefer the size of graf_1_small it pastes easaly to a word documentn ( 2 one one line), yet the labels are not always clear to read, so i copy the one without suffix to word then i size it to 75%, wich means it loses some quality. The question here is, can anyone help me with the right cex sizes? I gues that that is the question in general, since it are the cex that are eighter printed or not, on all these graph's. Thaks in advance, T here is my example: # saving Directory WinDir - C:/Documents and Settings/Project dir.create(file.path(C:/Documents and Settings,Project)) # data trials - rep(1:10,3) test - c(rep(Low,10),rep(Normal,10),rep(Randomised,10)) means - rbeta(1:30,11,3)+rbeta(1:30,8,3) ci - rbeta(1:30,1,2) meanstable - data.frame( Trial=trials,Test=test,Means= means,Upper=means+ci ,Lower=means-ci) graf_1 - Means 1 graf_2 - Means 2 # settings for different graph's setting - data.frame('adname'=c(paste(graf_1,'_small'),graf_1,paste(graf_1,'_big')),'width'=c(300,400,500),'height' = c(300,400,500), 'pointsetting' = c(10,12,14),'Directory'=rep(WinDir,3), 'cex.lab'= c(0.85,0.9,1),'cexsize'= c(0.8,1,1.2), 'cex.axis'= c(0.6,0.9,1)) # loop for different graph's for(rol in 1:3){ save_at -WinDir setwd( save_at) x11() png(filename = paste(graf_1,setting[rol,1],.png,sep=),width = setting[rol,2], height = setting[rol,3],pointsize = setting[rol,4], bg = white, res = NA) plot(1:nrow(meanstable),meanstable$Mean, xlim=c(0,nrow(meanstable)),ylim= ylimits, xaxt='n', pch = 19, frame.plot=F,xlab='',ylab=lg10_label,cex.lab=setting[rol,6],cex.axis=setting[rol,8]) arrows(1:nrow(meanstable) , meanstable$Upper , 1:nrow(meanstable) , meanstable$Lower, lty=3, code = 3, angle = 45, length = .1) axis(3,at=1:2,unique(meanstable$Test)[c(1,2)],las=1 ,hadj=0.5, padj=c(1,0),cex.axis=setting[rol,8]) # upper X axis axis(3,at=2:3,unique(meanstable$Test)[c(2,2)],las=1 ,hadj=0.5, padj=c(0,-1),cex.axis=setting[rol,8]) axis(3,at=3,unique(meanstable$Test)[3],las=1, hadj=0,padj=-1,cex.axis=setting[rol,8]) axis(1,at=c(1:3),c('','1','') , padj=-1 , cex.axis=setting[rol,8]) # lopwer x axis this does not show on the png files, yet it works in an r graphic axis(1,at=c(4:6),c('','2','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(7:9),c('','3','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(10:12),c('','4','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(13:15),c('','5','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(16:18),c('','6','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(19:21),c('','7','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(22:24),c('','8','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(25:27),c('','9','') , padj=-1 , cex.axis=setting[rol,8]) axis(1,at=c(28:30),c('','10','') , padj=-1 , cex.axis=setting[rol,8]) dev.off() } # graphics.off() End of example. Disclaimer: click here Disclaimer: click here [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Bootstrap problem
Dear R-users, I'm having a small problem while bootstraping data. What i would like to do, is resmple the data and calulate a function on this, so i can estimate the measure of reproducability for this data. The function i wrote works fine, even while bootstraping. The only problem is that bootstraping. The dataset existes of 10 trials, each divided in to 3 groups of high(3) medium(2) and low(1). A bootstrap samlpe (trial) should always exist of 5 obs. taken from each group population, so to be representative. example: original data: trial 1 : group(1) = (0,0,1,0,0);group(2) = (0,1,1,0,1);group(3) = (1,1,1,1,1) ... bootstraped data: trial 1 : group(1) = (0,0,0,0,1);group(2) = (1,1,0,0,1);group(3) = (1,0,1,1,1) NOT bootstraped data: trial 1 : group(1) = (0,0,0,0,1,1,0);group(2) = (1,0,1);group(3) = (1,0,1,1,1,1,1,0,1,1) Now I am familiar how to use function bootstrap (pkg bootstrap), but i read about a function called boot (pkg boot), however i can't seem to master this. The explanation (help('boot') ) isn't making me any smarter. I know I can always split the data up (wich is what i am doing) but i was wondering whether this would have an effect on the bootstrap, maby it is beter to keep all the groups together? here is a (this time WORKING) code example of what i did. ## proc ## generate data datas - data.frame(protection=c(rep(c(0,1,0,1,0,0,1,0,1,1,1,0,1,1,1),2),c(0,0,0,0,1,0,1,0,1,1,1,0,1,0,1),rep(c(0,1,1,1,0,0,1,1,0,1,1,1,1,0,1),2),c(0,1,0,0,1,0,1,1,1,1,1,1,1,1,1),rep(c(0,1,0,1,0,0,1,1,1,1,1,0,1,1,1),2),c(0,1,0,0,0,0,1,1,0,1,1,1,1,1,1),c(0,0,1,1,0,0,1,1,1,0,0,1,1,0,1)) ,group=rep(1:3,50),trial=c(rep(1,15),rep(2,15),rep(3,15),rep(4,15),rep(5,15),rep(6,15),rep(7,15),rep(8,15),rep(9,15),rep(10,15))) ## describe Function Vacc.Vcon -function (dataset1 , trialdata , groupdata ) { groups - unique (groupdata) trials - unique (trialdata) Tr - length(trials) G - length(groups) Gl - length(dataset1)/(G*Tr) Tl - length(dataset1)/(Tr) iterg -data.frame(1:G,as.vector(groups)) trials - unique (trialdata) Tr - length(trials) itert -data.frame(1:Tr,as.vector(trials)) triallist - c() grouplist - c() for (x in 1:G){ ifelse(x==1,y-x,y- y+Tr) grouplist[c(y:(y+Tr-1))] -rep(iterg[x,2],Tr)} iter -data.frame(1:(Tr),rep(trials,G),grouplist) VACC - data.frame() VACC.sub - function (dataset1,trialn,groupn){ p0 -sum( ifelse(dataset1==1 trialdata==trialn groupdata==groupn, 1,0) ) p1 -sum( ifelse(dataset1==0 trialdata==trialn groupdata==groupn, 1,0) ) p - p0+p1 VACC.group - list('Trial'=trialn,'Group'=groupn,'Vacc'=sum((p0/p)^2 ,(p1/p)^2),p0=(p0/p) , p1=(p1/p) ,n0=as.numeric(p0),n1=as.numeric(p1),'n'=as.numeric(p)) VACC.group} for (i in 1:(G*Tr) ) { VACC[i,1] - VACC.sub (dataset1,iter[i,2],iter[i,3])[1] VACC[i,2] - VACC.sub (dataset1,iter[i,2],iter[i,3])[2] VACC[i,3] - VACC.sub (dataset1,iter[i,2],iter[i,3])[3] VACC[i,4] - VACC.sub (dataset1,iter[i,2],iter[i,3])[4] VACC[i,5] - VACC.sub (dataset1,iter[i,2],iter[i,3])[5] VACC[i,6] - VACC.sub (dataset1,iter[i,2],iter[i,3])[6] VACC[i,7] - VACC.sub (dataset1,iter[i,2],iter[i,3])[7] VACC[i,8] - VACC.sub (dataset1,iter[i,2],iter[i,3])[8] VACC} rownames(VACC) - NULL rownames(VACC) - paste(iter[,2],iter[,3],sep='_') Pcalc - function(x) { out-(1/(Tr)) * sum(x) out} P0 - tapply( VACC$p0,VACC$Group,Pcalc) P1 - tapply( VACC$p1,VACC$Group,Pcalc) Vcon - mean(cbind(P0^2 + P1^2)) Vacc.total - mean (tapply( VACC$Vacc,VACC$Group,mean)) out - list(all=VACC,N=G,P0=P0,P1=P1,Vcon=Vcon*100,Vacc.total=Vacc.total*100) out } ## end describe Function Vacc.Vcon (datas[,1] , datas[,3], datas[,2]) # example of how fun works ## data needs to be in matrix form for bootstrap function xdata -matrix( cbind(datas$protection,datas$group,datas$trial),ncol=3,byrow=F) ## function for bootstrap vacc.boot - function(x,xdata){ Vacc.Vcon(xdata[x,1],xdata[x,3],xdata[x,2]) } bootk - 10 results - bootstrap(1:150,bootk,vacc.boot,xdata) taccs - list() ;Vaccs - vector();Vcons - vector() boot.amp.vac2- for(i in 1:bootk) { m.i - results$thetastar[[i]] taccs[i] - list(m.i ) G.Vacc - round( tapply(taccs[[i]]$all$Vacc,rownames(taccs[[i]]$all),mean)*100 ,digits=3) Vaccs - round( mean(taccs[[i]]$Vacc.total),digits=3) Vcons-round( mean(taccs[[i]]$Vcon ),digits=3) tacc - list( data=taccs,Booted.means=list(Vacc.grouped=G.Vacc ,Vacc.Total=Vaccs,Vcon.Total=Vcons)) tacc} Rep.table - tacc$Booted.mean Rep.table ## problem area = n should always be 5 in each group as in the original data #calcues based on original data last colon : n = 5 Vacc.Vcon (datas[,1] ,datas[,3], datas[,2])$all [1:5,] #calcues based on Booted datan is not 5 !
[R] PNG file don't run on mac's?
Dear R users, I 'm having problems with creating PNG graphic outputs. Usualy i create reports in HTML format, containing PNG graphics, so they can ealsaly be exported to word and xl and so on. On a windows pc that i use at work all works fine, but it never works on my mac. The HTML's i create on windows, open under safari but the graphics never open? is there a way of creating gif's for example? Then i was wondering whether there is a less codeing intensif way to create plots? What i mean is, when i create a plot, existing out of 4 subplots, and i want to save it as a PNG to incorporate it in to a HTML.output i need to creat every plot under the PNG declaration. # png(filname.. # parset - par(mfrow = c(2,2), oma = c(0,0,1,0),..]) # plot(... # plot(... # plot(... # plot(... # par(parset) # dev.off() Now, because i create graphs for different people, i usualy generate them in differerent sizes and configurations. So they just have to choose wich one to use in their text. There for i am looking for a way of programming that can save time and space yet the example below does not work, it only writes tables of data # plot.1 - plot(... # plot.2 - plot(... # plot.3 - plot(... # plot.4 - plot(... # # png(.. # plot.1 # dev.off() # ... # png(filname.. # parset - par(mfrow = c(2,2), oma = c(0,0,1,0),..]) # plot.1 # plot.2 # plot.3 # plot.4 # par(parset) # dev.off() Here is an example of how i create the png files that i use in the HTML outputs. # size2 - data.frame(adname=c(_small,,_big,_html), width=c(300,400,500,300), height = c(300,400,500,300), pointsize = c(10,12,14,10),Directory=c(rep(graph,3),html ), cexsize= c(0.6,0.65,0.7,0.65) ) # # for(rol in 1:4) { # save_at -file.path(ResultDir,as.character(size[rol,5])) # setwd( save_at) # x11() # png(filename = paste(graf_1,size[rol,1],.png,sep=),width = size[rol,2], height = size[rol,3],pointsize = size[rol,4], bg = white, res = NA, restoreConsole = TRUE) # parset - par(mfrow = c(2,2), oma = c(0,0,1,0),cex.axis=size[rol,6],cex=size[rol,6]) # boxplot(bxpdtrail$value,ylab=log10 (x), main=paste(Mean,testname,at,fase,sep=' ')) # points(1 ,mean(bxpdtrail$value,na.rm=T), pch = 19) # arrows(1 , upper(bxpdtrail$value,na.rm=T) , 1 , lower(bxpdtrail$value,na.rm=T), lty=3, code = 3, angle = 45, length = .1) # pyramid.plot(pyramiddata$yes,pyramiddata$no,main=Observation of Protection,labels=pyramiddata$L1,labelcex=size[rol,6], top.labels = c(Protected, log10(x), Unprotected),xycol=xycol,xxcol=xxcol,gap=8 ,unit = # animals) # barplot(pyramiddata$prob_dens,names=pyramiddata$L1,ylab=EPP %, xlab=log10(x) ,main=Probability density plot) # plot(pyramiddata$L1,pyramiddata$prob_cum, ylab=EPP %, xlab=log10(x) ,main=Cumulative Probability plot, type=S) # par(parset) # dev.off() # } kind regards, Tom. Disclaimer: click here [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help with SQL, how can i use functions in sql (pkg :sqldf)
Dear R ussers , I was trying to summaryse data with sql, from the sqldf pkg. it seemed like a promessing solution, yet all i can do in this is calculate avg count and sum. where i d like to use confidence intervals and standard deviation as wel. now i was trying to find a solution my self , but the closest i got was sqlite3_create_function16 explained on http://www.sqlite.org/c3ref/create_function.html sadely i don't understand much of the explanation. Now i hoped sombody could give me an other SQL solution for this. the function i hoped to use is this one. mean.CI - function (X,na.rm=T) { names(X)-NULL if (is.vector(X)) {nn - length(X)} else {nn - nrow(X)} NAs - sum(is.na(X)) n - nn-NAs if (na.rm) avg - mean(X,na.rm=T) Sd - sd(X,na.rm=T) Var - var(X,na.rm=T) if (is.matrix(X)) { apply(X, 2, sd, na.rm = T) } else if (is.vector(X)) { confin -qt(0.975,df=n-1)*(sd(X, na.rm = T)/sqrt(n))} else if (is.data.frame(X)){ confin -qt(0.975,df=n-1)*((sapply(X, sd, na.rm = T))/sqrt(n))} else {confin - qt(0.975,df=n-1)*(sd(as.vector(X), na.rm = T))/sqrt(n)} out -round (c( avg-confin ,avg+confin) ,digits=3) out } kind regards, Tom Disclaimer: click here [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how do i save data to txt file? are there marcos to save tables to word?
Dear R-ussers, I would like to save a newly created data file, out of R in to a text file. It is a rather big dataset, and recalculating the new variables takes a long time. The quickest way to read data is when it is saved as .txt, this is why i hope to read the data from the old txt, than calculate a new set of variables based on the old set, and save them in a new data file, also a .txt format. One other thing i d like to know is how to save a Table output from R, directly to a table in word? copy paste destroys the layout of the table. many thaks in advance, Tom E-mail: [EMAIL PROTECTED] Disclaimer: click here [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how do i creat multiple back to back histograms?
dear R-ussers, I would like to creeate a graph, i wich my data is presented as verticaly oriented histograms, wich give the frequency of the measured values, grouped per used measurement methode. So the X axis should hold the grouping variable and the Y axis a continuos variable. well not realy continouse, but it should show the values, representing the clase intervals. small example to clarify this: This one i can create: graph 1 only the positive results per test 5 | ##|#|## | # |## |# 4 | ### | | | |## |## 3 | ### |#| | # |###|# 2 | | |## | ### |## |# 1 | # |#|## |_|_|_ 0test 1test2test3 Tihs is what i want to create: graph 2 a back to back histogram plot of the pos/negative results, grouped per test 5 | |## |# |## | |#|## #|# 4 | |###| ##| | .|..|## ...#|##.. 3 |#|### #|#| | ##|###|#####|# 2 |#|..|..##|##.. |###|#####|#####|# 1 | ##|# #|#|## |___|__#|#|_ 0test 1test2test3 Neg. | Pos.Neg. | Pos. Neg. | Pos. I 'd like to creat the figure of graph 2, a back to back plot of the pos/ and negative results of a test, and this with the 3 tests in one graf. I have been searching for examples, the only trouble is that it is way to complex. (http://addictedtor.free.fr/graphiques/RGraphGallery.php?graph=109) Here is one other example of a back to back plot (graph 2) . (histbackback(fool) After trying to understand wath happens in the complex example, i identified a part that does what i need. It does creat the graph similar to what i want (graph 1). see code below Yet i do not understand it wel enough,so i can't creat the more complex graph 2. What can i do to understand this graph functions beter, without spending to much time on them? Kind regards, Tom. this is the code, wich i use to creat graph 1 # data for plot: freqs - data.frame(value= c( 0.000,1.204,1.301,1.362,1.447,1.505,1.602,1.653,1.756,1.806,1.903,1.959, 2.053,2.107,2.204,2.258,2.354,2.408,2.505,2.559,2.656,2.709,2.806) , tp1= c( 8,1,0,13,0,6,0,25,0,5,0,15,0,4,0,7,0,0,0,1,0,0,0) , tn1= c( 17,0,0,2,0,0,0,1,0,2,0,1,0,0,0,0,0,0,0,0,0,0,0), tp2= c( 10,0,2,0,9,0,8,0,19,0,4,0,5,0,2,0,5,0,2,0,1,0,2) , tn2= c( 13,0,1,0,1,0,2,0,2,0,0,0,2,0,0,0,0,0,0,0,0,0,0), tp3= c( 9,0,0,0,0,0,0,0,0,10,0,10,0,21,0,10,0,11,0,8,0,5,0), tn3= c( 15,0,0,0,0,0,0,0,0,3,0,2,0,2,0,1,0,0,0,0,0,0,0)) test-c(1,2,3,4,5,6) testname -c('test1 p','test1 n','test2 p','test2 n','test3 p','test3 n') # parameters for plot xlim = c(min(test),max(test)) ylim = c(0,length(freqs$value)) barscale = 0.2 barcol = 8 # plot win.graph() for (i in 1:length(freqs)) { par(new = TRUE) xmin - -test[i] + xlim[1] xmax - xlim[2] - test[i] ser - freqs[, i+1] ser - ser/max(ser) * barscale barplot(ser, horiz = TRUE, axes = FALSE, xlim = c(xmin, xmax), ylim = ylim, col = barcol, space = 0) } axis(1,labels=testname,at=c(0,0.2,0.4,0.6,0.8,1)) axis(2,labels=freqs$value ,at=c((0:22)/23) ) this is the code, wich i hoped would creat graph 2 but it doesn't work for (i in 1:length(freqs)) { par(new = TRUE) xmin - -test[i] + xlim[1] xmax - xlim[2] - test[i] serx - freqs[, i+1] sery - freqs[, i+2] ser - list((serx/sum(serx) * barscale),(sery/sum(sery) * barscale)) histbackback(ser,axes=FALSE ,xlim = c(xmin, xmax), ylim = ylim) } E-mail: [EMAIL PROTECTED] Disclaimer: click here [[alternative HTML version deleted]] __ R-help@r-project.org
[R] biserial correlation with pkg polycor
Dear R-ussers, While looking for a way to calculate the association between a countinuous and a binary variable, i found a procedure called point biserial corralation. Me, not being a mathematicion, i did my very best to understand what it was all about, and then i found a easily understandable paper (by steve simon) on ow to calculate this. ref ## http://www.childrens-mercy.org/stats/definitions/biserial.htm (this page has the same example) Further i discovered the polycor package in R. Now i'm having troubles with the fact that the polycor pkg never gives me the same output as the manuals aplication of the formula. In the example below found, manualy r(biserial) = 0.49 between fb an age, and ussing function polyserial (polycor pkg) r(biserial) =-0.8591. This is a rather big difference, no due to abriviation or flootingpoints. Is there someone whom is familiar with biserial correlation, and the appropriate way to calculate it? Kind regards, Tom. here is the example, at the end is the R file. 1e I create the input library(abind) library (polycor) ### data input no - c(1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8) fb - c(19, 30, 20, 19, 29, 25, 21, 24, 50, 25, 21, 17, 15, 14, 14, 22, 17) ss - c(14, 41, 18, 11, 16, 24, 18, 21, 37, 17, 10, 16, 22, 12, 14, 12, 18) age -c('elderly', 'elderly', 'elderly', 'elderly', 'elderly', 'elderly', 'elderly', 'elderly', 'elderly', 'young', 'young', 'young', 'young', 'young', 'young', 'young', 'young') dataset - data.frame(no,fb,ss,age) dataset - subset(dataset,select=c(fb:age)) nrow(dataset) [1] 17 data_eld - subset(dataset,age=='elderly',select=fb) data_young - subset(dataset,age=='young',select=fb) here i calculate the R_bis (biserial corelation) manualy ### point biserial correlation fb - subset(dataset,select=fb) fb0 - subset(dataset,age!='elderly',select=fb) fb1 - subset(dataset,age=='elderly',select=fb) meanfb0 - mean(fb0,na.rm=T) meanfb1 - mean(fb1,na.rm=T) sdfb- sd(dataset$fb,na.rm=T) ss - subset(dataset, select=ss) ss0 - subset(dataset,age!='elderly',select=ss) ss1 - subset(dataset,age=='elderly',select=ss) meanss0 - mean(ss0,na.rm=T) meanss1 - mean(ss1,na.rm=T) sdss- sd(dataset$ss,na.rm=T) age - subset(dataset,select=age) n - nrow(dataset) this is the formula from ref ## http://www.childrens-mercy.org/stats/definitions/biserial.htm R_bis - function(x,x1,x0,n){ p - (nrow(x1)/n) +(((mean(x1,na.rm=T)-mean(x0,na.rm=T))/sd(x,na.rm=T)) *sqrt(p*(1-p))) } this is the corrected formula from ref ## http://en.wikipedia.org/wiki/Point-biserial_correlation_coefficient R_bis2 - function(x,x1,x0,n){ +((mean(x1,na.rm=T)-mean(x0,na.rm=T))/sd(x,na.rm=T)) * (sqrt( (nrow(x1)*nrow(x0))/(n*(n-1} R_bis(fb,fb1,fb0,n) fb 0.4798873 result in paper was 0.49 R_bis2(fb,fb1,fb0,n) fb 0.4946565 equals result in paper 0.49 Then i use the polycor package, function hetcor will give all the different correlation ressults hetcor(dataset$fb,dataset$ss,dataset$age ,ML=TRUE) Maximum-Likelihood Estimates Correlations/Type of Correlation: dataset$fb dataset.ss dataset.age dataset$fb 1Pearson Polyserial dataset.ss 0.703 1 Polyserial dataset.age-0.8591-0.6685 1 Standard Errors: dataset$fb dataset.ss dataset$fb dataset.ss 0.1215 dataset.age 0.1106 0.2497 n = 17 P-values for Tests of Bivariate Normality: dataset$fb dataset.ss dataset$fb dataset.ss 0.1782 dataset.age 0.4269 0.4034 hetcor(dataset,ML=TRUE) Maximum-Likelihood Estimates Correlations/Type of Correlation: fb ssage fb1 Pearson Polyserial ss0.703 1 Polyserial age -0.8591 -0.6685 1 Standard Errors: fb ss fb ss 0.1215 age 0.1106 0.2497 n = 17 P-values for Tests of Bivariate Normality: fb ss fb ss 0.1782 age 0.4269 0.4034 here a quick two step method is ussed to calculate the polyserial correlation polyserial(dataset$fb,dataset$age) [1] -0.6205737 polyserial(dataset$fb,dataset$age, ML=TRUE, std.err=TRUE) same method as in hetcor, only for indecated variables Polyserial Correlation, ML est. = -0.8591 (0.1106) Test of bivariate normality: Chisquare = 4.91, df = 5, p = 0.4269 1 Threshold 0.1811 Std.Err. 0.1849 ### for side to side (ss) incase no 9 is an outlier in fb, this will not be the case in ss R_bis(ss,ss1,ss0,n) ss 0.4153681 result in paper was 0.43 R_bis2(ss,ss1,ss0,n) ss 0.4281516 equals result in paper 0.43 polyserial(dataset$ss,dataset$age) [1] -0.5371397 polyserial(dataset$ss,dataset$age, ML=TRUE, std.err=TRUE) Polyserial Correlation, ML est. = -0.6685 (0.2497) Test of bivariate normality: Chisquare = 5.103, df = 5, p = 0.4034 1 Threshold 0.1504 Std.Err. 0.2583
Re: [R] plot for binomial glm
Dear Jonh, there is probably an easier way, but i find this to give nice smooth plots. good luck with it. ### R-file alive - data$num - data$numdead numdead - data$numdead temp - data$temp data.table - cbind(numdead, alive) points.graph - data$alive/data$num glm.mort-glm(data.table ~ temp, family=binomial) fit - predict(glm.mort, type='response' ) a - glm.mort$coef[1]# writes model parameters to named variable, you can also use them directly in a function, as you like b - glm.mort$coef[2] x2 - c((logit(fit)-(a))/b) p2 - c ((inv.logit(a+b*x2)) ) y2 - c ( a+b*x2) plot(c(30,55), c(0,1),type=n, main= survival,xlab = Log x, ylab = Probability) lines( sortedXyData( (logit(p2)-(a))/b,p2),type=l,lty=1 ,col=blue,ylim=c(0,1.2) ) points(temp,fit,pch=4,type= p,col=black) ## This will plot a smooth cuve x - c(x=(rep(33:55,1))) p - c ((inv.logit(a+b*x)) ) y - c ( a+b*x) plot(c(30,55), c(0,1),type=n, main= survival,xlab = Log x, ylab = Probability) lines( sortedXyData( (logit(p)-(a))/b,p),type=l,lty=1 ,col=blue,ylim=c(0,1.2) ) points(temp,fit,pch=4,type= p,col=black) ### END Willems Tom E-mail: [EMAIL PROTECTED] Disclaimer: click here [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Question: Rcmdr and macbooks
Hello dear Russers, I have noticed that certain verry handy functions, like plotMeans and ci.plot, only run under Rcmdr. Rcmdr does not run on a mac , so i hope someone out there knows about an Rcmdr module for mac's. Kind regards, Tom Willems Tom E-mail: [EMAIL PROTECTED] www.var.fgov.be Disclaimer: click here [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Q: confidence intervals plotMeans, how do i discard NA's
Dear R ussers, I noticed a small problem when ussing for example the function plotMeamns under Rcmdr. In this case a graph will be ploted, giving the means on the Y-axis, by a factor on the X-axis. All is correct when doing this, ussing 'sd' for error bars. The problem occures when ussing conidence intervals for error bars. Here, R will assume that every factor on the X-axis has the same length. And will calculate the confidence interval as folows: N - length ( X) means - mean ( X, na.rm=T) if (is.matrix ( X)) apply( X, 2, sd, na.rm = T) else if ( is.vector( X)) qt( 0.975, df=N-1)*( sd( X, na.rm = T)/ sqrt( N)) else if ( is.data.frame(X)) qt( 0.975, df=N-1)*(( sapply( X, sd, na.rm = T)) /sqrt( N)) else qt( 0.975, df=N-1)*( sd( as.vector(X), na.rm = T)) /sqrt( N) yrange - if (error.bars != none) c(min(means - sds, na.rm=TRUE), max(means + sds ,na.rm=TRUE)) Now this works fine when you have a we equilibrated data set, yet in most cases you don't! Still R will calculate the confidence intevals for each means par factor, ussing the length of the greatest factor. example: when you have 3 factors, A with 10 mesurements, B with 5, and C with 8, all confidence intervals will be calculated ussing a total N of 10 I would like to correct this, so the plots i creat have corret CI errorbars, yet i did not find a procedure to omint NA's in the function length , NROW, or nrow. Can anybody help me solve this problem please. Kind regards, Tom Disclaimer: click here [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.