Re: [R] memory.size help
At 13:27 31/08/2007, dxc13 wrote: >I keep getting the 'memory.size' error message when I run a program I have >been writing. It always it cannot allocate a vector of a certain size. I >believe the error comes in the code fragement below where I have multiple >arrays that could be taking up space. Does anyone know a good way around >this? It is a bit hard without knowing the dimensions of the arrays but see below. >w1 <- outer(xk$xk1, data[,x1], function(y,z) abs(z-y)) >w2 <- outer(xk$xk2, data[,x2], function(y,z) abs(z-y)) >w1[w1 > d1] <- NA >w2[w2 > d2] <- NA >i1 <- ifelse(!is.na(w1),yvals[col(w1)],NA) >i2 <- ifelse(!is.na(w2),yvals[col(w2)],NA) If I read this correctly after this point you no longer need w1 and w2 so what happens if you remove them? >zk <- numeric(nrow(xk)) #DEFININING AN EMPTY VECTOR TO HOLD ZK VALUES >for(x in 1:nrow(xk)) { > k <- intersect(i1[x,], i2[x,]) > zk[x] <- mean(unlist(k), na.rm = TRUE) >} >xk$zk <- zk >data <- na.omit(xk) >-- >View this message in context: >http://www.nabble.com/memory.size-help-tf4359846.html#a12425401 >Sent from the R help mailing list archive at Nabble.com. Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Exact Confidence Intervals for the Ration of Two Binomial
At 18:28 23/08/2007, [EMAIL PROTECTED] wrote: >Hello everyone, > >I will like to know if there is an R package to compute exact confidence >intervals for the ratio of two binomial proportions. If you go to https://www.stat.math.ethz.ch/pipermail/r-help//2006-November/thread.html and search that part of the archive for "relative risk" you will find a number of suggestions. Unfortunately the responses are not all threaded so you need to search the whole thing. >Tony. > [[alternative HTML version deleted]] Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to fit an linear model withou intercept
At 10:56 23/08/2007, Michal Kneifl wrote: >Please could anyone help me? >How can I fit a linear model where an intercept has no sense? Well the line has to have an intercept somewhere I suppose. If you use the site search facility and look for "known intercept" you will get some clues. >Thanks in advance.. > >Michael > > Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] GLMM: MEEM error due to dichotomous variables
At 14:31 07/08/2007, Elva Robinson wrote: >I am trying to run a GLMM on some binomial data. My fixed factors >include 2 dichotomous variables, day, and distance. When I run the model: > >modelA<-glmmPQL(Leaving~Trial*Day*Dist,random=~1|Indiv,family="binomial") > >I get the error: > >iteration 1 >Error in MEEM(object, conLin, control$niterEM) : >Singularity in backsolve at level 0, block 1 > > From looking at previous help topics,( > http://tolstoy.newcastle.edu.au/R/help/02a/4473.html) >I gather this is because of the dichotomous predictor variables - >what approach should I take to avoid this problem? I seem to remember a similar error message (although possibly not from glmmPQL). Does every combination of Trial * Day * Dist occur in your dataset? You would find it easier to read your code if you used your space bar. Computer storage is cheap. >Thanks, Elva. > >_ >Got a favourite clothes shop, bar or restaurant? Share your local knowledge > > Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] proportional odds model in R
At 08:51 02/08/2007, Ramon Martínez Coscollà wrote: >Hi all!! There is no need to post twice, nor to also post on allstat. Pages 204-205 of MASS for which this software is a support tool provides ample information on how to compare models. >I am using a proportinal odds model to study some ordered categorical >data. I am trying to predict one ordered categorical variable taking >into account only another categorical variable. > >I am using polr from the R MASS library. It seems to work ok, but I'm >still getting familiar and I don't know how to assess goodness of fit. >I have this output, when using response ~ independent variable: > >Residual Deviance: 327.0956 >AIC: 333.0956 > > polr.out$df.residual >[1] 278 > > polr.out$edf >[1] 3 > >When taking out every variable... (i.e., making >formula: response ~ 1), I have: > >Residual Deviance: 368.2387 >AIC: 372.2387 > >How can I test if the model fits well? How can I check that the >independent variable effectively explains the model? Is there any >test? > >Moreover, sendig summary(polr.out) I get this error: > > >Error in optim(start, fmin, gmin, method = "BFGS", hessian = Hess, ...) : >initial value in 'vmmin' is not finite > >Something to do with the optimitation procedure... but, how can I fix >it? Any help would be greatly appreciated. > >Thanks. Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] generating symmetric matrices
At 16:29 30/07/2007, Gregory Gentlemen wrote: >Douglas Bates <[EMAIL PROTECTED]> wrote: On 7/27/07, Gregory >Gentlemen wrote: > > Greetings, > > > I have a seemingly simple task which I have not been able to > solve today. I want to construct a symmetric matrix of arbtriray > size w/o using loops. The following I thought would do it: > > > p <- 6 > > Rmat <- diag(p) > > dat.cor <- rnorm(p*(p-1)/2) > > Rmat[outer(1:p, 1:p, "<")] <- Rmat[outer(1:p, 1:p, ">")] <- dat.cor > > > However, the problem is that the matrix is filled by column and > so the resulting matrix is not symmetric. > >Could you provide more detail on the properties of the symmetric >matrices that you would like to generate? It seems that you are >trying to generate correlation matrices. Is that the case? Do you >wish the matrices to be a random sample from a specific distribution. >If so, what distribution? > >Yes, my goal is to generate correlation matrices whose entries have >been sampled independently from a normal with a specified mean and variance. Would it sufficient to use one of the results of RSiteSearch("random multivariate normal", restrict = "functions") or have I completely misunderstood what you want? (I appreciate this is not exactly what you say you want.) >Thanks for the help. > >Greg > > >- > > [[alternative HTML version deleted]] Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SLLOOOWWW function ...
At 12:32 17/07/2007, Johannes Graumann wrote: >Does anybody have any insight into how to make this faster? I am not an expert on R programming by any means but I notice you are growing your new data frame row by row. I believe it is normally recommended to allocate enough space to start with. >I suspect, that the rounding going on may be an issue, as is the stepping >through data frame rows using integers ... > >If you have the patience to teach a noob, he will highly appreciate it ;0) > >Joh > >digit <- 4 >for (minute in seq(from=25,to=lrange[2])){ > # Extract all data associtaed with the current time (minute) > frame <- subset(mylist,mylist[["Time"]] == minute) > # Sort by Intensity > frame <- frame[order(frame[["Intensity"]],decreasing = TRUE),] > # Establish output frame using the most intense candidate > newframe <- frame[1,] > # Establish overlap-checking vector using the most intense candidate > lowppm <- round(newframe[1,][["Mass"]]-newframe[1, >[["Mass"]]/1E6*ppmrange,digits=digit) > highppm <- round(newframe[1,][["Mass"]]+newframe[1, >[["Mass"]]/1E6*ppmrange,digits=digit) > presence <- seq(from=lowppm,to=highppm,by=10^(-digit)) > # Walk through the entire original frame and check whether peaks are >overlap-free ... do so until max of 2000 entries > for (int in seq(from=2,to=nrow(frame))) { > if(nrow(newframe) < 2000) { > lowppm <- round(frame[int,][["Mass"]]-frame[int, >[["Mass"]]/1E6*ppmrange,digits=digit) > highppm <- round(frame[int,][["Mass"]]+frame[int, >[["Mass"]]/1E6*ppmrange,digits=digit) > windowrange <- seq(from=lowppm,to=highppm,by=10^(-digit)) > if (sum(round(windowrange,digits=digit) %in% >round(presence,digits=digit)) < 1) { > newframe <- rbind(newframe,frame[int,]) > presence <- c(presence,windowrange) > } > } else { > break() > } > } Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Partial Proportional Odds model using vglm
At 13:01 16/07/2007, Rizwan Younis wrote: >Hi: >I am trying to fit a PPO model using vglm from VGAM, and get an error while >executing the code. You seem to keep posting the same problem. Since the only person who can tell you what is happening inside VGAM is probably the maintainer a more efficient strategy would be to email him as the instructions ask you to do. However if that fails then try simplifying your problem to see if the error goes away 1) try merging ages 1 and 2, and ages 4 and 5 2) try merging column 3 "2" with 2 "1" or 4 "3" >Here is the data, code, and error: > >Data = rc13, first row is the column names. a = age, and 1,2,3, 4 and 5 are >condition grades. > > a 1 2 3 4 5 > 1 1 0 0 0 0 > 2 84 2 7 10 2 > 3 16 0 6 6 2 > 4 13 0 3 4 0 > 5 0 0 0 1 0 > >Library(VGAM) > >rc13<-read.table("icg_rcPivot_group2.txt",header=F) >names(rc13)<-c("a","1","2","3","4","5") > >ppo<-vglm(cbind(rc13[,2],rc13[,3],rc13[,4],rc13[,5],rc13[,6])~a,family = >cumulative(link = logit, parallel = F , reverse = F),na.action=na.pass, >data=rc13) >summary(ppo) > >I get the following error: > >Error in "[<-"(`*tmp*`, , index, value = c(1.13512932539841, >0.533057528200189, : > number of items to replace is not a multiple of replacement length >In addition: Warning messages: >1: NaNs produced in: log(x) >2: fitted values close to 0 or 1 in: tfun(mu = mu, y = y, w = w, res = >FALSE, eta = eta, extra) >3: 19 elements replaced by 1.819e-12 in: checkwz(wz, M = M, trace = trace, >wzeps = control$wzepsilon) > >I will appreciate any help to fix this problem. >Thanks > >Reez You >Grad Student >University of Waterloo >Waterloo, ON Canada Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Difference in linear regression results for Stata and R
At 17:17 12/07/2007, kdestler wrote: >Hi >I recently imported data from r into Stata. I then ran the linear >regression model I've been working on, only to discover that the results are >somewhat (though not dramatically different). the standard errors vary more >between the two programs than do the coefficients themselves. Any >suggestions on what I've done that causes this mismatch? You really need to find a small example which exhibits the problem and then post that. >Thanks, >Kate >-- >View this message in context: >http://www.nabble.com/Difference-in-linear-regression-results-for-Stata-and-R-tf4069072.html#a11563283 >Sent from the R help mailing list archive at Nabble.com. Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Meta-Analysis of proportions
At 09:58 28/06/2007, Chung-hong Chan wrote: >OpenBUGS should be something related to Bayesian statistics. > >You may refer to Chapter 12 of Handbook >http://cran.r-project.org/doc/vignettes/HSAUR/Ch_meta_analysis.pdf >It talks about meta-regression. > > > >On 6/28/07, Monica Malta <[EMAIL PROTECTED]> wrote: >>Dear colleagues, >> >>I'm conducting a meta-analysis of studies evaluating adherence of >>HIV-positive drug users into AIDS treatment, therefore I'm looking >>for some advice and syntax suggestion for running the >>meta-regression using proportions, not the usual OR/RR frequently >>used on RCT studies. Monica, you have a number of options. 1 - weight each study equally 2 - weight each individual equally 3 - use the usual inverse variance procedure, possibly transforming the proportions first 4 - something else I have not though of You could do 3 using rmeta which is available from CRAN. Programming 1 or 2 is straightforward. Of course, you do need to decide which corresponds to your scientific question. >>Have already searched already several handbooks, R-manuals, mailing >>lists, professors, but... not clue at all... >> >>Does anyone have already tried this? A colleague of mine recently >>published a similar study on JAMA, but he used OpenBUGS - a >>software I'm not familiar with... >> >>If there is any tip/suggestion for a possible syntax, could someone >>send me? I need to finish this paper before my PhD qualify, but I'm >>completely stuck... >> >>So, any tip will be more than welcome...I will really appreciate it!!! >> >>Thanks in advance and congrats on the amazing mailing-list. >> >> >> >>Bests from Rio de Janeiro, Brazil. >> >>Monica >> >> >> >> >> >>Monica Malta >>Researcher >>Oswaldo Cruz Foundation - FIOCRUZ >>Social Science Department - DCS/ENSP >>Rua Leopoldo Bulhoes, 1480 - room 905 >>Manguinhos >>Rio de Janeiro - RJ 21041-210 >>Brazil >>phone +55.21.2598-2715 >>fax +55.21.2598-2779 >> [[alternative HTML version deleted]] >> >>__ >>R-help@stat.math.ethz.ch mailing list >>https://stat.ethz.ch/mailman/listinfo/r-help >>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>and provide commented, minimal, self-contained, reproducible code. > > >-- >"The scientists of today think deeply instead of clearly. One must be >sane to think clearly, but one can think deeply and be quite insane." >Nikola Tesla >http://www.macgrass.com > > Michael Dewey [EMAIL PROTECTED] http://www.aghmed.fsnet.co.uk/home.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merging dataframes with diffent rownumbers
At 09:09 18/06/2007, Thomas Hoffmann wrote: >Dear R-Helpers, > >I have following problem: > >I do have two data frames dat1 and dat2 with a commen column BNUM >(long integer). dat1 has a larger number of BNUM than dat2 and >different rows of dat2 have equal BNUM. The numbers of rows in dat1 >and dat2 is not equal. I applied the tapply-function to dat2 with >BNUM as index. I would like to add the columns from dat1 to the results of > >b.sum <- tapply(dat2, BNUM, sum). > >However the BNUM of b.sum are only a subset of the dat1. > >Does anybody knows a elegant way to solve the problem? If I understand you correctly ?merge should help you here >Thanks in advance > >Thomas H. > > Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Responding to a posting in the digest
At 08:26 14/06/2007, Moshe Olshansky wrote: >Is there a convenient way to respond to a particular >posting which is a part of the digest? >I mean something that will automatically quote the >original message, subject, etc. Yes, if you use appropriate mailing software. I use Eudora, receive the emails in MIME format and the responses do get properly threaded as far as I can see from the mailing list archives. >Thank you! > >Moshe Olshansky >[EMAIL PROTECTED] Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] export to a dat file that SAS can read
At 09:05 13/06/2007, Rina Miehs wrote: >Hello > >i have a data frame in R that some SAS users need to use in their >programs, and they want it in a dat file, is that possible? >and which functions to use for that? Does library(foreign) ?write.foreign get you any further forward? > >my data frame is like this: > > > out13[1:100,] > faridniveau1 niveau3 >p1p3 antal1 >210007995 0.0184891394 4.211306e-10 5.106471e-02 2.594580e-02 > 3 >910076495 0.0140812953 3.858757e-10 1.065804e-01 3.743271e-02 > 3 >10 10081892 0.0241760590 7.429612e-10 1.628295e-02 3.021538e-04 > 6 >13 10101395 0.0319517576 3.257375e-10 2.365204e-03 6.633232e-02 >19 >16 10104692 0.0114040787 3.661169e-10 1.566721e-01 4.550663e-02 > 4 >17 10113592 0.0167586526 4.229634e-10 6.922003e-02 2.543987e-02 > 2 >18 10113697 0.0259205504 2.888646e-10 1.096366e-02 9.118995e-02 > 6 >22 10121697 -0.0135341273 -5.507914e-10 1.157417e-01 5.501947e-03 >16 >28 10146495 0.0093514076 3.493487e-10 2.041883e-01 5.340801e-02 > 4 >29 10150497 0.0091611504 3.455925e-10 2.089893e-01 5.531904e-02 > 4 >36 10171895 0.0089116669 2.956742e-10 2.153844e-01 8.614259e-02 > 4 >42 10198295 0.0078515166 3.147140e-10 2.437943e-01 7.314111e-02 > 5 > > >Thanks > >Rina > > > > > [[alternative HTML version deleted]] Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A function for raising a matrix to a power?
At 15:48 06/05/2007, Ron E. VanNimwegen wrote: >> Hi, >> >>Is there a function for raising a matrix to a power? For example if >>you like to compute A%*%A%*%A, is there an abbreviation similar to A3? >> >>Atte Tenkanen I may be revealing my ignorance here, but is MatrixExp in the msm package (available from CRAN) not relevant here? [snip other suggestion] Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] digest readability/ thunderbird email client/ R-help
At 21:00 28/04/2007, Martin Maechler wrote: >Hi Sam, I posted a request for information on how various mailers handled this so I could summarise the results and put it on a helpful webpage. Nobody was willing to sing the praises of his/her favourite mailer so the only one I can suggest is the one I use myself, Eudora, although that is now no longer under development. My offer is still open if anyone has anything to contribute. > >>>>> "SamMc" == Sam McClatchie <[EMAIL PROTECTED]> > >>>>> on Fri, 27 Apr 2007 11:21:56 -0700 writes: > > SamMc> System:Linux kernel 2.6.15 Ubuntu dapper > ... > > > SamMc> Has anyone figured out how to make the R-help digest > SamMc> more easily readable in the Thunderbird mail client? > SamMc> I think the digest used to be more readable than is > SamMc> currently the case with all of the messages as > SamMc> attachments. > > SamMc> I know the work around is to get individual messages > SamMc> rather than the digest, use another mail client, or > SamMc> just look at the archives on the web... > > {and there are at least two alternatives to the standard > archives, notably Gmane. > BTW: Just found an URL that should list all R lists carried > by them : http://dir.gmane.org/search.php?match=.r. > > >It had been pointed out more than once that Thunderbird >unfortunately is not adhering to the standard(s) of such >digests-- too bad it still is not. > >One alternative is to get the digest as "plain" digest >((which BTW another standard-complying digest format, that > e.g. emacs Rmail or VM can easily deal with)) >which will most probably just appear as one long e-mail in >Thunderbird, with "table of contents" of all the subject lines, >but nothing "clickable" > >Regards, >Martin > >Martin Maechler, ETH Zurich Michael Dewey [EMAIL PROTECTED] http://www.aghmed.fsnet.co.uk/home.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] graphs superimposed on pictures?
At 14:39 11/04/2007, Robert Biddle wrote: >Hi: > >I am doing some work that involves plotting points of interest >superimposed on photographs and maps. I can produce the plots fine >in R, but so far >I have had to do the superimposition externally, which makes it >tedious to do exploratory work. >I have looked to see if there is some capability to put a background >picture on a plot window, >but I have not found anything. >Advice, anyone? Although my situation was not exactly the same as yours you may find http://finzi.psych.upenn.edu/R/Rhelp02a/archive/42884.html a help >Cheers >Robert Biddle > > Michael Dewey http://www.aghmed.fsnet.co.uk __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] summary polr
At 14:41 20/02/2007, you wrote: Please do not just reply to me, 1 - I might not know 2 - it breaks the threading >Hi > >here there is an example extracted from polr help in MASS: > >The function could be: > >temp <- function(form, dat) { > house.plr <- polr(formula = > form, weights = Freq, data = dat) > coeff <- summary(house.plr) >return(coeff)} Why do you try to redefine the coeff extractor function? Try calling the results of summary summ or some other obvious name and I think you will find the problem goes away. See also > library(fortunes) > fortune("dog") Firstly, don't call your matrix 'matrix'. Would you call your dog 'dog'? Anyway, it might clash with the function 'matrix'. -- Barry Rowlingson R-help (October 2004) >the function can be called by: > > temp(Sat ~ Infl + Type + Cont, housing) > >where all data is available from MASS, as it is >an example in R Help on 'polr'. > >Results are: > > Re-fitting to get Hessian > >Error in eval(expr, envir, enclos) : object "dat" not found > >Paolo Accadia > > > >>> Michael Dewey <[EMAIL PROTECTED]> 20/02/07 1:43 PM >>> >At 15:21 19/02/2007, Paolo Accadia wrote: > >Hi all, > > > >I have a problem to estimate Std. Error and > >t-value by âpolrâ in library Mass. >s. > >They result from the summary of a polr object. > > > >I can obtain them working in the R environment > with the following statements: > > > > temp <- polr(formula = formula1, data = data1) > > coeff <- summary(temp), > > > >but when the above statements are enclosed in a > >function, summary reports the following error: > > > >Error in eval(expr, envir, enclos) : object "dat" not found > > > >Someone knows how I can solve the problem? > >By giving us a short reproducible example? > >Specifically we do not know: >1 - what formula1 is >2 - what the structure of data1 is >3 - what the enclosing function looks like >4 - what dat is > > > >Thanks for any help. > >Paolo > >Michael Dewey >http://www.aghmed.fsnet.co.uk Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] summary polr
At 15:21 19/02/2007, Paolo Accadia wrote: >Hi all, > >I have a problem to estimate Std. Error and >t-value by âpolrâ in library Mass. >They result from the summary of a polr object. > >I can obtain them working in the R environment with the following statements: > > temp <- polr(formula = formula1, data = data1) > coeff <- summary(temp), > >but when the above statements are enclosed in a >function, summary reports the following error: > >Error in eval(expr, envir, enclos) : object "dat" not found > >Someone knows how I can solve the problem? By giving us a short reproducible example? Specifically we do not know: 1 - what formula1 is 2 - what the structure of data1 is 3 - what the enclosing function looks like 4 - what dat is >Thanks for any help. >Paolo Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] enhanced question / standardized coefficients
At 15:51 07/02/2007, Brian S Cade wrote: >There was a nice paper in The American Statistician by Johan Bring (1994. >How to standardize regression coefficients. The American Statistician >48(3):209-213) pointing out that comparing ratios of t-test statistic >values (for null hypothesis that parameter = 0) is equivalent to comparing >ratios of standardized coefficients where standardization is based on the >partial (conditional) standard deviations of the parameter estimates. And >this is equivalent to thinking about the incremental improvement in >R-squared that is obtained by including a variable in the regression model >after all others are already in the model. It would seem possible to >extend this idea to categorical factor variables with more than 2 levels >(>1 indicator variable), given the relation between an F and t-test >statistic. You may also be interested in http://www.tfh-berlin.de/~groemp/rpack.html As well as her package relaimpo this also links to her article in JSS which I found thought provoking. >Any way something to think about, though there are no doubt still >limitations in trying to equate effects of variables measured on disparate >scales. > >Brian > >Brian S. Cade > >U. S. Geological Survey >Fort Collins Science Center >2150 Centre Ave., Bldg. C >Fort Collins, CO 80526-8818 > >email: [EMAIL PROTECTED] >tel: 970 226-9326 > > > >"John Fox" <[EMAIL PROTECTED]> >Sent by: [EMAIL PROTECTED] >02/07/2007 07:49 AM > >To >"'Simon P. Kempf'" <[EMAIL PROTECTED]> >cc >r-help@stat.math.ethz.ch >Subject >Re: [R] enhanced question / standardized coefficients > > > > > > >Dear Simon, > >In my opinion, standardized coefficients only offer the illusion of >comparison for quantitative explanatory variables, since there's no deep >reason that the standard deviation of one variable has the same meaning as >the standard deviation of another. Indeed, if the variables are in the >same >units of measurement in the first place, permitting direct comparison of >unstandardized coefficients, then separate standardization of each X is >like >using a rubber ruler. > >That said, as you point out, it makes no sense to standardize the dummy >regressors for a factor, so you can just standardize the quantitative >variables (Y and X's) in the regression equation. > >I hope that this helps, > John > > >John Fox >Department of Sociology >McMaster University >Hamilton, Ontario >Canada L8S 4M4 >905-525-9140x23604 >http://socserv.mcmaster.ca/jfox > > > > -Original Message- > > From: [EMAIL PROTECTED] > > [mailto:[EMAIL PROTECTED] On Behalf Of Simon P. Kempf > > Sent: Wednesday, February 07, 2007 9:27 AM > > To: r-help@stat.math.ethz.ch > > Subject: [R] enhanced question / standardized coefficients > > > > Hello, > > > > > > > > I would like to repost the question of Joerg: > > > > > > > > Hello everybody, > > > > a question that connect to the question of Frederik Karlsons > > about 'how to stand. betas' > > With the stand. betas i can compare the influence of the > > different explaning variables. What do i with the betas of > > factors? I can't use the solution of JohnFox, because there > > is no sd of an factor. How can i compare the influence of the > > factor with the influence of the numeric variables? > > > > I got the same problem. In my regression equation there are > > several categorical variables and I would like to compute > > the standard coefficients. How can I do this? > > > > > > > > Simon > > > > > > > > > > > > > > > > > > > > > >[[alternative HTML version deleted]] > > > > __ > > R-help@stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > >__ >R-help@stat.math.ethz.ch mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. > > > [[alternative HTML version deleted]] Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to find series of small numbers in a big vector?
At 01:49 30/01/2007, Ed Holdgate wrote: >Hello: > >I have a vector with 120,000 reals >between 0.0 and 0. > >They are not sorted but the vector index is the >time-order of my measurements, and therefore >cannot be lost. > >How do I use R to find the starting and ending >index of ANY and ALL the "series" or "sequences" >in that vector where ever there are 5 or more >members in a row between 0.021 and 0.029 ? You could look at rle which codes into runs >For example: > >search_range <- c (0.021, 0.029) # inclusive searching >search_length <- 5 # find ALL series of 5 members within search_range >my_data <- c(0.900, 0.900, 0.900, 0.900, 0.900, > 0.900, 0.900, 0.900, 0.900, 0.900, > 0.900, 0.028, 0.024, 0.027, 0.023, > 0.022, 0.900, 0.900, 0.900, 0.900, > 0.900, 0.900, 0.024, 0.029, 0.023, > 0.025, 0.026, 0.900, 0.900, 0.900, > 0.900, 0.900, 0.900, 0.900, 0.900, > 0.900, 0.900, 0.900, 0.900, 0.022, > 0.023, 0.025, 0.333, 0.027, 0.028, > 0.900, 0.900, 0.900, 0.900, 0.900) > >I seek the R program to report: >start_index of 12 and an end_index of 16 >-- and also -- >start_index of 23 and an end_index of 27 >because that is were there happens to be >search_length numbers within my search_range. > >It should _not_ report the series at start_index 40 >because that 0.333 in there violates the search_range. > >I could brute-force hard-code an R program, but >perhaps an expert can give me a tip for an >easy, elegant existing function or a tactic >to approach? > >Execution speed or algorithm performance is not, >for me in this case, important. Rather, I >seek an easy R solution to find the time windows >(starting & ending indicies) where 5 or more >small numbers in my search_range were measured >all in a row. > >Advice welcome and many thanks in advance. > >Ed Holdgate Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] replicating the odds ratio from a published study
At 21:13 28/01/2007, Bob Green wrote: >Michael, > >Thanks. Yes, clearly the volume number for the Schanda paper I cited is wrong. > >Where things are a bit perplexing, is that I used the same method as >Peter suggested on two papers by Eronen (referenced below). I can >reproduce in R a similar odds ratio to the first published paper e.g >OR = 9.7 (CI= 7.4-12.6) whereas I obtained quite different results >from the second published paper (Eronen 2) of OR = 10.0 (8.1-12.5). >One reason why I wanted to work out the calculations was so I could >analyse data from studies using the same method, for confirmation. > >Now the additional issue, is that Woodward, who is also the author >of an epidemiological text, says in a review that Eronen used >wrong formula in a 1995 paper and indicates that this comment >applies also to later studies - he stated the "they use methods >designed for use with binomial data when they really have Poisson >data. Consequently, they quote odds ratios when they really have >relative rates and their confidence intervals are >inaccurate". Eronen1 cites the formula that was used for OR. >Schanda sets out his table for odds ratio the same as Eronen1 There do seem to be difficulties in what they are doing as they have not observed all the non-homicides, they estimate how many they are and then estimate the number of people with a given diagnosis using prevalence estimates from another study. I think you are moving towards writing an article criticising the statistical methods used in this whole field which I think is going beyond the resources of R-help. >For the present purpose, my primary question is: as you have now >seen the Schanda paper, would you consider Schanda calculated odds >or relative risk? > >Also, when I tried the formula suggested by Peter (below) I obtained >an error - do you know what M might be or the source of the error? > >exp(log(41*2936210/920/20068)+qnorm(c(.025,.975))*sqrt(sum(1/M))) >Error in sum(1/M) : object "M" not found > > > > eronen1 <- as.table(matrix(c(58,852,13600-58,1947000-13600-852), > ncol = 2 , dimnames = list(group=c("scz", "nonscz"), who= > c("sample", "population" > > fisher.test(eronen1) > > >p-value < 2.2e-16 >alternative hypothesis: true odds ratio is not equal to 1 >95 percent confidence interval: > 7.309717 12.690087 >sample estimates: >odds ratio > 9.713458 > > > eronen2 > <- as.table(matrix(c(86,1302,13530-86,1933000-13530-1302), ncol = > 2 , dimnames = list(group=c("scz", "nonscz"), who= c("sample", > "population" > > fisher.test(eronen2) > >p-value < 2.2e-16 >alternative hypothesis: true odds ratio is not equal to 1 >95 percent confidence interval: > 7.481272 11.734136 >sample estimates: >odds ratio >9.42561 > >References > >Eronen, M. et al. (1996 - 1) Mental disorders and homicidal behavior >in Finland. Archives of General Psychiatry, 53, 497-501 > >Eronen, M et al (1996 - 2). Schizophrenia & homicidal >behavior. Schizophrenia Bulletin, 22, 83-89 > >Woodward, Mental disorder & homicide. Epidemiologia E Psichiatria >Sociale, 9, 171-189 > >Any comments are welcomed, > >Bob > >At 01:57 PM 28/01/2007 +, Michael Dewey wrote: >>At 22:01 26/01/2007, Peter Dalgaard wrote: >>>Bob Green wrote: >>>>Peetr & Michael, >>>> >>>>I now see my description may have confused the issue. I do want >>>>to compare odds ratios across studies - in the sense that I want >>>>to create a table with the respective odds ratio for each study. >>>>I do not need to statistically test two sets of odds ratios. >>>> >>>>What I want to do is ensure the method I use to compute an odds >>>>ratio is accurate and intended to check my method against published sources. >>>> >>>>The paper I selected by Schanda et al (2004). Homicide and major >>>>mental disorders. Acta Psychiatr Scand, 11:98-107 reports a total >>>>sample of 1087. Odds ratios are reported separately for men and >>>>women. There were 961 men all of whom were convicted of homicide. >>>>Of these 961 men, 41 were diagnosed with schizophrenia. The >>>>unadjusted odds ratio is for this group of 41 is cited as >>>>6.52 (4.70-9.00). They also report the general population aged >>>>over 15 with schizophrenia =20,109 and the total population =2,957,239. >> >>Looking at the paper (which is in volume 110 by the way) suggests >>that Peter's r
Re: [R] replicating the odds ratio from a published study
At 22:01 26/01/2007, Peter Dalgaard wrote: >Bob Green wrote: >>Peetr & Michael, >> >>I now see my description may have confused the issue. I do want to >>compare odds ratios across studies - in the sense that I want to >>create a table with the respective odds ratio for each study. I do >>not need to statistically test two sets of odds ratios. >> >>What I want to do is ensure the method I use to compute an odds >>ratio is accurate and intended to check my method against published sources. >> >>The paper I selected by Schanda et al (2004). Homicide and major >>mental disorders. Acta Psychiatr Scand, 11:98-107 reports a total >>sample of 1087. Odds ratios are reported separately for men and >>women. There were 961 men all of whom were convicted of homicide. >>Of these 961 men, 41 were diagnosed with schizophrenia. The >>unadjusted odds ratio is for this group of 41 is cited as >>6.52 (4.70-9.00). They also report the general population aged >>over 15 with schizophrenia =20,109 and the total population =2,957,239. Looking at the paper (which is in volume 110 by the way) suggests that Peter's reading of the situation is correct and that is what the authors have done. >>Any further clarification is much appreciated, >> >> >A fisher.test on the following matrix seems about right: > > matrix(c(41,920,20109-41,2957239-20109-920),2) > > [,1][,2] >[1,] 41 20068 >[2,] 920 2936210 > > > fisher.test(matrix(c(41,920,20109-41,2957239-20109-920),2)) > >Fisher's Exact Test for Count Data > >data: matrix(c(41, 920, 20109 - 41, 2957239 - 20109 - 920), 2) >p-value < 2.2e-16 >alternative hypothesis: true odds ratio is not equal to 1 >95 percent confidence interval: >4.645663 8.918425 >sample estimates: >odds ratio > 6.520379 > >The c.i. is not precisely the same as your source. This could be >down to a different approximation (R's is based on the noncentral >hypergeometric distribution), but the classical asymptotic formula gives > > > exp(log(41*2936210/920/20068)+qnorm(c(.025,.975))*sqrt(sum(1/M))) >[1] 4.767384 8.918216 > >which is closer, but still a bit narrower. > Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] replicating the odds ratio from a published study
At 09:04 26/01/2007, Bob Green wrote: >I wanted to compare odds ratio across studies and tried to replicate >the results from a study but have not been able to determine how to >do this in R. > >The study reported a sample of 961 men, of whom 41 had a disorder. >The reported raw odds ratio was 6.52 (4.70-9.00) For an odds ratio you require two odds from which you form the odds ratio. You only have one odds. Do you have another one lying around somewhere? >I did a search of the archives and came across script that looks >like it should perform this task, however the results I obtained did not match > > > prop.test(41,961) > > 1-sample proportions test with continuity correction > >data: 41 out of 961, null probability 0.5 >X-squared = 802.1686, df = 1, p-value < 2.2e-16 >alternative hypothesis: true p is not equal to 0.5 >95 percent confidence interval: > 0.03115856 0.05795744 >sample estimates: > p >0.04266389 > > > > > ci.p <- prop.test(920, 961)$conf > > ci.odds <- ci.p/(1-ci.p) > > ci.odds >[1] 16.25404 31.09390 >attr(,"conf.level") >[1] 0.95 > > >Any advice regarding the script I require is appreciated, > >regards > >bob Green > > Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] selecting rows for inclusion in lm
At 08:19 18/01/2007, David Barron wrote: >Why not use the subset option? Something like: > >lm(diff ~ Age + Race, data=data, subset=data$Meno=="PRE") > >should do the trick, and be much easier to read! And indeed the advice in > library(fortunes) > fortune("dog") Firstly, don't call your matrix 'matrix'. Would you call your dog 'dog'? Anyway, it might clash with the function 'matrix'. -- Barry Rowlingson R-help (October 2004) > also helps to make life clearer I find >On 18/01/07, John Sorkin <[EMAIL PROTECTED]> wrote: >>I am having trouble selecting rows of a dataframe that will be included >>in a regression. I am trying to select those rows for which the variable >>Meno equals PRE. I have used the code below: >> >>difffitPre<-lm(data[,"diff"]~data[,"Age"]+data[,"Race"],data=data[data[,"Meno"]=="PRE",]) >>summary(difffitPre) >> >>The output from the summary indicates that more than 76 rows are >>included in the regression: >> >>Residual standard error: 2.828 on 76 degrees of freedom >> >>where in fact only 22 rows should be included as can be seen from the >>following: >> >>print(data[length(data[,"Meno"]=="PRE","Meno"])) >>[1] 22 >> >>I would appreciate any help in modifying the data= parameter of the lm >>so that I include only those subjects for which Meno=PRE. >> >>R 2.3.1 >>Windows XP >> >>Thanks, >>John >> >>John Sorkin M.D., Ph.D. >>Chief, Biostatistics and Informatics >>Baltimore VA Medical Center GRECC, >>University of Maryland School of Medicine Claude D. Pepper OAIC, >>University of Maryland Clinical Nutrition Research Unit, and >>Baltimore VA Center Stroke of Excellence >> >>University of Maryland School of Medicine >>Division of Gerontology >>Baltimore VA Medical Center >>10 North Greene Street >>GRECC (BT/18/GR) >>Baltimore, MD 21201-1524 >> >>(Phone) 410-605-7119 >>(Fax) 410-605-7913 (Please call phone number above prior to faxing) >>[EMAIL PROTECTED] >> >>Confidentiality Statement: >>This email message, including any attachments, is for the so...{{dropped}} >> >>__ >>R-help@stat.math.ethz.ch mailing list >>https://stat.ethz.ch/mailman/listinfo/r-help >>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>and provide commented, minimal, self-contained, reproducible code. > > >-- >= >David Barron >Said Business School >University of Oxford >Park End Street >Oxford OX1 1HP > > Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to format R code in LaTex documents
At 14:12 15/01/2007, Frank E Harrell Jr wrote: >Benjamin Dickgiesser wrote: >>Hi, >>I am planning on putting some R script in an appendix of a LaTex >>document. Can anyone recommend me a way of how to format it? Is there >>a way to keep all line breaks without having to insert \\ in every >>single line? >>Thank you! >>Benjamin > >Here's one way and I would appreciate anyone's >improvements. I've also included solutions from >two others. Please let me know what you decide to use. -Frank Some interesting ideas below so just to add that I find I often have to use the linewidth parameter to avoid lines wrapping. This is valuable if you are also going to include bits of R output as well (you can control the width of your own code fragments of course). >\usepackage{listings,relsize} >\lstloadlanguages{R} >\lstset{language=R,basicstyle=\smaller[2],commentstyle=\rmfamily\smaller, > showstringspaces=false,% > xleftmargin=4ex,literate={<-}{{$\leftarrow$}}1 {~}{{$\sim$}}1} >\lstset{escapeinside={(*}{*)}} % for (*\ref{ }*) inside lstlistings (S code) > >. . . >\begin{lstlisting} >. . . S code . . . >\end{lstlisting} > >The following code was provided by Vincent Goulet: > > >listings is a great package to highlight R keywords and comments and --- that >was my main use of the package --- index those keywords. I found that I had >to slightly redefine the list of keywords included in listings. I still did >not take the time to submit a patch to the author, though... > >In any case, here's what I use, if it can be of any help to anyone: > >\lstloadlanguages{R} >\lstdefinelanguage{Renhanced}[]{R}{% > morekeywords={acf,ar,arima,arima.sim,colMeans,colSums,is.na,is.null,% > mapply,ms,na.rm,nlmin,replicate,row.names,rowMeans,rowSums,seasonal,% > sys.time,system.time,ts.plot,which.max,which.min}, > deletekeywords={c}, > alsoletter={.\%},% > alsoother={:_\$}} >\lstset{language=Renhanced,extendedchars=true, > basicstyle=\small\ttfamily, > commentstyle=\textsl, > keywordstyle=\mdseries, > showstringspaces=false, > index=[1][keywords], > indexstyle=\indexfonction} > >with > > [EMAIL PROTECTED] > >-- Vincent Goulet, Associate Professor École >d'actuariat Université Laval, Québec >[EMAIL PROTECTED] http://vgoulet.act.ulaval.ca > >Anupam Tyagi provided the following: > >\documentclass{report} >\usepackage{listings} >\begin{document} > >Somethings . > >\lstset{% general command to set parameter(s) >basicstyle=\small, % print whole in small >stringstyle=\ttfamily, % typewriter type for strings >numbers=left, % numbers on the left >numberstyle=\tiny, % Tiny numbers >stepnumber=2, % number every second line of code >numbersep=5pt, % 5pt seperation between numbering and code listing >language=R } > >\lstinputlisting{text1.R} > >\end{document} > > Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] proc GLM with R
At 05:15 18/12/2006, Cressoni, Massimo \(NIH/NHLBI\) [F] wrote: >I want to migrate from SAS to R. >I used proc mixed to do comparison between multiple groups and to perform >multiple comparison between groups since, as far as I know, proc >mixed does not make assumptions about the data and so >it is better than a simple anova (data must only be normal). >Es. how can I translate a code like this (two way anova with a factor of >repetition) : > > >proc mixed; >class kind PEEP codice; >model PaO2_FiO2 = kind PEEP kind*PEEP; >repeated /type = un sub=codice; >lsmeans kind*PEEP /adjust=bon; >run; > >codice is a unique identifier of patient >kind is a variable which subdivided the patient (i.e. red or brown hairs) >PEEP is positive end expiratory pressure. These are the steps of a clinical >trial. Patient did the trial at PEEP = 5 and PEEP = 10 You could investigate either nlme or lme4 The best documentation for nlme (which should be included in your system) is @BOOK{pinheiro00, author = {Pinheiro, J C and Bates, D M}, year = 2000, title = {Mixed-effects models in {S} and {S-PLUS}}, publisher = {Springer-Verlag}, address = {New York}, keywords = {glm; mixed models} } lme4 is a more recent development by Bates which as yet has slightly fewer helper functions and no book. Since you are assuming normal error you can use nlme. I am afraid I do not read SAS so I think it would be wrong of me to try to translate your example (traddutore, traditore and all that) >Thank you > >Massimo Cressoni > >run; Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ifelse misusage become more and more frequent...
At 10:42 13/12/2006, Martin Maechler wrote: > >>>>> "[EMAIL PROTECTED]" == [EMAIL PROTECTED] fr <[EMAIL PROTECTED]> > >>>>> on Tue, 12 Dec 2006 22:24:33 +0100 writes: > > [EMAIL PROTECTED]> ...ifelse, a function of three **vector** > [EMAIL PROTECTED]> arguments Yes !! I misunderstood the > [EMAIL PROTECTED]> functioning of ifelse. > >Seems to happen more an more often. >When I teach "R programming" I nowadays usually emphasize that people >should often *NOT* use ifelse(). >In other words, I think ifelse() is much over-used in situations >where something else would be both clearer and more efficient. > >Is there a document / book around which lures people into >misusing ifelse() so frequently? Perhaps it is because here two concepts are involved which may be less familiar to people: program control and vectorisation. I wonder whether the manual pages for ifelse and if need to do more than just See also the other one. I notice that the page for if has a very helpful health warning about putting braces round everything. This is unusual in the manual pages which usually describe what is rather than tell you explicitly what to do to achieve salvation. Perhaps I could suggest that the if page tells you what error message you get when you wanted ifelse and that the ifelse page has a health warning along the lines of 'ifelse is not if and they are often confused with difficult to understand results'. [snip] Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiple Imputation / Non Parametric Models / Combining Results
At 08:12 08/12/2006, Simon P. Kempf wrote: >Dear R-Users, > > > >The following question is more of general nature than a merely technical >one. Nevertheless I hope someone get me some answers. > > I am in no sense an expert in this area but since it seems that noone else has answered so far; I wonder whether the mitools package from CRAN helps? >I have been using the mice package to perform the multiple imputations. So >far, everything works fine with the standard regressions analysis. > > > >However, I am wondering, if it is theoretically correct to perform >nonparametric models (GAM, spline smoothing etc.) with multiple imputed >datasets. If yes, how can I combine the results in order to show the >uncertainty? > > > >In the research field of real estate economics, the problem of missing data >is often ignored respectively unmentioned. However, GAM, spline smoothing >etc. become increasingly popular. In my research, I would like to use >multiple imputed datasets and GAM, but I am unsure how present single >results. > > > >Again I want to apologize that this is a rather theoretical statistical >question than a technical question on R. > > > >Thanks in advance for any hints and advices. > > > >Simon > > > > > > > > > > > >Simon P. Kempf > >Dipl.-Kfm. MScRE Immobilienökonom (ebs) > >Wissenschaftlicher Assistent > > > >Büro: > >IREBS Immobilienakademie > >c/o ebs Immobilienakademie GmbH > >Berliner Str. 26a > >13507 Berlin > > > >Privat: > >Dunckerstraße 60 > >10439 Berlin > > > >Mobil: 0176 7002 6687 > >Email: <mailto:[EMAIL PROTECTED]> [EMAIL PROTECTED] > > > > > [[alternative HTML version deleted]] Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] beginning my R-learning
At 22:16 04/12/2006, Michael McCulloch wrote: >Hello, >I'm just beginning to learn R. What books/online learning modules >with datasets would you suggest? >Thank you! If a) you already know some statistics b) you want to use R as a tool in your applied statistical work then @BOOK{venables02, author = {Venables, W N and Ripley, B D}, year = 2002, title = {Modern applied statistics with {S}}, publisher = {Springer-Verlag}, address = {New York}, keywords = {statistics, general, software} } is worth considering. It is quite terse though (it is one of the few books I have which I wish were longer) and not so suitable if you are also learning statistics at the same time. >Best wishes, >Michael > > > > >Michael McCulloch >Pine Street Clinic >Pine Street Foundation >124 Pine Street, San Anselmo, CA 94960-2674 >tel 415.407.1357 >fax 415.485.1065 >email: [EMAIL PROTECTED] >web:www.pinest.org > www.pinestreetfoundation.org > www.medepi.net/meta > > Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] prop.trend.test issue
At 13:46 03/12/2006, Ethan Johnsons wrote: >I don't find any other test avail for this? >Am I missing something? I do not want to seem unhelpful but the only response that springs to mind is a knowledge of statistics. I hope people's lives are not at stake with the results of your analysis >thx > >ej > >On 12/3/06, Michael Dewey <[EMAIL PROTECTED]> wrote: >>At 05:55 03/12/2006, Ethan Johnsons wrote: >> >I have the clinical study data. >> > >> >Year 0 Year 3 >> >Retinol (nmol/L)N Mean +-sd Mean +-sd >> >Vitamin A group 73 1.89+-0.36 2.06+-0.53 >> >Trace group57 1.83+-0.31 1.78+-0.30 >> > >> >where N is the number of male for the clinical study. >> > >> >I want to test if the mean serum retinol has increased over 3 years >> >among subjects in the vitamin A group. >> >>If you want to test means why did you think a test for proportions >>was a good idea? >> >> >> >> 1.89+0.36 >> >[1] 2.25 >> >>1.89-0.36 >> >[1] 1.53 >> >>2.06+0.53 >> >[1] 2.59 >> >>2.06-0.53 >> >[1] 1.53 >> > >> > >> >>prop.trend.test(c(2.25, 1.53),c( 2.59, 1.53)) >> > >> >Chi-squared Test for Trend in Proportions >> > >> >data: c(2.25, 1.53) out of c(2.59, 1.53) , >> >using scores: 1 2 >> >X-squared = 0.2189, df = 1, p-value = 0.6399 >> > >> >The issue I am seeing that N of Vitamin A group = 73 seems not reflected. >> >This leads me to think that I can't implement the test based on the >> >data just presented. >> >Nor a two-tailed test is possible. >> > >> >2-sample test for equality of proportions with continuity >> correction >> > >> >data: c(2.25, 1.53) out of c(2.59, 1.53) >> >X-squared = 0, df = 1, p-value = 1 >> >alternative hypothesis: two.sided >> >95 percent confidence interval: >> >-0.6738203 0.4112720 >> >sample estimates: >> > prop 1prop 2 >> >0.8687259 1.000 >> > >> >Warning message: >> >Chi-squared approximation may be incorrect in: prop.test(c(2.25, >> >1.53), c(2.59, 1.53)) >> > >> >Any ideas, please? >> > >> >thx >> > >> >ej >> > >> > >> >>Michael Dewey >>http://www.aghmed.fsnet.co.uk >> > > > >-- >No virus found in this incoming message. >Checked by AVG Free Edition. >Version: 7.5.431 / Virus Database: 268.15.3/562 - Release Date: >01/12/2006 13:12 Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] prop.trend.test issue
At 05:55 03/12/2006, Ethan Johnsons wrote: >I have the clinical study data. > >Year 0 Year 3 >Retinol (nmol/L)N Mean +-sd Mean +-sd >Vitamin A group 73 1.89+-0.36 2.06+-0.53 >Trace group57 1.83+-0.31 1.78+-0.30 > >where N is the number of male for the clinical study. > >I want to test if the mean serum retinol has increased over 3 years >among subjects in the vitamin A group. If you want to test means why did you think a test for proportions was a good idea? >> 1.89+0.36 >[1] 2.25 >>1.89-0.36 >[1] 1.53 >>2.06+0.53 >[1] 2.59 >>2.06-0.53 >[1] 1.53 > > >>prop.trend.test(c(2.25, 1.53),c( 2.59, 1.53)) > >Chi-squared Test for Trend in Proportions > >data: c(2.25, 1.53) out of c(2.59, 1.53) , >using scores: 1 2 >X-squared = 0.2189, df = 1, p-value = 0.6399 > >The issue I am seeing that N of Vitamin A group = 73 seems not reflected. >This leads me to think that I can't implement the test based on the >data just presented. >Nor a two-tailed test is possible. > >2-sample test for equality of proportions with continuity correction > >data: c(2.25, 1.53) out of c(2.59, 1.53) >X-squared = 0, df = 1, p-value = 1 >alternative hypothesis: two.sided >95 percent confidence interval: >-0.6738203 0.4112720 >sample estimates: > prop 1prop 2 >0.8687259 1.000 > >Warning message: >Chi-squared approximation may be incorrect in: prop.test(c(2.25, >1.53), c(2.59, 1.53)) > >Any ideas, please? > >thx > >ej > > Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is there a better way for inputing data manually?
At 04:40 02/12/2006, Duncan Murdoch wrote: >On 12/1/2006 11:00 PM, Wei-Wei Guo wrote: >>Dear All, >>I have worked with R for some time. It's a great tool for data >>analysis. But it's too hard to inputing raw data manually with R (I >>don't mean importing data. R is good at importing data). Maybe it's >>not a focused topic in this list, but I don't know other place >>where I can ask the question. How do you do when inputing data from >>a paper material, such as questionnaire, or what do you use ? > >I would not use R for this. Depending on how many questionnaires I >had, from small number to large, I would use: > >1. A text file. >2. A spreadsheet, like Excel, or the OpenOffice one, or the R data editor. >3. A custom program written specifically to handle the particular >questionnaire. > >You can do 1 and 2 in R, but you can't do them as well as programs >dedicated to those tasks, and you can't do 3 at all well. It >depends a lot on the specific conventions of the platform you're >working on. R is aimed at writing cross-platform programs, and >isn't particularly good at writing GUI programs, which is what you >want here. I would personally use Delphi for this, but there are >lots of alternatives. I usually use Epidata, available from http://www.epidata.dk/, which is free and enables you to write your own questionnaire file and then input your data. It also enables you to implement checks during data entry which in my view is an important feature. It enables export in a number of formats but its native output format, the .rec file, can be read into R using the foreign package. >Duncan Murdoch > > Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] GLM and LM singularities
At 23:12 26/11/2006, Tim Sippel wrote: >Hi- > >I'm wrestling with some of my data apparently not being called into a GLM or >an LM. I'm looking at factors affecting fish annual catch rates (ie. CPUE) >over 30 years. Two of the factors I'm using are sea surface temperature and >sea surface temperature anomaly. A small sample of my data is below: > > I could not make much sense of the dataset and your output is incomplete (what happened to Year?) but usually the message you received means that one of your predictors is a linear combination of some of the others. >CPUE > >Year > >Vessel_ID > >Base_Port > >Boat_Lgth > >Planing > >SST > >Anomaly > > >0.127 > >1976 > >2 > >BOI > >40 > >Planing > >19.4 > >-0.308 > > >0.059 > >1976 > >30 > >BOI > >40 > >Displacement > >19.4 > >-0.308 > > >0.03 > >1977 > >46 > >BOI > >45 > >Displacement > >18.5 > >-1.008 > > >0.169231 > >1977 > >2 > >BOI > >40 > >Planing > >18.5 > >-1.008 > > >0.044118 > >1977 > >19 > >BOI > >34 > >Planing > >18.5 > >-1.008 > > >0.114286 > >1977 > >29 > >WHANGAROA > >42 > >Displacement > >18.5 > >-1.008 > > > >Have defined Year, Vessel_ID, Base_Port, Boat_Lgth, Planing as factors, and >CPUE, SST, Anomaly are continuous variables. > > > >The formula I'm calling is: glm(formula = log(CPUE) ~ Year + Vessel_ID + >SST, data = marlin). A summary of the glm includes the following output: > > > >Coefficients: (1 not defined because of singularities) > > Estimate Std. Error t value Pr(>|t|) > >Vessel_ID80 -0.540930.23380 -2.314 0.021373 * > >Vessel_ID83 -0.364990.20844 -1.751 0.080977 . > >Vessel_ID84 -0.233860.19098 -1.225 0.221718 > >SST NA NA NA NA > > > >Can someone explain the output "Coefficients: (1 not defined because of >singularities)"? What does this mean? I'm guessing that something to do >with my SST data is causing a "singularity" but I don't know where to go >from there? Have inspected various R discussions on this, but I haven't >found the information I need. > > > >Many thanks, > > > >Tim Sippel (MSc) > >School of Biological Sciences > >Auckland University > >Private Bag 92019 > >Auckland 1020 > >New Zealand > >+64-9-373-7599 ext. 84589 (work) > >+64-9-373-7668 (Fax) > >+64-21-593-001 (mobile) > > > > > [[alternative HTML version deleted]] Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] random effect question and glm
At 06:28 26/11/2006, you wrote: >Below is the output for p5.random.p,p5.random.p1 and m0 >My question is >in p5.random.p, variance for P is 5e-10. >But in p5.random.p1,variance for P is 0.039293. >Why they are so different? Please do as the posting guide asks and reply to the list, not just the individual. a - I might not know the answer b - the discussion might help others You give very brief details of the underlying problem so it is hard to know what information will help you most. Remember, if a computer estimates a non-negative quantity as very small perhaps it is really zero. I think you might benefit from reading Pinheiro and Bates, again see the posting list. >Is that wrong to write Y~P+(1|P) if I consider P as random effect? I suppose terminology differs but I would have said the model with Y~1+(1|P) was a random intercept model >Also in p5.random.p and m0, it seems that there are little >difference in estimate for P. Why? > >thanks, > >Aimin Yan > > > p5.random.p <- > lmer(Y~P+(1|P),data=p5,family=binomial,control=list(usePQL=FALSE,msV=1)) > > summary(p5.random.p) >Generalized linear mixed model fit using Laplace >Formula: Y ~ P + (1 | P) >Data: p5 > Family: binomial(logit link) > AIC BIC logLik deviance > 1423 1452 -705.4 1411 >Random effects: > Groups NameVariance Std.Dev. > P (Intercept) 5e-102.2361e-05 >number of obs: 1030, groups: P, 5 > >Estimated scale (compare to 1 ) 0.938 > >Fixed effects: > Estimate Std. Error z value Pr(>|z|) >(Intercept) 0.153404 0.160599 0.9552 0.3395 >P8ABP -0.422636 0.202181 -2.0904 0.0366 * >P8adh0.009634 0.194826 0.0495 0.9606 >P9pap0.108536 0.218875 0.4959 0.6200 >P9RNT0.376122 0.271957 1.3830 0.1667 >--- >Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > >Correlation of Fixed Effects: > (Intr) P8ABP P8adh P9pap >P8ABP -0.794 >P8adh -0.824 0.655 >P9pap -0.734 0.583 0.605 >P9RNT -0.591 0.469 0.487 0.433 > > p5.random.p1 <- > lmer(Y~1+(1|P),data=p5,family=binomial,control=list(usePQL=FALSE,msV=1)) > > summary(p5.random.p1) >Generalized linear mixed model fit using Laplace >Formula: Y ~ 1 + (1 | P) >Data: p5 > Family: binomial(logit link) > AIC BIC logLik deviance > 1425 1435 -710.6 1421 >Random effects: > Groups NameVariance Std.Dev. > P (Intercept) 0.039293 0.19823 >number of obs: 1030, groups: P, 5 > >Estimated scale (compare to 1 ) 0.9984311 > >Fixed effects: > Estimate Std. Error z value Pr(>|z|) >(Intercept) 0.1362 0.1109 1.2280.219 > > > m0<-glm(Y~P,data=p5,family=binomial(logit)) > > summary(m0) > >Call: >glm(formula = Y ~ P, family = binomial(logit), data = p5) > >Deviance Residuals: > Min 1Q Median 3Q Max >-1.4086 -1.2476 0.9626 1.1088 1.2933 > >Coefficients: > Estimate Std. Error z value Pr(>|z|) >(Intercept) 0.154151 0.160604 0.960 0.3371 >P8ABP -0.422415 0.202180 -2.089 0.0367 * >P8adh0.009355 0.194831 0.048 0.9617 >P9pap0.108214 0.218881 0.494 0.6210 >P9RNT0.374693 0.271945 1.378 0.1683 >--- >Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > >(Dispersion parameter for binomial family taken to be 1) > > Null deviance: 1425.5 on 1029 degrees of freedom >Residual deviance: 1410.8 on 1025 degrees of freedom >AIC: 1420.8 > >Number of Fisher Scoring iterations: 4 > > >At 06:13 AM 11/24/2006, you wrote: >>At 21:52 23/11/2006, Aimin Yan wrote: >>>consider p as random effect with 5 levels, what is difference between these >>>two models? >>> >>> > p5.random.p <- lmer(Y >>>~p+(1|p),data=p5,family=binomial,control=list(usePQL=FALSE,msV=1)) >>> > p5.random.p1 <- lmer(Y >>>~1+(1|p),data=p5,family=binomial,control=list(usePQL=FALSE,msV=1)) >> >>Well, try them and see. Then if you cannot understand the output tell us >>a) what you found >>b) how it differed from what you expected >> >>>in addtion, I try these two models, it seems they are same. >>>what is the difference between these two model. Is this a cell means model? >>>m00 <- glm(sc ~aa-1,data = p5) >>>m000 <- glm(sc ~1+aa-1,data = p5) >> >>See above >> >> >>>thanks, >>> >>>Aimin Yan >> >>Michael Dewey >>http://www.aghmed.fsnet.co.uk > > > > > >-- >No virus found in this incoming message. >Checked by AVG Free Edition. >Version: 7.5.431 / Virus Database: 268.14.16/551 - Release Date: >25/11/2006 10:55 Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] random effect question and glm
At 21:52 23/11/2006, Aimin Yan wrote: >consider p as random effect with 5 levels, what is difference between these >two models? > > > p5.random.p <- lmer(Y >~p+(1|p),data=p5,family=binomial,control=list(usePQL=FALSE,msV=1)) > > p5.random.p1 <- lmer(Y >~1+(1|p),data=p5,family=binomial,control=list(usePQL=FALSE,msV=1)) Well, try them and see. Then if you cannot understand the output tell us a) what you found b) how it differed from what you expected >in addtion, I try these two models, it seems they are same. >what is the difference between these two model. Is this a cell means model? >m00 <- glm(sc ~aa-1,data = p5) >m000 <- glm(sc ~1+aa-1,data = p5) See above >thanks, > >Aimin Yan > > Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] show unique combination of two factor
At 22:02 23/11/2006, Aimin Yan wrote: >p factor have 5 levels >aa factor have 19 levels. >totally it should have 95 combinations. >but I just find there are 92 combinations. >Does anyone know how to code to find what combinations are missed? Does which(table(p,aa) == 0, arr.ind=TRUE) do what you wanted. >Thanks, > >Aimin > > Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Newbie problem ... Forest plot
At 15:16 22/11/2006, you wrote: >When I use studlab parameter I don`t have (don't know how to put) >names inside graph, X and Y axis, and I have Black/White forest >plot, and I need colored. 1 - please cc to r-help so that others can learn from your experience 2 - when you use the studlab parameter what happens, and how does that differ from what you wanted to happen? Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Summary, was Re: Confidence interval for relative risk
At 14:43 10/11/2006, Michael Dewey wrote: After considerable help from list members and some digging of my own I have prepared a summary of the findings which I have posted (see link below). Broadly there were four suggestions 1 - Wald-type intervals, 2 - transforming the odds ratio confidence interval, 3 - method based on score test, 4 - method based on likelihood. Method 3 was communicated to me off-list === I haven't followed all of the thread either but has someone already pointed out to you confidence intervals that use the score method? For example Agresti (Categorical Data Analysis 2nd edition, p. 77-78) note that 'although computationally more complex, these methods often perform better'. However, they also note that 'currently they are not available in standard software'. But then again, R is not standard software: the code (riskscoreci) can be found from here: http://www.stat.ufl.edu/~aa/cda/R/two_sample/R2/index.html best regards, Jukka Jokinen and so I reproduce it here. Almost a candidate for the fortunes package there. The other three can be found from the archive under the same subject although not all in the same thread. Methods 3 and 4 seem to have more going for them as far as I can judge. Thanks to David Duffy, Spencer Graves, Jukka Jokinen, Terry Therneau, and Wolfgan Viechtbauer. The views and calculations presented here and in the summary are my own and any errors are my responsibility not theirs. The summary document is available from here http://www.zen103156.zen.co.uk/rr.pdf Original post follows. >The concrete problem is that I am refereeing >a paper where a confidence interval is >presented for the risk ratio and I do not find >it credible. I show below my attempts to >do this in R. The example is slightly changed >from the authors'. > >I can obtain a confidence interval for >the odds ratio from fisher.test of >course > >=== fisher.test example === > > > outcome <- matrix(c(500, 0, 500, 8), ncol = 2, byrow = TRUE) > > fisher.test(outcome) > > Fisher's Exact Test for Count Data > >data: outcome >p-value = 0.00761 >alternative hypothesis: true odds ratio is not equal to 1 >95 percent confidence interval: > 1.694792 Inf >sample estimates: >odds ratio >Inf > >=== end example === > >but in epidemiology authors often >prefer to present risk ratios. > >Using the facility on CRAN to search >the site I find packages epitools and Epi >which both offer confidence intervals >for the risk ratio > >=== Epi example === > > > library(Epi) > > twoby2(outcome[c(2,1),c(2,1)]) >2 by 2 table analysis: >-- >Outcome : Col 1 >Comparing : Row 1 vs. Row 2 > > Col 1 Col 2P(Col 1) 95% conf. interval >Row 1 8 500 0.01570.0079 0.0312 >Row 2 0 500 0.0. NaN > >95% conf. interval > Relative Risk:Inf NaN Inf > Sample Odds Ratio:Inf NaN Inf >Conditional MLE Odds Ratio:Inf1.6948 Inf > Probability difference: 0.01570.0027 0.0337 > > Exact P-value: 0.0076 > Asymptotic P-value: NaN >-- > >=== end example === > >So Epi gives me a lower limit of NaN but the same confidence >interval and p-value as fisher.test > >=== epitools example === > > > library(epitools) > > riskratio(outcome) >$data > Outcome >Predictor Disease1 Disease2 Total > Exposed1 5000 500 > Exposed2 5008 508 > Total10008 1008 > >$measure > risk ratio with 95% C.I. >Predictor estimate lower upper > Exposed11NANA > Exposed2 Inf NaN Inf > >$p.value > two-sided >Predictor midp.exact fisher.exact chi.square > Exposed1 NA NA NA > Exposed2 0.00404821 0.007610478 0.004843385 > >$correction >[1] FALSE > >attr(,"method") >[1] "Unconditional MLE & normal approximation (Wald) CI" >Warning message: >Chi-squared approximation may be incorrect in: chisq.test(xx, correct = >correction) > >=== end example === > >And epitools also gives a lower limit >of NaN. > >=== end all examples === > >I would prefer not to have to tell the authors of the >paper I am refereeing that >I think they are wrong unless I can help them with what they >should have done. > >Is there another package
Re: [R] Newbie problem ... Forest plot
At 15:59 16/11/2006, Peter Bolton wrote: >Hello! >I have some data stored into 2 separate csv file. 1 file (called >A.csv) (12 results named Group1, Group2, Group3, etc...) odds >ratios, 2 file (called B.csv) 12 corresponded errors. >How to import that data into R and make forest plot like I saw >inside help file Rmeta and meta with included different font colors >and names trough X and Y axis. You need to show us what you tried, and why it did not do what you want, for us to be able to point you in the right direction. I think you also need to get a good book on meta-analysis if you think that meta.MH from rmeta is the right tool for your datset @BOOK{sutton00, author = {Sutton, A J and Abrams, K R and Jones, D R and Sheldon, T A and Song, F}, year = 2000, title = {Methods for meta-analysis in medical research}, publisher = {Wiley}, address = {Chichester}, keywords = {meta-analysis, general} } may help you. >I know for meta libb >... >out <- metagen(name1,name2) >plot(out,xlab="abcd") > > >But I need for my data to look like this? (copy from help file rmeta) > >library(rmeta) >op <- par(lend="square", no.readonly=TRUE) >data(catheter) >a <- meta.MH(n.trt, n.ctrl, col.trt, col.ctrl, data=catheter, > names=Name, subset=c(13,6,5,3,7,12,4,11,1,8,10,2)) ># angry fruit salad >metaplot(a$logOR, a$selogOR, nn=a$selogOR^-2, a$names, > summn=a$logMH, sumse=a$selogMH, sumnn=a$selogMH^-2, > logeffect=TRUE, colors=meta.colors(box="magenta", > lines="blue", zero="red", summary="orange", > text="forestgreen")) >par(op) # reset parameters > >Of course if someone have better idea to import data - not trough >csv file no problem any format is OK - data frame or something else >and to make forest plot ... no problem ... I need them on the forest plot. >Greetings... >Peter Boolton >MCA adcc.. Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Confidence interval for relative risk
At 12:35 12/11/2006, Peter Dalgaard wrote: >Michael Dewey <[EMAIL PROTECTED]> writes: > > > At 15:54 10/11/2006, Viechtbauer Wolfgang (STAT) wrote: > > > > Thanks for the suggestion Wolfgang, but whatever the original authors > > did that is not it. > >Did you ever say what result they got? > > -p > No, because I did not want to use the original numbers in the request. So as the snippet below indicates I changed the numbers. If I apply Wolfgang's suggestion (which I had already thought of but discarded) I get about 13 for the real example where the authors quote about 5. My question still remains though as to how I can get a confidence interval for the risk ratio without adding a constant to each cell. > > > > it credible. I show below my attempts to > > > > do this in R. The example is slightly changed > > > > from the authors'. Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Confidence interval for relative risk
At 15:54 10/11/2006, Viechtbauer Wolfgang (STAT) wrote: Thanks for the suggestion Wolfgang, but whatever the original authors did that is not it. >Hello, > >A common way to calculate the CI for a relative risk is the >following. Given a 2x2 table of the form: > >a b >c d > >the log of the risk ratio is given by: > >lrr = log[ a/(a+b) / ( c/(c+d) ) ] > >which is asymptotically normal with variance: > >vlrr = 1/a - 1/(a+b) + 1/c - 1/(c+d). > >So an approximate 95% CI for the risk ratio is given by: > >exp[ lrr - 1.96*sqrt(vlrr) ], exp[ lrr + 1.96*sqrt(vlrr) ]. > >A common convention is to add 1/2 to each cell when there are zeros. > >So, for the table: > >Col 1 Col 2 >Row 1 8 500 >Row 2 0 500 > >lrr = log[ 8.5/509 / ( 0.5/501 ) ] = 2.817 >vllr = 1/8.5 - 1/509 + 1/0.5 - 1/501 = 2.1137 > >exp[ 2.817-1.96*sqrt(2.1137) ] = .97 >exp[ 2.817+1.96*sqrt(2.1137) ] = 289.04 > >Maybe that is what the authors did. > >Best, > >-- >Wolfgang Viechtbauer > Department of Methodology and Statistics > University of Maastricht, The Netherlands > http://www.wvbauer.com/ > > > > -Original Message- > > From: [EMAIL PROTECTED] [mailto:r-help- > > [EMAIL PROTECTED] On Behalf Of Michael Dewey > > Sent: Friday, November 10, 2006 15:43 > > To: r-help@stat.math.ethz.ch > > Subject: [R] Confidence interval for relative risk > > > > The concrete problem is that I am refereeing > > a paper where a confidence interval is > > presented for the risk ratio and I do not find > > it credible. I show below my attempts to > > do this in R. The example is slightly changed > > from the authors'. > > > > I can obtain a confidence interval for > > the odds ratio from fisher.test of > > course > > > > === fisher.test example === > > > > > outcome <- matrix(c(500, 0, 500, 8), ncol = 2, byrow = TRUE) > > > fisher.test(outcome) > > > > Fisher's Exact Test for Count Data > > > > data: outcome > > p-value = 0.00761 > > alternative hypothesis: true odds ratio is not equal to 1 > > 95 percent confidence interval: > > 1.694792 Inf > > sample estimates: > > odds ratio > > Inf > > > > === end example === > > > > but in epidemiology authors often > > prefer to present risk ratios. > > > > Using the facility on CRAN to search > > the site I find packages epitools and Epi > > which both offer confidence intervals > > for the risk ratio > > > > === Epi example === > > > > > library(Epi) > > > twoby2(outcome[c(2,1),c(2,1)]) > > 2 by 2 table analysis: > > -- > > Outcome : Col 1 > > Comparing : Row 1 vs. Row 2 > > > >Col 1 Col 2P(Col 1) 95% conf. interval > > Row 1 8 500 0.01570.0079 0.0312 > > Row 2 0 500 0.0. NaN > > > > 95% conf. interval > > Relative Risk:Inf NaN Inf > > Sample Odds Ratio:Inf NaN Inf > > Conditional MLE Odds Ratio:Inf1.6948 Inf > > Probability difference: 0.01570.0027 0.0337 > > > > Exact P-value: 0.0076 > > Asymptotic P-value: NaN > > -- > > > > === end example === > > > > So Epi gives me a lower limit of NaN but the same confidence > > interval and p-value as fisher.test > > > > === epitools example === > > > > > library(epitools) > > > riskratio(outcome) > > $data > >Outcome > > Predictor Disease1 Disease2 Total > >Exposed1 5000 500 > >Exposed2 5008 508 > >Total10008 1008 > > > > $measure > >risk ratio with 95% C.I. > > Predictor estimate lower upper > >Exposed11NANA > >Exposed2 Inf NaN Inf > > > > $p.value > >two-sided > > Predictor midp.exact fisher.exact chi.square > >Exposed1 NA NA NA > >Exposed2 0.00404821 0.007610478 0.004843385 > > > > $correction > > [1] FALSE > > > > attr(,"method") > > [1] "Unconditional MLE & normal approximation (Wald) CI" > > Warning message: > > Chi-squared approximation may be incorrect in: chisq.test(x
[R] Confidence interval for relative risk
The concrete problem is that I am refereeing a paper where a confidence interval is presented for the risk ratio and I do not find it credible. I show below my attempts to do this in R. The example is slightly changed from the authors'. I can obtain a confidence interval for the odds ratio from fisher.test of course === fisher.test example === > outcome <- matrix(c(500, 0, 500, 8), ncol = 2, byrow = TRUE) > fisher.test(outcome) Fisher's Exact Test for Count Data data: outcome p-value = 0.00761 alternative hypothesis: true odds ratio is not equal to 1 95 percent confidence interval: 1.694792 Inf sample estimates: odds ratio Inf === end example === but in epidemiology authors often prefer to present risk ratios. Using the facility on CRAN to search the site I find packages epitools and Epi which both offer confidence intervals for the risk ratio === Epi example === > library(Epi) > twoby2(outcome[c(2,1),c(2,1)]) 2 by 2 table analysis: -- Outcome : Col 1 Comparing : Row 1 vs. Row 2 Col 1 Col 2P(Col 1) 95% conf. interval Row 1 8 500 0.01570.0079 0.0312 Row 2 0 500 0.0. NaN 95% conf. interval Relative Risk:Inf NaN Inf Sample Odds Ratio:Inf NaN Inf Conditional MLE Odds Ratio:Inf1.6948 Inf Probability difference: 0.01570.0027 0.0337 Exact P-value: 0.0076 Asymptotic P-value: NaN -- === end example === So Epi gives me a lower limit of NaN but the same confidence interval and p-value as fisher.test === epitools example === > library(epitools) > riskratio(outcome) $data Outcome Predictor Disease1 Disease2 Total Exposed1 5000 500 Exposed2 5008 508 Total10008 1008 $measure risk ratio with 95% C.I. Predictor estimate lower upper Exposed11NANA Exposed2 Inf NaN Inf $p.value two-sided Predictor midp.exact fisher.exact chi.square Exposed1 NA NA NA Exposed2 0.00404821 0.007610478 0.004843385 $correction [1] FALSE attr(,"method") [1] "Unconditional MLE & normal approximation (Wald) CI" Warning message: Chi-squared approximation may be incorrect in: chisq.test(xx, correct = correction) === end example === And epitools also gives a lower limit of NaN. === end all examples === I would prefer not to have to tell the authors of the paper I am refereeing that I think they are wrong unless I can help them with what they should have done. Is there another package I should have tried? Is there some other way of doing this? Am I doing something fundamentally wrong-headed? Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] struggling to plot subgroups
At 04:06 05/11/2006, Sumitrajit Dhar wrote: >Hi Folks, > >I have data that looks like this: > >freqgender xBar >1000m 2.32 >1000f 3.22 >2000m 4.32 >2000f 4.53 >3000m 3.21 >3000f 3.44 >4000m 4.11 >4000f 3.99 > >I want to plot two lines (with symbols) for the two groups "m" and >"f". I have tried the following: > >plot(xBar[gender=="m"]~freq[gender=="f"]) followed by So you want to plot the value of xBar for a man against the corresponding freq for a woman? But how are we to tell which man maps to which woman? >lines(xBar[gender=="m"]~freq[gender=="f"]) > >with different symbols and lines colors (which I know how to do). > >However, these initial plots are not giving me the right output. The >initial plot command generates the following error. >Error in plot.window(xlim, ylim, log, asp, ...) : > need finite 'xlim' values >In addition: Warning messages: >1: no non-missing arguments to min; returning Inf >2: no non-missing arguments to max; returning -Inf >3: no non-missing arguments to min; returning Inf >4: no non-missing arguments to max; returning -Inf > >A second issue, I would also like to offset the second set of points >along the x-axis so that the error bars will be visible. > >Any help is appreciated. > >Regards, >Sumit > > Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] latent class model
At 19:59 26/10/2006, karen xie wrote: >Dear List, > >I try to implement the latent class model with the unknown number of >classes. I wonder whether someone can provide me some sample codes. RSiteSearch("latent class") brings up 160 documents some of which may be relevant >Thank you for your help. > >Karen > > [[alternative HTML version deleted]] Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error when naming rows of dataset
At 12:44 25/10/2006, yongchuan wrote: >I'm pretty new with R, so the only error message I see is >the below that I pasted. I'm attaching the first few rows >of the file for reference. You seem to have 2 rows called 2.1 >The layout looks screwy when I >attach it here but 'Start' to 'closingCoupon' is the first >row in the .txt file. Thx! > > Start StopPrepayDate modBalance closingCoupon >1.1 6 7 0 811.27698.35 >1.2 7 8 0 811.27698.35 >1.3 8 9 1 811.27698.35 >2.1 4 5 0 2226.0825 8.7 >2.2 5 6 0 2226.0825 8.7 >2.3 6 7 0 2226.0825 8.7 >2.4 7 8 0 2226.0825 8.7 >2.5 8 9 0 2226.0825 8.7 >2.6 9 10 0 2226.0825 8.7 >2.7 10 11 0 2226.0825 8.7 >2.8 11 12 0 2226.0825 8.7 >2.9 12 13 0 2226.0825 8.7 >2.1 13 14 0 2226.0825 8.7 > > > > > From: Michael Dewey <[EMAIL PROTECTED]> > > Date: Wed 25/10/2006 6:38 PM SGT > > To: yongchuan <[EMAIL PROTECTED]>, > > Subject: Re: [R] Error when naming rows of dataset > > > > At 17:30 24/10/2006, yongchuan wrote: > > >I get the following error when I try reading in a table. > > >How are 1.1, 1.2, 1.3 duplicate row names? Thx. > > > > R gives you brief details of where it was when it fell over. > > Have you checked in latestWithNumber.txt' to see whether R is right? > > > > > > > > table <- read.table('latestWithNumber.txt', header=T) > > >Error in "row.names<-.data.frame"(`*tmp*`, value = c("1.1", > "1.2", "1.3", : > > > duplicate 'row.names' are not allowed > > > > > >Yongchuan > > > > Michael Dewey > > http://www.aghmed.fsnet.co.uk > > > > > > > > >-- >No virus found in this incoming message. >Checked by AVG Free Edition. >Version: 7.1.408 / Virus Database: 268.13.11/496 - Release Date: 24/10/2006 Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error when naming rows of dataset
At 17:30 24/10/2006, yongchuan wrote: >I get the following error when I try reading in a table. >How are 1.1, 1.2, 1.3 duplicate row names? Thx. R gives you brief details of where it was when it fell over. Have you checked in latestWithNumber.txt' to see whether R is right? > > table <- read.table('latestWithNumber.txt', header=T) >Error in "row.names<-.data.frame"(`*tmp*`, value = c("1.1", "1.2", "1.3", : > duplicate 'row.names' are not allowed > >Yongchuan Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glm cannot find valid starting values
At 16:12 15/10/2006, Ronaldo ReisJunior wrote: >Em Sábado 14 Outubro 2006 11:15, Gabor Grothendieck escreveu: > > Try using OLS starting values: > > > > glm(Y~X,family=gaussian(link=log), start = coef(lm(Y~X))) > > > >Ok, using a starting value the model work. > >But, sometimes ago it work without any starting values. Why now I need a >starting values? Since this involves a prediction about how a computer will work in the future I would have thought the quote from Samuel Goldwyn in your .sig was helpful. More seriously I think you need to give more detail about when it worked for those more expert than me to tell you what happened. >Thanks >Ronaldo >-- >Nunca faça previsões, especialmente sobre o futuro. > -- Samuel Goldwyn >-- > > Prof. Ronaldo Reis Júnior >| .''`. UNIMONTES/Depto. Biologia Geral/Lab. Ecologia Evolutiva >| : :' : Campus Universitário Prof. Darcy Ribeiro, Vila Mauricéia >| `. `'` CP: 126, CEP: 39401-089, Montes Claros - MG - Brasil >| `- Fone: (38) 3229-8190 | [EMAIL PROTECTED] >| ICQ#: 5692561 | LinuxUser#: 205366 Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glm cannot find valid starting values
At 15:31 13/10/2006, Ronaldo ReisJunior wrote: >Hi, > >I have some similar problems. Some times ago this problem dont there existed. > >Look this simple example: > > > Y <- c(0,0,0,1,4,8,16) > > X <- c(1,2,3,4,5,6,7) > > > m <- glm(Y~X,family=gaussian(link=log)) >Error in eval(expr, envir, enclos) : cannot find valid starting >values: please >specify some Have you tried specifying some starting values as it asks? Try start=c(-3,1) for instance Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error from pmvnorm
At 20:11 30/09/2006, [EMAIL PROTECTED] wrote: >Hi all, > >Can anyone tell me what the following error message means? >" Error in mvt(lower = lower, upper = upper, df = 0, corr = corr, >delta = mean, >: NA/NaN/Inf in foreign function call (arg 6)" > >It was generated when I used the 'pmvnorm' function in the 'mvtnorm' package. You are supposed I think to send questions about contributed packages to the maintainer, but before you do, or repost to this list, it might be a good idea to collect more information. What was the call of pmvnorm? What version of mvtnorm do you have? What version of R? >Thanks a lot. > >Yonghai Li Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Variable im Data Frame Namen
At 12:03 20/09/2006, Thorsten Muehge wrote: >Hello R Experts, >how can I incorporate a variable in a data frame definition? Thorsten, you have already had a reply about this, but I wonder whether you _really_ want to do this. You do not say what the scientific question is you are trying to answer, but I find that when I am tempted to do this it is almost always better to make a list (in your case of data frames) and then access the elements of the list >Example: >week <- 28 >test(week) <- data.frame(a,b,s,c); >test28 Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Consulta sobre bases en R
At 17:04 31/08/2006, Marcela Corrales wrote: >Estimados, > >Quisiera saber cuanto es el tamaño máximo de datos contenidos en mi base de >datos, que el software R puede llegar a procesar y soporta. Marcela, you will get more help if you (a) post in English (b) give us more information about your problem and your system. At the moment the answer to your question is 'How long is a piece of string?' Try using the site search facility for "large databases" >De antemano agradezco y envió un cordial saludo. De nada > > >Marcela Corrales > >CPData Optimum And please do not send HTML. > [[alternative HTML version deleted]] Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Vector Join
At 13:23 13/08/2006, Michael Zatorsky wrote: >Hi, > >I'm working on producing a simple cumulative frequency >distribution. > >Thanks to the help of the good people on this list I >now have four vectors that I'd like to join/relate >into a table. e.g. > > >v1 <- myHistogram$breaks # classes >v2 <- myHistogram$counts # freqs >v3 <- cumsum(v2) # cumulative freq >v4 <- ((v3 / length(myData)) * 100) # cumulative % data.frame(v1 = myHistogram$breaks, v2 = myHistogram$counts, and so on ...) >What is the recommend approach to turning these into a >single table with four columns? ie effectively doing >a relational join on row id? > >The goal is to ultimately have the data with one row >per class in a format I can write out to a text file >as: > > v1v2v3v4 > v1v2v3v4 > etc... > >Any advice will be appreciated. > >Regards >Michael. > > > > > >________ > >Coming soon: Celebrity Survivor - 11 celebrities, 25 days, unlimited drama Michael Dewey [EMAIL PROTECTED] http://www.aghmed.fsnet.co.uk/home.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to reply to a thread if receiving R-help mails in digest form
At 14:21 13/08/2006, Dirk Enzmann wrote: >I receive R-help messages in digest form that makes it difficult to >answer a post. I noticed that my answer is not added to the thread >(instead, a new thread is started) although I use the same subject >line (starting with "Re: ") as the original post. Is there a >solution (I prefer the digest to separate messages for several >reasons and don't want to change my email reader)? Dirk, I asked some while ago for people to tell me how they read digests in their news reader so I could put together a list of helpful hints. Unfortunately I got nowhere. In the email client I use (Eudora) there is an option to receive digests as attachments (as Ted Harding has also outlined in another email. It is rather hidden so I suspect if you can do it in Thunderbird it will be similarly hidden. Try looking for such an option and let me know if you find it. >The way I answer post up to now is: >1) I press the reply button of my email program (Mozilla / >Thunderbird, Windows) >2) I delete all contents of the digest except for the post >(including name and mail address of the posting person) I want to >answer so that the original question will be included (cited) in my answer. >3) I add the email address to the individual sender to "cc:" to the >automatically generated address of the R-help list. >4) I replace the automatically generated subject line (for example >"Re: R-help Digest, Vol 42, Issue 13" by "Re: " followed by a copy >of the original subject line of the post. >5) I write my answer and send the mail to the mailing list. > >It's not that this is tedious - the problem is that the thread is >broken. Is there a better way even if I want to keep receiving >messages in digest form? The posting guide is silent about this. > >Dirk > >* >Dr. Dirk Enzmann >Institute of Criminal Sciences >Dept. of Criminology >Edmund-Siemers-Allee 1 >D-20146 Hamburg >Germany > >phone: +49-(0)40-42838.7498 (office) >+49-(0)40-42838.4591 (Billon) >fax: +49-(0)40-42838.2344 >email: [EMAIL PROTECTED] >www: >http://www2.jura.uni-hamburg.de/instkrim/kriminologie/Mitarbeiter/Enzmann/Enzmann.html > > Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Great R documentation
At 10:09 31/07/2006, hadley wickham wrote: >Dear all, > >I'm trying to improve the documentation I provide my R packages, and >to that end I'd like to find out what you think is great R >documentation. I'm particularly interested in function documentation, Hadley, I do not think any bit of function documentation, if you mean the manual page type documents, is ever that enlightening unless you already know what the function does. For me the useful documents are the more extended narrative documents. What I find helpful is: start with what is the aim of this document tell me what I am assumed to know first (and ideally tell me where to look if I do not) start with an example using the defaults tell me what the output means in terms of the scientific problem tell me what action I might take when it gives me a warning (if that is predictable) tell me about other packages with the same aim and why I should use this one tell me where to go for more information >but great vignettes, websites or book are also of interest. > >What is your favourite bit of R documentation, and why? > >Thanks, > >Hadley > > Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Joining variables
At 21:01 24/05/2006, [EMAIL PROTECTED] wrote: >Hello, > >Although I cannot speak for the original individual posting this >question but I would like to see an example, if possible, of how to >construct a factor of several factors. ?interaction (assuming I understand you correctly) >I think this would be more efficient than "paste" if such a >procedure were possible and, of course, would also answer the >postee's original question. > >Many thanks, > >Mark LoPresti >Thomson Financial/Research Group > >-Original Message- >From: [EMAIL PROTECTED] >[mailto:[EMAIL PROTECTED] Behalf Of Berton Gunter >Sent: Wednesday, May 24, 2006 2:54 PM >To: 'Guenther, Cameron'; r-help@stat.math.ethz.ch >Subject: Re: [R] Joining variables > > >What does "combination of both" mean exactly. I can think of two >interpretations that have two different answers. If you give a small example >(as the posting guide suggests) it would certainly help me provide an >answer. > >-- Bert Gunter >Genentech Non-Clinical Statistics >South San Francisco, CA > > > > > > > -Original Message- > > From: [EMAIL PROTECTED] > > [mailto:[EMAIL PROTECTED] On Behalf Of > > Guenther, Cameron > > Sent: Wednesday, May 24, 2006 11:46 AM > > To: r-help@stat.math.ethz.ch > > Subject: [R] Joining variables > > > > Hello, > > > > If I have two variables that are factors or characters and I want to > > create a new variable that is the combination of both what > > function can > > I use to accomplish this? > > > > Ex. > > > > Var1 Var2 > > SA100055113 19851113 > > > > And I want > > > > NewVar > > SA10005511319851113 > > > > Thanks in advance. > > > > Cameron Guenther, Ph.D. > > Associate Research Scientist > > FWC/FWRI, Marine Fisheries Research > > 100 8th Avenue S.E. > > St. Petersburg, FL 33701 > > (727)896-8626 Ext. 4305 > > [EMAIL PROTECTED] > > > > __ > > R-help@stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! > > http://www.R-project.org/posting-guide.html > > > >__ >R-help@stat.math.ethz.ch mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Michael Dewey [EMAIL PROTECTED] http://www.aghmed.fsnet.co.uk/home.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] logistic regression model with non-integer weights
At 17:12 09/04/06, Ramón Casero Cañas wrote: I have not seen a reply to this so far apologies if I missed something. >When fitting a logistic regression model using weights I get the >following warning > > > data.model.w <- glm(ABN ~ TR, family=binomial(logit), weights=WEIGHT) >Warning message: >non-integer #successes in a binomial glm! in: eval(expr, envir, enclos) > >Details follow > >*** > >I have a binary dependent variable of abnormality > >ABN = T, F, T, T, F, F, F... > >and a continous predictor > >TR = 1.962752 1.871123 1.893543 1.685001 2.121500, ... > > > >As the number of abnormal cases (ABN==T) is only 14%, and there is large >overlapping between abnormal and normal cases, the logistic regression >found by glm is always much closer to the normal cases than for the >abnormal cases. In particular, the probability of abnormal is at most 0.4. > >Coefficients: > Estimate Std. Error z value Pr(>|z|) >(Intercept) 0.7607 0.7196 1.057 0.2905 >TR2 -1.4853 0.4328 -3.432 0.0006 *** >--- > >I would like to compensate for the fact that the a priori probability of >abnormal cases is so low. I have created a weight vector I am not sure what the problem you really want to solve is but it seems that a) abnormality is rare b) the logistic regression predicts it to be rare. If you want a prediction system why not try different cut-offs (other than 0.5 on the probability scale) and perhaps plot sensitivity and specificity to help to choose a cut-off? > > WEIGHT <- ABN > > WEIGHT[ ABN == TRUE ] <- 1 / na / 2 > > WEIGHT[ ABN == FALSE ] <- 1 / nn / 2 > >so that all weights add up to 1, where ``na'' is the number of abnormal >cases, and ``nn'' is the number of normal cases. That is, normal cases >have less weight in the model fitting because there are so many. > >But then I get the warning message at the beginning of this email, and I >suspect that I'm doing something wrong. Must weights be integers, or at >least greater than one? > >Regards, > >-- >Ramón Casero Cañas > >http://www.robots.ox.ac.uk/~rcasero/wiki >http://www.robots.ox.ac.uk/~rcasero/blog Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Wikis etc.
At 16:06 09/01/06, Gabor Grothendieck wrote: [snip various earlier posts] >In addition to books, the various manuals, contributed documents and >mailing list archives, all of which one should review, >the key thing to do if you want to really learn R is to read source code >and lots of it. I think there is no other way. Furthermore, the fact that >you can do this is really a key advantage of open source. But that is the solution to a different problem. Reading the source for merge tells you how R merges two dataframes, the beginning user wants to know how to link together the information s/he has in two files but does not know what the name of the relevant command is, or indeed whether it is even possible. To give you some idea of how ignorant some of us are it was only quite recently that I realised (despite several years reading free documentation, books on R and R-help) that if I type cor at the prompt what I see looks like source code but is not _the_ source code. Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Wikis etc.
At 20:12 08/01/06, Jack Tanner wrote: >Philippe's idea to start a wiki that grows out of the content on >http://zoonek2.free.fr/UNIX/48_R/all.html is really great. Here's why. > >My hypothesis is that the basic reason that people ask questions on R-help >rather than first looking elsewhere is that looking elsewhere doesn't get >them the info they need. > >People think in terms of the tasks they have to do. The documentation for >R, which can be very good, is organized in terms of the structure of R, >its functions. This mismatch -- people think of tasks, the documentation >"thinks in" functions -- causes people to turn to the mailing list. Further to that I feel that (perhaps because they do not like to blow their own trumpet too much) the authors of books on R do not stress how much most questioners could gain by buying and reading at least one of the many books on R. When I started I found the free documents useful but I made most progress when I bought MASS. I do realise that liking books is a bit last millennium. Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] How do I get sub to insert a single backslash?
Something about the way R processes backslashes is defeating me. Perhaps this is because I have only just started using R for text processing. I would like to change occurrences of the ampersand & into ampersand preceded by a backslash. > temp <- "R & D" > sub("&", "\&", temp) [1] "R & D" > sub("&", "\\&", temp) [1] "R & D" > sub("&", "\\\&", temp) [1] "R & D" > sub("&", "&", temp) [1] "R \\& D" > So I can get zero, or two backslashes, but not one. I am sure this is really simple but I did not find the answer by doing, for example ?regexp or ?Quotes Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to create a correlation table for categorical data???
At 19:07 04/01/06, fabio crepaldi wrote: >Hi, > I need to create the correlation table of a set of categorical data > (sex, marital_status, car_type, etc.) for a given population. > Basically what I'm looking for is a function like cor( ) working on > factors (and, if possible, considering NAs as a level). If they are ordered have you considered downloading the polycor package? Michael Dewey [EMAIL PROTECTED] http://www.aghmed.fsnet.co.uk/home.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] from Colombia - help
At 13:50 20/12/05, andres felipe wrote: > Hi, my name is Andres Felipe Barrientos, I'm a student of Statistic and > don't speak in english. En mi trabajo de grado necesito implementar la > funcion smooth.spline y necesito saber con que tipo de spline trabaja > (b-splines o naturales). Since I have not yet seen a reply here goes. It is very honest of you to tell us it is your 'trabajo de grado' but should you not ask your tutor for help? Have you tried ?smooth.spline which tells you what the function does (admittedly fairly tersely)? > Ademas me gustaria saber cual es la base que se usa para encontrar > estos splines, por ejemplo, cosenos, senos, polinomios entre otros > Otra pregunta que tengo consiste en saber cual es la relacion que > sostiene la funcion smooth.spline y ns Agradeciendo la atencion > prestada y esperando una respuesta desde la universidad del valle, quien > le escribe. Andres Felipe > > > >- > 1GB gratis, Antivirus y Antispam > Correo Yahoo!, el mejor correo web del mundo > Abrí tu cuenta aquí > [[alternative HTML version deleted]] Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Comments please, how to set your mailer to read R-help in digest format
There are occasional comments on this list about how difficult it is to read the digest format. Since it took me a few false starts to configure my mailer to do it I have (with the encouragement of the list owner) put together a few hints. All I need now is for people who use other mailers to fill in the details in the same way as I have done for Eudora. The draft is available at http://www.aghmed.fsnet.co.uk/r/digest.htm Any other comments welcome of course. Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Digest reading is tedious
At 11:43 01/09/05, Martin Maechler wrote: [snip section about Trevor Hastie's experience] >What are other readers' experiences with mailman mailing lists >in digest mode -- using "MIME" type delivery? I use Eudora 6.2.1.2 (which is not the very latest version) running under Windows 98 or Windows XP and the digests are fine once you have found the option 'Receive MIME digest as mailbox attachment' and turned it on. I can then see the whole set of messages, sort them by subject and reply to them individually without having to change the subject of the message. Hope that helps. >Regards, >Martin Michael Dewey http://www.aghmed.fsnet.co.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] (sans objet)
At 11:12 26/06/05, pir2.jv wrote: >Ma première question: puis-je écrire en français, mon anglais est >pire que pire. J'essaie en français; sinon, j'essaierai une traduction! > >Je sduis débutant. Ma question est donc simple, j'imagine. I think ?merge is your friend here. >Je travaille sur des tableaux de statistiques. Je prends un exemple: >J'ai un "frame" que j'appelle "eurostat" qui me donne des >statistiques commerciales: > >eurostat: PaysProduitTonnage > CI80110123 >... > > >J'ai un autre frame , cp qui m'indique >ProduitNom >801Coconut > >et un autre qui m'indique que CI est "Côte d'Ivoire" >Je veux créer une nouvelle table, par exemple "Importation" qui me >donne les données suivantes: > >ImportationPays Produit Tonnage > Côte d'IvoireCoconut10123 > >... et je n'y arrive pas. Ce doit pourtant être basique. > >Merci pour votre aide. > >Jacques Vernin >PiR2 >10, Boulevard de Brazza >13008 Marseille >0491 734 736 >[EMAIL PROTECTED] > > Michael Dewey [EMAIL PROTECTED] http://www.aghmed.fsnet.co.uk/home.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Problem trying to use boot and lme together
At 23:09 21/06/05, Prof Brian Ripley wrote: >On Tue, 21 Jun 2005, Douglas Bates wrote: > >>On 6/21/05, Søren Højsgaard <[EMAIL PROTECTED]> wrote: Thanks everyone for your help, more comments at the foot >>>The problem with simulate.lme is that it only returns logL for a given >>>model fitted to a simulated data set - not the simulated data set >>>itself (which one might have expected a function with that name to >>>do...). It would be nice with that functionality... >>>Søren >> >>You could add it. You just need to create a matrix that will be large >>enough to hold all the simulated data sets and fill a column (or row >>if you prefer but column is probably better because of the way that >>matrices are stored) during each iteration of the simulation and >>remember to include that matrix in the returned object. > >Note: you don't need to store it: you can do the analysis at that point >and return the statistics you want, rather than just logL. > >I did say `see simulate.lme', not `use simulate.lme'. I know nlme is no >longer being developed, but if it were I would be suggesting/contributing >a modification that allowed the user to specify an `extraction' function >from the fit -- quite a few pieces of bootstrap code work that way. > >-- >Brian D. Ripley, [EMAIL PROTECTED] >Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ >University of Oxford, Tel: +44 1865 272861 (self) >1 South Parks Road, +44 1865 272866 (PA) >Oxford OX1 3TG, UKFax: +44 1865 272595 This has been most informative for me. Given the rather stern comments in Pinheiro and Bates that most things do not have the reference distribution you might naively think I now realise that simulate.lme is more important than its rather cursory treatment in the book. As suggested I have looked at the code but although I can see broadly what each section does I lack the skill to modify it myself. I will have to wait for someone more gifted. If there is to be a successor edition to Pinheiro and Bates perhaps I could suggest that this topic merits a bit more discussion? Michael Dewey [EMAIL PROTECTED] http://www.aghmed.fsnet.co.uk/home.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Problem trying to use boot and lme together
The outcome variable in my dataset is cost. I prefer not to transform it as readers will really be interested in pounds, not log(pounds) or sqrt(pounds). I have fitted my model using lme and now wish to use boot on it. I have therefore plagiarised the example in the article in RNews 2/3 December 2002 after loading the dataset I source this file require(boot) require(nlme) model.0 <- lme(tot ~ time + timepost + pct + totpat + (time + timepost) : single + single + (time + timepost) : train + train + time:gpattend + timepost:gpattend + gpattend, data = common, random = ~time + timepost | gp ) ints.9 <- intervals(model.0, which="fixed")$fixed[9,] # common$fit <- fitted(model.0) common$res <- resid(model.0, type = "response") cats.fit <- function(data) { mod <- lme(tot ~ time + timepost + pct + totpat + (time + timepost) : single + single + (time + timepost) : train + train + time:gpattend + timepost:gpattend + gpattend, data = data, random = ~ time + timepost | gp) summ <- summary(mod) c(fixef(summ), diag(summ$varFix)) } model.fun <- function(d, i) { d$tot <- d$fit+d$res[i] cats.fit(d) } tot.boot <- boot(common, model.fun, R=999) So I fit the model and then generate fitted values and residuals which I use within the model.fun function to generate the bootstrap resample. If I do this the plot looks fine as does the jack.after.boot plot but the confidence intervals are about 10% of the width of the ones from the lme output. I wondered whether I was using the wrong fitted and residuals so I added level = 0 to the calls of fitted and resid respectively (level = 1 is the default) but this seems to lead to resamples to which lme cannot fit the model. Can anyone spot what I am doing wrong? I would appreciate a cc'ed response as my ISP has taken to bouncing the digest posts from R-help with probability approximately 0.3. FWIW I am using 2.1.0 under XP (SP2) with the versions of boot and nlme which shipped with the binary. Michael Dewey [EMAIL PROTECTED] http://www.aghmed.fsnet.co.uk/home.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How I calculate nCr with R ? (Como calculo nCr con R? )
At 10:37 19/03/05, Mario Morales wrote: En español (In Spanish) Necesito calcular la en numeros de combinaciones de n cosas tomando k al tiempo. In English we usually read this as N choose r and with that clue you might go ?choose Incidentally your English is fine although I see the logic in giving us both. Como hago eso en R ??? Yo escribí mi propia función pero pienso que de esa forma no es fácil para mis estudiantes . He estado buscando en la ayuda y no he encontrado información sobre una función que calcule eso directamente. Podrían ayudarme In English (en Inglés ) I need calculate the combination of n things taking k at time. How do I do that in R ? I wrote my own function but in this form isn't easy for my students. Can you help me ? Michael Dewey [EMAIL PROTECTED] http://www.aghmed.fsnet.co.uk/home.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Tetrachoric and polychoric ceofficients (for sem) - any tips?
About two years ago there was a thread about this which suggested that at that time nobody had these coefficients ready to go. (a) has anyone in the meanwhile programmed them? (b) I think I can see how to do the tetrachoric one with mvtnorm on similar lines to an example on the help page so will try that if nobody else already has (c) looking at the polychoric one makes me realise yet again that I wish I knew more mathematics. I would also find it difficult to test it as I do not have any worked examples. Does anyone have any tips about how to program it and how to test the resultant code? (d) I appreciate this last item is not strictly an R question, but my intention is to use these as input into the sem package for structural equation models. If anyone thinks that is misguided I would be intersted to here. Michael Dewey [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Summary, was [R] Plotting panels at arbitrary places on a map, rather than on a lattice
At 17:58 01/10/04, Michael Dewey wrote: I think it is easiest to describe what I want in terms of the concrete problem I have. I have data from a number of countries in each of which a sample of people was interviewed. In presenting the results in a forthcoming collaborative publication much emphasis will be placed on the multi-centre nature of the study. Although I suspect colleagues may do this with shaded maps I would prefer to avoid them as (a) they present one fact per country per map (b) they are unfair to the Netherlands and other high density countries. What I would like to do is to make the background represent Europe (ideally with a map but that is a frill) then place simple scattergrams (or radar plots) on it located roughly where the country is. Another way of describing it might be to say that I want something like the panels produced by lattice but at arbitrary coordinates rather than on a rectangular grid. I suspect I have to do this from scratch and I would welcome hints. Am I right that there is no off the shelf way to do this? Is grid the way to go? Looking at the article in Rnews 2(2) and a brief scan of the documentation suggests so. If grid is the way to go then bearing in mind I have never used grid before (a) any hints about the overall possible solution structure would be welcome (b) is this realistic to do within a week or shall I revert to lattice and lose the geography? Is there a simple way to draw a map in the background? It needs to cover as far as Sweden, Spain and Greece. It can be crude, as long as Italy looks roughly like a boot that is fine. I am an epidemiologist not a geographer. I received some very helpful hints and was able to get a satisfactory solution. Roger Bivand pointed me in the right way with map. After loading maps and mapproj I go map("world", region = c("France", "Belgium", "Greece", "Spain", "Italy", "Switzerland", "Sweden", "Germany", "Netherlands", "Austria", "Denmark", "Sicily", "Sardinia"), xlim = c(-10, 30), ylim = c(30, 60), projection = "albers", parameters = c(30,60), col = "lightgreen") then bearing in mind that my data is in a file called merg which contains the coordinates and the data to plot merg$x <- mapproject(merg$long,merg$lat)$x merg$y <- mapproject(merg$long,merg$lat)$y gets the coordinates in the new system. Deepayan Sarkar reminded me about stars which I should have remembered myself from reading MASS. I now go stars(2*merg[,ord+4], scale = FALSE, len = 0.07, locations = cbind(merg$x,merg$y), labels = NULL, cex = 0.3, key.loc = c(mapproject(-10,60)$x,mapproject(-10,60)$y), add = TRUE ) and get a plot which does what I wanted. Roger had pointed out that gridbase was probably the way to put scatterplots on the map but I decided after looking at the stars that scatterplots would end up too small to use so I stayed with lattice for them. (When I said scattergrams I meant what nearly everyone else calls scatterplots.) Thanks to both of them, and to the people who made maps and stars so easy when you know what to look for. Michael Dewey [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Plotting panels at arbitrary places on a map, rather than on a lattice
I think it is easiest to describe what I want in terms of the concrete problem I have. I have data from a number of countries in each of which a sample of people was interviewed. In presenting the results in a forthcoming collaborative publication much emphasis will be placed on the multi-centre nature of the study. Although I suspect colleagues may do this with shaded maps I would prefer to avoid them as (a) they present one fact per country per map (b) they are unfair to the Netherlands and other high density countries. What I would like to do is to make the background represent Europe (ideally with a map but that is a frill) then place simple scattergrams (or radar plots) on it located roughly where the country is. Another way of describing it might be to say that I want something like the panels produced by lattice but at arbitrary coordinates rather than on a rectangular grid. I suspect I have to do this from scratch and I would welcome hints. Am I right that there is no off the shelf way to do this? Is grid the way to go? Looking at the article in Rnews 2(2) and a brief scan of the documentation suggests so. If grid is the way to go then bearing in mind I have never used grid before (a) any hints about the overall possible solution structure would be welcome (b) is this realistic to do within a week or shall I revert to lattice and lose the geography? Is there a simple way to draw a map in the background? It needs to cover as far as Sweden, Spain and Greece. It can be crude, as long as Italy looks roughly like a boot that is fine. I am an epidemiologist not a geographer. Michael Dewey [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] rbind with missing columns
At 11:03 01/05/04, you wrote: Message: 3 Date: Fri, 30 Apr 2004 13:47:35 +0200 From: <[EMAIL PROTECTED]> Subject: [R] rbind with missing columns To: <[EMAIL PROTECTED]> Message-ID: <[EMAIL PROTECTED]> Content-Type: text/plain; charset="us-ascii" Hello, I have several data sets, which I want to combine column-by-column using "rbind" (the data sets have ~100 columns). The problem is, that in some data sets some of the columns are missing. Simply using rbind gives me: "Error in match.names(clabs, names(xi)) : names don't match previous names" Is there a simple way to make R do the rbind despite the missing columns and to fill out the missing data with NA's? (I could program this somehow, but probably there is one very simple solution available) To make it clear here a simplified example. "unknown.command" is what I am looking for. A <- data.frame(a = c(1,2,3), b = c(4,5,6)) B <- data.frame(a = c(1,2,3)) unknown.command(A, B) - should give A B 1 4 2 5 3 6 4 NA 5 NA 6 NA Does A$id <- 1:nrow(A) B$id <- 1:nrow(B) + nrow(A) AandB <- merge(A, B, all = TRUE) > A a b id 1 1 4 1 2 2 5 2 3 3 6 3 > B a id 1 1 4 2 2 5 3 3 6 > AandB a id b 1 1 1 4 2 1 4 NA 3 2 2 5 4 2 5 NA 5 3 3 6 6 3 6 NA > Give you what you wanted? You can always strip out the "id" column if you wish (after sorting on it if you would like the original order back. Thank you for your help Klaus Michael Dewey [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Re: R-help Digest, Vol 14, Issue 3
At 12:01 03/04/04 +0200, you wrote: Content-Transfer-Encoding: 8bit From: [EMAIL PROTECTED] Precedence: list MIME-Version: 1.0 Cc: To: [EMAIL PROTECTED] Date: Fri, 2 Apr 2004 12:47:48 -0300 (ART) Message-ID: <[EMAIL PROTECTED]> Content-Type: text/plain;charset=iso-8859-1 Subject: [R] convert excell file to text with RODBC package Message: 27 Hi, i can convert excell to list in R with package RODBC ()but i don't understand 2 mistake 1) Don't read the last row of the table excell 2) Don' t take the hours See below My excell file call prueba4.xls and have the following rows: where prueba4.xls was make in excell (office xp) and have one spreadsheet call "Hoja1", you see each rows of she: Día Horacol1col2col3col4col5col6col7col8 15/12/2003 12:14:59217 27608,2 35 79,686,4 15/12/2003 12:15:00217 27648,2 35 79,686,4 15/12/2003 12:15:01217 27588,3 35 79,686,4 15/12/2003 12:15:02217 27608,3 35 79,686,4 15/12/2003 12:15:03217 27558,3 35 79,686,4 15/12/2003 12:15:04217 27668,3 35 79,686,4 15/12/2003 12:15:05217,1 27668,3 35,179,686,4 15/12/2003 12:15:06217,1 27588,3 35,179,686,4 15/12/2003 12:15:07217,1 27688,3 35,179,686,4 That seems to have 9 rows of data. There seem to be either 8 columns or 10 columns of data. My code (i use the R 1.7.1 for windows xp) is the following: library(RODBC) > canal<-odbcConnectExcel("c:/prueba4.xls") > tablas<-sqlTables(canal) > tablas TABLE_CAT TABLE_SCHEM TABLE_NAME TABLE_TYPE REMARKS 1 c:\\prueba4 Hoja1$ SYSTEM TABLE 2 c:\\prueba4 Hoja2$ SYSTEM TABLE 3 c:\\prueba4 Hoja3$ SYSTEM TABLE > tbl<-sqlFetch(canal,substr(tablas[1,3],1,nchar(tablas[1,3])-1)) > tbl[1] Día 1 2003-12-15 00:00:00 2 2003-12-15 00:00:00 3 2003-12-15 00:00:00 4 2003-12-15 00:00:00 5 2003-12-15 00:00:00 6 2003-12-15 00:00:00 7 2003-12-15 00:00:00 8 2003-12-15 00:00:00 9 2003-12-15 00:00:00 ¿Which is the mistake?. Well that has 9 rows the same as Hoja1 in Pruebas4.xls, so I do not understand your question 1. What exactly do you mean by question 2? Do you mean it is not reading the hours column (Hora) or do you mean it takes a long time? It looks to me as though column one, despite its label of day (Día) actually contains some much more complex Excel format. Michael Dewey [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Re: How transform excell file to text file in R
At 12:00 30/03/04 +0200, you wrote: From: [EMAIL PROTECTED] Precedence: list MIME-Version: 1.0 Cc: To: [EMAIL PROTECTED] Date: Mon, 29 Mar 2004 17:22:14 -0300 (ART) Message-ID: <[EMAIL PROTECTED]> Content-Type: text/plain;charset=iso-8859-1 Subject: [R] how transform excell file to text file in R Message: 56 Hello, it want to know if there is a library that transform a file in excell (*. xls) to text file(*.txt, *,prn, *.csv). Thanks library(RODBC) will read your Excel file into R. You do not need a copy of Excel to do this. Michael Dewey [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Lattice, skip= and layout= problem, plotting object from nlme output
At 13:28 22/03/04 -0600, you wrote: [snip my original problem] You are seeing documented lattice behavior, but looks like that's inconsistent with what happens in S-PLUS. lattice replicates the skip vector to be as long as the number of panels per page and uses that for every page, while S-PLUS replicates it to be as long as the total number of panels spanning all pages. I can easily fix this for 1.9.0, but this may break old (but probably rare) lattice code. For example, layout = c(2,2,2), skip = c(T,F,F) will now expand to page 1: T, F, F, T page 2: F, F, T, F as opposed to the current behavior page 1: T, F, F, T page 2: T, F, F, T Any objections to that ? From my point of view the current behaviour is not very useful, and in your example either behaviour seems a bit strange so I would not object to it changing. Thanks for the quick response, and thanks for making lattice available. I find myself making so many more graphical displays because they are (a) easy (b) aesthetically pleasing (c) revealing about my dataset. Deepayan Michael Dewey [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Lattice, skip= and layout= problem, plotting object from nlme output
I generate a groupedData object library(nlme) obj <- groupedData(mg10 ~ time | gp, data = common, outer = ~pct) gp has 101 levels, and pct has 3. There are 38, 25, 38 gps in each of the levels of pct respectively. I fit my model fit.rtg <- lme(mg10 ~ time * group, data = obj, random = ~time * group | gp) Now I try to plot the results. I would like to print 40 panels on each page and have a new level of pct start a new page. The effect of using outer in the call of groupedData is that the values of gp are presented by pct. plot.rtg <- plot(augPred(fit.rtg), skip = c(rep(FALSE, 38), TRUE, TRUE, rep(FALSE, 25), rep(TRUE, 15), rep(FALSE, 38), TRUE, TRUE), layout = c(5, 8, 3), strip = FALSE ) What I get is 38 panels on page 1 with 2 blank panels top right. I had hoped to get 25 on the next page with 15 blanks but I get 38 again with 2 blanks. If I alter layout to (say) layout = c(12, 10) I get blanks as expected on the single page produced so I surmise that skip is forcing the same format on each page. There is an example in \cite{pinheiro00} which suggests that what I wanted can be done (p113, p445-447) so I suspect I am doing something stupid here. I am using R 1.8.1 with the versions of lattice and nlme that came with it. I am using Windows 98SE if that is relevant. @BOOK{pinheiro00, author = {Pinheiro, J C and Bates, D M}, year = 2000, title = {Mixed-effects models in {S} and {S-PLUS}}, publisher = {Springer-Verlag} } Michael Dewey [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Specifying suitable PC to run R
If I am buying a PC where the most compute intensive task will be running R and I do not have unlimited resources what trade-offs should I make? Specifically should I go for 1 - more memory, or 2 - faster processor, or 3 - something else? If it makes a difference I shall be running Windows on it and I am thinking about getting a portable which I understand makes upgrading more difficult. Extra background: the tasks I notice going slowly at the moment are fitting models with lme which have complex random effects and bootstrapping. By the standards of r-help posters I have small datasets (few thousand cases, few hundred variables). In order to facilitate working with colleagues I need to stick with windows even if linux would be more efficient Michael Dewey [EMAIL PROTECTED] http://www.aghmed.fsnet.co.uk/home.html __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help