[R] Questions concerning function 'svm' in e1071 package
Greetings everyone, I have the following problem (illustrating R-code at bottom of mail): Given a training sample with binary outcomes (-1/+1), I train a linear Support Vector Machine to separate them. Afterwards, I compute the weight vector w in the usual way, and obtain the fitted values as w'x + b 0 == yfitted = 1, otherwise -1. However, upon verifying with the 'predict' method, the outcomes do not match up as they should. I've already tried to find information concerning this issue on the R-help board, but to no avail. Can any of you point me in the right direction? Signed, Johan Van Kerckhoven ORSTAT and University Center of Statistics Katholieke Universiteit Leuven -- #initialization of the problem rm(list=ls()) library(e1071) set.seed(2) n = 50 d = 4 p = 0.5 x = matrix(rnorm(n*d), ncol=d) mushift = c(1, -1, rep(0, d-2)) y = runif(n) p y = factor(2*y - 1) x = x - outer(rep(1, n), mushift) x[y == 1, ] = x[y == 1] + 2*outer(rep(1, sum(y == 1)), mushift) svclass = svm(x, y, scale=FALSE, kernel=linear) #Computation of the weight vector w = t(svclass$coefs) %*% svclass$SV if (y[1] == -1) { w = -w } #Derivation of predicted class lavels #Using method in documentation yfit = (x %*% t(w) + svclass$rho) 0 yfit = factor(2*yfit - 1) #Extracting them directly from the model yfit2 = svclass$fitted #Display where predictions differ from each other yfit != yfit2 Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] [R-pkgs] ggplot: a new system for drawing graphics in R
ggplot provides a new system for drawing graphics in R, based on the Grammar of Graphics. It combines the advantages of both base and lattice graphics: conditioning and shared axes are handled automatically, and you can still build up a plot step by step from multiple data sources. It also implements a more sophisticated multidimensional conditioning system and a consistent interface to map data to visual attributes. ggplot (along with reshape) received the John Chambers Award for Statistical Computing. ggplot is available now from CRAN (install.packages(ggplot)) and more information is available at my website (http://had.co.nz/ggplot) including copies of talks, examples, and a guide showing how to convert your existing lattice code. To get started I recommend you look at: * the introductory vignette: vignette(introduction, ggplot) * help for the quick plotting command: ?qplot * help for the full plotting commands: ?ggplot I want to provide great documentation, so if there is anything you think I am missing, please let me know. Regards, Hadley ___ R-packages mailing list [EMAIL PROTECTED] https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] could i change the ouput style on summary?
Dear friends, summary() doesn't give a good ouput style,e.g.: grasssoiltem airtem gheight humidity altitude diluo :38 Min. :15.90 Min. :17.70 Min. : 8.00 Min. : 0.2360 high: 43 huanghuacai:32 1st Qu.:19.32 1st Qu.:22.60 1st Qu.:40.00 1st Qu.: 0.3190 low :119 hucao :46 Median :20.20 Median :25.30 Median :60.00 Median : 0.3399 yuhao :46 Mean :20.52 Mean :26.43 Mean :51.93 Mean : 0.3423 3rd Qu.:21.48 3rd Qu.:29.73 3rd Qu.:70.00 3rd Qu.: 0.3627 Max. :28.60 Max. :42.10 Max. :90.00 Max. : 0.4453 could i get a result like the following: colums are statistical indices and the rows are variables *min 1st Qu. Median Mean 3rd Qu. Max. variance grass* *soiltem * *airtem* which could be copy and used in the papers easily? Thanks very much! -- Kind Regards, Zhi Jie,Zhang ,PHD Department of Epidemiology School of Public Health Fudan University Tel:86-21-54237149 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] sparse matrix tools
JJTh == Thaden, John J [EMAIL PROTECTED] on Sun, 2 Jul 2006 07:38:21 -0500 writes: JJTh Dear R-Help list: I'm using the Matrix library to JJTh operate on 600 X ~5000 element unsymmetrical sparse JJTh arrays. So far, so good, but if I find I need more JJTh speed or functionality, Can you be more specific? Functionality: For asymmetric matrices, in our view, the most important gap to fill is the LU decomposition and hence solve() features. Speed: Are you sure the time your R code spends is spent in functions from Matrix? {Did you use 'Rprof()' ?} If yes, which ones? JJTh how hard would it be to JJTh utilize other sparse matrix toolsets from within R, JJTh say MUMPS, PARDISO or UMFPACK, that do not have JJTh explicit R interfaces? More information on these is JJTh available here JJTh www.cise.ufl.edu/research/sparse/umfpack/ JJTh www.computational.unibas.ch/cs/scicomp/software/pardiso JJTh www.enseeiht.fr/lima/apo/MUMPS/ From these, only the first one is open source. Unfortunately, the PARDISO people seem to believe in money making with scientific software -- a particular shame for since they only work at most an hour away from me. MUMPS is said to be public domain, but then you only get it after filling out a web form and only for a specific hardware. Also, it is about massively parallel computation, very interesting for sparse matrices, but AFAIK not yet in our main focus. UMFPACK is different, and even ships with the Matrix package, because we have planned to interface to it, but haven't yet got to that. What parts of UMFPACK functionality would you be interested in ? JJTh and in these reviews JJTh ftp://ftp.numerical.rl.ac.uk/pub/reports/ghsNAGIR20051r1.pdf JJTh http://www.cise.ufl.edu/research/sparse/codes/ JJTh neither of which reviewed the R Matrix package, unfortunately. JJTh Thanks, - John Thaden, Ph.D., U. Arkansas for JJTh Med. Sci., Little Rock. Regards, Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Calculation of lags
Hi, If I have the follow situation: A dependent variable (i.e. number of insects) that is affected by an independent variable (i.e. rain). The problem is that the measure of rain affect the population in other moment. So there exit a lag between the rain and the number of insects. Exist in R any tool to find what is this lag? Explain better. Suppose that I have a linear relationship between rain and insects. More rain make more insects. If I try to model the rain and insects in the same moment I cant see this relationship because the rain today affect the number of insects in the future. Thus, I need to model the present rain with the number of insect in the future. But when is this future? 1 day after, 2 days after, etc. How the best way to calculate this lag? Thanks Ronaldo -- Todos chegamos um dia como a agua e nos vamos como o vento -- Graham Greene -- Prof. Ronaldo Reis Júnior | .''`. UNIMONTES/Depto. Biologia Geral/Lab. Ecologia Evolutiva | : :' : Campus Universitário Prof. Darcy Ribeiro, Vila Mauricéia | `. `'` CP: 126, CEP: 39401-089, Montes Claros - MG - Brasil | `- Fone: (38) 3229-8190 | [EMAIL PROTECTED] | ICQ#: 5692561 | LinuxUser#: 205366 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] lme convergence
Hi, Dimitris: The change you suggested sounds constructive. Unfortunately, it did NOT solve the problem, at least for the modification of the example from the 'lme' help page I tested. However, one other similar change (and adding 'nlme:::' to the calls for functions hidden in an nlme namespace, identified partly with 'traceback()') produced a version of 'lme.formula' that actually returned an answer for my test case: fm1 - lme(distance ~ age, data = Orthodont, +control=lmeControl(msMaxIter=1, + returnObject=TRUE)) Warning message: nlminb problem, convergence error code = 1 ; message = iteration limit reached without convergence (9) in: lme.formula(distance ~ age, data = Orthodont, control = lmeControl(msMaxIter = 1, Below please find (1) the 2 changes and (2) a complete version of 'lme.formula' that worked for this example. Thanks for your contribution. Spencer Graves ### change 1 ## if (!needUpdate(lmeSt)) { if(optRes$convergence){ msg - paste(controlvals$opt, problem, convergence error code =, optRes$convergence, ; message =, optRes$message) { if(!controlvals$returnObject) stop(msg) else warning(msg) } break } } ### change 2 # if (numIter controlvals$maxIter) { msg - paste(Maximum number of iterations, (lmeControl(maxIter)) reached without convergence.) if (controlvals$returnObject) { warning(msg) break } else { stop(msg) } } ### lme.formula revised ### lme.formula - function(fixed, data = sys.frame(sys.parent()), random = pdSymm( eval( as.call( fixed[ -2 ] ) ) ), correlation = NULL, weights = NULL, subset, method = c(REML, ML), na.action = na.fail, control = list(), contrasts = NULL, keep.data = TRUE) { Call - match.call() miss.data - missing(data) || !is.data.frame(data) ## control parameters controlvals - lmeControl() if (!missing(control)) { if(!is.null(control$nlmStepMax) control$nlmStepMax 0) { warning(Negative control$nlmStepMax - using default value) control$nlmStepMax - NULL } controlvals[names(control)] - control } ## ## checking arguments ## if (!inherits(fixed, formula) || length(fixed) != 3) { stop(\nFixed-effects model must be a formula of the form \resp ~ pred\) } method - match.arg(method) REML - method == REML reSt - reStruct(random, REML = REML, data = NULL) groups - getGroupsFormula(reSt) if (is.null(groups)) { if (inherits(data, groupedData)) { groups - getGroupsFormula(data) namGrp - rev(names(getGroupsFormula(data, asList = TRUE))) Q - length(namGrp) if (length(reSt) != Q) { # may need to repeat reSt if (length(reSt) != 1) { stop(Incompatible lengths for \random\ and grouping factors) } randL - vector(list, Q) names(randL) - rev(namGrp) for(i in 1:Q) randL[[i]] - random randL - as.list(randL) reSt - reStruct(randL, REML = REML, data = NULL) } else { names(reSt) - namGrp } } else { ## will assume single group groups - ~ 1 names(reSt) - 1 } } ## check if corStruct is present and assign groups to its formula, ## if necessary if (!is.null(correlation)) { if(!is.null(corGrpsForm - getGroupsFormula(correlation, asList = TRUE))) { corGrpsForm - unlist(lapply(corGrpsForm, function(el) deparse(el[[2]]))) corQ - length(corGrpsForm) lmeGrpsForm - unlist(lapply(splitFormula(groups), function(el) deparse(el[[2]]))) lmeQ - length(lmeGrpsForm) if (corQ = lmeQ) { if (any(corGrpsForm != lmeGrpsForm[1:corQ])) { stop(paste(Incompatible formulas for groups in \random\, and \correlation\)) } if (corQ lmeQ) { warning(paste(Cannot use smaller level of grouping for, \correlation\ than for \random\. Replacing, the former with the latter.)) attr(correlation, formula) - eval(parse(text = paste(~, deparse(getCovariateFormula(formula(correlation))[[2]]), |, deparse(groups[[2]] } } else { if (any(lmeGrpsForm != corGrpsForm[1:lmeQ])) { stop(paste(Incompatible formulas for groups in \random\, and \correlation\)) } } } else { ## using the same grouping as in random attr(correlation, formula) - eval(parse(text = paste(~,
Re: [R] large dataset!
JENNIFER HILL jh1030 at columbia.edu writes: Hi, I need to analyze data that has 3.5 million observations and about 60 variables and I was planning on using R to do this but I can't even seem to read in the data. It just freezes and ties up the whole system -- and this is on a Linux box purchased about 6 months ago on a dual-processor PC that was pretty much the top of the line. I've tried expanding R the memory limits but it doesn't help. I'll be hugely disappointed if I can't use R b/c I need to do build tailor-made models (multilevel and other complexities). My fall-back is the SPlus big data package but I'd rather avoid if anyone can provide a solution Thanks Jennifer Hill Dear Jennifer, you may want to look at the R newsletters. A few years ago it had an article on using DBMS with R, like MySQL, Oracle, etc. This is a frequently asked question: There are also some posts over the past few years that may be helpful. I have successfully read large database into MySQL, and accessed it from R---it was larger than your database. I hope that helps. Anupam Tyagi. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] [R-pkgs] Rassist - Student-friendly package
The Rassist package has been loaded to CRAN. This package is designed to make R easier for new users, by providing extra checks and feedback. Presently the package functionality includes: * offers an alternative help facility, eg(.), with examples first, with additional examples included. eg() offers a start help menu, and eg(.) incorporates help.search(.) automatically. It also locates the relevant package if not loaded. * Some mathematical functions validate input received. They give feedback on units used, and when in error, prompt the user to correct the input. * plot(.) offers feedback and cautions about input format. * deskcheck incorporates debug(), in straight forward testing. * source(.), under development, in error produces a directory listing. We believe that much remains to be done, but we wanted to solicit community feedback from interested parties. Please pass on suggestions for other functions, or better ways to achieve the same goals, to me. Cheers Andrew -- Andrew Robinson Department of Mathematics and StatisticsTel: +61-3-8344-9763 University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599 Email: [EMAIL PROTECTED] http://www.ms.unimelb.edu.au ___ R-packages mailing list [EMAIL PROTECTED] https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] rownames, colnames, and date and time
Hi all I was wondering whether there has ever been an update on the rownames and colnames behaviour as described by Eric below? I still get the same behaviour, exactly as described by Eric, on my WinXP installation of R-2.3.0. I also posted a message to r-help on Friday but looking through the online archives it seems to have not made it to the list. I would agree with Eric that a consistent use of the typecast would be a reasonable solution. Any comments? Tobias Brandt Quantitative Analyst Taquanta Asset Managers Nedbank Clock Tower Victoria Alfred Waterfront, Cape Town 8001 Tel : +27 (0) 21 416 6602 Fax : +27 (0) 21 416 9945 Email : [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Erich Neuwirth Sent: 21 March 2006 01:31 PM To: R-help@stat.math.ethz.ch Cc: Christian Prinoth Subject: [R] rownames, colnames, and date and time I noticed something surprising (in R 2.2.1 on WinXP) According to the documentation, rownames and colnames are character vectors. Assigning a vector of class POSIXct or POSIXlt as rownames or colnames therefore is not strictly according to the rules. In some cases, R performs a reasonable typecast, but in some other cases where the same typecast also would be possible, it does not. Assigning a vector of class POSIXct to the rownames or names of a dataframe creates a reasonable string representation of the dates (and possibly times). Assigning such a vector to the rownames or colnames of a matrix produces rownames or colnames consisting of the integer representation of the date-time value. Trying to assign a vector of class POSIXlt in all cases (dataframes and matrices, rownames, colnames, names) produces an error. Demonstration code is given below. This is somewhat inconsistent. Perhaps a reasonable solution could be that the typecast used for POSIXct and dataframes is used in all the other cases also. Code: mymat-matrix(1:4,nrow=2,ncol=2) mydf-data.frame(mymat) mydates-as.POSIXct(c(2001-1-24,2005-12-25)) rownames(mydf)-mydates names(mydf)-mydates rownames(mymat)-mydates colnames(mymat)-mydates print(deparse(mydates)) print(deparse(rownames(mydf))) print(deparse(names(mydf))) print(deparse(rownames(mymat))) print(deparse(colnames(mymat))) mydates1-as.POSIXlt(mydates) # the following lines will not work and # produce errors rownames(mydf)-mydates1 names(mydf)-mydates1 rownames(mymat)-mydates1 colnames(mymat)-mydates1 -- Erich Neuwirth Institute for Scientific Computing and Didactic Center for Computer Science University of Vienna phone: +43-1-4277-39464 fax: +43-1-4277-39459 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Nedbank Limited Reg No 1951/09/06. The following link displays the names of the Nedbank Board of Directors and Company Secretary. [ http://www.nedbank.co.za/terms/DirectorsNedbank.htm ] This email is confidential and is intended for the addressee only. The following link will take you to Nedbank's legal notice. [ http://www.nedbank.co.za/terms/EmailDisclaimer.htm ] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] median of gamma distribution
Hi, to compute the median (or expectation, var, sd, IQR, mad, ...) you can also use package distrEx. library(distrEx) (G - Gammad()) median(G) Matthias - original Nachricht Betreff: Re: [R] median of gamma distribution Gesendet: Fri, 30. Jun 2006 Von: [EMAIL PROTECTED] On 30-Jun-06 Philip He wrote: Doese anyone know a R function to find the median of a gamma distribution? qgamma will do it. Test: -log(0.5) [1] 0.6931472 qgamma(0.5,1) [1] 0.6931472 Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 30-Jun-06 Time: 16:53:16 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html --- original Nachricht Ende -- Dr. rer. nat. Matthias Kohl [EMAIL PROTECTED] www.stamats.de __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] could i change the ouput style on summary?
Try this function: #create some data x - data.frame(a=runif(100), b=runif(100,100,200), c=runif(100,1000,2000)) # need to transpose the output of sapply t(sapply(x, function(z){ + .res - c(quantile(z, c(0, .25, .5, .75, 1)), mean(z), var(z)) + names(.res) - c('Min', '1st Qu', 'Median', '3rd Qu', 'Max', 'Mean', 'Var') + .res + } + )) Min 1st Qu Median 3rd Qu Max Mean Var a 8.147703e-030.24776050.53650740.76446150.995281 0.5142281 8.596908e-02 b 1.000605e+02 122.9294392 147.0742970 176.0785627 199.484326 147.973 9.092492e+02 c 1.010171e+03 1173.8183349 1498.1997677 1719.2519408 1996.363311 1467.8446420 8.932657e+04 On 7/3/06, zhijie zhang [EMAIL PROTECTED] wrote: Dear friends, summary() doesn't give a good ouput style,e.g.: grasssoiltem airtem gheight humidity altitude diluo :38 Min. :15.90 Min. :17.70 Min. : 8.00 Min. : 0.2360 high: 43 huanghuacai:32 1st Qu.:19.32 1st Qu.:22.60 1st Qu.:40.00 1st Qu.: 0.3190 low :119 hucao :46 Median :20.20 Median :25.30 Median :60.00 Median : 0.3399 yuhao :46 Mean :20.52 Mean :26.43 Mean :51.93 Mean : 0.3423 3rd Qu.:21.48 3rd Qu.:29.73 3rd Qu.:70.00 3rd Qu.: 0.3627 Max. :28.60 Max. :42.10 Max. :90.00 Max. : 0.4453 could i get a result like the following: colums are statistical indices and the rows are variables *min 1st Qu. Median Mean 3rd Qu. Max. variance grass* *soiltem * *airtem* which could be copy and used in the papers easily? Thanks very much! -- Kind Regards, Zhi Jie,Zhang ,PHD Department of Epidemiology School of Public Health Fudan University Tel:86-21-54237149 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 646 9390 (Cell) +1 513 247 0281 (Home) What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] help a newbie with a loop
Hi, I am new in R and stumbled on a problem my (more experienced) friends can not help with with. Why isnt this code working? The function is working, also with the loop and the graph appears, only when I build another loop around it (for different values of p) , R stays in a loop? Can't it take more then 2 loops in one program? powerb-function(x,sp2,a,b,b1,m) { sx-(sum(x^2)-(sum(x)^2)/length(x))/length(x) n0-ceilingqnorm(1-(a/2))+qnorm(1-b))/b1)^2)*(((m+1)/m)*sp2/sx)) repeat { n1-ceilingqt(1-(a/2),n0-4)+qt(1-b,n0-4))/b1)^2)*(((m+1)/m)*sp2/sx)) if(n0==n1) break n0-n1 } return(c(sx,n1)) } x-rnorm(1000,0,1) x-x[order(x)] res-matrix(0,1000,2) #use the function and plot for different values of ind and p for ( p in c(0.05,0.10,0.15,0.20,0.25,0.30,0.40,0.50)) { risk-p*(2-p) nonrisk-(1-p)^2 m-nonrisk/risk for (ind in 1:500) {res[ind,]-powerb(x[c(1:(500-ind),(500+ind):1000)],4,0.05,0.20,0.1,m)} plot(res[,1],res[,2],type=p,ylab=n,xlab=var(x),main=b=0.1,power=0 .80,alpha=0.05,dominant met p=0.25)} I would appreciate the help, Marco MPM Boks, MD PhD, Department of Psychiatry, B01.206 University Medical Centre Utrecht, PO box 85500, 3508 GA Utrecht The Netherlands. Tel: +31 30 2506370 Fax: +31 30 2505509 Email: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Determine file access modifiers at file creation
Hi, is in R a command or option to a command which allows to set the file access modifiers for newly created files under Linux ? Currently a new file (e.g. with cat) is created with rw-r--r-- and I would like to have rw-rw-r. And I do not want to use umask. Thanks in advance Sigbert __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] macro facility in R
R 2.2 on windows XP I have a dataset with multiple columns. Some of the columns represent independent variables, some represent dependent variables. I would like to run the same analyses on a fixed set of independent variables, changing only the dependent variable, e.g. y1-y2=x1+x2+x3 y3-y4=x1+x2+x3 y5-y6=x1+x2+x3, etc. I know I can write a function to perform the analyses, however in order to make the analyses easy to do, I really need a macro scripting language that will allow preprocessing of the function and substitution of macro variables with parameters passed to the macro, i.e. I need someting akin to the macro facility in SAS. Is there a macro facility that I can use with R? I have tried help.search(macro) and did not have any success. Thanks, John John Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics Baltimore VA Medical Center GRECC, University of Maryland School of Medicine Claude D. Pepper OAIC, University of Maryland Clinical Nutrition Research Unit, and Baltimore VA Center Stroke of Excellence University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 410-605-7119 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] legend with filled boxes AND lines
Dear all, Is there a straightforward way to create a legend box that has both filled boxes and lines? So far I have built around this problem by creating two legends (with bty = n) and manually drawing a box around both (but this is cumbersome, because I have to check upon the y coordinates of the legends every time). If I do something like legend( ...,c(X1,X2, mean), fill = c(red, blue, 0), lty = (0,0,2)) , I cannot get rid of the unfilled box or change the color of the fill box border (from its default color black), and I end up with two filled boxes and an empty, black-lined box plus the line as a legend for the third argument mean. This trick therefore only works if I define black as the bg color for the complete legend box (because it masks the empty box from the fill argument). So, if there is a command to modify the color of the fill box border line (not the legend box border line), this would help me, too (still not ideal, though...). Thanks, Florian __ Florian Koller GfK Fernsehforschung GmbH Research Consulting Development Nordwestring 101 D-90319 Nürnberg Fon +49 (0)911 395-3554 Fax +49 (0)911 395-4130 www.gfk.de / www.gfk.com _ Diese E-Mail (ggf. nebst Anhang) enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind, oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail ist nicht gestattet. This e-mail (and any attachment/s) contains confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorised copying, disclosure or distribution of the material in this e-mail is strictly forbidden. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] rownames, colnames, and date and time
Note that if you are really trying to represent time series using rownames as the time variable then you might consider using the zoo or its packages (or ts class if they are regular) instead. library(zoo) mymat-matrix(1:4,nrow=2,ncol=2) mydates-as.POSIXct(c(2001-1-24,2005-12-25)) z - zoo(mymat, mydates) z On 7/3/06, Brandt, T. (Tobias) [EMAIL PROTECTED] wrote: Hi all I was wondering whether there has ever been an update on the rownames and colnames behaviour as described by Eric below? I still get the same behaviour, exactly as described by Eric, on my WinXP installation of R-2.3.0. I also posted a message to r-help on Friday but looking through the online archives it seems to have not made it to the list. I would agree with Eric that a consistent use of the typecast would be a reasonable solution. Any comments? Tobias Brandt Quantitative Analyst Taquanta Asset Managers Nedbank Clock Tower Victoria Alfred Waterfront, Cape Town 8001 Tel : +27 (0) 21 416 6602 Fax : +27 (0) 21 416 9945 Email : [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Erich Neuwirth Sent: 21 March 2006 01:31 PM To: R-help@stat.math.ethz.ch Cc: Christian Prinoth Subject: [R] rownames, colnames, and date and time I noticed something surprising (in R 2.2.1 on WinXP) According to the documentation, rownames and colnames are character vectors. Assigning a vector of class POSIXct or POSIXlt as rownames or colnames therefore is not strictly according to the rules. In some cases, R performs a reasonable typecast, but in some other cases where the same typecast also would be possible, it does not. Assigning a vector of class POSIXct to the rownames or names of a dataframe creates a reasonable string representation of the dates (and possibly times). Assigning such a vector to the rownames or colnames of a matrix produces rownames or colnames consisting of the integer representation of the date-time value. Trying to assign a vector of class POSIXlt in all cases (dataframes and matrices, rownames, colnames, names) produces an error. Demonstration code is given below. This is somewhat inconsistent. Perhaps a reasonable solution could be that the typecast used for POSIXct and dataframes is used in all the other cases also. Code: mymat-matrix(1:4,nrow=2,ncol=2) mydf-data.frame(mymat) mydates-as.POSIXct(c(2001-1-24,2005-12-25)) rownames(mydf)-mydates names(mydf)-mydates rownames(mymat)-mydates colnames(mymat)-mydates print(deparse(mydates)) print(deparse(rownames(mydf))) print(deparse(names(mydf))) print(deparse(rownames(mymat))) print(deparse(colnames(mymat))) mydates1-as.POSIXlt(mydates) # the following lines will not work and # produce errors rownames(mydf)-mydates1 names(mydf)-mydates1 rownames(mymat)-mydates1 colnames(mymat)-mydates1 -- Erich Neuwirth Institute for Scientific Computing and Didactic Center for Computer Science University of Vienna phone: +43-1-4277-39464 fax: +43-1-4277-39459 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Nedbank Limited Reg No 1951/09/06. The following link displays the names of the Nedbank Board of Directors and Company Secretary. [ http://www.nedbank.co.za/terms/DirectorsNedbank.htm ] This email is confidential and is intended for the addressee only. The following link will take you to Nedbank's legal notice. [ http://www.nedbank.co.za/terms/EmailDisclaimer.htm ] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Calculation of lags
Represent your series as a ts object and see ?lag Also lag.plot may or may not be of use to you. On 7/2/06, Ronaldo Reis-Jr. [EMAIL PROTECTED] wrote: Hi, If I have the follow situation: A dependent variable (i.e. number of insects) that is affected by an independent variable (i.e. rain). The problem is that the measure of rain affect the population in other moment. So there exit a lag between the rain and the number of insects. Exist in R any tool to find what is this lag? Explain better. Suppose that I have a linear relationship between rain and insects. More rain make more insects. If I try to model the rain and insects in the same moment I cant see this relationship because the rain today affect the number of insects in the future. Thus, I need to model the present rain with the number of insect in the future. But when is this future? 1 day after, 2 days after, etc. How the best way to calculate this lag? Thanks Ronaldo -- Todos chegamos um dia como a agua e nos vamos como o vento -- Graham Greene -- Prof. Ronaldo Reis Júnior | .''`. UNIMONTES/Depto. Biologia Geral/Lab. Ecologia Evolutiva | : :' : Campus Universitário Prof. Darcy Ribeiro, Vila Mauricéia | `. `'` CP: 126, CEP: 39401-089, Montes Claros - MG - Brasil | `- Fone: (38) 3229-8190 | [EMAIL PROTECTED] | ICQ#: 5692561 | LinuxUser#: 205366 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Determine file access modifiers at file creation
Try using 'system': system(chmod 664 filename) On 7/3/06, statwi01 [EMAIL PROTECTED] wrote: Hi, is in R a command or option to a command which allows to set the file access modifiers for newly created files under Linux ? Currently a new file (e.g. with cat) is created with rw-r--r-- and I would like to have rw-rw-r. And I do not want to use umask. Thanks in advance Sigbert __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 646 9390 (Cell) +1 513 247 0281 (Home) What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] help a newbie with a loop
You need to debug your function. If you put some 'cat(p, ind)' statements: for (ind in 1:500) {cat(p, ind, '\n') res[ind,]-powerb(x[c(1:(500-ind),(500+ind):1000)],4,0.05,0.20,0.1,m)} Here is the place where it seemed to 'stall' on me: 0.3 250 0.3 251 0.3 252 0.3 253 0.3 254 0.3 255 0.3 256 It looks like at this point there was probably a loop in your function. You will have to learn how to use debug to trace through what it is doing at this point. It had produced the graphs for the other values of 'p'. So there must be a 'bug' there. On 7/3/06, Boks, M.P.M. [EMAIL PROTECTED] wrote: Hi, I am new in R and stumbled on a problem my (more experienced) friends can not help with with. Why isnt this code working? The function is working, also with the loop and the graph appears, only when I build another loop around it (for different values of p) , R stays in a loop? Can't it take more then 2 loops in one program? powerb-function(x,sp2,a,b,b1,m) { sx-(sum(x^2)-(sum(x)^2)/length(x))/length(x) n0-ceilingqnorm(1-(a/2))+qnorm(1-b))/b1)^2)*(((m+1)/m)*sp2/sx)) repeat { n1-ceilingqt(1-(a/2),n0-4)+qt(1-b,n0-4))/b1)^2)*(((m+1)/m)*sp2/sx)) if(n0==n1) break n0-n1 } return(c(sx,n1)) } x-rnorm(1000,0,1) x-x[order(x)] res-matrix(0,1000,2) #use the function and plot for different values of ind and p for ( p in c(0.05,0.10,0.15,0.20,0.25,0.30,0.40,0.50)) { risk-p*(2-p) nonrisk-(1-p)^2 m-nonrisk/risk for (ind in 1:500) {res[ind,]-powerb(x[c(1:(500-ind),(500+ind):1000)],4,0.05,0.20,0.1,m)} plot(res[,1],res[,2],type=p,ylab=n,xlab=var(x),main=b=0.1,power=0 .80,alpha=0.05,dominant met p=0.25)} I would appreciate the help, Marco MPM Boks, MD PhD, Department of Psychiatry, B01.206 University Medical Centre Utrecht, PO box 85500, 3508 GA Utrecht The Netherlands. Tel: +31 30 2506370 Fax: +31 30 2505509 Email: [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 646 9390 (Cell) +1 513 247 0281 (Home) What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Determine file access modifiers at file creation
Hi, Try using 'system': system(chmod 664 filename) That's my plan, if I can not do it in R directly. Thanks Sigbert __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] sparse matrix tools
So far I've used only 'new(dgC.Matrix,...)' and 'new(dgT.Matrix,...)'! I did not mean to malign Matrix speed/functionality -- I've not tested these yet -- nor do I quite know yet what function(s) I needto perform on my matrices! My questions were hypothetical. Thanks for the additional information on packages I mentioned. As I become able to reply regarding my needs for (Matrix or UMFPACK) functions, I shall reply offlist. Thanks and best regards, -John Thaden (operating R v.2.3.0 within Windows XP) MM = Martin Maechler [EMAIL PROTECTED] replied on Monday, July 03, 2006 2:45 AM: JJTh == Thaden, John J [EMAIL PROTECTED] on Sun, 2 Jul 2006 07:38:21 -0500 writes: JJTh Dear R-Help list: I'm using the Matrix library to JJTh operate on 600 X ~5000 element unsymmetrical sparse JJTh arrays. So far, so good, but if I find I need more JJTh speed or functionality, MM Can you be more specific? MM Functionality: MMFor asymmetric matrices, in our view, the most MMimportant gap to fill is the LU decomposition MMand hence solve() features. MM MM Speed: Are you sure the time your R code spends is MMspent in functions from Matrix? {Did you use MM 'Rprof()' ?}? If yes, which ones? JJTh how hard would it be to JJTh utilize other sparse matrix toolsets from within R, JJTh say MUMPS, PARDISO or UMFPACK, that do not have JJTh explicit R interfaces? More information on these is JJTh available here JJTh www.cise.ufl.edu/research/sparse/umfpack/ JJTh www.computational.unibas.ch/cs/scicomp/software/pardiso JJTh www.enseeiht.fr/lima/apo/MUMPS/ From these, only the first one is open source. Unfortunately, the PARDISO people seem to believe in money making with scientific software -- a particular shame for since they only work at most an hour away from me. MUMPS is said to be public domain, but then you only get it after filling out a web form and only for a specific hardware. Also, it is about massively parallel computation, very interesting for sparse matrices, but AFAIK not yet in our main focus. UMFPACK is different, and even ships with the Matrix package, because we have planned to interface to it, but haven't yet got to that. What parts of UMFPACK functionality would you be interested in ? JJTh and in these reviews JJTh ftp://ftp.numerical.rl.ac.uk/pub/reports/ghsNAGIR20051r1.pdf JJTh http://www.cise.ufl.edu/research/sparse/codes/ JJTh neither of which reviewed the R Matrix package, unfortunately. JJTh Thanks, - John Thaden, Ph.D., U. Arkansas for JJTh Med. Sci., Little Rock. Regards, Martin Maechler, ETH Zurich Confidentiality Notice: This e-mail message, including any a...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] macro facility in R
John Sorkin [EMAIL PROTECTED] writes: R 2.2 on windows XP (that's a bit old...) I have a dataset with multiple columns. Some of the columns represent independent variables, some represent dependent variables. I would like to run the same analyses on a fixed set of independent variables, changing only the dependent variable, e.g. y1-y2=x1+x2+x3 y3-y4=x1+x2+x3 y5-y6=x1+x2+x3, etc. I know I can write a function to perform the analyses, however in order to make the analyses easy to do, I really need a macro scripting language that will allow preprocessing of the function and substitution of macro variables with parameters passed to the macro, i.e. I need someting akin to the macro facility in SAS. Is there a macro facility that I can use with R? I have tried help.search(macro) and did not have any success. Thanks, John Some people have indicated that they might want to try their hand and write a macro facility for R, at some time in the future. (There are some parts of the R internals that would benefit from such a facility too). Meanwhile, update() is your friend. mdl - lm(y1-y2 ~ x1+x2+x3,) summary(mdl) summary(update(mdl, y3 - y4 ~ .)) ...etc... -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] macro facility in R
On 7/3/2006 7:39 AM, John Sorkin wrote: R 2.2 on windows XP I have a dataset with multiple columns. Some of the columns represent independent variables, some represent dependent variables. I would like to run the same analyses on a fixed set of independent variables, changing only the dependent variable, e.g. y1-y2=x1+x2+x3 y3-y4=x1+x2+x3 y5-y6=x1+x2+x3, etc. I know I can write a function to perform the analyses, however in order to make the analyses easy to do, I really need a macro scripting language that will allow preprocessing of the function and substitution of macro variables with parameters passed to the macro, i.e. I need someting akin to the macro facility in SAS. Is there a macro facility that I can use with R? I have tried help.search(macro) and did not have any success. I think the substitute function in R does most of what you would want from a macro facility. For example, analyze - function(dependent, data) { + formula - as.formula(substitute(dep ~ x1 + x2 + x3, + list(dep=substitute(dependent + lm(formula=formula, data=data) + } dat - data.frame(y = rnorm(10)+1:10, x1=rnorm(10)+1:10, x2=rnorm(10), x3=rnorm(10)) analyze(y, dat) Call: lm(formula = formula, data = data) Coefficients: (Intercept) x1 x2 x3 1.1256 0.8248 0.1671 -0.1907 I used substitute twice: the inner call gets the unevaluated expression that was passed as dependent; the outer one puts that in place of the dep variable. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Workspace size
Hi, Can I determine the approximate size of a workspace on the harddisk before saving it via sys.save.image(name) ? Thanks in advance Sigbert __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] macro facility in R
John Sorkin wrote: R 2.2 on windows XP I have a dataset with multiple columns. Some of the columns represent independent variables, some represent dependent variables. I would like to run the same analyses on a fixed set of independent variables, changing only the dependent variable, e.g. y1-y2=x1+x2+x3 y3-y4=x1+x2+x3 y5-y6=x1+x2+x3, etc. I know I can write a function to perform the analyses, however in order to make the analyses easy to do, I really need a macro scripting language that will allow preprocessing of the function and substitution of macro variables with parameters passed to the macro, i.e. I need someting akin to the macro facility in SAS. Is there a macro facility that I can use with R? I have tried help.search(macro) and did not have any success. Also try RSiteSearch(macro, restrict=functions). And have a look at Thomas Lumley's column in the following issue of Rnews: http://cran.r-project.org/doc/Rnews/Rnews_2001-3.pdf Thanks, John John Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics Baltimore VA Medical Center GRECC, University of Maryland School of Medicine Claude D. Pepper OAIC, University of Maryland Clinical Nutrition Research Unit, and Baltimore VA Center Stroke of Excellence University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 410-605-7119 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] macro facility in R
Try this: # test data set.seed(1) mat - matrix(rnorm(900), nc = 9) colnames(mat) - letters[1:9] DF - as.data.frame(mat) # run lm's and display coefs for(i in seq(1, 5, 2)) { dat - cbind(z = DF[,i] - DF[,i+1], DF[7:9]) Coef - coef(lm(z ~., dat)) cat(y: DF[,, i, ] - DF[,, i+1, ], coef: , Coef, \n) } On 7/3/06, John Sorkin [EMAIL PROTECTED] wrote: R 2.2 on windows XP I have a dataset with multiple columns. Some of the columns represent independent variables, some represent dependent variables. I would like to run the same analyses on a fixed set of independent variables, changing only the dependent variable, e.g. y1-y2=x1+x2+x3 y3-y4=x1+x2+x3 y5-y6=x1+x2+x3, etc. I know I can write a function to perform the analyses, however in order to make the analyses easy to do, I really need a macro scripting language that will allow preprocessing of the function and substitution of macro variables with parameters passed to the macro, i.e. I need someting akin to the macro facility in SAS. Is there a macro facility that I can use with R? I have tried help.search(macro) and did not have any success. Thanks, John John Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics Baltimore VA Medical Center GRECC, University of Maryland School of Medicine Claude D. Pepper OAIC, University of Maryland Clinical Nutrition Research Unit, and Baltimore VA Center Stroke of Excellence University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 410-605-7119 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] help a newbie with a loop
On 7/3/06, Boks, M.P.M. [EMAIL PROTECTED] wrote: Hi, I am new in R and stumbled on a problem my (more experienced) friends can not help with with. Why isnt this code working? The function is working, also with the loop and the graph appears, only when I build another loop around it (for different values of p) , R stays in a loop? Can't it take more then 2 loops in one program? You can have as many as you like (as far as I know) powerb-function(x,sp2,a,b,b1,m) { sx-(sum(x^2)-(sum(x)^2)/length(x))/length(x) n0-ceilingqnorm(1-(a/2))+qnorm(1-b))/b1)^2)*(((m+1)/m)*sp2/sx)) repeat { n1-ceilingqt(1-(a/2),n0-4)+qt(1-b,n0-4))/b1)^2)*(((m+1)/m)*sp2/sx)) if(n0==n1) break n0-n1 } return(c(sx,n1)) } x-rnorm(1000,0,1) x-x[order(x)] res-matrix(0,1000,2) #use the function and plot for different values of ind and p for ( p in c(0.05,0.10,0.15,0.20,0.25,0.30,0.40,0.50)) { risk-p*(2-p) nonrisk-(1-p)^2 m-nonrisk/risk for (ind in 1:500) {res[ind,]-powerb(x[c(1:(500-ind),(500+ind):1000)],4,0.05,0.20,0.1,m)} plot(res[,1],res[,2],type=p,ylab=n,xlab=var(x),main=b=0.1,power=0 .80,alpha=0.05,dominant met p=0.25)} I modified your function as follows: powerb-function(x,sp2,a,b,b1,m){ sx-(sum(x^2)-(sum(x)^2)/length(x))/length(x) n0-ceilingqnorm(1-(a/2))+qnorm(1-b))/b1)^2)*(((m+1)/m)*sp2/sx)) repeat{ n1-ceilingqt(1-(a/2),n0-4)+qt(1-b,n0-4))/b1)^2)*(((m+1)/m)*sp2/sx)) if(n0==n1) break else cat(p = , p, , ind = , ind, , n0 = , n0, , n1 = , n1, \n, sep='') n0-n1 } return(c(sx,n1)) } and got, for example, the following output: p = 0.2, ind = 278, n0 = 2288, n1 = 2289 p = 0.2, ind = 278, n0 = 2289, n1 = 2288 p = 0.2, ind = 278, n0 = 2288, n1 = 2289 p = 0.2, ind = 278, n0 = 2289, n1 = 2288 p = 0.2, ind = 278, n0 = 2288, n1 = 2289 p = 0.2, ind = 278, n0 = 2289, n1 = 2288 p = 0.2, ind = 278, n0 = 2288, n1 = 2289 p = 0.2, ind = 278, n0 = 2289, n1 = 2288 ... so your in an infinite loop. Maybe you can check the difference between n0 and n1 instead, or perhaps better, do something like: powerb-function(x,sp2,a,b,b1,m){ sx-(sum(x^2)-(sum(x)^2)/length(x))/length(x) n0-ceilingqnorm(1-(a/2))+qnorm(1-b))/b1)^2)*(((m+1)/m)*sp2/sx)) n0.1 - n1.1 - Inf repeat{ n1-ceilingqt(1-(a/2),n0-4)+qt(1-b,n0-4))/b1)^2)*(((m+1)/m)*sp2/sx)) if(n0==n1) break if(n0.1==n1 n1.1==n0) break n0.1 - n0 n1.1 - n1 n0-n1 } return(c(sx,n1)) } __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] macro facility in R
- Original Message - From: Gabor Grothendieck [EMAIL PROTECTED] To: John Sorkin [EMAIL PROTECTED] Cc: r-help@stat.math.ethz.ch Sent: Monday, July 03, 2006 3:01 PM Subject: Re: [R] macro facility in R Try this: # test data set.seed(1) mat - matrix(rnorm(900), nc = 9) colnames(mat) - letters[1:9] DF - as.data.frame(mat) # run lm's and display coefs for(i in seq(1, 5, 2)) { dat - cbind(z = DF[,i] - DF[,i+1], DF[7:9]) Coef - coef(lm(z ~., dat)) cat(y: DF[,, i, ] - DF[,, i+1, ], coef: , Coef, \n) } In case of lm(), this could also be written as: y - mat[, c(1, 3, 5)] - mat[, c(2, 4, 6)] x - mat[, 7:9] lm(y ~ x) Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm On 7/3/06, John Sorkin [EMAIL PROTECTED] wrote: R 2.2 on windows XP I have a dataset with multiple columns. Some of the columns represent independent variables, some represent dependent variables. I would like to run the same analyses on a fixed set of independent variables, changing only the dependent variable, e.g. y1-y2=x1+x2+x3 y3-y4=x1+x2+x3 y5-y6=x1+x2+x3, etc. I know I can write a function to perform the analyses, however in order to make the analyses easy to do, I really need a macro scripting language that will allow preprocessing of the function and substitution of macro variables with parameters passed to the macro, i.e. I need someting akin to the macro facility in SAS. Is there a macro facility that I can use with R? I have tried help.search(macro) and did not have any success. Thanks, John John Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics Baltimore VA Medical Center GRECC, University of Maryland School of Medicine Claude D. Pepper OAIC, University of Maryland Clinical Nutrition Research Unit, and Baltimore VA Center Stroke of Excellence University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 410-605-7119 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Optional variables in function?
jim holtman wrote: ?missing On 7/2/06, Jonathan Greenberg [EMAIL PROTECTED] wrote: I'm a bit new to writing R functions and I was wondering what the best practice for having optional variables in a function is, and how to test for optional and non-optional variables? e.g. if I have the following function: helpme - function(a,b,c) { } In this example, I want c to be an optional variable, but a and b to be required. How do I: 1) test to see if the user has inputted c 2) break out of the function of the user has NOT inputted a or b. if(missing(c)) stop(need c) or if(missing(c)) return() depending on your problem it might be better to provide sensible default values for a,b,c in the function definition helpme-function(a=A, b=B, c=C) {...} instead of enforcing that every argument is specified? joerg Thanks! --j -- Jonathan A. Greenberg, PhD NRC Research Associate NASA Ames Research Center MS 242-4 Moffett Field, CA 94035-1000 Phone: 415-794-5043 AIM: jgrn3007 MSN: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] macro facility in R
Here is one user's perspective on this, with no pretense of being definitive. In the S (R) world, the expression computing on the language is used to encompass what I would call the tasks of macro programming. This involves uses of various S (R) expressions that convert between names of objects and the object themselves; pasting together expressions out of fixed and variable text; and executing them. I for one have been able to do everything I could do in the SAS macro language, but it has taken trial and error. To get started, you might consult section 3.5 of S Programming by Venables and Ripley. MHP John Sorkin wrote on 7/3/2006 7:39 AM: R 2.2 on windows XP I have a dataset with multiple columns. Some of the columns represent independent variables, some represent dependent variables. I would like to run the same analyses on a fixed set of independent variables, changing only the dependent variable, e.g. y1-y2=x1+x2+x3 y3-y4=x1+x2+x3 y5-y6=x1+x2+x3, etc. I know I can write a function to perform the analyses, however in order to make the analyses easy to do, I really need a macro scripting language that will allow preprocessing of the function and substitution of macro variables with parameters passed to the macro, i.e. I need someting akin to the macro facility in SAS. Is there a macro facility that I can use with R? I have tried help.search(macro) and did not have any success. Thanks, John John Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics Baltimore VA Medical Center GRECC, University of Maryland School of Medicine Claude D. Pepper OAIC, University of Maryland Clinical Nutrition Research Unit, and Baltimore VA Center Stroke of Excellence University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 410-605-7119 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Michael H. Prager, Ph.D. Population Dynamics Team NOAA Center for Coastal Habitat and Fisheries Research NMFS Southeast Fisheries Science Center Beaufort, North Carolina 28516 USA http://shrimp.ccfhrb.noaa.gov/~mprager/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] problem with --vanilla in R
Hallo, I'm using the R Version 2.3.0 (2006-04-24) on Suse Linux 10.1. With an older R and Linux version I could write a R-function into a file and execute it with the command: R --vanilla script.r In the file script.r was code like this: postscript(file=results.eps) x-2 y-3 plot(x,y) dev.off() It worked fine, but now nothing happens. In the manual page this option is still mentioned. I don't know what to do, to make it working again. Thanks for your help. With kind regards, Katrin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] problem with --vanilla in R
Katrin Braesel wrote: Hallo, I'm using the R Version 2.3.0 (2006-04-24) on Suse Linux 10.1. With an older R and Linux version I could write a R-function into a file and execute it with the command: R --vanilla script.r In the file script.r was code like this: postscript(file=results.eps) x-2 y-3 plot(x,y) dev.off() It worked fine, but now nothing happens. In the manual page this option is still mentioned. I don't know what to do, to make it working again. Works for me with R-2.3.1 using an appropriate shell. Maybe you misspelled the filename or don't have write access? What happens after R --vanilla script.r BTW: I'd rather use R CMD BATCH, and I'd call the script script.R... Uwe Ligges Thanks for your help. With kind regards, Katrin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] do i set the correct argument?
Dear friends, In gls() of nlme package, there is some explanation on correlation: gls(model, data, correlation, weights, subset, method, na.action, control, verbose) correlation: an optional 'corStruct' object describing the within-group correlation structure. See the documentation of 'corClasses' for a description of the available 'corStruct' classes. If a grouping variable is to be used, it must be specified in the 'form' argument to the 'corStruct' constructor. Defaults to 'NULL', corresponding to uncorrelated errors. Now i have two categorical variables: grass and altitude, *gls.fit -gls(log(snail)~grass+altitude+gheight+humidity+soiltemr+airtemr,data=model,correlation=corAR1( form=~grass+altitude))* Is the above argument that i use right? thanks very much! -- Kind Regards, Zhi Jie,Zhang ,PHD Department of Epidemiology School of Public Health Fudan University Tel:86-21-54237149 [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] large dataset!
Jennifer, we had a little discussion about this topic last May when I had a similar problem. It is archived at http://finzi.psych.upenn.edu/R/Rhelp02a/archive/76401.html You can follow the thread to see the various arguments and solutions. I tried to summarize the possible suggested approachs at http://finzi.psych.upenn.edu/R/Rhelp02a/archive/76583.html HTH, Rogerio Porto. -- Cabeçalho original --- De: [EMAIL PROTECTED] Para: r-help@stat.math.ethz.ch Cópia: Data: Sun, 2 Jul 2006 10:12:25 -0400 (EDT) Assunto: [R] large dataset! Hi, I need to analyze data that has 3.5 million observations and about 60 variables and I was planning on using R to do this but I can't even seem to read in the data. It just freezes and ties up the whole system -- and this is on a Linux box purchased about 6 months ago on a dual-processor PC that was pretty much the top of the line. I've tried expanding R the memory limits but it doesn't help. I'll be hugely disappointed if I can't use R b/c I need to do build tailor-made models (multilevel and other complexities). My fall-back is the SPlus big data package but I'd rather avoid if anyone can provide a solution Thanks Jennifer Hill __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] how to get the studentized residuals in lm()
ronggui == ronggui [EMAIL PROTECTED] on Mon, 3 Jul 2006 12:29:42 +0800 writes: help.search(studentized) ronggui You will see: ronggui studres(MASS) Extract Studentized Residuals from a Linear Model Yes. But you don't even need another package. Thanks to John Fox, there's rstudent() rstandard() etc, even in the 'stats' package, with methods for lm and for glm objects. And now, in R-devel (2.4.0-to-be), help.search(studentized) will show the 'influence.measures' help page which contains rstudent(). Martin Maechler, ETH Zurich ronggui 2006/7/3, zhijie zhang [EMAIL PROTECTED]: Dear friends, In s-plus, lm() generates the the studentized residuals automatically for us, and In R, it seems don't have the results: After i fitted lm(), i use attibutes() to see the objects and didn't find studentized residuals . How to get the the studentized residuals in lm(),have i missed something? thanks very much! -- Kind Regards, Zhi Jie,Zhang ,PHD Department of Epidemiology School of Public Health Fudan University Tel:86-21-54237149 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html ronggui -- ronggui __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] macro facility in R
On Mon, 3 Jul 2006, John Sorkin wrote: R 2.2 on windows XP I have a dataset with multiple columns. Some of the columns represent independent variables, some represent dependent variables. I would like to run the same analyses on a fixed set of independent variables, changing only the dependent variable, e.g. y1-y2=x1+x2+x3 y3-y4=x1+x2+x3 y5-y6=x1+x2+x3, etc. Other people have mentioned my defmacro() function in R-news, however when I have to do this sort of thing I usually use substitute or bquote For example for(i in 1+(1:5)*2){ A-as.name(paste(y,i,sep=)) B-as.name(paste(y,i+1,sep=) eval(bquote(lm(I(.(A)-.(B))~x1+x2+x3,data=dat))) } -thomas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] legend with filled boxes AND lines
Did you try legend(.., lty=..., fill=..., merge = TRUE) ? In an example I just tried, this allowed to give filled boxes *and* lines. Please give a reproducible example of what you did -- maybe by modifying one of the many example(legend) examples. Martin Maechler, ETH Zurich florian == florian koller [EMAIL PROTECTED] on Mon, 3 Jul 2006 13:40:33 +0200 writes: florian Dear all, florian Is there a straightforward way to create a legend florian box that has both filled boxes and lines? So far I florian have built around this problem by creating two florian legends (with bty = n) and manually drawing a box florian around both (but this is cumbersome, because I have florian to check upon the y coordinates of the legends florian every time). florian If I do something like legend( ...,c(X1,X2, florian mean), fill = c(red, blue, 0), lty = (0,0,2)) florian , I cannot get rid of the unfilled box or change florian the color of the fill box border (from its default florian color black), and I end up with two filled boxes florian and an empty, black-lined box plus the line as a florian legend for the third argument mean. This trick florian therefore only works if I define black as the bg florian color for the complete legend box (because it masks florian the empty box from the fill argument). So, if there florian is a command to modify the color of the fill box florian border line (not the legend box border line), this florian would help me, too (still not ideal, though...). florian Thanks, florian Florian florian __ florian Florian Koller florian GfK Fernsehforschung GmbH florian Research Consulting Development florian Nordwestring 101 florian D-90319 Nürnberg florian Fon +49 (0)911 395-3554 florian Fax +49 (0)911 395-4130 florian www.gfk.de / www.gfk.com florian _ florian Diese E-Mail (ggf. nebst Anhang) enthält vertrauliche und/oder rechtlich florian geschützte Informationen. Wenn Sie nicht der richtige Adressat sind, oder florian diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den florian Absender und vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die florian unbefugte Weitergabe dieser Mail ist nicht gestattet. florian This e-mail (and any attachment/s) contains confidential and/or privileged florian information. If you are not the intended recipient (or have received this florian e-mail in error) please notify the sender immediately and destroy this florian e-mail. Any unauthorised copying, disclosure or distribution of the florian material in this e-mail is strictly forbidden. florian __ florian R-help@stat.math.ethz.ch mailing list florian https://stat.ethz.ch/mailman/listinfo/r-help florian PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] help with coxme
When I tried it, I seemed to get inconsistent results, the first of which was irreproducible. The first error message said that 'rats2' was not a data.frame. I don't know what I did to get that, but when I tried it again, I got the following: Error in max(kindex) : object kindex not found traceback() 3: max(kindex) 2: max(kindex) 1: coxme(Surv(time, status) ~ rx + x2, data = rats2, random = ~(1 + x2) | litter) This sounds to me like a logic error in 'coxme'; I am therefore cc-ing the maintainer listed in help(package='kinship'). If it were my problem, I might list coxme and make a local copy of it. It's easy to find the offending 'max(kindex)'. It's harder to figure out what kindex should be in this context. To attempt that, I might request debug(coxme), then try again the 'coxme' call that produced the error message and trace through it line by line until I figured it out. Hope this helps. Spencer Graves Lei Liu wrote: Hi there, I have a question on fitting data by coxme. In particular I want to fit a random intercept and random slope cox model. Using the rats dataset as an example, I generated another covariate x2 and want to specify a random slope for x2. Here is my code: x2=matrix(rep(runif(50), 3), 50, 3) x2=as.vector(t(x2)) rats2=cbind(rats, x2) But when I used the coxme function as follows, it gave an error message. What is the right way to do it? Thanks a lot! coxme(Surv(time, status) ~ rx+x2, data=rats2, random=~ (1+x2)|litter ) Lei Liu Assistant Professor Division of Biostatistics and Epidemiology Dept. of Public Health Sciences School of Medicine University of Virginia 3181 Hospital West Complex Charlottesville, VA 22908-0717 1-434-982-3364 (o) 1-434-806-8086 (c) [EMAIL PROTECTED] [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Antwort: Re: legend with filled boxes AND lines
Hi Martin, I know about the merge command, but I want a line without the box in the legend. Ideally I would need some argument that tells the fill subcommand not only to suppress the box color (see example below), but also to suppress the frame of the fill box. Alternatively if someone could tell me how to modify the color of the frame, this would help, too, because I could simply set it to the bg command. x1 - rnorm(100) x2 - rnorm(100, 2) hist(x1, main = , col = orange,ylab = density, xlab = x, freq = F, density = 55, xlim = c(-2, 5), ylim = c(0, 0.5)) par(new = T) hist(x2, main = , col = green, ylab = , xlab = ,axes = F, xlim = c(-2, 5), ylim = c(0, 0.5), density = 45, freq = F) abline(v = mean(x1), col = orange, lty = 2, lwd = 2.5) abline(v = mean(x2), col = green, lty = 2, lwd = 2.5) legend(3, 0.45, legend = c(x1, x2, mean(x1), mean(x2)), col = c(orange, green), fill=c(orange,green, 0, 0), lty = c(0, 0, 2, 2), merge = T) Thank you, Florian Koller Martin Maechler [EMAIL PROTECTED] schrieb am 03/07/2006 18:41:54: Did you try legend(.., lty=..., fill=..., merge = TRUE) ? In an example I just tried, this allowed to give filled boxes *and* lines. Please give a reproducible example of what you did -- maybe by modifying one of the many example(legend) examples. Martin Maechler, ETH Zurich florian == florian koller [EMAIL PROTECTED] on Mon, 3 Jul 2006 13:40:33 +0200 writes: florian Dear all, florian Is there a straightforward way to create a legend florian box that has both filled boxes and lines? So far I florian have built around this problem by creating two florian legends (with bty = n) and manually drawing a box florian around both (but this is cumbersome, because I have florian to check upon the y coordinates of the legends florian every time). florian If I do something like legend( ...,c(X1,X2, florian mean), fill = c(red, blue, 0), lty = (0,0,2)) florian , I cannot get rid of the unfilled box or change florian the color of the fill box border (from its default florian color black), and I end up with two filled boxes florian and an empty, black-lined box plus the line as a florian legend for the third argument mean. This trick florian therefore only works if I define black as the bg florian color for the complete legend box (because it masks florian the empty box from the fill argument). So, if there florian is a command to modify the color of the fill box florian border line (not the legend box border line), this florian would help me, too (still not ideal, though...). florian Thanks, florian Florian florian __ florian Florian Koller florian GfK Fernsehforschung GmbH florian Research Consulting Development florian Nordwestring 101 florian D-90319 Nürnberg florian Fon +49 (0)911 395-3554 florian Fax +49 (0)911 395-4130 florian www.gfk.de / www.gfk.com florian _ florian Diese E-Mail (ggf. nebst Anhang) enthält vertrauliche und/oder rechtlich florian geschützte Informationen. Wenn Sie nicht der richtige Adressat sind, oder florian diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den florian Absender und vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die florian unbefugte Weitergabe dieser Mail ist nicht gestattet. florian This e-mail (and any attachment/s) contains confidential and/or privileged florian information. If you are not the intended recipient (or have received this florian e-mail in error) please notify the sender immediately and destroy this florian e-mail. Any unauthorised copying, disclosure or distribution of the florian material in this e-mail is strictly forbidden. florian __ florian R-help@stat.math.ethz.ch mailing list florian https://stat.ethz.ch/mailman/listinfo/r-help florian PLEASE do read the posting guide! http://www.R-project. org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] gamm
Hello, I am a bit confused about gamm in mgcv. Consulting Wood (2006) or Ruppert et al. (2003) hasn't taken away my confusion. In this code from the gamm help file: b2-gamm(y~s(x0)+s(x1)+s(x2)+s(x3),family=poisson,random=list(fac=~1)) Am I correct in assuming that we have a random intercept herebut that the amount of smoothing is also changing per level of the factor?? Or is it only the intercept that is changing? And where can I find some explanation on the magic output below? Thanks Piet summary(b2$lme) Random effects: Formula: ~Xr.1 - 1 | g.1 Structure: pdIdnot Xr.11Xr.12Xr.13Xr.14Xr.15Xr.16Xr.17Xr.18 StdDev: 1.680679 1.680679 1.680679 1.680679 1.680679 1.680679 1.680679 1.680679 Formula: ~Xr.2 - 1 | g.2 %in% g.1 Structure: pdIdnot Xr.21 Xr.22 Xr.23 Xr.24 Xr.25 Xr.26 Xr.27 Xr.28 StdDev: 1.57598 1.57598 1.57598 1.57598 1.57598 1.57598 1.57598 1.57598 Formula: ~Xr.3 - 1 | g.3 %in% g.2 %in% g.1 Structure: pdIdnot Xr.31Xr.32Xr.33Xr.34Xr.35Xr.36Xr.37Xr.38 StdDev: 20.06377 20.06377 20.06377 20.06377 20.06377 20.06377 20.06377 20.06377 Formula: ~Xr.4 - 1 | g.4 %in% g.3 %in% g.2 %in% g.1 Structure: pdIdnot Xr.41Xr.42Xr.43Xr.44Xr.45 StdDev: 0.0001063304 0.0001063304 0.0001063304 0.0001063304 0.0001063304 Xr.46Xr.47Xr.48 StdDev: 0.0001063304 0.0001063304 0.0001063304 Formula: ~1 | fac %in% g.4 %in% g.3 %in% g.2 %in% g.1 (Intercept) Residual StdDev: 0.6621173 1.007227 Variance function: Structure: fixed weights Formula: ~invwt Fixed effects: y.0 ~ X - 1 Value Std.Error DF t-value p-value X(Intercept) 2.0870364 0.3337787 392 6.252755 0. Xs(x0)Fx1-0.325 0.1028794 392 -0.000316 0.9997 Xs(x1)Fx1 0.3831694 0.0957323 392 4.002509 0.0001 Xs(x2)Fx1 1.4584330 0.3909237 392 3.730736 0.0002 Xs(x3)Fx1-0.0123951 0.0143162 392 -0.865809 0.3871 Correlation: - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] lme and SAS Proc mixed
Your example is entirely too complicated for me to parse in the time available, but I have a few questions that I hope might help: First, have you examined str(fit.lme) plus all the other help pages listed under See Also in the lme help page, especially lmeObject? With luck, this may answer your questions. Second, are all your random effects nested, or are some crossed? If the model is all standard nesting with no complicated inhomogeneity of variance or correlation structure, you may be working too hard: The 'nlme' package is designed to make standard nested mixed-model analysis as easily as Doug Bates could make it with a general tool in the S language paradigm. Functions like 'pdBlocked' are relatively inefficient attempts to adapt the 'nlme' paradigm to support crossed random effects, etc. If you have both crossed and nested random effects, you might try Bates' current project, 'lmer' associated with the 'lme4' package. Unfortunately, that package currently has fewer helper functions for things like confidence intervals. (It's Bates' leading edge development effort.) Third, can generate a much simpler, self-contained problem that exhibits the same features as your more complicated example? If your email had contained a few lines of R code that someone like me could copy and paste into R and get what you got, it would increase your chances of getting a more informative reply quicker. Without it, I'm left to guessing. Fourth, if you want to know, which is more correct, have you considered Monte Carlo? The nlme package includes a function simulate.lme, which might help you. The lme4 package includes mcmcsamp. Hope this helps. Spencer Graves Kellie J. Archer, Ph.D. wrote: I am trying to use lme to fit a mixed effects model to get the same results as when using the following SAS code: proc mixed; class refseqid probeid probeno end; model expression=end logpgc / ddfm=satterth; random probeno probeid / subject=refseqid type=cs; lsmeans end / diff cl; run; There are 3 genes (refseqid) which is the large grouping factor, with 2 probeids nested within each refseqid, and 16 probenos nested within each of the probeids. I have specified in the SAS Proc Mixed procedure that the variance-covariance structure is to be compound symmetric. Therefore, the variance-covariance matrix is a block diagonal matrix of the form, V_1 0 0 0 V_2 0 00 V3 where each V_i represents a RefSeqID. Moreover, for each V_i the structure within the block is v_{11} v{12} v_{21} v{22} where v_{11} and v_{22} are different probeids nested within the refseqid, and so are correlated. The structure of both v_{11} and v_{22} are compound symmetric, and v_{12} and v{21} contain a constant for all elements of the matrix. I have tried to reproduce this using lme, but it is unclear from the documentation (and Pinheiro Bates text) how the pdBlocked and compound symmetric structure can be combined. fit.lme-lme(expression~End+logpgc,random=list(RefSeqID=pdBlocked(list (~1,~ProbeID-1),pdClass=pdSymm)),data=dataset,correlation=corCompSym m(form=~1|RefSeqID/ProbeID/ProbeNo)) The point estimates are essentially the same comparing R and SAS for the fixed effects, but the 95% confidence intervals are much shorter using lme(). In order to find the difference in the algorithms used by SAS and R I tried to extract the variance-covariance matrix to look at its structure. I used the getVarCov() command, but it tells me that this function is not available for nested structures. Is there another way to extract the variance-covariance structure for nested models? Does anyone know how I could get the var-cov structure above using lme? Kellie J. Archer, Ph.D. Assistant Professor, Department of Biostatistics Fellow, Center for the Study of Biological Complexity Virginia Commonwealth University 1101 East Marshall St., B1-066 Richmond, VA 23298-0032 phone: (804) 827-2039 fax: (804) 828-8900 e-mail: [EMAIL PROTECTED] website: www.people.vcu.edu/~kjarcher __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] xlab, ylab in balloonplot(tab)?
I'm not understanding something. I'm trying to add xlab ylab to a balloon plot of a table object. From docs I thought following should work: require(gplots) # From balloonplot example: # Create an example using table xnames - sample( letters[1:3], 50, replace=2) ynames - sample( 1:5, 50, replace=2) tab - table(xnames, ynames) balloonplot(tab) # Try xlab, ylab: balloonplot(tab, xlab = MyX, ylab = MyY) But second plot is no different from first. R.version.string: Version 2.3.1 (2006-06-01) gplots version: 2.3.0 on WinXP SP1 -- TIA, Jim Porzak Loyalty Matrix Inc. San Francisco, CA [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] difficult data manipulation question
hi everyone : suppose i have a matrix in which some column names are identical so, for example, TEMP AAA, BBB, CCC, DDD,AAA, BBB 0 2 1 2 0 0 2 3 7 6 0 1 1.54 9 9 6 0 1.06 1011 3 3 I didn't even check yet whether identical column names are allowed in a matrix but i hope they are. assuming that they are, then i would like to be able to take the matrix and make a new matrix with the following requirements. 1) whenever there is a unique column name, just take that column for the new matrix 2) whenever the column name is not unique, take the one that has the most non zero elements ? ( in the case of ties, i don't care which one is picked ). so, in this case, the resulting matrix would just be the first 4 columns. i realize ( or atleast i think ) that sum( TEMP[(TEMP[,columnname] !=0) ,columnname) will give me the number of non elements in a column with the name columnmame but how to use this deal with the non uniqueness to solve my particular problem is beyond me. plus, i think the command will bomb because columnname will not always be unique ? Thanks for any help. I realize this is not a trivial problem so I really appreciate it. Mark __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] difficult data manipulation question
Try this: # test data # read in header separately so R does not make column names unique Lines - AAA BBB CCC DDD AAA BBB 0 2 1 2 0 0 2 3 7 6 0 1 1.54 9 9 6 0 1.06 1011 3 3 DF - read.table(textConnection(Lines), skip = 1) names(DF) - scan(textConnection(Lines), what = , nlines = 1) f - function(x) x[which.max(colSums(DF[x]!=0))] tapply(seq(DF), names(DF), f) On 7/3/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: hi everyone : suppose i have a matrix in which some column names are identical so, for example, TEMP AAA, BBB, CCC, DDD,AAA, BBB 0 2 1 2 0 0 2 3 7 6 0 1 1.54 9 9 6 0 1.06 1011 3 3 I didn't even check yet whether identical column names are allowed in a matrix but i hope they are. assuming that they are, then i would like to be able to take the matrix and make a new matrix with the following requirements. 1) whenever there is a unique column name, just take that column for the new matrix 2) whenever the column name is not unique, take the one that has the most non zero elements ? ( in the case of ties, i don't care which one is picked ). so, in this case, the resulting matrix would just be the first 4 columns. i realize ( or atleast i think ) that sum( TEMP[(TEMP[,columnname] !=0) ,columnname) will give me the number of non elements in a column with the name columnmame but how to use this deal with the non uniqueness to solve my particular problem is beyond me. plus, i think the command will bomb because columnname will not always be unique ? Thanks for any help. I realize this is not a trivial problem so I really appreciate it. Mark __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] analogue of group option of SAS MIXED/random in R
Dear list, I am trying to use lme to build the analogue of the following SAS MIXED random specification: random int+Variable1+Variable2 /subject = Subject group=Condition type=vc; which gives a Condition-blocked heterogeneity in the random effects variance-covariance matrix. Needless to say, I have a hard time in specifying Condition-specific heterogeneities in the variance-covariance parameters. I initially tried the following commands (without Condition-heterogeneity in the random effects): G.Data-groupedData(Response~1|Subject,data=In.Data) Fit1-lme(Response~1+Variable1+Variable2*Condition,random=pdDiag(~1+Variable1+Variable2),method=REML,data=G.Data) but have no idea about where to go from here (note that I don't want to nest Subject in Condition). Thanks!! Bruno __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] gamm and binomial data
Hello, I have a response variable that is a time series of 0's and 1's. And a couple of continous explanatory variables. I would like to fit a gamm with auto-correlation and binomial distribution using gamm in mgcv. Something simple like: tmp-gamm(y ~ s(x), correlation=corAR1(), binomial) However, I read in various messages on this newsgroup that glmmPQL (used by gamm) is using an overdispersion parameter. Does this make my approach invalid? Thanks for any answer, Piet Bell - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] xlab, ylab in balloonplot(tab)?
Dear ListRs, I'm not understanding something. I'm trying to add xlab ylab to a balloon plot of a table object. From gplots docs I thought following should work: require(gplots) # From balloonplot example: # Create an example using table xnames - sample( letters[1:3], 50, replace=2) ynames - sample( 1:5, 50, replace=2) tab - table(xnames, ynames) balloonplot(tab) # Try xlab, ylab: balloonplot(tab, xlab = MyX, ylab = MyY) But second plot is no different from first. R.version.string: Version 2.3.1 (2006-06-01) gplots version: 2.3.0 on WinXP SP1 -- TIA, Jim Porzak Loyalty Matrix Inc. San Francisco, CA [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] difficult data manipulation question
Here is a modification of Gabor's solution that will return the dataframe with just the maximum columns: # test data # read in header separately so R does not make column names unique Lines - AAA BBB CCC DDD AAA BBB 0 2 1 2 0 0 2 3 7 6 0 1 1.54 9 9 6 0 1.06 1011 3 3 DF - read.table(textConnection(Lines), skip = 1) names(DF) - scan(textConnection(Lines), what = , nlines = 1) f - function(x) x[which.max(colSums(DF[x]!=0))] tapply(seq(DF), names(DF), f) #added code# # compute the number of non-zeros in each column MostZeros - colSums(DF != 0) # determine which column is the maximum x.max - lapply(unique(names(DF)), function(.name){ .col - which(names(DF) == .name) # find columns of matching names .max - which.max(MostZeros[.col]) # determine max .col[.max] # return the column number of the max }) DF[unlist(x.max)] # select only the unique maximums On 7/3/06, Gabor Grothendieck [EMAIL PROTECTED] wrote: Try this: # test data # read in header separately so R does not make column names unique Lines - AAA BBB CCC DDD AAA BBB 0 2 1 2 0 0 2 3 7 6 0 1 1.54 9 9 6 0 1.06 1011 3 3 DF - read.table(textConnection(Lines), skip = 1) names(DF) - scan(textConnection(Lines), what = , nlines = 1) f - function(x) x[which.max(colSums(DF[x]!=0))] tapply(seq(DF), names(DF), f) On 7/3/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: hi everyone : suppose i have a matrix in which some column names are identical so, for example, TEMP AAA, BBB, CCC, DDD,AAA, BBB 0 2 1 2 0 0 2 3 7 6 0 1 1.54 9 9 6 0 1.06 1011 3 3 I didn't even check yet whether identical column names are allowed in a matrix but i hope they are. assuming that they are, then i would like to be able to take the matrix and make a new matrix with the following requirements. 1) whenever there is a unique column name, just take that column for the new matrix 2) whenever the column name is not unique, take the one that has the most non zero elements ? ( in the case of ties, i don't care which one is picked ). so, in this case, the resulting matrix would just be the first 4 columns. i realize ( or atleast i think ) that sum( TEMP[(TEMP[,columnname] !=0) ,columnname) will give me the number of non elements in a column with the name columnmame but how to use this deal with the non uniqueness to solve my particular problem is beyond me. plus, i think the command will bomb because columnname will not always be unique ? Thanks for any help. I realize this is not a trivial problem so I really appreciate it. Mark __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jim Holtman Cincinnati, OH +1 513 646 9390 (Cell) +1 513 247 0281 (Home) What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] curiosity question: new graphics vs. old graphics subsystem
Hi ivo welch wrote: I just read paul murrell's new book, R graphics. now, I have always used the traditional graphics system. apparently, the new (trellis?) system is an entirely separate graphics system. after reading the book, I cannot figure out what the intrinsic capability advantage of the old graphics system is that cannot be replicated in the trellis system. if the new system's capabilities are practically a subset of the old system, why don't we design a compatibility layer so that we can just have one graphics subsystem, instead? it seems weird that newbies learn the standard system first, and then, instead of building on it with more complex functions, are told to forget about things and start with something new. I do like the simplicity of learning of the old system, but this would be the same if it were to come through a compatibility layer, too. and then it would be easy to build learning on it. but maybe I have it all wrong. maybe there is something unique about the old system that the new system cannot do. curious: what is it? The main problem is that traditional graphics thinks everything is a plot. This is good because it makes things like this possible ... plot(1) text(1, 1, whoop-dee do) ... the important bit is that the 'text()' call has a reliable meaning: it adds text to the current plot. The downside is that if what you want to draw is not a plot, you have to fight the system to do it. The grid system thinks everything is a picture. This is good because if you want to draw something other than a plot, you can. It is bad because if I do something like ... grid.text(la-dee da) ... that is not necessarily going to be added to a plot (unless I first make sure that I am in an appropriate viewport). There are other issues such as the fact that grid is generally more memory hungry and slightly slower. A compatibility layer has been discussed and it is perhaps possible, BUT it would require reimplementing the traditional graphics system on top of grid, which would be a pretty icky task. Contributions always welcome :) I also found the naming of the new system confusing. there is trellis, there is lattice, there is grid. how exactly should the new system be called? paul calls the old system traditional. the new one seems to rear its head in different forms. Reread Section 1.2 :) There are two basic graphical systems (traditional and grid) and numerous graphical packages built on top of each. Lots of packages build on top of the traditional system. Several packages now exist on top of 'grid' (notably 'lattice', 'vcd', 'ggplot', and 'hexbin'). some other opinions (which follow the old rule that everyone has one): * if we had one graphics subsystem, paul's book, and for this matter any explanation of the R graphics system, would become more parsimonious. * R is, IMHO, the premier programmed graphics package today. I may be complaining, but I also recognize that it is great. so, please consider this to be only a suggestion. * we have a pixmap image function. we should also have a pdf includegraphics function, which can import an existing graphics image. if a device (X11) is incapable of displaying it, we should just display a rectangle of the bounding box. this would open up even more avenues to the ability of R to create graphics. There is now a 'grImport' package on CRAN for this sort of thing (also see http://www.r-project.org/useR-2006/Slides/Murrell.pdf). Paul -- Dr Paul Murrell Department of Statistics The University of Auckland Private Bag 92019 Auckland New Zealand 64 9 3737599 x85392 [EMAIL PROTECTED] http://www.stat.auckland.ac.nz/~paul/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] postscript file too large : maybe an R question
Hi [EMAIL PROTECTED] wrote: i created a postscipt file in R and then i downloaded a free version of ghostview to view it. unfortunately, i get the message fata error : dynamic memory exhausted when i try to view it. when i do a dir on windows xp, the file size is 149,034,475 and i know there about 17,000 graphs. is there a way of possibly viewing this size postscript file in R itself ? This postscript file presumably has more than one page(?). Take a look at the 'onefile' argument in '?postscript'. Paul -- Dr Paul Murrell Department of Statistics The University of Auckland Private Bag 92019 Auckland New Zealand 64 9 3737599 x85392 [EMAIL PROTECTED] http://www.stat.auckland.ac.nz/~paul/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] parameter las (function par / graphics) on right axis
Dear R users, When trying to produce a graph with a scale on the right axis, I came across a possible problem with graph making in R. When using plot or barplot, when the parameter las (used by the graphics function par) is set to 1 (axis labels always horizontal), labels on the right axis are cropped when they have more than two digits. It seems that the graphics window does not allow for the extra space needed when scale labels are horizontal and have more than two digits. Resizing the graphics window does not make the graphs to show correctly. Should outer regions of the graph, or figure regions, or figure margins be somehow redefined when las = 1? I use R 2.3.1, running under Windows XP. My monitor is a Samsung SyncMaster 794MB+, resolution is 1024 x 768 pixels. (The problem also occurs at the lower resolution 800 x 600 pixels). Some simple examples - barplot: # right scale shows correctly, labels are vertical y - c(100,100,100,150,150,150,150) barplot(y, ylim=c(0,200), axes=FALSE) box() axis(side=4) # the rightmost digit of (horizontal) scale labels is cropped y - c(100,100,100,150,150,150,150) barplot(y, ylim=c(0,200), axes=FALSE) box() axis(side=4,las=1) Some simple examples - plot: # right scale shows correctly, labels are vertical x - runif(25,0,2000) y - runif(25,0,2000) plot(x,y,axes=FALSE) box() axis(side=4) # the two rightmost digits of (horizontal) scale labels are cropped x - runif(25,0,2000) y - runif(25,0,2000) plot(x,y,axes=FALSE) box() axis(side=4,las=1) Regards, Paulo Barata --- Paulo Barata Fundacao Oswaldo Cruz Rua Leopoldo Bulhoes 1480 - 8A 21041-210 Rio de Janeiro - RJ Brasil Fax: 55-21-2232-9218 E-mail: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Problems on testing moderating effect (or interactive effect).
Hi everyone, I want to do test on moderating effect. I have three factors, A, B, and C. A has influence on B, and C moderating the influence. The relationship looks like this: A - B ^ | C A, B, and C are all scale variables. I think I can test the moderating effect by adding a interactive variable between A and C. But I'm not sure how to do. Is there a default way to do it in package sem? I'm also thinking about create a interaction variable of A and C, but I don't know how to it. A has n (n = 27) items and p (p = 288) cases and C has m (m = 16) iterms and p (p = 288) cases. Does anyone have any suggestion? Thanks in advance. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] analogue of group option of SAS MIXED/random in R
Well, just in case somebody is interested, the following R code gives the same estimates as the SAS code below: ##R code### G.Data-groupedData(Response~1|Subject,data=In.Data) G.Data$Condition-as.ordered(G.Data$Condition) G.Data$Const-rep(1,length(Variable1)) tmp-pdDiag(~Condition:Const+Condition:Variable1+Condition:Variable2-1) Fit1-lme(Response~1+Variable1+Variable2*Condition, random=tmp, method=REML,data=G.Data) SAS code proc mixed data=InData;class Subject Condition; model Response=Variable1 Variable2 Condition Variable2*Condition; random int Variable1 Variable2/subject = Subject group=Condition type=vc; run; Bruno - Original Message - From: Bruno L. Giordano [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Sent: Monday, July 03, 2006 5:56 PM Subject: [R] analogue of group option of SAS MIXED/random in R Dear list, I am trying to use lme to build the analogue of the following SAS MIXED random specification: random int+Variable1+Variable2 /subject = Subject group=Condition type=vc; which gives a Condition-blocked heterogeneity in the random effects variance-covariance matrix. Needless to say, I have a hard time in specifying Condition-specific heterogeneities in the variance-covariance parameters. I initially tried the following commands (without Condition-heterogeneity in the random effects): G.Data-groupedData(Response~1|Subject,data=In.Data) Fit1-lme(Response~1+Variable1+Variable2*Condition,random=pdDiag(~1+Variable1+Variable2),method=REML,data=G.Data) but have no idea about where to go from here (note that I don't want to nest Subject in Condition). Thanks!! Bruno __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html ~~ Bruno L. Giordano, PhD CIRMMT Schulich School of Music, McGill University 555 Sherbrooke Street West Montréal, QC H3A 1E3 Canada http://www.music.mcgill.ca/~bruno/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] randomization test for a two-way ANOVA
Hi, I've looked into ways of implementing this procedure, i.e. repeating the two-way ANOVA many times, scrambling the order of cases across the treatments, to produce a distribution of F ratios for each effect. This seemed a job for the 'boot' package. However, I'm not sure I'm doing an actual randomization test, as opposed to a bootstrap here. This is how I've coded the test (using the poisons data): --cut here---start- require(boot) boot.lm - function(data, i) { mod - lm(time ~ treat * poison, data=data[i, ]) anova(mod)[F value][[1]][-4] # the F ratios for each effect } poisons.boot - boot(poisons, boot.lm, R=1000, sim=permutation) --cut here---end--- Is this the right way to ask for a randomization test using 'boot'? Thanks in advance. Cheers, -- Seb __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html