[R] R CMD SHLIB help
Hello, I have been having a great deal of difficulty using R CMD SHLIB to generate the .so file from my C file. I am using Win XP, and I have cygwin installed. I updated to the latest MinGW make files in order to get rid of some error. But I still get a few warnings (shown below), and no shared file (.so file) produced at the end. C:\cygwin\home\DeepakR CMD SHLIB ssa.h C:/R/src/gnuwin32/MkRules:235: warning: overriding commands for target `.c.d' C:/R/src/gnuwin32/MkRules:223: warning: ignoring old commands for target `.c.d' C:/R/src/gnuwin32/MkRules:255: warning: overriding commands for target `.c.o' C:/R/src/gnuwin32/MkRules:243: warning: ignoring old commands for target `.c.o' On another Windows computer (without Cygwin, but has Rtools), I tried the same. I get the error: c:\R\Rtools\MinGW\bin\windres.exe: _res.rc:8: syntax error make: *** [_res.o] Error 1 Additionally, there was an error that says gcc-sjlj is not found. Is this supposed to be gcc -sjlj? As a temporary solution I made a copy of gcc.exe called gcc-sjlj.exe -- I don't know whether this will cause other problems. Thank you very much __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] learning S4
Robin Hankin a écrit : Christophe you might find the Brobdingnag package on CRAN helpful here. Yep, I read it and I find it very usefull. A question anyway: Is there a way to change a slot without using the - ? Instead of new(obj) [EMAIL PROTECTED] - 3 I would like to have new(obj) setSlotA(obj,3) In your vignette, 'Section 8 : Get and Set methods', you define 4 Gets, but no Set... Christophe __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] survey package: proportion estimate confidence intervals using svymean
Using the survey package I find it is convenient and easy to get estimated proportions using svymean, and their corresponding estimated standard errors. But is there any elegant/simple way to calculate corresponding confidence intervals for those proportions? Of course +/- 1.96 s.e. is a reasonable approximation for a 95% CI, but (incorrectly) assumes symmetrical distribution for a proportion. About all I can think of is to transform onto some scale like the logit scale, use the delta method to estimate standard errors, and transform back for appropriate confidence intervals. Or perhaps use the machinery of the survey package svyglm function with logit link (family=quasibinomial) to do the calculations for me. Still, seems rather cumbersome. Is there a better solution? Thanks for any suggestions or pointing out the errors in my logic! Peter Holck __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R CMD SHLIB help
On Tue, 12 Feb 2008, Deepak Chandran wrote: Hello, I have been having a great deal of difficulty using R CMD SHLIB to generate the .so file from my C file. I am using Win XP, and I have cygwin installed. Not surprising: there are no .so objects on Windows. Also, we don't support Cygwin, which also does not have .so objects. It is not clear if you are trying to do this with R built for Windows or for Cygwin (which is possible but unsupported). I updated to the latest MinGW make files in order to get rid of some error. But I still get a few warnings (shown below), and no shared file (.so file) produced at the end. C:\cygwin\home\DeepakR CMD SHLIB ssa.h C:/R/src/gnuwin32/MkRules:235: warning: overriding commands for target `.c.d' C:/R/src/gnuwin32/MkRules:223: warning: ignoring old commands for target `.c.d' C:/R/src/gnuwin32/MkRules:255: warning: overriding commands for target `.c.o' C:/R/src/gnuwin32/MkRules:243: warning: ignoring old commands for target `.c.o' So you have a Makefile locally or a make that is setting .c.o etc rules. On another Windows computer (without Cygwin, but has Rtools), I tried the same. I get the error: c:\R\Rtools\MinGW\bin\windres.exe: _res.rc:8: syntax error make: *** [_res.o] Error 1 That looks like a naming problem, but we did ask for a reproducible example (see the footer of this and every R-help message and the posting guide). Whilst reading the posting guide, please note what it says about using R-devel for programming questions unintelligible to non-programmers, and the request for 'at a minimum' sessionInfo output. Additionally, there was an error that says gcc-sjlj is not found. Is this supposed to be gcc -sjlj? As a temporary solution I made a copy of gcc.exe called gcc-sjlj.exe -- I don't know whether this will cause other problems. Yes, there is, and so you don't have the correct Rtools. Thank you very much __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Please DO. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Finding LD50 from an interaction Generalised Linear model
Many thanks indeed Bill. I understand now what you're getting at with the a/b-1! Many thanks again. Graeme Duncan Bill.Venables wrote: The trick is to fit the model in a form which has the two separate intercepts and the two separate slopes as the parameters. You do have to realise that a*b, a*b-1, a/b, a/b-1, ... all specify the same model, they just use different parameters for the job. (Yes, really!) dat ldose sex numdead 1 0 M 0 2 1 M 3 3 2 M 9 4 3 M 16 5 4 M 18 6 5 M 20 7 0 F 0 8 1 F 2 9 2 F 6 10 3 F 10 11 4 F 11 12 5 F 14 dat - transform(dat, Tot = 20) fm - glm(numdead/20 ~ sex/ldose-1, binomial, dat, weights = Tot) coef(fm) sexF sexM sexF:ldose sexM:ldose -2.7634338 -3.4853625 0.7793144 1.5877754 dose.p(fm, c(1,3)) ## females DoseSE p = 0.5: 3.545981 0.3025148 dose.p(fm, c(2,4))## males DoseSE p = 0.5: 2.195123 0.1790317 In fact if you look at the book from which dose.p comes, you will find an example not unlike this one done this way. At least I think that's what I, er, read. W. Bill Venables CSIRO Laboratories PO Box 120, Cleveland, 4163 AUSTRALIA Office Phone (email preferred): +61 7 3826 7251 Fax (if absolutely necessary): +61 7 3826 7304 Mobile: +61 4 8819 4402 Home Phone: +61 7 3286 7700 mailto:[EMAIL PROTECTED] http://www.cmis.csiro.au/bill.venables/ -- View this message in context: http://www.nabble.com/Finding-LD50-from-an-interaction-Generalised-Linear-model-tp15436597p15452544.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Package for sample size calculation
Dear list, Is anyone aware of a library for sample size calculation in R, similar to NQuery? I have to give a course in this area, and I would like to enable the students playing around with this. Best wishes, Matthias __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] passing username and password to source()
[Ricardo Rodriguez] Your XEN ICT Team webmaster at xen.net writes: Please, is it possible to pass an username and a password to source() to access code stored in restricted access web server? I know I can use commands like... source(http://mire.environmentalchange.net/~webmaster/R/3Dsurface.r;) but I am not able to find if it is possible to pass authorization parameters together with the link. One way to handle this is to use Sys.setenv/Sys.getenv to pass parameters by environment. Make sure, however, that this does not violate security, because these parameter could be read by other programs. A quick workaround would be that the receiving program immediately deletes the environment string, after having read it. Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stumped by eval
Berwin A Turlach wrote: G'day Peter, On Wed, 13 Feb 2008 08:03:07 +0100 Peter Dalgaard [EMAIL PROTECTED] wrote: Ross Boylan wrote: In the following example, the inner evaluation pulls in the global value of subset (a function) rather than the one I thought I was passing in (a vector). Can anyone help me understand what's going on, and what I need to do to fix the problem? [...] The point is that subset (and offset) arguments are subject to the same evaluation rules as the terms inside the formula: First look in data, then in the environment of the formula, which in this case is the global environment. Perhaps I have a [senior|blonde]-day today, but this does not seem to be the full explanation about what is going on to me. According to this explanation the following should not work: lm(Reading~0+Spec+Reader, netto, subset=c(1) ) Call: lm(formula = Reading ~ 0 + Spec + Reader, data = netto, subset = c(1)) Coefficients: Spec Reader 1 NA since the value passed to subset is not part of data and not in the global environment. But, obviously, it works. It is, however, an expression that can be evaluated in the global environment, and that works. OTOH, if we change f0 to f0 function(formula, data, subset, na.action) { lm(formula, data, subset=subset, na.action=na.action) } then we get the same behaviour as with Ross's use of f1 inside of f0: t3 - f0(Reading~0+Spec+Reader, netto, c(1) ) Error in xj[i] : invalid subscript type 'closure' I told you it was elusive... The thing is that lm() is using nonstandard evaluation, so it sees the _symbol_ subset --- er, wait a minute, too many subsets, let us define it like this instead f0 - function(formula, data, s, na.action) { lm(formula, data, subset=s, na.action=na.action) } Ok, lm() sees the _symbol_ s passed as subset and then looks for it in data and then in environment(formula). It never gets the idea to look for s in the evaluation frame of f0. One workaround is this: f0 function(formula, data, s, na.action) { eval(bquote(lm(formula, data, subset=.(s), na.action=na.action))) } Another, I think better, way is f0 function(formula, data, s, na.action) { eval(bquote(lm(formula, data, subset=.(substitute(s)), na.action=na.action))) } The latter also allows f0(Reading~0+Spec+Reader, netto, Spec0 ) Call: lm(formula = formula, data = data, subset = Spec 0, na.action = na.action) Coefficients: Spec Reader -1.2976 0.7934 More over, with the original definition of f0: f0 function(formula, data, subset, na.action) { f1(formula, data, subset, na.action) } (f1(Reading~0+Spec+Reader, netto, subset= Spec==1 )) Reading Spec Reader 1 11 1 f0(Reading~0+Spec+Reader, netto, subset= Spec==1 ) Error in xj[i] : invalid subscript type 'closure' Given your explanation, I would have expected this to work. I think the issue here is still that model.matrix ends up being called with subset=subset rather than subset= Spec==1. Reading up on `subset' in ?model.frame also does not seem to shed light onto what is going on. Remaining confused. Cheers, Berwin -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Markov and Hidden Markov models
See CRAN: http://cran-r.c3sl.ufpr.br/src/contrib/Descriptions/HiddenMarkov.html http://cran-r.c3sl.ufpr.br/src/contrib/Descriptions/hmm.discnp.html On 12/02/2008, David Kaplan [EMAIL PROTECTED] wrote: Hi, Is there a package that will estimate simple Markov models and hidden Markov models for discrete time processes in R? Thanks in advance, David -- === David Kaplan, Ph.D. Professor Department of Educational Psychology University of Wisconsin - Madison Educational Sciences, Room, 1061 1025 W. Johnson Street Madison, WI 53706 email: [EMAIL PROTECTED] homepage: http://www.education.wisc.edu/edpsych/facstaff/kaplan/kaplan.htm Phone: 608-262-0836 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] summary statistics
stephen sefick wrote: below is my data frame. I would like to compute summary statistics for mgl for each river mile (mean, median, mode). My apologies in advance- I would like to get something like the SAS print out of PROC Univariate. I have performed an ANOVA and a tukey LSD and I would just like the summary statistics. Hi Stephen, Have a look at describe in the prettyR package. You can specify the summary stats that you want, and the formatting may suit you. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stumped by eval
G'day Peter, On Wed, 13 Feb 2008 08:03:07 +0100 Peter Dalgaard [EMAIL PROTECTED] wrote: Ross Boylan wrote: In the following example, the inner evaluation pulls in the global value of subset (a function) rather than the one I thought I was passing in (a vector). Can anyone help me understand what's going on, and what I need to do to fix the problem? [...] The point is that subset (and offset) arguments are subject to the same evaluation rules as the terms inside the formula: First look in data, then in the environment of the formula, which in this case is the global environment. Perhaps I have a [senior|blonde]-day today, but this does not seem to be the full explanation about what is going on to me. According to this explanation the following should not work: lm(Reading~0+Spec+Reader, netto, subset=c(1) ) Call: lm(formula = Reading ~ 0 + Spec + Reader, data = netto, subset = c(1)) Coefficients: Spec Reader 1 NA since the value passed to subset is not part of data and not in the global environment. But, obviously, it works. OTOH, if we change f0 to f0 function(formula, data, subset, na.action) { lm(formula, data, subset=subset, na.action=na.action) } then we get the same behaviour as with Ross's use of f1 inside of f0: t3 - f0(Reading~0+Spec+Reader, netto, c(1) ) Error in xj[i] : invalid subscript type 'closure' More over, with the original definition of f0: f0 function(formula, data, subset, na.action) { f1(formula, data, subset, na.action) } (f1(Reading~0+Spec+Reader, netto, subset= Spec==1 )) Reading Spec Reader 1 11 1 f0(Reading~0+Spec+Reader, netto, subset= Spec==1 ) Error in xj[i] : invalid subscript type 'closure' Given your explanation, I would have expected this to work. Reading up on `subset' in ?model.frame also does not seem to shed light onto what is going on. Remaining confused. Cheers, Berwin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is there a simple way to use one-way ANOVA comparing the means of groups?
See ?TukeyHSD On 13/02/2008, Kes Knave [EMAIL PROTECTED] wrote: Dear all, I have finally managed to get a Analysis of Variance Table: Response: LogHand Df Sum Sq MeanSq F valuePr(F) Prey35.3125 1.7708 20.6722.066e-11 *** Residuals 16414.0488 0.0857 I want to compare the means of the four different groups, and see if there is a difference. I searched for help, and think that maybe Tukey(stats) would be proper..? Is there anyone that could give advice for to this? Regards Kes __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Formulae for R functions
Can someone direct me to a resource or resources that list the formulae used by R functions (i.e. predict.lm ) to calculate the statistic reported. I am not a programmer and studying the r code is extremely slow going. I have searched r-project.org and all the function help files without success. For example I have attempted to replicate by hand the se.fit calculation from a lm object calculated by a call to the predict function and have not been able to reproduce the results. The information you want is not readily available for most functions in R. The standard documentation files are, by tradition, very short synopses of the arguments. For a few of the larger packages such as linear mixed models (lme) and the survival code, the authors have written a book: when it exists this is your best bet for a comprensive methods description. Many details can be found in the excellent Venables and Ripley text, which is often the first place I look. For your question, the standard statistical formulae are quite well established, R will of course use them. If you are reduced to reading source code, make sure you fetch it off of the R site, rather than printing out the R internal code, which has been stripped of all comments. Terry Therneau __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cox model
What you appear to want are all of the univariate models. You can get this with a loop (and patience - it won't be fast). ngene - ncol(genes) coefmat - matrix(0., nrow=ngene, ncol=2) for (i in 1:ngene) { tempfit - coxph(Surv(time, relapse) ~ genes[,i]) coefmat[i,] - c(tempfit$coef, sqrt(tempfit$var)) } However, the fact that R can do this for you does not mean it is a good idea. In fact, doing all of the univariate tests for a microarray has been shown by many people to be a very bad idea. There are several approaches to deal with the key issues, which you should research before going forward. Terry Therneau __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] dimnames
Hi, I used the write.table function to save data in txt file, and this is the output: V1 V2 V3 V4 1 YAL005C 21 14 11 2 YAL007C 2 1 4 3 YAL012W 8 16 3 4 YAL016W 24 23 23 5 YAL019W 3 3 2 6 YAL020C 2 4 2 7 YAL021C 7 5 5 8 YAL022C 3 1 2 but I need to remove the dimnames (first column) I tried to use dimnames function to remove it and then save it, but still, the output is the same These are the command lines, XX #matrix dimnames(XX)-NULL write.table(XX,XX.txt,quote=FALSE,sep=\t) Thanks in advance Roberto [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dimnames
Try this: write.table(XX,XX.txt,quote=FALSE,sep=\t, row.names = FALSE) On 13/02/2008, Roberto Olivares Hernandez [EMAIL PROTECTED] wrote: Hi, I used the write.table function to save data in txt file, and this is the output: V1 V2 V3 V4 1 YAL005C 21 14 11 2 YAL007C 2 1 4 3 YAL012W 8 16 3 4 YAL016W 24 23 23 5 YAL019W 3 3 2 6 YAL020C 2 4 2 7 YAL021C 7 5 5 8 YAL022C 3 1 2 but I need to remove the dimnames (first column) I tried to use dimnames function to remove it and then save it, but still, the output is the same These are the command lines, XX #matrix dimnames(XX)-NULL write.table(XX,XX.txt,quote=FALSE,sep=\t, row.names = FALSE) Thanks in advance Roberto [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cox model
Hmm...I see. I think I will give a try to the univariate analysis nonetheless...I intend to catch the p-values for each gene and select the most significant from these...I have seen it in several papers. Best Regards, Eleni On Feb 13, 2008 2:59 PM, Terry Therneau [EMAIL PROTECTED] wrote: What you appear to want are all of the univariate models. You can get this with a loop (and patience - it won't be fast). ngene - ncol(genes) coefmat - matrix(0., nrow=ngene, ncol=2) for (i in 1:ngene) { tempfit - coxph(Surv(time, relapse) ~ genes[,i]) coefmat[i,] - c(tempfit$coef, sqrt(tempfit$var)) } However, the fact that R can do this for you does not mean it is a good idea. In fact, doing all of the univariate tests for a microarray has been shown by many people to be a very bad idea. There are several approaches to deal with the key issues, which you should research before going forward. Terry Therneau [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dimnames
What if you just removed the first column from your matrix: XX-XX[,2:length(XX[1,])) so you have a new matrix without the first column and save this second one to a file? Regards, Eleni On Feb 13, 2008 3:06 PM, Roberto Olivares Hernandez [EMAIL PROTECTED] wrote: Hi, I used the write.table function to save data in txt file, and this is the output: V1 V2 V3 V4 1 YAL005C 21 14 11 2 YAL007C 2 1 4 3 YAL012W 8 16 3 4 YAL016W 24 23 23 5 YAL019W 3 3 2 6 YAL020C 2 4 2 7 YAL021C 7 5 5 8 YAL022C 3 1 2 but I need to remove the dimnames (first column) I tried to use dimnames function to remove it and then save it, but still, the output is the same These are the command lines, XX #matrix dimnames(XX)-NULL write.table(XX,XX.txt,quote=FALSE,sep=\t) Thanks in advance Roberto [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cox model
Hi Eleni, The problem of this approach is easily explained: Under the Null hypothesis, the P values of a significance test are random variables, uniformly distributed in the interval [0, 1]. It is easily seen that the lowest of these P values is not any 'better' than the highest of the P values. Best wishes, Matthias Eleni Christodoulou schrieb: Hmm...I see. I think I will give a try to the univariate analysis nonetheless...I intend to catch the p-values for each gene and select the most significant from these...I have seen it in several papers. Best Regards, Eleni On Feb 13, 2008 2:59 PM, Terry Therneau [EMAIL PROTECTED] wrote: What you appear to want are all of the univariate models. You can get this with a loop (and patience - it won't be fast). ngene - ncol(genes) coefmat - matrix(0., nrow=ngene, ncol=2) for (i in 1:ngene) { tempfit - coxph(Surv(time, relapse) ~ genes[,i]) coefmat[i,] - c(tempfit$coef, sqrt(tempfit$var)) } However, the fact that R can do this for you does not mean it is a good idea. In fact, doing all of the univariate tests for a microarray has been shown by many people to be a very bad idea. There are several approaches to deal with the key issues, which you should research before going forward. Terry Therneau [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cox model
On Feb 13, 2008 2:37 PM, Matthias Gondan [EMAIL PROTECTED] wrote: Hi Eleni, The problem of this approach is easily explained: Under the Null hypothesis, the P values of a significance test are random variables, uniformly distributed in the interval [0, 1]. It is easily seen that the lowest of these P values is not any 'better' than the highest of the P values. Best wishes, Matthias Correct me if I'm wrong, but isn't that the point? I assume that the hypothesis is that one or more of these genes are true predictors, i.e. for these genes the p-value should be significant. For all the other genes, the p-value is uniformly distributed. Using a significance level of 0.01, and an a priori knowledge that there are significant genes, you will end up with on the order of 20 genes, some of which are the true predictors, and the rest being false positives. this set of 20 genes can then be further analysed. A much smaller and easier problem to solve, no? /Gustaf -- Gustaf Rydevik, M.Sci. tel: +46(0)703 051 451 address:Essingetorget 40,112 66 Stockholm, SE skype:gustaf_rydevik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cox model
On Feb 13, 2008 3:06 PM, Gustaf Rydevik [EMAIL PROTECTED] wrote: On Feb 13, 2008 2:37 PM, Matthias Gondan [EMAIL PROTECTED] wrote: Hi Eleni, The problem of this approach is easily explained: Under the Null hypothesis, the P values of a significance test are random variables, uniformly distributed in the interval [0, 1]. It is easily seen that the lowest of these P values is not any 'better' than the highest of the P values. Best wishes, Matthias Correct me if I'm wrong, but isn't that the point? I assume that the hypothesis is that one or more of these genes are true predictors, i.e. for these genes the p-value should be significant. For all the other genes, the p-value is uniformly distributed. Using a significance level of 0.01, and an a priori knowledge that there are significant genes, you will end up with on the order of 20 genes, some of which are the true predictors, and the rest being false positives. this set of 20 genes can then be further analysed. A much smaller and easier problem to solve, no? /Gustaf Sorry, it should say 200 genes instead of 20. -- Gustaf Rydevik, M.Sci. tel: +46(0)703 051 451 address:Essingetorget 40,112 66 Stockholm, SE skype:gustaf_rydevik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] writing a simple function
Dear lists, any suggestion on how to write a function to return a TRUE if interval [a,b] overlaps the interval [c,d]. I've tried it but an error occur saying that 'could not find function v ; in addition warning message occur'. Below is my codes.Please help me sort this problem as i'm in process of learning of writing a function. Many thanks overlap-function(m,n){ t=length(m) v=length(n) tt=logical(t) tv=logical(v) for(i in 1:m){ for(i in 1:n){ if(v(i,j)=t(i,j)){ tt=T tv=T } } } k=c(tt,tv) return(k) } Regards, Anisah - [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] summary statistics
Jim, prettyR looks like it will work, but with the way that my data frame is set up I still can not get what I want out of it. I am lacking in my knowledge on manipulating data frames, and general R programing. Is there a reference that will give me all of these wonderful data manipulation tools that previous posters in this thread have sighted: tapply(x$mgl, x$RM, summary) I read the help file for tapply and I still don't entirely understand it. I am a biologist trying to learn R because it serves my purposes- I am not a programmer; however, I see the utility enough to be right stubborn when it come to learning this. I am fully aware that I am not competent in programing, but I do know the the theory behind the analyzes that I am performing, which, granted, are not all that sophisticated (mostly descriptive statistics, community data ordination, and time series analysis). My main complaint is I don't know which way to put a data frame into R to let it proceed with the analysis. I preprocess most of my data in excel- this is a convention imposed by the kind of data and knowledge of the rest of the folks that are in my lab. Is there a succinct reference for programing in R(S+) on how to rearrange data once inside of R, and maybe a general guide for matrix/ data frame setup for general analysis and what way something should look for different kinds of analysis. I am learning and, most of the time, I can generate graphs etc. faster than in excel, but sometimes I spend hours trying to get the data in the right format and then three minutes of actual coding and result generation. Maybe I'm out of my league, but I am not content with giving my data over to somebody else to do the analysis because relativley more knowledge on how a river system works. thanks everybody for the continuing help Stephen On Feb 13, 2008 5:03 AM, Jim Lemon [EMAIL PROTECTED] wrote: stephen sefick wrote: below is my data frame. I would like to compute summary statistics for mgl for each river mile (mean, median, mode). My apologies in advance- I would like to get something like the SAS print out of PROC Univariate. I have performed an ANOVA and a tukey LSD and I would just like the summary statistics. Hi Stephen, Have a look at describe in the prettyR package. You can specify the summary stats that you want, and the formatting may suit you. Jim -- Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Package for sample size calculation
Thanks a lot! I have now three packages: asypow, pwr, and http://www.cs.uiowa.edu/~rlenth/Power/ This should solve most of the problems we will encounter in this introductory course. Best wishes, Matthias Mitchell Maltenfort schrieb: the library is called pwr and is based on the Cohen book. On Feb 13, 2008 4:52 AM, Matthias Gondan [EMAIL PROTECTED] wrote: Dear list, Is anyone aware of a library for sample size calculation in R, similar to NQuery? I have to give a course in this area, and I would like to enable the students playing around with this. Best wishes, Matthias __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is there a simple way to use one-way ANOVA comparin g the means of groups?
On 13/02/2008, Kes Knave kestrel78 at gmail.com wrote: I have finally managed to get a Analysis of Variance Table: [snip] I want to compare the means of the four different groups, and see if there is a difference. I searched for help, and think that maybe Tukey(stats) would be proper..? Is there anyone that could give advice for to this? I would strongly recommend getting a copy of Peter Dalgaard's book on Introductory Statistics with R ... Ben Bolker __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is there a simple way to use one-way ANOVA comparing the means of groups?
see also the 'multcomp' and 'multcompView' packages. Henrique Dallazuanna wrote: See ?TukeyHSD On 13/02/2008, Kes Knave [EMAIL PROTECTED] wrote: Dear all, I have finally managed to get a Analysis of Variance Table: Response: LogHand Df Sum Sq MeanSq F valuePr(F) Prey35.3125 1.7708 20.6722.066e-11 *** Residuals 16414.0488 0.0857 I want to compare the means of the four different groups, and see if there is a difference. I searched for help, and think that maybe Tukey(stats) would be proper..? Is there anyone that could give advice for to this? Regards Kes __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Newbie HLM with lme4 questions
Dear R listers, I know I'm breaking the rules by asking a homework related question-- I hope you'll forgive me. I am a social psychology graduate student, and the only one in my department who uses R. I successfully completed my multiple regression and structural equation modeling courses using R (John Fox's car and sem packages were a big help, as was his book). Now I am taking a hierarchical linear modeling course, and am hoping that I can use R for this also. I've searched the list archives, and consulted the one-line version of Pinheiro and Bates (available through my library), but I'm having a great deal of difficulty translating what I'm learning in class into lmer syntax. Specifically, the instructor is using HLM 6.0. In this program, one specifies the level one and level two models explicitly, and I'm having trouble understanding what the equivalent of this is in lmer. Most of the examples we cover in class are change models, i.e., we working with longitudinal data. Specific questions: in HLM 6.0, we build the following model; Y = P0 + P1*(CONFLICT) + P2*(TIMEYRS) + E Level-2 Model P0 = B00 + B01*(H0MCITOT) + R0 P1 = B10 + B11*(H0MCITOT) + R1 P2 = B20 + B21*(H0MCITOT) + R2 Can someone explain to me how to represent this in lmer syntax? I've tried e.g., lmer(MAT ~ 1 + CONFLICT + TIMEYRS + (1 + CONFLICT + + TIMEYRS | H0MCITOT)) But I don't get the same result. More generally: Should I be using the lme4 package, the nlme package, or something else entirely? Is there any documentation for the lme4 package that is geared more towards novice users? Sorry for the long post, and thanks in advance for any help you can offer. --Ista [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] writing a simple function
Your code looks like matlab, but I don't see how it would work even there. The specific problem is the to refer to an array, you do so with square brackets [..] and not round ones, (though your v is not an array, anyway, so your troubles are just starting). Let's tackle what I think may be your problem directly. You say you want to know if the interval (a,b) overlaps (c,d) or not. OK, suppose a, b, c and d are four vectors defining intervals (a[i],b[i]) and (c[i],d[i]), and you want a vector of logicals as the result. Here is a suggestion. overlap - function(a, b, c, d) !((pmax(a, b) pmin(c, d)) | (pmax(c, d) pmin(a, b))) Here is a little demo set.seed(2008) w - runif(10) x - runif(10) y - runif(10) z - runif(10) o - overlap(w, x, y, z) data.frame(a=w, b=x, c=y, d=z, o) a b c d o 1 0.5711326 0.32063318 0.02002140 0.3581223 TRUE 2 0.6422710 0.09871573 0.02017792 0.6659485 TRUE 3 0.4645480 0.38348609 0.57137732 0.5682764 FALSE 4 0.8695450 0.99505570 0.78598674 0.8700771 TRUE 5 0.6338629 0.62817830 0.03717119 0.6008403 FALSE 6 0.3897689 0.34575269 0.70860446 0.7215733 FALSE 7 0.5236651 0.91862887 0.27005734 0.6000776 TRUE 8 0.7730075 0.02215747 0.76042526 0.9626952 TRUE 9 0.6267855 0.64686826 0.81222122 0.6340692 TRUE 10 0.1752022 0.66873049 0.58638984 0.1629897 TRUE Bill Venables CSIRO Laboratories PO Box 120, Cleveland, 4163 AUSTRALIA Office Phone (email preferred): +61 7 3826 7251 Fax (if absolutely necessary): +61 7 3826 7304 Mobile: +61 4 8819 4402 Home Phone: +61 7 3286 7700 mailto:[EMAIL PROTECTED] http://www.cmis.csiro.au/bill.venables/ -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of mohamed nur anisah Sent: Thursday, 14 February 2008 12:22 AM To: r-help@r-project.org Subject: [R] writing a simple function Dear lists, any suggestion on how to write a function to return a TRUE if interval [a,b] overlaps the interval [c,d]. I've tried it but an error occur saying that 'could not find function v ; in addition warning message occur'. Below is my codes.Please help me sort this problem as i'm in process of learning of writing a function. Many thanks overlap-function(m,n){ t=length(m) v=length(n) tt=logical(t) tv=logical(v) for(i in 1:m){ for(i in 1:n){ if(v(i,j)=t(i,j)){ tt=T tv=T } } } k=c(tt,tv) return(k) } Regards, Anisah - [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] writing a simple function
Hi, mohamed nur anisah wrote: Dear lists, any suggestion on how to write a function to return a TRUE if interval [a,b] overlaps the interval [c,d]. I've tried it but an error occur saying that 'could not find function v ; in addition warning message occur'. Below is my codes.Please help me sort this problem as i'm in process of learning of writing a function. Many thanks does this do what you want? overlap - function(a,b,c,d) { all(c:d %in% a:b) } overlap(1,5,3,4) overlap(1,2,3,4) Best, Roland overlap-function(m,n){ t=length(m) v=length(n) tt=logical(t) tv=logical(v) for(i in 1:m){ for(i in 1:n){ if(v(i,j)=t(i,j)){ tt=T tv=T } } } k=c(tt,tv) return(k) } Regards, Anisah - [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] FinTS_0.2-7
Hi, All: FinTS version 0.2-7 is now available on CRAN. This version adds two new functions: * ArchTest to compute the Engle (1982) Lagrange multiplier test for conditional heteroscedasticity, discussed on pp. 101-102 of Tsay, with examples on those pages worked in the R script in ~R\library\FinTS\scripts\ch03.R, where ~R is your local R installation directory. The code for this was kindly provided by Bernhard Pfaff. * Acf and plot.Acf to plot the autocorrelation function without the noninformative unit spike at lag 0, facilitating the production of many ACF plots in Tsay (2005) Analysis of Financial Time Series, 2nd ed. (Wiley) that follow this secondary standard. Spencer Graves ___ R-packages mailing list [EMAIL PROTECTED] https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Newbie HLM with lme4 questions
Hi, I suggest you take a look at the mixed model materials in Zelig, which has both a structural form and a reduced form model specification for multilevel model. The structural form specification is what you need. Shige On Feb 13, 2008 11:53 PM, Ista Zahn [EMAIL PROTECTED] wrote: Dear R listers, I know I'm breaking the rules by asking a homework related question-- I hope you'll forgive me. I am a social psychology graduate student, and the only one in my department who uses R. I successfully completed my multiple regression and structural equation modeling courses using R (John Fox's car and sem packages were a big help, as was his book). Now I am taking a hierarchical linear modeling course, and am hoping that I can use R for this also. I've searched the list archives, and consulted the one-line version of Pinheiro and Bates (available through my library), but I'm having a great deal of difficulty translating what I'm learning in class into lmer syntax. Specifically, the instructor is using HLM 6.0. In this program, one specifies the level one and level two models explicitly, and I'm having trouble understanding what the equivalent of this is in lmer. Most of the examples we cover in class are change models, i.e., we working with longitudinal data. Specific questions: in HLM 6.0, we build the following model; Y = P0 + P1*(CONFLICT) + P2*(TIMEYRS) + E Level-2 Model P0 = B00 + B01*(H0MCITOT) + R0 P1 = B10 + B11*(H0MCITOT) + R1 P2 = B20 + B21*(H0MCITOT) + R2 Can someone explain to me how to represent this in lmer syntax? I've tried e.g., lmer(MAT ~ 1 + CONFLICT + TIMEYRS + (1 + CONFLICT + + TIMEYRS | H0MCITOT)) But I don't get the same result. More generally: Should I be using the lme4 package, the nlme package, or something else entirely? Is there any documentation for the lme4 package that is geared more towards novice users? Sorry for the long post, and thanks in advance for any help you can offer. --Ista [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is there a simple way to use one-way ANOVA comparing the means of groups?
Ben Bolker wrote: On 13/02/2008, Kes Knave kestrel78 at gmail.com wrote: I have finally managed to get a Analysis of Variance Table: [snip] I want to compare the means of the four different groups, and see if there is a difference. I searched for help, and think that maybe Tukey(stats) would be proper..? Is there anyone that could give advice for to this? I would strongly recommend getting a copy of Peter Dalgaard's book on Introductory Statistics with R ... Thanks, Ben... The book is a bit simplistic on this particular issue, though, so I'd also recommend looking into the multcomp package. -p Ben Bolker __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Modelling disease spread
I was at a lecture the other day and I saw a presentation of very neat (short) animation modeling epidemic disease spread over a map region. When I ask what software they used they mentioned SAS. Do you know if there are equivalent resources in R to model the spread of disease with animation output? My search in R-help and google didn't lead to any document (though I found a couple of documents for SAS). However my intuition telles me there must be a similar program in R. Thanks -- View this message in context: http://www.nabble.com/Modelling-disease-spread-tp15459834p15459834.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] User defined split function in Rpart
Direction corresponds to goodness: for the split represented by goodness[i], direction[i]=-1 means that values less than the split at goodness[i] will go left, greater than will go right. If direction[i] = 1 then they will be sent to opposite sides. The long-and-short of it is that, for most trees, we want to send splits smaller than the split value left, and greater than right, so direction should be -1 for all values, ie, direction = rep(-1,length(goodness). The vector is only added if you want to customize the structure of your tree. Hope that helps, Sam On Jan 3, 2007 12:56 PM, Paolo Radaelli [EMAIL PROTECTED] wrote: Dear all, I'm trying to manage with user defined split function in rpart (file rpart\tests\usersplits.R in http://cran.r-project.org/src/contrib/rpart_3.1-34.tar.gz - see bottom of the email). Suppose to have the following data.frame (note that x's values are already sorted) D y x 1 7 0.428 2 3 0.876 3 1 1.467 4 6 1.492 5 3 1.703 6 4 2.406 7 8 2.628 8 6 2.879 9 5 3.025 10 3 3.494 11 2 3.496 12 6 4.623 13 4 4.824 14 6 4.847 15 2 6.234 16 7 7.041 17 2 8.600 18 4 9.225 19 5 9.381 20 8 9.986 Running rpart and setting minbucket=1 and maxdepth=1 we get the following tree (which uses, by default, deviance): rpart(D$y~D$x,control=rpart.control(minbucket=1,maxdepth=1)) n= 20 node), split, n, deviance, yval * denotes terminal node 1) root 20 84.8 4.60 2) D$x 9.6835 19 72.63158 4.421053 * 3) D$x=9.6835 1 0.0 8.00 * This means that the first 19 observation has been sent to the left side of the tree and one observation to the right. This is correct when we observe goodness (the maximum is the last element of the vector). The thing i really don't understand is the direction vector. # direction= -1 = send y cutpoint to the left side of the tree # 1 = send y cutpoint to the right What does it mean ? In the example here considered we have sign(lmean) [1] 1 1 -1 -1 -1 -1 -1 1 1 1 -1 -1 -1 -1 -1 -1 -1 -1 -1 Which is the criterion used ? In my opinion we should have all the values equal to -1 given that they have to be sent to left side of the tree. Does someone can help me ? Thank you ### # The split function, where most of the work occurs. # Called once per split variable per node. # If continuous=T (the case here considered) # The actual x variable is ordered # y is supplied in the sort order of x, with no missings, # return two vectors of length (n-1): # goodness = goodness of the split, larger numbers are better. # 0 = couldn't find any worthwhile split # the ith value of goodness evaluates splitting obs 1:i vs (i+1):n # direction= -1 = send y cutpoint to the left side of the tree # 1 = send y cutpoint to the right # this is not a big deal, but making larger mean y's move towards # the right of the tree, as we do here, seems to make it easier to # read # If continuos=F, x is a set of integers defining the groups for an # unordered predictor. In this case: # direction = a vector of length m= # groups. It asserts that the # best split can be found by lining the groups up in this order # and going from left to right, so that only m-1 splits need to # be evaluated rather than 2^(m-1) # goodness = m-1 values, as before. # # The reason for returning a vector of goodness is that the C routine # enforces the minbucket constraint. It selects the best return value # that is not too close to an edge. The vector wt of weights in our case is: wt [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 temp2 - function(y, wt, x, parms, continuous) { # Center y n - length(y) y - y- sum(y*wt)/sum(wt) if (continuous) { # continuous x variable temp - cumsum(y*wt)[-n] left.wt - cumsum(wt)[-n] right.wt - sum(wt) - left.wt lmean - temp/left.wt rmean - -temp/right.wt goodness - (left.wt*lmean^2 + right.wt*rmean^2)/sum(wt*y^2) list(goodness= goodness, direction=sign(lmean)) } } Paolo Radaelli Dipartimento di Metodi Quantitativi per le Scienze Economiche ed Aziendali Facoltà di Economia Università degli Studi di Milano-Bicocca P.zza dell'Ateneo Nuovo, 1 20126 Milano Italy e-mail [EMAIL PROTECTED] __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] survey package: proportion estimate confidence intervals using svymean
On Tue, 12 Feb 2008, Peter Holck wrote: Using the survey package I find it is convenient and easy to get estimated proportions using svymean, and their corresponding estimated standard errors. But is there any elegant/simple way to calculate corresponding confidence intervals for those proportions? Of course +/- 1.96 s.e. is a reasonable approximation for a 95% CI, but (incorrectly) assumes symmetrical distribution for a proportion. 'Incorrectly' is a bit strong, I think. About all I can think of is to transform onto some scale like the logit scale, use the delta method to estimate standard errors, and transform back for appropriate confidence intervals. Or perhaps use the machinery of the survey package svyglm function with logit link (family=quasibinomial) to do the calculations for me. Still, seems rather cumbersome. Is there a better solution? I think the most straightforward solution would be svyglm(). You could also use svycontrast() to do the delta-method calculations for you. -thomas Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cox model
On 2/13/2008 9:08 AM, Gustaf Rydevik wrote: On Feb 13, 2008 3:06 PM, Gustaf Rydevik [EMAIL PROTECTED] wrote: On Feb 13, 2008 2:37 PM, Matthias Gondan [EMAIL PROTECTED] wrote: Hi Eleni, The problem of this approach is easily explained: Under the Null hypothesis, the P values of a significance test are random variables, uniformly distributed in the interval [0, 1]. It is easily seen that the lowest of these P values is not any 'better' than the highest of the P values. Best wishes, Matthias Correct me if I'm wrong, but isn't that the point? I assume that the hypothesis is that one or more of these genes are true predictors, i.e. for these genes the p-value should be significant. For all the other genes, the p-value is uniformly distributed. Using a significance level of 0.01, and an a priori knowledge that there are significant genes, you will end up with on the order of 20 genes, some of which are the true predictors, and the rest being false positives. this set of 20 genes can then be further analysed. A much smaller and easier problem to solve, no? /Gustaf Sorry, it should say 200 genes instead of 20. I agree with your general point, but want to make one small quibble: the choice of 0.01 as a cutoff depends pretty strongly on the distribution of the p-value under the alternative. With a small sample size and/or a small effect size, that may miss the majority of the true predictors. You may need it to be 0.1 or higher to catch most of them, and then you'll have 10 times as many false positives to wade through (but still 10 times fewer than you started with, so your main point still holds). Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] writing a simple function
Roland Rau roland.rproject at gmail.com writes: does this do what you want? overlap - function(a,b,c,d) { all(c:d %in% a:b) } overlap(1,5,3,4) overlap(1,2,3,4) Do you really want this to be discrete? How about overlap - function(a,b,c,d) { ac bd } (although this assumes that ab and cd ...) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rolling sum (like in Rmetrics package)
See ?filter e.g. output - filter( x, rep(1,20) ) HTH, Chuck On Wed, 13 Feb 2008, joshv wrote: Hello, I'm new to R and would like to know how to create a vector of rolling sums. (I have seen the Rmetrics package and the rollMean function and I would like to do the same thing except Sum instead of Mean.) I imagine someone has done this, I just can't find it anywhere. Example: x - somevector #where x is 'n' entries long #what I would like to do is: x1 - x[1:20] output1 - sum(x1) x2 - x[2:21] output2 - sum(x2) x3 - ... ouput - c(output1, output2, ...) Thanks, JV -- View this message in context: http://www.nabble.com/rolling-sum-%28like-in-Rmetrics-package%29-tp15459848p15459848.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:[EMAIL PROTECTED] UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rolling sum (like in Rmetrics package)
Hello, rollapply in zoo can probably help you : x - zoo( 1:10 ) rollapply( x, 4, sum ) 2 3 4 5 6 7 8 10 14 18 22 26 30 34 sum( x[1:4] ) [1] 10 sum( x[2:5] ) [1] 14 Cheers, Romain -- Mango Solutions data analysis that delivers Introduction to R training course :: London :: 06-07 March 2008 http://www.mango-solutions.com/services/rtraining/r_intro.html joshv wrote: Hello, I'm new to R and would like to know how to create a vector of rolling sums. (I have seen the Rmetrics package and the rollMean function and I would like to do the same thing except Sum instead of Mean.) I imagine someone has done this, I just can't find it anywhere. Example: x - somevector #where x is 'n' entries long #what I would like to do is: x1 - x[1:20] output1 - sum(x1) x2 - x[2:21] output2 - sum(x2) x3 - ... ouput - c(output1, output2, ...) Thanks, JV __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rolling sum (like in Rmetrics package)
Have you tried 'filter'? x - 1:20 filter(x,filter=rep(1,5)) Time Series: Start = 1 End = 20 Frequency = 1 [1] NA NA 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 NA NA On 2/13/08, joshv [EMAIL PROTECTED] wrote: Hello, I'm new to R and would like to know how to create a vector of rolling sums. (I have seen the Rmetrics package and the rollMean function and I would like to do the same thing except Sum instead of Mean.) I imagine someone has done this, I just can't find it anywhere. Example: x - somevector #where x is 'n' entries long #what I would like to do is: x1 - x[1:20] output1 - sum(x1) x2 - x[2:21] output2 - sum(x2) x3 - ... ouput - c(output1, output2, ...) Thanks, JV -- View this message in context: http://www.nabble.com/rolling-sum-%28like-in-Rmetrics-package%29-tp15459848p15459848.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] model construction
I buy flowers at a local market on a fairly regular basis. The flower vendors post their prices and if I want to buy only one or two flowers I will generally get the posted price. From time to time I want to buy large quantities of flowers, and sometimes a vendor will give me a better price than their posted price for the bulk order, but more often I have to offer them a higher price than the posted price to get my desired quantity. I have collected the outcome of several thousand visits to the flower market and I want to analyze whether there is any relationship between the amount of flowers I am buying and the 'average' increment above the posted price that I end up needing to pay. Moreover, I am interested in the right hand side of this relationship since tomorrow, being Valentine's Day, I am contemplating purchasing a very large number of flowers. So, if amt = a vector of quantities of flowers bought on various days deltaP = a vector of the differences between the purchase price and the posted price on those days Two simple models might be: mottle1 = lm( deltaP ~ amt ) or mottle2 = lm( deltaP ~ amt - 1 ) But, I have the urge to set the model up as follows mottle3 = lm( deltaP ~ amt - 1, weights = amt ) because I want the big purchases to weigh much more in the calculation of the slope than the small purchases, but I have an uneasy feeling that this amounts to double-dipping/counting. Can anyone explain to me if/why this is a bad idea? Thanks, Dan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rolling sum (like in Rmetrics package)
If there are no missing values then a rolling sum is just n * rolling mean so you can use your rolling mean. Also see: http://tolstoy.newcastle.edu.au/R/help/04/10/5161.html In the zoo package see rollapply and rollmean. In the caTools package see runmean and runsum.exact. runmean is particularly fast. On Feb 13, 2008 11:25 AM, joshv [EMAIL PROTECTED] wrote: Hello, I'm new to R and would like to know how to create a vector of rolling sums. (I have seen the Rmetrics package and the rollMean function and I would like to do the same thing except Sum instead of Mean.) I imagine someone has done this, I just can't find it anywhere. Example: x - somevector #where x is 'n' entries long #what I would like to do is: x1 - x[1:20] output1 - sum(x1) x2 - x[2:21] output2 - sum(x2) x3 - ... ouput - c(output1, output2, ...) Thanks, JV -- View this message in context: http://www.nabble.com/rolling-sum-%28like-in-Rmetrics-package%29-tp15459848p15459848.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] stumped by eval
On Wed, 2008-02-13 at 11:18 +0100, Peter Dalgaard wrote: Berwin A Turlach wrote: G'day Peter, On Wed, 13 Feb 2008 08:03:07 +0100 Peter Dalgaard [EMAIL PROTECTED] wrote: Ross Boylan wrote: In the following example, the inner evaluation pulls in the global value of subset (a function) rather than the one I thought I was passing in (a vector). Can anyone help me understand what's going on, and what I need to do to fix the problem? [...] The point is that subset (and offset) arguments are subject to the same evaluation rules as the terms inside the formula: First look in data, then in the environment of the formula, which in this case is the global environment. Oh my! Thank you; in several hours of staring/debugging I didn't get that, and I'm not sure how many more it would have taken. Is there any way I could have discovered that through debugging? The fact that eval was a primitive made it hard to see what it was doing, although I guess the relevant behavior was in model.frame. I thought at one point that it mattered that subset was not a named argument of model.frame, although it is an argument for the default model.frame method. Does that make any difference? Also, when I return to my original problem in which the subset argument (to f0) is missing, will I encounter more difficulties? Maybe the safest thing to do would be to testing for missing at the top level and eliminate the corresponding entries in the calll. Finally, the sample code I gave was a distillation of some code that started with the code of lm. Ross I told you it was elusive... The thing is that lm() is using nonstandard evaluation, so it sees the _symbol_ subset --- er, wait a minute, too many subsets, let us define it like this instead f0 - function(formula, data, s, na.action) { lm(formula, data, subset=s, na.action=na.action) } Ok, lm() sees the _symbol_ s passed as subset and then looks for it in data and then in environment(formula). It never gets the idea to look for s in the evaluation frame of f0. One workaround is this: f0 function(formula, data, s, na.action) { eval(bquote(lm(formula, data, subset=.(s), na.action=na.action))) } Another, I think better, way is f0 function(formula, data, s, na.action) { eval(bquote(lm(formula, data, subset=.(substitute(s)), na.action=na.action))) } The latter also allows f0(Reading~0+Spec+Reader, netto, Spec0 ) Call: lm(formula = formula, data = data, subset = Spec 0, na.action = na.action) Coefficients: Spec Reader -1.2976 0.7934 More over, with the original definition of f0: f0 function(formula, data, subset, na.action) { f1(formula, data, subset, na.action) } (f1(Reading~0+Spec+Reader, netto, subset= Spec==1 )) Reading Spec Reader 1 11 1 f0(Reading~0+Spec+Reader, netto, subset= Spec==1 ) Error in xj[i] : invalid subscript type 'closure' Given your explanation, I would have expected this to work. I think the issue here is still that model.matrix ends up being called with subset=subset rather than subset= Spec==1. Reading up on `subset' in ?model.frame also does not seem to shed light onto what is going on. Remaining confused. Cheers, Berwin __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gWidgets process management
Thanks for that link to the mac Gtk2, it's been very helpful. To work around my original problem I've decided to just have the gui be a seperate function that returns the parameter values entered. Then I'll call that function from within a non-gui function -- so have the window close and then the process start, all in a while loop. This won't allow for a button to stop the simulation, but it will do. Thanks for all your help everybody. Peter On Feb 12, 2008, at 1:56 PM, Michael Lawrence wrote: On Feb 12, 2008 1:51 PM, Peter McMahan [EMAIL PROTECTED] wrote: Thanks, that's very helpful. Unfortunately Gtk2 is difficult to get running on a Mac, so I've been trying the gWidgetstcktk interface. It sounds like the behavior you're describing is exactly what I want, so it may just be a difference in the TGtk2 and tcltk event loops? In your example, can you think of a way to have a cancel button that would be able to kill reallySlowFunction() early? for the time being, I guess I'll just work on getting the gtk28 package from macports working... There are GTK+ binaries available that should work with the RGtk2 CRAN binary for the Mac. http://r.research.att.com/gtk2-runtime.dmg Thanks, Peter On Feb 12, 2008, at 1:31 PM, John Verzani wrote: Dear Peter, I think this issue has more to do with the event loop than gWidgets. I've cc'ed Michael Lawrence, who may be able to shed more light on this. Perhaps gIdleAdd from RGtk2 can work around this, but I didn't get anywhere. My understanding is that the event loop is preventing the console from being interactive, but not GTK events. So for instance, the GUI in the following example is responsive, during the execution, but the command line is not. library(gWidgets) options(guiToolkit=RGtk2) reallySlowFunction = function(n=20) { for(i in 1:n) { cat(z) Sys.sleep(1) } cat(\n) } w - gwindow(test) g - ggroup(cont=w, horizontal=FALSE) b - gbutton(click me, cont=g,handler = function(h,...) reallySlowFunction()) r - gradio(1:3, cont=g, handler = function(h,...) print(svalue(h $obj))) ## you can click the radio button and get a response, but not the console --John Hello, I'm trying to make a graphical interface for an R function I've written. A common use for the function is to call it with specific parameters, and then watch the output as it evolves. There's not necessarily a logical stopping point, so I usually use ctrl-C when I'm done to stop it. I've made a gWidgets interface to the function that gets some user info and then on a button click calls the function. The gWidgets window, however, seems to be frozen as long as the function is running, and the function output seems to be happening in the background in my R session, so ctrl-C (esc on the Mac GUI) does not work to stop it. I have to kill R entirely to stop the process. So after all that setup here is my question: Is there some way to have a gWidgets window interrupt a function that it has called via a widget handler? Thanks, Peter -- John Verzani CUNY/CSI Department of Mathematics [EMAIL PROTECTED] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] merging more than 2 data frames
I have created a merged data frame and in turn merged that with another dataframe. I could see how it could become a pain if you had lots of dataframes. I would try running a for loop...but have never done it. joseph [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Hi merge() takes only 2 data frames. What can you do to it to make take more than two data frames? or is there another function that does that? Thanks joseph Looking for last minute shopping deals? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Newbie HLM with lme4 questions
On Feb 13, 2008 9:53 AM, Ista Zahn [EMAIL PROTECTED] wrote: Dear R listers, I know I'm breaking the rules by asking a homework related question-- I hope you'll forgive me. I am a social psychology graduate student, and the only one in my department who uses R. I successfully completed my multiple regression and structural equation modeling courses using R (John Fox's car and sem packages were a big help, as was his book). Now I am taking a hierarchical linear modeling course, and am hoping that I can use R for this also. I think that learning to use R for a course that is based on other software doesn't fall within the no homework questions rule. I've searched the list archives, and consulted the one-line version of Pinheiro and Bates (available through my library), but I'm having a great deal of difficulty translating what I'm learning in class into lmer syntax. Wow! A one-line version! I'd like to read that. Must be a really long line. Specifically, the instructor is using HLM 6.0. In this program, one specifies the level one and level two models explicitly, and I'm having trouble understanding what the equivalent of this is in lmer. Most of the examples we cover in class are change models, i.e., we working with longitudinal data. Specific questions: in HLM 6.0, we build the following model; Y = P0 + P1*(CONFLICT) + P2*(TIMEYRS) + E Level-2 Model P0 = B00 + B01*(H0MCITOT) + R0 P1 = B10 + B11*(H0MCITOT) + R1 P2 = B20 + B21*(H0MCITOT) + R2 What determines the level 2 groups? I don't see it in the specification although I will admit that I have difficulty reading those specifications. To me they confound fixed effects and random effects and end up expressing fixed-effects interactions in obscure and confusing ways. The way I have to understand this is to substitute the P0, P1 and P2 into the first equation, expand the terms then collect the fixed effects and the random effects. That would give me Y = B00 + B01*(H0MCITOT) + B10*(CONFLICT) + B11*(CONFLICT)*(H0MCITOT) + B20*(TIMEYRS) + B21*(TIMEYRS)*(H0MCITOT) + R0 + R1*(CONFLICT) + R2*(TIMEYRS) + E To me the specification is not complete because I don't know what groups the random effects R0, R1 and R2 are associated with. Also, are R0, R1 and R2 allowed to be correlated within groups or are they required to be independent? Let's say that GRP is the variable that defines the groups that share a common level of the random effects. Then the uncorrelated random effects model is written Y ~ CONFLICT*H0MCITOT + TIMEYRS*H0MCITOT + (1|GRP) + (CONFLICT-1|GRP) + (TIMEYRS-1|GRP) and the model with correlated random effects is Y ~ CONFLICT*H0MCITOT + TIMEYRS*H0MCITOT + (1+CONFLICT+TIMEYRS|GRP) Can someone explain to me how to represent this in lmer syntax? I've tried e.g., lmer(MAT ~ 1 + CONFLICT + TIMEYRS + (1 + CONFLICT + + TIMEYRS | H0MCITOT)) But I don't get the same result. More generally: Should I be using the lme4 package, the nlme package, or something else entirely? Is there any documentation for the lme4 package that is geared more towards novice users? To me the answer to the first question is to use lme4 except that the answer to the second question is that there isn't documentation geared to novice users. To a large degree that is my fault because I keep changing the package so that previously written introductory documentation is no longer applicable. The lmer function should be faster and more reliable than lme. It allows for fitting generalized linear mixed models and for fitting models with crossed random effects (as in having random effects for subject and for item) or partially crossed random effects (e.g. longitudinal data indexed by students and teachers within schools where students get exposed to different teachers and may move between schools). It can handle very large data sets - I showed an example of the analysis of 1.6 million grade point scores for 55,000 students, 8,000 instructors and 100 departments in a message to the MULTILEVEL listserv. Sorry for the long post, and thanks in advance for any help you can offer. --Ista [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] writing a simple function
Hi, Ben Bolker wrote: Roland Rau roland.rproject at gmail.com writes: does this do what you want? overlap - function(a,b,c,d) { all(c:d %in% a:b) } overlap(1,5,3,4) overlap(1,2,3,4) Do you really want this to be discrete? How about overlap - function(a,b,c,d) { ac bd } you are absolutely right.[1] I assume with discrete you mean integers? I think the bigger problem with my function is that it makes way too many comparisons than are actually necessary (memory and time problems). What about the following function: overlap - function(intval1, intval2) { (min(intval1) min(intval2)) (max(intval1) max(intval2)) } Best, Roland [1] I realized the problems with my solution almost as soon as I sent the email. But I was on a way to a meeting and there was no more time to correct it at that moment. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Modelling disease spread
The simecol package is maybe what you want. http://hhbio.wasser.tu-dresden.de/projects/simecol/ http://hhbio.wasser.tu-dresden.de/projects/simecol/ Another possibility is to use a program i've written. Here is a flash presentation maybe also interesting for you. http://www.uni-bielefeld.de/biologie/Oekosystembiologie/bio7app/flashtut/animaterplot.htm http://www.uni-bielefeld.de/biologie/Oekosystembiologie/bio7app/flashtut/animaterplot.htm With kind regards Marcel -- View this message in context: http://www.nabble.com/Modelling-disease-spread-tp15459834p15460869.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] model construction
Weighting is entirely ok, if you think that the large transactions are ones for which the observed price is more accurate. (This sounds unlikely to me, but it might be possible; the seller might devote more careful calculation to larger sales.) Weights in lm() should be chosen as w_i = 1/v_i where v_i is the conditional variance of the ith observation. You can of course check to see whether variability of prices is increasing in quantity -- this would be prudent. url:www.econ.uiuc.edu/~rogerRoger Koenker email[EMAIL PROTECTED]Department of Economics vox: 217-333-4558University of Illinois fax: 217-244-6678Champaign, IL 61820 On Feb 13, 2008, at 10:57 AM, Daniel Dunn wrote: I buy flowers at a local market on a fairly regular basis. The flower vendors post their prices and if I want to buy only one or two flowers I will generally get the posted price. From time to time I want to buy large quantities of flowers, and sometimes a vendor will give me a better price than their posted price for the bulk order, but more often I have to offer them a higher price than the posted price to get my desired quantity. I have collected the outcome of several thousand visits to the flower market and I want to analyze whether there is any relationship between the amount of flowers I am buying and the 'average' increment above the posted price that I end up needing to pay. Moreover, I am interested in the right hand side of this relationship since tomorrow, being Valentine's Day, I am contemplating purchasing a very large number of flowers. So, if amt = a vector of quantities of flowers bought on various days deltaP = a vector of the differences between the purchase price and the posted price on those days Two simple models might be: mottle1 = lm( deltaP ~ amt ) or mottle2 = lm( deltaP ~ amt - 1 ) But, I have the urge to set the model up as follows mottle3 = lm( deltaP ~ amt - 1, weights = amt ) because I want the big purchases to weigh much more in the calculation of the slope than the small purchases, but I have an uneasy feeling that this amounts to double-dipping/counting. Can anyone explain to me if/why this is a bad idea? Thanks, Dan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Generalized nonlinear mixed model function?
I am wondering if there is an R function that could estimate a generalized nonlinear mixed model. From my reading it seems that nlme from the nlme package can fit nonlinear mixed models, while lmer from the lme4 package can fit generalized linear mixed models. One alternative I?ve found is gnlmix from the repeated package, although this only allows for a single random effect. Is there anything else out there that I have missed? Thanks, Phil Phillip van Mantgem USGS Western Ecological Research Center Sequoia and Kings Canyon Field Station 47050 Generals Highway #4 Three Rivers, CA 93271-9651 USA --- The motivating problem is estimating average trends in forest mortality rates with unequally spaced census intervals. The census interval has an exponential effect on survival (i.e., p = annual survival, year.interval = census interval, and p^year.interval; annual morality = 1 - p), leading me to use a nonlinear model. Our data are composed of counts of live and dead trees, so I?ll need to use a binomial or poisson model. Our data look like the following... # plot identifier (random effect) # eventually I?ll need to add another random term for species identity plot - rep(c(A, B, C), each = 3) # census identifier census - rep(1:3, 3) # year of census year - c(1960, 1989, 2004, 1960, 1989, 2004, 1955, 1989, 2004) # interval between census years year.interval - c(NA, 29, 15, NA, 29, 15, NA, 34, 15) # count of live trees n.live - c(1509, 1249, 1106, 1986, 1616, 1383, 3529, 2831, 2511) # count of dead trees n.dead - c(NA, 260, 143, NA, 370, 233, NA, 698, 320) forest.mort - data.frame(plot, census, year, year.interval, n.live, n.dead) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matching last argument in function
Add envir = parent.frame() to the do.call: with.options - function(...) { L - as.list(match.call())[-1] len - length(L) old.options - do.call(options, L[-len], envir = parent.frame()) on.exit(options(old.options)) invisible(eval.parent(L[[len]])) } # test with.options(width = 40, print(1:25)) test - function(w) with.options(width = w, print(1:25)) test(40) On Feb 13, 2008 1:29 PM, Alistair Gee [EMAIL PROTECTED] wrote: Hi Gabor, That almost works ... but it fails when I nest with.options() within another function: with.options - function(...) { L - as.list(match.call())[-1] len - length(L) old.options - do.call(options, L[-len]) on.exit(options(old.options)) invisible(eval.parent(L[[len]])) } with.width - function(w) with.options(width=w, print(1:25)) m.with.width(10) Error in function (...) : object w not found Enter a frame number, or 0 to exit 1: with.width(10) 2: with.options(width = w, print(1:25)) 3: do.call(options, L[-len]) 4: function (...) I tried, unsuccessfully, to fix the problem by using eval.parent() around do.call() and around L[-len]. This problem does not occur if I use my original implementation: with.options - function(..., expr) { options0 - options(...) tryCatch(expr, finally=options(options0)) } I realize that I can use my original implementation in this particular case, but I'd like to have a single implementation that works correctly, while not requiring explicitly naming the expr argument. TIA On Feb 12, 2008 12:43 PM, Gabor Grothendieck [EMAIL PROTECTED] wrote: Try this: with.options - function(...) { L - as.list(match.call())[-1] len - length(L) old.options - do.call(options, L[-len]) on.exit(options(old.options)) invisible(eval.parent(L[[len]])) } with.options(width = 40, print(1:25)) [1] 1 2 3 4 5 6 7 8 9 10 11 12 [13] 13 14 15 16 17 18 19 20 21 22 23 24 [25] 25 with.options(width = 80, print(1:25)) [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 On Feb 12, 2008 12:45 PM, Alistair Gee [EMAIL PROTECTED] wrote: I often want to temporarily modify the options() options, e.g. a - seq(1001, 1001 + 10) # some wide object with.options - function(..., expr) { options0 - options(...) tryCatch(expr, finally=options(options0)) } Then I can use: with.options(width=160, expr = print(a)) But I'd like to avoid explicitly naming the expr argument, as in: with.options(width=160, print(a)) How can I do this with R's argument matching? (I prefer the expr as the last argument since it could be a long code block. Also, I'd like with.options to take multiple options.) TIA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] matching last argument in function
Hi Gabor, That almost works ... but it fails when I nest with.options() within another function: with.options - function(...) { L - as.list(match.call())[-1] len - length(L) old.options - do.call(options, L[-len]) on.exit(options(old.options)) invisible(eval.parent(L[[len]])) } with.width - function(w) with.options(width=w, print(1:25)) m.with.width(10) Error in function (...) : object w not found Enter a frame number, or 0 to exit 1: with.width(10) 2: with.options(width = w, print(1:25)) 3: do.call(options, L[-len]) 4: function (...) I tried, unsuccessfully, to fix the problem by using eval.parent() around do.call() and around L[-len]. This problem does not occur if I use my original implementation: with.options - function(..., expr) { options0 - options(...) tryCatch(expr, finally=options(options0)) } I realize that I can use my original implementation in this particular case, but I'd like to have a single implementation that works correctly, while not requiring explicitly naming the expr argument. TIA On Feb 12, 2008 12:43 PM, Gabor Grothendieck [EMAIL PROTECTED] wrote: Try this: with.options - function(...) { L - as.list(match.call())[-1] len - length(L) old.options - do.call(options, L[-len]) on.exit(options(old.options)) invisible(eval.parent(L[[len]])) } with.options(width = 40, print(1:25)) [1] 1 2 3 4 5 6 7 8 9 10 11 12 [13] 13 14 15 16 17 18 19 20 21 22 23 24 [25] 25 with.options(width = 80, print(1:25)) [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 On Feb 12, 2008 12:45 PM, Alistair Gee [EMAIL PROTECTED] wrote: I often want to temporarily modify the options() options, e.g. a - seq(1001, 1001 + 10) # some wide object with.options - function(..., expr) { options0 - options(...) tryCatch(expr, finally=options(options0)) } Then I can use: with.options(width=160, expr = print(a)) But I'd like to avoid explicitly naming the expr argument, as in: with.options(width=160, print(a)) How can I do this with R's argument matching? (I prefer the expr as the last argument since it could be a long code block. Also, I'd like with.options to take multiple options.) TIA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Does goodfit() require frequency count of the numbers or numbers themselves?
Hello, I am a student and for project I need R. I have one problem regarding function goodfit(). Does goodfit() require frequency count of numbers or numbers themselves? For example suppose I have data say 150 readings.Do I need to use goodfit() directly on data or should I make suitable no of bins and then apply goodfit()? Aswad [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gWidgets process management
One more note on stopping at user click. The following is a bit of a hack but it seems to work. (Note: In the slow function I used for(i in 1:500){} instead of Sys.sleep(1) to better simulate a real slow R-only function): reallySlowFunction - function(n=10){ for(i in 1:n){ cat(z) while(gtkEventsPending()){ gtkMainIteration() if(svalue(cbutton)==stopping...){ cat(interrupted) svalue(cbutton) - stop simulation return() } } for(i in 1:500){} } cat(\n) } w - gwindow(test) g - ggroup(cont=w, horizontal=FALSE) b - gbutton(click me, cont=g,handler = function(h,...)reallySlowFunction()) cbutton - gbutton(stop simulation,cont=g,handler=function(h,...) {svalue(h$obj) - stopping...}) r - gradio(1:3, cont=g, handler = function(h,...) print(svalue(h$obj))) Not very robust but a useful workaround. (I was hoping to avoid using RGtk2-specific code in the reallySlowFunction, but I can add an extra argument to conditionally check if it's called in interactive mode) Thanks again to everybody for their help. Peter On Feb 13, 2008, at 11:18 AM, Peter McMahan wrote: Thanks for that link to the mac Gtk2, it's been very helpful. To work around my original problem I've decided to just have the gui be a seperate function that returns the parameter values entered. Then I'll call that function from within a non-gui function -- so have the window close and then the process start, all in a while loop. This won't allow for a button to stop the simulation, but it will do. Thanks for all your help everybody. Peter On Feb 12, 2008, at 1:56 PM, Michael Lawrence wrote: On Feb 12, 2008 1:51 PM, Peter McMahan [EMAIL PROTECTED] wrote: Thanks, that's very helpful. Unfortunately Gtk2 is difficult to get running on a Mac, so I've been trying the gWidgetstcktk interface. It sounds like the behavior you're describing is exactly what I want, so it may just be a difference in the TGtk2 and tcltk event loops? In your example, can you think of a way to have a cancel button that would be able to kill reallySlowFunction() early? for the time being, I guess I'll just work on getting the gtk28 package from macports working... There are GTK+ binaries available that should work with the RGtk2 CRAN binary for the Mac. http://r.research.att.com/gtk2-runtime.dmg Thanks, Peter On Feb 12, 2008, at 1:31 PM, John Verzani wrote: Dear Peter, I think this issue has more to do with the event loop than gWidgets. I've cc'ed Michael Lawrence, who may be able to shed more light on this. Perhaps gIdleAdd from RGtk2 can work around this, but I didn't get anywhere. My understanding is that the event loop is preventing the console from being interactive, but not GTK events. So for instance, the GUI in the following example is responsive, during the execution, but the command line is not. library(gWidgets) options(guiToolkit=RGtk2) reallySlowFunction = function(n=20) { for(i in 1:n) { cat(z) Sys.sleep(1) } cat(\n) } w - gwindow(test) g - ggroup(cont=w, horizontal=FALSE) b - gbutton(click me, cont=g,handler = function(h,...) reallySlowFunction()) r - gradio(1:3, cont=g, handler = function(h,...) print(svalue(h $obj))) ## you can click the radio button and get a response, but not the console --John Hello, I'm trying to make a graphical interface for an R function I've written. A common use for the function is to call it with specific parameters, and then watch the output as it evolves. There's not necessarily a logical stopping point, so I usually use ctrl-C when I'm done to stop it. I've made a gWidgets interface to the function that gets some user info and then on a button click calls the function. The gWidgets window, however, seems to be frozen as long as the function is running, and the function output seems to be happening in the background in my R session, so ctrl-C (esc on the Mac GUI) does not work to stop it. I have to kill R entirely to stop the process. So after all that setup here is my question: Is there some way to have a gWidgets window interrupt a function that it has called via a widget handler? Thanks, Peter -- John Verzani CUNY/CSI Department of Mathematics [EMAIL PROTECTED] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __
[R] How to handle Which on two matrices that do not have same number of rows
R-newbie question I have 2 matrices (a) P1 has only one column of 32K rows (b) PC has 2 column {P, C} of 3200 rows Every values in P1 matches with a value in PC[,p] (column p). I would like to use Which to search for all value in P1 that matchex PC[,p] and get the PC[,c]. However because P1 and PC does not have the same number of rows, I got lots of 'NA'. Thanks for your help. Example.. P1 - {'p001', 'p001', 'p002', 'p010'..} PC - { c('p001','class a'), c('p002', 'class b'),.. , c('p010', 'class 10')} Result - {c('p001', 'class a'), c('p001', 'class a'), c ('p002', 'class b'), c('p010', 'class 10')...} My D. Coyne [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Generalized nonlinear mixed model function?
(Sorry for the re-posting. My first attempt failed.) I am wondering if there is an R function that could estimate a generalized nonlinear mixed model. From my reading it seems that nlme from the nlme package can fit nonlinear mixed models, while lmer from the lme4 package can fit generalized linear mixed models. One alternative I’ve found is gnlmix from the repeated package, although this only allows for a single random effect. Is there anything else out there that I have missed? Thanks, Phil Phillip van Mantgem USGS Western Ecological Research Center Sequoia and Kings Canyon Field Station 47050 Generals Highway #4 Three Rivers, CA 93271-9651 USA --- The motivating problem is estimating average trends in forest mortality rates with unequally spaced census intervals. The census interval has an exponential effect on survival (i.e., p = annual survival, year.interval = census interval, and p^year.interval; annual morality = 1 - p), leading me to use a nonlinear model. Our data are composed of counts of live and dead trees, so I’ll need to use a binomial or poisson model. Our data look like the following... # plot identifier (random effect) # eventually I’ll need to add another random term for species identity plot - rep(c(A, B, C), each = 3) # census identifier census - rep(1:3, 3) # year of census year - c(1960, 1989, 2004, 1960, 1989, 2004, 1955, 1989, 2004) # interval between census years year.interval - c(NA, 29, 15, NA, 29, 15, NA, 34, 15) # count of live trees n.live - c(1509, 1249, 1106, 1986, 1616, 1383, 3529, 2831, 2511) # count of dead trees n.dead - c(NA, 260, 143, NA, 370, 233, NA, 698, 320) forest.mort - data.frame(plot, census, year, year.interval, n.live, n.dead) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to handle Which on two matrices that do not have same number of rows
Try this: merge(data.frame(P1), data.frame(PC), by.x=1, by.y=1) On 13/02/2008, My Coyne [EMAIL PROTECTED] wrote: R-newbie question I have 2 matrices (a) P1 has only one column of 32K rows (b) PC has 2 column {P, C} of 3200 rows Every values in P1 matches with a value in PC[,p] (column p). I would like to use Which to search for all value in P1 that matchex PC[,p] and get the PC[,c]. However because P1 and PC does not have the same number of rows, I got lots of 'NA'. Thanks for your help. Example.. P1 - {'p001', 'p001', 'p002', 'p010'..} PC - { c('p001','class a'), c('p002', 'class b'),.. , c('p010', 'class 10')} Result - {c('p001', 'class a'), c('p001', 'class a'), c ('p002', 'class b'), c('p010', 'class 10')...} My D. Coyne [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Does goodfit() require frequency count of the numbers or numbers themselves?
On Thu, 14 Feb 2008, Aswad Gurjar wrote: Hello, I am a student and for project I need R. I have one problem regarding function goodfit(). ...from package vcd. Does goodfit() require frequency count of numbers or numbers themselves? For example suppose I have data say 150 readings.Do I need to use goodfit() directly on data or should I make suitable no of bins and then apply goodfit()? Let me read the help page to you. ?goodfit says x: either a vector of counts, a 1-way table of frequencies of counts or a data frame or matrix with frequencies in the first column and the corresponding counts in the second column. The examples on the same man page have illustration for various usages of goodfit(). Let me also read the footer of this e-mail to you: PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. best, Z Aswad [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Survival (KM/Cox) plots with lattice?
On 2/13/08, Dieter Menne [EMAIL PROTECTED] wrote: Dear gRaphers, I am looking for a lattice-panel for survival (KM/Cox) plots. I know it's not standard, but maybe someone has already tried? There are some half-formed ideas in http://dsarkar.fhcrc.org/talks/extendingLattice.pdf but nothing packaged (mostly because I'm not all that familiar with survival analysis). If you have some well-defined use cases, I would be happy to collaborate on something generally useful. -Deepayan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to handle Which on two matrices that do not have samenumber of rows
How about this? PC[na.omit(match(P1[,1],PC[,1])),] HTH, DYH -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of My Coyne Sent: February 13, 2008 2:07 PM To: [EMAIL PROTECTED] Subject: [R] How to handle Which on two matrices that do not have samenumber of rows R-newbie question I have 2 matrices (a) P1 has only one column of 32K rows (b) PC has 2 column {P, C} of 3200 rows Every values in P1 matches with a value in PC[,p] (column p). I would like to use Which to search for all value in P1 that matchex PC[,p] and get the PC[,c]. However because P1 and PC does not have the same number of rows, I got lots of 'NA'. Thanks for your help. Example.. P1 - {'p001', 'p001', 'p002', 'p010'..} PC - { c('p001','class a'), c('p002', 'class b'),.. , c('p010', 'class 10')} Result - {c('p001', 'class a'), c('p001', 'class a'), c ('p002', 'class b'), c('p010', 'class 10')...} My D. Coyne [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] use of poly()
Hi, I am curious about how to interpret the results of a polynomial regression-- using poly(raw=TRUE) vs. poly(raw=FALSE). set.seed(123456) x - rnorm(100) y - jitter(1*x + 2*x^2 + 3*x^3 , 250) plot(y ~ x) l.poly - lm(y ~ poly(x, 3)) l.poly.raw - lm(y ~ poly(x, 3, raw=TRUE)) s - seq(-3, 3, by=0.1) lines(s, predict(l.poly, data.frame(x=s)), col=1) lines(s, predict(l.poly.raw, data.frame(x=s)), col=2) The results are the same, but the regression coeficients are different: as.vector(coef(l.poly)) 1.806618 88.078858 16.194423 58.051642 as.vector(coef(l.poly.raw)) -0.1025114 1.5265248 2.0617970 2.7393995 When using continuous data in both Y and X, does the difference between raw and orthagonal polynomials have any practical meaning? Thanks, Dylan -- Dylan Beaudette Soil Resource Laboratory http://casoilresource.lawr.ucdavis.edu/ University of California at Davis 530.754.7341 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] do.call() browser() hang with large data.frame
I have a problem similar to this: http://tolstoy.newcastle.edu.au/R/help/06/02/21987.html If I try to debug (via browser()) and the call stack has a do.call() call with a large data frame, R hangs for a very long time. If, instead, replaced the do.call() call with a direct call, there is no hang. What can I do? I am using R 2.6.2.patched built from SVN today on Linux AMD64. Here is my sample code: # options(deparse.max.lines=10) ### Create a big data frame with 1 million rows and 100 columns. This do.call() is NOT the problem. a - do.call(data.frame, sapply(sprintf(col%d, 1:100), function(x) rnorm(1e6), simplify=FALSE, USE.NAMES=TRUE)) ### function I want to debug via browser() foo - function(aa, bb) { browser() # I want to stop here to debug. } ### This goes into the browser() immediately: foo(aa=a,bb=2) ### This hangs for a very long time: do.call(foo, list(aa=a, bb=2)) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error in model.frame.default: invalid type (list) for variable ...
I'm getting the following errors when I try to execute the glm() procedure: If my matrix of predictors is a data.frame, R shows me this error: Error in model.frame.default(formula = Ycentro ~ metodocentro - 1, data = metodocentro, : invalid type (list) for variable 'metodocentro' If my matrix of predictors isn't a data.frame, R shows me this one Error in model.frame.default(formula = Ycentro ~ metodocentro - 1, data = metodocentro, : 'data' must be a data.frame, not a matrix or an array I'm using this version of R: platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 6.1 year 2007 month 11 day26 svn rev43537 language R version.string R version 2.6.1 (2007-11-26) And here are some examples of the response variable and the predictors: Ycentro is my response variable, it has 248 rows of 0's and 1's: [,1] [1,]1 [2,]1 [3,]1 [4,]1 [5,]1 [6,]1 [7,]1 [8,]1 [9,]1 [10,]1 while metodocentro is my matrix of predictors, it has this format: vies13 centro13x1 centro13x2 vies23 centro23x1 centro23x2 [1,] 1 18.009066 27.414592 00.0 0.00 [2,] 1 10.457802 15.485570 00.0 0.00 [3,] 1 16.828764 27.652697 00.0 0.00 [4,] 1 18.807218 40.238635 00.0 0.00 [5,] 1 23.025113 45.918713 00.0 0.00 [6,] 1 27.861487 30.397026 00.0 0.00 [7,] 1 23.845959 27.193986 00.0 0.00 [8,] 1 10.557285 31.366402 00.0 0.00 [9,] 1 22.811078 27.074102 00.0 0.00 [10,] 1 12.309584 44.193660 00.0 0.00 Thanks you for your patience Diego [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] passing username and password to source()
Dieter Menne wrote: One way to handle this is to use Sys.setenv/Sys.getenv to pass parameters by environment. Make sure, however, that this does not violate security, because these parameter could be read by other programs. A quick workaround would be that the receiving program immediately deletes the environment string, after having read it. Dieter Thanks Dieter. But considering I am just newbie, could you point me to any example to get this working? Thanks! All the best, Ricardo -- Ricardo Rodríguez Your XEN ICT Team __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating a data.frame
Joe d - data.frame(x=x, y=y, z=z) sapply(d, class) x y z factor numeric numeric d - data.frame(x=x, y=y, z=z, stringsAsFactors=F) sapply(d, class) x y z character numeric numeric HTH Peter Alspach -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Joe Trubisz Sent: Thursday, 14 February 2008 11:18 a.m. To: r-help@r-project.org Subject: [R] Creating a data.frame OK...newbie question here. Either I'm reading the docs wrong, or I'm totally confused. Given the following: x-c(aaa,bbb,ccc) y-rep(0,3) z-rep(0,3) is.character(x) [1] TRUE is.numeric(y) [1] TRUE Now...I want to create a data frame, but keep the data types. In reading the docs, I assume you do it this way: d-data.frame(cbind(x=I(x),y=y,z=z) But, when I do str(d), I get the following: 'data.frame': 3 obs. of 3 variables: $ x: Factor w/ 3 levels aaa,bbb,ccc: 1 2 3 $ y: Factor w/ 1 level 0: 1 1 1 $ z: Factor w/ 1 level 0: 1 1 1 I thought the I() prevents character from becoming factors, right? Secondly, how do I force y and z in the data frame to become numeric? Thanks in advance Joe __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. The contents of this e-mail are privileged and/or confidential to the named recipient and are not to be used by any other person and/or organisation. If you have received this e-mail in error, please notify the sender and delete all material pertaining to this e-mail. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating a data.frame
On 2/13/2008 5:17 PM, Joe Trubisz wrote: OK...newbie question here. Either I'm reading the docs wrong, or I'm totally confused. Given the following: x-c(aaa,bbb,ccc) y-rep(0,3) z-rep(0,3) is.character(x) [1] TRUE is.numeric(y) [1] TRUE Now...I want to create a data frame, but keep the data types. In reading the docs, I assume you do it this way: d-data.frame(cbind(x=I(x),y=y,z=z) But, when I do str(d), I get the following: 'data.frame': 3 obs. of 3 variables: $ x: Factor w/ 3 levels aaa,bbb,ccc: 1 2 3 $ y: Factor w/ 1 level 0: 1 1 1 $ z: Factor w/ 1 level 0: 1 1 1 I thought the I() prevents character from becoming factors, right? Secondly, how do I force y and z in the data frame to become numeric? Don't use cbind() inside of data.frame(). Using cbind() coerces the variables into a matrix where all variables share a common type. I think you want this: d - data.frame(x=I(x), y=y, z=z) Thanks in advance Joe __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating a data.frame
On 13-Feb-08 22:17:32, Joe Trubisz wrote: OK...newbie question here. Either I'm reading the docs wrong, or I'm totally confused. Given the following: x-c(aaa,bbb,ccc) y-rep(0,3) z-rep(0,3) is.character(x) [1] TRUE is.numeric(y) [1] TRUE Now...I want to create a data frame, but keep the data types. In reading the docs, I assume you do it this way: d-data.frame(cbind(x=I(x),y=y,z=z) But, when I do str(d), I get the following: 'data.frame': 3 obs. of 3 variables: $ x: Factor w/ 3 levels aaa,bbb,ccc: 1 2 3 $ y: Factor w/ 1 level 0: 1 1 1 $ z: Factor w/ 1 level 0: 1 1 1 I thought the I() prevents character from becoming factors, right? Secondly, how do I force y and z in the data frame to become numeric? Thanks in advance Joe You don't need to force it! It's not obvious what made you think of using cbind() internally, but it doesn't do what you intended. Simply: d-data.frame(x=I(x),y=y,z=z) d # x y z # 1 aaa 0 0 # 2 bbb 0 0 # 3 ccc 0 0 str(d) # 'data.frame': 3 obs. of 3 variables: # $ x:Class 'AsIs' chr [1:3] aaa bbb ccc # $ y: num 0 0 0 # $ z: num 0 0 0 The trouble was that cbind(x=I(x),y=y,z=z) makes a matrix, and you cannot mix types in a matrix, so this will coerce all the variables to character type. So it's your original way which does the forcing! Hoping this helps, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 13-Feb-08 Time: 23:24:30 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] apply on large arrays
I have a big contingency table, approximately of size 60*2*500*500, and I need to count the number of cells containing a count of 1 for each of the factors values defining the first dimension. Here is my attempt: tab1-with(pisa1,table(CNT,GENDER,ISCOF,ISCOM)) tab2-apply(tab1,1:4,function(x)ifelse(sum(x)==1,1,0)) tab3-apply(tab2,1,sum) Computing tab2 is very slow. Is there a faster and/or more elegant way of doing this? -- Erich Neuwirth, University of Vienna Faculty of Computer Science Computer Supported Didactics Working Group Visit our SunSITE at http://sunsite.univie.ac.at Phone: +43-1-4277-39464 Fax: +43-1-4277-39459 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] apply on large arrays
Your code is tab1 - with(pisa1, table(CNT,GENDER,ISCOF,ISCOM)) tab2 - apply(tab1, 1:4, function(x) ifelse(sum(x) == 1, 1, 0)) tab3 - apply(tab2, 1, sum) As far as I can see, step 2, (the problematic one), merely replaces any entries in tab1 that are not equal to one by zeros. I think this would do the same job a bit faster: tab2 - tab1 - with(pisa1, table(CNT,GENDER,ISCOF,ISCOM)) tab2[] - 0 tab2[which(tab1 == 1, arr.ind = TRUE)] - 1 tab3 - rowSums(tab2) If you don't need to keep tab1, you would make things even better by removing it. Bill Venables. Bill Venables CSIRO Laboratories PO Box 120, Cleveland, 4163 AUSTRALIA Office Phone (email preferred): +61 7 3826 7251 Fax (if absolutely necessary): +61 7 3826 7304 Mobile: +61 4 8819 4402 Home Phone: +61 7 3286 7700 mailto:[EMAIL PROTECTED] http://www.cmis.csiro.au/bill.venables/ -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Erich Neuwirth Sent: Thursday, 14 February 2008 9:52 AM To: r-help Subject: [R] apply on large arrays I have a big contingency table, approximately of size 60*2*500*500, and I need to count the number of cells containing a count of 1 for each of the factors values defining the first dimension. Here is my attempt: tab1-with(pisa1,table(CNT,GENDER,ISCOF,ISCOM)) tab2-apply(tab1,1:4,function(x)ifelse(sum(x)==1,1,0)) tab3-apply(tab2,1,sum) Computing tab2 is very slow. Is there a faster and/or more elegant way of doing this? -- Erich Neuwirth, University of Vienna Faculty of Computer Science Computer Supported Didactics Working Group Visit our SunSITE at http://sunsite.univie.ac.at Phone: +43-1-4277-39464 Fax: +43-1-4277-39459 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] use of poly()
On Wednesday 13 February 2008, [EMAIL PROTECTED] wrote: You ask When using continuous data in both Y and X, does the difference between raw and orthagonal polynomials have any practical meaning? Yes, indeed it does, even if X is not 'continuous'. There are (at least) two practical differences: 1. With orthogonal polynomials you are using an orthogonal basis, so the estimates of the regression coefficients are statistically independent. This makes it much easier in model building to get an idea of the degree of polynomial warranted by the data. You can usually do it from a single model fit. 2. With an orthogonal polynomial basis your model matrix has, in principle, an optimal condition number and the numerical properties of the least squares fitting algorithm can be much better. If you really want the raw coefficients and their standard errors, c, you unravel this a bit, but why would you want to? If all you are interested in is the fitted curve, though, (and this is indeed the key thing, not the coefficients), then what kind of basis you use is pretty irrelevant. Regards, W. This is exactly the kind of explanation I was looking for. Thanks! Dylan Bill Venables CSIRO Laboratories PO Box 120, Cleveland, 4163 AUSTRALIA Office Phone (email preferred): +61 7 3826 7251 Fax (if absolutely necessary): +61 7 3826 7304 Mobile: +61 4 8819 4402 Home Phone: +61 7 3286 7700 mailto:[EMAIL PROTECTED] http://www.cmis.csiro.au/bill.venables/ -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dylan Beaudette Sent: Thursday, 14 February 2008 6:42 AM To: r-help@r-project.org Subject: [R] use of poly() Hi, I am curious about how to interpret the results of a polynomial regression-- using poly(raw=TRUE) vs. poly(raw=FALSE). set.seed(123456) x - rnorm(100) y - jitter(1*x + 2*x^2 + 3*x^3 , 250) plot(y ~ x) l.poly - lm(y ~ poly(x, 3)) l.poly.raw - lm(y ~ poly(x, 3, raw=TRUE)) s - seq(-3, 3, by=0.1) lines(s, predict(l.poly, data.frame(x=s)), col=1) lines(s, predict(l.poly.raw, data.frame(x=s)), col=2) The results are the same, but the regression coeficients are different: as.vector(coef(l.poly)) 1.806618 88.078858 16.194423 58.051642 as.vector(coef(l.poly.raw)) -0.1025114 1.5265248 2.0617970 2.7393995 When using continuous data in both Y and X, does the difference between raw and orthagonal polynomials have any practical meaning? Thanks, Dylan -- Dylan Beaudette Soil Resource Laboratory http://casoilresource.lawr.ucdavis.edu/ University of California at Davis 530.754.7341 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Bayesian function in R
Hi all, I was looking for a Bayesian function in R which can take input as a matrix (containing gene expression values) and giving the output as a pair of genes which are related. Precisely, giving output as a pair of genes (A and B) and whether A depends on B or vice versa. Please let me know If anyone is aware of such function or related function in R. Regards, Shweta. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Principal component analysis PCA
Hi, I am trying to run PCA on a set of data with dimension 115*300,000. The columns represnt the snps and the row represent the individuals. so this is what i did. #load the data code-read.table(code.txt, sep='\t', header=F, nrows=30) # do PCA # pr-prcomp(code, retx=T, center=T) I am getting the following error message Error: cannot allocate vector of size 275.6 Mb I tried to increase the memory size : memory.size(4000) but it did not work, is there a solution for this ? or is there another software that can handle large data sets. Thanks -- View this message in context: http://www.nabble.com/Principal-component-analysis-PCA-tp15472509p15472509.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Creating a data.frame
Just make d - data.frame(x,y,z). cbind forces x,y and z to be of the same type. --- Joe Trubisz [EMAIL PROTECTED] wrote: OK...newbie question here. Either I'm reading the docs wrong, or I'm totally confused. Given the following: x-c(aaa,bbb,ccc) y-rep(0,3) z-rep(0,3) is.character(x) [1] TRUE is.numeric(y) [1] TRUE Now...I want to create a data frame, but keep the data types. In reading the docs, I assume you do it this way: d-data.frame(cbind(x=I(x),y=y,z=z) But, when I do str(d), I get the following: 'data.frame': 3 obs. of 3 variables: $ x: Factor w/ 3 levels aaa,bbb,ccc: 1 2 3 $ y: Factor w/ 1 level 0: 1 1 1 $ z: Factor w/ 1 level 0: 1 1 1 I thought the I() prevents character from becoming factors, right? Secondly, how do I force y and z in the data frame to become numeric? Thanks in advance Joe __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Package for Multiple Additive Regression Trees
Hello List, I've been unsuccessful in tracking down a package that does MART. J Friedman's website has a page that mentions an R implementation but the links lead nowhere. There is also a nice walkthru/vignette pdf on his site. But I can not find this package anywhere. Perhaps it's commercial? Does anyone know of where it might be available or of any other R package (or Matlab toolbox) that can do MART? Many thanks, Christopher __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Installation of R on UNIX - Sparc Solaris v8 [SEC=UNCLASSIFIED]
Dear Prof Ripley and Don, Thank you all for your kind reply. The UNIX system we have is Sparc Solaris v8 (a 64bit version). The Precompiled binary distributions of the base system and contributed packages are only available for Linux, MacOS X and Windows. The only thing available for UNIX seems Source Code for all Platforms. Could you please guide me to the Prebuilt versions of R are available for Solaris and the Solaris binary? By the way, given the UNIX system is pretty old (Solaris v8), could the latest version of R be installed on such old system or we have to install an earlier version of R? Thanks a lot in advance. Regards, Jin -Original Message- From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] Sent: Tuesday, 12 February 2008 5:06 To: Li Jin Cc: r-help@r-project.org Subject: Re: [R] Linux, UNIX, XP32, Vista X64 or ...? [SEC=UNCLASSIFIED] On Tue, 12 Feb 2008, [EMAIL PROTECTED] wrote: Thanks to all for your kind suggestions. After some discussion with our IT staff, I was told the UNIX system we have is Solaris and installation of R is very time consuming because Given that this software is not standard, and given the amount of time required to compile the software (and potentially it's dependencies), it will need to be resourced as a project ... From my experience with IT staff, it may take quite a long time for them to set up such project, let alone the installation. Prebuilt versions of R are available for Solaris -- and the 'R Installation and Administration' manual told them so. Given that, I wonder if it is possible to install it myself. As I have mentioned before, I have no experience in using UNIX, but I will have an access to the UNIX system soon. Any suggestions and help are greatly appreciated. It is easy to install R from the sources if you have the compilers and e.g. Tcl/Tk installed. But a Solaris box quite possibly does not, and then a binary install is much easier. Regards, Jin -Original Message- From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] Sent: Monday, 28 January 2008 11:38 To: Li Jin Cc: r-help@r-project.org Subject: Re: [R] Linux, UNIX, XP32, Vista X64 or ...? [SEC=UNCLASSIFIED] On the PC there is a builtin GUI but not on UNIX and there are some packages that are OS specific in which case you might get more or less selection but probably more. Also depending on the specific system you may have greater difficulty installing certain packages due to the need to compile them on UNIX and the possibility exists that you don't quite have the right libraries. On Windows you get binaries so this is not a problem. I have repeatedly found that common packages that I took for granted on Windows had some problem with installation on UNIX and I had to hunt around and figure out what the problem was with my UNIDX libraries or possibly some other problem. For all R packages this won't be a problem but for packages that use C and FORTRAN this can be. Although I am lumping all UNIX systems together I think this varies quite a bit from one particular type/distro of UNIX/Linux to another and I suspect if you are careful in picking out the right one (if you have a choice) you will actually have zero problems. On Jan 23, 2008 6:08 PM, [EMAIL PROTECTED] wrote: Dear All, I am currently using R in Windows PC with a 2 GB of RAM. Some pretty large datasets are expected soon, perhaps in an order of several GB. I am facing a similar situation like Ralph, either to get a new PC with a bigger RAM or else. I am just wondering if R is getting faster in other systems like UNIX or Linux. Any suggestions are appreciated. Regards, Jin Jin Li, PhD Spatial Modeller/ Computational Statistician Marine Coastal Environment Geoscience Australia Ph: 61 (02) 6249 9899 Fax: 61 (02) 6249 9956 email: [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Prof Brian Ripley Sent: Thursday, 24 January 2008 12:05 To: Ralph79 Cc: r-help@r-project.org Subject: Re: [R] Problems with XP32-3GB-patch?/ Worth upgrading to Vista X64? On Wed, 23 Jan 2008, Ralph79 wrote: Dear R-Users, as I will start a huge simulation in a few weeks, I am about to buy a new and fast PC. I have noticed, that the RAM has been the limiting factor in many of my calculations up to now (I had 2 GB in my old system, but Windows still used quite a lot of virtual memory), hence my new computer will have 4 GB of fast DDR2-800 RAM. However, I know that 1.) Windows 32 bit cannot make use of more than about 3,2 GB RAM and 2.) it is normally not allowed to allocate more than 2 GB of RAM to one single application (at least under XP, I don't know if that has changed under Vista?). I remember from the R-FAQ that you can manually adjust XP so that it allocates up to 3 GB to one application (the 3GB
Re: [R] non-interactive connection to SQL Server
On Wed, 13 Feb 2008, Moshe Olshansky wrote: Hi everyone, I am afraid that I have already asked this question in the past (or at least I knew an answer to it) but I am unable to do it now. I have an SQL Server data base. I used the GUI interface of odbcDriverConnect to create a .dsn file for this data base and every time I want to connect I invoke odbcDriverConnect() which opens GUI from which I choose my .dsn file and the connection is established. Now I want to do this automatically (without GUI) and all my attempts fail. If I remember correctly the dsn must be a string containing all the connection details but it does not work for me. Clearly it does, as you did that via odbcDriverConnect. Look at the object it returns: it contains the string it used. You can use that directly. Could anybody help, please! Thank you! Moshe. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Installation of R on UNIX - Sparc Solaris v8 [SEC=UNCLASSIFIED]
On Thu, 14 Feb 2008, [EMAIL PROTECTED] wrote: Dear Prof Ripley and Don, Thank you all for your kind reply. The UNIX system we have is Sparc Solaris v8 (a 64bit version). The Precompiled binary distributions of the base system and contributed packages are only available for Linux, MacOS X and Windows. The only thing available for UNIX seems Source Code for all Platforms. Could you please guide me to the Prebuilt versions of R are available for Solaris and the Solaris binary? There are binaries for that too, as mentioned in the manual. I don't know why I have to read the manual for you, but see http://www.sunfreeware.com/indexintel8.html. They have R 2.6.0. By the way, given the UNIX system is pretty old (Solaris v8), could the latest version of R be installed on such old system or we have to install an earlier version of R? R 2.6.2 will install from the sources. Thanks a lot in advance. Regards, Jin -Original Message- From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] Sent: Tuesday, 12 February 2008 5:06 To: Li Jin Cc: r-help@r-project.org Subject: Re: [R] Linux, UNIX, XP32, Vista X64 or ...? [SEC=UNCLASSIFIED] On Tue, 12 Feb 2008, [EMAIL PROTECTED] wrote: Thanks to all for your kind suggestions. After some discussion with our IT staff, I was told the UNIX system we have is Solaris and installation of R is very time consuming because Given that this software is not standard, and given the amount of time required to compile the software (and potentially it's dependencies), it will need to be resourced as a project ... From my experience with IT staff, it may take quite a long time for them to set up such project, let alone the installation. Prebuilt versions of R are available for Solaris -- and the 'R Installation and Administration' manual told them so. Given that, I wonder if it is possible to install it myself. As I have mentioned before, I have no experience in using UNIX, but I will have an access to the UNIX system soon. Any suggestions and help are greatly appreciated. It is easy to install R from the sources if you have the compilers and e.g. Tcl/Tk installed. But a Solaris box quite possibly does not, and then a binary install is much easier. Regards, Jin -Original Message- From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] Sent: Monday, 28 January 2008 11:38 To: Li Jin Cc: r-help@r-project.org Subject: Re: [R] Linux, UNIX, XP32, Vista X64 or ...? [SEC=UNCLASSIFIED] On the PC there is a builtin GUI but not on UNIX and there are some packages that are OS specific in which case you might get more or less selection but probably more. Also depending on the specific system you may have greater difficulty installing certain packages due to the need to compile them on UNIX and the possibility exists that you don't quite have the right libraries. On Windows you get binaries so this is not a problem. I have repeatedly found that common packages that I took for granted on Windows had some problem with installation on UNIX and I had to hunt around and figure out what the problem was with my UNIDX libraries or possibly some other problem. For all R packages this won't be a problem but for packages that use C and FORTRAN this can be. Although I am lumping all UNIX systems together I think this varies quite a bit from one particular type/distro of UNIX/Linux to another and I suspect if you are careful in picking out the right one (if you have a choice) you will actually have zero problems. On Jan 23, 2008 6:08 PM, [EMAIL PROTECTED] wrote: Dear All, I am currently using R in Windows PC with a 2 GB of RAM. Some pretty large datasets are expected soon, perhaps in an order of several GB. I am facing a similar situation like Ralph, either to get a new PC with a bigger RAM or else. I am just wondering if R is getting faster in other systems like UNIX or Linux. Any suggestions are appreciated. Regards, Jin Jin Li, PhD Spatial Modeller/ Computational Statistician Marine Coastal Environment Geoscience Australia Ph: 61 (02) 6249 9899 Fax: 61 (02) 6249 9956 email: [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Prof Brian Ripley Sent: Thursday, 24 January 2008 12:05 To: Ralph79 Cc: r-help@r-project.org Subject: Re: [R] Problems with XP32-3GB-patch?/ Worth upgrading to Vista X64? On Wed, 23 Jan 2008, Ralph79 wrote: Dear R-Users, as I will start a huge simulation in a few weeks, I am about to buy a new and fast PC. I have noticed, that the RAM has been the limiting factor in many of my calculations up to now (I had 2 GB in my old system, but Windows still used quite a lot of virtual memory), hence my new computer will have 4 GB of fast DDR2-800 RAM. However, I know that 1.) Windows 32 bit cannot make use of more than about
Re: [R] summary statistics
Hi one of the good starting points is Paul Johnsons StatsRus (the first hit in Google and I believe it is in Rwiki too). It helped me when I started with R about 10 years ago. For me usually the best way to arrange data is in database form. It means each column is a variable (numerical, categorical, logical) and each row is a record for one particular event (sample, day, hour, minute, etc.). From this rectangular data frame you can easily choose a subset by ?subset or data[condition, ] you can use apply like functions (see also ?aggregate, ?by) to get summaries. And you can perform various models based on such constructed data frames.And you can also try to look to ?merge and/or maybe ?reshape to join data frames and to transfer them to other forms sometimes more suitable for printing. And last but not least you can look into almost any of suggested publications in CRAN, usually all of them has some intro how to operate with data inside R or you can briefly go through Rintro manual you probably have installed with R distribution especially chapters 2-6. For editing commands I use external editor as a more convenient option. Tinn-R which is easier for non-Unix people then Emacs though probably less powerful. Regards Petr [EMAIL PROTECTED] [EMAIL PROTECTED] napsal dne 13.02.2008 15:07:53: Jim, prettyR looks like it will work, but with the way that my data frame is set up I still can not get what I want out of it. I am lacking in my knowledge on manipulating data frames, and general R programing. Is there a reference that will give me all of these wonderful data manipulation tools that previous posters in this thread have sighted: tapply(x$mgl, x$RM, summary) I read the help file for tapply and I still don't entirely understand it. I am a biologist trying to learn R because it serves my purposes- I am not a programmer; however, I see the utility enough to be right stubborn when it come to learning this. I am fully aware that I am not competent in programing, but I do know the the theory behind the analyzes that I am performing, which, granted, are not all that sophisticated (mostly descriptive statistics, community data ordination, and time series analysis). My main complaint is I don't know which way to put a data frame into R to let it proceed with the analysis. I preprocess most of my data in excel- this is a convention imposed by the kind of data and knowledge of the rest of the folks that are in my lab. Is there a succinct reference for programing in R(S+) on how to rearrange data once inside of R, and maybe a general guide for matrix/ data frame setup for general analysis and what way something should look for different kinds of analysis. I am learning and, most of the time, I can generate graphs etc. faster than in excel, but sometimes I spend hours trying to get the data in the right format and then three minutes of actual coding and result generation. Maybe I'm out of my league, but I am not content with giving my data over to somebody else to do the analysis because relativley more knowledge on how a river system works. thanks everybody for the continuing help Stephen On Feb 13, 2008 5:03 AM, Jim Lemon [EMAIL PROTECTED] wrote: stephen sefick wrote: below is my data frame. I would like to compute summary statistics for mgl for each river mile (mean, median, mode). My apologies in advance- I would like to get something like the SAS print out of PROC Univariate. I have performed an ANOVA and a tukey LSD and I would just like the summary statistics. Hi Stephen, Have a look at describe in the prettyR package. You can specify the summary stats that you want, and the formatting may suit you. Jim -- Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] passing username and password to source()
[Ricardo Rodriguez] Your XEN ICT Team webmaster at xen.net writes: But considering I am just newbie, could you point me to any example to get this working? Thanks! I was thinking of something like this (note: you must store PassEnv2.r on disk as a file). # File PassEnv.r Sys.setenv(PASSW=mypass) source(PassEnv2.r) Sys.unsetenv(PASSW) # Try again (to show it's removed) source(PassEnv2.r) #File PassEnv1.r passw=Sys.getenv(PASSW) # Sys.unsetenv(PASSW) ## This would also be possible to clear the enviroment # Show that is worked cat(The password is ,passw,\n) Again: this is good for your own computer, but should not be used when security constraints are relevant. The password is visible to everyone for the time of passenv1.r running. Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] need help
Hi, Anybody using tweedie.profile function to determine profile loglikelihood value? Thank you for your attention. Roslinazairimah Zakaria PhD Candidate School of Maths Stats University of South Australia Mawson Lakes, SA 5095. Ph: 83025296 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.