Re: [R] apply model predictions over larger area with predict()
K. Fleischer wrote: Dear all, I have built glm models based on presences/absences and a number of predictor maps and would like to compute habitat suitability based on the modelled coefficients. I thought this is pretty straight forward and wanted to use predict() and supply the new data in a data frame, with one column for each predictor. However, I do get an error msg warning me that the number of rows for old and new data do not match. the script looks like that: model-glm(species~temp+prec+elev,family=binomial(link=logit)) #whereby temp,prec,elev are in vector format and contain the elements on species presence/absence; species is vector of 0's and 1's (length=319) wholearea-data.frame(cbind(as.vector(temperature),as.vector (precipitation),as.vector(elevation)) # (length=7526) predict(model, newdata=wholearea,type=response) Warning message: 'newdata' had 7526 rows but variable(s) found have 319 rows. Ive searched quite a while for the answer now, has anyone encountered that problem before?? thanx in advance. Your newdata does not contain the variables named temp, prec, and elev that your model refers to. (And you should get rid of that cbind, but that is not the problem.) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R on 64-bit Windows
On Wed, 15 Oct 2008, Göran Broström wrote: In the R for Windows FAQ (2.7.2) we have under 2.2: There is no specific version for x64 Windows, but the standard 32-bit version works well enough. Does that include the handling of huge data sets? Yes, but 'huge' is not the largest size in the Huber/Wegman classification. Also, few people with enormous datasets use Windows. What I have read in other places indicates that the answer is No. If so, will there be a 64-bit version availble soon? When I websearch for 64-bit MinGW it seems to exist out there. Please do read all the FAQ: this is covered in Q8.1. You are welcome to try building a 64-bit version using the work that has already been done, and you are welcome to pay for someone else to do so (and I understand that such projects are under way). If I wanted to do this I would buy the PGI compilers and expect the port to be relatively simple. I've wasted far too much time on mingw64: several times I have built a complete version of R and it just crashed or failed even to start. -- Göran Broström -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] A cluster package question and request for images
Dear R developers, i'm developing a Java application with a very efficient image transfer api between ImageJ and R. In my next release i can transfer bytes to R very fast with the help of the Rserve application. Furthermore i added an easy to use interface to the clustering algorithm clara in the cluster package. At the moment i can cluster an R,G,B image with size 4000*4000 without any problem on a 32-bit package (with 3GB ram) with a good speed . For the clustering i have to convert the byte vectors to integer vectors and add them to a matrix. My question is if it is possible to change the package in that way that the function clara which i use for clustering also accepts smaller datatypes so that i can cluster bigger images (or more dimensions) without running out of memory. In addition is there a better way to transfer the vectors to this function maybe without the need to bind the vectors to a matrix (which costs little bit more time)? Any answers or comments are appreciated. With kind regards M.Austenfeld -- View this message in context: http://www.nabble.com/A-%22cluster%22-package-question-and-request-for-images-tp19989680p19989680.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fw: Logistic regresion - Interpreting (SENS) and (SPEC)
Gad Abraham wrote: This approach leaves much to be desired. I hope that its practitioners start gauging it by the mean squared error of predicted probabilities. Is the logic here is that low MSE of predicted probabilities equals a better calibrated model? What about discrimination? Perfect calibration Almost. I was addressed more the wish for the use of strategies that maximize precision while keeping bias to a minimim. implies perfect discrimination, but I often find that you can have two That doesn't follow. You can have perfect calibration in the large with no discrimination. competing models, the first with higher discrimination (AUC) and worse calibration, and the the second the other way round. Which one is the better model? I judge models on the basis of both discrimination (best measured with log likelihood measures, 2nd best AUC) and calibration. It's a two-dimensional issue and we don't always know how to weigh the two. For many purposes calibration is a must. In those we don't look at discrimination until calibration-in-the-small is verified at high resolution. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] request: How can we ignore a component of list having no element
An alternative approach would be to store 0 x 0 matrices instead of NULLs. This way every object in your list is a consistent type. Hadley On Wed, Oct 15, 2008 at 5:23 AM, Muhammad Azam [EMAIL PROTECTED] wrote: Dear friends There is a list of arrays comprising different no of rows and columns even sometimes NULL, such as [[2]] given below. How can we ignore [[2]] or others like this in the complete list. Any help in this regard is needed. Thanks [[1]] [,1] [,2] [1,]31 [2,]31 [3,]31 [[2]] NULL [[3]] [,1] [,2] [,3] [,4] [,5] [,6] [,7] [1,]3100000 [2,]3100000 [3,]3100000 [4,]3131321 [5,]3131321 [6,]3131320 [[4]] [,1] [,2] [,3] [,4] [1,]3000 [2,]3133 [3,]3133 [4,]3130 OR x1=c(1,2,3); x2=c(1,2,3,4,6); x3=c(); x=list(x1,x2,x3) M.Azam [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] parameter assessment in differential equation
Benoit Boulinguiez benoit.boulinguiez at ensc-rennes.fr writes: I'd like to know whether R is capable to assess parameters in a model describing the kinetic of a pollutant adsorption onto activated carbon. Start with the deSolve package (especially demo(CCL4model)) Ben Bolker __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MLE Constraints
Dears, I'm trying to find the parameters (a,b, ... l) that optimize the function (Model) described below. 1) How can I set some constraints with MLE2 function? I want to set p10, p20, p30, p1p3. 2) The code is giving the following warning. Warning: optimization did not converge (code 1) How can I solve this problem? Can someone help me? M - 14 Y = c(0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1) x1 = c(0.50, 0.50, 0.50, 0.50, 0.50, 0.50, 0.50, 0.50, 0.50, 0.50, 0.25, 0.25, 0.25, 0.25) x2 = c(-1, -1, -1, -1, -1, 1, 1, 1, 1, 1, 1, 1, 1, 1) x3 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0) states = c(1, 1, 2, 3, 1, 2, 3, 1, 1, 2, 2, 3, 1, 1) prob_fn = rep(0,M) Model=function(a, b, c, d, e, f, g, h, i, j, k, l) { p1 = exp(-(a g*x1 d*x2 j*x3)) p2 = exp(-(b h*x1 e*x2 k*x3)) p3 = exp(-(c i*x1 f*x2 l*x3)) ### Set P t5 = 0 while(t5M) { t5 = t5 1 if(states[t5]==1) {prob_ok = p1[1]} if(states[t5]==2) {prob_ok = p2[1]} if(states[t5]==3) {prob_ok = p3[1]} prob_fn[t5] = c(prob_ok) } prob_fn[prob_fn==0] = 0.1 ### LL ll_calc = -(sum(Y*log(prob_fn))) return(ll_calc) } res = mle2(Model, start=list(a=1, b=1, c=1, d=0.15, e=0.15, f=0.15, g=0.9, h=0.9, i=0.9, j=0.1, k=0.1, l=0.1), method = Nelder- Mead) res Best regards, LFRC -- View this message in context: http://www.nabble.com/MLE-Constraints-tp19994553p19994553.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice key title color
Meesters, Erik Erik.Meesters at wur.nl writes: is there a way to define the color of the title for the legend in lattice? Getting the right par to set in lattice can be intimidating. I keep the result of trellis.par.get() in a text file and search for the closest match. Dieter library(lattice) show.settings() # does not help much here (wish this were more complete..) # capture the following output into a text file trellis.par.get() # end of trying around, let's work # you need to know that the title called main pars = trellis.par.get(par.main.text) pars$col=red trellis.par.set(par.main.text,pars) xyplot(rnorm(100)~rnorm(100),main=Hello) # I normally use the following method to set many of my pars at once DMlatticeOptions - function(superpose.polygon.col=NULL, superpose.line.col=NULL) { require(lattice) require(RColorBrewer) ltheme = canonical.theme(color=TRUE) if (!is.null(superpose.polygon.col)) ltheme$superpose.line$col = superpose.line.col else ltheme$superpose.line$col = c('black',red,blue,#e3,darkgreen, gray) # ltheme$superpose.line$col = rev(brewer.pal(8,Set1)) ltheme$superpose.fill$col = ltheme$superpose.line$col if (!is.null(superpose.polygon.col)) ltheme$superpose.polygon$col = superpose.polygon.col ltheme$strip.shingle$col = ltheme$superpose.polygon$col ltheme$superpose.symbol$pch = c(16,17,18,1,2,3,4,8) ltheme$superpose.symbol$col = ltheme$superpose.line$col ltheme$superpose.symbol$cex = 0.4 ltheme$strip.background$col = c(gray90, gray80) ltheme$background$col = transparent ltheme$par.main.text$cex = 0.9 # default is 1.2 ltheme$par.ylab.text$cex =0.8 ltheme$par.ylab.text$cex =0.8 ltheme$add.text$cex = 1 ltheme$axis.text$cex = 0.6 ltheme$box.rectangle$col = black ltheme$box.umbrella$col = black ltheme$dot.symbol$col = black ltheme$plot.symbol$col = black ltheme$plot.line$col = black ltheme$plot.symbol$cex = 0.3 ltheme$plot.symbol$pch = c(16) ltheme$plot.polygon$col = #A6D96A ltheme$par.sub.text$cex=0.7 ltheme$par.sub.text$font=1 lattice.options(default.theme=ltheme) } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help Help with sampling
Hi everyone, I have a dataset(named Mydata) which includes 4 different variables named; s1,s2,s3,s4 .Each variable(symptom) has 14 patients. I need to use random sampling to make, 5 different samples from my data with 5 patients in each sample. i.e. using all 4 variables I need to make 5 different samples by changing patients(with 5 patients in each sample). X8 X9 X10 X102 X110 X177 X283 X284 X286 X292 X297 X306 X308 X314 s1 0 1 000100000000 s2 0 0 001000000010 s3 0 1 000000000010 s4 1 0 001000010000 I used this code: temp=list(NULL) for(i in 1:5) {temp[i]-sample(Mydata,5, replace=F)} show(temp) but I get the following error: number of items to replace is not a multiple of replacement length any idea why I get this eeror message and how can I fix it? Thanks a lot. -- View this message in context: http://www.nabble.com/Help-Help-with-sampling-tp19994275p19994275.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] apply model predictions over larger area with predict()
You did not read Peters response close enough. Do: names(areadata) And you will see that the names of your areadata object do not match with what the predict function is expecting. When you create areadata, you need to name the columns (or name them afterwards), it does not automatically generate the correct names. Try something like: areadata- data.frame(veld=as.vector(veld), water=as.vector(water), moeras=as.vector(moeras)) Or use the same definition of areadata, but follow with the command: names(areadata) - c('veld','water','moeras') Whenever R complains about not finding something in your data, it is a good idea to take a close look at your data to make sure that it matches what it should (and predict is very picky about names matching exactly). Functions that help with this include names, head, tail, View, str, and print. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] 801.408.8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] project.org] On Behalf Of K. Fleischer Sent: Wednesday, October 15, 2008 4:06 AM To: ONKELINX, Thierry Cc: r-help@r-project.org Subject: Re: [R] apply model predictions over larger area with predict() Dear all, thanx for the quick reply but your ideas have not gotten me further unfortuantely.. after implementing the idea from Thierry my code looks like that now: #b00all is the dataframe with only element used for model fitting #variables .._zui contain the whole are over which to predict veld-b00all$veld00all_zui water-b00all$kmoi_water_zui moeras-b00all$kveg_moeras_zui specvec-b00all$specvec mydata-data.frame(specvec,veld,water,moeras) b00all_mdl6-glm(specvec~veld+water+moeras, data=mydata, family=binomial(link=logit)) veld-veld00all_zui water-kmoi_water_zui moeras-kveg_moeras_zui areadata- data.frame(as.vector(veld),as.vector(water),as.vector(moeras)) zuid-predict(b00all_mdl6,newdata=areadata,type=response) Error: variables 'veld', 'water', 'moeras' were specified with different types from the fit In addition: Warning message: 'newdata' had 7526 rows but variable(s) found have 106 rows So again the problem is as follows: I have a dataframe with each column representing one predictor, whereby first columns contains species presence/absences and every row is one observation. now I want to take the model output and apply it over the whole area for which I have predictor values for, which are stored in .asc maps. That cannot be too difficult I am sure!! thanx again, Katrin MSc student Computational Geo-Ecology University of Amsterdam - Original Message - From: ONKELINX, Thierry [EMAIL PROTECTED] Date: Wednesday, October 15, 2008 9:56 am Subject: RE: [R] apply model predictions over larger area with predict() To: K. Fleischer [EMAIL PROTECTED], r-help@r-project.org Dear Katrin, Store the old data in a dataframe instead of vectors and supply the name of that dataframe to your model. eg Model - glm(species ~ temp + prec + elev, data = your.data.frame, family = binomial(link = logit)) Notice that I've added some spaces which makes your code more easy to read. I think this will solve your problem. HTH, Thierry --- - ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 [EMAIL PROTECTED] www.inbo.be To call in the statistician after the experiment is done may be no morethan asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] project.org]Namens K. Fleischer Verzonden: woensdag 15 oktober 2008 9:34 Aan: r-help@r-project.org Onderwerp: [R] apply model predictions over larger area with predict() Dear all, I have built glm models based on presences/absences and a number of predictor maps and would like to compute habitat suitability based on the modelled coefficients. I thought this is pretty straight forward and wanted to use predict() and supply the new data in a data frame, with one column for each predictor. However, I do get an error msg warning me that the number of rows for old and new data do not match. the script looks like that: model-glm(species~temp+prec+elev,family=binomial(link=logit))
Re: [R] Maximum number of pasted 'code' lines?
Hello, Writing 19000 lines of R-script in Excel sounds terrible. Could you provide us some code (please NOT 19000 lines), and the way you generate this with Excel, maybe we can figure out a much shorter/faster way to accomplish the same result? Bart Duncan Murdoch-2 wrote: On 10/14/2008 2:39 PM, Michael Just wrote: Erik, Roger, others: Why I use excel: the ability to concatenate and 'drag' formulas. I use it because it is what I know. Apparently, excel is frowned upon, what should I be using? I don't know how else to create many very similar lines of R code. I think you're solving the wrong problem. It's unlikely that you really need a 19000 line R script, especially if it's a script that Excel could create: it's likely that a loop in R would be much more compact and easier to read. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Maximum-number-of-pasted-%27code%27-lines--tp19978992p19994869.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] apply model predictions over larger area with predict()
Dear all, thanx for the quick reply but your ideas have not gotten me further unfortuantely.. after implementing the idea from Thierry my code looks like that now: #b00all is the dataframe with only element used for model fitting #variables .._zui contain the whole are over which to predict veld-b00all$veld00all_zui water-b00all$kmoi_water_zui moeras-b00all$kveg_moeras_zui specvec-b00all$specvec mydata-data.frame(specvec,veld,water,moeras) b00all_mdl6-glm(specvec~veld+water+moeras, data=mydata, family=binomial(link=logit)) veld-veld00all_zui water-kmoi_water_zui moeras-kveg_moeras_zui areadata-data.frame(as.vector(veld),as.vector(water),as.vector(moeras)) zuid-predict(b00all_mdl6,newdata=areadata,type=response) Error: variables ‘veld’, ‘water’, ‘moeras’ were specified with different types from the fit In addition: Warning message: 'newdata' had 7526 rows but variable(s) found have 106 rows So again the problem is as follows: I have a dataframe with each column representing one predictor, whereby first columns contains species presence/absences and every row is one observation. now I want to take the model output and apply it over the whole area for which I have predictor values for, which are stored in .asc maps. That cannot be too difficult I am sure!! thanx again, Katrin MSc student Computational Geo-Ecology University of Amsterdam - Original Message - From: ONKELINX, Thierry [EMAIL PROTECTED] Date: Wednesday, October 15, 2008 9:56 am Subject: RE: [R] apply model predictions over larger area with predict() To: K. Fleischer [EMAIL PROTECTED], r-help@r-project.org Dear Katrin, Store the old data in a dataframe instead of vectors and supply the name of that dataframe to your model. eg Model - glm(species ~ temp + prec + elev, data = your.data.frame, family = binomial(link = logit)) Notice that I've added some spaces which makes your code more easy to read. I think this will solve your problem. HTH, Thierry --- - ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 [EMAIL PROTECTED] www.inbo.be To call in the statistician after the experiment is done may be no morethan asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] project.org]Namens K. Fleischer Verzonden: woensdag 15 oktober 2008 9:34 Aan: r-help@r-project.org Onderwerp: [R] apply model predictions over larger area with predict() Dear all, I have built glm models based on presences/absences and a number of predictor maps and would like to compute habitat suitability based on the modelled coefficients. I thought this is pretty straight forward and wanted to use predict() and supply the new data in a data frame, with one column for each predictor. However, I do get an error msg warning me that the number of rows for old and new data do not match. the script looks like that: model-glm(species~temp+prec+elev,family=binomial(link=logit)) #whereby temp,prec,elev are in vector format and contain the elements on species presence/absence; species is vector of 0's and 1's (length=319) wholearea-data.frame(cbind(as.vector(temperature),as.vector (precipitation),as.vector(elevation)) # (length=7526) predict(model, newdata=wholearea,type=response) Warning message: 'newdata' had 7526 rows but variable(s) found have 319 rows. Ive searched quite a while for the answer now, has anyone encountered that problem before?? thanx in advance. Katrin Fleischer MSc Student Computational Geo-ecology University of Amsterdam __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document.
Re: [R] Defining a list
maybe: result - list() for (i in 1:10) result[[i]] - rnorm(i) Bart Megh Dal wrote: Can anyone please tell me how to define a list. Suppose I want to define a list object result with length n then want to fill each place of result with different objects. For e.g. i=1 result[1] = rnorm(1) i=2 result[2] = rnorm(2) ... i=n result[n] = rnorm(n) What would be the best way to do that? Regards, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Defining-a-%22list%22-tp19991526p19994922.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] comparing with lead function
On Wed, Oct 15, 2008 at 5:59 AM, Michael Pearmain [EMAIL PROTECTED] wrote: Hi All, I've been trying to compare if the previous value in a variable is equal to a binary value..(i.e i want to check if the last event was a yes or no) i've been trying to write some code for this, but it seems overly elaborate, can anyone suggest a better / shorter / neater way? The below doesn't quite work but shows my idea of splitting by the factor id, then creating a new vector that is lead, then i was going to use an ifelse clause.. But as i suggested this seem very elaborate.. my sample code below How about: library(plyr) ddply(DF, .(id), transform, diff = c(NA, tail(time, -1))) This splits the data frame up by id, transforms each piece in the same way as your code (but expressed a little more elegantly) and then joins the pieces back together. More info about plyr is available at http://had.co.nz/plyr Regards, Hadley -- http://had.co.nz/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help Help with sampling
Alex99 wrote: Hi everyone, I have a dataset(named Mydata) which includes 4 different variables named; s1,s2,s3,s4 .Each variable(symptom) has 14 patients. I need to use random sampling to make, 5 different samples from my data with 5 patients in each sample. i.e. using all 4 variables I need to make 5 different samples by changing patients(with 5 patients in each sample). X8 X9 X10 X102 X110 X177 X283 X284 X286 X292 X297 X306 X308 X314 s1 0 1 000100000000 s2 0 0 001000000010 s3 0 1 000000000010 s4 1 0 001000010000 I used this code: temp=list(NULL) for(i in 1:5) {temp[i]-sample(Mydata,5, replace=F)} show(temp) but I get the following error: number of items to replace is not a multiple of replacement length any idea why I get this eeror message and how can I fix it? Thanks a lot. The direct cause is that you are not using temp[[i]]-, but I don't think that sample() construct does what I think you think it does either. You might want to use replicate() instead, as in replicate(2,airquality[sample(1:153,5, replace=F),], simplify=F) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help Help with sampling
Dear Alex, Is this what you want? # Data set my=read.table(textConnection( X8 X9 X10 X102 X110 X177 X283 X284 X286 X292 X297 X306 X308 X314 0 1 000100000000 0 0 001000000010 0 1 000000000010 1 0 00100001000 0),header=TRUE) closeAllConnections() rownames(my)=paste('s',1:4,sep=) # Samples res=list() for(i in 1:5) res[[i]]- my[,sample(colnames(my),5)] res HTH, Jorge On Wed, Oct 15, 2008 at 10:07 AM, Alex99 [EMAIL PROTECTED] wrote: Hi everyone, I have a dataset(named Mydata) which includes 4 different variables named; s1,s2,s3,s4 .Each variable(symptom) has 14 patients. I need to use random sampling to make, 5 different samples from my data with 5 patients in each sample. i.e. using all 4 variables I need to make 5 different samples by changing patients(with 5 patients in each sample). X8 X9 X10 X102 X110 X177 X283 X284 X286 X292 X297 X306 X308 X314 s1 0 1 000100000000 s2 0 0 001000000010 s3 0 1 000000000010 s4 1 0 001000010000 I used this code: temp=list(NULL) for(i in 1:5) {temp[i]-sample(Mydata,5, replace=F)} show(temp) but I get the following error: number of items to replace is not a multiple of replacement length any idea why I get this eeror message and how can I fix it? Thanks a lot. -- View this message in context: http://www.nabble.com/Help-Help-with-sampling-tp19994275p19994275.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] request: How can we ignore a component of list having no element
Try x[ !sapply(x, is.null) ] hadley wickham wrote: An alternative approach would be to store 0 x 0 matrices instead of NULLs. This way every object in your list is a consistent type. Hadley On Wed, Oct 15, 2008 at 5:23 AM, Muhammad Azam [EMAIL PROTECTED] wrote: Dear friends There is a list of arrays comprising different no of rows and columns even sometimes NULL, such as [[2]] given below. How can we ignore [[2]] or others like this in the complete list. Any help in this regard is needed. Thanks [[1]] [,1] [,2] [1,]31 [2,]31 [3,]31 [[2]] NULL [[3]] [,1] [,2] [,3] [,4] [,5] [,6] [,7] [1,]3100000 [2,]3100000 [3,]3100000 [4,]3131321 [5,]3131321 [6,]3131320 [[4]] [,1] [,2] [,3] [,4] [1,]3000 [2,]3133 [3,]3133 [4,]3130 OR x1=c(1,2,3); x2=c(1,2,3,4,6); x3=c(); x=list(x1,x2,x3) M.Azam [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Extracting standard error from survreg?
Hello, I would like to extract the standard error for the coefficients returned when using survreg. Does anyone know how to do this? Thank you. Joan -- View this message in context: http://www.nabble.com/Extracting-standard-error-from-survreg--tp19994742p19994742.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help Help with sampling
Thanks for the reply Peter, that works BUT the reason I didn't use it, is because I need to calculate the mean and Standard deviation for S1,S2,S3 and S4 in each sample. which means I'll have 5 means for S1, and 5 means for S2 and 5 means for S3 and 5 means for S4 (and at the end I have to get the average of the means for each 'S'). I used the following: for(i in 1:5) {temp-sample(A3,3, replace=F) + Trans=t(temp) + Avg=mean(Trans) + show(temp) + show(Trans) + show(Avg)} the reason I used Transpose is to get the R to give me the mean for S's. but what I get is just one mean for each sample. not the mean for each S. any suggestion how to do it? thanks again for all your help Peter Dalgaard wrote: Alex99 wrote: Hi everyone, I have a dataset(named Mydata) which includes 4 different variables named; s1,s2,s3,s4 .Each variable(symptom) has 14 patients. I need to use random sampling to make, 5 different samples from my data with 5 patients in each sample. i.e. using all 4 variables I need to make 5 different samples by changing patients(with 5 patients in each sample). X8 X9 X10 X102 X110 X177 X283 X284 X286 X292 X297 X306 X308 X314 s1 0 1 000100000000 s2 0 0 001000000010 s3 0 1 000000000010 s4 1 0 001000010000 I used this code: temp=list(NULL) for(i in 1:5) {temp[i]-sample(Mydata,5, replace=F)} show(temp) but I get the following error: number of items to replace is not a multiple of replacement length any idea why I get this eeror message and how can I fix it? Thanks a lot. The direct cause is that you are not using temp[[i]]-, but I don't think that sample() construct does what I think you think it does either. You might want to use replicate() instead, as in replicate(2,airquality[sample(1:153,5, replace=F),], simplify=F) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Help-Help-with-sampling-tp19994275p19995210.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Doing a Task Without Using a For Loop
Run Rprof on your script that is updating the dataframe. A dataframe is a list and everytime you access something in the list it can be expensive. Rprof will probably show that a lot of time is spent in the function [[ which is accessing portions of the dataframe. Vectors are much faster because they are typically sequentially in memory and can be accessed easily. Rprof is always helpful in answering the question of why is something taking so long. It helps you to find where the potential bottlenecks are. On Wed, Oct 15, 2008 at 7:33 AM, Tom La Bone [EMAIL PROTECTED] wrote: I want to thank everyone for the help. I ended up having to use a loop to assign values from the table to NinYear. However, as I have played with the full datasets I have noticed that R is MUCH faster if I use vectors in the loop rather than columns of a dataframe. In the specific case of 43,000 lines of data, assigning values from the table to the 43,000 elements of a vector took 6 seconds whereas assigning values from the table to 43,000 elements of a dataframe took 21 minutes. Why is there such a huge difference? Tom Tom La Bone wrote: Assume that I have the dataframe data1, which is listed at the end of this message. I want count the number of lines that each person has for each year. For example, the person with ID=213 has 15 entries (NinYear) for 1953. The following bit of code calculates NinYear: for (i in 1:length(data1$ID)) { data1$NinYear[i] - length(data1[data1$Year==data1$Year[i] data1$ID==data1$ID[i],1]) } This seems to work but is horribly slow (some files I am working with have over 500,000 lines). Can anyone suggest a faster way of doing this, perhaps a way that does not use a for loop? Thanks. Tom IDYearNinYear 209 19710 209 19710 213 19510 213 19510 213 19530 213 19530 213 19530 213 19530 213 19530 213 19530 213 19530 213 19530 213 19530 213 19530 213 19530 213 19530 213 19530 213 19530 213 19530 213 19540 213 19540 213 19540 213 19540 213 19540 213 19540 213 19540 213 19540 213 19540 213 19540 213 19540 213 19550 213 19550 234 19530 234 19530 234 19530 234 19530 234 19530 234 19580 234 19580 234 19650 234 19650 234 19650 249 19520 249 19520 -- View this message in context: http://www.nabble.com/Doing-a-Task-Without-Using-a-For-Loop-tp19974078p19991682.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help Help with sampling
Thanks for the reply Jorge, that works BUT the reason I didn't use it, is because I need to calculate the mean and Standard deviation for S1,S2,S3 and S4 in each sample. which means I'll have 5 means for S1, and 5 means for S2 and 5 means for S3 and 5 means for S4 (and at the end I have to get the average of the means for each 'S'). I used the following: for(i in 1:5) {temp-sample(A3,3, replace=F) + Trans=t(temp) + Avg=mean(Trans) + show(temp) + show(Trans) + show(Avg)} the reason I used Transpose is to get the R to give me the mean for S's. but what I get is just one mean for each sample. not the mean for each S. any suggestion how to do it? thanks again for all your help Jorge Ivan Velez wrote: Dear Alex, Is this what you want? # Data set my=read.table(textConnection( X8 X9 X10 X102 X110 X177 X283 X284 X286 X292 X297 X306 X308 X314 0 1 000100000000 0 0 001000000010 0 1 000000000010 1 0 00100001000 0),header=TRUE) closeAllConnections() rownames(my)=paste('s',1:4,sep=) # Samples res=list() for(i in 1:5) res[[i]]- my[,sample(colnames(my),5)] res HTH, Jorge On Wed, Oct 15, 2008 at 10:07 AM, Alex99 [EMAIL PROTECTED] wrote: Hi everyone, I have a dataset(named Mydata) which includes 4 different variables named; s1,s2,s3,s4 .Each variable(symptom) has 14 patients. I need to use random sampling to make, 5 different samples from my data with 5 patients in each sample. i.e. using all 4 variables I need to make 5 different samples by changing patients(with 5 patients in each sample). X8 X9 X10 X102 X110 X177 X283 X284 X286 X292 X297 X306 X308 X314 s1 0 1 000100000000 s2 0 0 001000000010 s3 0 1 000000000010 s4 1 0 001000010000 I used this code: temp=list(NULL) for(i in 1:5) {temp[i]-sample(Mydata,5, replace=F)} show(temp) but I get the following error: number of items to replace is not a multiple of replacement length any idea why I get this eeror message and how can I fix it? Thanks a lot. -- View this message in context: http://www.nabble.com/Help-Help-with-sampling-tp19994275p19994275.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Help-Help-with-sampling-tp19994275p19995327.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice key title color
Another way to look through the possible par settings that hopefully makes it a little less intimidating is: library(TeachingDemos) TkListView(trellis.par.get()) This requires the TeachingDemos package (obviously) and the tcltk package (most have it, but sometimes you need to run R a certain way for it to work). -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] 801.408.8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] project.org] On Behalf Of Dieter Menne Sent: Wednesday, October 15, 2008 8:18 AM To: [EMAIL PROTECTED] Subject: Re: [R] Lattice key title color Meesters, Erik Erik.Meesters at wur.nl writes: is there a way to define the color of the title for the legend in lattice? Getting the right par to set in lattice can be intimidating. I keep the result of trellis.par.get() in a text file and search for the closest match. Dieter library(lattice) show.settings() # does not help much here (wish this were more complete..) # capture the following output into a text file trellis.par.get() # end of trying around, let's work # you need to know that the title called main pars = trellis.par.get(par.main.text) pars$col=red trellis.par.set(par.main.text,pars) xyplot(rnorm(100)~rnorm(100),main=Hello) # I normally use the following method to set many of my pars at once DMlatticeOptions - function(superpose.polygon.col=NULL, superpose.line.col=NULL) { require(lattice) require(RColorBrewer) ltheme = canonical.theme(color=TRUE) if (!is.null(superpose.polygon.col)) ltheme$superpose.line$col = superpose.line.col else ltheme$superpose.line$col = c('black',red,blue,#e3,darkgreen, gray) # ltheme$superpose.line$col = rev(brewer.pal(8,Set1)) ltheme$superpose.fill$col = ltheme$superpose.line$col if (!is.null(superpose.polygon.col)) ltheme$superpose.polygon$col = superpose.polygon.col ltheme$strip.shingle$col = ltheme$superpose.polygon$col ltheme$superpose.symbol$pch = c(16,17,18,1,2,3,4,8) ltheme$superpose.symbol$col = ltheme$superpose.line$col ltheme$superpose.symbol$cex = 0.4 ltheme$strip.background$col = c(gray90, gray80) ltheme$background$col = transparent ltheme$par.main.text$cex = 0.9 # default is 1.2 ltheme$par.ylab.text$cex =0.8 ltheme$par.ylab.text$cex =0.8 ltheme$add.text$cex = 1 ltheme$axis.text$cex = 0.6 ltheme$box.rectangle$col = black ltheme$box.umbrella$col = black ltheme$dot.symbol$col = black ltheme$plot.symbol$col = black ltheme$plot.line$col = black ltheme$plot.symbol$cex = 0.3 ltheme$plot.symbol$pch = c(16) ltheme$plot.polygon$col = #A6D96A ltheme$par.sub.text$cex=0.7 ltheme$par.sub.text$font=1 lattice.options(default.theme=ltheme) } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice key title color
Greg Snow Greg.Snow at imail.org writes: Another way to look through the possible par settings that hopefully makes it a little less intimidating is (completed by DM): library(lattice) library(TeachingDemos) TkListView(trellis.par.get()) Nice; should come handy in other contexts. But with lattice pars, searching in a text editor can be very efficient. Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting standard error from survreg?
jpl wrote: Hello, I would like to extract the standard error for the coefficients returned when using survreg. Does anyone know how to do this? Thank you. Joan Apparently, it is summary(mymodel)$table[,2] (Why, oh why, can't Terry allow coef(summary(...)) like we have in lm/glm??? Oh well...) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Network meta-analysis, varConstPower in nlme
Dear Thomas Lumley, and R-help list members, I have read your article Network meta-analysis for indirect treatment comparisons (Statist Med, 2002) with great interest. I found it very helpful that you included the R code to replicate your analysis; however, I have had a problem replicating your example and wondered if you are able to give me a hint. When I use the code from the article: lme1 - lme(Y1 ~ trt.B + trt.C + trt.D + trt.E, random = ~ 1 | trtpair, data=lumley1, var = varConstPower(form=~sigma, fixed=list(power=1))) I get an error message: Error in lme(Y1 ~ trt.B + trt.C + trt.D + trt.E, random = ~1 | trtpair, : unused argument(s) (var = list(const = numeric(0), power = numeric(0))) The problem seems to be in the varConstPower component, but I don't understand exactly what is wrong. (I have pasted the complete code to reproduce the result at the end of this email.) I use R 2.6.1 and nlme 3.1-86. Any help would be greatly appreciated! Best, Christian Gold --- lumley1 - scan() 2 1 -2.428 -1.451 2 5 3.233 3.819 3 2 -0.493 -0.405 1 2 0.577 1.868 3 1 -2.06 -0.709 3 4 1.976 1.196 2 3 1.084 2.055 2 3 1.255 1.278 3 4 0.807 1.058 3 4 0.254 1.672 1 4 3.008 3.634 1 5 3.817 4.063 2 1 -0.755 -0.405 1 3 1.339 2.100 lumley1 - data.frame(matrix(lumley1, byrow=T, ncol=4)) names(lumley1) - c(Active, Control, Y1, Y2) trt - matrix(0, ncol=5, nrow=nrow(lumley1)); colnames(trt) - paste(trt, LETTERS[1:5], sep=.) for (i in 1:nrow(trt)){trt[i, lumley1$Active[i]] = 1; trt[i, lumley1$Control[i]] = -1} lumley1 - data.frame(data.frame(trt), lumley1) lumley1 - transform(lumley1, trtpair = paste(Active, Control)) sigma - rep(sqrt(.5), nrow=lumley1) library(nlme) lme1 - lme(Y1 ~ trt.B + trt.C + trt.D + trt.E, random = ~ 1 | trtpair, data=lumley1, var = varConstPower(form=~sigma, fixed=list(power=1))) summary(lme1) lme2 - lme(Y2 ~ trt.B + trt.C + trt.D + trt.E, random = ~ 1 | trtpair, data=lumley1), var = varConstPower(form=~sigma, fixed=list(power=1))) summary(lme2) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error in Switch in KhmaladzeTest
Hey, My dataset has 1 dependent variable(Logloss) and 7 independent dummy variables(AS,AM,CB,CF,RB,RBR,TS) , it's attached in this email. The problem is I cant finish Khmaladze test because there's an error Error in switch(mode(x), NULL = structure(NULL, class = formula), : invalid formula which I really dont know how to fix. My R version is 2.7.2. The packages loaded are Quantreq and Sparse M, the process is as follows: CPBP-read.table(CPBP.txt) local({pkg - select.list(sort(.packages(all.available = TRUE))) + if(nchar(pkg)) library(pkg, character.only=TRUE)}) Loading required package: SparseM Package SparseM (0.78) loaded. To cite, see citation(SparseM) Package quantreg (4.22) loaded. To cite, see citation(quantreg) fit-rqProcess(Logloss~AS+AM+CB+CF+RB+RBR,data=CPBP,taus=-1) KhmaladzeTest(fit,nullH=location) Error in switch(mode(x), NULL = structure(NULL, class = formula), : invalid formula I would greatly appreciate if you can help me out, thank you very much (See attached file: CPBP.txt) Best Regards, Lulu Ren - ** This E-mail is confidential. It may also be legally privileged. If you are not the addressee you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return E-mail. Internet communications cannot be guaranteed to be timely, secure, error or virus-free. The sender does not accept liability for any errors or omissions. ** SAVE PAPER - THINK BEFORE YOU PRINT!Logloss AS AM CB CF RB RBR TS 1 8.760 0 0 0 1 0 0 2 7.710 0 0 0 1 0 0 3 7.390 0 0 0 1 0 0 4 7.310 0 0 0 1 0 0 5 7.180 0 0 0 1 0 0 6 7.090 0 0 0 1 0 0 7 6.960 0 0 0 1 0 0 8 6.910 0 0 0 1 0 0 9 6.850 0 0 0 1 0 0 10 6.850 0 0 0 1 0 0 11 6.850 0 0 0 1 0 0 12 6.820 0 0 0 1 0 0 13 6.740 0 1 0 0 0 0 14 6.720 0 0 0 1 0 0 15 6.700 0 1 0 0 0 0 16 6.690 0 0 0 1 0 0 17 6.570 0 0 1 0 0 0 18 6.490 1 0 0 0 0 0 19 6.490 0 0 0 1 0 0 20 6.480 0 0 0 1 0 0 21 6.470 0 0 1 0 0 0 22 6.430 0 0 0 0 0 1 23 6.381 0 0 0 0 0 0 24 6.330 0 0 0 1 0 0 25 6.310 0 0 0 1 0 0 26 6.300 0 0 1 0 0 0 27 6.201 0 0 0 0 0 0 28 6.180 0 0 0 1 0 0 29 6.120 0 0 0 1 0 0 30 6.080 0 1 0 0 0 0 31 6.080 0 0 0 0 1 0 32 6.040 0 0 0 1 0 0 33 6.020 0 1 0 0 0 0 34 6.000 0 0 0 1 0 0 35 6.000 0 0 0 1 0 0 36 5.960 0 0 0 1 0 0 37 5.920 0 0 0 1 0 0 38 5.900 0 0 1 0 0 0 39 5.890 0 0 0 0 1 0 40 5.850 0 0 0 1 0 0 41 5.750 0 0 0 1 0 0 42 5.700 0 1 0 0 0 0 43 5.700 0 0 0 1 0 0 44 5.700 0 1 0 0 0 0 45 5.700 0 1 0 0 0 0 46 5.700 0 0 0 0 1 0 47 5.690 0 0 0 1 0 0 48 5.680 1 0 0 0 0 0 49 5.631 0 0
Re: [R] Lattice key title color
Yes, I agree that for searching for specific terms, the file approach is great. I suggested the TkListView more for browsing through the possible parameters to learn what is there and can be changed (and to see what the current values are of those of interest). -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] 801.408.8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] project.org] On Behalf Of Dieter Menne Sent: Wednesday, October 15, 2008 9:27 AM To: [EMAIL PROTECTED] Subject: Re: [R] Lattice key title color Greg Snow Greg.Snow at imail.org writes: Another way to look through the possible par settings that hopefully makes it a little less intimidating is (completed by DM): library(lattice) library(TeachingDemos) TkListView(trellis.par.get()) Nice; should come handy in other contexts. But with lattice pars, searching in a text editor can be very efficient. Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting standard error from survreg?
On Wed, 15 Oct 2008, Peter Dalgaard wrote: jpl wrote: Hello, I would like to extract the standard error for the coefficients returned when using survreg. Does anyone know how to do this? Apparently, it is summary(mymodel)$table[,2] In order not to compute on the internal structure but to use extractor methods, you could use vcov(mymodel) and sqrt(diag(vcov(mymodel))) for the covariance matrix and the standard errors respectively. (Why, oh why, can't Terry allow coef(summary(...)) like we have in lm/glm??? Oh well...) Is that desired or just a by-product? From the description of the coef() generic, it seems that coef() should only extract coefficients (not their standard errors, Wald statistics, and p values). Z __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2: sub/super-script axes labels
?mathplot but this should work: qplot(1:10,1:10)+scale_y_continuous(expression(Respiration, pmol *O[2]*h^-1)) On Wed, Oct 15, 2008 at 7:45 AM, Adam Marsh [EMAIL PROTECTED] wrote: How are sub/superscripts designated for text in axis labels? As in this y-axis label: scale_y_continuous(Respiration, pmol O2 h-1) where the 2 needs to be subscripted and the -1 needs to be superscripted. Thanks. Adam --- Adam G. Marsh, Ph.D. Associate Professor Marine Biological Sciences University of Delaware Lewes, DE 19958 302.645.4367 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Stephen Sefick Research Scientist Southeastern Natural Sciences Academy Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice key title color
Also: str(trellis.par.get()) show it compactly. On Wed, Oct 15, 2008 at 11:18 AM, Greg Snow [EMAIL PROTECTED] wrote: Another way to look through the possible par settings that hopefully makes it a little less intimidating is: library(TeachingDemos) TkListView(trellis.par.get()) This requires the TeachingDemos package (obviously) and the tcltk package (most have it, but sometimes you need to run R a certain way for it to work). -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] 801.408.8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] project.org] On Behalf Of Dieter Menne Sent: Wednesday, October 15, 2008 8:18 AM To: [EMAIL PROTECTED] Subject: Re: [R] Lattice key title color Meesters, Erik Erik.Meesters at wur.nl writes: is there a way to define the color of the title for the legend in lattice? Getting the right par to set in lattice can be intimidating. I keep the result of trellis.par.get() in a text file and search for the closest match. Dieter library(lattice) show.settings() # does not help much here (wish this were more complete..) # capture the following output into a text file trellis.par.get() # end of trying around, let's work # you need to know that the title called main pars = trellis.par.get(par.main.text) pars$col=red trellis.par.set(par.main.text,pars) xyplot(rnorm(100)~rnorm(100),main=Hello) # I normally use the following method to set many of my pars at once DMlatticeOptions - function(superpose.polygon.col=NULL, superpose.line.col=NULL) { require(lattice) require(RColorBrewer) ltheme = canonical.theme(color=TRUE) if (!is.null(superpose.polygon.col)) ltheme$superpose.line$col = superpose.line.col else ltheme$superpose.line$col = c('black',red,blue,#e3,darkgreen, gray) # ltheme$superpose.line$col = rev(brewer.pal(8,Set1)) ltheme$superpose.fill$col = ltheme$superpose.line$col if (!is.null(superpose.polygon.col)) ltheme$superpose.polygon$col = superpose.polygon.col ltheme$strip.shingle$col = ltheme$superpose.polygon$col ltheme$superpose.symbol$pch = c(16,17,18,1,2,3,4,8) ltheme$superpose.symbol$col = ltheme$superpose.line$col ltheme$superpose.symbol$cex = 0.4 ltheme$strip.background$col = c(gray90, gray80) ltheme$background$col = transparent ltheme$par.main.text$cex = 0.9 # default is 1.2 ltheme$par.ylab.text$cex =0.8 ltheme$par.ylab.text$cex =0.8 ltheme$add.text$cex = 1 ltheme$axis.text$cex = 0.6 ltheme$box.rectangle$col = black ltheme$box.umbrella$col = black ltheme$dot.symbol$col = black ltheme$plot.symbol$col = black ltheme$plot.line$col = black ltheme$plot.symbol$cex = 0.3 ltheme$plot.symbol$pch = c(16) ltheme$plot.polygon$col = #A6D96A ltheme$par.sub.text$cex=0.7 ltheme$par.sub.text$font=1 lattice.options(default.theme=ltheme) } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to make the connection between MikTEX and R
Hi: After a couple of days trying to synchonize or connect MiKTex and R I'm getting a little frustrated. My knowledge about creating and running batch files is kind of limited so, I was wondering if someone could explain in plain english how to do this task. I'm able to run MiKtex by itself and accomplish what I am trying to do, but I want to be able to use the Sweave function from within R (or Tinn-R)and generate my dynamic reports from there. I'm running windows XP and R 2.7.2. Any help will be greatly appreciated. Thanks Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish Wildlife Service California, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help Help with sampling
Dear Alex, Is this what you want? my=read.table(textConnection( X8 X9 X10 X102 X110 X177 X283 X284 X286 X292 X297 X306 X308 X314 0 1 000100000000 0 0 001000000010 0 1 000000000010 1 0 00100001000 0),header=TRUE) closeAllConnections() rownames(my)=paste('s',1:4,sep=) # Using 5 samples only res=replicate(5,my[,sample(colnames(my),5)],simplify=FALSE) # mean lapply(res,function(x) rbind(rowMeans(x)) # standard deviation lapply(res,function(x) apply(x,1,sd,na.rm=TRUE)) HTH, Jorge On Wed, Oct 15, 2008 at 10:58 AM, Alex99 [EMAIL PROTECTED] wrote: Thanks for the reply Jorge, that works BUT the reason I didn't use it, is because I need to calculate the mean and Standard deviation for S1,S2,S3 and S4 in each sample. which means I'll have 5 means for S1, and 5 means for S2 and 5 means for S3 and 5 means for S4 (and at the end I have to get the average of the means for each 'S'). I used the following: for(i in 1:5) {temp-sample(A3,3, replace=F) + Trans=t(temp) + Avg=mean(Trans) + show(temp) + show(Trans) + show(Avg)} the reason I used Transpose is to get the R to give me the mean for S's. but what I get is just one mean for each sample. not the mean for each S. any suggestion how to do it? thanks again for all your help Jorge Ivan Velez wrote: Dear Alex, Is this what you want? # Data set my=read.table(textConnection( X8 X9 X10 X102 X110 X177 X283 X284 X286 X292 X297 X306 X308 X314 0 1 000100000000 0 0 001000000010 0 1 000000000010 1 0 00100001000 0),header=TRUE) closeAllConnections() rownames(my)=paste('s',1:4,sep=) # Samples res=list() for(i in 1:5) res[[i]]- my[,sample(colnames(my),5)] res HTH, Jorge On Wed, Oct 15, 2008 at 10:07 AM, Alex99 [EMAIL PROTECTED] wrote: Hi everyone, I have a dataset(named Mydata) which includes 4 different variables named; s1,s2,s3,s4 .Each variable(symptom) has 14 patients. I need to use random sampling to make, 5 different samples from my data with 5 patients in each sample. i.e. using all 4 variables I need to make 5 different samples by changing patients(with 5 patients in each sample). X8 X9 X10 X102 X110 X177 X283 X284 X286 X292 X297 X306 X308 X314 s1 0 1 000100000000 s2 0 0 001000000010 s3 0 1 000000000010 s4 1 0 001000010000 I used this code: temp=list(NULL) for(i in 1:5) {temp[i]-sample(Mydata,5, replace=F)} show(temp) but I get the following error: number of items to replace is not a multiple of replacement length any idea why I get this eeror message and how can I fix it? Thanks a lot. -- View this message in context: http://www.nabble.com/Help-Help-with-sampling-tp19994275p19994275.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Help-Help-with-sampling-tp19994275p19995327.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] YALAQ - Yet Another LApply Question
Please forgive this repost, it's been a week without a squeak. No comments? Original post: https://stat.ethz.ch/pipermail/r-help/2008-October/176340.html Hello, Two lapply questions (system info and sample data below): 1) Why does the first form of command1 add the name of y _after_ the str() output rather than before as does the second (preferred) form? # command1 version1 invisible(lapply(ls(pattern='bn'), function(y) cat(y, \n, str(get(y)), \n) )) # command1 version2 (preferred output) invisible(lapply(ls(pattern='bn'), function(y) { cat(y, \n) ; str(get(y)) ; cat(\n) } )) 2) Why does the same method as command1 version2 above not work with the summary() command as does the second (preferred) set of commands? # command2 version1 lapply(ls(pattern='bn'), function(y) { cat(y, \n) ; summary(get(y)) ; cat(\n) } ) # command2 version2 (preferred output) smry.list - lapply(ls(pattern='bn'), function(y) summary(get(y))) names(smry.list) - ls(pattern='bn') smry.list Thanx, DaveT. ### # system info: sessionInfo() ; cat(\n) ; Sys.info()[c(1:3,5)] R version 2.7.2 (2008-08-25) i386-pc-mingw32 locale: LC_COLLATE=English_Canada.1252;LC_CTYPE=English_Canada.1252;LC_MONETARY= English_Canada.1252;LC_NUMERIC=C;LC_TIME=English_Canada.1252 attached base packages: [1] stats graphics grDevices datasets tcltk utils methods base other attached packages: [1] debug_1.1.0 mvbutils_1.1.1 lattice_0.17-14 plotrix_2.4-7 svSocket_0.9-5 TinnR_1.0.2 R2HTML_1.59 Hmisc_3.4-3 loaded via a namespace (and not attached): [1] cluster_1.11.11 grid_2.7.2 svMisc_0.9-5tools_2.7.2 sysname release version machine Windows XP build 2600, Service Pack 2x86 # system info: ### ### # sample data # dput(ban.nat.1994[sample(row.names(ban.nat.1994), 20),]) bn94 - structure(list(oplt = c(18L, 50L, 11L, 16L, 54L, 35L, 45L, 40L, 15L, 50L, 38L, 45L, 53L, 15L, 1L, 54L, 33L, 13L, 30L, 21L), tree = c(144L, 824L, 47L, 525L, 291L, 702L, 717L, 615L, 821L, 551L, 750L, 639L, 664L, 813L, 31L, 346L, 689L, 59L, 200L, 658L), bd1 = c(NA, NA, 3.6, 3.1, 4.72, 2.03, 2.88, 1.65, 5.39, 3.04, 2.75, 3.06, 2.81, 2.78, NA, 6.5, 4.62, 4.76, NA, 2.69), bd2 = c(NA, NA, 3.41, 3.06, 4.86, 2.09, 2.78, 1.8, 5.08, 3.26, 2.71, 3.1, 2.87, 2.73, NA, 6.6, 4.53, 4.97, NA, 2.81), bd = c(NA, 4.25, 3.51, 3.08, 4.79, 2.06, 2.83, 1.72, 5.23, 3.15, 2.73, 3.08, 2.84, 2.76, NA, 6.55, 4.58, 4.87, NA, 2.75), ht = c(NA, 20.4, 18.1, 18, 25.8, 13.1, 15.7, 4, 16, 14, 12.7, 8.6, 8.1, 16.2, NA, 52.7, 31.7, 23.7, NA, 17.6), spr = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_), stat = c(1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 1L, 0L), dam = c(NA, 2L, NA, NA, NA, NA, NA, 2L, NA, NA, 2L, NA, 1L, NA, NA, NA, 1L, NA, NA, NA), com = c(, from partial data set, tend = 1, , , , , , , , , , , , , , , , , , )), .Names = c(oplt, tree, bd1, bd2, bd, ht, spr, stat, dam, com ), row.names = c(1049L, 2845L, 549L, 884L, 3128L, 1887L, 2432L, 2087L, 879L, 2729L, 1990L, 2421L, 3089L, 871L, 3L, 3158L, 1801L, 704L, 1653L, 1215L), class = data.frame) # dput(ban.nat.1995[sample(row.names(ban.nat.1995), 20),]) bn95 - structure(list(oplt = c(2L, 27L, 54L, 8L, 8L, 51L, 3L, 4L, 20L, 35L, 15L, 22L, 31L, 7L, 4L, 31L, 34L, 6L, 20L, 51L), tree = c(773L, 308L, 331L, 234L, 235L, 170L, 211L, 701L, 745L, 709L, 798L, 350L, 207L, 736L, 718L, 240L, 193L, 266L, 735L, 244L), bd1 = c(8.41, NA, 6.93, 2.74, 6.06, 2.36, 5.87, 3.53, 2.48, NA, 9, 4.18, 1.74, 3.42, 5.54, 3.74, 3.26, 2.38, 4.42, 3.65), bd2 = c(8.76, NA, 7.17, 2.82, 6.16, 2.33, 6.05, 2.58, 2.5, NA, 9.04, 4.22, 1.68, 3.39, 5.52, 3.68, 3.18, 2.38, 4.47, 3.74), bda = c(8.59, NA, 7.05, 2.78, 6.11, 2.34, 5.96, 3.05, 2.49, NA, 9.02, 4.2, 1.71, 3.41, 5.53, 3.71, 3.22, 2.38, 4.44, 3.7), ht = c(69.2, NA, 55.2, 25.7, 47.8, 17.1, 35.6, 11, 12, NA, 52.5, 33.2, 10.4, 16.2, 32.7, 22.1, 15, 13.6, 24.5, 22.4), spr = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_), stat = c(0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), dam = c(NA, NA, NA, NA, NA, NA, NA, 1L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), com = c(, , , , , , , , , , , , , , , , , , , )), .Names = c(oplt, tree, bd1, bd2, bda, ht, spr, stat, dam, com ), row.names = c(34L, 462L, 724L, 227L, 228L, 673L, 76L, 123L, 368L, 632L, 281L, 391L, 495L, 199L, 140L, 528L, 604L, 191L, 358L, 689L), class =
Re: [R] Labour Statistics
Ruben, Thankyou for the advice. I'll do what I can with it. Ruben Wrote: If I understand your problem correctly, you have that the magnitude of deviations from the mean/median/mode in the volume of your requests for background checks in month m predicts a multivariate response that represents the macroeconomic situation in month m+1. First, regarding your original question, a statistician judging your product would like to see a measure of predictive success. If you have a model to relate your predictor (the deviation in volume of requests) with your response (several variables representing the macroeconomic status) then you could run the model for many months (say from Jan 2000 to Sep 2008) and predict the macroeconomic status with the model and compare it with the actual macroeconomic status observed. This would be framed into a measure of predictive success and predictive mean squared error. Second, regarding what method to use to fit the relation between your predictor and the multivariate response, you have a number of options. One simple alternative that would reduce your problem to a simple univariate modeling problem would be to research the economic literature to define an index of macroeconomic status that would reduce your multivariate response to an univariate response. Additionally, if the variables in the multivariate response are strongly correlated, you can define your own index by using principal component analysis on the multivariate response, and later use the first principal component as a univariate response. After that, many options are again available, such as forecasting methods or regular time series analysis. A more complex but probably more precise approach would be to model the multivariate response as such. This depends on the nature of the variables in the multivariate response. If they can be considered as multinomial counts then you have a very good solution using multinomial logistic regression with function multinom in package nnet. Maybe this can get you started. Regards Rubén Gad Abraham explained : Max wrote: Hi everyone, This is not so much of an R question as a statistics question. I currently work for the largest pre employment screening company in Canada. Upper management has noticed that noticed that usually a month or so before any big kind of economic shock happens, that our incoming files (requests for a background check) jump up or down. As the company statistician, they've asked me to see if the relationship is strong enough to put together a product that can be sold to any kind of firm or organization (brokerages or any kind of investing firm, federal ministry of finance, statistics canada (like the bureau of stats in the USA), universities etc) In Canada on the 10th of every month, statistics canada releases labour statistics for the previous month. The way CFO sees it, *ideally* on the (1st to 10th, something like that) every month, the firm I work for could be releasing data for the rest of the month. What I'm trying to figure out is if you were in the position of evaluating the final product for purchase, what kind of information would make the product credible/viable? Summary statistics? Variance covariance matrices? Graphs of the data? Cross Correlation matrices for time series analysis? It's frustrating because I can see a noticeable relationship between our file volume and the unemployment rate (in particular,) but I'm not sure how to appropriately frame it in a way that another statistician/modeler would want the data. Why not start with some simple plots of the relationships between your variables? Once you have a feel for the problem, you can look into modelling it more formally using a suitable regression model. Gad, the issue I have is that I technically have one predictor for multiple response. The data is not very clean for simple univariate models. Unfortunately, my knowledge of multivariate response models is poor, and how to set up the problem in R as a multivariate regression is a total mystery to me. (Multivariate was the one course that I wasn't able to take in my undergrad math/stats degree. ) The other issue is that if I view the problem as a time series problem, it's multiple time series analysis, which I don't have any books on. The more I look at the data and the problem the more I feel like I'm in way over my head. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Defining a list
Can anyone please tell me how to define a list. Suppose I want to define a list object result with length n then want to fill each place of result with different objects. For e.g. i=1 result[1] = rnorm(1) i=2 result[2] = rnorm(2) ... i=n result[n] = rnorm(n) What would be the best way to do that? Regards, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting standard error from survreg?
Thanks! This is exactly what I needed. Peter Dalgaard wrote: jpl wrote: Hello, I would like to extract the standard error for the coefficients returned when using survreg. Does anyone know how to do this? Thank you. Joan Apparently, it is summary(mymodel)$table[,2] (Why, oh why, can't Terry allow coef(summary(...)) like we have in lm/glm??? Oh well...) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Extracting-standard-error-from-survreg--tp19994742p19996412.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Heuristic optimisation?
I wondered was people on this list felt about this article: http://www.voxeu.org/index.php?q=node/2363 which talks about the problems of obtaining sound answers in numerical optimisation in settings such as MLE or NLS. -- Ajay Shah http://www.mayin.org/ajayshah [EMAIL PROTECTED] http://ajayshahblog.blogspot.com *(:-? - wizard who doesn't know the answer. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to make the connection between MikTEX and R
Suppose foo.Rnw is in \a\b and you have placed MiKTeX in the default place when you installed it and suppose that when you installed R you did not prevent it from writing to the reigstry. Then: 1. Go to the link I gave already, read everything there, go to the download page and download batchfiles_0.3-2.zip 2. Unzip that, read the README and place sweave.bat in \a\b (or anywhere in your PATH). 3. Press Windows-R to get into Windows console and type: cd \a\b Sweave foo.Rnw On Wed, Oct 15, 2008 at 12:12 PM, Felipe Carrillo [EMAIL PROTECTED] wrote: Hi: After a couple of days trying to synchonize or connect MiKTex and R I'm getting a little frustrated. My knowledge about creating and running batch files is kind of limited so, I was wondering if someone could explain in plain english how to do this task. I'm able to run MiKtex by itself and accomplish what I am trying to do, but I want to be able to use the Sweave function from within R (or Tinn-R)and generate my dynamic reports from there. I'm running windows XP and R 2.7.2. Any help will be greatly appreciated. Thanks Felipe D. Carrillo Supervisory Fishery Biologist Department of the Interior US Fish Wildlife Service California, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in Switch in KhmaladzeTest
Hey back, If you had RTFM you would know that the first argument of the function KhmaladzeTest was supposed to be a formula, so try: T - KhmaladzeTest(Logloss~AS+AM+CB+CF+RB+RBR, data=CPBP, nullH=location) url:www.econ.uiuc.edu/~rogerRoger Koenker email[EMAIL PROTECTED]Department of Economics vox: 217-333-4558University of Illinois fax: 217-244-6678Champaign, IL 61820 On Oct 15, 2008, at 10:39 AM, [EMAIL PROTECTED] wrote: Hey, My dataset has 1 dependent variable(Logloss) and 7 independent dummy variables(AS,AM,CB,CF,RB,RBR,TS) , it's attached in this email. The problem is I cant finish Khmaladze test because there's an error Error in switch(mode(x), NULL = structure(NULL, class = formula), : invalid formula which I really dont know how to fix. My R version is 2.7.2. The packages loaded are Quantreq and Sparse M, the process is as follows: CPBP-read.table(CPBP.txt) local({pkg - select.list(sort(.packages(all.available = TRUE))) + if(nchar(pkg)) library(pkg, character.only=TRUE)}) Loading required package: SparseM Package SparseM (0.78) loaded. To cite, see citation(SparseM) Package quantreg (4.22) loaded. To cite, see citation(quantreg) fit-rqProcess(Logloss~AS+AM+CB+CF+RB+RBR,data=CPBP,taus=-1) KhmaladzeTest(fit,nullH=location) Error in switch(mode(x), NULL = structure(NULL, class = formula), : invalid formula I would greatly appreciate if you can help me out, thank you very much (See attached file: CPBP.txt) Best Regards, Lulu Ren - ** This E-mail is confidential. It may also be legally privileged. If you are not the addressee you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return E-mail. Internet communications cannot be guaranteed to be timely, secure, error or virus-free. The sender does not accept liability for any errors or omissions. ** SAVE PAPER - THINK BEFORE YOU PRINT! CPBP.txt__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice key title color
Gabor Grothendieck ggrothendieck at gmail.com writes: str(trellis.par.get()) Never thought of that. Really nice. Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with matplot
Hi, I apologise in advance for the naïve question. I have large matrices that I want to plot. I currently use color2D.matplot. However, these matrices contain many values of no interest (i.e. where there is no data, the figure -999 is automatically displayed). Is there any way of removing these from the matrices to be plotted by matplot? An obvious possibility is setting them all to 0, but that introduces 'false data'. I also see that matplot does not have a way of dealing with NaNs. Is this actually the case? Thanks in advance for any help you can give me!! Bernardo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Extracting standard error from survreg?
Achim Zeileis wrote: On Wed, 15 Oct 2008, Peter Dalgaard wrote: jpl wrote: Hello, I would like to extract the standard error for the coefficients returned when using survreg. Does anyone know how to do this? Apparently, it is summary(mymodel)$table[,2] In order not to compute on the internal structure but to use extractor methods, you could use vcov(mymodel) and sqrt(diag(vcov(mymodel))) for the covariance matrix and the standard errors respectively. (Why, oh why, can't Terry allow coef(summary(...)) like we have in lm/glm??? Oh well...) Is that desired or just a by-product? From the description of the coef() generic, it seems that coef() should only extract coefficients (not their standard errors, Wald statistics, and p values). Good question. It's likely mostly coincidental, but it wouldn't hurt to use the construction idiomatically. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with matplot
What do you mean by removing them? Do you want the whole row/column deleted? Have you tried setting them to NA so that the points are not plotted? You have to be a little more specific on what type of action you want to have happen when the criteria is met. Of course the answer is anything is possible in R. On Wed, Oct 15, 2008 at 6:19 AM, Garcia Carreras, Bernardo [EMAIL PROTECTED] wrote: Hi, I apologise in advance for the naïve question. I have large matrices that I want to plot. I currently use color2D.matplot. However, these matrices contain many values of no interest (i.e. where there is no data, the figure -999 is automatically displayed). Is there any way of removing these from the matrices to be plotted by matplot? An obvious possibility is setting them all to 0, but that introduces 'false data'. I also see that matplot does not have a way of dealing with NaNs. Is this actually the case? Thanks in advance for any help you can give me!! Bernardo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Network meta-analysis, varConstPower in nlme
Hi Christian, I believe that the argument var has changed name to weights. The following lines work for me: ... sigma - rep(sqrt(.5), nrow(lumley1)) # not nrow= lme1 - lme(Y1 ~ trt.B + trt.C + trt.D + trt.E, random = ~ 1 | trtpair, data=lumley1, weights = varConstPower(form=~sigma, fixed=list(power=1))) lme2 - lme(Y2 ~ trt.B + trt.C + trt.D + trt.E, random = ~ 1 | trtpair, data=lumley1), weights = varConstPower(form=~sigma, fixed=list(power=1))) I hope you can now make it work!? Christian __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice key title color
On 10/15/08, Dieter Menne [EMAIL PROTECTED] wrote: Gabor Grothendieck ggrothendieck at gmail.com writes: str(trellis.par.get()) Never thought of that. Really nice. There's also http://lmdvr.r-forge.r-project.org/figures/figures.html?chapter=07;figure=07_03 All of which doesn't really answer the original question. Searching in page(draw.key) reveals the following snippet: if (!is.null(key$title)) key.gf - placeGrob(key.gf, textGrob(label = key$title, gp = gpar(cex = key$cex.title, lineheight = key$lineheight)), row = 1, col = NULL) So, the title color cannot be currently specified, either directly or through the settings system. -Deepayan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] investigating interaction term for a model of Gross Primary Productivity
I am trying to investigate the interaction term in the below. The paradigm in aquatic systems is that algal production is either nitrogen (TIN) or Phosphorus limited, and I am trying to investigate this- what is the best way to go about investigating the interaction term. I have some thoughts on the above, but I will withhold them to see what others think. Thanks for your help. d - (structure(list(d.GPP = c(1.213695235, 3.817313822, 1.267930498, 10.45692825, 3.268295623, 3.505286001, 4.468225245, 0.915653726, 1.635617261, 3.726133898, 1.363453706, 13.99650967, 0.417618143, 0.741080504, 0.412440872, 3.515743675, 8.248491445, 1.537773727, 1.537773727, 3.249103284, 3.249103284, 0.768531626, 2.633107621, 3.113199095, 0.773824094, 0.680150068, 0.680150068, 0.026385752, 0.369310858, 8.049276658, 7.487378383, 0.950072035, 0.950072035, 0.763580377, 0.333244629, 5.475999014, 9.235631398), d.TIN = c(0.53, 0.52, 0.446, 0.217, 0.39, 0.34, 0.45, 0.29, 0.23, 0.308, 0.3, 0.1, 0.52, 0.13, 0.38, 0.36, 0.226, 0.345, 0.345, 0.217, 0.217, 0.47, 0.34, 0.31, 0.35, 0.41, 0.41, 0.432, 0.333, 0.168, 0.22, 0.37, 0.37, 0.44, 0.38, 0.3, 0.105), d.Phosphorus = c(0.17, 0.19, 0.2, 0.012, 0.13, 0.14, 0.069, 0.13, 0.13, 0.13, 0.099, 0.013, 0.16, 0.076, 0.12, 0.037, 0.011, 0.1, 0.1, 0.021, 0.021, 0.12, 0.099, 0.088, 0.036, 0.15, 0.15, 0.12, 0.12, 0.0076, 0.014, 0.14, 0.14, 0.15, 0.15, 0.12, 0.011), d.TSS = c(2.8, 8.4, 11, 1.3, 4.2, 2, 3.4, 14, 8.2, 3.1, 1.4, 0.9, 6.1, 9.2, 11, 1.2, 1.3, 11, 11, 8.5, 8.5, 13, 4.4, 1.4, 2.1, 25, 25, 9.3, 6.1, 1.6, 1.5, 19, 19, 24, 9.6, 1.8, 1.4)), .Names = c(d.GPP, d.TIN, d.Phosphorus, d.TSS), row.names = c(NA, -37L), class = data.frame)) #here is the model model.GPP - lm(GPP~TIN*Phosphorus+I(1/TSS), data=d) #regression equation GPP - function(TIN, P, TSS){ 8.765-(22.896*TIN)-(63.513*P)+(5.836/TSS)+(170.735*(TIN*P)) } #predicting GPP from the data - This is kind of useless in this context because I am predicting GPP from the data that I modeled it with, but I think I am going to have to fool around #with these values to investigate the interaction term. This is kind of where I am stuck. a - GPP(d$TIN, d$Phosphorus, d$TSS) #plotting the data plot(a~d$TSS) -- Stephen Sefick Research Scientist Southeastern Natural Sciences Academy Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R on 64-bit Windows
In the R for Windows FAQ (2.7.2) we have under 2.2: There is no specific version for x64 Windows, but the standard 32-bit version works well enough. Does that include the handling of huge data sets? What I have read in other places indicates that the answer is No. If so, will there be a 64-bit version availble soon? When I websearch for 64-bit MinGW it seems to exist out there. -- Göran Broström __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] New versions of two packages: granova and PSAgraphics
Dear R users, This is to announce new versions of our packages granova and PSAgraphics, both now v.1.2. granova derives from ‘graphical analysis of variance.’ The package consists of four functions that facilitate seeing basic data as well as standard summary statistics for various ANOVA applications. The functions are especially flexible, so they are able to handle widely different data specifications. Focus is directed to basic questions that drive methods, so that the graphics can help to learn how data play out in answer to question(s) posed by methods. Experience suggests these functions can be particularly helpful for students and non-statistician analysts, since they help understand the methods themselves, as well as the data. Dynamic graphics are used to depict two-way data. In all cases numerical results accompany the graphical displays. PSAgraphics aims chiefly to facilitate visualization of various results in the context of propensity score analysis. One set of functions concerns assessment of covariate balance, with respect to defined propensity scores; both numerical and graphical functions are provided. Two other functions help to understand PSA effects after adjustment for propensity scores. The functions have been designed to be flexible, and in all cases numerical results complement the graphic displays. Various documents and displays that illustrate graphic results for these methods are available; write rmp at [EMAIL PROTECTED] . All comments, suggestions for improvement and bug-reports are solicited. Bob Pruzek James Helmreich ___ R-packages mailing list [EMAIL PROTECTED] https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Labour Statistics
Gad Abraham explained : Max wrote: Hi everyone, This is not so much of an R question as a statistics question. I currently work for the largest pre employment screening company in Canada. Upper management has noticed that noticed that usually a month or so before any big kind of economic shock happens, that our incoming files (requests for a background check) jump up or down. As the company statistician, they've asked me to see if the relationship is strong enough to put together a product that can be sold to any kind of firm or organization (brokerages or any kind of investing firm, federal ministry of finance, statistics canada (like the bureau of stats in the USA), universities etc) In Canada on the 10th of every month, statistics canada releases labour statistics for the previous month. The way CFO sees it, *ideally* on the (1st to 10th, something like that) every month, the firm I work for could be releasing data for the rest of the month. What I'm trying to figure out is if you were in the position of evaluating the final product for purchase, what kind of information would make the product credible/viable? Summary statistics? Variance covariance matrices? Graphs of the data? Cross Correlation matrices for time series analysis? It's frustrating because I can see a noticeable relationship between our file volume and the unemployment rate (in particular,) but I'm not sure how to appropriately frame it in a way that another statistician/modeler would want the data. Why not start with some simple plots of the relationships between your variables? Once you have a feel for the problem, you can look into modelling it more formally using a suitable regression model. Gad, the issue I have is that I technically have one predictor for multiple response. The data is not very clean for simple univariate models. Unfortunately, my knowledge of multivariate response models is poor, and how to set up the problem in R as a multivariate regression is a total mystery to me. (Multivariate was the one course that I wasn't able to take in my undergrad math/stats degree. ) The other issue is that if I view the problem as a time series problem, it's multiple time series analysis, which I don't have any books on. The more I look at the data and the problem the more I feel like I'm in way over my head. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Lattice key title color
Dear R users, is there a way to define the color of the title for the legend in lattice? The help page on xyplot has a lot of details on key options just as the new book, but no mentioning of a color attribute for the title. Should I use ltext or is there any other way? Best wishes, Erik [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Maximum number of pasted 'code' lines?
Michael Just wrote: Hello, I write most of my R code in excel and then paste it into R. I am wondering if there is a limit to how much I can paste? I want to paste about 19,000 lines of code should this work? I am doing this because when I did it chunks it took about an hour and half. I thought if I could insert it all and leave it for that long that would be better time management. I am still waiting for 19,000 paste to process. Any thoughts or suggestions? Thanks, Michael [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hi, For windows I use the Tinn-R text editor that integrates nicely with R. For linux I use Kate (standard KDE text editor). 19000 lines of code, pfieuw, that is a lot of codeyou could export the lines to a text file and load them into R using source(). cheers, Paul -- Drs. Paul Hiemstra Department of Physical Geography Faculty of Geosciences University of Utrecht Heidelberglaan 2 P.O. Box 80.115 3508 TC Utrecht Phone: +31302535773 Fax:+31302531145 http://intamap.geo.uu.nl/~paul __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Defining a list
Try this: lapply(1:n, rnorm) On Wed, Oct 15, 2008 at 8:19 AM, Megh Dal [EMAIL PROTECTED] wrote: Can anyone please tell me how to define a list. Suppose I want to define a list object result with length n then want to fill each place of result with different objects. For e.g. i=1 result[1] = rnorm(1) i=2 result[2] = rnorm(2) ... i=n result[n] = rnorm(n) What would be the best way to do that? Regards, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] parameter assessment in differential equation
Hi, I'd like to know whether R is capable to assess parameters in a model describing the kinetic of a pollutant adsorption onto activated carbon. A common relation is for instance the Adam-Bohart-Thomas' one: dx/dt = K1 * (qm-x)*C - K2x where {K1,K2} are the unknown paramters and {qm,C} are known parameters Of course I get experimental data sets of measured x as a function of time. If R can help to handle that, which functions have to be used (diff, optim, nls...) Regards/Cordialement - Benoit Boulinguiez Ph.D Ecole de Chimie de Rennes (ENSCR) Bureau 1.20 Equipe CIP UMR CNRS 6226 Sciences Chimiques de Rennes Campus de Beaulieu, 263 Avenue du Général Leclerc 35700 Rennes, France Tel 33 (0)2 23 23 80 83 Fax 33 (0)2 23 23 81 20 http://www.ensc-rennes.fr/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] trouble with combining png and pdf
L.S., I would like to add a downsized image to a pdf file and got stuck on the code to use. My goal is to open a pdf device, add a plot, add the downsized image, and then close the pdf device. Making the downsized image is easy using png(), png(file=x.png, width = 600, height = 600, units=px, pointsize=12) image(very large image) dev.off() making a pdf with a plot as well using pdf(), pdf(file = y.pdf, paper=letter, width=8.5, height=11) plot(1:100) dev.off() However this produces two separate files and my goal is to export the downsized image (as png) into the pdf file (behind the plot)! Maybe anyone has some suggestions? Regards, Sander Botter Erasmus Medical Centre, Rotterdam, The Netherlands [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] problems using AFDM(FactoMineR)
Hello R-list members! I tried to do the following with my dataset that contains factor and numerics, (80columns,about 600 rows) Dataset.afdm-AFDM(Dataset[282:595,], type=TypeVector, ncp=3) Fehler in svd(X) : infinite or missing values in 'x' TypeVector [1] n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n n [46] n n n n n n n n n n n n n n n n n s s s s s s n n n n n n n n n n n n If I do is.na(Dataset) is.infinite(Dataset) I get only FALSEs as answer or perhaps I should better say I can only see FALSE but tha dataset is huge. The given example for AFDM works. Can somebody give me a hint, perhaps also how I could let me show only NA or infinite values? Many thanks for help in advance and sorry for not giving reproducible code but there is maybe something going wrong with my dataset. B. - The art of living is more like wrestling than dancing. (Marcus Aurelius) -- View this message in context: http://www.nabble.com/problems-using-AFDM%28FactoMineR%29-tp19998607p19998607.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problems with R memory usage on Linux
Hello all, I'm working with a large data-set, and upgraded my RAM to 4GB to help with the mem use. I've got a 32bit kernel with 64GB memory support compiled in. gnome-system-monitor and free both show the full 4GB as being available. In R I was doing some processing and I got the following message (when collecting 100 307200*8 dataframes into a single data-frame (for plotting): Error: cannot allocate vector of size 2.3 Mb So I checked the R memory usage: $ ps -C R -o size SZ 3102548 I tried removing some objects and running gc() R then shows much less memory being used: $ ps -C R -o size SZ 2732124 Which should give me an extra 300MB in R. I still get the same error about R being unable to allocate another 2.3MB. I deleted well over 2.3MB of objects... Any suggestions as to get around this? Is the only way to use all 4GB in R to use a 64bit kernel? Thanks all, B. Bogart __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] gamboost partial fit prediction
Dear useRs, I am struggling to use gamboost function form the 'mboost' package. More precisely, I am trying to extract the *partial fit* for each of the covariates estimated in a model and I usually end up with this annoying: Error in newdata[[xname]] : subscript out of bounds . I hope that the lack of details in my query can be straightforwardly compensated by examining the following code (I am aware of the additive structure of the estimation, therefore the excerpt below is supposed to be only a part of my final solution). If the problem does not become more obvious I will provide more details. Thanks in advance, AS R code rm(list=ls(all=T)) library(mboost) data(bodyfat) ## Model fit bf_gam - gamboost(DEXfat ~ ., data = bodyfat) aic - AIC(bf_gam) fit - bf_gam[mstop(aic)] ## partial prediction input - fit$data$input i - sort(unique(fit$ensemble[,1]))[1] a - fit$ensembles[fit$ensemble[,1]==i] ### Why this does not work? a[[1]][[1]]$predict(newdata=90) [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help Help with sampling
Thanks again Jorge, but when I type in lapply(res,function(x) rbind(rowMeans(x)) it's like it's waiting for something else, it doesnt run. I tried help(lapply) to learn more about lappy but didnt give me any result. could you tell me why it doesnt run and what is lapply for? Thanks a lot for all your help Jorge Ivan Velez wrote: Dear Alex, Is this what you want? my=read.table(textConnection( X8 X9 X10 X102 X110 X177 X283 X284 X286 X292 X297 X306 X308 X314 0 1 000100000000 0 0 001000000010 0 1 000000000010 1 0 00100001000 0),header=TRUE) closeAllConnections() rownames(my)=paste('s',1:4,sep=) # Using 5 samples only res=replicate(5,my[,sample(colnames(my),5)],simplify=FALSE) # mean lapply(res,function(x) rbind(rowMeans(x)) # standard deviation lapply(res,function(x) apply(x,1,sd,na.rm=TRUE)) HTH, Jorge On Wed, Oct 15, 2008 at 10:58 AM, Alex99 [EMAIL PROTECTED] wrote: Thanks for the reply Jorge, that works BUT the reason I didn't use it, is because I need to calculate the mean and Standard deviation for S1,S2,S3 and S4 in each sample. which means I'll have 5 means for S1, and 5 means for S2 and 5 means for S3 and 5 means for S4 (and at the end I have to get the average of the means for each 'S'). I used the following: for(i in 1:5) {temp-sample(A3,3, replace=F) + Trans=t(temp) + Avg=mean(Trans) + show(temp) + show(Trans) + show(Avg)} the reason I used Transpose is to get the R to give me the mean for S's. but what I get is just one mean for each sample. not the mean for each S. any suggestion how to do it? thanks again for all your help Jorge Ivan Velez wrote: Dear Alex, Is this what you want? # Data set my=read.table(textConnection( X8 X9 X10 X102 X110 X177 X283 X284 X286 X292 X297 X306 X308 X314 0 1 000100000000 0 0 001000000010 0 1 000000000010 1 0 00100001000 0),header=TRUE) closeAllConnections() rownames(my)=paste('s',1:4,sep=) # Samples res=list() for(i in 1:5) res[[i]]- my[,sample(colnames(my),5)] res HTH, Jorge On Wed, Oct 15, 2008 at 10:07 AM, Alex99 [EMAIL PROTECTED] wrote: Hi everyone, I have a dataset(named Mydata) which includes 4 different variables named; s1,s2,s3,s4 .Each variable(symptom) has 14 patients. I need to use random sampling to make, 5 different samples from my data with 5 patients in each sample. i.e. using all 4 variables I need to make 5 different samples by changing patients(with 5 patients in each sample). X8 X9 X10 X102 X110 X177 X283 X284 X286 X292 X297 X306 X308 X314 s1 0 1 000100000000 s2 0 0 001000000010 s3 0 1 000000000010 s4 1 0 001000010000 I used this code: temp=list(NULL) for(i in 1:5) {temp[i]-sample(Mydata,5, replace=F)} show(temp) but I get the following error: number of items to replace is not a multiple of replacement length any idea why I get this eeror message and how can I fix it? Thanks a lot. -- View this message in context: http://www.nabble.com/Help-Help-with-sampling-tp19994275p19994275.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Help-Help-with-sampling-tp19994275p19995327.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
Re: [R] YALAQ - Yet Another LApply Question
For the first question, ?str tells us that the str function does not return anything, but has a side effect of printing information to the console. So in version 1 when you call the cat function it evaluates its arguments and as it evaluates str(...) the information is printed, then nothing is passed into cat which then prints the name of the object (after str has done its printing). In the second version cat finishes printing the name before str is called so that it is in the correct order. You might prefer to use lapply/sapply to create a list with all the pieces of interest, then just do str on the resulting list, or look at the TkListView function in the TeachingDemos package for another way to look at the structure of a list. For example: TkListView( sapply( ls(pattern='bn'), function(y) get(y), simplify=FALSE, USE.NAMES=TRUE ) ) For the 2nd question, the difference is that the summary function does not print anything, just returns the summary information. When you type summary at the command line the parser automatically calls the print method for the summary object returned, but within functions/loops/apply the summary object is returned (included in the output of lapply) but not printed. You need to explicitly print the results of summary. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] 801.408.8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] project.org] On Behalf Of Thompson, David (MNR) Sent: Wednesday, October 15, 2008 10:19 AM To: r-help@r-project.org Subject: Re: [R] YALAQ - Yet Another LApply Question Please forgive this repost, it's been a week without a squeak. No comments? Original post: https://stat.ethz.ch/pipermail/r-help/2008-October/176340.html Hello, Two lapply questions (system info and sample data below): 1) Why does the first form of command1 add the name of y _after_ the str() output rather than before as does the second (preferred) form? # command1 version1 invisible(lapply(ls(pattern='bn'), function(y) cat(y, \n, str(get(y)), \n) )) # command1 version2 (preferred output) invisible(lapply(ls(pattern='bn'), function(y) { cat(y, \n) ; str(get(y)) ; cat(\n) } )) 2) Why does the same method as command1 version2 above not work with the summary() command as does the second (preferred) set of commands? # command2 version1 lapply(ls(pattern='bn'), function(y) { cat(y, \n) ; summary(get(y)) ; cat(\n) } ) # command2 version2 (preferred output) smry.list - lapply(ls(pattern='bn'), function(y) summary(get(y))) names(smry.list) - ls(pattern='bn') smry.list Thanx, DaveT. ### # system info: sessionInfo() ; cat(\n) ; Sys.info()[c(1:3,5)] R version 2.7.2 (2008-08-25) i386-pc-mingw32 locale: LC_COLLATE=English_Canada.1252;LC_CTYPE=English_Canada.1252;LC_MONETARY = English_Canada.1252;LC_NUMERIC=C;LC_TIME=English_Canada.1252 attached base packages: [1] stats graphics grDevices datasets tcltk utils methods base other attached packages: [1] debug_1.1.0 mvbutils_1.1.1 lattice_0.17-14 plotrix_2.4-7 svSocket_0.9-5 TinnR_1.0.2 R2HTML_1.59 Hmisc_3.4-3 loaded via a namespace (and not attached): [1] cluster_1.11.11 grid_2.7.2 svMisc_0.9-5tools_2.7.2 sysname release version machine Windows XP build 2600, Service Pack 2x86 # system info: ### ### # sample data # dput(ban.nat.1994[sample(row.names(ban.nat.1994), 20),]) bn94 - structure(list(oplt = c(18L, 50L, 11L, 16L, 54L, 35L, 45L, 40L, 15L, 50L, 38L, 45L, 53L, 15L, 1L, 54L, 33L, 13L, 30L, 21L), tree = c(144L, 824L, 47L, 525L, 291L, 702L, 717L, 615L, 821L, 551L, 750L, 639L, 664L, 813L, 31L, 346L, 689L, 59L, 200L, 658L), bd1 = c(NA, NA, 3.6, 3.1, 4.72, 2.03, 2.88, 1.65, 5.39, 3.04, 2.75, 3.06, 2.81, 2.78, NA, 6.5, 4.62, 4.76, NA, 2.69), bd2 = c(NA, NA, 3.41, 3.06, 4.86, 2.09, 2.78, 1.8, 5.08, 3.26, 2.71, 3.1, 2.87, 2.73, NA, 6.6, 4.53, 4.97, NA, 2.81), bd = c(NA, 4.25, 3.51, 3.08, 4.79, 2.06, 2.83, 1.72, 5.23, 3.15, 2.73, 3.08, 2.84, 2.76, NA, 6.55, 4.58, 4.87, NA, 2.75), ht = c(NA, 20.4, 18.1, 18, 25.8, 13.1, 15.7, 4, 16, 14, 12.7, 8.6, 8.1, 16.2, NA, 52.7, 31.7, 23.7, NA, 17.6), spr = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_), stat = c(1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 1L, 0L), dam = c(NA, 2L, NA, NA, NA, NA, NA, 2L, NA, NA, 2L, NA, 1L, NA, NA, NA, 1L, NA, NA, NA), com = c(, from partial data set, tend = 1, , , , , , , , , , , , , , , , , , )), .Names = c(oplt, tree, bd1, bd2, bd, ht,
[R] Standard deviation for rows
Hi everyone, I have just started using R, and I have a simple question. How can I get the Standard deviation for rows. basically I am looking for something like rowMeans() but for Standard deviation (I tried rowSds() didn't exist) Thanks, -- View this message in context: http://www.nabble.com/Standard-deviation-for-rows-tp19998106p19998106.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Doing a Task Without Using a For Loop
I want to thank everyone for the help. I ended up having to use a loop to assign values from the table to NinYear. However, as I have played with the full datasets I have noticed that R is MUCH faster if I use vectors in the loop rather than columns of a dataframe. In the specific case of 43,000 lines of data, assigning values from the table to the 43,000 elements of a vector took 6 seconds whereas assigning values from the table to 43,000 elements of a dataframe took 21 minutes. Why is there such a huge difference? Tom Tom La Bone wrote: Assume that I have the dataframe data1, which is listed at the end of this message. I want count the number of lines that each person has for each year. For example, the person with ID=213 has 15 entries (NinYear) for 1953. The following bit of code calculates NinYear: for (i in 1:length(data1$ID)) { data1$NinYear[i] - length(data1[data1$Year==data1$Year[i] data1$ID==data1$ID[i],1]) } This seems to work but is horribly slow (some files I am working with have over 500,000 lines). Can anyone suggest a faster way of doing this, perhaps a way that does not use a for loop? Thanks. Tom IDYearNinYear 209 19710 209 19710 213 19510 213 19510 213 19530 213 19530 213 19530 213 19530 213 19530 213 19530 213 19530 213 19530 213 19530 213 19530 213 19530 213 19530 213 19530 213 19530 213 19530 213 19540 213 19540 213 19540 213 19540 213 19540 213 19540 213 19540 213 19540 213 19540 213 19540 213 19540 213 19550 213 19550 234 19530 234 19530 234 19530 234 19530 234 19530 234 19580 234 19580 234 19650 234 19650 234 19650 249 19520 249 19520 -- View this message in context: http://www.nabble.com/Doing-a-Task-Without-Using-a-For-Loop-tp19974078p19991682.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help about how can R compute AIC?
Em Ter, 2008-10-14 às 17:13 +0200, Arnau Mir Torres escreveu: Hello. I need to know how can R compute AIC when I study a regression model? For example, if I use these data: growth tannin 1 12 0 2 10 1 3 8 2 4 11 3 5 6 4 6 7 5 7 2 6 8 3 7 9 3 8 and I do model - lm (growth ~ tannin) AIC(model) R responses: 38.75990 I know the following formula to compute AIC: AIC= -2*log-likelihood + 2*(p+1) In my example, it would be: AIC=-2*log-likelihood + 2*2 but I don't know how R computes log-likelihood: logLik(model) 'log Lik.' -16.37995 (df=3) Arnau, LogLik= -16.37995 AIC= -2*log-likelihood + 2*(p+1) AIC=-2*-16.37995 + 2*(p+1) AIC= 32.7599+2*(p+1) # # this is very important the model have two # parameter, because sigma is a parameter to. # so # AIC= 32.7599+2*(2+1) AIC= 32.7599+6 AIC= 38.7599 -- Bernardo Rangel Tura, M.D,MPH,Ph.D National Institute of Cardiology Brazil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with R memory usage on Linux
See ?Memory-size On Wed, 15 Oct 2008, B. Bogart wrote: Hello all, I'm working with a large data-set, and upgraded my RAM to 4GB to help with the mem use. I've got a 32bit kernel with 64GB memory support compiled in. gnome-system-monitor and free both show the full 4GB as being available. In R I was doing some processing and I got the following message (when collecting 100 307200*8 dataframes into a single data-frame (for plotting): Error: cannot allocate vector of size 2.3 Mb So I checked the R memory usage: $ ps -C R -o size SZ 3102548 I tried removing some objects and running gc() R then shows much less memory being used: $ ps -C R -o size SZ 2732124 Which should give me an extra 300MB in R. I still get the same error about R being unable to allocate another 2.3MB. I deleted well over 2.3MB of objects... Any suggestions as to get around this? Is the only way to use all 4GB in R to use a 64bit kernel? Thanks all, B. Bogart __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Standard deviation for rows
?apply e.g. apply(matrix,1,sd) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Alex99 Sent: Wednesday, October 15, 2008 1:17 PM To: r-help@r-project.org Subject: [R] Standard deviation for rows Hi everyone, I have just started using R, and I have a simple question. How can I get the Standard deviation for rows. basically I am looking for something like rowMeans() but for Standard deviation (I tried rowSds() didn't exist) Thanks, -- View this message in context: http://www.nabble.com/Standard-deviation-for-rows-tp19998106p19998106.ht ml Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. P Please consider the environment before printing this e-mail Cleveland Clinic is ranked one of the top hospitals in America by U.S. News World Report (2008). Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use\...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ggplot2: sub/super-script axes labels
How are sub/superscripts designated for text in axis labels? As in this y-axis label: scale_y_continuous(Respiration, pmol O2 h-1) where the 2 needs to be subscripted and the -1 needs to be superscripted. Thanks. Adam --- Adam G. Marsh, Ph.D. Associate Professor Marine Biological Sciences University of Delaware Lewes, DE 19958 302.645.4367 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Lattice key title color
Deepayan Sarkar deepayan.sarkar at gmail.com writes: All of which doesn't really answer the original question. ... So, the title color cannot be currently specified, either directly or through the settings system. Krrr.. it was key$title Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] request: How can we ignore a component of list having no element
Dear Mahbub Thanks a lot for the help. - Original Message From: Mahbub Latif [EMAIL PROTECTED] To: Muhammad Azam [EMAIL PROTECTED] Sent: Wednesday, October 15, 2008 12:30:13 PM Subject: Re: [R] request: How can we ignore a component of list having no element try this, junk - sapply(x,function(i) !is.null(i)) y - x[junk] On Wed, Oct 15, 2008 at 11:23 AM, Muhammad Azam [EMAIL PROTECTED] wrote: Dear friends There is a list of arrays comprising different no of rows and columns even sometimes NULL, such as [[2]] given below. How can we ignore [[2]] or others like this in the complete list. Any help in this regard is needed. Thanks [[1]] [,1] [,2] [1,]31 [2,]31 [3,]31 [[2]] NULL [[3]] [,1] [,2] [,3] [,4] [,5] [,6] [,7] [1,]3100000 [2,]3100000 [3,]3100000 [4,]3131321 [5,]3131321 [6,]3131320 [[4]] [,1] [,2] [,3] [,4] [1,]3000 [2,]3133 [3,]3133 [4,]3130 OR x1=c(1,2,3); x2=c(1,2,3,4,6); x3=c(); x=list(x1,x2,x3) M.Azam [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- A.H.M. Mahbub Latif Assistant Professor Applied Statistics Institute of Statistical Research and Training University of Dhaka, Dhaka 1000, Bangladesh web : http://www.isrt.ac.bd/mlatif [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] legend
Em Ter, 2008-10-14 às 18:44 -0700, Lavan escreveu: Hi R users, I'm trying to have the symbols for sigma[1] in my legend. the code is given below, please have a look. Thanks, lavan try legend(bottom,legend=expression(sigma[1]),...) legend(bottom,legend=paste(Sigma1=, c(0.01,0.1,0.2,0.5,1,1.5,2,4,6,9.5), sep=), fill=c(red,green,blue,black,pink,brown,purple,yellow,lightblue,orange)) -- Bernardo Rangel Tura, M.D,MPH,Ph.D National Institute of Cardiology Brazil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] investigating interaction term for a model of Gross Primary Productivity
Dear Stephen, I'm not sure exactly what you have in mind, but you might take a look at the effects package. I hope this helps, John On Wed, 15 Oct 2008 12:59:33 -0400 stephen sefick [EMAIL PROTECTED] wrote: I am trying to investigate the interaction term in the below. The paradigm in aquatic systems is that algal production is either nitrogen (TIN) or Phosphorus limited, and I am trying to investigate this- what is the best way to go about investigating the interaction term. I have some thoughts on the above, but I will withhold them to see what others think. Thanks for your help. d - (structure(list(d.GPP = c(1.213695235, 3.817313822, 1.267930498, 10.45692825, 3.268295623, 3.505286001, 4.468225245, 0.915653726, 1.635617261, 3.726133898, 1.363453706, 13.99650967, 0.417618143, 0.741080504, 0.412440872, 3.515743675, 8.248491445, 1.537773727, 1.537773727, 3.249103284, 3.249103284, 0.768531626, 2.633107621, 3.113199095, 0.773824094, 0.680150068, 0.680150068, 0.026385752, 0.369310858, 8.049276658, 7.487378383, 0.950072035, 0.950072035, 0.763580377, 0.333244629, 5.475999014, 9.235631398), d.TIN = c(0.53, 0.52, 0.446, 0.217, 0.39, 0.34, 0.45, 0.29, 0.23, 0.308, 0.3, 0.1, 0.52, 0.13, 0.38, 0.36, 0.226, 0.345, 0.345, 0.217, 0.217, 0.47, 0.34, 0.31, 0.35, 0.41, 0.41, 0.432, 0.333, 0.168, 0.22, 0.37, 0.37, 0.44, 0.38, 0.3, 0.105), d.Phosphorus = c(0.17, 0.19, 0.2, 0.012, 0.13, 0.14, 0.069, 0.13, 0.13, 0.13, 0.099, 0.013, 0.16, 0.076, 0.12, 0.037, 0.011, 0.1, 0.1, 0.021, 0.021, 0.12, 0.099, 0.088, 0.036, 0.15, 0.15, 0.12, 0.12, 0.0076, 0.014, 0.14, 0.14, 0.15, 0.15, 0.12, 0.011), d.TSS = c(2.8, 8.4, 11, 1.3, 4.2, 2, 3.4, 14, 8.2, 3.1, 1.4, 0.9, 6.1, 9.2, 11, 1.2, 1.3, 11, 11, 8.5, 8.5, 13, 4.4, 1.4, 2.1, 25, 25, 9.3, 6.1, 1.6, 1.5, 19, 19, 24, 9.6, 1.8, 1.4)), .Names = c(d.GPP, d.TIN, d.Phosphorus, d.TSS), row.names = c(NA, -37L), class = data.frame)) #here is the model model.GPP - lm(GPP~TIN*Phosphorus+I(1/TSS), data=d) #regression equation GPP - function(TIN, P, TSS){ 8.765-(22.896*TIN)-(63.513*P)+(5.836/TSS)+(170.735*(TIN*P)) } #predicting GPP from the data - This is kind of useless in this context because I am predicting GPP from the data that I modeled it with, but I think I am going to have to fool around #with these values to investigate the interaction term. This is kind of where I am stuck. a - GPP(d$TIN, d$Phosphorus, d$TSS) #plotting the data plot(a~d$TSS) -- Stephen Sefick Research Scientist Southeastern Natural Sciences Academy Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods. We are mammals, and have not exhausted the annoying little problems of being mammals. -K. Mullis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. John Fox, Professor Department of Sociology McMaster University Hamilton, Ontario, Canada http://socserv.mcmaster.ca/jfox/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fw: Logistic regresion - Interpreting (SENS) and (SPEC)
This approach leaves much to be desired. I hope that its practitioners start gauging it by the mean squared error of predicted probabilities. Is the logic here is that low MSE of predicted probabilities equals a better calibrated model? What about discrimination? Perfect calibration implies perfect discrimination, but I often find that you can have two competing models, the first with higher discrimination (AUC) and worse calibration, and the the second the other way round. Which one is the better model? -- Gad Abraham Dept. CSSE and NICTA The University of Melbourne Parkville 3010, Victoria, Australia email: [EMAIL PROTECTED] web: http://www.csse.unimelb.edu.au/~gabraham __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] YALAQ - Yet Another LApply Question
Thank you very much Greg. Apparently, my learning curve is still well below it's horizontal asymptote. And no wonder in the vast universe of the R-project, eh? Thanx, DaveT. -Original Message- From: Greg Snow [mailto:[EMAIL PROTECTED] Sent: October 15, 2008 01:52 PM To: Thompson, David (MNR); r-help@r-project.org Subject: RE: YALAQ - Yet Another LApply Question For the first question, ?str tells us that the str function does not return anything, but has a side effect of printing information to the console. So in version 1 when you call the cat function it evaluates its arguments and as it evaluates str(...) the information is printed, then nothing is passed into cat which then prints the name of the object (after str has done its printing). In the second version cat finishes printing the name before str is called so that it is in the correct order. You might prefer to use lapply/sapply to create a list with all the pieces of interest, then just do str on the resulting list, or look at the TkListView function in the TeachingDemos package for another way to look at the structure of a list. For example: TkListView( sapply( ls(pattern='bn'), function(y) get(y), simplify=FALSE, USE.NAMES=TRUE ) ) For the 2nd question, the difference is that the summary function does not print anything, just returns the summary information. When you type summary at the command line the parser automatically calls the print method for the summary object returned, but within functions/loops/apply the summary object is returned (included in the output of lapply) but not printed. You need to explicitly print the results of summary. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] 801.408.8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] project.org] On Behalf Of Thompson, David (MNR) Sent: Wednesday, October 15, 2008 10:19 AM To: r-help@r-project.org Subject: Re: [R] YALAQ - Yet Another LApply Question Please forgive this repost, it's been a week without a squeak. No comments? Original post: https://stat.ethz.ch/pipermail/r-help/2008-October/176340.html Hello, Two lapply questions (system info and sample data below): 1) Why does the first form of command1 add the name of y _after_ the str() output rather than before as does the second (preferred) form? # command1 version1 invisible(lapply(ls(pattern='bn'), function(y) cat(y, \n, str(get(y)), \n) )) # command1 version2 (preferred output) invisible(lapply(ls(pattern='bn'), function(y) { cat(y, \n) ; str(get(y)) ; cat(\n) } )) 2) Why does the same method as command1 version2 above not work with the summary() command as does the second (preferred) set of commands? # command2 version1 lapply(ls(pattern='bn'), function(y) { cat(y, \n) ; summary(get(y)) ; cat(\n) } ) # command2 version2 (preferred output) smry.list - lapply(ls(pattern='bn'), function(y) summary(get(y))) names(smry.list) - ls(pattern='bn') smry.list Thanx, DaveT. ### # system info: sessionInfo() ; cat(\n) ; Sys.info()[c(1:3,5)] R version 2.7.2 (2008-08-25) i386-pc-mingw32 locale: LC_COLLATE=English_Canada.1252;LC_CTYPE=English_Canada.1252;LC_MONETARY = English_Canada.1252;LC_NUMERIC=C;LC_TIME=English_Canada.1252 attached base packages: [1] stats graphics grDevices datasets tcltk utils methods base other attached packages: [1] debug_1.1.0 mvbutils_1.1.1 lattice_0.17-14 plotrix_2.4-7 svSocket_0.9-5 TinnR_1.0.2 R2HTML_1.59 Hmisc_3.4-3 loaded via a namespace (and not attached): [1] cluster_1.11.11 grid_2.7.2 svMisc_0.9-5tools_2.7.2 sysname release version machine Windows XP build 2600, Service Pack 2x86 # system info: ### ### # sample data # dput(ban.nat.1994[sample(row.names(ban.nat.1994), 20),]) bn94 - structure(list(oplt = c(18L, 50L, 11L, 16L, 54L, 35L, 45L, 40L, 15L, 50L, 38L, 45L, 53L, 15L, 1L, 54L, 33L, 13L, 30L, 21L), tree = c(144L, 824L, 47L, 525L, 291L, 702L, 717L, 615L, 821L, 551L, 750L, 639L, 664L, 813L, 31L, 346L, 689L, 59L, 200L, 658L), bd1 = c(NA, NA, 3.6, 3.1, 4.72, 2.03, 2.88, 1.65, 5.39, 3.04, 2.75, 3.06, 2.81, 2.78, NA, 6.5, 4.62, 4.76, NA, 2.69), bd2 = c(NA, NA, 3.41, 3.06, 4.86, 2.09, 2.78, 1.8, 5.08, 3.26, 2.71, 3.1, 2.87, 2.73, NA, 6.6, 4.53, 4.97, NA, 2.81), bd = c(NA, 4.25, 3.51, 3.08, 4.79, 2.06, 2.83, 1.72, 5.23, 3.15, 2.73, 3.08, 2.84, 2.76, NA, 6.55, 4.58, 4.87, NA, 2.75), ht = c(NA, 20.4, 18.1, 18, 25.8, 13.1, 15.7, 4, 16, 14, 12.7, 8.6, 8.1, 16.2, NA, 52.7, 31.7, 23.7, NA, 17.6), spr = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_,
[R] Iterative estimation of linear regression model
Dear all I am intrested in making iterative estimation (thro' loop statements) of, say, linear regression model. For this purpose, I have written the following programme and that I have made use of a sample data (viz., exp.txt): Programme: # Linear regression modelling with sample data (try5.txt) # Repeated estimation through loop statement x=read.table(try5.txt,header=T,sep=\t) nm=names(x) out=slrout.txt sink(out) nvr=10;nv1=4;nv2=nvr-nv1 for(i in 1:nv1){ dep=nm[i+1] # The 1st dependent variable is C1 (at column no. 2) for(j in 1:nv2){ ind=nm[j+nv1+1] # The first independent variable is C5 (at column no. 6) # Estimation of simple linear regression equation slr=lm(dep ~ ind,data=x) smr=summary(slr) print(slr) tsp=ts.plot(dep) tsp } } Data file (try.txt): Year C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 Y1 3.5 13.8 9.5 6.8 0.4 24.0 3 7.1 24.6 0.03 Y2 3.8 13.9 9.9 7.6 0.7 12.8 5 9.0 32.5 0.01 Y3 4.5 14.5 14.2 9.2 0.6 14.5 1 7.5 22.9 0.05 Y4 5.9 16.2 24.6 12.7 0.2 24.3 3 8.0 26.8 0.04 Y5 7.2 20.4 40.6 18.2 0.8 28.2 5 9.2 44.3 0.02 Y6 5.9 18.6 37.4 14.5 0.3 36.9 8 9.5 32.9 0.10 Y7 8.0 16.1 88.6 24.1 0.1 34.6 2 8.7 11.1 0.02 Y8 13.6 21.1 56.3 19.0 0.7 33.3 6 9..5 10.8 0.06 Y9 11.2 20.4 40.7 12.9 1.1 40.3 3 12.2 6.5 0.04 Y10 7.6 18.3 27.5 8.1 2.3 41.9 2 5.9 2.9 1.00 Y11 8.8 22.2 33.3 8.8 0.6 44.4 4 6.6 55.5 0.09 Y12 9.4 16.5 35.6 16.2 0.7 50.2 5 8..8 31.4 0.07 (These files have been given as attachments as well) The data file has 10 variables in all, divided into two groups: G1 consisting of the first 4 variables and G2 of the remaining 6 variables. We wish to iteratively pick one variable from the group G1 and regress it upon (an iteratively picked) variable from G2 and save the summary results. We also wish to save the repeatedly generated diagrams in an output file. But the programme has failed to work. Kindly help; I'm a raw hand in R-language. Regards ajss Be the first one to try the new Messenger 9 Beta! Go to http://in.messenger.yahoo.com/win/# Linear regression modelling with sample data (try5.txt) # Repeated estimation through loop statement x=read.table(try5.txt,header=T,sep=\t) nm=names(x) out=slrout.txt sink(out) nvr=10;nv1=4;nv2=nvr-nv1 for(i in 1:nv1){ dep=nm[i+1] # The 1st dependent variable is C1 (at column no. 2) for(j in 1:nv2){ ind=nm[j+nv1+1] # The first independent variable is C5 (at column no. 6) # Estimation of simple linear regression equation slr=lm(dep ~ ind,data=x) smr=summary(slr) print(slr) tsp=ts.plot(dep) tsp } } YearC1 C2 C3 C4 C5 C6 C7 C8 C9 C10 Y1 3.5 13.89.5 6.8 0.4 24.03 7.1 24.6 0.03 Y2 3.8 13.99.9 7.6 0.7 12.85 9.0 32.5 0.01 Y3 4.5 14.514.29.2 0.6 14.51 7.5 22.9 0.05 Y4 5.9 16.224.612.70.2 24.33 8.0 26.8 0.04 Y5 7.2 20.440.618.20.8 28.25 9.2 44.3 0.02 Y6 5.9 18.637.414.50.3 36.98 9.5 32.9 0.10 Y7 8.0 16.188.624.10.1 34.62 8.7 11.1 0.02 Y8 13.621.156.319.00.7 33.36 9.5 10.8 0.06 Y9 11.220.440.712.91.1 40.33 12.26.5 0.04 Y10 7.6 18.327.58.1 2.3 41.92 5.9 2.9 1.00 Y11 8.8 22.233.38.8 0.6 44.44 6.6 55.5 0.09 Y12 9.4 16.535.616.20.7 50.25 8.8 31.4 0.07__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rgl-snapshot failed (err-msg: failed)
On 10/10/2008 8:55 AM, Oliver Bandel wrote: Zitat von Duncan Murdoch [EMAIL PROTECTED]: On 10/10/2008 8:13 AM, Oliver Bandel wrote: Hello, I tried to use rgl.snapshot and it failed. The error message was not very verbose: == plot3d( motion[[idx+2]], motion[[idx+1]], motion[[idx]] ) rgl.snapshot(filename=/tmp/shot_01.png, fmt=png) [1] failed == There was a graphic created by rgl, but the snapshot was not created. The same problem occurs, when I use example(rgl.snapshot) The graphic/animation will be created, but there is no possibility to create the output-files. I use R version 2.4.0 Patched (2006-11-25 r39997). Is this a problem of this old version? or is there maybe a general problem, independent of the version? This is a problem with the build. rgl is not easy to build, because it links to a lot of external libraries. In this case it looks as though your build (the one from R-forge? You should say...) did not link to libpng. [...] I installed a binary (Debian etch). I may try the same things at my laptop. There I have Ubuntu, and it uses much newer R versions. maybe they also have fixed this problem. If the error messages would be a littlebid more verbose, this could help. Is it possible to configure the verboseness of the error messages? Just a small followup: the R-forge folks are in the process of making changes so their binary builds should include PNG support, and I've made changes to rgl so that its error messages in case PNG support is lacking are more informative. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with R memory usage on Linux
Doen't work. \misiek Prof Brian Ripley wrote: See ?Memory-size On Wed, 15 Oct 2008, B. Bogart wrote: [...] [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Standard deviation for rows
apply( mymatrix, 1, sd ) Will give the row standard deviations of a matrix or matrix like data frame. There are also some functions in add-on packages that do row sd's or var's. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] 801.408.8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] project.org] On Behalf Of Alex99 Sent: Wednesday, October 15, 2008 11:17 AM To: r-help@r-project.org Subject: [R] Standard deviation for rows Hi everyone, I have just started using R, and I have a simple question. How can I get the Standard deviation for rows. basically I am looking for something like rowMeans() but for Standard deviation (I tried rowSds() didn't exist) Thanks, -- View this message in context: http://www.nabble.com/Standard-deviation- for-rows-tp19998106p19998106.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Standard deviation for rows
Thanks for the reply, that's exactly what I wanted. I also figured that the function rowSds() does exist. I just needed to load the generfilter package. Thanks again. Nutter, Benjamin wrote: ?apply e.g. apply(matrix,1,sd) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Alex99 Sent: Wednesday, October 15, 2008 1:17 PM To: r-help@r-project.org Subject: [R] Standard deviation for rows Hi everyone, I have just started using R, and I have a simple question. How can I get the Standard deviation for rows. basically I am looking for something like rowMeans() but for Standard deviation (I tried rowSds() didn't exist) Thanks, -- View this message in context: http://www.nabble.com/Standard-deviation-for-rows-tp19998106p19998106.ht ml Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. P Please consider the environment before printing this e-mail Cleveland Clinic is ranked one of the top hospitals in America by U.S. News World Report (2008). Visit us online at http://www.clevelandclinic.org for a complete listing of our services, staff and locations. Confidentiality Note: This message is intended for use\...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Standard-deviation-for-rows-tp19998106p1312.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R/Parallel
Hi there, I am looking for R/parallel package or some other package that would speed up the analysis.I am working on computatioanly intensive data so any suggestions would be really helpful. Kindly let me know if any -- View this message in context: http://www.nabble.com/R-Parallel-tp1425p1425.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] integrate problem
i'm in the process of switch from Maple to R and am trying to code the following function: w(d)=\int -Inf_Inf A(x) |int 0_(t(d)-x) Fy dydx can some one point me in the right direction? i don't seem to be able to figure it out on my own. Dr. Wade Winterhalter University of Central Florida Department of Biology ph: 407-260-0134 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] LME gives estimates with a random factor that has one 1 datapoint per group
Dear all, I have been fitting models to do meta-analysis in R. This includes a mixed effect model with a weighting function. This weighting function is the sample size and the random factor is study or species for instance (Nakagawa 2007). I have tried this method and I find some strange things. I can fit a model that has a random factor that has only one data point per group and this influences the effect the weight function has on the intercept. I do not get how this works because with one datapoint there is no within group variance. Please see the code below. I really appriciate if someone could explain me what happens, because I want to use this method to do some meta-analyses. Mirre Univ. of Groningen The Netherlands library(nlme) rm(list=ls()) Status-read.csv(Status2.csv,header=T) attach(Status) str(Status) model1-glm(ES~1) model2-lme(ES~1,random=~1|ID) model3-gls(ES~1,weights=varFixed(~(Size))) model4-lme(ES~1,random=~1|ID, weights=varFixed(~(Size))) summary(model1) summary(model2) summary(model3) summary(model4) The data file is: ID,ES,Size 1,0.53,25 2,0.57,13 3,0.52,10 4,0.89,14 5,0.32,9 6,0.04,6 7,0.33,11 8,0.54,9 9,0.33,9 10,0.88,10 11,0.483,19 12,0.488,41 13,0.147,20 14,0.37,22 15,0.1,28 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R/Parallel
Hi, have a look to Dirks tutorial at the UseR2008. This should be a good starting point: http://www.statistik.uni-dortmund.de/useR-2008/tutorials/eddelbuettel.html Markus Rajasekaramya wrote: Hi there, I am looking for R/parallel package or some other package that would speed up the analysis.I am working on computatioanly intensive data so any suggestions would be really helpful. Kindly let me know if any -- Dipl.-Tech. Math. Markus Schmidberger Ludwig-Maximilians-Universität München IBE - Institut für medizinische Informationsverarbeitung, Biometrie und Epidemiologie Marchioninistr. 15, D-81377 Muenchen URL: http://www.ibe.med.uni-muenchen.de Mail: Markus.Schmidberger [at] ibe.med.uni-muenchen.de Tel: +49 (089) 7095 - 4599 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with R memory usage on Linux
Or ?Memory-limits (and the posting guide of course). On Wed, 15 Oct 2008, Prof Brian Ripley wrote: See ?Memory-size On Wed, 15 Oct 2008, B. Bogart wrote: Hello all, I'm working with a large data-set, and upgraded my RAM to 4GB to help with the mem use. I've got a 32bit kernel with 64GB memory support compiled in. gnome-system-monitor and free both show the full 4GB as being available. In R I was doing some processing and I got the following message (when collecting 100 307200*8 dataframes into a single data-frame (for plotting): Error: cannot allocate vector of size 2.3 Mb So I checked the R memory usage: $ ps -C R -o size SZ 3102548 I tried removing some objects and running gc() R then shows much less memory being used: $ ps -C R -o size SZ 2732124 Which should give me an extra 300MB in R. I still get the same error about R being unable to allocate another 2.3MB. I deleted well over 2.3MB of objects... Any suggestions as to get around this? Is the only way to use all 4GB in R to use a 64bit kernel? Thanks all, B. Bogart __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] dbAppy questions/clarifications
In the example in the documentation, I see: rs - dbSendQuery(con, select Agent, ip_addr, DATA from pseudo_data order by Agent) out - dbApply(rs, INDEX = Agent, FUN = function(x, grp) quantile(x$DATA, names=FALSE)) Maybe I am a bit thick, but it took me a while, and a kind hint from Phil, to figure much of this out. It is clear that the SQL orders the data by Agent, and the INDEX parameter tells dbApply that FUN is to be applied to each group of values defined by Agent (like applying SUM(DATA) in SQL using a GROUP BY clause). If my understanding is correct, out will be an array holding ordered pairs, with the value of Agent and the corresponding values returned by FUN. I take it FUN = function(x, grp) quantile(x$DATA, names=FALSE) is the function definition for a function called FUN. I would guess, then, that the opening and closing braces are optional. Is that correct? Or is this something else? I did not see a definition of 'grp'. What is it? Suppose the function I want to apply is fitdistr(x,exponential). Would I just replace quantile(x$DATA, names=FALSE) by fitdistr(x,exponential)? Finally, suppose the query I need to run is more complex, such as: SELECT group_id,YEAR(my_date),WEEK(my_date),ndays FROM myTable ORDER BY group_id,YEAR(my_date),WEEK(my_date); Can dbApply handle applying fitdistr(x,exponential) to each group of values defined by group_id,YEAR(my_date),WEEK(my_date)? If so, how would I change the call to dbsendQuery, and how would I insert the resulting estimates using something like INSERT INTO myResults (group_id,year,week,rate,sd) VALUES (?,?,?,?);? Once I get this, I can do everything else within a stored procedure in MySQL. I get the idea of using,e.g., sprintf to interpolate values I need to insert into a query string, but it is a question of how to get the values I need from 'out' (to use the above example), and how to iterate over them to do the SQL INSERT. Actually, would 'dbWriteTable' handle inserting these values efficiently? If so, how do I ensure it maps the group_id,year, week, c. from 'out' to the right columns in my results table (what I have in mind involves a table with a couple extra columns that would take appropriate default values)? Thanks Ted -- View this message in context: http://www.nabble.com/dbAppy-questions-clarifications-tp2632p2632.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Can R scripts executed in batch mode take a commandline argument?
I have examined the documentation for batch mode use of R: R CMD BATCH [options] infile [outfile] The documentation for this seems rather spartan. Running R CMD BATCH --help gives me info on only two options: one for getting help and the other to get the version. I see, further on, that there are options for retoring and saving sessions (which I do not need to do in this case), but are there other options defined? If so, what are they and how are they to be used? However, it goes on to say: Further arguments starting with a '-' are considered as options as long as '--' was not encountered, and are passed on to the R process, which by default is started with '--restore --save'. I see here it says further arguments starting with a '-' are passed to the R process, but usage is not clear. For example, if I write a script that should take as a commandline argument the name of a file that contains a series of numbers that I want to place into a vector which, in turn I want to pass to fitdistr(x,exponential), what do I do to get that file name from the commandline and pass it to, say, read.csv? BTW: How would I tell it that there is no need to restore and save? If I can't pass a commandline argument, do I have to write the arguments in afile, and have that file read each time I need to run the script? Thanks Ted -- View this message in context: http://www.nabble.com/Can-R-scripts-executed-in-batch-mode-take-a-commandline-argument--tp2914p2914.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plot - central limit theorem
Hi, Is there a way to simulate a population with R and pull out m samples, each with n values for calculating m means? I need that kind of data to plot a graphic, demonstrating the central limit theorem and I don't know how to begin. So, perhaps someone can give me some tips and hints how to start and which functions to use. thanks for any help, joerg __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot - central limit theorem
Look at the clt.examp function in the TeachingDemos package, it generates samples from normal, uniform, gamma (default exponential), and beta (default U-shaped) distributions and plots histograms of the means along with a reference line of a normal distribution with the same mean and sd. The default sample size is 1 which will show the shape of the population, then you can run it again with larger values of n to show how all of them become more and more normal (the exponential is the slowest). Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] 801.408.8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] project.org] On Behalf Of Jörg Groß Sent: Wednesday, October 15, 2008 1:49 PM To: r-help@r-project.org Subject: [R] plot - central limit theorem Hi, Is there a way to simulate a population with R and pull out m samples, each with n values for calculating m means? I need that kind of data to plot a graphic, demonstrating the central limit theorem and I don't know how to begin. So, perhaps someone can give me some tips and hints how to start and which functions to use. thanks for any help, joerg __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Removing characters and periods from character strings
Hello R-users, I have code that gives me the important variables from an analysis. I need to input these variables into a different analysis. To do this, I need to modify them slightly... 1) remove all numbers at the end of the variables, 2) remove all periods. I tried to do it with the awkward code below. It works to remove all the numbers, but when I try to remove the period everything is lost. Does anyone have a solution? And perhaps a more elegant way? pick=c((Intercept), Clear, factor(Hab)3, factor(Hab)4, factor(Hab)5,factor(Hab)7, factor(Log)1, Hunt, Pop, s(PrimRd).1, s(PrimRd).2, Unlog, Xcoord, Ycoord) vars=as.character(as.vector(strsplit(pick,1))) vars=unique(as.character(as.vector(strsplit(vars,2 vars=unique(as.character(as.vector(strsplit(vars,3 vars=unique(as.character(as.vector(strsplit(vars,4 vars=unique(as.character(as.vector(strsplit(vars,5 vars=unique(as.character(as.vector(strsplit(vars,7 vars=unique(as.character(as.vector(strsplit(vars,. Thanks for your help! John __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot - central limit theorem
On 10/15/2008 3:48 PM, Jörg Groß wrote: Hi, Is there a way to simulate a population with R and pull out m samples, each with n values for calculating m means? I need that kind of data to plot a graphic, demonstrating the central limit theorem and I don't know how to begin. The easiest way to do this is to put the simulations into a matrix, and calculate row means (or column means). For example: par(mfrow=c(2,2)) m - 1 for (n in c(1,2,4,8)) { sims - matrix(runif(n*m), ncol=n) means - rowMeans(sims) hist(means, main=paste(n = , n), xlim=c(0,1)) } Duncan Murdoch So, perhaps someone can give me some tips and hints how to start and which functions to use. thanks for any help, joerg __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] help with a for loop
Hi all, I am running OLS models with a power parameter: y=a+bx^p, and I want to iterate through p with a set increment. The function I constructed is something like below: test-function(data, model,...){ f-formula(model) #model=y~I(x^p) p-0.1 n-10 result-list() while (p=1) { for (i in 1:n) { result[[i]]-lm(data, formula=f...) p-p+0.1 } } return(result) } The problem I am facing is that the loop does not assign or update p values. I also tried a nested for loop starting with for (p in seq(0.1,1,0.1)){for (i in 1:n) {... but it did not work either. After the initial value of p is assigned, the inner loop is executed but p is not updated. Any help is greatly appreciated. best, quan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot - central limit theorem
Hi Joerg, Is there a way to simulate a population with R and pull out m samples, each with n values for calculating m means? I need that kind of data to plot a graphic, demonstrating the central limit theorem and I don't know how to begin. So, perhaps someone can give me some tips and hints how to start and which functions to use. Have a look at library(TeachingDemos) ?clt.examp HTH, Tobias __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing characters and periods from character strings
A case for [g]sub: gsub(., , sub([0-9]*$, , pick), fixed = TRUE) On Wed, 15 Oct 2008, John Poulsen wrote: Hello R-users, I have code that gives me the important variables from an analysis. I need to input these variables into a different analysis. To do this, I need to modify them slightly... 1) remove all numbers at the end of the variables, 2) remove all periods. I tried to do it with the awkward code below. It works to remove all the numbers, but when I try to remove the period everything is lost. Does anyone have a solution? And perhaps a more elegant way? pick=c((Intercept), Clear, factor(Hab)3, factor(Hab)4, factor(Hab)5,factor(Hab)7, factor(Log)1, Hunt, Pop, s(PrimRd).1, s(PrimRd).2, Unlog, Xcoord, Ycoord) vars=as.character(as.vector(strsplit(pick,1))) vars=unique(as.character(as.vector(strsplit(vars,2 vars=unique(as.character(as.vector(strsplit(vars,3 vars=unique(as.character(as.vector(strsplit(vars,4 vars=unique(as.character(as.vector(strsplit(vars,5 vars=unique(as.character(as.vector(strsplit(vars,7 vars=unique(as.character(as.vector(strsplit(vars,. Thanks for your help! John __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing characters and periods from character strings
Try this; gsub(\\.|[0-9], , pick) On Wed, Oct 15, 2008 at 4:42 PM, John Poulsen [EMAIL PROTECTED] wrote: Hello R-users, I have code that gives me the important variables from an analysis. I need to input these variables into a different analysis. To do this, I need to modify them slightly... 1) remove all numbers at the end of the variables, 2) remove all periods. I tried to do it with the awkward code below. It works to remove all the numbers, but when I try to remove the period everything is lost. Does anyone have a solution? And perhaps a more elegant way? pick=c((Intercept), Clear, factor(Hab)3, factor(Hab)4, factor(Hab)5,factor(Hab)7, factor(Log)1, Hunt, Pop, s(PrimRd).1, s(PrimRd).2, Unlog, Xcoord, Ycoord) vars=as.character(as.vector(strsplit(pick,1))) vars=unique(as.character(as.vector(strsplit(vars,2 vars=unique(as.character(as.vector(strsplit(vars,3 vars=unique(as.character(as.vector(strsplit(vars,4 vars=unique(as.character(as.vector(strsplit(vars,5 vars=unique(as.character(as.vector(strsplit(vars,7 vars=unique(as.character(as.vector(strsplit(vars,. Thanks for your help! John __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing characters and periods from character strings
To remove all numbers and periods at the end do: vars - sub([0-9.]+$, , pick ) If there are other periods that this does not remove (not at the end) then you can do a second pass: vars - sub(\\., , vars) Part of the reason that your code for the period does not work is that an unescaped period matches any single character (hence the \\. in the above). You may also want to look at the unique function if you want to remove the duplicates that result. Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] 801.408.8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] project.org] On Behalf Of John Poulsen Sent: Wednesday, October 15, 2008 1:43 PM To: r-help@r-project.org Subject: [R] Removing characters and periods from character strings Hello R-users, I have code that gives me the important variables from an analysis. I need to input these variables into a different analysis. To do this, I need to modify them slightly... 1) remove all numbers at the end of the variables, 2) remove all periods. I tried to do it with the awkward code below. It works to remove all the numbers, but when I try to remove the period everything is lost. Does anyone have a solution? And perhaps a more elegant way? pick=c((Intercept), Clear, factor(Hab)3, factor(Hab)4, factor(Hab)5,factor(Hab)7, factor(Log)1, Hunt, Pop, s(PrimRd).1, s(PrimRd).2, Unlog, Xcoord, Ycoord) vars=as.character(as.vector(strsplit(pick,1))) vars=unique(as.character(as.vector(strsplit(vars,2 vars=unique(as.character(as.vector(strsplit(vars,3 vars=unique(as.character(as.vector(strsplit(vars,4 vars=unique(as.character(as.vector(strsplit(vars,5 vars=unique(as.character(as.vector(strsplit(vars,7 vars=unique(as.character(as.vector(strsplit(vars,. Thanks for your help! John __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dbAppy questions/clarifications
I can't help with your inquiry generally, but I can address the issue of ``optional braces''. On 16/10/2008, at 8:34 AM, Ted Byers wrote: snip I take it FUN = function(x, grp) quantile(x$DATA, names=FALSE) is the function definition for a function called FUN. I would guess, then, that the opening and closing braces are optional. Is that correct? Yes. This is correct. If the body of the function doesn't need braces, you don't *have* to put them in. (They never hurt, but.) E.g. foo - function(x) x^2 works ``just as well as'' foo - function(x){x^2} cheers, Rolf ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot - central limit theorem
Jörg Groß wrote: Hi, Is there a way to simulate a population with R and pull out m samples, each with n values for calculating m means? I need that kind of data to plot a graphic, demonstrating the central limit theorem and I don't know how to begin. So, perhaps someone can give me some tips and hints how to start and which functions to use. thanks for any help, joerg x - runif(1,0,1) y - runif(1,0,1) z - runif(1,0,1) u - runif(1,0,1) v - runif(1,0,1) w - runif(1,0,1) t - x+y+z+u+v+w hist(t,breaks=100,xlab=sum of i.i.d r.v with finite mean and variance,ylab=Frequency,main=) Change runif for the corresponding function of the distribution of your choice. Not exactly what you wanted but it does verify the C.L.T. Rubén __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plot - central limit theorem
Galton's 19th century mechanical version of this is the quincunx. I have a (very primitive) version of this for R at: http://www.econ.uiuc.edu/~roger/courses/476/routines/quincunx.R url:www.econ.uiuc.edu/~rogerRoger Koenker email[EMAIL PROTECTED]Department of Economics vox: 217-333-4558University of Illinois fax: 217-244-6678Champaign, IL 61820 Jörg Groß wrote: Hi, Is there a way to simulate a population with R and pull out m samples, each with n values for calculating m means? I need that kind of data to plot a graphic, demonstrating the central limit theorem and I don't know how to begin. So, perhaps someone can give me some tips and hints how to start and which functions to use. thanks for any help, joerg __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Condition warning: has length 1 and only the first element will be used
Hello, I've been learning R functions recently and I've come across a problem that perhaps someone could help me with. # I have a a chron() object of times hours=chron(time=c(01:00:00,18:00:00,13:00:00,10:00:00)) # I would like to subtract 12 hours from each time element, so I created a vector of 12 hours (actually, more like a vector of noons) less.hours=chron(time=rep(12:00:00,4)) # If I just subtract the two vectors, I get some negative values, as fractions of a day hours-less.hours [1] -0.4583 0.2500 0.0417 -0.0833 # I would like those negative values to cycle around and subtract the amount 0 from midnight # That is to say, 01:00:00 - 12:00:00 should be 13:00:00 # because 01:00:00-12:00:00 = -11:00:00, and 24:00:00-11:00:00 = 13:00:00 # It's sort of like going back to the previous day, but without actually including information about which day it is # This is what I tried test.function=function(x,y) { + sub = x-y + if(sub0) x+y + } test.function(hours,less.hours) Time in days: [1] 0.5416667 1.250 1.0416667 0.917 Warning message: In if (sub 0) x + y : the condition has length 1 and only the first element will be used # My questions are, why does it only use the first element?? Why does it not apply to all? Also, what happened to the elements where sub= 0, it looks like they followed the rules # of if(sub0). I feel like I must not be understanding something rather basic about how logical expressions operate within R # Help would be appreciated... [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Condition warning: has length 1 and only the first element will be used
The if statement is for program flow control (do this bunch of code if this condition is true, else do this), the vectorized version is the ifelse function, that is the one that you want to use. Also note (this is for times in general, not sure about chron specifically) subtracting noon from the times gives you time differences, subtracting a time difference from a time will give you another time, so it may be simpler to subtract just 12, rather than noon from your times. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] 801.408.8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] project.org] On Behalf Of Joe Kaser Sent: Wednesday, October 15, 2008 2:19 PM To: r-help@r-project.org Subject: [R] Condition warning: has length 1 and only the first element will be used Hello, I've been learning R functions recently and I've come across a problem that perhaps someone could help me with. # I have a a chron() object of times hours=chron(time=c(01:00:00,18:00:00,13:00:00,10:00:00)) # I would like to subtract 12 hours from each time element, so I created a vector of 12 hours (actually, more like a vector of noons) less.hours=chron(time=rep(12:00:00,4)) # If I just subtract the two vectors, I get some negative values, as fractions of a day hours-less.hours [1] -0.4583 0.2500 0.0417 -0.0833 # I would like those negative values to cycle around and subtract the amount 0 from midnight # That is to say, 01:00:00 - 12:00:00 should be 13:00:00 # because 01:00:00-12:00:00 = -11:00:00, and 24:00:00-11:00:00 = 13:00:00 # It's sort of like going back to the previous day, but without actually including information about which day it is # This is what I tried test.function=function(x,y) { + sub = x-y + if(sub0) x+y + } test.function(hours,less.hours) Time in days: [1] 0.5416667 1.250 1.0416667 0.917 Warning message: In if (sub 0) x + y : the condition has length 1 and only the first element will be used # My questions are, why does it only use the first element?? Why does it not apply to all? Also, what happened to the elements where sub= 0, it looks like they followed the rules # of if(sub0). I feel like I must not be understanding something rather basic about how logical expressions operate within R # Help would be appreciated... [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Condition warning: has length 1 and only the first element will be used
Joe Kaser wrote: Hello, I've been learning R functions recently and I've come across a problem that perhaps someone could help me with. # I have a a chron() object of times hours=chron(time=c(01:00:00,18:00:00,13:00:00,10:00:00)) # I would like to subtract 12 hours from each time element, so I created a vector of 12 hours (actually, more like a vector of noons) less.hours=chron(time=rep(12:00:00,4)) # If I just subtract the two vectors, I get some negative values, as fractions of a day hours-less.hours [1] -0.4583 0.2500 0.0417 -0.0833 # I would like those negative values to cycle around and subtract the amount 0 from midnight # That is to say, 01:00:00 - 12:00:00 should be 13:00:00 # because 01:00:00-12:00:00 = -11:00:00, and 24:00:00-11:00:00 = 13:00:00 # It's sort of like going back to the previous day, but without actually including information about which day it is # This is what I tried test.function=function(x,y) { + sub = x-y + if(sub0) x+y + } test.function(hours,less.hours) Time in days: [1] 0.5416667 1.250 1.0416667 0.917 Warning message: In if (sub 0) x + y : the condition has length 1 and only the first element will be used # My questions are, why does it only use the first element?? Why does it not apply to all? Also, what happened to the elements where sub= 0, it looks like they followed the rules # of if(sub0). I feel like I must not be understanding something rather basic about how logical expressions operate within R # Help would be appreciated... OK: help(ifelse), ordinary if doesn't vectorize. Actually, ifelse does awkward things with classed objects. You might want something like sub - x - y sub[sub0] - x+y instead. And don't forget to return sub in all cases. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Forecasting using ARIMAX
Hi Siang Li. It would help if you explained what packages you are using. auto.arima() is in the forecast package and arimax appears to be from the TSA package. Using auto.arima() to select the orders is inappropriate because you are ignoring the regressors. auto.arima() does not currently handle regressors, but you can get something that is at least consistent by using lm() to fit a linear model with the regressors and then apply auto.arima() to the residuals to select the order. Sometime, I'll add regressors into auto.arima(). On your specific questions: 1. Fit the model using all the data. 2. Ask the author of the TSA package. Rob siang.li.chua at acceval-intl.com writes: Dear R-helpers, I would appreicate if someone can help me on the transfer parameter in ARIMAX and also see what I am doing is correct. I am using ARIMAX with 2 Exogeneous Variables and 10 years data are as follows: DepVar Period, depVar, IndepVar1 Period, indepVar1, IndepVar2 Period, indepVar2 Jan 1998,708,Jan 1998,495,Jan 1998,245.490 Feb 1998,670,Feb 1998,421.25,Feb 1998,288.170 Mar 1998,642.5,Mar 1998,395,Mar 1998,254.950 Apr 1998,610,Apr 1998,377.5,Apr 1998,230.640 : (nrowDepVar - nrow(depVar)) [1] 545 (nTest - nInstance + nHorizon - 1) #number of latest points reserved for testing [1] 13 (nTrain - nrowDepVar - nTest) [1] 532 First I use auot.arima to find the best (p,d,q). modArima - auto.arima(depVar[1:nTrain,], trace=TRUE) ARIMA(2,1,2) with drift : 4402.637 ARIMA(0,1,0) with drift : 4523.553 ARIMA(1,1,0) with drift : 4410.036 ARIMA(0,1,1) with drift : 4442.558 ARIMA(1,1,2) with drift : 4401.178 ARIMA(1,1,1) with drift : 4399.421 ARIMA(1,1,1): 4398.502 ARIMA(0,1,1): 4443.709 ARIMA(2,1,1): 4400.818 ARIMA(1,1,0): 4409.569 ARIMA(1,1,2): 4400.196 ARIMA(0,1,0): 4526.782 ARIMA(2,1,2): 4401.824 Best model: ARIMA(1,1,1) (bestOrder - cbind(modArima$arma[1],modArima$arma[5],modArima$arma[2])) [,1] [,2] [,3] [1,]111 (bestSessionOrder - cbind(modArima$arma[3],modArima$arma[6],modArima$arma[4])) [,1] [,2] [,3] [1,]010 modArimax - arimax(depVar[1:nTrain,], order=bestOrder, xtransf=data.frame(indepVar[1:nTrain,])) After testing and validation, I think the model is robust enough go for real forecasting. Q1. Since my model is trained until Jul 2007, I shall 'update' the model to to include values up to Sep 2007, how can I 'update' it? Q2. Now, say I am forecasting for next month Nov 2008. But I yet to have Nov 08 data for the 2 independent variables In fact currently, I only have Sep 2008 values. I think the parameter transfer is the solution? Would appreciate someone can shed some light on how I can proceed. Many Thanks. siangli _ Rob J Hyndman Professor of Statistics, Monash University Editor-in-Chief, International Journal of Forecasting http://www.robjhyndman.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] combining same-day lab measurements with 'apply'
Another request for help implementing the 'apply' functions to avoid a loop structure... I am working with a data set that includes lab measurements taken at different dates for the subjects, with some subjects having more results than others. I would like to average lab results for each subject that were taken on the same day. I can do this using a for loop, but would like to know how to efficiently accomplish the same thing without looping as I will likely have to do the same with a much larger data set. At the end of this post are examples of what I'm starting with and what I want the result to look like: I tried another suggestion I saw on this list using a list object for the index of a call to 'tapply' as in: new.x - tapply(x, list(id, date), mean) but this produced a table-like object referencing every subject id with every date in the dataset - too large for the full data set and also would require serious re-working (at least with the tools I know) to return to the original dataframe structure. Another attempt was pasting the id and date together to create a single indexing vector. I could get this to work, but it seems clumsy to be substring'ing the names attribute of the resulting dataframe and implementing this with id's that range from 1 to 3 digits further complicates things: new.x - tapply(x, paste(id, date),mean) data.frame( + id = substr(names(new.x),start=1,stop=1), + x = new.x, + date = as.Date(substr(names(new.x),start=3,stop=100))) idx date 2 2005-12-15 2 21.0 2005-12-15 2 2006-01-13 2 22.5 2006-01-13 3 2000-04-05 3 17.0 2000-04-05 4 2003-05-23 4 18.0 2003-05-23 4 2003-07-08 4 27.0 2003-07-08 4 2003-11-30 4 24.5 2003-11-30 5 2001-04-19 5 23.0 2001-04-19 I could get this to work, but it seems clumsy to be substring'ing the names attribute of the resulting dataframe and implementing. Also, the full data set has subject id's that range from 1 to 3 digits further complicates things the 'substr' call (although it just occurred to me that I could use strsplit as well..). It may be irrelevant, but the 'date' variable is a Date class object. I've tried first converting this to a character object but didn't get anywhere. Further, I'll use the dates later with difftime to figure the subjects' age at the onset of their condition, so I'd like to avoid converting between classes too much. Any advice would be greatly appreciated. Here is the code to build the sample data and the working for loop as well: dum - data.frame( + id = c(2,2,2,3,4,4,4,4,5,5), + x = sample(15:30,length(id)), + date = as.Date(c(12/15/2005,1/13/2006,1/13/2006,4/5/2000,5/23/2003, + 7/8/2003,11/30/2003,11/30/2003,4/19/2001,4/19/2001),format=%m/%d/%Y) + ) id.list - unique(id) dum id x date 1 2 21 2005-12-15 2 2 22 2006-01-13 3 2 23 2006-01-13 4 3 17 2000-04-05 5 4 18 2003-05-23 6 4 27 2003-07-08 7 4 25 2003-11-30 8 4 24 2003-11-30 9 5 26 2001-04-19 10 5 20 2001-04-19 output - NULL for (i in seq(along=id.list)) { + sel - dum$id==id.list[i] + x.averaged - tapply(dum$x[sel], dum$date[sel], mean, na.rm=TRUE) + dat - data.frame(id.list[i], x.averaged, names(x.averaged)) + output - rbind(output, dat) + } names(output) - names(dum) rownames(output) - NULL output idx date 1 2 24.0 2005-12-15 2 2 22.0 2006-01-13 3 3 19.0 2000-04-05 4 4 22.0 2003-05-23 5 4 26.0 2003-07-08 6 4 28.5 2003-11-30 7 5 21.0 2001-04-19 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.